Now that we are aware of the basic networking concepts, let’s take a deeper look at the main protocol of the web - HTTP.
History of HTTP⌗
The Hypertext Transfer Protocol is an application layer protocol in the IP protocol family, built for transmitting hypermedia documents for the World Wide Web. Its first version was developed by Tim Berners-Lee at CERN in 1989.
It’s a request-response protocol (typical client-server model). The client (for example the browser) sends a request to the server which sends back a response. It’s a stateless protocol, meaning that the server does not keep any state between two requests.
- 0.9 (had no version at that time) - 1989
- the original, one-line protocol
- the original, one-line protocol
- 1.0 - 1996
- version information
- concept of headers for both request and response
- status code
- not just HTML content (thanks to the
- 1.1 - 1997
- the first properly standarized version
- used actively even today, easily extensible
- connections can be reused
- chunked responses
- cache control mechanisms
- content negotiation
- more domains from one IP (thanks to the mandatory
- 2.0 - 2015
- based on SPDY (by Google)
- binary protocol (instead of the older text based ones)
- multiplexed, parallel requests can be made over the same connection
- header compression - many headers are the same/similar for a lot of requests
- server push
- uses QUIC (desigen by Google…) instead of TCP
- otherwise very similar to HTTP/2
Personal note about newer HTTP versions
For small/medium websites, HTTP/1.1 is pretty much enough. Implementing (the protocol itself) is pretty easy, you can easily issue simple HTTP calls even by hand (via
telnetfor example), and a basic HTTP/1.1 server that reports the temperature for example can be implemented in a few lines of C code running on a microcontroller.
HTTP/2 and 3 are on the other hand huge beasts, with (basically mandatory, and computation heavy) TLS, header compression, connection multiplexing and so on. You most probably don’t want to implement any minor part of it for your hobby project - and as I said, you won’t pretty much need id for any low traffic site. Big websites (like Google or Facebook) are the ones who desperately need them.
So if you are in an environment - using standard webserver software, like
nginx- where HTTP/2 or 3 is available, use it. But if you can only use HTTP/1.1 for embedded systems, it will still work. And if you want to connect it to the public internet, there’s always the option to put it behind a proxy server, that supports HTTP/2+ and strong TLS.
┌──────────┐ ┌─────────┐ ┌───────────────────┐ │ Client │◄─────►│ Proxy │◄────────►│ Proxy │ │ │ │ │ │ (load balancer) │ └──────────┘ └─────────┘ └───────────────────┘ ▲ ▲ │ │ ┌────────┐ │ │ ┌────────┐ │ Server │◄───┘ └─►│ Server │ └────────┘ └────────┘
The user agent - a tool that acts on behalf of the user. In most cases the browser, or programs performing communication over HTTP.
The client is always the one initiating the request. (Even in case of HTTP/2 server push, the content pushed by the server is just cached in the browser, and won’t be used until a formal request is performed to that specific resource - it just won’t use the network, but will be served from the cache.)
To display a basic webpage - a hypertext document - the browser sends a request to fetch the HTML document that represents the page. While it parses this document, it discovers other resources (scripts, CSS files, images, videos, …) that needs to be fetched too, and retrievs them (if required, it’s also a tricky topic when and what resources are fetched in which order…). Later, scripts executed by the browser can fetch more resources, too.
The opposite side of the communication channel, which serves the documents requested by the client. From the client’s point of view the server appears to be a single entity, but it may actually be a collection of servers sharing the load in some way, or other software (like caches) delivering the requested resource in some way.
Also a single server can host multiple websites using the
Host header of HTTP/1.1 and SNI for TLS.
Between the client and the server, numerous nodes are relaying the HTTP messages. Due to the layered structure of the internet, most of them operate at some lower (transport, network or physical) layer, like routers and switches, thus being transparent for the application layer HTTP actors. Those that operate in the application layer are generally called proxies.
Proxies can perform the following actions:
- caching (like the browsers)
- filtering (antivirus scan, parental controls, corporate filters)
- load balancing (to allow multiple servers share the load)
- terminate HTTPS or downgrade version (for dummy HTTP servers)
The HTTP flow⌗
I’m going to use HTTP/1.1 here, the concepts are the same for newer versions, and the optimizations (header compression, QUIC vs TCP, etc) are basically hidden from the user.
The client opens a TCP connection (3-way handshake) - and builds the secure TLS channel over it if required. This is an “expensive” process, so it is a good idea to reuse this connection later. Multiple connections can be opened, but most browsers limit this to maximum 6 per domain.
After the connection is established, the request is sent by the client to the server. For example:
POST /ide/megy/az/adat HTTP/1.1 Host: cica.hu Accept-Encoding: gzip, deflate, compress Content-Type: application/x-www-form-urlencoded Content-Length: 26 alma=12&beka=Beka%20vagyok
Basically verbs with different semantics.
GET- requests a resource, should not send data
HEAD- request the same resource as
GET, but without the response body
POST- sends an entity to the specified resource (often cause change in the server side, not idempotent)
PUT- replaces the current representation of the resource with the payload (idempotent)
DELETE- deletes the specified resource
Less frequently used, but still standardised:
CONNECT- establishes a tunnel to an other server (for proxies)
OPTIONS- describes the communication options
TRACE- basically traces the path to the target server (through proxies)
PATCH- partial modification on a resource
The resource to fetch. Basically the URL without the protocol, domain, port and hash.
Well, most probably HTTP/1.1 if you can read it ;)
See HTTP Headers.
The data sent by the client to the server (where it makes sense, for
PUT, but should be empty for
The server sends back a response:
HTTP/1.1 200 OK Date: Wed, 13 May 2015 11:12:13 GMT Connection: keep-alive Content-Encoding: gzip Content-Type: text/html Server: nginx/1.7.1 }P������0������+@`������G<������x2+T���m���b���_oyH���
The status code indicates that the request was successful or not, and why:
4xx- client error
5xx- server error
See HTTP Headers.
The resource sent back to the client (optional).
The connection can be closed⌗
… or kept open for further communication.
The HTTP headers provide additional information about the request/response. A header consists of a case-insensitive
name followed by a color (
:), then by its value.
Header can be related to (among many other things):
- authentication (
- caching (
- whether the resource has been changed or not (
- content negotiation (
- cookies (
- CORS (
- message body information (
- proxies (
- redirects (
- request context (
- response context (
- security (
Also, you can use your own custom headers, too. Earlier it was enough to start them with
X-..., but since some x-headers were standardized (like
X-Frame-Options) or widely used (
X-Forwarded-For) meanwhile, you’d better check it if it’s already used. But of course, something like
X-YourGithubUsername-Whatever should be pretty safe to use.
Let’s see some features implemented with the help of headers.
Accept-Encoding: br, gzip
Content-Encoding: gzip Vary: Accept-Encoding
While modern frameworks/websites use more sophisticated/secure methods for authentication, the classic HTTP authentication is still an option, and in some cases it can be useful. For example URLs can contain the credentials in the form of
If the server responds with
401 Unauthorized status code and there’s at least one
WWW-Authenticate header, the client will typically prompt the user for credentials. The
WWW-Authenticate header usually has a
type and a
realm (which is the name of the protected area).
Then the client responds including an
Authorization: <type> <credential> header.
Basic- base64 encoded credentials
Digest- md5/sha/… hashed credentials (very minimum added security compared to basic)
Bearer- some kind of token (OAuth, JWT)
As any other kind of authentication on the internet, only provide your credentials over secure channel -
HTTPSin this case!
The client can specify some properties of the response which it prefers, using the
The response should contain the
Vary header with the
Accept-* headers involved or might use
300 Multiple Choices or
406 Not Acceptable response status codes.
Accept-Language: <language code>
All headers above can contain a list of (optionally) weighted options. For example if the client prefers german, but understands english (with 0.8 priority) or hungarian (with 0.4 priority) as well, it can use:
Accept-Language: en, de;q=0.8, hu;q=0.4
To save network bandwith and server resources, caching content in the user-agent that change scarcely is pretty useful. But most content do change sometimes, so it can be hand to decide what to cache and for how long.
- the browser cache
- it has a size limit
- users can turn it off
- some (especially older) browsers might misunderstand cache related headers
- caching proxies, CDNs
- users cannot turn them off
- they can do other kind of fancy stuff (like image resizing/recompression, minify resources, …)
- if something is cached here that should be not, it can cause hard to discover, tricky issues
So properly controlling the caching of our resources is an important thing.
Typical cachable content⌗
Browsers tend to cache these responses even without any special header:
- simple documents, images that rarely change (GET => 200 OK)
- permanent redirects (GET => 301, 308)
- some error responses (GET => 404)
no-store, no-cache, must-revalidate- don’t cache at all
no-cache- browser might stores this for short period (for back navigation), requests validation before using the cached copy
private- single user only, browser can cache it, proxies shouldn’t
public- can be stored in a public cache (proxy, CDN)
must-revalidate- can be cached, with validation
max-age=<seconds>- can be stored for the given time
Expiresheader with a date (which is a client side date, so not very reliable)
Pragmaheader was for HTTP/1.0, don’t use it anymore.
- “strong” validator
- server sends
- client uses
If-None-Matchwith the cached
- “weak” validator
- server sends
Last-Modifiedheader with date
- client uses
If-Modified-Sincewith the cached date
If the content wasn’t modified, so the cached data can be used then the server responds with a
304 Not Modified response. Otherwise a normal
200 OK response is sent back with the updated content (and with updated validation related headers).
If the server uses the
Vary header that means that the content can be different for the given resource based on the headers listed in the
Vary header, so the client must include those headers when caching.
The server can send cookies via the
Set-Cookie header, that are stored by the browser and sent back in the
Set-Cookie header can set one cookie, but a response can have multiple such headers:
Set-Cookie: <name>=<value>; [directives separated by ;] Set-Cookie: <name2>=<value2>; [directives]
A request can only contain one
Cookie header, but it can have multiple cookies separated by semicolons:
Cookie: name1=value1; name2=value2
- the cookie expires on date, the client should remove it afterwards
- if not set, the cookie will have the lifetime of the session (it’s deleted when the tab is closed)
- similar to
Expires, but has higher priority
- similar to
- if not set, current domain is used (subdomains not included)
- if set, subdomains are included
- the root of the path where the cookie is sent
- the cookie is only sent over secure (HTTPS) connection
- some kind of CSRF protection
Strict: cookie is only sent with requests initiated by the cookie’s origin site.
Lax: (default) cookie is also sent when the user navigates to the cookie’s origin (from an external site).
None: cookies are sent on cross-site requests as well (but only in secure contexts, so
Secureshould be set too).
- CORS related headers (
- to prevent XSS
- tells the client to trust only the given TLS public key, to prevent MITM attacks
- tells the client that it should use secure channel (HTTPS) to communicate with this server
- prevents some MITM attacks with HTTPS termination, or attacks hijacking the HTTP->HTTPS redirects
- blocks requests with
scriptdestinations if the content type of the resource is not
- blocks requests with