Now that we are aware of the basic networking concepts, let’s take a deeper look at the main protocol of the web - HTTP.

History of HTTP

The Hypertext Transfer Protocol is an application layer protocol in the IP protocol family, built for transmitting hypermedia documents for the World Wide Web. Its first version was developed by Tim Berners-Lee at CERN in 1989.

It’s a request-response protocol (typical client-server model). The client (for example the browser) sends a request to the server which sends back a response. It’s a stateless protocol, meaning that the server does not keep any state between two requests.

Versions:

  • 0.9 (had no version at that time) - 1989
    • the original, one-line protocol GET /something.html
  • 1.0 - 1996
    • version information
    • concept of headers for both request and response
    • status code
    • not just HTML content (thanks to the Content-Type header)
  • 1.1 - 1997
    • the first properly standarized version
    • used actively even today, easily extensible
    • connections can be reused
    • chunked responses
    • cache control mechanisms
    • content negotiation
    • more domains from one IP (thanks to the mandatory Host header)
  • 2.0 - 2015
    • based on SPDY (by Google)
    • binary protocol (instead of the older text based ones)
    • multiplexed, parallel requests can be made over the same connection
    • header compression - many headers are the same/similar for a lot of requests
    • server push
  • 3.0
    • uses QUIC (desigen by Google…) instead of TCP
    • otherwise very similar to HTTP/2

Personal note about newer HTTP versions

For small/medium websites, HTTP/1.1 is pretty much enough. Implementing (the protocol itself) is pretty easy, you can easily issue simple HTTP calls even by hand (via telnet for example), and a basic HTTP/1.1 server that reports the temperature for example can be implemented in a few lines of C code running on a microcontroller.

HTTP/2 and 3 are on the other hand huge beasts, with (basically mandatory, and computation heavy) TLS, header compression, connection multiplexing and so on. You most probably don’t want to implement any minor part of it for your hobby project - and as I said, you won’t pretty much need id for any low traffic site. Big websites (like Google or Facebook) are the ones who desperately need them.

So if you are in an environment - using standard webserver software, like nginx - where HTTP/2 or 3 is available, use it. But if you can only use HTTP/1.1 for embedded systems, it will still work. And if you want to connect it to the public internet, there’s always the option to put it behind a proxy server, that supports HTTP/2+ and strong TLS.

Actors

┌──────────┐       ┌─────────┐          ┌───────────────────┐
│  Client  │◄─────►│  Proxy  │◄────────►│  Proxy            │
│          │       │         │          │  (load balancer)  │
└──────────┘       └─────────┘          └───────────────────┘
                                             ▲     ▲
                                             │     │
                               ┌────────┐    │     │  ┌────────┐
                               │ Server │◄───┘     └─►│ Server │
                               └────────┘             └────────┘

The Client

The user agent - a tool that acts on behalf of the user. In most cases the browser, or programs performing communication over HTTP.

The client is always the one initiating the request. (Even in case of HTTP/2 server push, the content pushed by the server is just cached in the browser, and won’t be used until a formal request is performed to that specific resource - it just won’t use the network, but will be served from the cache.)

To display a basic webpage - a hypertext document - the browser sends a request to fetch the HTML document that represents the page. While it parses this document, it discovers other resources (scripts, CSS files, images, videos, …) that needs to be fetched too, and retrievs them (if required, it’s also a tricky topic when and what resources are fetched in which order…). Later, scripts executed by the browser can fetch more resources, too.

The Server

The opposite side of the communication channel, which serves the documents requested by the client. From the client’s point of view the server appears to be a single entity, but it may actually be a collection of servers sharing the load in some way, or other software (like caches) delivering the requested resource in some way.

Also a single server can host multiple websites using the Host header of HTTP/1.1 and SNI for TLS.

Proxies

Between the client and the server, numerous nodes are relaying the HTTP messages. Due to the layered structure of the internet, most of them operate at some lower (transport, network or physical) layer, like routers and switches, thus being transparent for the application layer HTTP actors. Those that operate in the application layer are generally called proxies.

Proxies can perform the following actions:

  • caching (like the browsers)
  • filtering (antivirus scan, parental controls, corporate filters)
  • load balancing (to allow multiple servers share the load)
  • terminate HTTPS or downgrade version (for dummy HTTP servers)
  • authentication
  • logging

The HTTP flow

I’m going to use HTTP/1.1 here, the concepts are the same for newer versions, and the optimizations (header compression, QUIC vs TCP, etc) are basically hidden from the user.

TCP connection

The client opens a TCP connection (3-way handshake) - and builds the secure TLS channel over it if required. This is an “expensive” process, so it is a good idea to reuse this connection later. Multiple connections can be opened, but most browsers limit this to maximum 6 per domain.

The request

After the connection is established, the request is sent by the client to the server. For example:

POST /ide/megy/az/adat HTTP/1.1
Host: cica.hu
Accept-Encoding: gzip, deflate, compress
Content-Type: application/x-www-form-urlencoded
Content-Length: 26

alma=12&beka=Beka%20vagyok

Method

Basically verbs with different semantics.

  • GET - requests a resource, should not send data
  • HEAD - request the same resource as GET, but without the response body
  • POST - sends an entity to the specified resource (often cause change in the server side, not idempotent)
  • PUT - replaces the current representation of the resource with the payload (idempotent)
  • DELETE - deletes the specified resource

Less frequently used, but still standardised:

  • CONNECT - establishes a tunnel to an other server (for proxies)
  • OPTIONS - describes the communication options
  • TRACE - basically traces the path to the target server (through proxies)
  • PATCH - partial modification on a resource

Path

The resource to fetch. Basically the URL without the protocol, domain, port and hash.

HTTP version

Well, most probably HTTP/1.1 if you can read it ;)

Headers

See HTTP Headers.

Body

The data sent by the client to the server (where it makes sense, for POST or PUT, but should be empty for GET or HEAD).

The response

The server sends back a response:

HTTP/1.1 200 OK
Date: Wed, 13 May 2015 11:12:13 GMT
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Server: nginx/1.7.1

}P������0������+@`������G<������x2+T���m���b���_oyH���

HTTP version

HTTP/1.1

Status code

The status code indicates that the request was successful or not, and why:

  • 1xx - informational
  • 2xx - success
  • 3xx - redirection
  • 4xx - client error
  • 5xx - server error

Headers

See HTTP Headers.

Body

The resource sent back to the client (optional).

The connection can be closed

… or kept open for further communication.

HTTP Headers

The HTTP headers provide additional information about the request/response. A header consists of a case-insensitive name followed by a color (:), then by its value.

Header can be related to (among many other things):

  • authentication (WWW-Authenticate, Authorization, …)
  • caching (Expires, Cache-Control, …)
  • whether the resource has been changed or not (If-Modified-Since/Last-Modified, If-Match/ETag, …)
  • content negotiation (Accept, Accept-Encoding, …)
  • cookies (Cookie, Set-Cookie)
  • CORS (Origin, Access-Control-...)
  • message body information (Content-Type, Content-Length, Content-Encoding, …)
  • proxies (Via, Forwarded, …)
  • redirects (Location)
  • request context (Host, User-Agent, …)
  • response context (Server, Allow, …)
  • security (Content-Security-Policy, …)

Also, you can use your own custom headers, too. Earlier it was enough to start them with X-..., but since some x-headers were standardized (like X-Frame-Options) or widely used (X-Forwarded-For) meanwhile, you’d better check it if it’s already used. But of course, something like X-YourGithubUsername-Whatever should be pretty safe to use.

Let’s see some features implemented with the help of headers.

Data compression

Request:

Accept-Encoding: br, gzip

Response:

Content-Encoding: gzip
Vary: Accept-Encoding

HTTP authentication

While modern frameworks/websites use more sophisticated/secure methods for authentication, the classic HTTP authentication is still an option, and in some cases it can be useful. For example URLs can contain the credentials in the form of https://username:password@www.example.com/.

If the server responds with 401 Unauthorized status code and there’s at least one WWW-Authenticate header, the client will typically prompt the user for credentials. The WWW-Authenticate header usually has a type and a realm (which is the name of the protected area).

Then the client responds including an Authorization: <type> <credential> header.

Authentication types

  • Basic - base64 encoded credentials
  • Digest - md5/sha/… hashed credentials (very minimum added security compared to basic)
  • Bearer - some kind of token (OAuth, JWT)

As any other kind of authentication on the internet, only provide your credentials over secure channel - HTTPS in this case!

Content negotiation

The client can specify some properties of the response which it prefers, using the Accept-* headers.

The response should contain the Vary header with the Accept-* headers involved or might use 300 Multiple Choices or 406 Not Acceptable response status codes.

Some Accept headers

  • Accept: <mime-type>/<mime-subtype>
  • Accept-Encoding: <encoding>
  • Accept-Language: <language code>

All headers above can contain a list of (optionally) weighted options. For example if the client prefers german, but understands english (with 0.8 priority) or hungarian (with 0.4 priority) as well, it can use:

Accept-Language: en, de;q=0.8, hu;q=0.4

Caching

To save network bandwith and server resources, caching content in the user-agent that change scarcely is pretty useful. But most content do change sometimes, so it can be hand to decide what to cache and for how long.

Cache levels

  • the browser cache
    • it has a size limit
    • users can turn it off
    • some (especially older) browsers might misunderstand cache related headers
  • caching proxies, CDNs
    • users cannot turn them off
    • they can do other kind of fancy stuff (like image resizing/recompression, minify resources, …)
    • if something is cached here that should be not, it can cause hard to discover, tricky issues

So properly controlling the caching of our resources is an important thing.

Typical cachable content

Browsers tend to cache these responses even without any special header:

  • simple documents, images that rarely change (GET => 200 OK)
  • permanent redirects (GET => 301, 308)
  • some error responses (GET => 404)

Cache-Control header

  • no-store, no-cache, must-revalidate - don’t cache at all
  • no-cache - browser might stores this for short period (for back navigation), requests validation before using the cached copy
  • private - single user only, browser can cache it, proxies shouldn’t
  • public - can be stored in a public cache (proxy, CDN)
  • must-revalidate - can be cached, with validation
  • max-age=<seconds> - can be stored for the given time
  • Expires header with a date (which is a client side date, so not very reliable)

The old Pragma header was for HTTP/1.0, don’t use it anymore.

Validation

  • ETags
    • “strong” validator
    • server sends ETag header
    • client uses If-None-Match with the cached ETag value
  • Last-Modified
    • “weak” validator
    • server sends Last-Modified header with date
    • client uses If-Modified-Since with the cached date

If the content wasn’t modified, so the cached data can be used then the server responds with a 304 Not Modified response. Otherwise a normal 200 OK response is sent back with the updated content (and with updated validation related headers).

Vary

If the server uses the Vary header that means that the content can be different for the given resource based on the headers listed in the Vary header, so the client must include those headers when caching.

Cookies

The server can send cookies via the Set-Cookie header, that are stored by the browser and sent back in the Cookie header.

One Set-Cookie header can set one cookie, but a response can have multiple such headers:

Set-Cookie: <name>=<value>; [directives separated by ;]
Set-Cookie: <name2>=<value2>; [directives]

A request can only contain one Cookie header, but it can have multiple cookies separated by semicolons:

Cookie: name1=value1; name2=value2

Directives

  • Expires=<date>
    • the cookie expires on date, the client should remove it afterwards
    • if not set, the cookie will have the lifetime of the session (it’s deleted when the tab is closed)
  • Max-Age=<seconds>
    • similar to Expires, but has higher priority
  • Domain=<domain>
    • if not set, current domain is used (subdomains not included)
    • if set, subdomains are included
  • Path=<path>
    • the root of the path where the cookie is sent
  • Secure
    • the cookie is only sent over secure (HTTPS) connection
  • HttpOnly
    • not accessible via JavaScript (through the document.cookie API)
  • SameSite=<ssoption>
    • some kind of CSRF protection
    • Strict: cookie is only sent with requests initiated by the cookie’s origin site.
    • Lax: (default) cookie is also sent when the user navigates to the cookie’s origin (from an external site).
    • None: cookies are sent on cross-site requests as well (but only in secure contexts, so Secure should be set too).

Security

  • CORS related headers (Access-Control-...)
  • CSP
    • to prevent XSS
  • HPKP
    • tells the client to trust only the given TLS public key, to prevent MITM attacks
  • HSTS
    • tells the client that it should use secure channel (HTTPS) to communicate with this server
    • prevents some MITM attacks with HTTPS termination, or attacks hijacking the HTTP->HTTPS redirects
  • X-Content-Type-Options: nosniff
    • blocks requests with style and script destinations if the content type of the resource is not text/css or JavaScript MIME type (accordingly)