Web Basics - HTTP
Now that we are aware of the basic networking concepts, let’s take a deeper look at the main protocol of the web - HTTP.
History of HTTP⌗
The Hypertext Transfer Protocol is an application layer protocol in the IP protocol family, built for transmitting hypermedia documents for the World Wide Web. Its first version was developed by Tim Berners-Lee at CERN in 1989.
It’s a request-response protocol (typical client-server model). The client (for example the browser) sends a request to the server which sends back a response. It’s a stateless protocol, meaning that the server does not keep any state between two requests.
Versions:
- 0.9 (had no version at that time) - 1989
- the original, one-line protocol
GET /something.html
- the original, one-line protocol
- 1.0 - 1996
- version information
- concept of headers for both request and response
- status code
- not just HTML content (thanks to the
Content-Type
header)
- 1.1 - 1997
- the first properly standarized version
- used actively even today, easily extensible
- connections can be reused
- chunked responses
- cache control mechanisms
- content negotiation
- more domains from one IP (thanks to the mandatory
Host
header)
- 2.0 - 2015
- based on SPDY (by Google)
- binary protocol (instead of the older text based ones)
- multiplexed, parallel requests can be made over the same connection
- header compression - many headers are the same/similar for a lot of requests
- server push
- 3.0
- uses QUIC (desigen by Google…) instead of TCP
- otherwise very similar to HTTP/2
Personal note about newer HTTP versions
For small/medium websites, HTTP/1.1 is pretty much enough. Implementing (the protocol itself) is pretty easy, you can easily issue simple HTTP calls even by hand (via
telnet
for example), and a basic HTTP/1.1 server that reports the temperature for example can be implemented in a few lines of C code running on a microcontroller.HTTP/2 and 3 are on the other hand huge beasts, with (basically mandatory, and computation heavy) TLS, header compression, connection multiplexing and so on. You most probably don’t want to implement any minor part of it for your hobby project - and as I said, you won’t pretty much need id for any low traffic site. Big websites (like Google or Facebook) are the ones who desperately need them.
So if you are in an environment - using standard webserver software, like
nginx
- where HTTP/2 or 3 is available, use it. But if you can only use HTTP/1.1 for embedded systems, it will still work. And if you want to connect it to the public internet, there’s always the option to put it behind a proxy server, that supports HTTP/2+ and strong TLS.
Actors⌗
┌──────────┐ ┌─────────┐ ┌───────────────────┐
│ Client │◄─────►│ Proxy │◄────────►│ Proxy │
│ │ │ │ │ (load balancer) │
└──────────┘ └─────────┘ └───────────────────┘
▲ ▲
│ │
┌────────┐ │ │ ┌────────┐
│ Server │◄───┘ └─►│ Server │
└────────┘ └────────┘
The Client⌗
The user agent - a tool that acts on behalf of the user. In most cases the browser, or programs performing communication over HTTP.
The client is always the one initiating the request. (Even in case of HTTP/2 server push, the content pushed by the server is just cached in the browser, and won’t be used until a formal request is performed to that specific resource - it just won’t use the network, but will be served from the cache.)
To display a basic webpage - a hypertext document - the browser sends a request to fetch the HTML document that represents the page. While it parses this document, it discovers other resources (scripts, CSS files, images, videos, …) that needs to be fetched too, and retrievs them (if required, it’s also a tricky topic when and what resources are fetched in which order…). Later, scripts executed by the browser can fetch more resources, too.
The Server⌗
The opposite side of the communication channel, which serves the documents requested by the client. From the client’s point of view the server appears to be a single entity, but it may actually be a collection of servers sharing the load in some way, or other software (like caches) delivering the requested resource in some way.
Also a single server can host multiple websites using the Host
header of HTTP/1.1 and SNI for TLS.
Proxies⌗
Between the client and the server, numerous nodes are relaying the HTTP messages. Due to the layered structure of the internet, most of them operate at some lower (transport, network or physical) layer, like routers and switches, thus being transparent for the application layer HTTP actors. Those that operate in the application layer are generally called proxies.
Proxies can perform the following actions:
- caching (like the browsers)
- filtering (antivirus scan, parental controls, corporate filters)
- load balancing (to allow multiple servers share the load)
- terminate HTTPS or downgrade version (for dummy HTTP servers)
- authentication
- logging
The HTTP flow⌗
I’m going to use HTTP/1.1 here, the concepts are the same for newer versions, and the optimizations (header compression, QUIC vs TCP, etc) are basically hidden from the user.
TCP connection⌗
The client opens a TCP connection (3-way handshake) - and builds the secure TLS channel over it if required. This is an “expensive” process, so it is a good idea to reuse this connection later. Multiple connections can be opened, but most browsers limit this to maximum 6 per domain.
The request⌗
After the connection is established, the request is sent by the client to the server. For example:
POST /ide/megy/az/adat HTTP/1.1
Host: cica.hu
Accept-Encoding: gzip, deflate, compress
Content-Type: application/x-www-form-urlencoded
Content-Length: 26
alma=12&beka=Beka%20vagyok
Method⌗
Basically verbs with different semantics.
GET
- requests a resource, should not send dataHEAD
- request the same resource asGET
, but without the response bodyPOST
- sends an entity to the specified resource (often cause change in the server side, not idempotent)PUT
- replaces the current representation of the resource with the payload (idempotent)DELETE
- deletes the specified resource
Less frequently used, but still standardised:
CONNECT
- establishes a tunnel to an other server (for proxies)OPTIONS
- describes the communication optionsTRACE
- basically traces the path to the target server (through proxies)PATCH
- partial modification on a resource
Path⌗
The resource to fetch. Basically the URL without the protocol, domain, port and hash.
HTTP version⌗
Well, most probably HTTP/1.1 if you can read it ;)
Headers⌗
See HTTP Headers.
Body⌗
The data sent by the client to the server (where it makes sense, for POST
or PUT
, but should be empty for GET
or HEAD
).
The response⌗
The server sends back a response:
HTTP/1.1 200 OK
Date: Wed, 13 May 2015 11:12:13 GMT
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Server: nginx/1.7.1
}P������0������+@`������G<������x2+T���m���b���_oyH���
HTTP version⌗
HTTP/1.1
Status code⌗
The status code indicates that the request was successful or not, and why:
1xx
- informational2xx
- success3xx
- redirection4xx
- client error5xx
- server error
Headers⌗
See HTTP Headers.
Body⌗
The resource sent back to the client (optional).
The connection can be closed⌗
… or kept open for further communication.
HTTP Headers⌗
The HTTP headers provide additional information about the request/response. A header consists of a case-insensitive name
followed by a color (:
), then by its value.
Header can be related to (among many other things):
- authentication (
WWW-Authenticate
,Authorization
, …) - caching (
Expires
,Cache-Control
, …) - whether the resource has been changed or not (
If-Modified-Since
/Last-Modified
,If-Match
/ETag
, …) - content negotiation (
Accept
,Accept-Encoding
, …) - cookies (
Cookie
,Set-Cookie
) - CORS (
Origin
,Access-Control-...
) - message body information (
Content-Type
,Content-Length
,Content-Encoding
, …) - proxies (
Via
,Forwarded
, …) - redirects (
Location
) - request context (
Host
,User-Agent
, …) - response context (
Server
,Allow
, …) - security (
Content-Security-Policy
, …) - …
Also, you can use your own custom headers, too. Earlier it was enough to start them with X-...
, but since some x-headers were standardized (like X-Frame-Options
) or widely used (X-Forwarded-For
) meanwhile, you’d better check it if it’s already used. But of course, something like X-YourGithubUsername-Whatever
should be pretty safe to use.
Let’s see some features implemented with the help of headers.
Data compression⌗
Request:
Accept-Encoding: br, gzip
Response:
Content-Encoding: gzip
Vary: Accept-Encoding
HTTP authentication⌗
While modern frameworks/websites use more sophisticated/secure methods for authentication, the classic HTTP authentication is still an option, and in some cases it can be useful. For example URLs can contain the credentials in the form of https://username:password@www.example.com/
.
If the server responds with 401 Unauthorized
status code and there’s at least one WWW-Authenticate
header, the client will typically prompt the user for credentials. The WWW-Authenticate
header usually has a type
and a realm
(which is the name of the protected area).
Then the client responds including an Authorization: <type> <credential>
header.
Authentication types⌗
Basic
- base64 encoded credentialsDigest
- md5/sha/… hashed credentials (very minimum added security compared to basic)Bearer
- some kind of token (OAuth, JWT)
As any other kind of authentication on the internet, only provide your credentials over secure channel -
HTTPS
in this case!
Content negotiation⌗
The client can specify some properties of the response which it prefers, using the Accept-*
headers.
The response should contain the Vary
header with the Accept-*
headers involved or might use 300 Multiple Choices
or 406 Not Acceptable
response status codes.
Some Accept
headers⌗
Accept: <mime-type>/<mime-subtype>
Accept-Encoding: <encoding>
Accept-Language: <language code>
All headers above can contain a list of (optionally) weighted options. For example if the client prefers german, but understands english (with 0.8 priority) or hungarian (with 0.4 priority) as well, it can use:
Accept-Language: en, de;q=0.8, hu;q=0.4
Caching⌗
To save network bandwith and server resources, caching content in the user-agent that change scarcely is pretty useful. But most content do change sometimes, so it can be hand to decide what to cache and for how long.
Cache levels⌗
- the browser cache
- it has a size limit
- users can turn it off
- some (especially older) browsers might misunderstand cache related headers
- caching proxies, CDNs
- users cannot turn them off
- they can do other kind of fancy stuff (like image resizing/recompression, minify resources, …)
- if something is cached here that should be not, it can cause hard to discover, tricky issues
So properly controlling the caching of our resources is an important thing.
Typical cachable content⌗
Browsers tend to cache these responses even without any special header:
- simple documents, images that rarely change (GET => 200 OK)
- permanent redirects (GET => 301, 308)
- some error responses (GET => 404)
Cache-Control
header⌗
no-store, no-cache, must-revalidate
- don’t cache at allno-cache
- browser might stores this for short period (for back navigation), requests validation before using the cached copyprivate
- single user only, browser can cache it, proxies shouldn’tpublic
- can be stored in a public cache (proxy, CDN)must-revalidate
- can be cached, with validationmax-age=<seconds>
- can be stored for the given timeExpires
header with a date (which is a client side date, so not very reliable)
The old
Pragma
header was for HTTP/1.0, don’t use it anymore.
Validation⌗
ETags
- “strong” validator
- server sends
ETag
header - client uses
If-None-Match
with the cachedETag
value
Last-Modified
- “weak” validator
- server sends
Last-Modified
header with date - client uses
If-Modified-Since
with the cached date
If the content wasn’t modified, so the cached data can be used then the server responds with a 304 Not Modified
response. Otherwise a normal 200 OK
response is sent back with the updated content (and with updated validation related headers).
Vary⌗
If the server uses the Vary
header that means that the content can be different for the given resource based on the headers listed in the Vary
header, so the client must include those headers when caching.
Cookies⌗
The server can send cookies via the Set-Cookie
header, that are stored by the browser and sent back in the Cookie
header.
One Set-Cookie
header can set one cookie, but a response can have multiple such headers:
Set-Cookie: <name>=<value>; [directives separated by ;]
Set-Cookie: <name2>=<value2>; [directives]
A request can only contain one Cookie
header, but it can have multiple cookies separated by semicolons:
Cookie: name1=value1; name2=value2
Directives⌗
Expires=<date>
- the cookie expires on date, the client should remove it afterwards
- if not set, the cookie will have the lifetime of the session (it’s deleted when the tab is closed)
Max-Age=<seconds>
- similar to
Expires
, but has higher priority
- similar to
Domain=<domain>
- if not set, current domain is used (subdomains not included)
- if set, subdomains are included
Path=<path>
- the root of the path where the cookie is sent
Secure
- the cookie is only sent over secure (HTTPS) connection
HttpOnly
- not accessible via JavaScript (through the
document.cookie
API)
- not accessible via JavaScript (through the
SameSite=<ssoption>
- some kind of CSRF protection
Strict
: cookie is only sent with requests initiated by the cookie’s origin site.Lax
: (default) cookie is also sent when the user navigates to the cookie’s origin (from an external site).None
: cookies are sent on cross-site requests as well (but only in secure contexts, soSecure
should be set too).
Security⌗
- CORS related headers (
Access-Control-...
) - CSP
- to prevent XSS
- HPKP
- tells the client to trust only the given TLS public key, to prevent MITM attacks
- HSTS
- tells the client that it should use secure channel (HTTPS) to communicate with this server
- prevents some MITM attacks with HTTPS termination, or attacks hijacking the HTTP->HTTPS redirects
X-Content-Type-Options: nosniff
- blocks requests with
style
andscript
destinations if the content type of the resource is nottext/css
or JavaScript MIME type (accordingly)
- blocks requests with