# HTTP
PUT /new.html HTTP/1.1
Host: example.com
Content-type: text/html
Content-length: 16
<p>New File</p>
HTTP/1.1 201 Created
Content-Location: /new.html
# Introduction
输入URL按下回车后
version history — for detail see Evolution of HTTP | MDN (opens new window)
- HTTP/0.9 时代:短连接
- 每个HTTP请求都要经历一次DNS解析、三次握手、传输和四次挥手。反复创建和断开TCP连接的开销巨大,在现在看来,这种传输方式简直是糟糕透顶。
- HTTP/1.0 时代:持久连接概念提出
- 人们认识到短连接的弊端,提出了持久连接的概念,在 HTTP/1.0 中得到了初步的支持。持久连接,即一个 TCP 连接服务多次请求:客户端在请求 header 中携带
Connection: Keep-Alive,即是在向服务端请求持久连接。如果服务端接受持久连接,则会在响应 header 中同样携带Connection: Keep-Alive,这样客户端便会继续使用同一个TCP连接发送接下来的若干请求。(Keep-Alive的默认参数是[timout=5, max=100],即一个TCP连接可以服务至多5秒内的100次请求) - 当服务端主动切断一个持久连接时(或服务端不支持持久连接),则会在header中携带
Connection: Close,要求客户端停止使用这一连接。
- 人们认识到短连接的弊端,提出了持久连接的概念,在 HTTP/1.0 中得到了初步的支持。持久连接,即一个 TCP 连接服务多次请求:客户端在请求 header 中携带
- HTTP/1.1 时代:持久连接成为默认的连接方式;提出 pipelining 概念
- HTTP/1.1 开始,即使请求 header 中没有携带
Connection: Keep-Alive,传输也会默认以持久连接的方式进行。 - 持久连接的弊端被提出 —— HOLB(Head of Line Blocking): 即持久连接下一个连接中的请求仍然是串行的,如果某个请求出现网络阻塞等问题,会导致同一条连接上的后续请求被阻塞。
- 提出了 pipelining 概念,即客户端可以在一个请求发送完成后不等待响应便直接发起第二个请求,服务端在返回响应时会按请求到达的顺序依次返回。响应仍然是按请求的顺序串行返回的。所以 pipelining 并没有被广泛接受,几乎所有代理服务都不支持 pipelining,部分浏览器不支持 pipelining,支持的大部分也会将其默认关闭
- HTTP/1.1 开始,即使请求 header 中没有携带
- SPDY 和 HTTP/2:multiplexing — multiplexing 即多路复用,在 SPDY 中提出,同时也在 HTTP/2 中实现。multiplexing 技术能够让多个请求和响应的传输完全混杂在一起进行,通过 streamId 来互相区别。这彻底解决了 holb 问题,同时还允许给每个请求设置优先级,服务端会先响应优先级高的请求。
- multiplexing (opens new window), binary, stream, message, frame
- server push — populate data in a client cache, in advance of it being required. 例如客户端请求 page.html 页面,服务端就把 script.js 和 style.css 等与之相关的资源一起发给客户端
- header compression and delta update — 客户端和服务器同时维护和更新一个包含之前见过的首部字段表以避免重复传输
- HTTP/0.9 时代:短连接
# Methods
request method attributes
- has body or not
- safe — does not alter the state of server, implies idempotent;
GET,HEAD, orOPTIONS - idempotent — an identical request can be made once or several times in a row with the same effect while leaving the server in the same state; safe methods and
PUT,DELETE, conditionallyPATCH - cacheable — a cacheable response is an HTTP response that can be cached;
GET,HEAD, more in rare cases- controlled by response header
Cache-Control - cacheable status code — 200, 203, 204, 206, 300, 301, 404, 405, 410, 414, and 501
- controlled by response header
GET— requests a representation of the specified resource- attributes — request has no body, response has body, safe, idempotent, cacheable, allowed in forms
HEAD— requests the headers that would be returned ifGET- attributes — no body, no body, safe, idempotent, cacheable, not in forms
POST— sends data to the server but not idempotent- attributes — has body, has body, not safe, not idempotent, typically not cacheable, allowed in forms
- related responses
201 Created
PUT— replaces all current representations of the target resource with the request payload or create new ones, idempotent- attributes — has body, no body, not safe, idempotent, not cacheable, not in forms
- related responses
201 Created200 OKor204 No Content
DELETE- attributes — may have body, may have body, not safe, idempotent, not cacheable, not in forms
- related responses
202 Accepted— the action will likely succeed but has not yet been enacted204 No Content— the action has been enacted and no further information is to be supplied200 OK— the action has been enacted and the response body includes a representation describing the status
PATCH— apply partial modifications- attributes — has body, has body, not safe, not idempotent, not cacheable, not in forms
- idempotent — can be idempotent if no something like an auto-incrementing counter field and no side-effects on other resources
- related headers and response
Accept-Patchresponse header — advertises which media-type the server is able to understand, and means thatPATCHis allowed on the resource identified by the Request-URI, can be together with415415 Unsupported Media Type— response to aPATCHrequest with an unsupported media type
- attributes — has body, has body, not safe, not idempotent, not cacheable, not in forms
OPTIONS— describe the communication options for the target resource, request URL can be*to refer to the entire server- attributes — no body, has body, safe, idempotent, not cacheable, not in forms
- related
Allowresponse header- CORS preflight request method, and request headers
Access-Control-Request-MethodandAccess-Control-Request-Headers - CORS preflight response headers —
Access-Control-Allow-Origin,Access-Control-Allow-Methods,Access-Control-Allow-Headers,Access-Control-Max-Age
- example
curl -X OPTIONS https://example.org -iHTTP/1.1 204 No Content Allow: OPTIONS, GET, HEAD, POST Cache-Control: max-age=604800 Date: Thu, 13 Oct 2016 11:45:00 GMT Server: EOS (lax004/2813)
TRACE— message loop-back test along the path to the target resource- attributes — no body, no body, safe, idempotent, not cacheable, not in forms
- response —
200 OKwithContent-Type: message/httpwith message of path along with wayMax-Forwardsrequest header — minus one before forwarding, stop forwarding if zero
CONNECT— hop-by-hop method in contrast to end-to-end method, establishes a tunnel to the server identified by the target resource, like SSL through a proxy- attributes — no body, has body, not safe, not idempotent, not cacheable, not in forms
- related headers and responses
Proxy-Authorizationrequest header — credentials to authenticate a user agent to a proxy server, usually after a407response407 Proxy Authentication Required
# Headers
headers
- proprietary headers — historically
X-prefixed, but this convention was deprecated in June 2012 - header contexts
- general headers — apply to both requests and responses, but with no relation to the data transmitted in the body
- request headers
- response headers
- entity headers — contain information about the body of the resource, like
Content-Length,Content-Type,Content-Encoding,Content-Language,Content-Location,Allow,Expiresand more
- proxy handling
- end-to-end headers — these headers must be transmitted to the final recipient of the message
- hop-by-hop headers — these headers are meaningful only for a single transport-level connection, and must not be retransmitted by proxies or cached
- proprietary headers — historically
authentication
WWW-AuthenticateAuthorization- more
cache
- blog (opens new window)
- HTTP header
- 强缓存 — 可以理解为无须验证的缓存策略。对强缓存来说,响应头中有两个字段
Expires/Cache-Control来表明规则。Expires— 指缓存过期的时间,超过了这个时间点就代表资源过期。有一个问题是由于使用具体时间,如果时间表示出错或者没有转换到正确的时区都可能造成缓存生命周期出错。并且Expires是 HTTP/1.0 的标准,现在更倾向于用 HTTP/1.1 中定义的Cache-Control。两个同时存在时也是Cache-Control的优先级更高。Cache-Control—Cache-Control可以由多个字段组合而成max-age指定一个时间长度,在这个时间段内缓存是有效的,单位是s。例如设置Cache-Control:max-age=31536000s-maxage同max-age,覆盖max-age、Expires,但仅适用于共享缓存,在私有缓存中被忽略。public表明响应可以被任何对象(发送请求的客户端、代理服务器等等)缓存。private表明响应只能被单个用户(可能是操作系统用户、浏览器用户)缓存,是非共享的,不能被代理服务器缓存。no-cache强制所有缓存了该响应的用户,在使用已缓存的数据前,发送带验证器的请求到服务器。不是字面意思上的不缓存。no-store禁止缓存,每次请求都要向服务器重新获取数据。
- other headers —
Pragma,Warningand more
- 协商缓存 — 客户端和服务器端通过某种验证机制验证当前请求资源是否可以使用缓存
Last-modified/If-Modified-SinceLast-modified— 服务器端资源的最后修改时间,响应头部会带上这个标识。第一次请求之后,浏览器记录这个时间- 再次请求时,请求头部带上
If-Modified-Since即为之前记录下的时间。服务器端收到带If-Modified-Since的请求后会去和资源的最后修改时间对比。若修改过就返回最新资源,状态码200,若没有修改过则返回304 Not Modified。
Etag/If-None-Match— 由服务器端上生成的一段 hash 字符串,第一次请求时响应头带上ETag: abcd,之后的请求中带上If-None-Match: abcd,服务器检查ETag,返回304或200。- 区别
- 某些服务器不能精确得到资源的最后修改时间,这样就无法通过最后修改时间判断资源是否更新。
Last-modified只能精确到秒。- 一些资源的最后修改时间改变了,但是内容没改变,使用
Last-modified看不出内容没有改变。 Etag的精度比Last-modified高,属于强验证,要求资源字节级别的一致,优先级高。如果服务器端有提供 ETag 的话,必须先对ETag进行 Conditional Request。- 实际使用
ETag/Last-modified要注意保持一致性,做负载均衡和反向代理的话可能会出现不一致的情况。计算ETag也是需要占用资源的,如果修改不是过于频繁,看自己的需求用Cache-Control是否可以满足。
- other headers —
If-Unmodified-Since,Vary
- 强缓存 — 可以理解为无须验证的缓存策略。对强缓存来说,响应头中有两个字段
- 其他 — 打包出来文件带hash后缀或版本号,文件内容改变后相当于请求一个新文件
client hints — new standard, experimental, tbd
connection
ConnectionKeep-Alive
content negotiation
AcceptAccept-CharsetAccept-EncodingAccept-Language
request and response context
- requester
FromHostRefererReferrer-PolicyUser-Agent
- responder
AllowServer
- requester
range request — respond part of the document
Accept-Ranges— indicates if the server supports range requests, and if so in which unit the range can be expressedRange— indicates the part of a document that the server should returnIf-Range— creates a conditional range request that is only fulfilled if the given etag or date matches the remote resource. Used to prevent downloading two ranges from incompatible version of the resourceContent-Range— indicates where in a full body message a partial message belongs- response —
206 Partial Content
frontend
- cookies —
Cookie,Set-Cookie - CORS —
Allow-Control-prefixed headers,Origin,Timing-Allow-Origin
- cookies —
security
- CORS policies
- CSP, content security policies
- more
other headers
- controls —
Expect,Max-Forwards - track —
DNT,TK - redirection —
Location - more
- controls —
# Status Codes
response status codes
- 1xx — informational
- 2xx — successful
- 3xx — redirection
- 4xx — client error
- 5xx — server error
1xx informational
100 Continue— everything so far is OK and that the client should continue the request101 Switching Protocol— in response toUpgradeheader in WebSockets- more
2xx successful
200 OK201 Created— typically forPOST,PUT202 Accepted— the request has been received but not yet acted upon204 No Content- more
3xx redirection
300 Multiple Choice— the request has more than one possible response. The user-agent or user should choose one of them301 Moved Permanently— the resource requested has been definitively moved to the URL given by theLocationheader302 Found— requested resource has been changed temporarily, given by theLocationheader. A browser redirects to this page but search engines don't update their links to the resource303 See Other—302but make the client useGETfollow the redirection304 Not Modified— response to cache control, likeIf-None-MatchorIf-Modified-Since307 Temporary Redirect—302but the user agent must not change the HTTP method and body
4xx client error
400 Bad Request— the server could not understand the request due to invalid syntax401 Unauthorized— unauthenticated, sent with aWWW-Authenticateheader that contains information on how to authorize correctly403 Forbidden— authenticated but unauthorized404 Not Found405 Method Not Allowed406 Not Acceptable— used in content negotiation418 I'm a teapot— April Fool ester egg
5xx server error
500 Internal Server Error501 Not Implemented— the server does not support the functionality required to fulfill the request; also405but it is not intentional that the server does not support method502 Bad Gateway— the server, while acting as a gateway or proxy, received an invalid response from the upstream server503 Service Unavailable— server is not ready to handle the request, could be due to maintenance or being overloaded, use withRetry-Afterif possible504 Gateway timeout— the server is acting as a proxy and did not receive a response from the upstream server
# Frontend Related
为什么通常在发送数据埋点请求的时候使用的是 1x1 像素的透明 gif 图片 — github (opens new window)
Navigator.sendBeacon()
# HTTPS
encryption
- client requests HTTPS connection
- server sends its public key, authentication see below
- client generate a session key and encrypt it with received public key
- client sends the encrypted session key which is decrypted by the server with its private key
- communication symmetric encrypted with the session key starts
authenticate
- CA, Certificate Authority — third parties trusted by both the client and the server, who sign certificates
- signing — CA sign public keys of servers to generate certificates, which contains the data and the signature
- signature — hash of the data encrypted by the private key of the signer
- authentication in HTTPS — the server sends the certificate, the client verify the certificate
- verification — compare the hash of the data and the hash decrypted from the signature using signer's public key
integrity —
Digestheader: in HTTP, integrity breached if altering the message and the digest at the same time; however, in HTTPS, cannot compute digest from encrypted message