5、HTTP · 前端躬行记

&emsp;&emsp;HTTP（HyperText Transfer Protocol）即超文本传输协议，是一种获取网络资源（例如图像、HTML文档）的应用层协议，它是互联网数据通信的基础，由请求和响应构成。 &emsp;&emsp;在 Node.js 中，提供了 3 个与之相关的模块，分别是 HTTP、HTTP2 和 HTTPS，后两者分别是对 HTTP/2.0 和 HTTPS 两个协议的实现。 &emsp;&emsp;HTTP/2.0 是 HTTP/1.1 的扩展版本，主要基于 Google 发布的 SPDY 协议，引入了全新的二进制分帧层，保留了 1.1 版本的大部分语义。 &emsp;&emsp;HTTPS（HTTP Secure）是一种构建在SSL或TLS上的HTTP协议，简单的说，HTTPS就是HTTP的安全版本。 &emsp;&emsp;本节主要分析的是 HTTP 模块，它是 Node.js 网络的关键模块。 &emsp;&emsp;本系列所有的示例源码都已上传至Github，[点击此处](https://github.com/pwstrick/node)获取。 ## 一、搭建 Web 服务器 &emsp;&emsp;Web 服务器是一种让网络用户可以访问托管文件的软件，常用的有 IIS、Nginx 等。 &emsp;&emsp;Node.js 与 ASP.NET、PHP 等不同，它不需要额外安装 Web 服务器，因为通过它自身包含的模块就能快速搭建出 Web 服务器。 &emsp;&emsp;运行下面的代码，在浏览器地址栏中输入 http://localhost:1234 就能访问一张纯文本内容的网页。 ~~~ const http = require('http'); const server = http.createServer((req, res) => { res.statusCode = 200; res.setHeader('Content-Type', 'text/plain'); res.end('strick'); }) server.listen(1234); ~~~ &emsp;&emsp;　res.end() 在[流一节](https://www.cnblogs.com/strick/p/16225418.html)中已分析过，用于关闭写入流。 **1）createServer()** &emsp;&emsp;createServer() 用于创建一个 Web 服务器，源码存于[lib/http.js](https://github.com/nodejs/node/blob/master/lib/http.js)文件中，内部就一行代码，实例化一个 Server 类。 ~~~ function createServer(opts, requestListener) { return new Server(opts, requestListener); } ~~~ &emsp;&emsp;Server 类的实现存于[lib/\_http\_server.js](https://github.com/nodejs/node/blob/master/lib/_http_server.js)文件中，由源码可知，http.Server 继承自 net.Server，而 net 模块可创建基于流的 TCP 和 IPC 服务器。 &emsp;&emsp;http.createServer() 在实例化 net.Server 的过程中，会监听 request 和 connection 两个事件。 ~~~ function Server(options, requestListener) { if (!(this instanceof Server)) return new Server(options, requestListener); // 当 createServer() 第一个参数类型是函数时的处理（上面示例中的用法） if (typeof options === 'function') { requestListener = options; options = {}; } else if (options == null || typeof options === 'object') { options = { ...options }; } else { throw new ERR_INVALID_ARG_TYPE('options', 'object', options); } storeHTTPOptions.call(this, options); // 继承于 net.Server 类 net.Server.call( this, { allowHalfOpen: true, noDelay: options.noDelay, keepAlive: options.keepAlive, keepAliveInitialDelay: options.keepAliveInitialDelay }); if (requestListener) { // 当 req 和 res 两个参数都生成后，就会触发该事件 this.on('request', requestListener); } // 官方注释：与此类似的选项，懒得写自己的文档 // http://www.squid-cache.org/Doc/config/half_closed_clients/ // https://wiki.squid-cache.org/SquidFaq/InnerWorkings#What_is_a_half-closed_filedescriptor.3F this.httpAllowHalfOpen = false; // 三次握手后触发 connection 事件 this.on('connection', connectionListener); this.timeout = 0; // 超时时间，默认禁用 this.maxHeadersCount = null; // 最大响应头数，默认不限制 this.maxRequestsPerSocket = 0; setupConnectionsTracking(this); } ~~~ **2）listen()** &emsp;&emsp;listen() 方法用于监听端口，它就是 net.Server 中的 server.listen() 方法。 ~~~ ObjectSetPrototypeOf(Server.prototype, net.Server.prototype); ~~~ **3）req 和 res** &emsp;&emsp;实例化 Server 时的 requestListener() 回调函数中有两个参数 req（请求对象）和 res（响应对象），它们的生成过程比较复杂。 &emsp;&emsp;简单概括就是通过 TCP 协议传输过来的二进制数据，会被 http\_parser 模块解析成符合 HTTP 协议的报文格式。 &emsp;&emsp;在将请求首部解析完毕后，会触发一个 parserOnHeadersComplete() 回调函数，在回调中会创建 http.IncomingMessage 实例，也就是 req 参数。 &emsp;&emsp;而在这个回调的最后，会调用 parser.onIncoming() 方法，在这个方法中会创建 http.ServerResponse 实例，也就是 res 参数。 &emsp;&emsp;最后触发在实例化 Server 时注册的 request 事件，并将 req 和 res 两个参数传递到 requestListener() 回调函数中。 &emsp;&emsp;生成过程的顺序如下所示，源码细节在此不做展开。 ~~~ lib/_http_server.js : connectionListener() lib/_http_server.js : connectionListenerInternal() lib/_http_common.js : parsers = new FreeList('parsers', 1000, function parsersCb() {}) lib/_http_common.js : parserOnHeadersComplete() => parser.onIncoming() lib/_http_server.js : parserOnIncoming() => server.emit('request', req, res) ~~~ &emsp;&emsp;在上述过程中，parsers 变量使用了[FreeList](https://zh.wikipedia.org/wiki/%E8%87%AA%E7%94%B1%E8%A1%A8)数据结构（如下所示），一种动态分配内存的方案，适合由大小相同的对象组成的内存池。 ~~~ class FreeList { constructor(name, max, ctor) { this.name = name; this.ctor = ctor; this.max = max; this.list = []; } alloc() { return this.list.length > 0 ? this.list.pop() : ReflectApply(this.ctor, this, arguments); // 执行回调函数 } free(obj) { if (this.list.length < this.max) { this.list.push(obj); return true; } return false; } } ~~~ &emsp;&emsp;parsers 维护了一个固定长度（1000）的队列（内存池），队列中的元素都是实例化的 HTTPParser。 &emsp;&emsp;当 Node.js 接收到一个请求时，就从队列中索取一个 HTTPParser 实例，即调用 parsers.alloc()。 &emsp;&emsp;解析完报文后并没有将其马上释放，如果队列还没满就将其压入其中，即调用 parsers.free(parser)。 &emsp;&emsp;如此便实现了 parser 实例的反复利用，当并发量很高时，就能大大减少实例化所带来的性能损耗。 ## 二、通信 &emsp;&emsp;Node.js 提供了[request()](https://nodejs.org/dist/latest-v18.x/docs/api/http.html#httprequestoptions-callback)方法显式地发起 HTTP 请求，著名的第三方库[axios](https://github.com/axios/axios)的服务端版本就是基于 request() 方法封装的。 **1）GET 和 POST** &emsp;&emsp;GET 和 POST 是两个最常用的请求方法，主要区别包含4个方面： * 语义不同，GET是获取数据，POST是提交数据。 * HTTP协议规定GET比POST安全，因为GET只做读取，不会改变服务器中的数据。但这只是规范，并不能保证请求方法的实现也是安全的。 * GET请求会把附加参数带在URL上，而POST请求会把提交数据放在报文内。在浏览器中，URL长度会被限制，所以GET请求能传递的数据有限，但HTTP协议其实并没有对其做限制，都是浏览器在控制。 * HTTP协议规定GET是幂等的，而POST不是，所谓幂等是指多次请求返回的相同结果。实际应用中，并不会这么严格，当GET获取动态数据时，每次的结果可能会有所不同。 &emsp;&emsp;在下面的例子中，发起了一次 GET 请求，访问上一小节中创建的 Server，options 参数中包含域名、端口、路径、请求方法。 ~~~ const http = require('http'); const options = { hostname: 'localhost', port: 1234, path: '/test?name=freedom', method: 'GET' }; const req = http.request(options, res => { console.log(res.statusCode); res.on('data', d => { console.log(d.toString()); // strick }); }); req.end(); ~~~ &emsp;&emsp;res 和 req 都是可写流，res 注册了 data 事件接收数据，而在请求的最后，必须手动关闭 req 可写流。 &emsp;&emsp;POST 请求的构造要稍微复杂点，在 options 参数中，会添加请求首部，下面增加了内容的MIME类型和内容长度。 &emsp;&emsp;req.write() 方法可发送一块请求内容，如果没有设置 Content-Length，则数据将自动使用 HTTP 分块传输进行编码，以便服务器知道数据何时结束。 Transfer-Encoding: chunked 标头会被添加。 ~~~ const http = require('http'); const data = JSON.stringify({ name: 'freedom' }); const options = { hostname: 'localhost', port: 1234, path: '/test', method: 'POST', headers: { 'Content-Type': 'application/json', 'Content-Length': data.length } }; const req = http.request(options, res => { console.log(res.statusCode); res.on('data', d => { console.log(d.toString()); // strick }); }); req.write(data); req.end(); ~~~ &emsp;&emsp;在 Server 中，若要接收请求的参数，需要做些处理。 &emsp;&emsp;GET 请求比较简单，读取 req.url 属性，解析 url 中的参数就能得到请求参数。 &emsp;&emsp;POST 请求就需要注册 data 事件，下面代码中只考虑了最简单的场景，直接获取然后字符串格式化。 ~~~ const server = http.createServer((req, res) => { console.log(req.url); // /test?name=freedom req.on('data', d => { console.log(d.toString()); // {"name":"freedom"} }); }) ~~~ &emsp;&emsp;在 KOA 的插件中有一款[koa-bodyparser](https://github.com/koajs/bodyparser)，基于[co-body](https://github.com/cojs/co-body)库，可解析 POST 请求的数据，将结果附加到 ctx.request.body 属性中。 &emsp;&emsp;而 co-body 依赖了[raw-body](https://github.com/stream-utils/raw-body)库，它能将多块二进制数据流组合成一块整体，刚刚的请求数据可以像下面这样接收。 ~~~ const getRawBody = require('raw-body'); const server = http.createServer((req, res) => { getRawBody(req).then(function (buf) { // <Buffer 7b 22 6e 61 6d 65 22 3a 22 66 72 65 65 64 6f 6d 22 7d> console.log(buf); }); }) ~~~ **2）路由** &emsp;&emsp;在开发实际的 Node.js 项目时，路由是必不可少的。 &emsp;&emsp;下面是一个极简的路由演示，先实例化[URL](https://nodejs.org/dist/latest-v18.x/docs/api/url.html)类，再读取路径名称，最后根据 if-else 语句返回响应。 ~~~ const server = http.createServer((req, res) => { // 实例化 URL 类 const url = new URL(req.url, 'http://localhost:1234'); const { pathname } = url; // 简易路由 if(pathname === '/') { res.end('main'); }else if(pathname === '/test') { res.end('test'); } }); ~~~ &emsp;&emsp;上述写法，不能应用于实际项目中，无论是在维护性，还是可读性方面都欠妥。下面通过一个开源库，来简单了解下路由系统的运行原理。 &emsp;&emsp;在 KOA 的插件中，有一个专门用于路由的[koa-router](https://github.com/ZijianHe/koa-router)（如下所示），先实例化 Router 类，然后注册一个路由，再挂载路由中间件。 ~~~ var Koa = require('koa'); var Router = require('koa-router'); var app = new Koa(); var router = new Router(); router.get('/', (ctx, next) => { // ctx.router available }); app.use(router.routes()).use(router.allowedMethods()); ~~~ &emsp;&emsp;Router() 构造函数中仅仅是初始化一些变量，在注册路由时会调用 register() 方法，将路径和回调函数绑定。 ~~~ methods.forEach(function (method) { Router.prototype[method] = function (name, path, middleware) { var middleware; if (typeof path === 'string' || path instanceof RegExp) { middleware = Array.prototype.slice.call(arguments, 2); } else { middleware = Array.prototype.slice.call(arguments, 1); path = name; name = null; } this.register(path, [method], middleware, { name: name }); return this; }; }); ~~~ &emsp;&emsp;在 register() 函数中，会将实例化一个 Layer 类，就是一个路由实例，并加到内部的数组中，下面是删减过的源码。 ~~~ Router.prototype.register = function (path, methods, middleware, opts) { opts = opts || {}; // 路由数组 var stack = this.stack; // 实例化路由 var route = new Layer(path, methods, middleware, { end: opts.end === false ? opts.end : true, name: opts.name, sensitive: opts.sensitive || this.opts.sensitive || false, strict: opts.strict || this.opts.strict || false, prefix: opts.prefix || this.opts.prefix || "", ignoreCaptures: opts.ignoreCaptures }); // add parameter middleware Object.keys(this.params).forEach(function (param) { route.param(param, this.params[param]); }, this); // 加到数组中 stack.push(route); return route; }; ~~~ &emsp;&emsp;在注册中间件时，首先会调用 router.routes() 方法，在该方法中会执行匹配到的路由（路径和请求方法相同）的回调。 &emsp;&emsp;其中 layerChain 是一个数组，它会先添加一个处理数组的回调函数，再合并一个或多个路由回调（一条路径可以声明多个回调）， &emsp;&emsp;在处理完匹配路由的所有回调函数后，再去运行下一个中间件。 ~~~ Router.prototype.routes = Router.prototype.middleware = function () { var router = this; var dispatch = function dispatch(ctx, next) { var path = router.opts.routerPath || ctx.routerPath || ctx.path; /** * 找出所有匹配的路由，可能声明了相同路径和请求方法的路由 * matched = { * path: [], 路径匹配 * pathAndMethod: [], 路径和方法匹配 * route: false 路由是否匹配 * } */ var matched = router.match(path, ctx.method); var layerChain, layer, i; if (ctx.matched) { ctx.matched.push.apply(ctx.matched, matched.path); } else { ctx.matched = matched.path; } // 将 router 挂载到 ctx 上，供其他中间件使用 ctx.router = router; // 没有匹配的路由，就运行下一个中间件 if (!matched.route) return next(); var matchedLayers = matched.pathAndMethod // 路径和请求方法都匹配的数组 // 最后一个 matchedLayer var mostSpecificLayer = matchedLayers[matchedLayers.length - 1] ctx._matchedRoute = mostSpecificLayer.path; if (mostSpecificLayer.name) { ctx._matchedRouteName = mostSpecificLayer.name; } /** * layerChain 是一个数组，先添加一个处理数组的回调函数，再合并一个或多个路由回调 * 目的是在运行路由回调之前，将请求参数挂载到 ctx.params 上 */ layerChain = matchedLayers.reduce(function(memo, layer) { memo.push(function(ctx, next) { // 正则匹配的捕获数组 ctx.captures = layer.captures(path, ctx.captures); // 请求参数对象，key 是参数名，value 是参数值 ctx.params = layer.params(path, ctx.captures, ctx.params); ctx.routerName = layer.name; return next(); }); // 注册路由时的回调，stack 有可能是数组 return memo.concat(layer.stack); }, []); // 在处理完匹配路由的所有回调函数后，运行下一个中间件 return compose(layerChain)(ctx, next); }; dispatch.router = this; return dispatch; }; ~~~ &emsp;&emsp;另一个 router.allowedMethods() 会对异常行为做统一的默认处理，例如不支持的请求方法，不存在的状态码等。参考资料： [饿了么网络面试题](https://github.com/ElemeFE/node-interview/tree/master/sections/zh-cn#network) [深入理解Node.js源码之HTTP](https://yjhjstz.gitbooks.io/deep-into-node/content/chapter10/chapter10-1.html) [官网HTTP](http://nodejs.cn/learn/build-an-http-server) [Node HTTP Server 源码解读](https://zhuanlan.zhihu.com/p/161680744) [node http server源码解析](https://segmentfault.com/a/1190000039273594) [Node 源码 —— http 模块](https://juejin.cn/post/6844903977239183368) [通过源码解析 Node.js 中一个 HTTP 请求到响应的历程](https://github.com/DavidCai1993/my-blog/issues/29) [koa-router源码解析](https://juejin.cn/post/6844903573851996167) [koa-router源码解读](https://zhuanlan.zhihu.com/p/91480087) ***** > 原文出处： [博客园-Node.js精进](https://www.cnblogs.com/strick/category/2154090.html) [知乎专栏-前端性能精进](https://www.zhihu.com/column/c_1611672656142725120) 已建立一个微信前端交流群，如要进群，请先加微信号freedom20180706或扫描下面的二维码，请求中需注明“看云加群”，在通过请求后就会把你拉进来。还搜集整理了一套[面试资料](https://github.com/pwstrick/daily)，欢迎浏览。 ![](https://box.kancloud.cn/2e1f8ecf9512ecdd2fcaae8250e7d48a_430x430.jpg =200x200) 推荐一款前端监控脚本：[shin-monitor](https://github.com/pwstrick/shin-monitor)，不仅能监控前端的错误、通信、打印等行为，还能计算各类性能参数，包括 FMP、LCP、FP 等。