记一次Nginx产生RST的研究 (续)
在前一篇博客记一次Nginx产生RST的研究中我们主要分析了在内核层面产生RST的原因,本篇博客将从应用层分析一下原因,以及看一下如何避免这个问题。
要关闭TCP链接,首先需要关闭TLS会话,因此我们先了解一下SSL关闭会话的一些特点。
SSL Close Notify
当一方发起ssl close notify的时候代表其已经不需要再发送数据了。意味着收到close notify的一方可以关闭接收了。
从TLS1.0 到TLS1.2 的RFC都要求收到close notify的一方需要立刻回复close notify,但首先发送close notify的一方不需要等待响应的close notify就可以关闭接收。
TLS1.3的RFC对此做了改变,要求双方都无需等待对方发送close notify就可以关闭接收。不过这可能会导致数据截断。
Openssl有ssl_shutdown这个方法来关闭一个SSL会话,它会发送一个SSL close notify报文。shutdown有不同的状态,具体如下:
- 0
No shutdown setting, yet.
- SSL_SENT_SHUTDOWN
A "close notify" shutdown alert was sent to the peer, the connection is being considered closed and the session is closed and correct.
- SSL_RECEIVED_SHUTDOWN
A shutdown alert was received form the peer, either a normal "close notify" or a fatal error.
https://docs.openssl.org/1.0.2/man3/SSL_set_shutdown/#description
SSL_SENT_SHUTDOWN表示本端已经发送了close notify,SSL_RECEIVED_SHUTDOWN表示本端已经收到对方的close notify,0就表示既没有发送,也没有收到close notify。
在ssl_shutdown之后,会先把SSL_SENT_SHUTDOWN设置,代表已经发送了close notify,ssl_shutdown返回0;之后如果不等待对方的close notify,则可以关闭TCP连接;如果需要等待,则需要再次调用ssl_shutdown,收到close notify之后返回1。
Nginx行为
Nginx对于一个ssl会话,也定义了这么两个状态,分别是no_send_shutdown和no_wait_shutdown。
- no_wait_shutdown 对应 SSL_RECEIVED_SHUTDOWN
- no_send_shutdown 对应 SSL_SENT_SHUTDOWN
Nginx已经用“无需等待对方发送close notify”作为原则,这就意味着如果Nginx主动发起close notify,那么ssl_shutdown之后它会直接关闭TCP连接。具体实现是Nginx在ssl会话建立之后,就把no_wait_shutdown设置了,用来表示已经收到了对面的close notify。在代码的注释中,开发者也提到有很多客户端(浏览器)的实现是收到close notify不回复close notify的,我实际测试了一下,不只是很多浏览器,很多Python的HTTP库也是不回复close notify。
static void
ngx_http_ssl_handshake_handler(ngx_connection_t *c)
{
if (c->ssl->handshaked) {
/*
* The majority of browsers do not send the "close notify" alert.
* Among them are MSIE, old Mozilla, Netscape 4, Konqueror,
* and Links. And what is more, MSIE ignores the server's alert.
*
* Opera and recent Mozilla send the alert.
*/
c->ssl->no_wait_shutdown = 1;
有了这个,后面Nginx关闭会话的时候就是一次ssl_shutdown之后立刻就把TCP连接关闭了,关闭TCP连接用的时候close这个系统调用,直接把读写全部关闭了,但在这个场景下客户端还有数据要发送(ssl close notify),因此这种Nginx处理方法并不优雅。
解决方案
客户端
对客户端来说,如果能使用keepalive的请求,则可以避免,毕竟从HTTP1.1开始,默认都是keepalive了。但是对于这种健康检查性质的请求,也确实没必要用长连接。因此还是看一下服务端有什么办法。
服务端
对服务端来说,Nginx已经默认所有客户端都不会回复close notify,因此真的遇上curl,Route53这种严格守规矩的客户端,只能设法留个机会读取数据。Nginx确实可以留这个机会:
lingering_close
https://nginx.org/en/docs/http/ngx_http_core_module.html
The default value “on” instructs nginx to wait for and process additional data from a client before fully closing a connection, but only if heuristics suggests that a client may be sending more data.
The value “always” will cause nginx to unconditionally wait for and process additional client data.
The value “off” tells nginx to never wait for more data and close the connection immediately. This behavior breaks the protocol and should not be used under normal circumstances.
当前情况我们可以在Nginx配置文件中把 lingering_close on 改为 lingering_close always ,即可让Nginx继续等待一段时间以接收后续到达的close notify。虽然是接收,但实际上Ngin学会
代码层面,Nginx在处理完请求之后,ngx_http_finalize_connection的函数中会判断是否要启用lingering_close,如果启用,则先不释放连接,转而进入 ngx_http_set_lingering_close 函数
ngx_http_finalize_connection
if (clcf->lingering_close == NGX_HTTP_LINGERING_ALWAYS
|| (clcf->lingering_close == NGX_HTTP_LINGERING_ON
&& (r->lingering_close
|| r->header_in->pos < r->header_in->last
|| r->connection->read->ready
|| r->connection->pipeline)))
{
ngx_http_set_lingering_close(r->connection);
return;
}
ngx_http_set_lingering_close函数会,先调用ssl_shutdown关闭ssl会话,再调用shutdown这个系统调用关闭写,这也就保留了socket上还能继续读的机会。设定linger_timeout计时器
if (ngx_shutdown_socket(c->fd, NGX_WRITE_SHUTDOWN) == -1) {
ngx_connection_error(c, ngx_socket_errno,
ngx_shutdown_socket_n " failed");
ngx_http_close_request(r, 0);
return;
}
***
ngx_add_timer(rev, clcf->lingering_timeout);
ngx_http_lingering_close_handler 处理后续收到的数据以及最后关闭连接。
do {
n = c->recv(c, buffer, NGX_HTTP_LINGERING_BUFFER_SIZE);
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, c->log, 0, "lingering read: %z", n);
if (n == NGX_AGAIN) {
break;
}
if (n == NGX_ERROR || n == 0) {
ngx_http_close_request(r, 0);
return;
}
} while (rev->ready);
最终关闭连接是调用的close这个系统调用关闭socket的fd,这才算是真正释放了这个socket。如果只shutdown,不close,则socket资源还会被一直占用。如果不启用lingering_close,则Nginx会直接调用close。以下是用strace做的对比:
lingering_close on
# strace -p `pidof "nginx: worker process"` -e trace=network,close,shutdown
strace: Process 13479 attached
accept4(7, {sa_family=AF_INET, sin_port=htons(55872), sin_addr=inet_addr("172.31.27.54")}, [112->16], SOCK_NONBLOCK) = 4
recvfrom(4, "\26", 1, MSG_PEEK, NULL, NULL) = 1
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
close(9) = 0
close(4) = 0
lingering_close always
# strace -p `pidof "nginx: worker process"` -e trace=network,close,shutdown
accept4(7, {sa_family=AF_INET, sin_port=htons(55920), sin_addr=inet_addr("172.31.27.54")}, [112->16], SOCK_NONBLOCK) = 3
recvfrom(3, "\26", 1, MSG_PEEK, NULL, NULL) = 1
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
shutdown(3, SHUT_WR) = 0
recvfrom(3, "\25\3\3\0\0321\216\t\"\376'G#\10\253X\307\360\357\27\23-\220\360\317\247\332\251&\354\322", 4096, 0, NULL, NULL) = 31
recvfrom(3, "", 4096, 0, NULL, NULL) = 0
close(8) = 0
close(3) = 0
改为always之后的抓包结果,确实没有了RST
15:21:54.447491 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [S], seq 493477799, win 26883, options [mss 8460,sackOK,TS val 2121925640 ecr 0,nop,wscale 7], length 0
15:21:54.447538 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [S.], seq 1558193578, ack 493477800, win 62643, options [mss 8961,sackOK,TS val 2282294913 ecr 2121925640,nop,wscale 7], length 0
15:21:54.448955 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [.], ack 1, win 211, options [nop,nop,TS val 2121925641 ecr 2282294913], length 0
15:21:54.449487 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [P.], seq 1:518, ack 1, win 211, options [nop,nop,TS val 2121925642 ecr 2282294913], length 517
15:21:54.449502 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [.], ack 518, win 486, options [nop,nop,TS val 2282294915 ecr 2121925642], length 0
15:21:54.450465 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [P.], seq 1:1522, ack 518, win 486, options [nop,nop,TS val 2282294916 ecr 2121925642], length 1521
15:21:54.451217 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [.], ack 1522, win 234, options [nop,nop,TS val 2121925644 ecr 2282294916], length 0
15:21:54.452418 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [P.], seq 518:644, ack 1522, win 234, options [nop,nop,TS val 2121925645 ecr 2282294916], length 126
15:21:54.452608 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [P.], seq 1522:1573, ack 644, win 486, options [nop,nop,TS val 2282294918 ecr 2121925645], length 51
15:21:54.454038 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [P.], seq 644:764, ack 1573, win 234, options [nop,nop,TS val 2121925647 ecr 2282294918], length 120
15:21:54.454276 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [P.], seq 1573:1847, ack 764, win 486, options [nop,nop,TS val 2282294920 ecr 2121925647], length 274
15:21:54.454321 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [P.], seq 1847:1878, ack 764, win 486, options [nop,nop,TS val 2282294920 ecr 2121925647], length 31
15:21:54.454340 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [F.], seq 1878, ack 764, win 486, options [nop,nop,TS val 2282294920 ecr 2121925647], length 0
15:21:54.455955 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [.], ack 1879, win 258, options [nop,nop,TS val 2121925648 ecr 2282294920], length 0
15:21:54.456015 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [P.], seq 764:795, ack 1879, win 258, options [nop,nop,TS val 2121925649 ecr 2282294920], length 31
15:21:54.456081 IP 172.31.27.54.59918 > 10.0.0.125.443: Flags [F.], seq 795, ack 1879, win 258, options [nop,nop,TS val 2121925649 ecr 2282294920], length 0
15:21:54.456090 IP 10.0.0.125.443 > 172.31.27.54.59918: Flags [.], ack 796, win 486, options [nop,nop,TS val 2282294922 ecr 2121925649], length 0
当然,如果是遇到那种不发送close notify的客户端,lingering_timeout的时间过后连接也会被关闭,代价就是连接多保留了一段时间。
不过实际上根据我的测试,不发送close notify的客户端也是会直接回复FIN或者RST直接关闭TCP连接,不会持续保持这个连接。
总结一下,Nginx配置文件设置 lingering_close always; 可以实现针对回复close notify的短连接客户端的优雅关闭。
这里也有一个插曲,为什么这个场景keepalive的连接不会产生RST?
短连接的场景下,keeaplive使得Nginx保持连接,因此主动发起关闭的是客户端。Nginx在收到客户端发起的close notify之后会把no_wait_shutdown 和 no_send_shutdown 都设置,这就意味着Nginx在ssl_shutdown方法里不需要做什么了,直接开始关闭TCP连接即可,就类似于上面提到的不发送close notify的客户端。 不过这么做确实违背了TLS标准。(这里必须要吐槽一下,标准要求发送close notify,但不要求接收;既然不要求接收,那发送的意义在哪🤷♂️)