This is an interesting blog post for me, since its about the root cause for an issue we saw with Tomcat back in May 2011, which remained unresolved. Under high concurrency and load, Tomcat would reset (i.e. TCP level RST) client connections, without refusing to accept them - as one would expect. I posted this again to the Tomcat user list a few days back, but then wanted to find out the root cause for myself, since it would surely come up again in the future.
Under high load, the ESB would start to see errors while talking to Tomcat, and the cause would be IO errors such as "Connection reset by peer". Now the problem to the ESB is that it had already started to send out an HTTP request / payload over an accepted TCP connection, and thus it does not know if it can fail-over safely by default to another node, since the backend service might have performed some processing over the request it may have already received. Of-course, the ESB could be configured to retry on such errors as well, but our default behaviour was to fail-over only on the safer connection refused or connect timeout errors (i.e. a connection could not be established within the allocated time) - which ensures correct operation, even for non-idempotent services.
As can be seen in the trace, the client initiated a TCP connection with the source port 9386, and Tomcat running on port 9000 accepted the connection - note “1”. The client kept sending packets of a 100K request, and Tomcat kept acknowledging them. The last such case is annotated with note “2”. Note that the request payload was not complete at this time from the client – note “3”. Suddenly, Tomcat resets the connection – note “4”
Background
This issue initially became evident when we ran high concurrency load tests at a customer location in Europe, where the customer had back-end services deployed on multiple Tomcat instances, and wanted to use the UltraESB for routing messages with load balancing and fail-over. For the ESB Performance Benchmark, we had been using an EchoService written over the Apache HttpComponents/Core NIO library that scaled extremely well and behaved well at the TCP level, even under load. However, at the client site, they wanted the test run against real services deployed on Tomcat - to analyse a more realistic scenario. We used a Java based clone of ApacheBench called the 'Java Bench' which is also a part of the Apache HttpComponents project, to generate load. The client would go up-to concurrency levels of 2560, pushing as many messages as possible through the ESB, to back end services deployed over Tomcat.Under high load, the ESB would start to see errors while talking to Tomcat, and the cause would be IO errors such as "Connection reset by peer". Now the problem to the ESB is that it had already started to send out an HTTP request / payload over an accepted TCP connection, and thus it does not know if it can fail-over safely by default to another node, since the backend service might have performed some processing over the request it may have already received. Of-course, the ESB could be configured to retry on such errors as well, but our default behaviour was to fail-over only on the safer connection refused or connect timeout errors (i.e. a connection could not be established within the allocated time) - which ensures correct operation, even for non-idempotent services.
Recent Observations
We recently experienced the same issue with Tomcat when a customer wanted to perform a load test scenario where a back-end service would block for 1-5 seconds randomly, to simulate realistic behaviour. Here, again we saw that Tomcat was resetting accepted TCP connections, and we were able to capture this with Wireshark as follows, using JavaBench directly against a Tomcat based servletAs can be seen in the trace, the client initiated a TCP connection with the source port 9386, and Tomcat running on port 9000 accepted the connection - note “1”. The client kept sending packets of a 100K request, and Tomcat kept acknowledging them. The last such case is annotated with note “2”. Note that the request payload was not complete at this time from the client – note “3”. Suddenly, Tomcat resets the connection – note “4”
Understanding the root cause
After failing to locate any code in the Tomcat source code that resets established connections, I wanted to simulate the behaviour with a very simple Java program. Luckily the problem was easy to reproduce with a simple program as follows:import java.net.ServerSocket;
import java.net.Socket;
public class TestAccept1 {
public static void main(String[] args) throws Exception {
ServerSocket serverSocket = new ServerSocket(8280, 0);
Socket socket = serverSocket.accept();
Thread.sleep(3000000); // do nothing
}
}
We just open a server socket on port 8280, with a backlog of 0 and start listening for connections. Since the backlog is 0, one could assume that only one client connection would be allowed - BUT, I could open more than that via telnet as follows, and even send some data afterwards by typing it in and pressing the enter key.
telnet localhost 8280
hello world
A netstat command now confirms that more than one connection is opene:
netstat -na | grep 8280
tcp 0 0 127.0.0.1:34629 127.0.0.1:8280 ESTABLISHED
tcp 0 0 127.0.0.1:34630 127.0.0.1:8280 ESTABLISHED
tcp6 0 0 :::8280 :::* LISTEN
tcp6 13 0 127.0.0.1:8280 127.0.0.1:34630 ESTABLISHED
tcp6 13 0 127.0.0.1:8280 127.0.0.1:34629 ESTABLISHED
However, the Java program has only accepted ONE socket, although at the OS level, two would appear. It seems like the OS also allows more than two connections to be opened, even when the backlog is specified as 0. On Ubuntu 12.04 x64, the netstat command would not show me the actual listen queue length - but I believe it was not 0. However, before and after this test, I did not see a difference in the reported statistics for "listen queue" overflow, which I could see with the "netstat -sp tcp | fgrep listen" command
Next I used the JavaBench from the SOA ToolBox and issued a small payload at concurrency 1024, with a single iteration against the same port 8280
As expected, all requests failed, but my Wireshark trace on port 8280 did not detect any connection resets. Pushing the concurrency to 2560 and the iterations to 10 started to show tcp level RSTs - which were similar to those seen on Tomcat, though not exactly the same.
Can Tomcat do better?
Yes, Possibly .. What an end user would expect from Tomcat is that it refuses to accept new connections when under load, and not to accept connections and then reset them halfway through. But one would ask if that is achievable? Especially considering the behaviour seen with the simple Java example we discussed.Well, the solution could be to perform better handling of the low level HTTP connections and the sockets, and this is already done by the free and open source high performance Enterprise Service Bus UltraESB, which utilizes the excellent Apache HttpComponents project underneath.
How does the UltraESB behave
One could easily test this by using the 'stopNewConnectionsAt' property of our NIO listener. If you set it to 2, you wont be able to even open a Telnet session to the socket beyond 2.The first would work, the second too
But the third would see a "Connection refused"
And the UltraESB would report the following on its logs:
INFO HttpNIOListener HTTP Listener http-8280 paused
WARN HttpNIOListener$EventLogger Enter maintenance mode as open connections reached : 2
Although it refuses to accept new connections, already accepted connections executes without any hindrance to completion. Thus a hardware level load balancer in front of an UltraESB cluster can safely load balance if an UltraESB node is loaded beyond its configured limits, without having to deal with any connection resets. Once a connection slot becomes free, the UltraESB will start accepting new connections as applicable.
Analysing a corresponding TCP dump
To analyse the corresponding behaviour, we wrote a simple Echo proxy service on the UltraESB, that also slept for 1 to 5 seconds before it replied, and tested this with the same JavaBench under 2560 concurrent users, each trying to push 10 messages in iteration.Out of the 25600 requests, 7 completed successfully, while 25593 failed, as expected. We also saw many tcp level RSTs on the Wireshark dump - which must have been issued by the underlying operating system.
However, what's interesting to note is the difference - the RSTs occur immediately on receiving the SYN packet from the client - and are not established HTTP or TCP connections, but elegant "Connection Refused" errors - which would be what the client can expect. Thus the client can safely fail-over to another node without any doubt, overhead or delay.
Appendix : Supporting high concurrency in general
During testing we also saw that the Linux OS could detect the opening of many concurrent connections at the same time as a SYN flood attack, and then start using SYN cookies. You would see messages such as-->
Possible SYN flooding on port 9000. Sending cookies
displayed on the output of a "sudo dmesg", if this happens. Hence, for a real load, it would be better to disable SYN cookies by turning it off as follows as the root user
-->
# echo 0 > /proc/sys/net/ipv4/tcp_syncookies
To make the change persist over reboots, add the following line to your /etc/sysctl.conf
-->
net.ipv4.tcp_syncookies = 0
To allow the Linux OS to accept more connections, its also recommended that the 'net.core.somaxconn' be increased - as it usually defaults to 128 or so. This could be performed by the root user as follows,
-->
# echo 1024 > /proc/sys/net/core/somaxconn
To persist the change, append the following to the /etc/sysctl.conf
-->
net.core.somaxconn = 1024