Monday, November 5, 2012

How to stop biting, when you cant chew more..

This is a follow up to my earlier post 'Does Tomcat bite more than it can chew?' and illustrates a pure Java program that utilizes Java NIO to stop accepting new messages when one is not able to handle the load, without any dependence on the TCP backlog etc.

Program Implementation

We open a selector, and invoke the startListening() method, that opens a ServerSocketChannel and then binds it to port 8280. The channel is configured as non-blocking, and finally we register our interest in OP_ACCEPT to handle incoming connections.

    private void startListening() throws IOException {
        server = ServerSocketChannel.open();
        server.socket().bind(new InetSocketAddress(8280), 0);
        server.configureBlocking(false);
        server.register(selector, SelectionKey.OP_ACCEPT);
        System.out.println("\nI am ready to listen for new messages now..");
    }


If you telnet to the port via a command line after the server starts up, you would see a message "Hi there! type a word". The server accepts the incoming connection as a non-blocking connection, and registers OP_READ to read the content typed in.


To illustrate how the server can prevent another client from connecting to it while it serves the currently connected client, it prints "I accepted this one.. but not any more now" on its console, cancels the SelectionKey and closes the channel.

A new telnet session will see the "Connection refused" error as expected.

Next, I would type a small word into the first telnet session, and the server would print it in its console, and close the connection.

At the same time, it prints the message "I am ready to listen for new messages now.." on its console, and invokes the above startListening() method again - making it ready to listen and accept a new client.

We re-try the connection from the same command prompt that received the 'Connection refused' earlier, and as expected are greeted with the welcome message again.

What did we learn?

A low level NIO server can stop accepting new connections, if it can determine that its not able to serve new clients. Once it stops accepting connections this way, any client connection attempt will see a 'Connection Refused' error. This may not be the case if our server was implemented differently (See my last article and its example and how Tomcat behaves under load).

Complete source code

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Iterator;

public class TestAccept2 {

    private ServerSocketChannel server = null;
    private Selector selector = null;

    public static void main(String[] args) throws Exception {
        new TestAccept2().run();
    }

    private void run() throws Exception {
        selector = Selector.open();
        startListening();

        while (true) {
            selector.select();

            for (Iterator i = selector.selectedKeys().iterator(); i.hasNext(); ) {
                SelectionKey key = i.next();
                i.remove();
                if (key.isAcceptable()) {
                    SocketChannel client = server.accept();
                    client.configureBlocking(false);
                    client.socket().setTcpNoDelay(true);
                    client.register(selector, SelectionKey.OP_READ);

                    System.out.println("I accepted this one.. but not any more now");
                    key.cancel();
                    key.channel().close();
                    sayHello(client);

                } else if (key.isReadable()) {
                    readDataFromSocket(key);
                }
            }
        }
    }

    private void startListening() throws IOException {
        server = ServerSocketChannel.open();
        server.socket().bind(new InetSocketAddress(8280), 0);
        server.configureBlocking(false);
        server.register(selector, SelectionKey.OP_ACCEPT);
        System.out.println("\nI am ready to listen for new messages now..");
    }

    private void sayHello(SocketChannel channel) throws Exception {
        channel.write(ByteBuffer.wrap("Hi there! type a word\r\n".getBytes()));
    }

    private void readDataFromSocket(SelectionKey key) throws Exception {
        SocketChannel socketChannel = (SocketChannel) key.channel();
        ByteBuffer buffer = ByteBuffer.allocate(32);
        if (socketChannel.read(buffer) > 0) {
            buffer.flip();
            byte[] bytearr = new byte[buffer.remaining()];
            buffer.get(bytearr);
            System.out.print(new String(bytearr));
            socketChannel.close();

            startListening();
        }
    }

}


Wednesday, October 31, 2012

Does Tomcat bite more than it can chew?

This is an interesting blog post for me, since its about the root cause for an issue we saw with Tomcat back in May 2011, which remained unresolved. Under high concurrency and load, Tomcat would reset (i.e. TCP level RST) client connections, without refusing to accept them - as one would expect. I posted this again to the Tomcat user list a few days back, but then wanted to find out the root cause for myself, since it would surely come up again in the future.

Background

This issue initially became evident when we ran high concurrency load tests at a customer location in Europe, where the customer had back-end services deployed on multiple Tomcat instances, and wanted to use the UltraESB for routing messages with load balancing and fail-over. For the ESB Performance Benchmark, we had been using an EchoService written over the Apache HttpComponents/Core NIO library that scaled extremely well and behaved well at the TCP level, even under load. However, at the client site, they wanted the test run against real services deployed on Tomcat - to analyse a more realistic scenario. We used a Java based clone of ApacheBench called the 'Java Bench' which is also a part of the Apache HttpComponents project, to generate load. The client would go up-to concurrency levels of 2560, pushing as many messages as possible through the ESB, to back end services deployed over Tomcat.

Under high load, the ESB would start to see errors while talking to Tomcat, and the cause would be IO errors such as "Connection reset by peer". Now the problem to the ESB is that it had already started to send out an HTTP request / payload over an accepted TCP connection, and thus it does not know if it can fail-over safely by default to another node, since the backend service might have  performed some processing over the request it may have already received. Of-course, the ESB could be configured to retry on such errors as well, but our default behaviour was to fail-over only on the safer connection refused or connect timeout errors (i.e. a connection could not be established within the allocated time) - which ensures correct operation, even for non-idempotent services.

Recent Observations

We recently experienced the same issue with Tomcat when a customer wanted to perform a load test scenario where a back-end service would block for 1-5 seconds randomly, to simulate realistic behaviour. Here, again we saw that Tomcat was resetting accepted TCP connections, and we were able to capture this with Wireshark as follows, using JavaBench directly against a Tomcat based servlet


As can be seen in the trace, the client initiated a TCP connection with the source port 9386, and Tomcat running on port 9000 accepted the connection - note “1”. The client kept sending packets of a 100K request, and Tomcat kept acknowledging them. The last such case is annotated with note “2”. Note that the request payload was not complete at this time from the client – note “3”. Suddenly, Tomcat resets the connection – note “4”

Understanding the root cause

After failing to locate any code in the Tomcat source code that resets established connections, I wanted to simulate the behaviour with a very simple Java program. Luckily the problem was easy to reproduce with a simple program as follows:

import java.net.ServerSocket;
import java.net.Socket;

public class TestAccept1 {

    public static void main(String[] args) throws Exception {
        ServerSocket serverSocket = new ServerSocket(8280, 0);
        Socket socket = serverSocket.accept();
        Thread.sleep(3000000); // do nothing
    }
}


We just open a server socket on port 8280, with a backlog of 0 and start listening for connections. Since the backlog is 0, one could assume that only one client connection would be allowed - BUT, I could open more than that via telnet as follows, and even send some data afterwards by typing it in and pressing the enter key.

telnet localhost 8280
hello world

A netstat command now confirms that more than one connection is opene:

netstat -na | grep 8280
tcp        0      0 127.0.0.1:34629         127.0.0.1:8280          ESTABLISHED
tcp        0      0 127.0.0.1:34630         127.0.0.1:8280          ESTABLISHED
tcp6       0      0 :::8280                 :::*                    LISTEN    
tcp6      13      0 127.0.0.1:8280          127.0.0.1:34630         ESTABLISHED
tcp6      13      0 127.0.0.1:8280          127.0.0.1:34629         ESTABLISHED

However, the Java program has only accepted ONE socket, although at the OS level, two would appear. It seems like the OS also allows more than two connections to be opened, even when the backlog is specified as 0. On Ubuntu 12.04 x64, the netstat command would not show me the actual listen queue length - but I believe it was not 0. However, before and after this test, I did not see a difference in the reported statistics for "listen queue" overflow, which I could see with the "netstat -sp tcp | fgrep listen" command

Next I used the JavaBench from the SOA ToolBox and issued a small payload at concurrency 1024, with a single iteration against the same port 8280


As expected, all requests failed, but my Wireshark trace on port 8280 did not detect any connection resets. Pushing the concurrency to 2560 and the iterations to 10 started to show tcp level RSTs - which were similar to those seen on Tomcat, though not exactly the same.

 

Can Tomcat do better?

Yes, Possibly .. What an end user would expect from Tomcat is that it refuses to accept new connections when under load, and not to accept connections and then reset them halfway through. But one would ask if that is achievable? Especially considering the behaviour seen with the simple Java example we discussed.

Well, the solution could be to perform better handling of the low level HTTP connections and the sockets, and this is already done by the free and open source high performance Enterprise Service Bus UltraESB, which utilizes the excellent Apache HttpComponents project underneath.

How does the UltraESB behave

One could easily test this by using the 'stopNewConnectionsAt' property of our NIO listener. If you set it to 2, you wont be able to even open a Telnet session to the socket beyond 2.

The first would work, the second too
But the third would see a "Connection refused"
And the UltraESB would report the following on its logs:

  INFO HttpNIOListener HTTP Listener http-8280 paused  
  WARN HttpNIOListener$EventLogger Enter maintenance mode as open connections reached : 2

Although it refuses to accept new connections, already accepted connections executes without any hindrance to completion. Thus a hardware level load balancer in front of an UltraESB cluster can safely load balance if an UltraESB node is loaded beyond its configured limits, without having to deal with any connection resets. Once a connection slot becomes free, the UltraESB will start accepting new connections as applicable.

Analysing a corresponding TCP dump

To analyse the corresponding behaviour, we wrote a simple Echo proxy service on the UltraESB, that also slept for 1 to 5 seconds before it replied, and tested this with the same JavaBench under 2560 concurrent users, each trying to push 10 messages in iteration.

Out of the 25600 requests, 7 completed successfully, while 25593 failed, as expected. We also saw many tcp level RSTs on the Wireshark dump - which must have been issued by the underlying operating system.


However, what's interesting to note is the difference - the RSTs occur immediately on receiving the SYN packet from the client - and are not established HTTP or TCP connections, but elegant "Connection Refused" errors - which would be what the client can expect. Thus the client can safely fail-over to another node without any doubt, overhead or delay.

Appendix : Supporting high concurrency in general

During testing we also saw that the Linux OS could detect the opening of many concurrent connections at the same time as a SYN flood attack, and then start using SYN cookies. You would see messages such as 
-->
Possible SYN flooding on port 9000. Sending cookies

displayed on the output of a "sudo dmesg", if this happens. Hence, for a real load, it would be better to disable SYN cookies by turning it off as follows as the root user
-->
# echo 0 > /proc/sys/net/ipv4/tcp_syncookies

To make the change persist over reboots, add the following line to your /etc/sysctl.conf
-->
net.ipv4.tcp_syncookies = 0

To allow the Linux OS to accept more connections, its also recommended that the 'net.core.somaxconn' be increased - as it usually defaults to 128 or so. This could be performed by the root user as follows,
-->
# echo 1024 > /proc/sys/net/core/somaxconn

To persist the change, append the following to the /etc/sysctl.conf
-->
net.core.somaxconn = 1024


Kudos!

The UltraESB could not have behaved gracefully without the support of the underlying Apache HttpComponents library, and the help and support received from that project community, especially by Oleg Kalnichevski - whose code and help has always fascinated me!



Friday, September 7, 2012


Thursday, August 23, 2012

First they ignore you, then they try to ridicule you, then they fight you, then you win

"First they ignore you, then they try to ridicule you, then they fight you, then you win" - is thought to have been said by Gandhi.


Tuesday, August 21, 2012

AdroitLogic Announces API Management Solution Based On Its High Performance ESB

We've just announced the APIDirector! an API and Services management solution based on our high performance Enterprise Service Bus UltraESB.


One of the key differences of the APIDirector is that it will offer both API and Services management features for enterprises, including support for AS2 and other legacy/traditional B2B and service protocols.

We've announced results of the 6th Round of ESB Performance Benchmarking earlier in August, although I've missed blogging about it previously, as my father was ill during those few days. The benchmark results showed the extreme performance as well as the stability of the UltraESB, which are both key elements of any API management solution.

Since an API management solution will be the entry point for your trading partners, customers and users accessing your exposed APIs - it MUST be capable of withstanding extreme load, as well as deliberate security attacks, without crashing by itself. The Round 6 results and the related information shows how some of the ESBs fail to withstand even legitimate and relatively small amounts of loads, when compared to an external attack. 

The APIDirector performs functions such as credential management, service logging, auditing and performance management support, with an easy to use graphical administration interface that also provides analytical features. We also ship the AS2Gateway as an optional module of the APIDirector, and this allows users to deploy a custom AS2 trading gateway - similar to the AS2Gateway we have hosted publicly.

Next week we will be deploying the AS2Director at one of our beta customer sites in the US, and it will initially deal with defining S/FTP file exchanges and SOAP based services, as well as AS2 based B2B exchanges. The APIDirector will be generally available to the public during the first quarter of 2013, although we will be happy to work with customers with beta releases prior to the general availability. Please contact us to learn more about the APIDirector, and how you could participate in the beta!

Original News Release: http://www.prweb.com/releases/prweb9820781.htm

Tuesday, July 31, 2012

Electronic Invoicing Announced on the AS2Gateway

We've just announced support for electronic invoicing on the cloud based Free B2B Trading Gateway AS2Gateway! This allows users to invoice trading partners via EDIFACT INVOIC messages over AS2, using a simple and intuitive web interface


Recurrent invoicing is simplified as any invoice can be saved as a template and re-used multiple times. The AS2Gateway converts all the invoice details into an EDIFACT D93A INVOIC message right now, and soon will support the other versions, as well as X12 based messages such as the 810.

In the near future, the platform will add support for parsing and generation of more message types, which will allow users to easily generate an invoice based on a purchase order received; or an advanced shipping notification (ASN) etc.

The AS2Gateway offers a free tier to support most SMEs that are required to electronically invoice trading partners for payment. For larger users, a premium tier is available with advanced options (to be announced shortly!), and for retailers or large corporations, the AS2Gateway is available for on-premise private deployment.

Tuesday, July 17, 2012

UltraESB Automates AS2/EDI Based B2B Trading for SKB Europe

We've just published a new case study on how SKB Europe, a leader in high-quality aluminium and plastic cases and boxes, has deployed the UltraESB as an AS2/EDI integration platform for Business-to-Business (B2B) integration.

The system initially integrates the ERP system of SKB Europe with one of its key trading partners Amazon, via EDIFACT based EDI documents exchanged over the AS2 protocol.

Read the news release and Case study from here
http://www.prweb.com/releases/2012/7/prweb9706746.htm

Thursday, June 28, 2012

ESB Performance Benchmarking Round 6 is coming..

AdroitLogic has just announced the Sixth round of ESB Performance Benchmarking today!

Round 5 introduced a public Amazon EC2 AMI image of all the ESB's that were included into the benchmark, and the next round has moved further by requesting vendors and/or contributors for submission of better optimized configurations for the different ESBs to be tested.

Stability and Performance are both important to any ESB, and the last round saw half of the selected ESBs not being able to complete the benchmark successfully. Each ESB has since releases new versions, and we look forward to testing each one successfully!

For more information, visit http://esbperformance.org and email info@adroitlogic.com if you are interested to participate closely with us during this round!

Monday, June 25, 2012

UltraESB Connects Coupon Publishers and Major Retailers in Europe in B2B Solution for HighCo Data Spain

Today we are happy to announce that Storelabs, a European startup specialized in FMCG/Retail marketing and operations, has deployed an AS2/EDI - FTP/XML Gateway for HighCo Data Spain. The Cloud hosted gateway 'PromoHub' initially integrates the retailer Carrefour Spain, with multiple coupon publishers utilizing the HighCo Data coupon clearing services.

Read the full Press Release and the Case Study

Wednesday, June 20, 2012

We've just launched our Free EDI/AS2 B2B Gateway - AS2Gateway.ORG

Since its launch in January 2010, the UltraESB has the only free and open source ESB that natively supported the widely used B2B trading protocol AS2 (Applicability Statement 2). AS2 is used extensively in the US, Europe and Asia for B2B integration especially in the Retail and Manufacturing industries, and the UltraESB has been used by many organizations to integrate EDI/AS2 systems with S/FTP, Web Services and XML style systems.

The UltraESB is now used to integrate with some of the largest retailers in Europe, and the US by multiple organizations and it was natural for us to see the potential in offering a Cloud hosted EDI/AS2 solution as a Service (SaaS).

The AS2Gateway is powered by our flagship product UltraESB, and offers free EDI/AS2 connectivity for SME's and certainly disrupts the traditional market place in both the features offered and the cost savings. For example, the free tier allows SMEs to trade electronically with upto 5 trading partners and offers 60 messages per month.

We've also just launched AS2Integration.ORG which is a resource site on AS2/EDI integration, as well as the documentation and user guide for the use of the AS2Gateway.

Utilizing the UltraESB underneath means that the flexibility is virtually limitless, and we will soon expand this service to include Electronic Invoicing, EDI document creation and processing as well!

The AS2Gateway begins its 'beta' testing phase today, and you can utilize all of its features for free until we move into production in a few more months! No credit cards are necessary for sign up - so get your AS2/EDI trading account from the Free AS2Gateway!

Read our Press Release about the launch for more details!

Sunday, May 20, 2012

Webinar: UltraESB - The Future of Enterprise Integration

Register for our free Webinar on using the UltraESB, on Tuesday the 29th of May 2012.

Click Here for more details and to Register!

Tuesday, January 3, 2012

OSGi - "Seduced by technology in search of a problem" :) ?

I was listening to a very interesting talk at QCon by Rod Johnson of SpringSource. The talk was titled "Things I Wish I'd known" and was about lessons he learned as an entrepreneur, which was very interesting!

Around 37:00 into the talk, he had a slide "Some things we got wrong at SpringSource" and it contained bullets about the over investment in OSGi for 2 years, which I quote below:

 - Seduced by technology in search of a problem
 - Clever technology but didn't solve most pressing customer problems


"Classic example of falling in love with a technology without considering the business implications.

We significantly over invested in OSGi and our dm-server technology for a couple of years. It wasn't a bad technology. Unfortunately, it simply solved a set of problems that a very small number of customers had., and yeah, we should have spent more time listening to the pressing problems our potential customers actually had.

That ultimately was something that we worked around, but we spent millions of dollars that we could have spent elsewhere."
Its interesting to note that about an year back, Ross Mason of MuleSource blogged about using OSGi in the Mule ESB in the post OSGi? No Thanks and said:
OSGi is a great specification for middleware vendors, but a terrible specification for the end user. This is because OSGi was never build for application developer consumption