Performance Comparison Between Node.js and Java EE For Reading JSON Data from CouchDB

Nodejs

Node.js has impressed me several times with high performance right out of the box. In my last Node.js project it was the same: we beat the given performance targets without having to tweak the application at all. I never really experienced this in Java EE projects. Granted the project was perfectly suited for Node.js: a small project centered around fetching JSON documents from CouchDB. Still I wanted to know: how would Java EE compare to Node.js in this particular case? 

TL;DR  I wanted to know how the performance of the application built on a vanilla Java stack would compare. I ran some simple performance tests. Turns out the Node.js application actually was 20% faster than a similar Java servlet application running on Tomcat 7. Not bad. You cannot generalise these results, though.

The Original Project

The Node.js project had to meet the following performance targets: 150 requests/second at 200ms average response time. I am not a performance guru, 200ms response time sounded pretty fast, and my feeling was that we would have to tweak the application to reach those goals.

A separate team ran performance tests against our application, and when the results came back the application actually had exceeded all performance targets: 200 requests/second at 100ms average response time. That was much better than the targets. I was quite amazed that Node.js was outperforming the requirements by such a margin, and all of this without any performance optimisation.

I asked myself: Is this really a good performance given the functionality of the application? Is Node.js just magically fast? What would the performance be if we would’ve gone with the established platform of Java EE?

I really couldn’t answer that question. Many Java EE applications I have worked on had response times that felt more like 1000ms, but they had more complex functionality than our Node.js application did. The core of our application only pulled out JSON documents by ID from a single table in a CouchDB database. No complex SQL, no table joins, and no data manipulation. I don’t know how a Java EE application would perform given those requirements. So I went out to answer the question: Can the perceived performance of Node.js vs. a traditional Java EE system be backed up by hard performance tests?

To answer this question I designed a set of performance tests to be run against both a Java EE application and a Node.js application, both backed by the same CouchDB database, and looked at how the two systems compared.

Preparation

I ran the the same performance tests against both a Node.js application and a Java servlet application. Both applications used the same backend as our original Node.js application: CouchDB. I used CouchBase Single Server version 1.1.3. I created 10.000 sample documents of 4KB each with random text. The test machine was a iMac with 2.4 GHZ Intel Core 2 Duo, 4 GB RAM, and Mac OS X.

I used Apache JMeter running on a separate machine as a test driver. The JMeter scripts fetched random documents from each application at various levels of concurrency.

Java EE

The Java servlet was run on an Apache Tomcat version 7.0.21, default configuration running on Java 1.6. The database driver was CouchDB4J version 0.30. The driver has no caching options available, so no configuration was done.

The following Java code is a servlet that fetches a document from CouchDB by id and forwards the data as a JSON object.


package com.shinetech.couchDB;

import java.io.IOException;
import java.io.PrintWriter;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.log4j.Logger;

import com.fourspaces.couchdb.Database;
import com.fourspaces.couchdb.Document;
import com.fourspaces.couchdb.Session;

@SuppressWarnings("serial")
public class MyServlet extends HttpServlet {
  Logger logger = Logger.getLogger(this.getClass());
  Session s = new Session("localhost",5984);
  Database db = s.getDatabase("testdb");

  public void doGet(HttpServletRequest req, HttpServletResponse res)
    throws IOException {
    String id = req.getPathInfo().substring(1);
    PrintWriter out = res.getWriter();
    Document doc = db.getDocument(id);
    if (doc==null){
      res.setContentType("text/plain");
      out.println("Error: no document with id " + id +" found.");
    } else {
      res.setContentType("application/json");
      out.println(doc.getJSONObject());
    }
    out.close();
  }
}

I ran the JMeter tests against this servlet at various levels of concurrency. The following table shows the number of concurrent requests, the average response time, and the requests that were served per second.

Concurrent Requests

Average Response time (ms)

Requests/second

10

23

422

50

119

416

100

243

408

150

363

411

What can be seen is that the response time deteriorates as the number of concurrent requests increases. The response time was 23 ms on average at 10 concurrent requests, and 243 ms on average at 100 concurrent requests.

The interesting part is that the average response time has an almost linear correlation to the number of concurrent requests, so that a tenfold increase in concurrent requests leads to a tenfold increase in response time per request. This makes the number of requests that can be handled per second is pretty constant, regardless of whether we have 10 concurrent requests or 150 concurrent requests. At all observed concurrency level the number of requests served per second was roughly 420.

Node

The Node.js application ran on Node.js 0.10.20 using the Cradle CouchDB driver version 0.57. The caching was turned off for the driver to create equal conditions.

The following shows the Node.js program that delivers the same JSON document from CouchDB for a given ID:


var http = require ('http'),
  url = require('url'),
  cradle = require('cradle'),
  c = new(cradle.Connection)(
          '127.0.0.1',5984,{cache: false, raw: false}),
  db = c.database('testdb'),
  port=8081;

process.on('uncaughtException', function (err) {
  console.log('Caught exception: ' + err);
});

http.createServer(function(req,res) {
  var id = url.parse(req.url).pathname.substring(1);
  db.get(id,function(err, doc) {
    if (err) {
      console.log('Error'+err.message);
      res.writeHead(500,{'Content-Type': 'text/plain'});
      res.write('Error' + err.message);
      res.end();
    } else {
      res.writeHead(200,{'Content-Type': 'application/json'});
      res.write(JSON.stringify(doc));
      res.end();
    }
  });
}).listen(port);

The numbers for Node.js system were as follows:

Concurrent Requests

Average Response time (ms)

Requests/second

10

19

509

50

109

453

100

196

507

150

294

506

As before the average response time has a linear correlation to the number of concurrent requests, keeping the requests that can be served per second pretty constant. Node.js is roughly 20% faster, e.g. 509 requests/second vs. 422 requests/second at ten concurrent requests.

Conclusion

The Node.js is 20% faster than the Java EE solution for the problem at hand. That amazed me. An interpreted language as fast as or faster a compiled language on a VM in which years of optimisation have gone into. Not bad at all.

It is important to take this with a grain of salt: this type of application is perfectly suited for Node.js. I would be weary to extend the findings here to other applications. I believe because of the interpreted nature of JavaScript and the lack of established patterns for programming in the large Node.js application are best kept small.

Both Node.js and Java EE scale beyond what a normal server needs. 400-500 requests per second is quite a lot. Google, the largest website in the world, has about 5 billion requests per day. If you divide that by 24 hours, 60 minutes, and 60 seconds it comes out to 57870 requests/ second. That is the number of requests across all Google domains worldwide, so if you have a website running with 400 requests per second on one machine your website is already pretty big. 1 million requests per day on average means 11.5 requests per second. Keep that in mind.

In this test the different concurrency models between single-threaded Node.js and multi-threaded Java EE made no difference. To test Node.js at higher concurrency levels – where it is supposed to outshine multi-threading – other problems like the number of open files need to be considered. I was not able to run these tests beyond 150 concurrent users because the OS complained about too many open files. This could have been solved through configuration, but is beyond the scope of this article.

For a general comparison of Node.js and Java EE see my blog Node.js From the Enterprise Java Perspective.

About Marc Fasel

Marc is a Senior Consultant with Shine Technologies. He has written code in 19 programming languages, but can only speak two natural languages. He enjoys referring to himself in the third person.
This entry was posted in Java, Javascript, Node.js, NoSQL and tagged , . Bookmark the permalink.

24 Responses to Performance Comparison Between Node.js and Java EE For Reading JSON Data from CouchDB

  1. noahwhitesgravatar says:

    It’s an interesting test. At the very least could you make the src, test data, and configuration available (with an open license) on GitHub? Also, technically Tomcat7 is not a full JavaEE application server, it simply a Servlet container which is one small part of the EE spec. There are other Servlet containers out there like Jetty which will undoubtedly give you different results. There are also other chioices in the EE stack like JAX-RS (REST) which can run using only the Grizzly NIO server. I think you need to conduct a bit more experimentation before making such sweeping performance generalizations. Lastly you are only looking at performance which is only one axis. There’s reliability, scalability, TCO etc.

    • Marc Fasel says:

      I think that’s exactly what I didn’t do: make sweeping performance generalisations. We had a very specific task and choose Node.js to do it. I wanted to know what would happen if we would have chosen a vanilla Java stack, and that’s all I documented here: “The Node.js is 20% faster than the Java EE solution for the problem at hand.”

      For a broader performance comparison of languages and frameworks check out “http://www.techempower.com/blog/2013/03/28/frameworks-round-1/” (thanks AOM for the link)

      • Noah White says:

        My comment about the sweeping conclusions regarding performance is simply based on the fact that you titled your piece, “…Performance Comparison Between Node.js and Java EE…”.

        What I was trying to get across is that Tomcat7 is not JavaEE. JavaEE is an umbrella spec for a group of specs. Tomcat7 is one implementation of one piece of the spec. To achieve your goal of evaluating performance against a ‘vanilla Java stack’ (which is also a pretty broad statement but I get what you really mean in this case) you would have benchmarked this not only against Tomcat7 but against the half dozen other servlet containers out there like for example Resin which is what was used in the link of framework comparisons you posted.

        I think a more accurate title for this piece would be, “…Performance Comparison Between Node.js and Tomcat7…”, otherwise yes this would be a sweeping conclusion given you only compared it with one impl. of one piece of the JavaEE spec.

  2. Have you warmed up the JVM ? Let it process 10k requests and ONLY THEN measure your requests/second. I am sure you will get different results.

  3. It doesn’t matter if you use tomcat, jetty or a jull JavaEE stack like glassfish of jboss because you are using just the sevrlet part of the stack.

  4. Dmitry says:

    REALLY awesome test. But let’s extend it. So, you’re querying CoachDB, and it responds fairly fast, below 20ms. That’s not too realistic. Could you please emulate the case when CoachDB has to do some more ‘back end’ job, spending, say 150ms.
    In this conditions we might see even better performance of Node.js

  5. Tamas says:

    Sorry, but in my opinion you only measured the performance of the underlying couchDb driver. Would be good to have some profiling in the application figure out why the java one perform that bad.

  6. Nicolas says:

    In my mind useless comparison:
    How is configured Tomcat : IO/NIO connector, pool, threads… ?
    How is configured NodeJs ?
    NodeJS CouchDB driver used callback. Do Java driver has callback too? Servlet 3 has async features
    What is the cost of the couchdb driver?
    Why Java 6 and not the last version 1.7?

  7. Alex says:

    I think that here you are comparing the multi-threaded blocking IO model vs non-blocking IO model.
    Maybe it would be more interesting to choose also a non-blocking IO Web server on the JVM such as Play Framework 2. Moreover, I agree that “raw performance” benchmark are only one axis of a benchmark for a Web stack (robustness, code base scaling, etc… could be as important as raw performance).

  8. Waddle says:

    “The Node.js is 20% faster than the Java EE solution” => totally wrong sentence (as said by others before).
    You’re just benching [Catalina + Couch Java driver] vs [Node.JS + Couch Node.JS driver]. Besides, this driver doesn’t belong to Java EE stack either, so it’s not *the* Java EE solution (neither *a* Java EE solution).
    Redo the bench with Resin or Jetty (which are faster than Tomcat) and you’ll probably conclude the oposit.
    Nice bench, wrong conclusion (as usual with benchmarks).

  9. Regarding the `too many open files` problem — Mac OS X has `ulimit -n` set to 256 by default. You can open up the process limit by `sudo ulimit -n 16384` to relax the per-process concurrent open file limit.

  10. btd says:

    Stupid test:
    1. You use not JavaEE, but servlets only.
    2. You did not post jvm options – they are important.
    3. You post nothing about test methodology.
    4. You compare threaded server with evented.

    Concluding. Just a yellow header.

  11. Sorry to be so critical but it’s benchmarks like this one that give benchmarking a really bad name. I don’t really know where to start but maybe this is a good question. You double the number of concurrent requests yet the tx rate remains about constant. Seems to be that something is throttling the bench and it’s not the application. In fact it looks as if the increase in response time is related to additional dead time due to maybe a thread/connection pool that is set a the default value? Sorry, this really isn’t intended to be a plug for my article on InfoQ but.. (http://www.infoq.com/articles/Java-Thread-Pool-Performance-Tuning)

    There might be an interesting and useful bench lurking about here but I think to get there you’re going to have to pub all the sources so that we can work on validation.

  12. A less “Apple and Orange”-test would be to compare Node.js and Vert.X(http://vertx.io/), they are both event-based frameworks built upon the same ideas. I don’t know what this test proves? That multithreaded, synchronized frameworks with an old none supported version of Java is slower than the latest version of Node?

    • Kirk says:

      You’re missing the point. The point is all of the numbers are dominated by dead time, time waiting for a connection to couchdb which means even if this were apples to apples you still couldn’t make a comparison until the bottleneck was taken out of the test harness….

  13. farhdine says:

    Hi.
    Why did you use CouchBase version 1.1.3 when the last version of CouchBase is 2.2.0?
    It is so old that it is not even possible to download it from the website…

  14. Pingback: OCTO talks ! » L’art du benchmark

  15. Pingback: OCTO talks ! » The art of benchmarking

  16. Anas says:

    The title of your benchmark in wrong, the right title is : “The non-blocking IO solution Node.js is 20% faster than the blocking IO solution implemented with Java servlet running on Tomcat 7 and non optimized JVM”…. a very long title isn’t it ?

  17. Bignouf says:

    It is not a good idea to compare a drone with a B52 because they are not comparable. Why about comparing Node.js and Grizzly NIO server ?

  18. Pingback: Conheça o Tyrus – sua implementação para WebSocket com backend em Node e Java | blog.caelum.com.br

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s