Javascript – Is Node.js correct choice for JSON parsing and IO driven

javascriptnode.jsprofilingscalability

My program should perform following task:
It listen on http port after getting request it does following things.

  1. Connect to gearman
  2. Parse gearman payload to JSON (Upto 100 bytes)
  3. Connect to Redis
  4. Parse redis payload to JSON (256 bytes to 10KB. 80% cases it will ~256 bytes)
  5. Put some data in MySQL
  6. Put data in Redis server

As my program seems to be IO driven. I have chosen nodesjs for developing. But after developing I am facing CPU hike related issue with nodejs.

My program taking 70%-100% cpu with 20 parallel clients. First I thought JSON parsing could be the issue. I was targetting near about 1K-3K request. As my redis server is able to process that may request in one second.

But for profiling I have started with one sample http server in node

Example code:

var http = require('http');
var url = require("url");
http.createServer(function (req, res) {
    var uri = url.parse(req.url).pathname;


    var body = "";

    req.on('data', function (chunk) {
        body += chunk;
    });

    req.on('end', function () {     
        res.writeHead(200, {'Content-Type': 'text/plain'});            
                res.end('hi vivek');
    });


}).listen(9097, "127.0.0.1");

Now my concern is with this hello world http server. Node CPU usage is spiking b/w 17%-20%.

My node version is v0.10.0
My OS is ubuntu 12.04

My cpu information is

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 2992.491
cache size  : 6144 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bogomips    : 5984.98
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 2992.491
cache size  : 6144 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bogomips    : 5984.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

My questions are:

  1. Is node correct choice for my problem description?
  2. If nodejs is not correct choice for my problem description. What will be better alternative? As thread based approach will not scale for IO driven application.

  3. How to find out what is causing that much CPU hike in complete integrated application and simple http program?

  4. According to some node blog, I can support upto 10K parallel request with nodejs. But with if with only simple http server node is spiking b/w 20% cpu. How I will be able to support 10K user?

Best Answer

Is node correct choice for my problem description?

Nodejs seems like a good fit to what you're doing. Node was built for exactly this sort of scenario. That's not to say other technologies wouldn't work as well.

Node is a young technology, and you often find yourself sacrificing comfort for performance. Often it's a lot more work, but once you learn how to work with it, it starts to be rewarding.

That said, other technologies might be able to accommodate your needs.

Pros for node

  • Fast
  • Optimized for this sort of IO driven task
  • Has an enthusiastic community which is very interested in helping beginners.
  • Fun to work with (That's totally subjective and is my personal opinion).

Cons for node

  • Often has less stable and mature drivers. If you're writing a production project, this is a biggie in my opinion.
  • New, sometimes has rough edges, sometimes APIs change.
  • Often requires tweaking and reading source code to get working in a satisfactory way.

Your task:

Connect to gearman

This node does nicely. node-gearman works nicely, it's pretty stable.

Parse gearman payload to JSON (Upto 100 bytes)

JS engines have been and will be extremely fast at parsing JSON. This is because JSON is a subset of JavaScript object literal notation (hence the name!). V8, which is the engine node runs on does JSON processing reliably fast.

Connect to Redis

node-redis lets you do that, it also works nicely.

Parse redis payload to JSON (256 bytes to 10KB. 80% cases it will ~256 bytes)

Again, JSON is not an issue to V8.

Put some data in MySQL

node-mysql is getting better, it still lacks support for prepared statements, but it does transactions, and emulates prepared statements with internal escaping.

Put data in Redis server

Again, node-redis

Related Topic