Node.js – Socket.io Websockets on a TCP configured Amazon Elastic Load Balancer

amazon-web-servicesnode.jssocket.iowebsocket

I'm planning to set up a group of NodeJS application servers running Socket.io on EC2, and I'd like to use the Elastic Load Balancer to spread load between them. I know ELB doesn't support Websockets out of the box, but I can use the setup described here in Scenario 2.

As described in the blog post, though, I notice that this setup offers no session affinity or source IP info:

We can not have Session Affinity nor X-Forward headers with this setup
because ELB is not parsing the HTTP messages, so its impossible to
match the cookies to ensure Session Affinity nor Inject special
X-Forward headers.

Will Socket.io still work under these circumstances? Or is there another way to have a set of Socket.io app servers behind a load balancer with SSL?

EDIT: Tim Caswell talks about doing this already here. Are there any posts explaining how to set this up? Again there's no session stickiness here, but things seem to be working fine.

As an aside, are sticky sessions actually necessary with websockets? Does information travel as new and separate requests or is there only one request + connection that all the information moves along?

Best Answer

Socket.io does not work out of the box even with a TCP ELB because it makes two HTTP requests before upgrading the connection to websockets.

The first connection is used to establish protocol, since socket.io supports more than just websockets.

GET /socket.io/1/?t=1360136617252 HTTP/1.1
User-Agent: node-XMLHttpRequest
Accept: */*
Host: localhost:9999
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: text/plain
Date: Wed, 06 Feb 2013 07:43:37 GMT
Connection: keep-alive
Transfer-Encoding: chunked

47
xX_HbcG1DN_nufWddblv:60:60:websocket,htmlfile,xhr-polling,jsonp-polling
0

The second request is used to actually upgrade the connection:

GET /socket.io/1/websocket/xX_HbcG1DN_nufWddblv HTTP/1.1
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: MTMtMTM2MDEzNjYxNzMxOA==
Host: localhost:9999

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: 249I3zzVp0SzEn0Te2RLp0iS/z0=

You can see in the above example that xX_HbcG1DN_nufWddblv is a shared key between requests. This is the problem. ELBs do round-robin routing, meaning the upgrade request hits a server than did not participate in the initial negotiation. As such, the server has no idea who the client is.

In-memory stateful data is the enemy of load-balancing. Thankfully, socket.io supports using Redis to store the data instead. If you share your redis connection with multiple servers, they essentially share the sessions of all clients.

See the socket.io wiki page for details on setting up Redis.

Misconceptions

There are few common misconceptions regarding WebSocket and Socket.IO:

The first misconception is that using Socket.IO is significantly easier than using WebSocket which doesn't seem to be the case. See examples below.
The second misconception is that WebSocket is not widely supported in the browsers. See below for more info.
The third misconception is that Socket.IO downgrades the connection as a fallback on older browsers. It actually assumes that the browser is old and starts an AJAX connection to the server, that gets later upgraded on browsers supporting WebSocket, after some traffic is exchanged. See below for details.

My experiment

I wrote an npm module to demonstrate the difference between WebSocket and Socket.IO:

It is a simple example of server-side and client-side code - the client connects to the server using either WebSocket or Socket.IO and the server sends three messages in 1s intervals, which are added to the DOM by the client.

Server-side

Compare the server-side example of using WebSocket and Socket.IO to do the same in an Express.js app:

WebSocket Server

WebSocket server example using Express.js:

var path = require('path');
var app = require('express')();
var ws = require('express-ws')(app);
app.get('/', (req, res) => {
  console.error('express connection');
  res.sendFile(path.join(__dirname, 'ws.html'));
});
app.ws('/', (s, req) => {
  console.error('websocket connection');
  for (var t = 0; t < 3; t++)
    setTimeout(() => s.send('message from server', ()=>{}), 1000*t);
});
app.listen(3001, () => console.error('listening on http://localhost:3001/'));
console.error('websocket example');

Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.js

Socket.IO Server

Socket.IO server example using Express.js:

var path = require('path');
var app = require('express')();
var http = require('http').Server(app);
var io = require('socket.io')(http);
app.get('/', (req, res) => {
  console.error('express connection');
  res.sendFile(path.join(__dirname, 'si.html'));
});
io.on('connection', s => {
  console.error('socket.io connection');
  for (var t = 0; t < 3; t++)
    setTimeout(() => s.emit('message', 'message from server'), 1000*t);
});
http.listen(3002, () => console.error('listening on http://localhost:3002/'));
console.error('socket.io example');

Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.js

Client-side

Compare the client-side example of using WebSocket and Socket.IO to do the same in the browser:

WebSocket Client

WebSocket client example using vanilla JavaScript:

var l = document.getElementById('l');
var log = function (m) {
    var i = document.createElement('li');
    i.innerText = new Date().toISOString()+' '+m;
    l.appendChild(i);
}
log('opening websocket connection');
var s = new WebSocket('ws://'+window.location.host+'/');
s.addEventListener('error', function (m) { log("error"); });
s.addEventListener('open', function (m) { log("websocket connection open"); });
s.addEventListener('message', function (m) { log(m.data); });

Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.html

Socket.IO Client

Socket.IO client example using vanilla JavaScript:

var l = document.getElementById('l');
var log = function (m) {
    var i = document.createElement('li');
    i.innerText = new Date().toISOString()+' '+m;
    l.appendChild(i);
}
log('opening socket.io connection');
var s = io();
s.on('connect_error', function (m) { log("error"); });
s.on('connect', function (m) { log("socket.io connection open"); });
s.on('message', function (m) { log(m); });

Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.html

Network traffic

To see the difference in network traffic you can run my test. Here are the results that I got:

WebSocket Results

2 requests, 1.50 KB, 0.05 s

From those 2 requests:

HTML page itself
connection upgrade to WebSocket

(The connection upgrade request is visible on the developer tools with a 101 Switching Protocols response.)

Socket.IO Results

6 requests, 181.56 KB, 0.25 s

From those 6 requests:

the HTML page itself
Socket.IO's JavaScript (180 kilobytes)
first long polling AJAX request
second long polling AJAX request
third long polling AJAX request
connection upgrade to WebSocket

Screenshots

WebSocket results that I got on localhost:

Socket.IO results that I got on localhost:

Test yourself

Quick start:

# Install:
npm i -g websocket-vs-socket.io
# Run the server:
websocket-vs-socket.io

Open http://localhost:3001/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the WebSocket version.

Open http://localhost:3002/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the Socket.IO version.

To uninstall:

# Uninstall:
npm rm -g websocket-vs-socket.io

Browser compatibility

As of June 2016 WebSocket works on everything except Opera Mini, including IE higher than 9.

This is the browser compatibility of WebSocket on Can I Use as of June 2016:

See http://caniuse.com/websockets for up-to-date info.

Node.js – Proxying WebSockets with TCP load balancer without sticky sessions

I think what we need to understand in order to answer this question is how exactly the underlying TCP connection evolves during the whole WebSocket creation process. You will realize that the sticky part of a WebSocket connection is the underlying TCP connection itself. I am not sure what you mean with "session" in the context of WebSockets.

At a high level, initiating a "WebSocket connection" requires the client to send an HTTP GET request to an HTTP server whereas the request includes the Upgrade header field. Now, for this request to happen the client needs to have established a TCP connection to the HTTP server (that might be obvious, but I think here it is important to point this out explicitly). The subsequent HTTP server response is then sent through the same TCP connection.

Note that now, after the server response has been sent, the TCP connection is still open/alive if not actively closed by either the client or the server.

Now, according to RFC 6455, the WebSocket standard, at the end of section 4.1:

If the server's response is validated as provided for above, it is
said that The WebSocket Connection is Established and that the
WebSocket Connection is in the OPEN state

I read from here that the same TCP connection that was initiated by the client before sending the initial HTTP GET (Upgrade) request will just be left open and will from now on serve as the transport layer for the full-duplex WebSocket connection. And this makes sense!

With respect to your question this means that a load balancer will only play a role before the initial HTTP GET (Upgrade) request is made, i.e. before the one and only TCP connection involved in said WebSocket connection creation is established between the two communication end points. Thereafter, the TCP connection stays established and cannot become "redirected" by a network device in between.

We can conclude that -- in your session terminology -- the TCP connection defines the session. As long as a WebSocket connection is alive (i.e. is not terminated), it by definition provides and lives in its own session. Nothing can change this session. Speaking in this picture, two independent WebSocket connections, however, cannot share the same session.

If you referred to something else with "session", then it probably is a session that is introduced by the application layer and we cannot comment on that one.

Edit with respect to your comments:

so you're saying that the load balancer is not involved in the TCP connection

No, that is not true, at least in general. It definitely can take influence upon TCP connection establishment, in the sense that it can decide what to do with the client connection attempt. The specifics depend on the exact type of load balancer (* , see below). Important: After the connection is established between two endpoints -- whereas I don't consider the load balancer to be an endpoint, I refer to WebSocket client and WebSocket server -- the two endpoints will not change anymore for the lifetime of the WebSocket connection. The load balancer might* still be in the network path, but can be assumed to not take influence anymore.

Therefore the full-duplex connection is between the client and the end server?

Yes!

***There are different types of load balancing. Depending on the type, the role of the load balancer is different after connection establishment between the two end points. Examples:

If the load balancing happens on DNS basis, then the load balancer is not involved in the final TCP connection at all. It just tells the client to which host is has to connect directly.
If the load balancer works like the Layer 4 ELB from AWS (docs here), then it so to say proxies the TCP connection. So the client would actually see the ELB itself as the server. What happens, however, is that the ELB just forwards the packages in both directions, without change. Hence, it is still heavily involved in the TCP connection, just transparently. In this case there are actually two permanent TCP connections involved: one from you to the ELB, and one from the ELB to the server. These are again permanent for the lifetime of your WebSocket connection.