Misconceptions
There are few common misconceptions regarding WebSocket and Socket.IO:
The first misconception is that using Socket.IO is significantly easier than using WebSocket which doesn't seem to be the case. See examples below.
The second misconception is that WebSocket is not widely supported in the browsers. See below for more info.
The third misconception is that Socket.IO downgrades the connection as a fallback on older browsers. It actually assumes that the browser is old and starts an AJAX connection to the server, that gets later upgraded on browsers supporting WebSocket, after some traffic is exchanged. See below for details.
My experiment
I wrote an npm module to demonstrate the difference between WebSocket and Socket.IO:
It is a simple example of server-side and client-side code - the client connects to the server using either WebSocket or Socket.IO and the server sends three messages in 1s intervals, which are added to the DOM by the client.
Server-side
Compare the server-side example of using WebSocket and Socket.IO to do the same in an Express.js app:
WebSocket Server
WebSocket server example using Express.js:
var path = require('path');
var app = require('express')();
var ws = require('express-ws')(app);
app.get('/', (req, res) => {
console.error('express connection');
res.sendFile(path.join(__dirname, 'ws.html'));
});
app.ws('/', (s, req) => {
console.error('websocket connection');
for (var t = 0; t < 3; t++)
setTimeout(() => s.send('message from server', ()=>{}), 1000*t);
});
app.listen(3001, () => console.error('listening on http://localhost:3001/'));
console.error('websocket example');
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.js
Socket.IO Server
Socket.IO server example using Express.js:
var path = require('path');
var app = require('express')();
var http = require('http').Server(app);
var io = require('socket.io')(http);
app.get('/', (req, res) => {
console.error('express connection');
res.sendFile(path.join(__dirname, 'si.html'));
});
io.on('connection', s => {
console.error('socket.io connection');
for (var t = 0; t < 3; t++)
setTimeout(() => s.emit('message', 'message from server'), 1000*t);
});
http.listen(3002, () => console.error('listening on http://localhost:3002/'));
console.error('socket.io example');
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.js
Client-side
Compare the client-side example of using WebSocket and Socket.IO to do the same in the browser:
WebSocket Client
WebSocket client example using vanilla JavaScript:
var l = document.getElementById('l');
var log = function (m) {
var i = document.createElement('li');
i.innerText = new Date().toISOString()+' '+m;
l.appendChild(i);
}
log('opening websocket connection');
var s = new WebSocket('ws://'+window.location.host+'/');
s.addEventListener('error', function (m) { log("error"); });
s.addEventListener('open', function (m) { log("websocket connection open"); });
s.addEventListener('message', function (m) { log(m.data); });
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.html
Socket.IO Client
Socket.IO client example using vanilla JavaScript:
var l = document.getElementById('l');
var log = function (m) {
var i = document.createElement('li');
i.innerText = new Date().toISOString()+' '+m;
l.appendChild(i);
}
log('opening socket.io connection');
var s = io();
s.on('connect_error', function (m) { log("error"); });
s.on('connect', function (m) { log("socket.io connection open"); });
s.on('message', function (m) { log(m); });
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.html
Network traffic
To see the difference in network traffic you can run my test. Here are the results that I got:
WebSocket Results
2 requests, 1.50 KB, 0.05 s
From those 2 requests:
- HTML page itself
- connection upgrade to WebSocket
(The connection upgrade request is visible on the developer tools with a 101 Switching Protocols response.)
Socket.IO Results
6 requests, 181.56 KB, 0.25 s
From those 6 requests:
- the HTML page itself
- Socket.IO's JavaScript (180 kilobytes)
- first long polling AJAX request
- second long polling AJAX request
- third long polling AJAX request
- connection upgrade to WebSocket
Screenshots
WebSocket results that I got on localhost:
Socket.IO results that I got on localhost:
Test yourself
Quick start:
# Install:
npm i -g websocket-vs-socket.io
# Run the server:
websocket-vs-socket.io
Open http://localhost:3001/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the WebSocket version.
Open http://localhost:3002/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the Socket.IO version.
To uninstall:
# Uninstall:
npm rm -g websocket-vs-socket.io
Browser compatibility
As of June 2016 WebSocket works on everything except Opera Mini, including IE higher than 9.
This is the browser compatibility of WebSocket on Can I Use as of June 2016:
See http://caniuse.com/websockets for up-to-date info.
I think what we need to understand in order to answer this question is how exactly the underlying TCP connection evolves during the whole WebSocket creation process. You will realize that the sticky part of a WebSocket connection is the underlying TCP connection itself. I am not sure what you mean with "session" in the context of WebSockets.
At a high level, initiating a "WebSocket connection" requires the client to send an HTTP GET request to an HTTP server whereas the request includes the Upgrade
header field. Now, for this request to happen the client needs to have established a TCP connection to the HTTP server (that might be obvious, but I think here it is important to point this out explicitly). The subsequent HTTP server response is then sent through the same TCP connection.
Note that now, after the server response has been sent, the TCP connection is still open/alive if not actively closed by either the client or the server.
Now, according to RFC 6455, the WebSocket standard, at the end of section 4.1:
If the server's response is validated as provided for above, it is
said that The WebSocket Connection is Established and that the
WebSocket Connection is in the OPEN state
I read from here that the same TCP connection that was initiated by the client before sending the initial HTTP GET (Upgrade) request will just be left open and will from now on serve as the transport layer for the full-duplex WebSocket connection. And this makes sense!
With respect to your question this means that a load balancer will only play a role before the initial HTTP GET (Upgrade) request is made, i.e. before the one and only TCP connection involved in said WebSocket connection creation is established between the two communication end points. Thereafter, the TCP connection stays established and cannot become "redirected" by a network device in between.
We can conclude that -- in your session terminology -- the TCP connection defines the session. As long as a WebSocket connection is alive (i.e. is not terminated), it by definition provides and lives in its own session. Nothing can change this session. Speaking in this picture, two independent WebSocket connections, however, cannot share the same session.
If you referred to something else with "session", then it probably is a session that is introduced by the application layer and we cannot comment on that one.
Edit with respect to your comments:
so you're saying that the load balancer is not involved in the TCP
connection
No, that is not true, at least in general. It definitely can take influence upon TCP connection establishment, in the sense that it can decide what to do with the client connection attempt. The specifics depend on the exact type of load balancer (* , see below). Important: After the connection is established between two endpoints -- whereas I don't consider the load balancer to be an endpoint, I refer to WebSocket client and WebSocket server -- the two endpoints will not change anymore for the lifetime of the WebSocket connection. The load balancer might* still be in the network path, but can be assumed to not take influence anymore.
Therefore the full-duplex connection is between the client and the
end server?
Yes!
***There are different types of load balancing. Depending on the type, the role of the load balancer is different after connection establishment between the two end points. Examples:
- If the load balancing happens on DNS basis, then the load balancer is not involved in the final TCP connection at all. It just tells the client to which host is has to connect directly.
- If the load balancer works like the Layer 4 ELB from AWS (docs here), then it so to say proxies the TCP connection. So the client would actually see the ELB itself as the server. What happens, however, is that the ELB just forwards the packages in both directions, without change. Hence, it is still heavily involved in the TCP connection, just transparently. In this case there are actually two permanent TCP connections involved: one from you to the ELB, and one from the ELB to the server. These are again permanent for the lifetime of your WebSocket connection.
Best Answer
Socket.io does not work out of the box even with a TCP ELB because it makes two HTTP requests before upgrading the connection to websockets.
The first connection is used to establish protocol, since socket.io supports more than just websockets.
The second request is used to actually upgrade the connection:
You can see in the above example that
xX_HbcG1DN_nufWddblv
is a shared key between requests. This is the problem. ELBs do round-robin routing, meaning the upgrade request hits a server than did not participate in the initial negotiation. As such, the server has no idea who the client is.In-memory stateful data is the enemy of load-balancing. Thankfully, socket.io supports using Redis to store the data instead. If you share your redis connection with multiple servers, they essentially share the sessions of all clients.
See the socket.io wiki page for details on setting up Redis.