Proxmox – Monitoring Softirqs RCU Spikes with NetData

cpu-usageirqnetdataproxmoxtcp

I have a server with the following characteristics: https://www.soyoustart.com/it/offerte/1801sysgame05.xml

Processor
Intel i7-4790K

RAM
32GB DDR3 1333MHz

Traffic
Unlimited
Anti-DDoS
Included

Disks
1x240GB SSD

Bandwidth
250 Mbps

I've installed the Proxmox Linux distribution that runs a container based on Ubuntu server to handle a real-time TCP game server written in C++ that, at the moment, reached around 1000 online users, and we are going to double the current population soon.

The problem is that we are encountering a weird performance "bottleneck" as soon as the number of online users reaches ~850. As soon as it returns to ~800 or less, the bottleneck disappears. What practically happens is that players have to wait for about 30s to be connected to the server, while the players already connected are not experiencing any issue (no latency, no freezes etc.). It seems like network congestion, or cap-limit, or something similar that denies further connections to the same process and creates pressure on our CPU (as you can see from the screenshots below)

Here I have collected some graphs from our NetData where I have noticed the same "pattern". The softirqs RCU is particularly meaningful I guess but I do not know what does it mean exactly.

softirqs RCU:
image

cpu usage/pressure:
image

cpu frequency
image

cpu temperature

image

I do not believe that the fault is our CPU itself, but as said above, seems like something related to a process limitation or something similar.

Do you have any idea of what's going on?

UPDATE:

another related graph

image

Best Answer

I solved this problem by increasing the ulimit

In my specific case the issue was that ulimit must be increased (both the hard and the soft) and permanently configured inside the /etc/ folder for both the host and the lxc container.

Also, I changed my container to a privileged one, but I'm not sure it's really needed, there could be a way to fix it also for a priviledged container, but I could not achieve that.

Related Topic