HIgh CPU Usage on Google Cloud Monitoring

google-cloud-platformgoogle-compute-enginegoogle-stackdriver

I am running a Google Cloud VM. Once I go to the monitoring Tab. I see that VM CPU utilisation is 70% as shown in screenshot also.

Google Cloud MOnitoring

But Once I SSH and use the terminal and use top command to view usage it shows only 5%. Anyone can help me please of which view to take seriously the google cloud one or the top command in terminal window.Thank you.

Terminal View for top command

Best Answer

In this case, I suggest you go to "Monitoring" -> "Metrics explorer" and create two metrics:

"VM Instances" and "agent.googleapis.com/cpu/utilization"
"VM Instances" and "compute.googleapis.com/instance/cpu/utilization"

so you can analyze CPU utilization from the host and container point of views.

From Google Cloud docs:

Metric: agent.googleapis.com/cpu/utilization

Description: CPU Usage (percent). This value is reported from inside the VM and can differ from compute.googleapis.com/instance/cpu/utilization, which is reported by the hypervisor for the VM.

Related Solutions

Ssh – Unable to connect Google Compute Engine instance via SSH in browser

It looks like you've added AllowUsers in /etc/ssh/sshd_config configuration file.

To resolve this issue, you'll need to attach the boot disk of your VM instance to a healthy instance as the second disk. Mount it, edit the configuration file and fix the issue.

Here are the steps you can take to resolve the issue:

First of all, take a snapshot of your instance’s disk, in case if a loss or corruption happens you can recover your disk.
In the Developers Console, click on your instance. Uncheck Delete boot disk when instance is deleted and then delete the instance. The boot disk will remain under “Disks”, and now you can attach the disk to another instance. You can also do this step using gcloud command:
```
$ gcloud compute instances delete NAME --keep-disks all
```
Now attach the disk to a healthy instance as an additional disk. You can do this through the Developers Console or using the gcloud command:
```
$ gcloud compute instances attach-disk EXAMPLE-INSTANCE --disk DISK --zone ZONE
```
SSH into your healthy instance.
Determine where the secondary disk lives:
```
$ ls -l /dev/disk/by-id/google-*
```

Mount the disk:

$ sudo mkdir /mnt/tmp
$ sudo mount /dev/disk/by-id/google-persistent-disk-1-part1 /mnt/tmp

Where google-persistent-disk-1 is the name of the disk

Edit sshd_config configuration file and remove AllowUsers line and save it.
```
$ sudo nano /mnt/tmp/etc/ssh/sshd_config
```
Now unmout the disk:
```
$ sudo umount /mnt/tmp
```
Detach it from the VM instance. This can be done through the Developers Console or using the command below:
```
$ gcloud compute instances detach-disk EXAMPLE-INSTANCE --disk DISK
```
Now create a new instance using your fixed boot disk.

Linux – What are these 99+% peaks in Google Cloud VM CPU Usage

Caveat: I'm also not a server / sysadmin person but had to dive in earlier this year.

I've encountered performance issues like this when running node.js processes. It's possible there are parallels to what you're seeing. In my case, based on changes I experimented with it looks to have been related to hitting max pages limits.

These are configuration changes I performed that helped resolve issues:

In /etc/security/limits.d/custom.conf

root soft nofile 1000000
root hard nofile 1000000
* soft nofile 1000000
* hard nofile 1000000

In /etc/sysctl.d/99-sysctl.conf

fs.file-max = 1000000
fs.nr_open = 1000000
net.nf_conntrack_max = 1048576

To update running processes:

sudo sysctl -w fs.file-max=1000000
sudo sysctl -w fs.nr_open=1000000
sudo sysctl -w net.nf_conntrack_max=1048576

As root:

ulimit -n 1000000

You're milage may vary based on what's managing your processes.

Here's some documentation with further sysctl tweaks, some of which I plan to research and implement: https://easyengine.io/tutorials/linux/sysctl-conf/

Best Answer

Related Solutions

Ssh – Unable to connect Google Compute Engine instance via SSH in browser

Linux – What are these 99+% peaks in Google Cloud VM CPU Usage

Related Topic