It looks like you've added AllowUsers
in /etc/ssh/sshd_config
configuration file.
To resolve this issue, you'll need to attach the boot disk of your VM instance to a healthy instance as the second disk. Mount it, edit the configuration file and fix the issue.
Here are the steps you can take to resolve the issue:
First of all, take a snapshot of your instance’s disk, in case if a loss or corruption happens you can recover your disk.
In the Developers Console, click on your instance. Uncheck Delete boot disk when instance is deleted
and then delete the instance. The boot disk will remain under “Disks”, and now you can attach the disk to another instance. You can also do this step using gcloud
command:
$ gcloud compute instances delete NAME --keep-disks all
Now attach the disk to a healthy instance as an additional disk. You can do this through the Developers Console or using the gcloud
command:
$ gcloud compute instances attach-disk EXAMPLE-INSTANCE --disk DISK --zone ZONE
SSH into your healthy instance.
Determine where the secondary disk lives:
$ ls -l /dev/disk/by-id/google-*
Mount the disk:
$ sudo mkdir /mnt/tmp
$ sudo mount /dev/disk/by-id/google-persistent-disk-1-part1 /mnt/tmp
Where google-persistent-disk-1
is the name of the disk
Edit sshd_config
configuration file and remove AllowUsers
line and save it.
$ sudo nano /mnt/tmp/etc/ssh/sshd_config
Now unmout the disk:
$ sudo umount /mnt/tmp
Detach it from the VM instance. This can be done through the Developers Console or using the command below:
$ gcloud compute instances detach-disk EXAMPLE-INSTANCE --disk DISK
Now create a new instance using your fixed boot disk.
Caveat: I'm also not a server / sysadmin person but had to dive in earlier this year.
I've encountered performance issues like this when running node.js processes. It's possible there are parallels to what you're seeing. In my case, based on changes I experimented with it looks to have been related to hitting max pages limits.
These are configuration changes I performed that helped resolve issues:
In /etc/security/limits.d/custom.conf
root soft nofile 1000000
root hard nofile 1000000
* soft nofile 1000000
* hard nofile 1000000
In /etc/sysctl.d/99-sysctl.conf
fs.file-max = 1000000
fs.nr_open = 1000000
net.nf_conntrack_max = 1048576
To update running processes:
sudo sysctl -w fs.file-max=1000000
sudo sysctl -w fs.nr_open=1000000
sudo sysctl -w net.nf_conntrack_max=1048576
As root:
ulimit -n 1000000
You're milage may vary based on what's managing your processes.
Here's some documentation with further sysctl tweaks, some of which I plan to research and implement: https://easyengine.io/tutorials/linux/sysctl-conf/
Best Answer
In this case, I suggest you go to "Monitoring" -> "Metrics explorer" and create two metrics:
so you can analyze CPU utilization from the host and container point of views.
From Google Cloud docs: