Is Ceph too slow and how to optimize it

cephperformanceproxmox

The setup is 3 clustered Proxmox for computations, 3 clustered Ceph storage nodes,

ceph01 8*150GB ssds (1 used for OS, 7 for storage)
ceph02 8*150GB ssds (1 used for OS, 7 for storage)
ceph03 8*250GB ssds (1 used for OS, 7 for storage)

When I create a VM on proxmox node using ceph storage, I get below speed (network bandwidth is NOT the bottleneck)

Writing to VM where hdd in Ceph

[root@localhost ~]# dd if=/dev/zero of=./here bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 46.7814 s, 23.0 MB/s

[root@localhost ~]# dd if=/dev/zero of=./here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 15.5484 s, 69.1 MB/s

Writing to VM where hdd in proxmox
for comparison, below is on a VM on proxmox, ssd same modal,

[root@localhost ~]# dd if=/dev/zero of=./here bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 10.301 s, 104 MB/s

[root@localhost ~]# dd if=/dev/zero of=./here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 7.22211 s, 149 MB/s

I have below ceph pool

size/min = 3/2
pg_num = 2048
ruleset = 0

Running 3 monitors on same hosts, Journals are stored on each own OSD
Running latest proxmox with Ceph Hammer

Any suggestions on where should we look at for improvements? Is it the Ceph pool? Is it the Journals? Does it matter if Journal is in same drive as OS (/dev/sda) or OSD (/dev/sdX)?

Best Answer

You can increase disk throughput (MB/s) by set MTU to 9000 and change the I/O scheduler to noop.

Related Topic