MySQL high “sy” (kernel) CPU time

MySQLperformance

My MySQL server runs about 100 writes and 500 reads per second. The CPU usage looks strange to me. "us" (user) time shows 2-8%, while "sy" (kernel/system) time often shows 50+%. Here's some vmstat output:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 5  1 3088828 153744    492 238764  152    0   224   578 3225 2343  7 61  5  3
 6  0 3088792 153356    492 239016   60    0    96  1955 3001 2102  8 59  5  3
16  1 3088792 153140    492 239008   32    0    80  1115 4906 3850  6 54 18 14
 2  1 3088792 153248    492 239124    8    0    44  1114 4529 3407  4 55 19 12
 2  0 3088792 152768    624 239208    0    0   180   914 3984 3052  4 39 48  8
 0  1 3088788 152736    624 239260   32    0    76   797 3683 2713  4 48 29  8
16  0 3088788 152644    624 239356    4    0    36   983 4042 2995  4 55 21  7
 3  0 3088788 153044    624 239412    8    0    48   891 3981 2928  5 51 29  7
 1  0 3088788 153016    624 239500    0    0    16   384 3581 2301  3 52 39  3
 1  1 3088768 150852    628 239524   32    0    72   830 3804 2826  4 48 33  9
 4  2 3088752 152604    632 239584   32    0    72   744 3423 2467  6 61  7  3
 3  0 3088704 152024    632 239664   80    0   152  1272 3641 2729  5 51 22  9
12  1 3088704 150000    632 239760    0    0    44  1037 4049 2989  4 53 19 12

And here is some mpstat output:

05:10:32 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
05:10:33 PM  all    5.65    0.43   55.65   10.87    0.00    1.30   16.52    0.00    9.57
05:10:34 PM  all    2.11    0.00   36.14    5.96    0.00    0.35    2.46    0.00   52.98
05:10:35 PM  all    1.98    0.00   28.05    7.26    0.00    0.33    0.00    0.00   62.38
05:10:36 PM  all    2.01    0.67   27.09    2.68    0.00    0.67    4.35    0.00   62.54
05:10:37 PM  all    2.04    0.68   32.65    6.12    0.00    0.34    2.04    0.00   56.12
05:10:38 PM  all    4.13    0.00   50.41   10.33    0.00    0.83   15.29    0.00   19.01
05:10:39 PM  all    2.71    0.00   32.54    4.75    0.00    0.34    0.34    0.00   59.32
05:10:40 PM  all    1.03    1.03   31.62    4.12    0.00    0.34    4.12    0.00   57.73
05:10:41 PM  all    3.02    0.34   27.85    8.05    0.00    0.67    0.34    0.00   59.73
05:10:42 PM  all    1.69    1.69   27.70    8.45    0.00    0.34    4.39    0.00   55.74
05:10:43 PM  all    3.19    0.00   38.65    3.90    0.00    0.35    2.13    0.00   51.77
05:10:44 PM  all    2.50    0.36   37.14    7.50    0.00    0.36    2.50    0.00   49.64
05:10:45 PM  all    4.17    0.38   42.42    7.20    0.00    0.38    4.92    0.00   40.53
05:10:46 PM  all    4.42    1.20   49.40    9.24    0.00    0.40    5.22    0.00   30.12
05:10:47 PM  all    3.32    0.00   44.65   18.82    0.00    0.37    0.37    0.00   32.47
05:10:48 PM  all    2.72    0.78   48.64    5.45    0.00    0.78    5.06    0.00   36.58
05:10:49 PM  all    3.69    0.00   48.36    4.92    0.00    0.82   13.11    0.00   29.10
05:10:50 PM  all    4.52    0.00   59.28   10.86    0.00    0.90   19.91    0.00    4.52
05:10:51 PM  all    4.02    0.89   58.04    3.57    0.00    0.45   22.32    0.00   10.71
05:10:52 PM  all    4.02    0.89   56.25    5.80    0.00    1.34   19.20    0.00   12.50
05:10:53 PM  all    5.38    1.15   39.62    8.08    0.00    0.77    7.69    0.00   37.31

Is this normal? How could I troubleshoot very high "sy" CPU usage? I'm running on an EC2 "large" instance.

This is one of a master-master pair, so replication runs both ways, if that matters.

Best Answer

vmstat is good, but mpstat adds somewhat more. Can you post its output as well?

UPD. on mpstat: it looks like I/O bursts impacts sys notably, though one can't say it's the main reason, yet. What is your storage, what FS does it use, have you considered reducing I/O stress?

UPD. on FS: Meanwhile XFS is a great FS for database store, its quite recent versions suffered from poor metadata performance. I had noticed 100 % CPU usage on some patterns and switching to anything else except XFS solved the problem. But nowadays it has silverbulletdelaylog mount option. You'd better check whether it's supported on your very version of kernel. I also do hope you have already noatime at least set.

So, drawing the line under — I'd suggest moving towards I/O reducing.