I'm troubleshooting some issues with my RHEL 5
server. This is an Oracle DB
server which are running for a while now without much issue. Lately I notice that the server load is relatively high due to KSWAPD processes causing high CPU usage. Upon checking i notice the server is having a lot of swapping activity.
The server specs are:
12 x 2 CPU & 64GB RAM
bash-3.2$ uname -a
Linux 2.6.18-408.el5 #1 SMP Fri Dec 11 14:03:08 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
When I view top, I can see the server still has 10GB of free physical memory left, thus I'm not sure why it's swapping. Appreciate if someone could point me the correct direction to troubleshoot.
top - 15:31:35 up 231 days, 5:22, 2 users, load average: 13.27, 13.97, 14.12
Tasks: 1443 total, 12 running, 1431 sleeping, 0 stopped, 0 zombie
Cpu(s): 29.2%us, 17.2%sy, 0.0%ni, 47.5%id, 5.4%wa, 0.0%hi, 0.6%si, 0.0%st
Mem: 65839252k total, 53587688k used, 12251564k free, 122936k buffers
Swap: 68059128k total, 4535508k used, 63523620k free, 45719164k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9423 oraitxnp 17 0 8403m 167m 166m R 98.7 0.3 0:57.51 oracle
12348 oraitxnp 17 0 8405m 242m 240m R 98.7 0.4 0:39.11 oracle
8942 oraitxnp 20 0 8404m 174m 171m R 95.6 0.3 1:59.77 oracle
9049 oraitxnp 25 0 8404m 170m 167m R 95.6 0.3 1:33.17 oracle
9402 oraitxnp 25 0 8404m 161m 158m R 95.6 0.3 1:24.03 oracle
13280 oraitxnp 17 0 8403m 161m 159m R 95.6 0.3 1:04.59 oracle
13227 oraitxnp 17 0 8403m 165m 162m R 92.4 0.3 0:40.65 oracle
1431 root 11 -5 0 0 0 R 82.8 0.0 2802:41 kswapd2
11395 oraitxnp 16 0 8403m 192m 191m R 66.9 0.3 0:15.55 oracle
sar -r
02:20:02 PM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
02:30:11 PM 12860252 52979000 80.47 122888 45721248 63711928 4347200 6.39 853652
02:40:02 PM 12591216 53248036 80.88 122876 45728156 63467408 4591720 6.75 860892
02:50:01 PM 12648836 53190416 80.79 122928 45729408 63717800 4341328 6.38 913284
03:00:02 PM 12489840 53349412 81.03 122932 45727364 63558884 4500244 6.61 941220
03:10:05 PM 12380352 53458900 81.20 123064 45735548 63541648 4517480 6.64 879124
03:20:12 PM 12195596 53643656 81.48 123124 45732364 63358440 4700688 6.91 901656
03:30:02 PM 12425600 53413652 81.13 122936 45718624 63582308 4476820 6.58 964544
Average: 12406342 53432910 81.16 121691 45498460 63646323 4412805 6.48 952204
sar -B
02:20:02 PM pgpgin/s pgpgout/s fault/s majflt/s
02:30:11 PM 36386.86 4421.45 14369.55 2242.21
02:40:02 PM 41398.13 5570.15 17610.94 2555.90
02:50:01 PM 51600.70 4681.47 14093.22 1675.94
03:00:02 PM 48850.39 5340.96 15636.23 2251.99
03:10:05 PM 53043.46 4755.90 17506.83 2378.80
03:20:12 PM 39151.42 5297.79 14383.58 1816.64
03:30:02 PM 47760.58 5099.56 14774.31 2236.45
Average: 47687.94 4831.93 15128.85 2191.29
-bash-3.2$ free -m
total used free shared buffers cached
Mem: 64296 52281 12014 0 120 44655
-/+ buffers/cache: 7506 56789
Swap: 66463 4545 61918
Best Answer
What is your
vm.swappiness
set at? Default is 60 (on Ubuntu anyway). As I understand, the lower the number, the more your system will prefer RAM over swap.This is, of course, assuming the high CPU load is due to disk swap. If I'm reading that output correctly, those 8
oraitxnp
process are consuing 8G+ of virtual (RAM) each. That seems like physical RAM contention but not sure how the RES and SHR columns work into that.I would
cat /proc/meminfo
to get a better idea of how much "physical" RAM is being used. It's hard to tell from some of thesar
output due to the way it mashes the 64G physical + 66G swap together, but I would venture a guess that adding another 64G of RAM to that box -- and maybe reduce that disk swap down to 8G or something. Ideally, you never want to hit disk swap. If you do, you need to add more physical RAM or incur performance penalties.Years ago, the Linux standard for swap was to "just make it double your RAM" but this was when most desktop systems were only running 1-2G. Even Redhat has changed this tune, suggesting 20% of physical is "usually a good idea"