Linux high load with ImageMagick convert utility and server freezes (blktrack output attached)

centos5high-loadimagemagicklinuxlinux-kernel

I'm using ImageMagick to convert a JPG to TIF file, and I'm using the limit options for Imagemagick as below:

/usr/bin/convert -limit memory 256 -limit map 512 subjectfile.jpg -colorspace Gray -depth 8 -resample 200x200 output.tif

When I run the above command the load on the server suddenly goes very high and the CPUs get in wait state for most of the time as below:

Tasks: 245 total,   3 running, 241 sleeping,   0 stopped,   1 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,100.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  1.0%us,  1.0%sy,  0.0%ni,  0.0%id, 93.1%wa,  0.0%hi,  5.0%si,  0.0%st
Cpu3  :  6.9%us,  1.0%sy,  0.0%ni, 90.1%id,  0.0%wa,  1.0%hi,  1.0%si,  0.0%st
Mem:   4148160k total,  3980380k used,   167780k free,    18012k buffers
Swap:  4096552k total,       96k used,  4096456k free,  3339884k cached

The iostat during this shows the following:

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  7361.00 62.00 137.00  3712.00 37180.00   410.97   128.13  120.48   5.04 100.20
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda3              0.00  7361.00 62.00 137.00  3712.00 37180.00   410.97   128.13  120.48   5.04 100.20
sdb               0.00  7368.00  0.00 144.00     0.00 33136.00   460.22   133.84  203.48   6.96 100.20
sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb3              0.00  7368.00  0.00 144.00     0.00 33136.00   460.22   133.84  203.48   6.96 100.20
md1               0.00     0.00 61.00 17711.00  3648.00 70844.00     8.38     0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1193.00  0.00 470.00     0.00 14200.00    60.43    91.07  216.34   2.02  95.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda3              0.00  1193.00  0.00 470.00     0.00 14200.00    60.43    91.07  216.34   2.02  95.00
sdb               0.00  1138.00  0.00 410.00     0.00  8700.00    42.44   141.31  119.61   2.44 100.20
sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb3              0.00  1138.00  0.00 410.00     0.00  8700.00    42.44   141.31  119.61   2.44 100.20
md1               0.00     0.00  0.00 5226.00     0.00 20904.00     8.00     0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1472.28  0.00 483.17     0.00  7821.78    32.38     5.52   11.43   0.52  25.05
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda3              0.00  1472.28  0.00 483.17     0.00  7821.78    32.38     5.52   11.43   0.52  25.05
sdb               0.00  1511.88  0.00 410.89     0.00 10047.52    48.91   143.60  171.46   2.42  99.31
sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb3              0.00  1511.88  0.00 410.89     0.00 10047.52    48.91   143.60  171.46   2.42  99.31
md1               0.00     0.00  0.00 778.22     0.00  3112.87     8.00     0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

I'm not very familiar to Linux I/O performance but by reading on the internet I managed to get some stats from blktrace while this happened, which show as:

==================== All Devices ====================
          ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q               0.000000499   0.000486353   1.158217913      172004
Q2G               0.000000258   0.000059510   0.198865402      343500
S2G               0.000128922   0.010945336   0.198863747        1840
G2I               0.000000214   0.000000517   0.000168407      343504
Q2M               0.000000190   0.000000519   0.000122999      344516
I2D               0.000000879   0.016310824   0.305521347      342948
M2D               0.000000951   0.007473560   0.205691209      344492
D2C               0.000083899   0.002041770   0.160452919      171859
Q2C               0.000092851   0.013953825   0.317186332      171859

==================== Device Overhead ====================

       DEV |       Q2G       G2I       Q2M       I2D       D2C
---------- | --------- --------- --------- --------- ---------
 (  8,  0) |   0.8524%   0.0074%   0.0075% 233.2591%  14.6323%
---------- | --------- --------- --------- --------- ---------
   Overall |   0.8524%   0.0074%   0.0075% 233.2591%  14.6323%

==================== Device Merge Information ====================

       DEV |       #Q       #D   Ratio |   BLKmin   BLKavg   BLKmax    Total
---------- | -------- -------- ------- | -------- -------- -------- --------
 (  8,  0) |   343516   343516     1.0 |        8       16     1024  5650976

==================== Device Q2Q Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8,  0) |          172005      27058614.9               0 | 0(123703)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |          172005      27058614.9               0 | 0(123703)

==================== Device D2D Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8,  0) |          343516       9204796.3               0 | 0(310240)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |          343516       9204796.3               0 | 0(310240)

Using btt with -A options shows, this:

==================== Per Process ====================

          Q2Qdm           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

          Q2Adm           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

          Q2Cdm           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------


            Q2Q           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------
convert           0.085368267   9.765798951  24.050329666           3
md1_raid1         0.000000730   0.000493657   1.158217913      169459
mysqld            0.000001386   0.018154085  14.221072636        2146
sh                0.005889458   0.322064972   1.423632298           5

            Q2A           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

            Q2G           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------
convert           0.000000539   0.000003194   0.000005260          16
md1_raid1         0.000000258   0.000060580   0.198865402      333440
mysqld            0.000000270   0.000028381   0.058359194        8476
sh                0.000000506   0.000000827   0.000001610          24
            S2G           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------
md1_raid1         0.000128922   0.010842039   0.198863747        1836
mysqld            0.058358625   0.058358625   0.058358625           4

I'm using the following I/O Schedular:

# cat /sys/block/sd*/queue/scheduler 
noop anticipatory deadline [cfq] 
noop anticipatory deadline [cfq]

So my question is why average (AVG) Q2Q value is so high for convert (ImageMagick) utility when I use it with limit options:

convert           0.085368267   9.765798951  24.050329666           3

I don't see the load going hight problem when I use convert (ImageMagick) without -limit options, so can you please help me why the load shoots high when I try to limit the resource used by ImageMagick's convert utility with -limit option and how can I fix the issue.

Best Answer

I have run your exact command line (although with a different picture I presume ;-) ) with and without the limit options. And I understood the problem: the limit options unit are in bytes. What does it mean?

You set 256B for the maximum memory and 512B for the file mapping in memory. Thus instead of having nice big chunk of buffer read, you read the size of FS block (or even less if you have a 4K hard disk). This is something that would generate a lots of unnecessary IOs.

What you want to do is set 256MiB and 512MiB (beware to respect the case MiB and not mib or MIB!):

/usr/bin/convert -limit memory 256MiB -limit map 512MiB subjectfile.jpg -colorspace Gray -depth 8 -resample 200x200 output.tif

With this command I get about the same real time as when not using limit and I have more or less the same IO activities than without the limit options.