I'm experiencing a similar problem running qemu kvm on a debian machine (I am running it through libvirt though).
I triggered the nic freeze by cloning a disk over ftp towards one of the 3 vm's running on this host, only the vm in question seems to be affected. The other 2 vm's and the host keep on working fine. To me it also seems like virtio is causing the freezing.
host kernel (Debian Lenny 5.0.6):
Linux host_machine_1 2.6.32-bpo.5-amd64 #1 SMP Thu Oct 21 10:02:18 UTC 2010 x86_64 GNU/Linux
guest kernel (Ubuntu Hardy Heron 8.04 LTS):
Linux virtual_machine_1 2.6.24-26-server #1 SMP Tue Dec 1 18:26:43 UTC 2009 x86_64 GNU/Linux
syslog guest:
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.151904] swapper: page allocation failure. order:1, mode:0x4020
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.151919] Pid: 0, comm: swapper Not tainted 2.6.24-26-server #1
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.151920]
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.151921] Call Trace:
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.151925] [__alloc_pages+0x2fd/0x3d0] __alloc_pages+0x2fd/0x3d0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152256] [new_slab+0x220/0x260] new_slab+0x220/0x260
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152260] [__slab_alloc+0x2f5/0x410] __slab_alloc+0x2f5/0x410
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152281] [virtio_net:__netdev_alloc_skb+0x2b/0x2eb0] __netdev_alloc_skb+0x2b/0x50
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152285] [virtio_net:__netdev_alloc_skb+0x2b/0x2eb0] __netdev_alloc_skb+0x2b/0x50
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152287] [__kmalloc_node_track_caller+0x121/0x130] __kmalloc_node_track_caller+0x121/0x130
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152290] [ipv6:__alloc_skb+0x7b/0x4f0] __alloc_skb+0x7b/0x160
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152293] [virtio_net:__netdev_alloc_skb+0x2b/0x2eb0] __netdev_alloc_skb+0x2b/0x50
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152312] [virtio_net:try_fill_recv+0x61/0x1b0] :virtio_net:try_fill_recv+0x61/0x1b0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152336] [ktime_get_ts+0x1b/0x50] ktime_get_ts+0x1b/0x50
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152341] [virtio_net:virtnet_poll+0x18c/0x350] :virtio_net:virtnet_poll+0x18c/0x350
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152346] [tick_program_event+0x35/0x60] tick_program_event+0x35/0x60
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152355] [net_rx_action+0x128/0x230] net_rx_action+0x128/0x230
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152358] [virtio_net:skb_recv_done+0x2c/0x40] :virtio_net:skb_recv_done+0x2c/0x40
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152369] [__do_softirq+0x75/0xe0] __do_softirq+0x75/0xe0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152379] [call_softirq+0x1c/0x30] call_softirq+0x1c/0x30
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152386] [do_softirq+0x35/0x90] do_softirq+0x35/0x90
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152389] [irq_exit+0x88/0x90] irq_exit+0x88/0x90
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152391] [do_IRQ+0x80/0x100] do_IRQ+0x80/0x100
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152393] [default_idle+0x0/0x40] default_idle+0x0/0x40
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152395] [default_idle+0x0/0x40] default_idle+0x0/0x40
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152396] [ret_from_intr+0x0/0x0a] ret_from_intr+0x0/0xa
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152398] [default_idle+0x29/0x40] default_idle+0x29/0x40
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152404] [cpu_idle+0x48/0xe0] cpu_idle+0x48/0xe0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152471] [start_kernel+0x2c5/0x350] start_kernel+0x2c5/0x350
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152475] [x86_64_start_kernel+0x12e/0x140] _sinittext+0x12e/0x140
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152482]
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152483] Mem-info:
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152484] Node 0 DMA per-cpu:
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152486] CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152487] Node 0 DMA32 per-cpu:
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152489] CPU 0: Hot: hi: 186, btch: 31 usd: 122 Cold: hi: 62, btch: 15 usd: 55
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152492] Active:35252 inactive:200609 dirty:11290 writeback:193 unstable:0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152492] free:1597 slab:11996 mapped:2986 pagetables:3395 bounce:0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152494] Node 0 DMA free:3988kB min:40kB low:48kB high:60kB active:1320kB inactive:4128kB present:10476kB pages_scanned:0 all_unreclaimable? no
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152497] lowmem_reserve[]: 0 994 994 994
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152499] Node 0 DMA32 free:2400kB min:4012kB low:5012kB high:6016kB active:139688kB inactive:798308kB present:1018064kB pages_scanned:0 all_unreclaimable? no
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152502] lowmem_reserve[]: 0 0 0 0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152504] Node 0 DMA: 3*4kB 1*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3988kB
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152509] Node 0 DMA32: 412*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2400kB
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152514] Swap cache: add 188, delete 187, find 68/105, race 0+0
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152516] Free swap = 3084140kB
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152517] Total swap = 3084280kB
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.152517] Free swap: 3084140kB
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158388] 262139 pages of RAM
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158390] 4954 reserved pages
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158391] 269600 pages shared
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158392] 1 pages swap cached
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158461] swapper: page allocation failure. order:1, mode:0x4020
Feb 21 09:00:22 virtual_machine_1 kernel: [63114.158464] Pid: 0, comm: swapper Not tainted 2.6.24-26-server #1
Guest config for qemu:
<domain type='kvm'>
<name>virtual_machine_1</name>
<uuid>41c1bf76-2aaa-3b32-8868-f28748db750a</uuid>
<memory>2097152</memory>
<currentMemory>2097152</currentMemory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64' machine='pc'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='block' device='disk'>
<driver name='qemu'/>
<source dev='/dev/drbd1'/>
<target dev='hda' bus='ide'/>
<address type='drive' controller='0' bus='0' unit='0'/>
</disk>
<disk type='block' device='cdrom'>
<driver name='qemu'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='1' unit='0'/>
</disk>
<controller type='ide' index='0'/>
<interface type='bridge'>
<mac address='52:54:00:2d:95:e5'/>
<source bridge='br0'/>
<model type='virtio'/>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target port='0'/>
</console>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' keymap='en-us'/>
<video>
<model type='cirrus' vram='9216' heads='1'/>
</video>
</devices>
</domain>
kvm command:
/usr/bin/kvm -S -M pc
-enable-kvm
-m 2048
-smp 1,sockets=1,cores=1,threads=1
-name virtual_machine_1
-uuid 41c1bf76-2aaa-3b32-8868-f28748db750a
-nodefaults
-chardev socket,id=monitor,path=/var/lib/libvirt/qemu/virtual_machine_1.monitor,server,nowait
-mon chardev=monitor,mode=readline -rtc base=utc
-boot c -drive file=/dev/drbd1,if=none,id=drive-ide0-0-0,boot=on,format=raw
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
-device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:2d:95:e5,bus=pci.0,addr=0x3
-net tap,fd=17,vlan=0,name=hostnet0
-chardev pty,id=serial0
-device isa-serial,chardev=serial0 -usb
-vnc 0.0.0.0:1
-k en-us
-vga cirrus
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
This post seems to be related:
http://www.mail-archive.com/kvm@vger.kernel.org/msg26033.html
This patch is also mentioned (I haven't tested it yet but it should solve the problem):
http://www.mail-archive.com/kvm@vger.kernel.org/msg26279.html
It's quite simple really. For homogeneous clusters and single host setups use the host
option. For mixed clusters, use the lowest available CPU version, so if one host is Penryn and the other Nehalem, use Penryn on both.
If you are using RHEV or oVirt, this is already built in. VMWare have this called "EVC" and position it as a huge feature.
Getting back to performance, you definitely need virtio everywhere you can put it. And if you still hit performance bottlenecks, those can usually be addressed on a case per case basis, depending on where they occur.
[offtop]On your choice of distribution I have already commented in another thread[/offtop]
Best Answer
Proxmox basically is the interface for use some hypervisor, then probably you use KVM as hypervisor.
Try to change the disc driver and test, in theory you should have no problem, but maybe grub fail in this case update the grub config, for example:
rm /boot/grub/device.map grub-mkdevicemap update-grub2
On some new linux versions grub don't have map file, only do update-grub2 or something.
For do this easily, you can use system rescue cd, and work over you guest vm in a chroot environment. You see mapper path because you're using LVM partitions on you guest.
(I post as answer because I can't comment)