I just built a server using a Supermicro X8DAH+-F board and running Ubuntu 10.04 Server 64bit. This has the Intel 82576 dual port controller (one port is disabled). Since this is a server, remote access is imperative.
The server is connected to a switch (DLink), and the switch is connected to a router running DD-WRT (Netgear WNR3500v2/U/L).
eth1 Link encap:Ethernet HWaddr 00:25:90:03:c9:b9
inet addr:192.168.0.100 Bcast:192.168.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7655 errors:0 dropped:0 overruns:0 frame:0
TX packets:5772 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7179394 (7.1 MB) TX bytes:919727 (919.7 KB)
Memory:fbc60000-fbc80000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:637 errors:0 dropped:0 overruns:0 frame:0
TX packets:637 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:96955 (96.9 KB) TX bytes:96955 (96.9 KB)
I am pulling my hair out. This server randomly drops all connections. If I am logged in via SSH, the session will get disconnected between 0 mins (immediately) after login, to 30 mins. Once the connections are dropped, it takes several minutes for services to come back up.
I decided to run a 24 hour ping test from the server to the router. I have noticed that these disconnections occur during random periods of high packet loss between the NIC and the router.
The server is not overloaded with I/O processes or CPU processes and I am the only one using it.
Things I have tried to no avail.
- Swapping cables
- Swapping routers
- Swapping ports on the router
- Removing network-manager (Ubuntu)
- disabling all firewalls
- disabling iptables.
- restarting all of the services manually.
I am considering buying a PCIe NIC, but I want to ask in case there is something I am overlooking.
Best Answer
One thing you might want to verify is that there are no other machine/device on the network "stealing" the server ip. Unless you can find that info in your network equipment there is always the option of running a arpwatch daemon on some suitable server on that local network.