Windows – exchange accounts on outlook 2010 freeze randomly due to high latency

exchange-2010outlookwindowswindows-sbs

One of our clients is running a windows SBS server 2011, with users in both a TS environment and local lan computer access.
This is a small environment with approximately 35 users.

The symptoms we are experiencing are as such:

individual users will experience outlook not responding for approximately 5-10 minutes at a time, at roughly the same time every day, as well as random times.
This happens to both TS and local users.
There isn't usually overlap between users, ie, different users will experience the issue at different times, whilst others continue to work without incident.

What we have observed:

  • Store.exe on the Server running exchange has constantly high disk access. varying between 5-10MB/s read access, with between 1-5MB/s write access (according to resource monitor). This remains constant regardless of people experiencing lockout. This activity dissapears at night time when no one is accessing the system.

  • During the users lockout of outlook, Exmon will report very high avg server latency for the specific user in question, (60 seconds latency is average during this time, normally sitting between 0 and 500 msec). Other unaffected users do not have high latency

  • during lockout, the users in question will have a high session count

Additional notes:

  • Event log doesn't appear to report anything related
  • There is nothing scheduled on the exchange server during the times people experience the lockout
  • all windows updates and service packs have been applied to both servers and to office.
  • when locked out, you cannot even open the mail control panel item for the exchange settings page for the specific user

Server Spec:

  • dual cpu xeon e5606@2.13
  • 24GB Memory
  • SAS 15K (seagate Cheetah) – 300gb pair in raid 1

We are at a loss as to what is causing the issue at this stage.

Any suggestions or direction would be greatly appreciated.

Best Answer

  1. It's not TCP connection exhaustion.
  2. Use Perfmon to log and check your key server metrics. I'd start with disk, as I'm fairly confident the issue is that you have a very long disk queue length.
  3. Check your hardware, both to make sure it hasn't gone bad and so you can get an idea of what performance you should be capable of.
    • The last time I saw this happen (5-10 minute, sporadic freezes for certain users), the issue was bad blocks on one of the server's disks.
    • The time before last that I saw this it was massive disk queue due to users opening >10 GB .pst files over a network share.
    • Having a disk configuration that does not support the level of IOPS your users are throwing at the server can also cause this type of behavior in use-cases similar to yours.