I have been monitoring this for about 6-7 weeks now and can finally give a definitive answer to the problem.
Firstly the Nonpaged Bytes for individual processes didn't really tell me anything useful as they all appeared to be fairly static in their usage. There were spikes but the usage always returned to the base line afterward.
The Nonpaged Bytes Memory total was static for awhile also but then started gradually increasing and then spiking. After a spike about half the memory was freed and then it remained static again (at the higher level) for awhile until the pattern repeated. Looking at the graph I noticed that these spikes seemed to be fairly regularly spaced and as it turns out they were happening 2 weeks apart and always on a Sunday.
So the next question was: What is running on bi-weekly on Sundays? I went had a look in Event Viewer and every time a spike occurred McAfee was running. I also think by logging onto the server frequently to monitor the issue we inadvertently made the problem worse because McAfee has a real time scanner and I believe this was causing the smaller increases we were seeing.
I think that the scans being scheduled tasks also explains why we saw the NP Memory increases attached to the Event objects tag in PoolMon instead of the the McAfee specific tag. This was the main thing that really led us down the garden path.
Now that we finally know what is causing the leaks we can do something about it. It's incredible that it took this long to track it down though.
UPDATE: Just as a final note. McAfee's was updated on the weekend and this completely resolved our Non-Paged Memory problem.
UPDATE 2: Since I just got an up vote for this, I'll add a further update to this. Initially the update to McAfee did appear to fix our problem i.e. we no longer see the massive spikes in NP Memory at regular intervals. I have also noticed that since the update it seems McAfee no longer writes logs to the Event Viewer by default now, which hides when it is actively scanning.
But we are still seeing gradual increases in NP memory usage. It's gotten to the point where we now need to reboot our server every 2 weeks or so. It's so bad that we recently acquired a new server in the hope that updated hardware and software will make this problem go away BUT our completely new server with only Windows Server 2008, SQL Server 2008 R2, and McAfee installed was STILL showing a NP Memory leak. It was only after I completely removed McAfee that the leak stopped and it has remained static even after we set up the server with all our software in preparation to switch over to it.
I have since read, and I don't know if this is true, that the problem isn't with McAfee, but with some Windows routine that McAfee uses that causes NP Memory to leak. Apparently, network activity is the cause of the leak i.e. more network activity => bigger leaks. This does seem to be consistent with our experience, in that the leak has gotten worse as our server has gotten busier.
Ram shortages can cause this also, according to Google searches, among other things. An error condition I observed via Google in finding some of below was low memory issues where the base operating system had little access to RAM. My guess is that same type of problem could easily be recreated in a virtual environment that is starved for ram.
A more fundamental troubleshooting question is quite simply - what is different with your production environment?
Have you tested the application in Windows 2003 x64 or Windows 2008?
Onto the 2nd part of your questions..
The following tools can be used for troubleshooting and fixing Winsock errors.
Sniffers:
http://www.wireshark.org/
Shims:
http://www.sstinc.com/winsock.html
http://www.win-tech.com/html/socktspy.htm
General purpose tools to track the system status and resources
http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
http://technet.microsoft.com/en-us/sysinternals/bb896645
Tool to detect the API calls
http://www.apimonitor.com/
http://www.nektra.com/products/spystudio-api-monitor/
Debuggers
http://www.ollydbg.de/
http://www.immunitysec.com/products-immdbg.shtml
Reversing tools or decompilers
http://www.hex-rays.com/products/ida/index.shtml
http://www.hex-rays.com/products/decompiler/index.shtml
Your standard IDE and compiler
http://www.microsoft.com/visualstudio/en-us
Here is a list of other tools:
http://www.sockets.com/devtools.htm
Other references found:
https://stackoverflow.com/questions/8118870/howto-debug-winsock-api-calls
http://brandon.fuller.name/archives/2007/01/24/19.44.29/
http://tangentsoft.net/ <---- Probably the best one
Best Answer
Isn't it this?
AfdP tag Ploblem -> http://support.microsoft.com/kb/917114/
AfdB tag Ploblem -> http://support.microsoft.com/kb/931311