Intermittently failing PPTP VPN connections

pptprrasvpn

I'm two weeks in as the System Administrator at a midsize manufacturing company with executives and sales staff who make heavy use of remote access. From almost the day I arrived, I started getting complains about VPN performance problems. Here's our setup: Our gateway is an Astaro appliance that is configured with DNAT rules that forward PPTP and GRE traffic to an internal server we'll call COMPANY-VPN01. COMPANY-VPN01 is a Windows Server 2003 box configured with RRAS. Our client computers (mostly XP, but some Vista and 7) have Windows PPTP VPN connections set up to point to vpn.company.com, which resolves to the public IP of the Astaro box's external interface.

Here's the really strange part: every time users connect to the VPN connection, it appears to work. On the client machine, the VPN connection shows "connected" (and their ipconfig shows a PPP adapter with a DHCP-issued internal IP address), and I can see their connection on COMPANY-VPN01 as well). However, more than half the time, these connections are no good — users can't access any internal resources. Since I've experienced this myself, I've tested some of the obvious things — tried using FQDNs of internal resources instead of NetBIOS names, tried using IP addresses, tried pinging internal resources by name and IP address. Nothing goes through. But if I disconnect and reconnect the VPN connection a few times, I'll eventually get a connection that works flawlessly — and will continue to work flawlessly, no matter how long I leave it connected. So, whatever the problem is, it's a) intermittent and b) occurs when the VPN connection is being set up — because if you get a "good" connection, it remains good.

Does anyone have any ideas about what might be going on here, or suggestions for things we might troubleshoot?

Best Answer

I guess I shouldn't just leave this unanswered forever when I have in fact solved the problem.

When I was hired, I was told that the LAN was "running out of scope". The network had a 24-bit subnet mask and uses DHCP for client addressing. The DHCP server wasn't set to check DNS before handing out addresses. DNS wasn't set to scavenge old records. RRAS was set to get addresses for VPN clients from DHCP, and by default it grabs blocks of 10. So it would grab blocks of 10 and start handing them out to clients, but as our address space ran out, some of them were "good" and some were already registered to other clients in DNS. I shortened the subnet mask to 22-bit and the problem was solved.