Java UDP hole punching example – connecting through firewall

hole-punchingjavanetworkingstun

Lets say I have two computers.

They know each others public and private IPs via ice4j.

One client listening and the other one sending some string.

I'd like to see this happen via UPD hole punching:

Let A be the client requesting the connection

Let B be the client that is responding to the request

Let S be the ice4j STUN server that they contact to initiate the connection
--
A sends a connection request to S

S responds with B's IP and port info, and sends A's IP and port info to B

A sends a UDP packet to B, which B's router firewall drops but it still
punches a hole in A's own firewall where B can connect

B sends a UDP packet to A, that both punches a hole in their own firewall,
and reaches A through the hole that they punched in their own firewall

A and B can now communicate through their established connection without 
the help of S

Could any one post pseudo examples of how to go about doing hole punching through symmetric NAT? Assuming there will be server S that will help to guess the port numbers and establish connection between the client A and B.

It would be nice if you accounted for double NAT as well.

NOTE:

You can use STUN to discover the IP and Port but you have to write your own code that would send the IP:Port to your server via keepalive technique.

Once one client identifies the other via unique ID on the server it will be provided with the other's client IP:port info to UDP hole punch the data it needs to send and receive.

Little update:

There is library that is showing up on the horizon for java check it out:
https://github.com/htwg/UCE#readme

Best Answer

This example is in C#, not in Java, but the concepts of NAT traversal are language-agnostic.

See Michael Lidgren's network library which has NAT traversal built in.

Link: http://code.google.com/p/lidgren-network-gen3/ Specific C# File Dealing with NAT Traversal: http://code.google.com/p/lidgren-network-gen3/source/browse/trunk/Lidgren.Network/NetNatIntroduction.cs

The process you've posted is correct. It will work, for only 3 out of 4 general types of NAT devices (I say general because NAT behavior isn't really standardized): Full-Cone NATs, Restricted-Cone NATs, and Port-Restricted-Cone NATs. NAT traversal will not work with Symmetric NATs, which are found mostly in corporate networks for enhanced security. If one party uses a Symmetric NAT and the other party doesn't, it's still possible to traverse the NAT but it requires more guesswork. A Symmetric NAT to Symmetric NAT traversal is extremely difficult - you can read a paper about it here.

But really, the process you've described works exactly. I've implemented it for my own remote screen sharing program (also in C#, unfortunately). Just make sure you've disabled Windows firewall (if you're using Windows) and third-party firewalls. But yes, I can happily confirm that it will work.

Clarifying the Process of NAT Traversal

I'm writing this update to clarify the process of NAT traversal for you and future readers. Hopefully, this can be a clear summary of the history and the process.

Some Reference Sources: http://think-like-a-computer.com/2011/09/16/types-of-nat/, and http://en.wikipedia.org/wiki/Network_address_translation, http://en.wikipedia.org/wiki/IPv4, http://en.wikipedia.org/wiki/IPv4_address_exhaustion.

IPv4 addresses, with the capacity to uniquely name approximately 4.3 billion computers, have run out. Smart people foresaw this problem, and, among other reasons, invented routers to combat IPv4 address exhaustion, by assigning a network of computers connected to itself 1 shared IP address.

There are LAN IPs. And then there are WAN IPs. LAN IPs are Local Area Network IPs which uniquely identify computers in a local network, say the desktops, laptops, printers, and smartphones connected to a home router. WAN IPs uniquely identify computers outside of the local area network in a wide area network - commonly taken to mean The Internet. So these routers assign a group of computers 1 WAN IP. Each computer still has its own LAN IP. LAN IPs are what you see when you type ipconfig in your Command Prompt and get IPv4 Address . . . . . . . . 192.168.1.101. WAN IPs are what you see when you connect to cmyip.com and get 128.120.196.204.

Just as the radio spectrum is bought out, so entire IP ranges are bought out and reserved as well by agencies and organizations, as well as port numbers. The short message is, again, that we don't have any more IPv4 addresses to spare.

What does this have to do with NAT traversal? Well, since routers were invented, direct connections (end-to-end connectivity) have been somewhat ... impossible, without a few hacks. If you have a network of 2 computers (Computer A and Computer B) both sharing the WAN IP of 128.120.196.204, to which computer does a connection go? I'm talking about an external computer (say google.com) initiating a connection to 128.120.196.204. The answer is: nobody knows, and neither does the router, which is why the router drops the connection. If Computer A initiates a connection to, say, google.com, then that's a different story. The router then remembers that Computer A with LAN IP 192.168.1.101 intiated a connection to 74.125.227.64 (google.com). As Computer A's request packet leaves the router, the router actually re-writes LAN IP 192.168.1.101 to the router's WAN IP of 128.120.196.204. So, when google.com receives Computer A's request packet, it sees the sender IP that the router re-wrote, not the LAN IP of Computer A (google.com sees 128.120.196.204 as the IP to reply to). When google.com finally replies, the packet reaches the router, the router remembers (it has a state table) that it was expecting a reply from google.com, and it appropriately forwards the packet to Computer A.

In other words, your router has no problem when you initiate the connection - your router will remember to forward the replying packet back to your computer (through that whole process described above). But, when an external server initiates a connection to you, the router can't know which computer the connection was meant for, since Computer A and Computer B both share the WAN IP of 128.120.196.204 ... unless, there's a clear rule that instructs the router to forward all packets originally going to destination port X, now to go to Computer A, destination port Y. This is known as port-forwarding. Unfortunately, if you're thinking of using port-forwarding for your networking applications, it's not practical, as your users may not understand how to enable it, and may be reluctant to enable it if they think it's a security risk. UPnP simply refers to the technology that allows you to programatically enable port-forwarding. Unfortunately, if you're thinking of using UPnP to port-forward your networking applications, it's not practical either, as UPnP is not always available, and when it is, it may not turned on by default.

So what's the solution then? The solution is to either proxy your entire traffic over your own computer (which you have carefully pre-configured to be globally reachable), or to come up with a way to beat the system. The first solution is (I believe) called TURN, and magically solves all connectivity issues at the price of providing a farm of servers with the available bandwidth. The second solution is called NAT traversal, and it's what we'll be exploring next.

Earlier, I described the process of an external server (say google.com) initiating a connection to 128.120.196.204. I said that, without the router having specific rules to understand which computer to forward google's connection request to, the router would simply drop the connection. This was a generalized scenario, and is not accurate because there are different types of NATs. (Note: A router is the actual physical device that you can drop on the floor. NAT (Network Address Translation) is a software process programmed into the router which helps save IPv4 addresses like trees). So, depending on which NAT the router employs, connection scenarios vary. A router may even combine NAT processes.

There are four types of NATs with standardized behavior: Full-Cone NATs, Restricted-Cone NATs, Port-Restricted-Cone NATs, and Symmetric NATs. Aside from these types, there can be other types of NATs with non-standardized behavior, but it's rarer.

Note: I'm not really too familiar with NATs...it seems like there are many ways of looking at routers, and information on the internet is very spread out on this topic. Classifying NATs by full, restricted, and port-restricted cones has been somewhat deprecated, says Wikipedia? There's something called static and dynamic NATs...just a bunch of various concepts that I can't reconcile together. Nevertheless, the following model worked for my own application. You can find out more about NATs by reading the links below and above and throughout this post. I can't post more about them because I don't really understand much about them.

Hoping for some network gurus to correct/add input, so that we can all learn more about this mysterious process.

To answer your question about gathering the external IP and Port of each client:

The headers of all UDP packets are structured the same with one source IP and one source port. UDP packet headers do not contain an "internal" source IP and an "external" source IP. UDP packet headers only contain one source IP. If you want to get an "internal" and "external" source IP, you need to actually send the internal source IP as part of your payload. But it doesn't sound like you need an internal source IP and port. It sounds like you only need an external IP and port, as your question stated. Which means that your solution it to simply read the source IP and port off the packet like the fields they are.

Two scenarios below (they don't really explain anything else):

LAN Communication

Computer A has a LAN IP of 192.168.1.101. Computer B has a LAN IP of 192.168.1.102. Computer A sends a packet, from port 3000, to Computer B at port 6000. The source IP on the UDP packet will be 192.168.1.101. And that will be the only IP. "External" has no context here, because the network is purely a local area network. In this example, a wide area network (like the Internet) doesn't exist. About ports though, because I'm unsure about NATs, I'm not sure if the port inscribed on the packet will be 3000. The NAT device may re-write the packet's port from 3000 to something random like 49826. Either way, you should use whatever port inscribed on the packet to reply - it's what you're supposed to use to reply. So in this example of LAN communication, you need send only one IP - the LAN IP, because that's all that matters. You don't have to worry about the port - the router takes care of that for you. When you receive the packet, you gather the only IP and port simply by reading it off the packet.

WAN Communication

Computer A has a LAN IP, again, of 192.168.1.101. Computer B has a LAN IP, again, of 192.168.1.102. Both Computer A and Computer B will share a WAN IP of 128.120.196.204. Server S is a server, a globally reachable computer on, let's say, an Amazon EC2 server, with a WAN IP of 1.1.1.1. Server S may have a LAN IP, but it's irrelevant. Computer B is irrelevant too.

Computer A sends a packet, from port 3000, to Server S. On the way out the router, the packet's source LAN IP from Computer A gets re-written to the WAN IP of the router. The router also re-writes the source port of 300 to 32981. What does Server S see, in terms of the external IP and port? Server S sees 128.120.196.204 as the IP, not 192.168.1.101, and Server S sees 32981 as the port, not 3000. Although these aren't the original IP and ports Computer A used to send the packet, these are the correct IPs and ports to reply to. When you receive the packet, you can only know the WAN IP and rewritten port. If that's what you want (you were asking for just the external IP and port), then you're set. Otherwise, if you also wanted the internal IP of the sender, you would need to have transmitted that as normal data separate from your header.

Code:

As stated above (below To answer your question about gathering the external IP), to gather the External IP and Port of each client, you simply read them off the packet. Each datagram sent always has the source IP and source port of the sender; you don't even need a fancy custom protocol because these two fields are always included - every single UDP packet must, by definition, have these two fields.

// Java language
// Buffer for receiving incoming data
byte[] inboundDatagramBuffer = new byte[1024];
DatagramPacket inboundDatagram = new DatagramPacket(inboundDatagramBuffer, inboundDatagramBuffer.length);
// Source IP address
InetAddress sourceAddress = inboundDatagram.getAddress();
// Source port
int sourcePort = inboundDatagram.getPort();
// Actually receive the datagram
socket.receive(inboundDatagram);

Because getAddress() and getPort() can return either the destination or source port, depending on what you set it to be, on the client (sending) machine, call setAddress() and setPort() to the server (receiving) machine, and on the server (receiving) machine, call setAddress() and setPort() back to the client (sending) machine. There must be a way to do this in receive(). Please elaborate if this (getAddress() and getPort() don't return the source IP and port you expect) is your actual roadblock. This is assuming the server to be a "standard" UDP server (it's not a STUN server).

Further Update:

I read your update about "how to use STUN to take the IP and port from one client and give it to the other"? A STUN server isn't designed to exchange endpoints or perform NAT traversal. A STUN server is designed to tell you your public IP, public port, and type of NAT device (whether it's a Full-Cone NAT, Restricted-Cone NAT, or Port-Restricted Cone NAT). I'd call the middleman server responsible for exchanging endpoints and performing the actual NAT traversal the "introducer". In my personal project, I don't actually need to use STUN to perform NAT traversing. My "introducer" (the middleman server that introduces clients A and B) is a standard server listening for UDP datagrams. As both clients A and B register themselves with the introducer, the introducer reads off their public IP and port and private IP (in case they're on a LAN). The public IP is read off the datagram header, like for all standard UDP datagrams. The private IP is written as part of the datagram payload, and the introducer just reads it as part of the payload. So, about STUN's usefulness, you don't need to rely on STUN to get the public IP and public port of each of your clients - any connected socket can tell you this. I'd say STUN is useful only for determining what type of NAT device your client is under so that you know whether to perform NAT traversal (if the NAT device type is Full-Cone, Restricted, or Port-Restricted), or to perform all-out TURN traffic proxying (if the NAT device type is Symmetric).

Please elaborate on your roadblock: if you want advice on best practices for designing an application messaging protocol, and advice on reading the fields off received messages in an orderly and systematic fashion (based on the comment you posted below), could you share your current method?

Related Topic