While internet pioneers in early days of internet (before mid. 1990) designed network architecture and protocols still used today they had in mind network should be able to provide end-to-end connectivity between any to host connected to internet as one of primary uses.
Drastic expansion of network resulted public IP address become valuable resource. End-to-end connectivity also had security issue because you practically directly expose your computer/device to potential abuses. Because of this reasons in mid. 1990 NAT (network address translation) devices become popular and today there presence is common in our houses and offices.
NAT devices (routers) enable multiple devices in local network to share single IP making internet easily distributable to any computer/device in home or office network.
NAT also provides basic shield to local network form possible outside attacks, because they don't let any traffic form outside reach computer/device on local network unless device/computer in local network initiates connection by sending request to remote service. NAT keep records of requests and lets back data from remote service only if local client has initiated data transfer. Routers commonly also have built in firewall that further increases security protection by router device.
This is great, but what about End-To-End connectivity. Original concepts had flaws to use it without consequences but no dough we need it and it's irreplaceable for some sorts of communication.
NAT device controllable port mapping by clients
Utilities to enable End-To-End connectivity where added to NAT (router) devices. Protocols that enabled client applications to request port mapping from router such that all data targeting mapped port reaches client that requested mapping. This protocols are:
UPnP - most common, port mapping creation using universal plug and play protocol (based on XML messages)
NAT PMP (Nat address translation port mapping protocol) - old protocol you would really found in your router. It had intensive use in some large AIR WAY companies.
PCP (Port control protocol) - proposed by Apple latest protocol to be adopted as RFC standard. It is compatible with NAT PMP protocol. You can commonly find it in Apple's NAT devices (newer).
Presently (2013) you will most likely find UPnP in your router settings, unfortunately disabled by default because if enabled can be abused by viruses and trojans if there are already present in your computer. It is expected that you enable one of this only if you need it , like in situation when you want to play some online multiplayer game.
Also this solutions have sense only if your router has public IP which is becoming rare this days because internet providers tend to share public IP between several users. Also it's not rare thy use devices and routing software that can full your WAN device that its on public IP.
So if you design some software that only use this methods to create direct connections you will have maybe in best case 10% users that can use it , that if you also explicitly tell them they should enable particular protocol. This stands for all devices accessing internet generally. If your users are people behind home routers that play some multiplayer online game over computer, most likely thy will have UPnP enabled because at least some friend will help them configure it so maybe you will have 20% usability in this case.
Traversal using Intended NAT Table manipulation
So we see port mapping protocols can be used only in small number of cases. What can we do now? We can cheat our NAT device to create mapping by sending some packet to remote client then instructing remote client to send packet that looks like response (matched source address and port) to us. If all ok and record in NAT mapping table is matched (like we are relay lucky then) packet form remote client will reach us. This would be explanation of oldest known technique of NAT traversal using intended NAT table manipulation referred as "UDP Hole punching". Earlier this technique was really usable and had great success rate. TCP connection could be even created after using same ports and even TCP hole punch was fairly successful.
UDP Hole punching become even more usable when STUN technique was invented (Cornel university). STUN is used to learn if computer is behind NAT , NAT behavior and ports that router with public IP mapped as external (Every NAT device can change packet source port, change is recorded in table so it knows how to modify response packets source port). They recognized four observable classifications of NAT behavior:
- A full cone NATis one where all requests from the same internal IP address and port are
mapped to the same external IP address and port. Furthermore, any external host can send
a packet to the internal host, by sending a packet to the mapped external address.
- A restricted cone NAT is one where all requests from the same internal IP address and
port are mapped to the same external IP address and port. Unlike a full cone NAT, an external
host (with IP address X) can send a packet to the internal host only if the internal host
had previously sent a packet to IP address X.
- A port restricted cone NAT is like a restricted cone NAT, but the restriction
includes port numbers. Specifically, an external host can send a packet, with source IP
address X and source port P, to the internal host only if the internal host had previously
sent a packet to IP address X and port P.
- A symmetric NAT is one where all requests from the same internal IP address and port,
to a specific destination IP address and port, are mapped to the same external IP address and
port. If the same host sends a packet with the same source address and port, but to
a different destination, a different mapping is used. Furthermore, only the external host that
receives a packet can send a UDP packet back to the internal host.
Since router (NAT) design in not standardized in this terms it quickly became clear that this classification is not valid, because it can lead us wrong way. Probably classification was "more valid" in time when invited because it's based on empiric conclusions, but eventually due new NAT designs become outdated. Commonly when you use some STUN testing client on your computer behind NAT to test it you can get 4 different results for NAT classification of router. So this simply cannot be taken as valid information. What you can use is fact that if you get any of above four results you can be sure there is NAT present . Mapped ports you get from responses are also usable because they will tell you most probable area of value next mapping will take . (Note that even if there is NAT sometimes very rarely STUN query may tell you that you are behind open internet - public IP)
Why did this techniques become outdated? Unfortunately security administrator and us IT engineers developing NAT traversal software are in constant struggle. We are basically both right and wrong. They claim NAT traversal is used only by crackers and pirates and we claim NAT traversal is simply sometimes needed and security level does not degrade because NAT traversal finds some random port for communication and does application specific data transfer. So even if abuser manages to guess one of 65536 ports his data will get into some application process that will throw exception because of false data or simply break. So speaking about security while transferring data using direct cannels is far far far ... more secure than using intermediate server. Communicating with servers is less secure than communication with some host directly because servers are well known and they are subject of crackers attacks. Also client-server communication is commonly based on well known protocols so that is also suitable for injecting entity of arbitrary code. Besides all that you can never be sure someone from cloud hosting company does not pick at your data.
To return to our story , later NAT devices and networks are not so thankful for NAT traversal because some engineers design routers probably recognized it as security threat. TCP traversal is almost impossible unless you have ability to use raw sockets which is unpractical because most OS-s enforce high security rules for their use or even just don't support it. TCP uses 3 step handshake involving packet number, session number ,packet type.... and in most cases you need to mach all of them to trick your NAT device not speaking of possibility that ICMP packet of type "Destination Pot Unreachable" resets your try. Basic UDP hole punch will work in small number of cases. Usably if you have router form some quality company like Cisco (Linksys) or NETGEAR chances of success are grater because engineers that design their devices are better and they probably recognize need so they will design their devices properly. For example STUN and basic UDP hole punch will be enough to traversal Linksys router NAT. Linksys will preserve source original port if possible or will take some near-by value so NAT traversal on such quality router is fairly easy.
Modern day method of NAT traversal by intended NAT table manipulation should involve next external ports prediction, price packets TTL manipulation which is key factor in cases of symmetric and port restricted cone NAT, multiple retries and side swap. As we already said NAT behavior is not standardized so designing good "piercing" method involves lot of testing on different NAT consultations between peers so good routine could be designed based on empiric conclusions.
Final NAT traversal solution
Industrial standard NAT traversal solution should be able to apply all possible methods mentioned in above texts. Is should inspect network environment of both peers and decide which meted of NAT traversal should be applied. If one method fails it should be able to try other methods or repeating swapping peer sides. If nether methods of direct tunnel creation succeeds relay will be last solution that we know for sure must work because it is based on standard client-server model. Relay is most expensive resource is peer-to-peer system network so having better success with direct tunnel methods will make system more flexible and cheap to maintain. Simple calculation to demonstrate this:
Let's say we have one server having connection bandwidth 100Mb/s. We want to support our Video-Over-IP application which requires let's say 500KB/s per peer pair = 4000Kb = 4Mb.
We want to have quality service guarantying 500KB/s for each peer-to-peer session under any conditions.
If we don't use NAT Traversal we will be able to support 100/4 = 20 sessions at once per server.
If we use NAT traversal and percentage of all tunnels made by relay is 5% we will be able to support (100/4) + 95%/5% * (100/4) = 400 sessions at once per server.
Also if we get users over the proposed limit quality of our service will degrade in 20 times slower rate on NAT traversal equipped system.
So cost of system equipped with NAT traversal would be about 20 times less than pure relay system.
This is just one advantage of NAT traversal equipped peer-to-peer systems.