First: What you describe is NAT, not firewalling. A firewall just filters what can go through, a NAT device changes addresses in packets.
You almost answer the first question yourself. Yes, a NAT device needs to keep track of every session going through it. Most communication on the internet uses TCP or UDP. Both of these protocols use port numbers. A session is defined by source address, source port number, destination address and destination port number. The NAT device needs to maintain a mapping between which numbers on the inside correspond to which numbers on the outside. And then it has to match every packet to an entry in its mapping table and adjust the packet accordingly.
This is also why NAT devices are less than optimal: a normal router is stateless. It doesn't need to keep track of what happened previously and it doesn't need to adjust the numbers and addresses in the packet. If a router fails another router can take over immediately. When a NAT device fails the device that takes over doesn't have the same mapping table and all sessions break and have to be re-established.
Your second question is more complex. One option is to configure port forwarding in one of the NAT devices. Then you let A send a packet to the forwarded port on C. B will change the source address and port to one of its own. When the packet arrives at C it then adjusts the destination address and port so the packet is forwarded to D. Reply packets do exactly the same in the opposite direction.
If there is no port forwarding then it gets more difficult. You need to have the assistance of an external server E. Both A and D have to initiate connections to E. Then E has to coordinate setting up the session between A and D. A and D both send outbound packets to trick B and C into adding entries to their mapping tables. Once those mappings are in place they can communicate directly.
To summarise: the way things usually work is that for outbound packets you have a device that performs source NAT. It changes the source address and port of the internal device to one of its own. For inbound packets you have a device that performs destination NAT. It changes the destination address and port to what is in its mapping table. The mapping table is filled either by manual configuration, by a protocol that lets internal systems request a mapping (didn't talk about those, look up UPNP and PCP) or automatically when the NAT device creates an entry for an outbound packet.
First, just because your two computers are connected via the Internet at two separate locations doesn't necessarily mean that you are using private addresses. That is certainly the most likely scenarios with IPv4, given the IPv4 address shortage, but it is still not necessarily true. If you are running IPv6, you are probably using public IPv6 addresses.
Let's assume you are using private IPv4 addressing behind a router using NAT.
Knowing the other private address does nothing for you at all, so just take that out of the equation.
Under normal, non-hacker circumstances, the NAT routers at each end would need to have port forwarding enabled for each PC's private address, or the routers would need to be using one-to-one NAT, to enable the two PCs to communicate via the public addresses. You could also use a VPN between the two PCs to get around the NAT problem. This all assumes that there are no firewalls in place to block traffic from the Internet into the private networks.
In one-to-one NAT, each network would have multiple public addresses which uniquely translate to a single private address, so using a public address will get you to the corresponding private address. This is also an unlikely scenario given the shortage of IPv4 addresses, but it is done in some places.
Port forwarding configures NAT to forward incoming traffic sent to a router's public address on a given port number to be sent to a particular inside private address at a given port number.
A VPN is a tunnel. Usually, traffic from one inside network is encapsulated within packets addressed to the public address of the other network and sent to the other network where it is de-encapsulated to the other inside network. This can be configured in such a way as to make the foreign network appear local to the tunnel interface of the local network.
A firewall on either end, or anywhere along the path, may be configured to block any or all of these methods.
Best Answer
That is incorrect. Routing protocols are one of three ways routers populate their routing tables:
Routing protocols are used to exchange routes between routers, but they do not route the packets.
That's what routers do. Your router inherently knows about both networks because they are directly connected, so it will populate its routing table with both networks, and it will default to routing packets between the networks.