Just like BGP attributes are packed when sending BGP UPDATEs, they are stored in memory in a rather compact format where each prefixes only holds references to the attributes that apply to that route. The AS path and the communities applied to a route are typically the attributes that are largest in size and as these are often the same over a large amount of routes, a great deal of memory can be saved by just storing an AS path once and then keeping references to all routes for which this AS path applies. The exact in-memory format changes with different versions of IOS/JUNOS.
A route for the same prefix received via different eBGP sessions will likely share many attributes (especially if the eBGP sessions are to the same upstream) and so they can be "packed" quite efficiently in memory.
Your explanation of BGP behaviour is mostly correct. R1 has sent the content of its entire BGP RIB to R2. R2s eBGP session is enablde and for every route received, R2 will compare it to the one in its BGP RIB (which mostly contains routes from R1). When R2 finds a prefix from its new eBGP peer that is better than the one in BGP RIB, it will use the new eBGP route and announce it to R1. If R1 finds the route received from R2 to be better than its own route (received from its eBGP session) then R2 will withdraw the route that it announced to R2. The routers will only announce their best route and if the best route is not from eBGP but from another iBGP neighbor, it will not announce it unless you have some route-reflection going on as well. You can easily end up in a scenario where both R1 and R2 each select all the routes learnt from their respective eBGP session as the best routes and that means they will always announce a full table to their iBGP buddies.
There's also something called "best external" which means you always announce your best eBGP route regardless if that's the one you have chosen as best path or not. Best external enhances convergence time by not having to go through "BGP path hunting" when something breaks. Obviously it consumes more memory.
Personally I don't think BGP Soft reset enhancement is much of an enhancement. RAM is cheap, so keeping the routes, exactly as received from your neighbor, in memory is quite easy, especially on a modern router. JUNOS doesn't even offer you the option to not store it - it always keeps the Adj-RIB-IN, but JUNOS was designed in the late 90s when RAM was plentiful and not in the 80s when IOS made the scene. If someone out there designing a router thinks "let's save on RAM because we have BGP soft reset enhancement", I'd like to shoot that person ;)
On the other hand, if you are running a network with old routers running low on memory I totally understand if you don't enable soft-reconf.
Bottom line, it's difficult to say and depends on your environment.
Sometimes, I wish they'd stop teaching the OSI model. It seems to confuse more than it helps.
When we say that layers communicate with each other, we mean the data created by a particular layer (say, transport) on host A is processed by the same layer on host B. This is a logical connection.
The actual data (in this case, the segment containing the SYN flag) is encapsulated in the Network PDU (IP packet), then encapsulated in the data-link PDU (Ethernet), then finally transmitted on the Ethernet cable (Physical layer). Host B reverses this process, unencapsulating the PDU at each layer until it reaches the transport layer. The transport layer processes the SYN flag and creates a new PDU containing the SYN, ACK flags. Then it sends it to A using the same encapsulation process.
The only way data is actually sent from one host to another is via the physical wire. Layer to layer communication is just a mental construct.
Best Answer
There is a lot of confusion regarding what part of the OSI model is in use here. Let me see if I can help:
Remember that the OSI model is just a model. It doesn't represent anything in actual use. The TCP/IP model is a better fit for protocols in use on the Internet.
The statement "routers are just depicted as having and using the networking protocols and IP" is an oversimplification -- and that's where some of the confusion comes from. Router software has the full stack in order to run routing protocols like BGP as well as management functions (telnet, SSH, snmp, etc).
Is BGP a network protocol or an application protocol? The BGP process that runs on a router talks to BGP processes on other routers. BGP makes use of TCP/IP to facilitate that communication. It establishes sessions between peers and has its own messaging format and syntax. In that sense, BGP is an application that runs on a router.
BGP's purpose is to populate the forwarding table of the router. When the router makes a forwarding decision for an IP packet, it looks in the table for the next hop address, adds the layer 2 header, and transmits it out an interface. That process only involves layers 1-3. So if you're talking about how routers route, that's all you need to discuss -- and that's probably where the confusion comes from.
In summary, the forwarding of packets involves layers 1-3. the information used to forward packets comes from many sources -- one of which could be the BGP application running on the router.