Assuming it's a real 5520 and not a fake that's actually a converted 5510 (yep, a bunch of those were floating around - you'll know it by the fact it's only got a single RAM slot) you'll want PC3200 with CAS latency (CL) of 3 running at 400MT/s, unbuffered, ECC.
Not sure if the ASAs are also picky about the chip arrangement too, so if you want to play it super safe, make sure the RAM you buy has the memory spread across 8 chips.
TCAM is a type of memory, which takes 10-12 transistors to store a single bit. By way of comparison, Static RAM (SRAM) only takes 6 transistors to store a single bit, and Dynamic RAM (DRAM) takes one transistor and a capacitor. All these different types of memories can either be internal or external to an ASIC. One reason to put all memories on a chip, is that they can be ran at higher clock rates than when external to a chip. Why choose one type of memory over another? This has to do with characteristics of the memory, SRAM can be accessed every clock, DRAM requires periodic refresh, so can not be accessed every clock and TCAM gives you ternary capability.
TCAMs are as scalable as long as you have space on a chip to instantiate them, or pins on package to connect to external ones. The issue with TCAM is they take 2x space of SRAM, and 12x space of DRAM. It does not always make sense to use TCAM for the same operations that you can do them algorithmically (Hashes, *tries) with other memory types. It comes down to a tradeoff between utilization effectiveness of the algorithm and space on the chip on which one to choose. TCAM's power utilization grows in linear proportion to size. The majority of large TCAMs (greater than 2M entries) now use algorithmic techniques so that power savings can be achieved.
NAT/PAT is complex feature, which generally needs a CPU or Network Processor (NPU) to handle fixups. The general packet flow for NAT is first packet goes to CPU/NPU, and a flow entry is installed in flow table or ACL table with the information on how to translate subsequent packets in the flow. There are multiple different forms of NAT/PAT, and just as many ways to optimize each one in a chip. The simplest NAT being rewrite the IPs, and don't worry if you break the addresses embedded in the payload, no fixups.
There is another version of BRKARC-3466 which was presented at CiscoLive 2013 in Melbourne that covers some of the high level ideas behind lookups, which is missing from the 2013 Orlando one. A good reference book on this area is Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices by George Varghese.
Best Answer
The best scenario for a RAM search is that you have the data stored in a hash table, and you spend the cycles to calculate the hash, then you must go to that point in the table and read the value.
There are other RAM storage methods, but a full discussion of data structures and searching methods is beyond the scope of this site.