Is it possible to build a CPU from content-addressable memory

chip-designcpudesignmicroprocessor

Every modern programming language use objects and not C/C++ style struct/class.
In C/C++ every data member has a size, so addressing a struct member is basically a memory address + offset. But script languages just like JavaScript and many others use an object ID that contains key/value pairs. These key/value pairs can be also an object ID.

So nowadays hardware memory is addressed with memory address + offset or memory address to a lookup table where key/value are stored.

A memory address can contains data or ASM code that is executable by a CPU.

So is it possible to build a hardware that only contains content-addressable memory (CAM)?

I can imagine the following:

Data:

id: 23
  -key: 123, value: 7
  -key: 456, value: 8
  -key: 789, value: 0

This is a scheme of a content addressable memory block. This block can be write with a SetValue(id, key, value) opcode and can read with a GetValue(id, key)

My idea is that if I specify a key/value pair for executable memory block I got a CPU.

If a memory block contains special key value pairs it will act like an opcode. For example if contains a key that is < 100 than it is an opcode

id: 45
  -key: 12, value: 7  // key 12 opcode for add two data
  -key: 10, value: 8  // key 10 is arg1, value is a data
  -key: 11, value: 9  // key 11 is arg2, value is a data

In this architecture every memory block an object and every memory block a minimal CPU at the same time. Do you think is it possible to build such a hardware?

Best Answer

Building and using large content-addressable memories is expensive and difficult. The most common design for a CAM is to have a number of addressable memory cells that compare their contents to a matched value that is passed in, and if they match they trigger a set of associated cells to produce output on a shared bus. This means two things:

  • As well as the value to output, we need memory cells for the addressable content, and additional logic to perform the comparison between that content and the value being searched for. Also, because there is no single predetermined location each value can be stored in, some mechanism of allocating space for new entries to be added is required[1]. Taken together, these make the size of a CAM much larger than the size of a traditional memory that stores the same amount of data.

  • Because the units output to a shared bus, they must be arranged so that only one at a time can match. This means that before an item can be added to a CAM the CAM must first be searched to ensure that it won't conflict with existing entries. If there is a conflict, the existing entry must be deleted before the new entry can be added. If the data is required to persist indefinitely (i.e. the CAM isn't being used as a cache, as most current applications of CAMs in processors are) then this means the addressable content must always be unique -- this requirement usually means that the size of addressable content needs to be quite large, further adding to the cost of implementation.

There is also a development process reason why architectures that work like this are likely to remain rare for the foreseeable future: a large proportion of CPU development work these days is performed using FPGAs to run the processor being designed. FPGAs do not usually have the kind of shared bus arrangement that CAMs are implemented with -- to select among multiple possible source values, an FPGA uses multiplexers instead, but large trees of multiplexers are slower than such a shared bus, so large CAMs in FPGAs are not usually very fast, making developing architectures that use them particularly difficult.

For these reasons, while it is possible to use CAMs to accelerate applications that use small amounts of memory, building an entire modern general purpose computer using them for all storage seems to be unjustifiably tricky.

[1] - this can be mitigated a little by the use of set-associative CAMs, but that's only really useful for caches as it makes it more likely for conflicts to arise that require an element to be deleted from the memory when it is still needed, which is an acceptable tradeoff for a cache but not for the kind of use proposed here.