Electronic – Multi-Port RAM (1 write port, many read ports)

portramverilogvhdlxilinx

I have a project where I may need a 128 KB lookup RAM. I have 1 write port which writes the lookup values at the start of the application. I will have more than 2 read ports (I am assuming 4). I do not want to replicate the 128 KB RAM 4 times for the sake of space.

I wanted to know if it is possible to build a multiport RAM using Xilinx BRAMs by simply adding n RdAddr/RdDataOut ports ? Is this possible. What do I need to take care of when I try to do this ?

Is there an open-source project which has done this, so I dont have to reinvent the wheel ?

Thanks for your info.

RRS

Best Answer

Assuming you need a read cycle on each port on each clock cycle, each BRAM will give you two read ports. Beyond that, you have to replicate the contents of the memory.

Is the bandwidth required at each port less than the raw bandwidth of the BRAM? In that case, you might consider multiplexing the ports. Use a counter that runs at the full speed of the BRAM to drive a multiplexer that scans the address bus for each port, feed these addresses to the BRAM, and then deliver the data (typically 2 clocks later) to the corresponding data bus for each port.

The downside of this approach is that the access latency for each port is now N clocks longer than the non-multiplexed case. There are various ways to deal with this latency, including adding additional pipeline stages to the other data paths.

Note that with a 2-port BRAM inside the module, you can scan two of the external ports at a time.

Problem with Design #1

I have noticed that you must specify the two ports in two separate processes for XST to infer dual-port RAM - if you don't you won't get the two ports. Separate processes is also how Xilinx suggests infering Dual-port RAM in XST User Guide. Hence your Design #1 will only infer single-port ram.

You can see my general VHDL for infering dual-port RAM with XST at the bottom of this post. (Details: http://www.fpga-dev.com/infering-dual-port-blockram-with-xst/)

Problem with Design #2

In your Design #2, you register the addres twice, probably unintentionally. <= signal assignments are made at the end of the process, not immediately. This code is equivalent to yours, only with simpler signal names:

-- sequential context (A, B, C are signals):
if rising_edge(clk) then
  B <= A;
  C <= B;
end if;

Here C <= B; will not assign to C what was assigned to B on the previous line, since that assignment only takes effect at the end of the process. If the signals are bits and the stimuli is a pulse on A, this would be the result of the above code:

clk _|"|_|"|_|"|_|"|_|"|_|"|
A   ______|"""|_____________
B   __________|"""|_________
C   ______________|"""|_____

Declaring B a variable instead and assigning with := will assign immediately:

-- sequential context (A, C are signals; B is variable):
if rising_edge(clk) then
  B := A;
  C <= B;
end if;

yielding

clk _|"|_|"|_|"|_|"|_|"|_|"|
A   ______|"""|_____________
B   __________|"""|_________
C   __________|"""|_________

Infering dual-port BlockRam with XST

(More details on this at http://www.fpga-dev.com/infering-dual-port-blockram-with-xst/.)

Below is my parameterized module for generic dual-port RAM. It will successfully infer dual-port RAM, as desired, with XST.

(Remove the write enable-signals and write logic to get ROM instead of RAM.)

Specify width and depth with width and highAddr (one less than desired depth) generics.

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity genRAM is
  generic(
    width     : integer;
    highAddr  : integer -- highest address (= size-1)
  );
  port(
    -- Two sets of ports (A and B), each set having ports Adress, Data in,
    -- Data out and Write enable:
    Aaddr     : in  integer range 0 to highAddr        := 0;
    ADI       : in  std_logic_vector(width-1 downto 0) := (others => '0');
    ADO       : out std_logic_vector(width-1 downto 0) := (others => '0');
    AWE       : in  std_logic                          := '0';
    Baddr     : in  integer range 0 to highAddr        := 0;
    BDI       : in  std_logic_vector(width-1 downto 0) := (others => '0');
    BDO       : out std_logic_vector(width-1 downto 0) := (others => '0');
    BWE       : in  std_logic                          := '0';
    clk       : in  std_logic
  );
end genRAM;

architecture arch of genRAM is
  subtype TmemWord is bit_vector(width-1 downto 0);
  type    Tmem     is array(0 to highAddr) of TmemWord;
  shared variable memory: Tmem;

  process(clk) is
  begin
    if (rising_edge(clk)) then
      ADO <= To_StdLogicVector(memory(Aaddr));
      if (AWE = '1') then
        memory(Aaddr) := To_bitvector(std_logic_vector(ADI));
      end if;
    end if;
  end process;

  process(clk) is
  begin
    if (rising_edge(clk)) then    
      BDO <= To_StdLogicVector(memory(Baddr));
      if (BWE = '1') then
        memory(Baddr) := To_bitvector(std_logic_vector(BDI));
      end if;
    end if;
  end process;
end arch;

The code above implements read-first behavior. That means that if address 0x00 contains 0xcafe and you write 0xbabe to 0x00, the cycle after the write will display 0xcafe on the data-out port ("data is read to output port before being written to memory").

If you desire write-first behaviour, change order of the reading and writing for both processes, below is how it would be for port A:

-- excerpt for write-first behaviour:
if (AWE = '1') then
  memory(Aaddr) := To_bitvector(std_logic_vector(ADI));
end if;
ADO <= To_StdLogicVector(memory(Aaddr));

In the above case, data-out would display 0xbabe one cycle after the write ("data is written to memory before reading memory contents to output port").

Electronic – Reading RAM externally in a running system by intercepting the memory bus or replacing the RAM chips

Is it possible at all?

Well, maybe yes.

Is it feasible to do this on commercially produced motherboards with soldered RAM chips found on consumer electronic devices?

Definitely no. Connecting wires to the lines between the CPU and the RAM without disturbing communications is extremely difficult. The faster the bus is, the more difficult this is. Replacing the RAM chips with something (a larger FPGA + external RAM would probably suffice) that "simulates" a RAM chip to the modified system and does whatever you want with the data is probably a lot more feasible.

Are there any off-the-shelf devices that can do this?

Most probably no.

Best Answer

Related Solutions

Electronic – Inferring Dual-Port Block RAM

Problem with Design #1

Problem with Design #2

Infering dual-port BlockRam with XST

Electronic – Reading RAM externally in a running system by intercepting the memory bus or replacing the RAM chips

Related Topic