Electrical – Fastest way of transferring data from VHDL section to Microblaze processor

fpgamicroblaze

My basic application involves sending ADC samples to PC via Ethernet. The ADC sampling and storing is happening in VHDL section while the Ethernet socket programming code is in Microblaze processor.

Microblaze is running at 60MHz and the FPGA is at 100MHz (I've taken care of the CDC issue). For transferring 1k data it takes me 740us. I have a dual port BRAM where the samples are stored and upon an interrupt Microblaze will read it from the second port. For simplicity I have kept the BRAM read in always enabled state and I am reading the data based on address only.

Any ideas as to how I can transfer the data in a faster way.

I've added the BRAM connection details for more reference.
I've used the simple dual port type BRAM.

Buf_2k : Ethernet_test_data
  PORT MAP (
    clka => ADC_clock,                      --10MHz clock
    wea(0) => BRAM_wr_en,                   --Write enable generated synchronously to ADC_clock
    addra => buf_wr_addr(10 downto 0),      --BRAM address generated similarly as write enable
    dina => ADC_data_trig,                  --Registered ADC data to be stored in BRAM
    clkb => CLK_50M,                        --50MHz clock generated from DCM
    addrb => BRAM_addr(10 downto 0),        --Address coming from Microblaze to read the data
    doutb => BRAM_dout                      --This data out is directly connected to Microblaze
  );

Best Answer

Any ideas as to how I can transfer the data in a faster way.

You could potentially use the FSL (or AXI-Stream in newer microblazes), as that is a single cycle "read from FIFO" into register.


But really the question is what do you mean by faster transfers?

The data is already "in the Microblaze subsystem" once it is in a BRAM that the MB can access. The processor can only go as fast as it can go, you cannot get the data any more tightly coupled than it already is (assuming the BRAM is directly connected to an LMB), the BRAM access is very quick. Have a look at the assembly code for the loop in which you are reading the data - there is probably a lot of stuff going on (all of which is necessary if you have the C-optimiser turned on).

You are measuring 42 cycles per 32 bit read, which sounds quite slow. Some thoughts:

  • Are you processing the data in the loop which reads it?
  • Is your loop code in cache, or are you running from some external memory?

In the best case of just reading the data to a register and then ignoring it, you could achieve a few cycles per word.