Electronic – Unstable/sensitive ARM- External SRAM connection

armpcbsram

I designed a board for a specific purpose. There are different active components on the board, including:

  • ATMEL SAM4S16C Controller (runs on 120MHz)
  • W5100 ethernet controller (connected to the SAM via SPI)
  • AT25DF321A FLASH (connected to the SAM via SPI)
  • RMLV0408EGSP 512Kx8 SRAM (connected to the SMC of SAM)

Schematic

Power:
enter image description here

I have checked the power lines with a scope. There are no noise or peaks on it. It's stable 3.25v

SAM<->SRAM:
enter image description here

Borad:
I didn't do impedance matching or track length matching. I thought that, it's not necessary yet on this frequency. Maybe that was a mistake. See exact trace lengths in update-4

Issue:
Communication on SPI and USB works just fine. However, the data transfer between the controller and the SRAM is unstable and operates only 10 times slower than it could/should be. I wrote a 2 phase test function which does the following:

Phase-1:

  1. Clears the RAM by filling it with 0xFF
  2. Writes 0x55 (01010101) on every odd addresses (1,3,5,etc) and 0xAA (10101010) on the even addresses
  3. Iterates over again on the full address range, reads back the values and compares to the expected value

Phase-2:

  1. 'Clears' the RAM by filling it with 0x00
  2. Writes 0x55 (01010101) on every even addresses (1,3,5,etc) and 0xAA (10101010) on the odd addresses (other way around then previously)
  3. Iterates over again on the full address range, reads back the values and compares to the expected value

If I set the timing according to the SRAM's datasheet the test function reports a few errors from phase-1 and a whole lot from phase 2 where are errors on almost every even addresses. When an error occurs, the controller reads out 0x00 instead of the expected value.

If I instruct the controller to drive the SRAM 10 times slower, then the test function doesn't report any error. I put it inside of an endless loop, and let it to write/read back approx 5Gbytes. It was done without a single error. However, if I softly touch the surface of either the SRAM or the controller's package (without touching any pin) it starts producing errors. Sometimes I don't even need to touch the chip, it's enough to hold my finger 2-3mm above it.
On this speed, the error looks differently. Now, when the expected value is 0xAA then the result is 0x55 and vice versa.

Same code works just fine on SAm4SXplained (with slower access time). There is a different SRAM on that board which is a little bit slower (55ns).

Question:
What is wrong?
Actually, I don't expect concrete answers here, since the problem is quite complex. What I need is some guideline, where should I start, how could I find the error.

UPDATE-1

These are the timings I used.

Write access times:

NWE_SETUP = 1(cycles) = 8.333ns

NWE_PULSE = 5(cycles) = 41.66ns

NWE_CYCLE = 6(cycles) = 49.998ns

Read access times:

NRD_SETUP = 3(cycles) = 24.999ns

NRD_PULSE = 3(cycles) = 24.999ns

NRD_CYCLE = 6(cycles) = 49.998ns

SMC Mode register = 3 which means, read and write operations are timed according to WE/RD signals, instead of CS.

Also, I would point out the followings:
Since I attached only one device to the SMC, the CS line of the SRAM is wired to GND permanently (see schematics). I think it should be fine, however I already am unsure.

UPDATE-2

Since I cannot think of different issue than bad board design, I add a picture of the board layout. It's a standard 2 layer FR4 board. I would ask you to highlight the obvious mistakes that I managed to realize.
For example, I'm not sure if it's acceptable to have bus traces which are crossing each other.

enter image description here

UPDATE-3

I did some further investigation, here are my findings:

I wrote a new test function which can provide more precise results. It works as followings.

  1. Fills the ram with 0xFF
  2. Fills the ram with test data which comes from a byte array with 8 item.
  3. Reads back the values and compares with the expected value

The sample data array is:

sample_data[0]=0;
sample_data[1]=84;
sample_data[2]=21;
sample_data[3]=85;
sample_data[4]=150;
sample_data[5]=170;
sample_data[6]=98;
sample_data[7]=255;

For the test I used both my original and those timings which were provided by FRob in his answer. The results were always very similar.

The test function prints all the errors on USB in CSV format. The output can be downloaded from here

After analyzing the output it turned out that:

  • ram access works fine up to 0x2FF address
  • failures start from 0x300
  • A8 and/or A9 lines are always affected in all errors (if an error occurs A8 and/or A9 is set)

Further observations:

  • Increasing the CYCLE times reduces the number of errors
  • Setting the CYCLE times to 350 results in successful ram test.

UPDATE-4
I measured the exact trace length of the address and data lines. (See below). I claimed earlier that the longest track is about 37mm which was obviously wrong estimation.
If somebody could confirm that the huge differences in lengths can cause the mentioned issue on the mentioned frequency, that would be good.

Net Length(mm)
A0  11.325
A1  11.437
A2  11.818
A3  12.56
A4  11.772
A5  13.156
A6  45.02
A7  37.645
A8  48.549
A9  51.861
A10 60.938
A11 44.396
A12 38.862
A13 43.397
A14 50.029
A15 46.972
A16 35.047
A17 44.777
A18 55.394

D0  51.166
D1  31.642
D2  34.394
D3  83.119
D4  81.543
D5  75.678
D6  62.609
D7  52.769

OE  43.215
WE  60.582

Best Answer

I think you may be getting some capacitive crosstalk on the WE line. The Atmel reference design has a pair internal planes that may help limit the crosstalk. You may need to consider doing a 4 layer board to fix this properly. Have you tried lengthening the setup time?

Also, where are your decoupling caps?

It also looks like the small pieces of ground plane you do have are cut by ground traces. Adjust your layout software configuration to allow traces to touch planes of the same signal.