So I have been successfully able to interface a Sandisk SD Card with Spartan 3 FPGA (using verilog). The card after power up initializes correctly and is able to read and write data sector wise correctly. The only problem that remains is that after power up if I press the reset button the card hangs and stops working. I have also not been able to find out what a hard reset means for the SD card. Should it get reinitialized somehow? What command should I send to the card for that? Any help is appreciated.
Electronic – Hard reset for SD Card interfaced with FPGA
fpgaresetsdverilogxilinx
Related Solutions
Looking at the datasheet:
14.3.4.5 Watchdog Reset The Watchdog Reset is entered when a watchdog fault occurs. This state lasts 3 Slow Clock cycles.
When in Watchdog Reset, assertion of the reset signals depends on the WDRPROC bit in WDT_MR: If WDRPROC is 0, the Processor Reset and the Peripheral Reset are asserted. The NRST line is also asserted, depending on the programming of the field ERSTL. However, the resulting low level on NRST does not result in a User Reset state.
Could it be that the watchdog is firing and driving the reset line?
If external signals are coming with higher frequency then clock frequency (or if processing of a signal takes multiple clock cycles), what are generic approaches to process such signals?
One approach is to use a faster clock. Most FPGAs (including your Spartan 6) have dedicated logic for generating different clocks from a source clock. Xilinx has called them DCM (Digital Clock Managers) in the past, although I think Spartan 6 has an actual PLL and DCM-like blocks (may be called something else these days).
These are nice. You can put a 12 MHz crystal on your PCB to minimize emissions, and then inside the FPGA you can multiply the clock up to e.g. 96 MHz (a multiplication of 8x)
You could also try clocking logic on the falling edge and the rising edge. This would give you the effective appearance of a 2x clock (also called DDR), but your logic gets complicated because you must split the design between falling-edge and rising-edge logic.
DCMs can also generate clocks that are 90, 180, or 270 degrees out-of-phase with the incoming clock. So if you really needed to sample a signal faster than the FPGA can handle, you could use the 90 degree clock and do the same DDR trick on it. This would give you four chances to sample an incoming signal with the fastest possible FPGA clock, but again this gets complicated because now you're negotiating logic between two sets of rising and falling edges.
.
Should the clock always be involved? For example, can I achieve the same thing without using clock at all?
You could give pure combinatorial logic a shot. But in general it represents a timing nightmare. FPGAs have a sea of flip flops and they're very important for achieving timing closure. It really is best practice to clock everything that you can afford to clock. FPGA IOBs (the input output buffers, which send or receive signals to the outside world) have integrated flip flops for the express purpose of making timing closure easier to achieve.
I'm not even really sure how you'd do this without clocking.
.
Is this what is called "crossing a clock domain"?
Almost. Crossing a clock domain, as mentioned above, happens when the output and input do not share a common clock. The reason this is bad is that those clocks are typically asynchronous. As an engineer, you should always consider that asynchronous I/O will go out of its way to change at the worst possible time.
Button inputs are asynchronous, so they suffer from the same problems as crossing clock domains. The problem with an async signal is that flip flops have parameters known as setup and hold times. Setup time is how long the signal must be steady before the clock edge. Hold time is how long the signal must be steady after the clock edge. If you have a setup or hold violation, the flip flop can go meta-stable. This means that you have no idea what the output will be. At the next clock edge, it generally fixes itself. (flip flops can also go meta-stable if they have a voltage between Vih and Vil applied during the clock edge, but that's not a problem in the FPGA) I believe I also once read somewhere that hold time in FPGAs is typically considered to be 0, and the primary reason you don't get timing closure is insufficient setup time.
To handle async inputs, we use synchronizers. Essentially we put multiple flip flops between the async input and the logic it's connected to. If the first flip flop goes meta stable, it won't screw up the second flip flop...hopefully. It still can, though, but the mean time between failures could be years. If that's not enough, you put even more flip flops to reduce the MTBF to acceptable levels.
Now, in terms of "crossing clock domains", you only need the synchronizer if your clocks are async to each other. One of the benefits of the DCM is that it can generate multiple clock frequencies that are perfectly synchronized. So if you have a "big" calculation that takes a long time to complete, you can use a 12 MHz clock, while the 96 MHz clock samples the inputs. Since they both came from the DCM, every 12 MHz edge perfectly lines up with a 96 MHz edge, so there's no need to use a synchronizer.
EDIT: response to comment
So clocks cannot pass my flip-flop asynchronously, obviously because there can be one signal at a time only (pure physics). So the synchronization is needed to handle one clock "ticking in between" of the other clock cycles. If that is correct, what happens if my logic (combined gate delay) is greater than input clock? Do I get silently screwed, does the signal just "disappears" on its way through the gates, or input clock tick is getting lost?
I'm going to be a bit picky about the language here because it's important to make distinctions sometimes. The clocks are async to each other, but every flip flop is sync to its own clock. The async signal is the output of the clock in one domain and the input of a clock in the other domain.
Say you have two crystals on a board at 48 MHz. They are not both precisely 48 MHz. So if you clock two series flip flops with the two crystals and an inverter at the end to make them switch every cycle, you will eventually cause the first flip flop to output a signal that violates the setup time of the second flip flop's input relative to the second flip flop's clock; but as far as the first flip flop's clock, everything is going according to plan. The second clock will glitch for one cycle; it may go high, it may go low, it may go somewhere inbetween, it might start going one direction and then abruptly change to another halfway through the clock cycle, etc. After the next second clock tick, it will probably correctly sample its input and update its output...probably. Which is why for really high reliability they put more flip flops in the synchronizer, because then you have probably*probably*probably = all but certain.
That's somewhat contrived for an example, but it's easy to see how those clocks would have a periodic meta-stable incident that's a function of the difference in the crystal speeds. For things like buttons, it happens basically totally randomly. This is also why some protocols carry their own clock somehow (e.g. RS232, USB has implicit clock in the data; I2C, SPI have explicit clock trace), to avoid crossing clock domains.
.
When your design is synthesized/mapped/placed/routed, the tools will tell you the maximum propagation delay for any logic that touches a flip flop in your design. If you go higher than this frequency, you broke the rules and the FPGA isn't responsible. In fact, you generally want to go a little slower. So what you do is tell the design tools (for Xilinx this uses a constraint file) "When you're doing place and route, try to keep everything at most 19ns apart," (~52 MHz), then you actually use a 48 MHz crystal (20.8 ns), giving you 1.8 ns of slack (about 9%).
Getting your design to route at the desired speed is called "achieving timing closure". For more complex designs, you have to consider where to put extra flip flops to help pipeline the process and keep the clock speed up. The Pentium 4 is an extreme example; pipeline stages 5 and 20 existed for the sole purpose of allowing signals to propagate across the chip, allowing them to make the clock speed very high. Since the FPGA has a DCM, if you need to clock a very large piece of logic in one stage (e.g. a very large decoder or mux), you could have the DCM do a "slow logic" clock to handle that part, and then have a "fast logic" clock for the other parts. This also kinda helps with power consumption as well, for any stage that's running with the slow logic clock.
For pure combinatorial logic, the tools will tell you the maximum pad-to-pad delay; that is, how much time a signal takes to get from the input pad, through all the logic, and back to the output pad. The problem with such logic is that it has no concept of the past. It cannot remember whether the light was on or off. You could probably design your own feedback elements (e.g. SR latch), perhaps even using the set/resets of the integrated flip flops, but at this point you're starting to become sequential logic, so you might as well just take the plunge and get everything clocked reliably.
Related Topic
- Spartan 6 FPGA IO changes state disregarding design
- Electronic – Is it possible for an FPGA to “partially” configure
- Electronic – STM32F042 failed hardware reset
- Electrical – microSD Class 4 card in SPI mode returns 0x000000021F after `CMD8`
- Electronic – Need help with SD Card Initialization (CLK Syncronization)
- Electronic – MicroSD raw write/read mode. After sector write once, can’t write more sectors and read always 0’s
Best Answer
You will have to simulate your Verilog code or use analyzer to see things going on in live system.
In my opinion, there're only two circumstances when SD-card appears "hung":
bad specification implementation, when card does not understand what is going on, or is unable to understand that previous command must be cancelled. However card must anyway tell you something other than all 1 during execution of these commands - for example during multiple block read there's CRC transmitted, which will contain zeroes, during multiple block write host must transmit CRC otherwise card responses with error, which will also contain 0s. Your task here is to check set of SD-card commands your controller uses, and adjust initialization sequence (number of spare clock cycle before you proceed to CMD0) accordingly;
your FPGA state machine fails after you press reset. For example, clock stops working at the SD-card clock input pin, MOSI gets stuck etc. This is most suspect for hung cards, because SD-card will output nothing if clock does not toggle, or there's no activation of CS and no useful information on MOSI line.