Bouncing, as I am sure you are aware, occurs when the contacts of a switch or button literally bounce off each other when you activate it. This causes, when working digitally, in a rapid succession of on-off-on-off-on signals, ending up finally with the steady state that is intended.
There are two basic methods of debouncing (lit: removing the bounce) - software and hardware. Hardware methods can be broken down into two types - RC filters and flip-flops. The latter requires a two-pole input that is used to toggle the inputs to a bistable flipflop, and the former effectively treats the switch signal as an AC waveform and low-pass filters it (filters out the high frequency switching noise and leaves the basic HIGH/LOW signals intact).
Neither of those are really applicable for writing in Verilog, as you don't have capacitors, and your joystick isn't a two-pole switch, but it's useful to know the hardware options so you can see how they relate to software.
Software debouncing basically involves emulating a low-pass filter in software. The most common way of doing it is to look at the input and say "How long has the input been at this state?", and that of course requires some form of timing.
The simplest method is to do the following:
- Notice when the input has changed state
- Flag that it has changed and clear a counter.
At the same time, driven by the clock, for any inputs with the "changed" flag set:
- Increment the input's counter
- If the counter exceeds a certain limit then clear the "changed" flag and set an output variable to the state of the switch.
That means that every time the switch changes state, which will be multiple times during the pressing of that switch, the counter is cleared, but the changed flag is only set once. Only when the last bounce has happened will the counter be able to count high enough (as it keeps getting reset by the bouncing) to exceed the threshold, and only then will the switches state be passed on to the rest of your code.
The clock wants to be considerably faster than 1Hz. It is generally accepted that any two events that happen faster than 20ms apart appear (to us) to occur at the same time. 50ms starts to become really noticeable, so a debounce period of 10-20ms is usually quite good. So your clock needs to go much faster than that to increment the counter and give good press response resolution. To keep things simple a 1KHz clock is good. That gives a 1ms tick, so when your counter reaches 20 that will be 20ms and a good threshold to have.
No two clocks will ever perfectly match. The method of determining the true clock frequency from the data is called "clock recovery".
If you know the nominal bit rate, then one straightforward method that doesn't require use of a PLL/DCM block is to over-sample the data and look for edges. Normally you would need to over sample by at least 4X the bit rate. Here is how it works...
Create a clock in your part that is 4X the bit rate. In the case of a 65MHz bit rate this is a 260MHz clock.
Using the 260Mhz clock double or triple register the incoming bits to avoid meta-stability issues. These types of issues can occur if an input signal changes very close to a clock edge. This is almost guaranteed to happen when sampling data using a different clock from which the data was generated.
Optionally do an extra two register stages and do a 2 out of 3 majority vote on the last three stages. This will reduce false detection of edges due to noise, which becomes important in the next step since you are using edges in the data to find the clock rate.
Make a two bit free running counter that counts from 0 to 3 and then rolls over back to 0. The counter is clocked by the 260MHz clock.
Whenever you see a 0 to 1 or 1 to 0 transition on the input data then assume you are at a clock edge and reset the counter to 1 (cnt <= "01").
Whenever the counter has a value of 2 (cnt = "10") use the output of your majority vote as your input sample. And if you are keeping a pixel count, also increment that.
I have personally used the above method to successfully recover the clock on serial data up to 100Mbps.
Depending on whether the incoming data is slightly faster or slower than your clock the counter will skip one tick or hold an extra tick to adjust the count rate to match the data.
For a slower data rate you will see something like...
...0,1,2,3,0,1,2,3,0,1,1,2,3,0,1,2,3,0,1,2,3...
For a faster data rate you will see something like...
...0,1,2,3,0,1,2,3,1,2,3,0,1,2,3,0,1,2,3...
There is another method where you can do the 4X over-sampling using two clocks that are the same rate as your pixel clock but 90 degrees out of phase. By sampling into four registers (one on rising and one on falling for each clock) you can achieve the same effect as is done with the counter based setup above. The maximum possible pixel rates are higher for that method, but the logic is a little more complex.
Best Answer
Yes - if some part of your output data is available later than other parts, you have to delay the other parts so they line up.
It's not a fudge, or a "bad" thing to do - it's just what has to be done to make the outputs right.
That's what I'd do. (EDIT: And as Yann reminded me, delaying signals can be very cheap in Xilinx FPGAs - 16 ticks can fit in a single look-up table + 1 more in the flipflop that's next to the LUT)
That's another option, but will probably take more logic.