Code Examples
Hop over to OpenCores and you will find dozens of open source projects. There are many written in Verilog and cover the gamut from I/O devices through to processors.
Also, do not forget the many Application Notes available from Xilinx. They are very helpful with their own devices.
Design Flow
Pick up a book or two on design flow so that you get an overview on the steps involved in FPGA design. In summary, they will involve:
- Design entry - in your case, Verilog.
- Functional simulation - using various tools.
- Synthesis - in your case, using the Xilinx ISE tools.
- Simulation - to verify your post-synthesis design because some aspects of Verilog are not synthesisable.
- Place & Route - using the Xilinx ISE tools.
- Implementation - downloading the design onto the FPGA.
- Testing.
FPGA Components
As for using the FPGA components, there are different ways to use them. But assuming that you are using a Verilog design entry, you can either infer or instantiate the different components.
Inference generally involves getting the synthesis tool to pick the best components to use based on the functionality that you require. The best example of this would be to design an adder.
By doing q <= a + b
or q = a + b
you can infer an adder. Both will infer the adder but there is a difference in when you use the blocking/non-blocking syntax.
Instantiation generally involves calling the exact library component in code. Some components just cannot be easily inferred in code - such as the DCM. You can use the ISE tools and examples to learn more about this.
The actual list of components themselves are provided by Xilinx in the Libraries Guide.
Protip
The best way to learn this is actually to experiment with short bits of code and run them through the ISE synthesis to see what it spits out. There are also plenty of examples in the ISE toolset itself.
So I think I found some answers to the problem and want to share them.
I started to simulate the GTXE2_CHANNEL hardmacro. The simulation is behaving as "false" as the hardware. So I tried to simulate the MGT in Verilog and used an instance template from here:
http://forums.xilinx.com/t5/7-Series-FPGAs/Using-v7gtx-as-sata-host-PHY-and-there-is-issue-bout-ALIGN/td-p/374203
This template simulates ElectricalIDLE conditions and OOB sequences nearly correct. So I started to diff both solutions:
TXPDELECIDLEMODE, which is a port to choose the behavior of TXElectricalIDLE is not working as expected. So now I'm using the synchronous mode.
PCS_RSVD_ATTR is a unconstrained bit_vector generic of 48 bit. If you have a look into the wrapper code of the secureip GTXE2_CHANNEL component, you will find a conversion from bit_vector => std_logic_vector => string
. Internally all generics are treated as DOWNTO ranged. So it's important to pass a DOWNTO constant to the GTXE2 generics!
So now you could ask why is he using to-ranged constants and generics?
Xilinx ISE up to the latest version 14.7 has a major bug in handling vectors of user defined types in unconstrained generics. The default direction of vectors is TO. If you are passing vectors of enums as DOWNTO to unconstrained generics into a component, ISE is reversing the vector elements and "emits" a TO ranged vector in the components !!
This is especially "funny" if the design hierarchy, which uses this generic, is not a balanced tree...
If you are using enums of 2 elements, the problem is not existent -> maybe this enum is mapped to a boolean.
Which bugs are left?
- TXComFinish is still not acknowledging the send OOB sequences.
- I have to investigate this two bug fixes in synthesis and measure the OOB sequences with a scope - this may last some days :)
Edit 1 - more bugs:
There is an other bug in the reset behavior of the GTXE2. If GTXE2 is used with output clock dividers set to 1 (TX_ and RX_RateSelection = "000") than the GTXE2 boots up and emits only 3 clock cycles (with wrong clock period) on TX_OutClock. After that TX_OutClock is 'X'. If you reset the GTXE2 after that wrong output it boots up a second time with now error and a correct clock on TX_OutClock.
Additionally to this bug, the GXTE2 ignores all assigned resets (CPLL as well as TX/RX_RESETs) until 'X' can be seen on TX_OutClock. So you MUST wait for circa 2.5 us to issue a reset.
If you are using clock dividers with 2 or 4 (8 and 16 are not tested yet) this problem will not occur.
Edit 2 - problems solved:
Solution for Bug 1:
I have added a timeout counter whose timeout depends on the current generation (clock frequency) and the current COM sequence which is to be send. If the timeout is reached I generate my own TXComFinished signal. Don't or the timeout signal with the original TXComFinished signal from GTX, because sometimes this signal is high while COMWAKE is to be send, but this finished strobe belongs still to the previous COMRESET sequence!
Solution for an other Bug:
RXElectricalIDLE is not glitch free! To solve this problem I added an filter element on this wire, which suppresses spikes on that line.
So currently my controller is running at SATA Gen1 with 1.5 GHz on a KC705 board with a SFP2SATA adapter and I think this question is solved.
Best Answer
\$5\mu s\$ is a lot slower than I would expect for a naive implementation, so you may have some other issues. Check the Floorplanner or FPGA Editor to see what the routed design actually looks like. You can try adding a "pad to pad" timing constraint if you have not done so already.
But in general, asynchronous designs are discouraged for FPGAs. I am not sure what your purpose is, but for inter-chip communication, the best practice is to implement some form of clocked interface.