Method 1: Create ROMs in your FPGA design
Because you have the same data in every board, one option is to use block RAMs in the FPGA, configured as ROM. To do this you instantiate a block RAM, but don't connect to the write pins. Use a synthesis directive in your HDL code or UCF file to specify the initial contents of the RAM. Read the Spartan-3 Generation User's Guide (Chapter 4) to see how to instantiate the RAM and how to access the data from the RAM. If you use Xilinx ISE, there is probably also a "wizard" to generate the RAM block and set up the initial contents for you.
Unfortunately, the Spartan-3E you are using has only 350 kbits of block RAM, not 8 Mbits like you require. For this to work then, you'll have to work out a scheme to compress your data to fit in 350 kbits. The details of how to do this depend on what kind of data you have. If your data is especially random, it might not be reasonable to get this much compression.
Method 2: Store data in external memory
You say you have a 128 Mbit parallel flash and a 16 Mbit SPI flash. You will need to read the datasheets for these parts and understand how they work. Then write a state machine into your FPGA that can access these devices. But this is your job as the FPGA designer. Some random strangers on the internet are not going to design your FPGA for you.
To store the data onto the flash initially you have two choices. First would be, if you are building these boards in volume, you can have your board assembly shop pre-program the flash devices before assembling them onto the boards. Typically you give them a data file in some format they request, and they charge you some small extra fee to have the data flashed in before assembly.
Second option: Read the datasheet for the flash device. Write an FPGA design that allows you to send data from some other interface available on your board (Ethernet, USB, SPI, I2C, whatever), and load it into the flash. At manufacturing time, you load this design temporarily into your FPGA and program your flash; then you store a different "run-time" FPGA design into the on-board configuration PROM, that doesn't have the ability to modify the FLASH, and your users won't have the ability to mess up the data.
Other answers have focused on why you might be approaching this the wrong way. Although I agree with those answers, what you're asking for does exist, so I'll go ahead and give you a straight answer. You'll likely find that this approach is more expensive than alternatives though.
What you want is a 2 GHz voltage-controlled oscillator (VCO) with 3.3-V LVPECL outputs. There are many vendors out there who make such parts.
If you don't find one with LVPECL output, since this is a clock signal, it's relatively easy to adjust the levels to something compatible with LVPECL by ac coupling and rebiasing. Any rf level between -3 and +2 dBm should be usable with a LVPECL input.
LVPECL parts like your 100EP016A can also accept single-ended inputs if you bias the complementary input to the midpoint between the normal logic levels (often there's even a pin called VBB
that outputs this level for your convenience, but I didn't check if the 'EP016A has it).
You will then need to build a phase-locked loop to maintain the VCO output frequency accurately by comparing it with a low-drift reference oscillator, which could be anywhere from 10 to 100 MHz.
One part that provides both the VCO and PLL in one chip is Analog Devices' ADF4360-2
A couple more notes:
I noticed that the maximum guaranteed switching frequency of the MC100EP016A is only 1.2 GHz, so if you really want to do this at 2 GHz, you might want to look for another part. Maybe MC100E137, but then you'll need to have a 5 V supply and you'll also need to deal with the unequal timing of the different outputs for a ripple counter.
Finally, you'll need to deal with latching in all the bits of the count at exactly the same instant, so you don't capture some bits before a transition and some bits after. One solution to this is to use a gray-coded counter instead of a binary counter --- then only one bit changes for any transition, and the maximum error from latching delay variation is only a single count.
Best Answer
15 years ago I designed a two parameter digitizer (energy and time) to measure time of flight. For this system I used a constant current source into a cap held in reset by a JFET. On receiving the trigger (NIM fast logic, level shifting kept in the analog (as opposed to saturated switching) regime, the JFET opened, and I was able to achieve 50ps resolution by digitizing the linear ramp, and interpolating from a 62.5MSPS ADC in an FPGA . The circuit was quite simple, and matched simulations perfectly.