\$5\mu s\$ is a lot slower than I would expect for a naive implementation, so you may have some other issues. Check the Floorplanner or FPGA Editor to see what the routed design actually looks like. You can try adding a "pad to pad" timing constraint if you have not done so already.
But in general, asynchronous designs are discouraged for FPGAs. I am not sure what your purpose is, but for inter-chip communication, the best practice is to implement some form of clocked interface.
Other answers have focused on why you might be approaching this the wrong way. Although I agree with those answers, what you're asking for does exist, so I'll go ahead and give you a straight answer. You'll likely find that this approach is more expensive than alternatives though.
What you want is a 2 GHz voltage-controlled oscillator (VCO) with 3.3-V LVPECL outputs. There are many vendors out there who make such parts.
If you don't find one with LVPECL output, since this is a clock signal, it's relatively easy to adjust the levels to something compatible with LVPECL by ac coupling and rebiasing. Any rf level between -3 and +2 dBm should be usable with a LVPECL input.
LVPECL parts like your 100EP016A can also accept single-ended inputs if you bias the complementary input to the midpoint between the normal logic levels (often there's even a pin called VBB
that outputs this level for your convenience, but I didn't check if the 'EP016A has it).
You will then need to build a phase-locked loop to maintain the VCO output frequency accurately by comparing it with a low-drift reference oscillator, which could be anywhere from 10 to 100 MHz.
One part that provides both the VCO and PLL in one chip is Analog Devices' ADF4360-2
A couple more notes:
I noticed that the maximum guaranteed switching frequency of the MC100EP016A is only 1.2 GHz, so if you really want to do this at 2 GHz, you might want to look for another part. Maybe MC100E137, but then you'll need to have a 5 V supply and you'll also need to deal with the unequal timing of the different outputs for a ripple counter.
Finally, you'll need to deal with latching in all the bits of the count at exactly the same instant, so you don't capture some bits before a transition and some bits after. One solution to this is to use a gray-coded counter instead of a binary counter --- then only one bit changes for any transition, and the maximum error from latching delay variation is only a single count.
Best Answer
If you want to measure frequency, you need a reference frequency you can compare it to.
The usual way to do this is to set up two counter chains, one of which is clocked by your reference frequency Fref, and the other clocked by your unknown frequency Funk (input signal conditioning is an exercise left for the student). You let the two counters run for the same amount of time T, then read their values.
$$Count_{ref} = F_{ref} \cdot T$$
$$Count_{unk} = F_{unk} \cdot T$$
Combine the two equations by dividing them, and solve for Funk:
$$F_{unk} = \frac{Count_{unk}}{Count_{ref}} F_{ref}$$
That's the general concept, but in practice, the process is often simplified by using the reference counter itself to determine the time period T, which makes Countref a known constant, and Fref is also chosen to be a "nice round" number. This turns the multiplication and division into trivial operations, and the frequency can be read more or less directly from Countunk. For example, if Fref/Countref = 1 Hz, then Countunk = Funk directly.
There are many subtleties associated with optimizing the performance of this process that I won't get into here, but there's one trick that's often used when you want to measure relatively low frequencies (such as RPM) with high resolution without requiring a long time T. Instead of making T depend on Fref, making Countref constant, you make T depend on Funk, making Countunk a constant. Also, make Fref as high as you reasonably can. Now you need to do a division operation to get the answer, but you can get it in a fraction of a second with many digits of precision.