Talking about signal termination is like opening a can of worms. This is a HUGE subject that is difficult to summarize in just a couple hundred words. Therefore, I won't. I am going to leave a huge amount of stuff out of this answer. But I will also give you a big warning: There is much misinformation about terminating resistors on the net. In fact, I would say that most of what's found on the net is wrong or misleading. Some day I'll write up something big and post it to my blog, but not today.
The first thing to note is that the resistor value to use for your termination must be related to your trace impedance. Most of the time the resistor value is the same as your trace impedance. If you don't know what the trace impedance is then you should figure it out. There are many online impedance calculators available. A Google search will bring up dozens more.
Most PCB traces have an impedance from 40 to 120 ohms, which is why you found that a 1k termination resistor did almost nothing and a 100-ish ohm resistor was much better.
There are many types of termination, but we can roughly put them into two categories: Source and End termination. Source termination is at the driver, end termination is at the far end. Within each category, there are many types of termination. Each type is best for different uses, with no one type good for everything.
Your termination, a single resistor to ground at the far end, is actually not a very good. In fact, it's wrong. People do it, but it isn't ideal. Ideally that resistor would go to a different power rail at half of your power rail. So if the I/O voltage is 3.3v then that resistor will not go to GND, but another power rail at half of 3.3v (a.k.a. 1.65v). The voltage regulator for this rail has to be special because it needs to source AND sink current, where most regulators only source current. Regulators that work for this use will mention something about termination in the first page of the datasheet.
The big problem with most end-termination is that they consume lots of current. There is a reason for this, but I won't go into it. For low-current use we must look at source termination. The easiest and most common form of source termination is a simple series resistor at the output of the driver. The value of this resistor is the same as the trace impedance.
Source termination works differently than end termination, but the net effect is the same. It works by controlling signal reflections, not preventing the reflections in the first place. Because of this, it only works if a driver output is feeding a single load. If there are multiple loads then something else should be done (like using end termination or multiple source termination resistors). The huge benefit of source termination is that it does not load down your driver like end termination does.
I said before that your series resistor for source termination must be located at the driver, and it must have the same value as your trace impedance. That was an oversimplification. There is one important detail to know about this. Most drivers have some resistance on it's output. That resistance is usually in the 10-30 ohm range. The sum of the output resistance and your resistor must equal your trace impedance. Let's say that your trace is 50 ohms, and your driver has 20 ohms. In this case your resistor would be 30 ohms since 30+20=50. If the datasheets do not say what the output impedance/resistance of the driver is then you can assume it to be 20 ohms-- then look at the signals on the PCB and see if it needs to be adjusted.
Another important thing: when you look at these signals on an o-scope you MUST probe at the receiver. Probing anywhere else will likely give you a distorted waveform and trick you into thinking that things are worse than they really are. Also, make sure that your ground clip is as short as possible.
Conclusion: Switch to source termination with a 33 to 50 ohm resistor and you should be fine. The usual caveats apply.
Should I connect nCE of JTAG directly in to nCE of the FPGA, or nCE from FPGA is connected to ground?
I'm not sure what you mean by "nCE" of "JTAG", but the nCE pin of the FPGA should be tied low. It is only used in multi-device configuration scenarios, where you daisy chain devices by connecting nCEO of one device to nCE of the next.
About the clock: I see that there are 16 clocks (clk0 t clk15). To which clock I should connect the output from oscillator to?
It doesn't matter, choose whichever is easier for you to route.
In my board, I only use 3.3V (the label VCC means 3.3V). Is that ok?
I doubt it. Doesn't the device require 1.2V for core logic and 2.5V for PLL?
By the way, would you please showing me how to add a flashing LED to indicate the JTAG is working?
You can just connect a LED via resistor between 3.3V and TMS. The pin is driven by the programmer and will be low most of the time during the programming.
Best Answer
There are 2 reasons for using an RC end termination topology. (can only be used on signals with 50% duty cycle such as - clocks)
If you do not care about either of these then a single far end resistor will do. (screw the cap) and you can end the resistor to either Vcc or Gnd.
If you do care, then connecting the capacitor to either Gnd or Vcc should be fine. The capacitor will appear to be a short circuit during the edges, which is what you want and the average voltage between the capacitor and the resistor would be 1/2Vcc, which reduces your power.