Ouch, well i've never worked with firewire at this level before but here are some maybe helpful thoughts:
I read through the TI PHY datasheet that you linked, it looks like its designed to work with the TI Link Layer Controllers. Have you considered just using such a part rather than trying to implement its functionality from scratch?
Also the PHY <-> LLC link runs at full firewire speed, which for 400mbps and an 8bit link is ~50Mhz, the camera may let you get away with running the link much slower, maybe not. Point being you can't just take the pins of that PHY, blue wire them into the FPGA pins and expect a remotely functional or stable connection. You'll very likely have major signal integrity issues.
It looks like the LLC's from TI come in various configurations, some with rather simple, generic 8/16 bit microcontroller style interfaces which should be easy to implement. You can probably find verilog blocks for such an interface freely available. These links would still need to be fast, ~50Mhz so you'll still have the signal integrity issue. You could run it slower i guess but if the data feed overruns the FIFO in the LLC your SOL.
Once your done with the physical interface you still have an entire firewire driver stack to implement, i guess you'd have to do this in the DSP? Or put a small soft core into the FPGA to handle this work?
What I would really do is tell them they are crazy for forcing a firewire interface for something of this size/capabilities. Its going to take a significant amount of resources to build the firewire interface for no gain since you'll never use anywhere near its bandwidth.
If that fails, I'd try something like this which is a single part with the firewire PHY.LLC and an ARM7 core in a single chip. It offers a parallel data bus to get the information into the FPGA. This way you write the firewire driver to support communications to the camera and plunk it in the ARM7 core and all that has to get transfered to the FPGA are the raw images, no overhead work in the FPGA. You still need to carefully design a PCB for this, your still dealing with a very high speed firewire bus.
EDIT:
At 100MBit/s the firewire bus runs at 100MHZ so you have to deal with moving 100MHZ differential signals from the PHY to the firewire connector. On the PHY<->LLC<->FPGA side: I wouldn't personally try to breadboard a 13MHZ parallel data bus, it may be possible if your careful.
The critical issue for signal integrity is the rise/fall time of the signal, not its clock rate. High clock rate usually means faster rise/fall times but sometimes if you use a transceiver thats designed to run at high frequency at lower frequencies it doesn't actually slow down the rise/fall times.
If the wire carrying the signal is longer than:
Tr/(2*Td) with
Tr = the signal rise time at the source and
Td = the propagation delay per unit length of the wire/cable you are using.
Then you need to consider transmission line effects. You'll have to deal with reflections in the wire which will cause all sorts of junk on the line.
You also need to be careful to make sure all the wires of a parallel bus are the same length with the tolerance for variation depending on the clock frequency of the bus.
Is this thing really going to end up in an UAV/RC aircraft? If so you've got to deal with vibrations and G forces as well.
Learning how FPGA's work from the transistor up is very ambitious. There's a lot of stuff inside an FPGA, and for the most part you never have to understand it to the level of detail that you are seeking. In fact, it's probably better if you ignore that stuff at first and learn some practical FPGA stuff instead.
The reason why you don't need to know those fine details is because the FPGA compiler will do it for you. Using VHDL or Verilog, you tell the FPGA compiler what you want, and it figures out how to do it for you. You don't need to know what gets programmed into the LUT's, where that LUT is located, or how to route the signals to/from the LUT. This also helps when you move from one FPGA to another. A Spartan-6 has a different LUT architecture than the Spartan-3's, and you don't want to have to learn a completely new architecture for each chip you use.
Then, as you get into it more you will learn more of the internal workings of the FPGA. Not down to the transistor level, but you will learn about the different kinds of signal routing resources, RAMs, I/O Blocks, carry logic, etc. Knowing about this kind of logic will help you make better use of the FPGA-- making your logic smaller and faster.
One really cool way to find out about the internals of a Xilinx FPGA is to write some VHDL/Verilog code and compile it. Then using the Xilinx FPGA Editor to go in and examine what the compiler did, looking at signal routing, slice usage, LUTs, and Flip-flops. This is most useful for me in understanding why my logic was bigger than I thought it should be. I would guess that 95% of the time you don't have to understand the inner working of an FPGA in more detail than what FPGA Editor will give you.
Best Answer
The FT2232H would probably be a good choice. The FT2232H provides two interfaces, which are configurable for UART, FIFO, and JTAG, among others. So you can use one port in JTAG mode to configure the FPGA (using openocd) and another port in FIFO mode for reasonably high speed data transfer. The FIFOs can run in async mode (8 MBps or 64 Mbps) or synchronous mode (40 MBps or 320 Mbps). How much bandwidth do you need?