There's two criteria that you can use to evaluate a digital project that help you decide which part best matches your criteria. The first is design size/complexity - how much logic is involved. The second is the input and output requirements in terms of pin count. Speed can be factored in if you can estimate what your slowest function would be. The vendor tools (Altera Quartus II, Xilinx ISE, etc.) will help you once you get in the right ballpark.
PAL/PLA/GAL: These are intended to replace a small to medium size circuits that you might normally implement as LSI logic chips (7400, 4000 series). These can offer better board layouts due to I/O remapping, and lots of simple logic functions. These chips contain non-volatile memory (or one time programmable fuses) and require no power-up configuration time. They may not contain data storage elements.
CPLD: These are larger cousins of the PLA. The designs can be small state machines, or even a very simple microprocessor core. Most of the CPLD chips that I have seen do not have any on-chip SRAM, although the large Cypress CPLD you linked does. CPLDs are more likely to be re-programmable with flash memory, and they also do not require configuration time on power-up.
FPGA: Unlike the CPLD, the logic blocks are based on SRAM instead of flash memory, resulting in faster logic operations. The major down-side with FPGAs is that since the configuration is stored in SRAM, every time the device is powered up the FPGA must load its programming into this SRAM. Depending on the size of your design and the speed of your non-volatile storage, this can cause a noticeable delay from power-on to fully functioning. Some FPGAs have on-chip flash for storing their data, but most use separate memory chips. FPGAs will often have hard-wired multipliers, PLLs, and other logic functions to improve computing speed. Large blocks of on-chip RAM is also available. You will also be able to use high-performance I/O specifications like LVDS, PCI, and PCI-Express.
FPGA with Microprocessor Hard Core: I'm not familiar with these, but I would imagine that your design would center around the microcontroller programming, and the FPGA would augment the microcontroller. The parts you identified make it look like you would start your design with a microcontroller and a FPGA, and then combine the two into one chip/package.
How to decide which is right for you:
The best way is to have your code (Verilog/VHDL) finished, and then use the vendor's tools to try and fit it into the smallest part possible. I know Altera's tool lets you change programming targets fairly easily, so you could keep picking smaller FPGAs, and then smaller CPLDs until your design usage gets close to about 75%. If you require performance, then try to pick devices that have features (fast multipliers) that decrease the speed requirements of the logic. Again, the vendor tools will help you identify if you need to upgrade or if you can downgrade.
Another factor of which part to use is ease-of-use. Using PAL/PLA/GAL logic is probably more effort than constructing the function using discrete logic gates (74HC*, 4000, etc). CPLDs typically require only a single supply voltage, and don't require additional circuitry. They are effectively stand-alone. FPGAs begin to use multiple power supplies for I/O and the logic core, complex I/O standards, separate program memory, multi-layer (>2) PCBs, and BGA packages.
Steps to narrowing down your design requirements would include:
Identify all inputs and outputs for your FPGA/CPLD. This is usually an easy part of the design stage. This way you know what package you're looking at, and how close you can cut it to that margin.
Draw a block diagram of the internal logic. If your blocks look simple (each block would have a hand-full of logic gates and registers), then you probably can use a CPLD. If, however, your blocks have labels such as "Ethernet transciever", "PCI-Express x16 interface", "DDR2 Controller", or "h264 Encode/Decode", then you are almost certainly looking at a FPGA and using HDL.
- Look and see if your interfaces have special I/O requirements, such as special voltages, LVDS, DDR, or high speed SERDES. It's easier to get a chip that supports it than to get an additional translator chip.
Example CPLD Applications:
- Multi-channel PWM with SPI interface
- I/O Expander
- CPU Address Space Decoding
- Clocks (Time keeping)
- Display Multiplexors
- Simple DSP
- Some simple programs can be converted into a CPLD design
Example Hobbyist FPGA Applications:
- Small System-on-Chip (SoC) designs
- Video
- Complex protocol bridges
- Signal processing
- Encryption/Decryption
- Legacy system emulation
- Logic Analyzer/Pattern Generator
For most hobbyist work, you'll be limited to relatively small FPGAs unless you want to solder BGA packages. I would choose between a large CPLD or a cheap FPGA, and the size/speed requirements would dictate which one I needed.
I'd say you're dreaming. The main problem will be the limited RAM.
In 2004, Eric Beiderman managed to get a kernel booting with 2.5MB of RAM, with a lot of functionality removed.
However, that was on x86, and you're talking about ARM. So I tried to build the smallest possible ARM kernel, for the 'versatile' platform (one of the simplest). I turned off all configurable options, including the ones that you're looking for (USB, WiFi, SPI, I2C), to see how small it would get. Now, I'm just referring to the kernel here, and this does not include any userspace components.
The good news: it will fit in your flash. The resulting zImage is 383204 bytes.
The bad news: with 256kB of RAM, it won't be able to boot:
$ size obj/vmlinux
text data bss dec hex filename
734580 51360 14944 800884 c3874 obj/vmlinux
The .text segment is bigger than your available RAM, so the kernel can't decompress, let alone allocate memory to boot, let alone run anything useful.
One workaround would be to use the execute-in-place support (CONFIG_XIP), if your system supports that (ie, it can fetch instructions directly from Flash). However, that means your kernel needs to fit uncompressed in flash, and 734kB > 700kB. Also, the .data and .bss sections total 66kB, leaving abut 190kB for everything else (ie, all dynamically-allocated data structures in the kernel).
That's just the kernel. Without the drivers you need, or any userspace.
So, yes, you're going to need a bit more RAM.
Best Answer
A textbook 32-bit RISC processor core capable of running the no-mmu version of linux doesn't actually need to be that large - the real resource you need is far more RAM (10s of megabytes) than available in any FPGA, so you'll probably want SDRAM on the board and a controller for that in the FPGA.
That said, if you want anything more than a trivial level of performance, you probably want a core with some optimizations (pipelining, etc), and that starts to increase the size somewhat. Adding a full mmu will make memory (re-)allocation more efficient and enable the usual copy-on-write fork() behavior.
Both major FPGA vendors have soft processor cores with available linux ports - Microblaze for Xilinx, Nios II for Altera. You should probably read their docs for specific platform recommendations as it is of course a target that moves with time. A third party core design might be somewhat larger for similar performance, if it is written in a more portable way and not as specifically optimized for a given FPGA family.
Historically there have been chips available combining both a hard processor core (often powerpc) with a region of configurable FPGA fabric. Another option to look at would be a separate processor (likely ARM) on the same board as an FPGA.
A lot of the decision will depend on how tightly you need to couple the processor and FPGA. If you can reduce the problem to configuration registers and a stream of data, it could be as modular as hanging an FPGA board with a fast USB chip off the USB host port of an embedded linux board like a BeagleBoard or RasberryPi. For tighter integration, you may want the FPGA on the same board and sitting on the processor's external bus. Or for low data rates, it's trivial to put an SPI register interface in an FPGA, and UART interfaces are entirely do-able though a bit trickier.
Finally, there is the question if you actually need a full operating system such as linux, or if a more "micro-controller sized" embedded TCP stack would solve your problem while requiring less memory.