If you look through hdmi projects on a site like hackaday, you'll find that just about every one of them involves an FPGA. I don't think I have seen any DIY project with HDMI output that hasn't used an FPGA.
But why? As far as I can tell, FPGAs are expensive, around $70-$100. Compare that to a Raspberry Pi for $35, that can do way more complex things, and output HDMI. Why isn't an ARM used? Or an even cheaper microcontroller?
In the case of upgrading the video on old game systems, the logic shouldn't be any more complicated a cheap microcontroller can handle, but I keep seeing HDMI as an impossible hurdle only tackled by FPGAs.
Basically, no microcontroller, even the raspberry pi, is fast enough. The raspberry pi has an onboard GPU that generates the HDMI output. And other than that, the I/O capability of the raspberry pi is incredibly limited - the highest bandwidth interface aside from HDMI is USB. Many of the HDMI conversion projects involve taking another video stream in a strange format and reworking that into something that can be sent to a standard HDTV over HDMI. This requires some custom interfacing logic to read in the video signal, signal procesing logic to reformat it, HDMI TMDS encoding logic, and then high speed serializers to actually drive the HDMI port.
Working with streaming, uncompressed, high definition video requires processing a huge amount of data, something which is not feasible on a general purpose CPU. A 1080p video signal at 30 frames per second has about 62 million pixels per second. The raspberry pi runs at 700 MHz, so you have, oh, 11 instructions per pixel. And that's 11 instructions to read in the oddball video format in real time, rescale it, etc., etc. Not possible. Period.
On an FPGA, it is possible to generate a long processing pipeline that can process one or more pixels per clock cycle and do so in a highly deterministic manner (no interrupts or task switching!) so that the pixel data is ready for transmission over HDMI at exactly the right time. If you have worked extensively with general purpose CPUs running any sort of operating system, you will know that getting accurate timing on a milisecond level is more or less doable, but on a microsecond level is pretty much impossible. For HDMI, you need nanosecond scale precision. Not doable on a general purpose CPU. Also, take a look at the HDMI audio/video project for the neo geo. This one not only has to rescale the video, it also has to resample the audio and insert it into the HDMI video stream. This requires some extremely precise orchestration to get working correctly.
And this still isn't considering the custom logic required to read in whatever input data format you have. You'll need custom hardware to interpret this. Software is not fast enough or deterministic enough. You might be able to, say, reformat it into some sort of USB based stream, but this will require custom digital logic anyway, so you might as well just output HDMI directly.
To implement all of this, digital logic is really the only feasible solution. And if you are doing digital logic, FPGAs are the only feasible solution, as it is too fast and too complex for discrete 7400 logic and ASICs are, well, several orders of magnitude more expensive.
Another required component are the actual high speed serializers and differential drivers to generate the parallel serial data streams that get sent down the cable. It's not possible to bit-bang serial data on the order of a gigabit per second from a general purpose CPU, this requires specialized hardware. The raspberry pi has an onboard GPU that does this, but it is limited in terms of what the GPU is capable of, not to mention what is documented. Most FPGAs contain at least the necessary differential drivers and DDR flip flops that are enough to support low-resolution video and there are quite a few FPGAs that also contain the necessary serializers (i.e. Xilinx OSERDES blocks) to generate full HD streams. Don't forget that the serial stream is not 'baseband' like a normal serial port where the actual data is sent verbatim with some framing information, but the data is actually encoded with TMDS (transition-minimized differential signalling) to give the signal certain electrical characteristics. A bit of logic is required to implement this in addition to the actual high speed serializers. All of this is relatively simple to do in pure digital logic (well, the encoding anyway - serialzers are arguably analog, or at least mixed signal) on either an ASIC or an FPGA.
It's actually a very important part of the overall digital/embedded system design process to figure out what parts of a system can be implemented in software and which ones require hardware, either in the form of off-the-shelf specialized chips, FPGAs, custom ASICs, hard or soft IP (HDL, netlists, GDSII), etc. In this case it is clear-cut: video signal generation requires specialized hardware, be it a GPU paired with a general purpose CPU, an FPGA with an integral hard or soft CPU core or paired with an external CPU, or something more specialized.
Edit: I just realized that the fpga4fun site and the neo geo video project both run at 640x480 instead of the full HD. However, this doesn't really make this while operation much simpler. The minimum pixel clock is 25 MHz, with a bit clock of 250 MHz. This means that the FPGA actually does not require serializers to transmit HDMI, only DDR flip flops. This still does not alleviate the issue of reading on the video data, though. If you want to do that on the raspberry pi with no hardware assistance, you would have to read from GPIO continuously at 25 MHz. Which is one read every 175 instructions. Entering the realm of possibility, but the only way you would make that work is on bare metal (no Linux) with hand-coded assembly.