I'm basing my answer completely on the code and documentation of the dvi_decoder module, and assuming it actually works as advertised. This file seems to be a (modified?) copy of the IP in the app notes Video Connectivity Using TMDS I/O in
Spartan-3A FPGAs and/or Implementing a TMDS Video Interface in the
Spartan-6 FPGA. These app notes are chock-full of important details, and I suggest you read them carefully.
As you indicated in the question, I will assume you are treating unencrypted streams, that is non-HDCP streams. I'm fairly certain that the information in the NeTV project can be adapted to decrypt HDCP, but it would involve a non-trivial amount of additional work and be on questionable legal grounds depending om your jurisdiction.
It looks like you will be able to obtain the data you need from the outputs of the dvi_decoder block. The block outputs 24-bit color information using the wires red
, green
and blue
, synced to the pixel clock pclk
. The outputs hsync
and vsync
alert the user to the end of a line/screen respectively. In general, you should be able to do on the fly averaging using these outputs.
You will need some basic logic to translate hsync
, vsync
and the pixel clock into an (X,Y) location. Just instantiate two counters, one for X
and one for Y
. Increment X
at every pixel clock. Reset X
to zero at hsync
. Increment Y
at every hsync
. Reset Y
to zero at every vsync
.
Using red
, green
, blue
, X
and Y
, you can do on the fly averaging. By comparing with X
and Y
, you can determine what box each individual pixel should contribute to, if any. Sum the color values into an accumulation register. To obtain the average value, you need to divide the value in the register by the number of pixels. If you are smart, you will make sure the number of pixels is a power of two. Then you can just wire the MSBs of the register to whatever you want to drive.
Because we want to drive displays while doing the accumulation, we will need to do double buffering. So we will need two registers per box per component. If you are using a 25-led string, this means you will need 25*3*2=150 registers. That's quite a bit, so you might want to use block ram instead of registers. It all depends on your exact requirements, experiment!
I assume you will be driving a led string like the one used in the original adafruit project kit. You should be able to figure out how to drive it from the values in the registers quite easily using SPI.
The dvi_decoder module is a fairly complex piece of kit. I suggest you study the app notes in detail.
As an aside, if you have not yet purchased an NeTV for use in this project, I recommend you also have a look at Digilent's Atlys board. With two HDMI inputs and two HDMI outputs, it appears to be tailor made for projects of this kind.
In programming this technique is called double buffering. You have two memory buffers between the camera and the monitor. While the camera fills one of them the other is displayed on the monitor (whatever time it costs - 1, 2 or more frames). When the first buffer is full (the whole frame is read from the camera) the two buffers are swapped and the second buffer now is read from the camera and the first one is displayed on the monitor.
This way, some of the camera frames will be displayed 2 monitor frames long, some only 1 (if the camera is faster than half of the monitor frame rate) or 3 monitor frames (if the camera is slower than half of the monitor frame rate) but the synchronization will be automatically provided.
I hope, this explanation is clear enough and I understood the problem correctly.
Best Answer
Seems like a reasonable idea. However, the data will not enter the FPGA bit-serial, it will be deserialized and decoded in parallel; the FPGA fabric can't go fast enough to deal with bit-serial 1920x1080 HDMI. You'll need to decode the TMDS encoding in order to mess with the pixel data anyway. And you'll also have to re-encode and re-serialize the signal to send it out the door again. The data path should run on the order of one pixel per clock cycle, so you can trivially swap out colors as necessary. I would recommend using an indexed color configuration - one 2D array of small numbers, perhaps 3 or 4 bits, that indexes into an array of RGB values. One of the values could then correspond to 'passing through' the original color. Or you could combine colors by adding, or even use one bit as a 'mask' bit to select whether or not to consider the original color.