Electronic – Raw RGB from CMOS image sensor to grayscale conversion

image-sensorimageprocessingrgb

I am trying to produce an image with OV7740 color image sensor. As far as I know, the sensor is configured by default to provide 10bit parallel RAW RGB data (due to an on-chip Bayer filter). How should this data be processed/scaled in order to obtain a grayscale image data?

EDIT:
I guess, the discussion of my issue can be found here.

Best Answer

A top quality demosaic process is fraught with subtlety, but very cheap answer can be had if you can accept a 50% loss in spacial resolution. Simply sum groups of four sensor pixels to get a single grey pixel. Assuming your sensor has a "typical" Bayer filter pattern, there will be twice as many green sensors as either red or blue. Summing square groups of four pixels will then make Y = 2*G + R + B, which does over-emphasize the blue component but for simple cases like verifying optical system focus that will not be an issue.

So for a typical arrangement of pixels which looks like this:

G R G R G R....
B G B G B G
G R G R G R
B G B G B G
.
.
.

You would get one Y output for each group of four, at a lower spacial resolution:

Y   Y   Y

Y   Y   Y

The implementation of this can be done in hardware (as in an FPGA) with only enough memory to store half a scan line, as you can sum the input pixels in pairs the first line, then in pairs and including the previous line's sum on each second line.

You could also generate as many Y pixels as inputs by doing a sliding window, that would require a full scan line's worth of memory in hardware.

Doing it "right" involves a number of processing steps, but the core operation is to apply spacial filters to estimate separate R, G, and B monochrome images at identical resolution, with care taken to align them properly. Given those aligned images, it is straightforward to compute Y at each pixel.

An implementation of a fairly sophisticated demosaic system can be found in dcraw, which consumes raw sensor data and emits JPG images. Raw sensor data can be had from many higher end digital cameras, but usually encumbered by trade secrets making its interpretation difficult for third parties. Coffin and his user community have put a great deal of effort into reverse engineering the raw file formats written by such cameras. But it can also be used in your context.

When developing camera projects in the past, I have taken raw pixels straight from CMOS sensors as captured in RAM, dumped them to a file, and passed the file through dcraw.

The related problem for a prototype camera is getting your first prototype optics in focus, in order to have pixels that could mean something and thus prove that the whole thing works by showing a finished JPG of your bench.

The first step is to find and use the test-pattern modes in your sensor. With that, you can usually get reference patterns with known content to test your electronics and data flow. Then you can switch to real optics with some confidence that the raw pixels are meaningful and concentrate on getting the optics to give you an in-focus image.

A technique I've used in the past for a camera based on an M12 mount lens holder was to bring the lens into fairly decent focus with the lens holder mounted on an evaluation board that had software that provided a live view on a PC. Then I locked down the lens, and moved the whole lens holder to my prototype. Due to slight variations in sensor position relative to the PCB and mounting hole it wasn't perfect, but it was good enough to get a recognizable image.