8

Sensor Processing for Image Sensors

Chapter Outline

There are two primary types of sensors that are used to capture video data — CCD sensors and CMOS sensors. A significant difference between these two technologies is how the video data is read out. For CCD sensors the charge is shifted out whereas for CMOS sensors the charge or voltage flows through column and row decoders, very much like a digital memory architecture.

CMOS sensors are becoming prevalent for a variety of reasons – not least because of their proliferation within consumer devices. This not only drives up the quality but also drives down the price. The latest iPhone has an 8 MP CMOS sensor – the kind that was found only in the most expensive digital SLR cameras just five years ago.

CMOS sensors are rapidly becoming the de facto standard when it comes to digital image capture. The output of an image sensor is not in a standard video format that can be processed: it cannot be resized (scaled), deinterlaced, or composited for example.

The sensor image must first go through a processing pipeline that may include a range of functions such as pixel correction, noise reduction, Bayer-to-RGB conversion, color correction, gamma correction, and other functions, as shown in Figure 8.1.

image

Figure 8.1 A generic sensor processing pipeline for CMOS sensor processing.

Given the wide variety of image sensors used, and the range of application requirements, each processing pipeline will be unique. In many cases hardware customization is required to produce an optimum quality image.

8.1 CMOS Sensor Basics

A CMOS sensor will generally consist of a grid of pixel sensors, each containing a photo-detector and an active amplifier.

CMOS sensors record the image data in grayscale. Color information is gathered by applying a color filter over the pixel grid. Such a color filter array (CFA) allows only light of a given primary color (R, G, or B) to pass through – everything else is absorbed, as shown in Figure 8.2.

image

Figure 8.2 The Bayer color filter array that gives more green than red and blue data.

This means each pixel sensor collects information about only one color. This is important to recognise as the data for each pixel is not composed of three colors (RGB), but of a single color. The other colors need to be “guessed” (interpolated) by the electronics behind the CMOS sensor.

There are many types of color filter arrays – the most common was invented by Dr. Bayer of Eastman Kodak. This color filter array – or the Bayer mosaic as it is more commonly referred to – utilizes a filter pattern that is half green, a quarter red and a quarter blue. This is based on the knowledge that the human eye is more sensitive to green.

The resultant image data does not have all three color components for each pixel. For some pixels we have only red data, for some only green and for others we have only blue. To do any meaningful video processing each pixel must have all the color plane data. So we must process this Bayer image to get three color planes for each pixel.

This processing is called the Bayer to RGB conversion, or Bayer demosaicing, and it is the key standard function required in a sensor processing-signal chain. This is also the most complex when implemented using a multi-tap filter.

Referring back to our discussion of video scaling, when you don’t have the pixel data, you must create it, not randomly through guesswork, but in an informed manner by looking at the data from the neighboring pixels (see Chapter 5 for details).

The simplest way would be to just copy the color plane values from the pixel that is the nearest and hope for the best. A second way would be to take an average of the neighboring pixels (see Figure 8.3).

image

Figure 8.3 Interpolating to get the RGB value for each pixel.

In this figure focus on the center pixel: we will be looking at ways to calculate the missing color information for this pixel.

The figure shows four possible cases. In the first two cases the center pixel has only a green value and in the next two cases the center value has either the blue color information or the red color information.

In (a) and (b), the red and the blue color planes can be calculated by taking the average of the neighboring pixels. In (c) and (d), since there are four neighboring green pixels and four neighboring blue pixels, the missing color plane can be calculated as:

(G1 + G2 + G3 + G4) / 4 OR (B1 + B2 + B3 + B4) / 4

This is called the bilinear interpolation technique. This is exactly the same as the video scaling techniques discussed earlier.

As in video scaling, the complexity in the interpolation technique can be increased to 5 × 5, 6 × 6 or to any arbitrary n × n level. The disadvantage is the additional hardware resources that are employed to implement this algorithm.

8.2 A Simplistic HW Implementation of Bayer Demosaicing

Since these algorithms have to be run at the pixel rate for HD (high-definition) resolutions, the pixel rate is very high and so FPGAs are used to implement custom demosaicing algorithms. The logic inside the FPGA is structured as shown in Figure 8.4.

image

Figure 8.4 Line buffer implementation for Bayer Demosaicing.

The Bayer data from the camera is fed into custom line buffers built inside the FPGA. These line buffers are “n” wide, where “n” is the horizontal resolution of the sensor (how many pixels per line the sensor can provide).

The number of the line buffers that you need depends upon your Bayer demosaicing algorithm. A bilinear implementation may require just two line buffers, but a complex 5 × 5 interpolation would need more line buffer resources.

8.3 Sensor Processing in Military Electro-optical Infrared Systems

Military imaging systems are becoming increasingly sophisticated, incorporating multiple advanced sensors ranging from thermal infrared to visible, and even to UV (ultraviolet) focal planes. Not only do these sensor outputs need to be corrected, interpolated and so forth, often images from multiple sensors must be fused, overlaid, and further processed for local display and/or for transmission.

Figure 8.5 shows a high-level block diagram of a typical signal chain implemented in an electro-optical infrared (EOIR) system. As shown, the processed image is compressed many times (usually using lossless techniques) before being transmitted over a communications link.

image

Figure 8.5 Typical top level structure of an EOIR system.

The first group of algorithms shown is responsible for the configuration and operation of the image sensor (also called focal plane array (FPA)). These algorithms include the generation of video timing and control signals for exposure control, readout logic, and synchronization.

Once this is completed, the pixel streams are processed by a second group of algorithms that addresses the imperfections of the focal plane. Functions such as non-uniformity correction, defective pixel replacement, noise filtering, and pixel binning may also be used to improve image quality. For a color-capable focal plane, demosaicing may be performed. The corrected video stream is then processed to implement functions such as automatic gain and exposure control, wide dynamic range (WDR) processing, white balancing and gamma correction.

In addition, FPGA-based camera cores are able to implement video processing algorithms that further enhance the output video. These processing stages may include functions such as image scaling (digital zoom), (de)warping, windowing, electronic image stabilization, super-resolution, external video overlay, image fusion, and on-screen display. In some cases, the captured and processed video stream may need to be compressed before it is transmitted to a control station.

The EOIR system implements high-quality sensor control and image processing within a tightly constrained power budget, yet retaining the programmability required for last-minute specification changes and field upgradeability.

Combining exceptional image quality with low power consumption is the key challenge when designing EOIR systems. For hand-held and wearable systems, such as night-vision goggles (NVGs) or weapon sights, the critical specification is often the number of hours a unit can run on AA batteries.

Low-power FPGAs are the platform of choice for almost all state-of-the-art EOIR systems because they meet the need for programmability, real-time video processing performance, and low power consumption. In practice, each successful generation of low-power FPGAs have achieved both lower static and dynamic power consumption by utilizing a combination of architectural enhancements and lower core voltages. As the process technology continues to march downwards, the average power consumed by these FPGAs has been dropping by up to 50% and 30% (as shown in Figure 8.6).

image

Figure 8.6 Power consumption trend of today’s leading FPGAs.

8.4 Conclusion

There are various other sensor processing functions that can be, and are, applied to image sensor data. Some of these functions include:

 Digital zoom/binning.

 Noise filtering.

 Non-uniformity correction.

 Wide dynamic range processing.

 Local-area adaptive contrast enhancement.

 Pixel-level adaptive image fusion.

 Electronic image stabilization.

 Super-resolution.

 Motion detection and multi-target tracking.

It is beyond the scope of an introductory text such as this to delve into all of these functions. As the image sensors that many of you will work with will be a CMOS sensor with a Bayer output format, it is important to understand how this data is processed so that full three color plane information for each pixel is created.

High-performance sensor processing at very low power is increasingly important in various military applications and we have seen that this kind of processing requires the inherent parallel architecture of FPGAs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset