A depth camera measures the distance from its sensor to objects and provides us with information about the 3D scene it sees. Using this information, it is easy to analyze and recognize 3D objects in the scene, including humans and their body parts. So today, depth cameras are the most used sensors for providing touchless interactions in most of the interactive projects. In this chapter, we will learn how to use depth cameras in openFrameworks projects using the ofxOpenNI addon. Also, we will consider the example of using depth images for making a flat surface such as a wall, a table, or a floor interactive.
The topics covered are as follows:
The depth camera is a camera device that captures depth images. The value of a depth image pixel is not equal to light's brightness, or color, but equal to a distance from the camera to the corresponding part of the object. The main types of such cameras are the following:
A depth image is computed using stereo correspondence between the projected pattern and the captured image. The measuring accuracy of such cameras decreases with distance like in the passive case. They work perfectly in indoor locations in light and dark environments.
Because of using the low energy lasers, such cameras do not work in outdoor environment with direct sunlight. Also, they poorly see transparent objects such as glasses, and light sources such as lamps.
Today, they are the cheapest cameras, used for entertainment and gesture-controlled applications, and also for many kinds of robotics and interactive experiments.
In this chapter, we will consider only the active infrared stereo cameras. They work in indoor space, have advanced SDK (OpenNI), and cost about $200. Time-of-flight real-time cameras and passive stereo cameras can be considered more powerful, because they can work in outdoor space, but currently their price starts from $1800.
There are several depth camera lines from different vendors: Microsoft Kinect, Asus Xtion, and PrimeSense Carmine, and each vendor, in its turn, has several camera models. Most of the cameras used today have the following characteristics:
The most notable differences between cameras are in connectivity and size. Microsoft Kinect connects to USB 2.0 and USB 3.0, but is quite big. Asus Xtion and PrimeSense Carmine are smaller and hence are more convenient for mounting, but currently have some issues when connecting to USB 3.0.
The depth images from these cameras can be used for 3D scene analysis, including human body recognition and analysis of hand gestures. These capabilities are implemented in the open cross-platform library OpenNI developed by not-for-profit consortium OpenNI (Open Natural Interface), http://www.openni.org.
OpenNI, and particularly its subpart called NiTE, is centered on analyzing and recognizing humans' postures and gestures. If you need any other 3D-objects' processing capabilities, such as searching specific objects like spheres and cylinders, or stitching data from multiple depth images, you should additionally use the PCL library. This is an open library for working with 3D point clouds, obtained from depth images.
Appearance of new depth cameras is expected in the near future. We believe that the major principles discussed in the chapter will be applicable to these cameras too.
The simplest way of using OpenNI in the openFrameworks project is by linking the ofxOpenNI addon. Let's discuss how to install the addon and explain its examples.
Let's note, openFrameworks has core addon ofxKinect for working with Microsoft Kinect cameras. Currently it does not use OpenNI. This addon is good for the projects which use depth image or 3D point cloud obtained from camera. For details, see the openFrameworks' example examples/addons/kinectExample
. In this chapter we will use OpenNI-based solution (implemented in the ofxOpenNI addon), because it has additional capabilities like tracking users and recognizing gestures, and works with all depth cameras models.