ROS packages useful for Computer Vision tasks

The great advantage of doing Computer Vision in ROS is the fact that we do not have to re-invent the wheel. A lot of third-party software is available, and we can also connect our vision stuff to the real robots or perform simulations. Here, we are going to enumerate interesting Computer Vision tools for the most common visual tasks, but we will only explain one of them in detail, including all the steps to set it up.

We will do this for visual odometry, but other packages are also easy to install and it is also easy to start playing with them; simply follow the tutorials or manuals in the links provided here:

  • Visual Servoing: Also known as Vision-based Robot Control, this is a technique that uses feedback information obtained from a vision sensor to control the motion of a robot, typically an arm used for grasping. In ROS, we have a wrapper of the Visual Servoing Platform (ViSP) software (http://www.irisa.fr/lagadic/visp/visp.html and http://www.ros.org/wiki/vision_visp). ViSP is a complete cross-platform library that allows you to prototype and develop applications in visual tracking and visual serving. The ROS wrapper provides a tracker that can be run with the visp_tracker (moving edge tracker) node as well as visp_auto_tracker (a model-based tracker). It also helps to calibrate the camera and perform hand-to-eye calibration, which is crucial for visual serving in grasping tasks.
  • Augmented Reality: An Augmented Reality (AR) application involves overlaying virtual imagery on the real world. A well-known library for this purpose is ARToolkit (https://www.hitl.washington.edu/artoolkit/). The main problem in this application is tracking the user's viewpoint to draw the virtual imagery on the viewpoint where the user is looking in the real world. ARToolkit video tracking libraries calculate the real camera position and orientation relative to physical markers in real time. In ROS, we have a wrapper named ar_pose (http://www.ros.org/wiki/ar_pose). It allows us to track single or multiple markers where we can render our virtual imagery (for example, a 3D model).
  • Perception and object recognition: Most basic perception and object recognition is possible with the OpenCV libraries. However, there are several packages that provide an object recognition pipeline, such as the object_recognition stack, which provides tabletop_object_detector to detect objects on a table; for example a more general solution provided by Object Recognition Kitchen (ORK) can be found at http://wg-perception.github.io/object_recognition_core. It is also worth mentioning a tool called RoboEarth (http://www.roboearth.org), which allows you to detect and build 3D models of physical objects and store them in a global database accessible for any robot (or human) worldwide. The models stored can be 2D or 3D and can be used to recognize similar objects and their viewpoint, that is, to identify what the camera/robot is watching. The RoboEarth project is integrated into ROS, and many tutorials are provided to have a running system (http://www.ros.org/wiki/roboearth), although it does not support officially the latest versions of ROS.
  • Visual odometry: A visual odometer is an algorithm that uses the images of the environment to track features and estimate the robot's movement, assuming a static environment. It can solve the six DoF poses of the robot with a monocular or stereo system, but it may require additional information in the monocular case. There are two main libraries for visual odometry: libviso2 (http://www.cvlibs.net/software/libviso2.html) and libfovis (http://www.ros.org/wiki/fovis_ros), both of which have wrappers for ROS. The wrappers just expose these libraries to ROS. They are the viso2 and fovis packages, respectively. In the following section, we will see how we can do visual odometry with our homemade stereo camera using the viso2_ros node of viso2. We also show how to install them, which at the moment needs to be done from source because these packages do not support ROS Kinetic officially; however, there is no other alternative for them integrated in ROS. The libviso2 library allows us to do monocular and stereo visual odometry. However, for monocular odometry, we also need the pitch and heading for the ground plane estimation. You can try the monocular case with one camera and an IMU (see Chapter 4, 3D Modeling and Simulation), but you will always have better results with a good stereo pair, correctly calibrated, as seen so far in this chapter. Finally, libfovis does not allow the monocular case, but it does support RGBD cameras, such as the Kinect sensor (see Chapter 6, The Navigation Stack – Beyond Setups). In regards the stereo case, it is possible to try both libraries and see which one works better in your case. Here, we will give you a step-by-step tutorial on how to install and run viso2 in ROS and fovis with Kinect.

Visual odometry

Visual odometry is the name used for algorithms that use vision to estimate the relative displacement of a mobile robot or sensor. Odometry accumulates the consecutive relative displacement to give a global estimation of the robot or sensor pose in the environment in respect to the initial pose, but bear in mind that it accumulates drift because the error on each relative displacement accumulates and grows without bounds. To solve this problem, a global localization or loop closure detection algorithm is needed. This is one of the components of a Visual SLAM system, but it also needs visual odometry to create a reference guess for the pose estimation.

Using visual odometry with viso2

In order to use viso2, go to your catkin workspace (~/catkin_ws) and use the following commands:

$ cdsrc
$ wstoolinit
$ wstool set viso2 --git git://github.com/srv/viso2.git
$ wstool update

Now, to build it, run the following command:

$ cd ..
$ catkin_make

Once it is built, we set up our environment by using the following command:

$ sourcedevel/setup.bash

Now we can run viso2_ros nodes, such as stereo_odometer, which is the one we are going to use here. But before that, we need to publish the frame transformation between our camera and the robot or its base link. The stereo camera driver is already prepared for that, but we will explain how it is done in the following sections.

Camera pose calibration

In order to set the transformation between the different frames in our robot system, we must publish the tf message of such transforms. The most appropriate and generic way to do so consists of using the camera_pose stack; we will use the latest version from this repository, which can be found at https://github.com/jbohren-forks/camera_pose. This stack offers a series of launch files that calibrate the camera poses in respect to each other. It comes with launch files for two, three, or more cameras. In our case, we only have two cameras (stereo), so we proceed this way. First, we extend camera_stereo.launch with the calibrate_pose argument that calls calibration_tf_publisher.launch from camera_pose:

<include file="$(find camera_pose_calibration)/blocks/calibration_tf_publisher.launch">
  <arg name="cache_file" value="/tmp/camera_pose_calibration_cache.bag"/>
</include>

Now, run the following command:

$ roslaunch chapter5_tutorials camera_stereo.launch calibrate_pose:=true

The calibration_tf_publisher will publish the frame transforms (tf) as soon as the calibration has been done correctly. The calibration is similar to the one we have seen so far but uses the specific tools from camera_pose, which are run using the following command:

$ roslaunch camera_pose_calibration calibrate_2_ camera.launch camera1_ns:=/stereo/left camera2_ns:=/stereo/right checker_rows:=6 checker_cols:=8 checker_size:=0.03

With this call, we can use the same calibration pattern we used with our previous calibration tools. However, this requires the images to be static; some bars can move from one side to another of the image and turn green when the images in all the cameras have been static for a sufficient period of time. This is shown in the following screenshot:

Camera pose calibration

With our noisy cameras, we will need support for the calibration pattern, such as a tripod or a panel, as shown in the following image:

Camera pose calibration

Then, we can calibrate, as shown in the following screenshot:

Camera pose calibration

This creates tf from the left-hand side camera to the right-hand side camera. However, although this is the most appropriate way to perform the camera pose calibration, we are going to use a simpler approach that is enough for a stereo pair. This is also required by viso2, as it needs the frame of the whole stereo pair as a single unit/sensor; internally, it uses the stereo calibration results of cameracalibrator.py to retrieve the baseline.

We have a launch file that uses static_transform_publisher for the camera link to the base link (robot base) and another from the camera link to the optical camera link because the optical one requires rotation; recall that the camera frame has the z axis pointing forward from the camera's optical lens, while the other frames (world, navigation, or odometry) have the z axis pointing up. This launch file is in launch/frames/stereo_frames.launch:

<launch>
  <arg name="camera" default="stereo" />

  <arg name="baseline/2" value="0.06"/>
  <arg name="optical_translation" value="0 -$(arg baseline/2) 0"/>

  <arg name="pi/2" value="1.5707963267948966"/>
  <arg name="optical_rotation" value="-$(arg pi/2) 0 -$(arg pi/2)"/>

  <node pkg="tf" type="static_transform_publisher" name="$(arg camera)_link"
  args="0 0 0.1 0 0 0 /base_link /$(arg camera) 100"/>
  <node pkg="tf" type="static_transform_publisher" name="$(arg camera)_optical_link"
  args="$(argoptical_translation) $(argoptical_rotation) /$(arg camera) /$(arg camera)_optical 100"/>
</launch>

This launch file is included in our stereo camera launch file and publishes these static frame transforms. Hence, we only have to run the following command to get the launch file to publish them:

$ roslaunch chapter5_tutorials camera_stereo.launchtf:=true

Then, you can check whether they are being published in rqt_rviz with the tf display, as we will see in the following running viso2; you can also use rqt_tf_tree for this (see Chapter 3, Visualization and Debugging Tools).

Running the viso2 online demo

At this point, we are ready to run the visual odometry algorithm: our stereo pair cameras are calibrated, their frame has the appropriate name for viso2 (ends with _optical), and tf for the camera and optical frames has been published. However, before using our own stereo pair, we are going to test viso2 with the bag files provided in http://srv.uib.es/public/viso2_ros/sample_bagfiles/; just run bag/viso2_demo/download_amphoras_pool_bag_files.sh to obtain all the bag files (this totals about 4 GB). Then, we have a launch file for both the monocular and stereo odometers in the launch/visual_odometry folder. In order to run the stereo demo, we have a launch file on top that plays the bag files and also allows you to inspect and visualize its contents. For instance, to calibrate the disparity image algorithm, run the following command:

$ roslaunch chapter5_tutorials viso2_demo.launch config_disparity:=true view:=true

You will see that the left-hand side, right-hand side, disparity images, and the rqt_reconfigure interface configures the disparity algorithm. You need to perform this tuning because the bag files only have the RAW images. We have found good parameters that are in config/viso2_demo/disparity.yaml. In the following screenshot, you can see the results obtained using them, where you can clearly appreciate the depth of the rocks in the stereo images:

Running the viso2 online demo

In order to run the stereo odometry and see the result in rqt_rviz, run the following command:

$ roslaunch chapter5_tutorials viso2_demo.launch odometry:=true rviz:=true

Note that we have provided an adequate configuration for rqt_rviz in config/viso2_demo/rviz.rviz, which is automatically loaded by the launch file. The following sequence of images shows different instances of the texturized 3D point cloud and the /odom and /stereo_optical frames that show the camera pose estimate from the stereo odometer. The third image has a decay time of three seconds for the point cloud, so we can see how the points overlay over time. This way, with good images and odometry, we can even see a map drawn in rqt_rviz, but this is quite difficult and generally needs a SLAM algorithm (see Chapter 8, Using Sensors and Actuators with ROS). All of this is encapsulated in the following screenshots:

Running the viso2 online demo

As new frames come, the algorithm is able to create a 3D reconstruction, as shown in the following screenshot, where we can see the heights of the rocks on the seabed:

Running the viso2 online demo

If we set a Decay Time of three seconds, the different point clouds from consecutive frames will be shown together, so we can see a map of the bottom of the sea. Remember that the map will contain some errors because of the drift of the visual odometry algorithm:

Running the viso2 online demo

Performing visual odometry with viso2 with a stereo camera

Finally, we can do what viso2_demo does with our own stereo pair. We only have to run the following command to run the stereo odometry and see the results in rqt_rviz (note that the tf tree is published by default):

$ roslaunch chapter5_tutorials camera_stereo.launch odometry:=true rviz:=true

The following image shows an example of the visual odometry system running for our low-cost stereo camera. If you move the camera, you should see the /odom frame moving. If the calibration is bad or the cameras are very noisy, the odometer may get lost, which is indicated by a warning message on the terminal. In that case, you should look for better cameras or recalibrate them to see whether better results can be obtained. You might also want to have to look for better parameters for the disparity algorithm.

Performing visual odometry with viso2 with a stereo camera
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset