The great advantage of doing Computer Vision in ROS is the fact that we do not have to re-invent the wheel. A lot of third-party software is available, and we can also connect our vision stuff to the real robots or perform simulations. Here, we are going to enumerate interesting Computer Vision tools for the most common visual tasks, but we will only explain one of them in detail, including all the steps to set it up.
We will do this for visual odometry, but other packages are also easy to install and it is also easy to start playing with them; simply follow the tutorials or manuals in the links provided here:
visp_tracker
(moving edge tracker) node as well as visp_auto_tracker
(a model-based tracker). It also helps to calibrate the camera and perform hand-to-eye calibration, which is crucial for visual serving in grasping tasks.ar_pose
(http://www.ros.org/wiki/ar_pose). It allows us to track single or multiple markers where we can render our virtual imagery (for example, a 3D model).object_recognition
stack, which provides tabletop_object_detector
to detect objects on a table; for example a more general solution provided by Object Recognition Kitchen (ORK) can be found at http://wg-perception.github.io/object_recognition_core. It is also worth mentioning a tool called RoboEarth (http://www.roboearth.org), which allows you to detect and build 3D models of physical objects and store them in a global database accessible for any robot (or human) worldwide. The models stored can be 2D or 3D and can be used to recognize similar objects and their viewpoint, that is, to identify what the camera/robot is watching. The RoboEarth project is integrated into ROS, and many tutorials are provided to have a running system (http://www.ros.org/wiki/roboearth), although it does not support officially the latest versions of ROS.libviso2
(http://www.cvlibs.net/software/libviso2.html) and libfovis
(http://www.ros.org/wiki/fovis_ros), both of which have wrappers for ROS. The wrappers just expose these libraries to ROS. They are the viso2
and fovis
packages, respectively. In the following section, we will see how we can do visual odometry with our homemade stereo camera using the viso2_ros
node of viso2
. We also show how to install them, which at the moment needs to be done from source because these packages do not support ROS Kinetic officially; however, there is no other alternative for them integrated in ROS. The libviso2
library allows us to do monocular and stereo visual odometry. However, for monocular odometry, we also need the pitch and heading for the ground plane estimation. You can try the monocular case with one camera and an IMU (see Chapter 4, 3D Modeling and Simulation), but you will always have better results with a good stereo pair, correctly calibrated, as seen so far in this chapter. Finally, libfovis
does not allow the monocular case, but it does support RGBD cameras, such as the Kinect sensor (see Chapter 6, The Navigation Stack – Beyond Setups). In regards the stereo case, it is possible to try both libraries and see which one works better in your case. Here, we will give you a step-by-step tutorial on how to install and run viso2
in ROS and fovis
with Kinect.Visual odometry is the name used for algorithms that use vision to estimate the relative displacement of a mobile robot or sensor. Odometry accumulates the consecutive relative displacement to give a global estimation of the robot or sensor pose in the environment in respect to the initial pose, but bear in mind that it accumulates drift because the error on each relative displacement accumulates and grows without bounds. To solve this problem, a global localization or loop closure detection algorithm is needed. This is one of the components of a Visual SLAM system, but it also needs visual odometry to create a reference guess for the pose estimation.
In order to use viso2
, go to your catkin
workspace (~/catkin_ws
) and use the following commands:
$ cdsrc $ wstoolinit $ wstool set viso2 --git git://github.com/srv/viso2.git $ wstool update
Now, to build it, run the following command:
$ cd .. $ catkin_make
Once it is built, we set up our environment by using the following command:
$ sourcedevel/setup.bash
Now we can run viso2_ros
nodes, such as stereo_odometer
, which is the one we are going to use here. But before that, we need to publish the frame transformation between our camera and the robot or its base link. The stereo camera driver is already prepared for that, but we will explain how it is done in the following sections.
In order to set the transformation between the different frames in our robot system, we must publish the tf
message of such transforms. The most appropriate and generic way to do so consists of using the camera_pose
stack; we will use the latest version from this repository, which can be found at https://github.com/jbohren-forks/camera_pose. This stack offers a series of launch files that calibrate the camera poses in respect to each other. It comes with launch files for two, three, or more cameras. In our case, we only have two cameras (stereo), so we proceed this way. First, we extend camera_stereo.launch
with the calibrate_pose
argument that calls calibration_tf_publisher.launch
from camera_pose
:
<include file="$(find camera_pose_calibration)/blocks/calibration_tf_publisher.launch"> <arg name="cache_file" value="/tmp/camera_pose_calibration_cache.bag"/> </include>
Now, run the following command:
$ roslaunch chapter5_tutorials camera_stereo.launch calibrate_pose:=true
The calibration_tf_publisher
will publish the frame transforms (tf
) as soon as the calibration has been done correctly. The calibration is similar to the one we have seen so far but uses the specific tools from camera_pose
, which are run using the following command:
$ roslaunch camera_pose_calibration calibrate_2_ camera.launch camera1_ns:=/stereo/left camera2_ns:=/stereo/right checker_rows:=6 checker_cols:=8 checker_size:=0.03
With this call, we can use the same calibration pattern we used with our previous calibration tools. However, this requires the images to be static; some bars can move from one side to another of the image and turn green when the images in all the cameras have been static for a sufficient period of time. This is shown in the following screenshot:
With our noisy cameras, we will need support for the calibration pattern, such as a tripod or a panel, as shown in the following image:
Then, we can calibrate, as shown in the following screenshot:
This creates tf
from the left-hand side camera to the right-hand side camera. However, although this is the most appropriate way to perform the camera pose calibration, we are going to use a simpler approach that is enough for a stereo pair. This is also required by viso2
, as it needs the frame of the whole stereo pair as a single unit/sensor; internally, it uses the stereo calibration results of cameracalibrator.py
to retrieve the baseline.
We have a launch file that uses static_transform_publisher
for the camera link to the base link (robot base) and another from the camera link to the optical camera link because the optical one requires rotation; recall that the camera frame has the z axis pointing forward from the camera's optical lens, while the other frames (world, navigation, or odometry) have the z axis pointing up. This launch file is in launch/frames/stereo_frames.launch
:
<launch> <arg name="camera" default="stereo" /> <arg name="baseline/2" value="0.06"/> <arg name="optical_translation" value="0 -$(arg baseline/2) 0"/> <arg name="pi/2" value="1.5707963267948966"/> <arg name="optical_rotation" value="-$(arg pi/2) 0 -$(arg pi/2)"/> <node pkg="tf" type="static_transform_publisher" name="$(arg camera)_link" args="0 0 0.1 0 0 0 /base_link /$(arg camera) 100"/> <node pkg="tf" type="static_transform_publisher" name="$(arg camera)_optical_link" args="$(argoptical_translation) $(argoptical_rotation) /$(arg camera) /$(arg camera)_optical 100"/> </launch>
This launch file is included in our stereo camera launch file and publishes these static frame transforms. Hence, we only have to run the following command to get the launch file to publish them:
$ roslaunch chapter5_tutorials camera_stereo.launchtf:=true
Then, you can check whether they are being published in rqt_rviz
with the tf
display, as we will see in the following running viso2
; you can also use rqt_tf_tree
for this (see Chapter 3, Visualization and Debugging Tools).
At this point, we are ready to run the visual odometry algorithm: our stereo pair cameras are calibrated, their frame has the appropriate name for viso2
(ends with _optical
), and tf
for the camera and optical frames has been published. However, before using our own stereo pair, we are going to test viso2
with the bag files provided in http://srv.uib.es/public/viso2_ros/sample_bagfiles/; just run bag/viso2_demo/download_amphoras_pool_bag_files.sh
to obtain all the bag files (this totals about 4 GB). Then, we have a launch file for both the monocular and stereo odometers in the launch/visual_odometry
folder. In order to run the stereo demo, we have a launch file on top that plays the bag files and also allows you to inspect and visualize its contents. For instance, to calibrate the disparity image algorithm, run the following command:
$ roslaunch chapter5_tutorials viso2_demo.launch config_disparity:=true view:=true
You will see that the left-hand side, right-hand side, disparity images, and the rqt_reconfigure
interface configures the disparity algorithm. You need to perform this tuning because the bag files only have the RAW images. We have found good parameters that are in config/viso2_demo/disparity.yaml
. In the following screenshot, you can see the results obtained using them, where you can clearly appreciate the depth of the rocks in the stereo images:
In order to run the stereo odometry and see the result in rqt_rviz
, run the following command:
$ roslaunch chapter5_tutorials viso2_demo.launch odometry:=true rviz:=true
Note that we have provided an adequate configuration for rqt_rviz
in config/viso2_demo/rviz.rviz
, which is automatically loaded by the launch file. The following sequence of images shows different instances of the texturized 3D point cloud and the /odom
and /stereo_optical
frames that show the camera pose estimate from the stereo odometer. The third image has a decay time of three seconds for the point cloud, so we can see how the points overlay over time. This way, with good images and odometry, we can even see a map drawn in rqt_rviz
, but this is quite difficult and generally needs a SLAM algorithm (see Chapter 8, Using Sensors and Actuators with ROS). All of this is encapsulated in the following screenshots:
As new frames come, the algorithm is able to create a 3D reconstruction, as shown in the following screenshot, where we can see the heights of the rocks on the seabed:
If we set a Decay Time of three seconds, the different point clouds from consecutive frames will be shown together, so we can see a map of the bottom of the sea. Remember that the map will contain some errors because of the drift of the visual odometry algorithm:
Finally, we can do what viso2_demo
does with our own stereo pair. We only have to run the following command to run the stereo odometry and see the results in rqt_rviz
(note that the tf
tree is published by default):
$ roslaunch chapter5_tutorials camera_stereo.launch odometry:=true rviz:=true
The following image shows an example of the visual odometry system running for our low-cost stereo camera. If you move the camera, you should see the /odom
frame moving. If the calibration is bad or the cameras are very noisy, the odometer may get lost, which is indicated by a warning message on the terminal. In that case, you should look for better cameras or recalibrate them to see whether better results can be obtained. You might also want to have to look for better parameters for the disparity algorithm.