3.1 Introduction

The MPEG-x and H.26x video coding standards adopt a hybrid coding approach which employs block matching (BMA) motion compensation and the discrete cosine transform (DCT). The reasons are that (a) a significant proportion of the motion trajectories found in natural video can be approximately described with arigid translational motion model; (b) fewer bits are required to describe simple translational motion; and (c) implementation is relatively straightforward and amenable to hardware solutions.

The hybrid video systems have provided interoperability in the heterogeneous network systems. Considering that transmission bandwidth is still a valuable commodity, ongoing developments in video coding seek scalability solutions to achieve a one-coding–multiple-decoding feature. To this end, the Joint Video Team of the ITU-T Video Coding Expert Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) is standardizing a scalability extension to the existing H.264/AVC codec. The H.264-based scalable video coding (SVC) allows partial transmission and decoding to the bitstream, resulting in various options for picture quality and spatio-temporal resolutions.

In this chapter, several advanced features/techniques to do with scalable video coding are described, mostly related to 3D video applications. In Section 3.1.1, applications and scenarios for the scalable coding systems are described. The advances of scalable video coding in 3D video applications are discussed in Section 3.3. Subsection 3.3.1 discusses a nonstandardized scalable 2D-model-based video coding scheme applied to the texture and depth coding of 3D video. The adaptation of scalable video coding for stereoscopic 3D video applications is elaborated on in Subsection 3.3.2. Although scalable extension of H.264/AVC is selected as the starting point in scalable video coding standardization, there are many contributions from the wavelet research community to scalable video coding. Some of these are described in Subsection 3.3.2. Section 3.4 elaborates on the proposed error-robustness techniques for scalable video and image coding. Subsection 3.4.1 advances the state of the art SVC, using correlated frame to increase the error robustness in error-prone networks. A scalable, multiple description coding (MDC) application for stereoscopic 3D video is investigated in Subsection 3.4.2. Subsection 3.4.3 elaborates on the advances of the wireless JPEG 2000.

3.1.1 Applications and Scenarios

This section describes the applications and scenarios that are targeted by the Visnet research on scalable video coding. The algorithm developments and test scenarios are guided by the limitations of these applications and scenarios. The main targeted application for this activity is TRT-UK's Virtual Desk (see Figure 3.1), which is one of the integration activities within Visnet II. The Virtual Desk is a collaborative working environment (CWE) which can connect multiple users in a single session. A key aspect of the collaboration is audiovisual conferencing. 3D scalable MDC video can provide significant benefits for the videoconferencing in terms of:

  • 3D: a more immersive communication experience.
  • Scalability: adaptability to different terminal types (examples of terminals are shown in Figure 3.1).
  • MDC: improved robustness to packet losses.

This application places some constraints on algorithm developments and the scenarios used to test them. These constraints are considered throughout the research and are elaborated below.

(i) Source Sequences

The most appropriate source sequences for audiovisual conferencing applications are “head and shoulders” test sequences, such as Paris, Suzie, and so on. However, there are no standard “head and shoulders” sequences available with depth maps. During investigations, a variety of different video sequences are experimented with, in order to demonstrate the applicability of the proposed techniques to a variety of sources. The sources used are described in Sections 3.2–3.6, when discussing the performance of the proposed algorithms.

(ii) Available Bandwidth

The terminals may be connected by DSL (1–8 Mbps), University Intranet (100 Mbps), or to a wireless network (WLAN: 54Mbps; UMTS: 384 kbps). Therefore, video tests are run for bit rates greater than 1Mbps and for bit rates less than 200 kbps. The CIF resolution sequences (352 × 288) are used during the initial experiments. However, VGA (640 × 480) and 4CIF (704 × 576) will be tested as the algorithms are finalized.

images

Figure 3.1 TRT-UK virtual desk

(iii) Channel Losses

Wired network losses are represented in simulations by JVT's packet loss simulator. Wireless network simulations are also carried out using the WiMAX simulator developed under IST-SUIT project.

(iv) Low Delay

The video coding algorithms must feature low delay. The proposed schemes do not inherently incur significant delay. However, the use of hierarchical B frames introduces a large amount of delay. The low-delay temporal scalability algorithms with I and P frames can be employed to avoid this.

The algorithms described in this chapter respect the constraints described above.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset