Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5

Compression of Digital Holographic Data

Abstract

In the second part, we discuss in more detail several wavelet-based holographic data compression methods based on Fresnelet (Liebling et al., 2003; Darakis and Soraghan, 2006a), separable and nonseparable vector lifting schemes (Xing et al., 2014a, 2015), arbitrary packet decomposition and directional wavelet transform (Blinder et al., 2013, 2014), and Morlet transform (Viswanathan et al., 2014).

Keywords

Digital holographic data compression

Quantization

Transform

Wavelets

Fresnelet

Vector lifting schemes

Packet decomposition

Directional wavelet transform

Morlet transform

Even though holography was invented more than 60 years ago, it has received more attention recently thanks to tremendous achievements in computer and digital technologies. Numerous research works have resulted in significant progresses to generate high-resolution digital holograms with cost-effective solutions. However, compression of digital holographic data is still a relatively new field of research.

Figure 5.1 shows the overall schematic of a complete digital holographic communication system. The different components include processing steps for digital hologram acquisition, compression, transmission, decompression, and display.

f05-01-9780128028544 — Figure 5.1 Digital holographic communication system components.

In detail, a digital hologram is first captured from a still 3D object by using a charged coupled device. For moving 3D objects, a temporal sequence of holograms is generated. Alternatively, the hologram can also be obtained by computational methods from virtual 3D object models. Next, the interference pattern should be converted into a suitable representation. The resulting representation is then compressed and encoded with properly designed coding tools. The encoded data are then transmitted. At the receiver side, the transmitted is first decoded. This allows reconstruction of the object wave, which can then be rendered on the holographic display device.

5.1 Overview of State of the Art for Digital Holographic Data Compression

Both lossless and lossy compression methods have been attempted in compressing digital holographic data. In the early phase, the performance of lossless coding and quantization methods were mainly investigated, while recent research works are more focused on lossy compression with wavelets transform. As summarized in the following, most of the research works on compressing holographic data are categorized into two kinds of approaches: quantization-based and transform-based methods.

5.1.1 Quantization Based

Some lossless and lossy data compression methods are applied to real-imaginary information obtained by phase-shifting interferometry (PSI) in Naughton et al. (2002). Lossless techniques such as Lempel-Ziv, Lempel-Ziv-Welch, Huffman, and Burrows-Wheeler are used to separately code real and imaginary components of the holographic data. However, due to the low spatial redundancies resulting from the speckle nature of holograms, lossless compression methods are usually inefficient (Darakis et al., 2006). As a consequence, lossy compression of holographic data seems essential.

Lossy compression techniques such as subsampling and quantization are applied as well in Naughton et al. (2002). Hologram resampling resulted in a high degradation in reconstructed image quality, but for resizing to a side length of 0.5, a compression rate of 18.6 could be achieved with a high degree of median filtering. Quantization proved to be a very effective technique. Each real and imaginary component could be reduced to 4 bits with an acceptable reconstruction error, resulting in a compression rate of 16.

Mills and Yamaguchi (2005) also confirmed the effectiveness of quantization in both numerical simulation and optical experiments. The influence of bit-depth reduction has been investigated by quantizing the spot-array patterns (the intensity information). They also pointed out that the use of 4 bits appears to be adequate for visual recognition. However, as the number of bits decreases, the resulting quality falls rapidly when only applying quantization.

Furthermore, Naughton et al. (2003) improved the quantization of real-imaginary information with a bit packing operation for real-time networking application. A speedup metric is also defined, which combines space gains due to compression with temporal overheads due to the compression routine and the transmission serialization. It has been reported that quantization to 4 bits results in a compression rate of 16, with normalized root mean square (NRMS) errors in the reconstructed object intensity lower than 0.06.

A companding histogram approach is applied for nonuniform quantization in Shortt et al. (2006b). A companding quantizer is a nonuniform quantizer composed of a compressor, a uniform quantizer, and an expander. A pattern-determined grid is used for the compander, which is fixed-interval sampled. Its pattern is firstly determined by quantization experiments with some DHs in order to produce a hard-coded pattern of clusters. In this way, the computation burden of iterations can be significantly reduced by considering the cluster pattern as a codebook. The compressor and expander works depending on the density of input data. If the input data are dense, the grid is compressed; contrarily, if the input data are sparse, the grid is expanded. The companding approach combines the efficiency of uniform quantization with the improved performance of nonuniform quantization. It exploits a priori knowledge of the distribution of the values in the data and performs well when the data distribution can be described. Shortt et al. first applied the companding quantization method for two reasons: (1) the nonuniform distribution of holographic data; and (2) the computation cost and time delay of iterative techniques. They developed two kinds of companding grid: the diamond companding grid and the logarithmic spiral one. The first one is generated based on a logarithmic sampling distribution, which performs as well as some iterative techniques, for example, k-means algorithm, with higher efficiency when the number of clusters are in the range of [25, 100]. The second companding grid is developed to improve the performance for smaller numbers of clusters. It uses a logarithmic spiral function to produce the cluster pattern. Improvements have been obtained over the performance of the diamond one.

Similarly, histogram quantization is proposed in Shortt et al. (2007). The histogram quantization technique is a noniterative nonuniform quantization technique, specially designed for digital holographic data with better exploitation of data distribution. In this technique, highest peaks in a representation, for example, real and imaginary data, are extracted to define clusters. The most frequently occurring value in each section is selected as the highest peak. The range r_i of the ith section is defined by

$\begin{array}{l} r_{i} = \{\begin{matrix} [a + i δ, a + (i + 1) δ), & if i < N - 1 \\ [a + i δ, a + (i + 1) δ], & if i = N - 1, \end{matrix} \end{array}$

si1_e (5.1)

where the range of the set of values is [a,b], N is the number of sections, $δ = \frac{(b - a)}{N}$ and i ∈{0,1,…,N − 1}. The peaks from real values and imaginary values are then paired to give complex-valued clusters. Each pixel is quantized to the value of its nearest cluster. The significant advantage of this quantization technique is its less time-complexity under the same requirement of reconstruction quality, compared to iterative techniques. This quantization technique can also be considered as joint quantization, similar to VQ, as the two sets of holographic data are jointly quantized. Shortt et al. also applied this technique on amplitude-phase representation; however, better reconstruction performance is reported for real-imaginary representation, in agreement with the conclusion obtained above.

Differently, based on SQ, a multiple description coding (MDC) method is applied on amplitude-phase information using maximum-a-posteriori in Arrifano et al. (2013). It takes advantage of MDC for optimally coding data between available channels and mitigate channel errors.

A comparative study of several quantization schemes, USQ, ASQ, and VQ, is presented in Xing et al. (2014b). It is shown that jointly encoding the components can bring more benefits for real-imaginary and shifted distance representations that present higher inter-component redundancies. Conversely, the amplitude-phase representation is better encoded separately due to the larger influence of phase information on the reconstruction quality.

5.1.2 Transform Based

Transform-based coding is widely used for image compression, thanks to its ability to compact the signal energy efficiently. Therefore, researchers have naturally investigated the use of transform-based coding, and more specifically wavelet transforms, for the compression of holograms.

Shortt et al. (2006a) introduced wavelet analysis for the compression of real-imaginary components. A 1D-DWT is applied to each component with different resolution levels. The wavelet coefficients are then quantized. It has been found that three levels performed the best on average. In addition, compression of optically obtained and computer-generated phase-shifting interference patterns by standard JPEG and JPEG 2000 compression techniques are, respectively, addressed in Darakis and Soraghan (2006b), with reported compression ratios in the range 20-27 at acceptable reconstruction levels, and Xing et al. (2013a) showing the higher performance of JPEG 2000.

However, standard wavelets are typically designed to process piecewise smooth signals, while holograms contain features which are spread out from the objects. Therefore, applying standard wavelets directly to the hologram is not efficient. For this purpose, Liebling et al. (2003) developed a family of wavelet bases—Fresnelets—obtained by applying the Fresnel transform operator to B-splines bases, which is specially tailored to the specificities of DH. The particular suitability of the Fresnel B-splines has been concluded.

Darakis and Soraghan (2006a) introduced the use of Fresnelets into phase-shifting digital holography (PSDH) for holographic data compression. The real-imaginary components are first separately decomposed to Fresnelet coefficients to a required scale depth. The real and imaginary Fresnelet coefficients are then fed to the SPIHT algorithm. Experimental results verified its extensive flexibility for the compression of PSI holographic data.

However, Viswanathan et al. (2013) analyzed that Fresnelets have limitations for showing localization in frequency regarding a viewpoint-based degraded reconstruction. Instead, they proposed to use Gabor wavelets, which are suitable for measuring the local spatial frequencies so that the coefficients can be pruned corresponding to a viewpoint selection. Also, the angular spectrum method is used to perform the reconstruction instead of the Fresnel transform. The experimental results prove that Gabor wavelets are able to suppress the unwanted orders created in the reconstruction for off-axis holograms and have better time-frequency localization for view-dependent compression techniques. Moreover, they proposed to use Morlet wavelets for transforming a hologram and partly reconstructing a scene by using a sparse set of Morlet transformed coefficients in Viswanathan et al. (2014). It has been shown that a view-dependent representation together with Morlet wavelets form a good starting step for compressing holographic data for next generation 3DTV applications.

On the other hand, Blinder et al. (2013) also investigated wavelet coding on off-axis holograms. Differently, the properties of the off-axis holograms are first examined by independent component analysis, which reveals the importance of orientation and high frequencies in off-axis holograms. For this reason, standard decomposition schemes are not suitable for compressing holograms. Based on the standard JPEG 2000 algorithm, some alternative decomposition schemes are proposed to decompose further the high-frequency subbands. They are combined with direction-adaptive wavelets (Chang and Girod, 2007). Significant improvements have been reported for lossy compression compared with the standard DWT using the Mallat decomposition. Furthermore, they proposed a wavelet packet decomposition scheme combined with directional wavelet transforms for compressing microscopic off-axis holograms in Blinder et al. (2014). Still, the JPEG 2000 standard is modified to cope with holograms by applying the new wavelet decomposition with directional wavelet transforms. This extended JPEG 2000 algorithm shows higher compression efficiency.

Wavelet-based methods are considered effective for compressing holographic data. However, all the coding schemes described above encode different holographic data independently. It could be more efficient to combine the wavelet transform and joint encoding methods to compress phase-shifting holographic data.Xing et al. (2014a) first introduced a joint coding method based on the concept of vector lifting scheme (VLS) to compress shifted distance information. Experimental results indicate the advantage of applying this joint coding scheme. A significant gain of about 2 dB and 0.15 in terms of PSNR and structural similarity (SSIM), respectively, has been achieved compared to independent JPEG 2000 coding scheme. Subsequently, the scheme has been further improved by changing the decomposition structure from a separable to a nonseparable scheme in Xing et al. (2015), resulting in significant improvements.

Several wavelet-based holographic data compression methods are further detailed in Section 5.2.

5.1.3 Extension to Digital Holographic Sequences

Some works on compressing hologram sequences have also been reported. However, most methods are simply based on 2D video compression techniques.

Seo et al. (2006) used a multiview prediction technique and a temporal motion prediction technique to remove the spatial and temporal data redundancies. The predicted and compensated data are then compressed by an MPEG-2 encoder. Enhanced performances have been reported.

The use of MPEG-4 advanced simple profile (ASP) (Pereira and Ebrahimi, 2002) for the compression of hologram sequences was investigated by Darakis and Naughton (2009). Although it has been designed for conventional video, the scheme is effective and achieves good reconstruction quality. Inter-frame coding is also shown to outperform intra-frame coding. In other words, the scheme successfully exploits temporal redundancies in the hologram sequences.

A 3D scanning method is introduced in Seo et al. (2007). More specifically, interference patterns are divided into blocks, and 2D discrete cosine transform (DCT) is performed. The resulting segments are then scanned in 3D, in order to form a video sequence. Finally, the resulting video sequence is encoded using the advanced video coding (AVC) standard—H.264/AVC (Wiegand et al., 2003). The authors show the effectiveness of the scheme and claim significantly improved compression performance when compared to earlier works.

Two video coding schemes, namely H.264/AVC (Wiegand et al., 2003) and the wavelet-based (Dirac, 2009), are compared in Darakis et al. (2010). The authors performed subjective experiments in order to identify the threshold of visually lossless quality. More specifically, the observers are successively shown two sequences in random order, the uncompressed reference and a compressed version, and are asked to identify the compressed one. The bitrate of the compressed sequence is varied until the just noticeable difference threshold is found. While performance is content-dependent, it is shown that compression ratios up 7.5 can be achieved with visually lossless quality.

The recent high efficiency video coding (HEVC) state-of-the-art video coding standard (Sullivan et al., 2012) has also been investigated to compress hologram sequences. In Xing et al. (2013b), HEVC is applied to holograms obtained from animated virtual objects using computer-generated phase-shifting holography. HEVC is shown to achieve high reconstruction quality with a bit rate of 15 Mbps. Moreover, HEVC is demonstrated to be significantly superior to H.264/AVC. Subjective quality assessment experiments using HEVC are also reported in Ahar et al. (2015).

5.2 Wavelet-Based Holographic Data Compression Methods

In this section, we discuss in more details several wavelet-based holographic data compression methods based on Fresnelet (Liebling et al., 2003; Darakis and Soraghan, 2006a), separable and nonseparable vector lifting schemes (Xing et al., 2014a, 2015), arbitrary packet decomposition and directional wavelet transform (Blinder et al., 2013, 2014), and Morlet transform (Viswanathan et al., 2014).

5.2.1 Fresnelet Scheme

Fresnelets are a family of B-spline wavelet basis functions, which are specially designed for processing digital holograms (Liebling et al., 2003). Wavelets are efficient for piecewise smooth signals, such as natural images. However, holograms have very different characteristics, and in this case, wavelets do not perform optimally. Fresnelets have been developed in order to address this drawback. They have been successfully applied to compress off-axis and phase-shifting holographic data (Darakis and Soraghan, 2006a). In this part, the basic principles of the design of Fresnelets will be represented.

As explained in Chapter 2, the Fresnel transform of the complex object wave U(x,y) in the wave traveling direction z = 0 to the distance z = d can be expressed by 2D convolution integral as

$\begin{array}{l} \hat{U_{τ}} (ξ, η) & = \frac{exp (i k d)}{i λ d} \int_{- \infty}^{+ \infty} \int_{- \infty}^{+ \infty} U (x, y) \\ \times exp [\frac{i π}{λ d} ({(ξ - x)}^{2} + {(η - y)}^{2})] d x d y \\ = - i exp (i k d) (U * K_{τ}) (ξ, η), \end{array}$

si3_e (5.2)

with

$\begin{array}{l} \begin{matrix} K_{τ} (x, y) & = \frac{1}{τ^{2}} exp [\frac{i π}{τ^{2}} (x^{2} + y^{2})] \\ = k_{τ} (x) \cdot k_{τ} (y), \\ τ & = \sqrt{λ d}, \end{matrix} \end{array}$

si4_e (5.3)

where * denotes convolution and K_τ(x,y) is the separable kernel.

The Fresnelet bases are the Fresnel transform of B-splines. B-splines are selected to generate a multiresolution analysis of L₂ because they satisfy the requirements of a valid scaling function of L₂. One-dimensional B-splines of degree n are defined as the (n + 1)-fold convolution of a rectangular pulse (Unser, 1999):

$\begin{array}{l} β^{n} (x) = \underset{n + 1 times}{\underset{︸}{β^{0} * \dots * β^{0}}} (x), \end{array}$

si5_e (5.4)

where

$\begin{array}{c} β^{0} (x) = \{\begin{array}{l} 1, & if - \frac{1}{2} < x < \frac{1}{2} \\ \frac{1}{2}, & if | x | = \frac{1}{2} \\ 0, & otherwise . \end{array} \end{array}$

si6_e (5.5)

An alternative equivalent definition of the B-splines is:

$\begin{array}{l} β^{n} (x) = Δ^{n + 1} * \frac{{(x)}_{+}^{n}}{n!}, \end{array}$

(5.6)

where Δⁿ⁺¹ is the (n + 1)th centered finite-difference operator:

$\begin{array}{l} Δ^{n + 1} = \sum_{k = 0}^{n + 1} {(- 1)}^{k} C_{n + 1}^{k} δ (x + \frac{n + 1}{2} - k) \end{array}$

si8_e (5.7)

and ${(x)}_{+}^{n} = max {(0, x)}^{n}$ is the one-sided power function. So the explicit central B-splines can be represented by

$\begin{array}{l} β^{n} (x) = \sum_{k = 0}^{n + 1} {(- 1)}^{k} C_{n + 1}^{k} \frac{{(x + \frac{n + 1}{2} - k)}_{+}^{n}}{n!} . \end{array}$

si10_e (5.8)

More specially, a two-scale relation of the form can be shown as

$\begin{array}{l} β^{n} (\frac{x}{2}) = \sum_{k} h (k) β^{n} (x - k), \end{array}$

si11_e (5.9)

where h(k) is the binomial filter:

$\begin{array}{l} h (k) = \frac{1}{2^{n}} C_{n + 1}^{k} . \end{array}$

(5.10)

Moreover, in Unser (1999), a general family of semi-orthogonal spline wavelets in a two-scale relation has been shown of the form:

$\begin{array}{l} ψ^{n} (\frac{x}{2}) = \sum_{k} g (k) β^{n} (x - k), \end{array}$

si13_e (5.11)

so a Riesz basis of L₂ is formed by the functions

$\begin{array}{l} \{ψ_{i, k}^{n} = 2^{\frac{- i}{2}} ψ (2^{- i} x - k)\} . \end{array}$

si14_e (5.12)

These wavelets are linear combinations of B-splines which are specified by g(k), while g(k) is the quadrature mirror filter of h(k), which forms Riesz bases of L₂.

Fresnelets are then obtained by Fresnel transformed B-spline bases:

$\begin{array}{l} {\hat{β}}_{τ}^{n} (x) = (β^{n} * k_{τ}) (x) . \end{array}$

(5.13)

They can be used to decompose the signals into different scales. Particularly, for the two-scale relation, they become

$\begin{array}{l} {\hat{β}}_{\frac{τ}{2}}^{n} (\frac{x}{2}) = \sum_{k} h (k) {\hat{β}}^{n} (x - k) . \end{array}$

si16_e (5.14)

The semi-orthogonal ones corresponding to Eq. (5.11) can be derived in the same way.

In order to apply Fresnelets in 2D data, it can be simply derived by extending B-splines to 2D as

$\begin{array}{l} β^{n} (x, y) = β^{n} (x) \cdot β^{n} (y), \end{array}$

(5.15)

where ⋅ denotes the tensor product. The Fresnelets coefficients of the complex wave $Û_{τ_{0}}$ at scale j = 1 can be obtained by

$\begin{array}{l} \begin{matrix} LL (\frac{x}{2}, \frac{y}{2}) = Û_{τ_{0}} * [{\hat{β}}_{τ_{0}}^{n} (\frac{x}{2}) \cdot {\hat{β}}_{τ_{0}}^{n} (\frac{y}{2})] \\ HL (\frac{x}{2}, \frac{y}{2}) = Û_{τ_{0}} * [{\hat{ψ}}_{τ_{0}}^{n} (\frac{x}{2}) \cdot {\hat{β}}_{τ_{0}}^{n} (\frac{y}{2})] \\ LH (\frac{x}{2}, \frac{y}{2}) = Û_{τ_{0}} * [{\hat{β}}_{τ_{0}}^{n} (\frac{x}{2}) \cdot {\hat{ψ}}_{τ_{0}}^{n} (\frac{y}{2})] \\ HH (\frac{x}{2}, \frac{y}{2}) = Û_{τ_{0}} * [{\hat{ψ}}_{τ_{0}}^{n} (\frac{x}{2}) \cdot {\hat{ψ}}_{τ_{0}}^{n} (\frac{y}{2})] . \end{matrix} \end{array}$

si19_e (5.16)

For more scales decomposition, the subband LL needs to be decomposed further.

In Darakis and Soraghan (2006a), the obtained Fresnelets coefficients obtained from real-imaginary information are further quantized by uniform quantization and compressed by SPIHT coding. The comparison is conducted between lossless B-spline wavelets transform coded hologram and SPIHT coded Fresnelet coefficients. For the same compression rate, the Fresnelets-based scheme leads to smaller NRMS.

5.2.2 Separable Vector Lifting Scheme

As shown in Chapter 4, in the cases of shifted distance and real-imaginary representations, the two components of the hologram show similar visual content. Thus, it is expected that these kinds of data include redundancies, and from this point of view, efficient hologram compression schemes can be designed by exploiting the dependencies between these patterns. For this purpose, Xing et al. (2014a, 2015) first proposed to apply a joint coding method, based on the concept of a vector lifting scheme (VLS), to phase-shifting holographic data.

Most of the existing joint coding schemes, which have been developed in the literature for video and stereo/multiview data compression purpose, consist of two steps. Assuming that two input correlated images are available to be encoded, the first step consists of selecting one image as a reference and encoding it independently of the other one. Then, the second image, selected as a target image, is predicted from the first one, and the difference between the two images, called the residual, is encoded. Typically, DCT or DWT can be used for encoding both the reference and residual images (Moellenhoff and Maier, 1998; Boulgouris and Strintzis, 2002). Contrary to this standard scheme, the main feature of VLS is that it does not generate a residual image, but two compact representations of both images. VLS is an extended scheme based on lifting scheme (LS) (Sweldens, 1996; Hampson and Pesquet, 1998; Kaaniche et al., 2011a, 2012). LS is an alternative method for computing DWT, which is simpler and faster than the classical one. The original signal can also be easily obtained by the inverse transform. With these advantages, LS has been proven to be an efficient tool for still image coding. Many extensions have been applied to LS. One of them known as the quincunx lifting scheme has been found particularly useful for coding satellite images by using a quincunx grid (Gouze et al., 2004). Moreover, some directional transforms, such as oriented wavelet transform (Chappelier and Guillemot, 2006) and grouplet (Mallat, 2009), have also been developed with LS for image processing.

Figure 5.2 shows a generic separable LS structure for 1D signal with one level resolution. Biorthogonal wavelets can be constructed by a series of operators including split, predict, and update in the forward transform (analysis step).

f05-02-9780128028544 — Figure 5.2 Generic lifting structure.

Split First, the input 1D signal s_j is partitioned into two sample sets: the even samples s_j(2n) and the odd samples s_j(2n + 1).

Predict Next, predict each sample in one of the subsets by the neighboring samples in the other subset. A prediction error or detail signal, for the case of predicting odd sample (the same for the following contents), can be computed:

$\begin{array}{l} d_{j + 1} (n) = s_{j} (2 n + 1) - p^{⊤} s_{j} (n), \end{array}$

(5.17)

where p is the prediction vector containing the prediction weights, $s_{j} (n) = {(s_{j} (2 n - 2 k))}_{k \in P}$ is the reference vector containing the even samples for predicting the odd sample, and $P$ is the support of the predictor.

Update Then, a coarser approximation of the original signal is generated by a smoothing operation on the even samples using the detail coefficients:

$\begin{array}{l} s_{j + 1} (n) = s_{j} (2 n) + u^{⊤} d_{j + 1} (n), \end{array}$

(5.18)

where similarly, u is the update vector containing the weights of update operator, $d_{j + 1} (n) = {(d_{j + 1} (n - k))}_{k \in U}$ is the reference vector containing the detail coefficients, and $U$ is the support of the update operator.

It is obvious that the two sample subsets can be easily reconstructed from the forward transform by two inverse operators:

Undo update

$\begin{array}{l} s_{j} (2 n) = s_{j + 1} (n) - u^{⊤} d_{j + 1} (n), \end{array}$

(5.19)

Undo predict

$\begin{array}{l} s_{j} (2 n + 1) = d_{j + 1} (n) + p^{⊤} s_{j} (n) . \end{array}$

(5.20)

Then, they are merged to obtain the original signal. Above all, the LS is easier to implement than the filter bank-based method.

Similarly, the separable 2D-DWT can also be implemented by applying the 1D case to the lines and columns. Also, one can obtain a multiresolution representation by recursively repeating the analysis steps to the approximation coefficients.

Based on LS, VLS has been developed to encode jointly two sets of dependent signals which will be denoted by S⁽¹⁾ and S⁽²⁾ (Benazza-Benyahia et al., 2002; Kaaniche et al., 2009). The block diagram of a separable vector lifting scheme is shown in Fig. 5.3.

f05-03-9780128028544 — Figure 5.3 Principle of the VLS decomposition.

The principle of this multiscale decomposition is described for a given line x. In what follows, $S_{j}^{(1)}$ and $S_{j}^{(2)}$ designate the approximation coefficients of 2D signals, S⁽¹⁾ and S⁽²⁾, at resolution level j. While j = 0 corresponds to the initial (full resolution) signal S⁽¹⁾ and S⁽²⁾, note that the dimensions of $S_{j}^{(1)}$ and $S_{j}^{(2)}$ (j ≥ 1) are divided by 2^j along the horizontal and vertical directions.

Decomposition of reference signal As can be seen in Fig. 5.3, the reference signal S⁽¹⁾ is firstly encoded using a classical lifting structure composed of a prediction and an update stage. To this end, for a given line x, the input signal $S_{j}^{(1)} (x, y)$ is firstly partitioned into two datasets formed by the even $S_{j}^{(1)} (x, 2 y)$ and odd samples $S_{j}^{(1)} (x, 2 y + 1)$ , respectively. Then, during the prediction step, each sample of one of the two subsets (say the odd ones) is predicted from the neighboring even samples, yielding the detail coefficients ${\tilde{d}}_{j + 1}^{(1)}$ at the resolution (j + 1):

$\begin{array}{l} {\tilde{d}}_{j + 1}^{(1)} (x, y) = S_{j}^{(1)} (x, 2 y + 1) - \sum_{k \in P_{j}^{(1)}} p_{j, k}^{(1)} S_{j}^{(1)} (x, 2 y - 2 k), \end{array}$

si36_e (5.21)

where the coefficients $p_{j, k}^{(1)}$ and the set $P_{j}^{(1)}$ represent, respectively, the weights and the support of the predictor of the odd samples $S_{j}^{(1)} (x, 2 y + 1)$ . After that, the update step aims at computing a coarser approximation ${\tilde{S}}_{j + 1}^{(1)}$ of the original signal by smoothing the even sample $S_{j}^{(1)} (x, 2 y)$ as follows:

$\begin{array}{l} {\tilde{S}}_{j + 1}^{(1)} (x, y) = S_{j}^{(1)} (x, 2 y) + \sum_{k \in U_{1}^{(1)}} u_{j, k}^{(1)} {\tilde{d}}_{j + 1}^{(1)} (x, y - k), \end{array}$

si42_e (5.22)

where the set $U_{j}^{(1)}$ denotes the spatial support of the update operator whose coefficients are $u_{j, k}^{(1)}$ .

Decomposition of target signal Once the reference signal S⁽¹⁾ is encoded in intra mode, the attention will be paid now to the target signal S⁽²⁾. It is important to note that the main difference between a basic lifting scheme and the vector lifting scheme is that for the target signal S⁽²⁾, the prediction step uses samples from itself and also their corresponding samples taken from the reference signal S⁽¹⁾.
As shown in Fig. 5.3, a P-U-P structure is used for the target signal S⁽²⁾. More precisely, a first intra-prediction step is applied to generate an intermediate detail signal ${\overset{˅}{d}}_{j + 1}^{(2)}$ , which serves to compute the approximation signal ${\tilde{S}}_{j + 1}^{(2)}$ through the update step. After that, an hybrid prediction is performed by exploiting simultaneously the intra- and inter-redundancies in order to compute the final detail signal ${\tilde{d}}_{j + 1}^{(2)}$ . Thus, the resulting decomposition implies the following equations:

$\begin{array}{l} {\overset{˅}{d}}_{j + 1}^{(2)} (x, y) = S_{j}^{(2)} (x, 2 y + 1) - \sum_{k \in P_{j}^{(2)}} p_{j, k}^{(2)} S_{j}^{(2)} (x, 2 y - 2 k), \end{array}$

si48_e (5.23)

$\begin{array}{l} {\tilde{S}}_{j + 1}^{(2)} (x, y) = S_{j}^{(2)} (x, 2 y) + \sum_{k \in U_{j}^{(2)}} u_{j, k}^{(2)} {\overset{˅}{d}}_{j + 1}^{(2)} (x, y - k), \end{array}$

si49_e (5.24)

$\begin{array}{l} {\tilde{d}}_{j + 1}^{(2)} (x, y) & = {\overset{˅}{d}}_{j + 1}^{(2)} (x, y) - (\sum_{k \in Q_{j}} q_{j, k} {\tilde{S}}_{j + 1}^{(2)} (x, y - k) \\ + \sum_{k \in P_{j}^{(1, 2)}} p_{j, k}^{(1, 2)} S_{j}^{(1)} (x, 2 y + 1 - k)), \end{array}$

si50_e (5.25)

where $P_{j}^{(2)}$ (respectively, $P_{j}^{(1, 2)}$ ) is the spatial support of the intra-signal (respectively, inter-signals) whereas its weights are designated by $p_{j, k}^{(2)}$ (respectively, $p_{j, k}^{(1, 2)}$ ), and $Q_{j}$ (respectively, q_j,k) is the support (respectively, weights) of the second intra-signal predictor.
Since a separable decomposition has been considered, these steps are iterated on the columns y of the resulting subbands ${\tilde{S}}_{j + 1}^{(1)}$ , ${\tilde{d}}_{j + 1}^{(1)}$ , ${\tilde{S}}_{j + 1}^{(2)}$ , ${\tilde{d}}_{j + 1}^{(2)}$ in order to produce the approximation subbands $S_{j + 1}^{(1)}$ and $S_{j + 1}^{(2)}$ as well as three details subbands, for each signal, oriented horizontally, vertically, and diagonally. This decomposition is again repeated on the approximation subbands over J resolution levels, yielding the multiresolution representation of the two input signals.
Finally, at the last resolution level J, instead of encoding the approximation subband of the target signal $S_{j}^{(2)}$ , it is proposed to encode the residual subband given by:

$\begin{array}{l} e_{J}^{(2)} (x, y) = S_{J}^{(2)} (x, y) - \sum_{k \in P_{J}^{(1, 2)}} p_{J, k}^{(1, 2)} S_{J}^{(1)} (x, y - k) . \end{array}$

si63_e (5.26)

Prediction and update filters Prediction and update filters with spatial supports $P_{j}^{(1)} = {- 1, 0}$ and $U_{j}^{(1)} = {0, 1}$ are considered for signal S⁽¹⁾. In addition, the weights of the update filter are set for all the resolution levels to: $u_{j, 0}^{(1)} = u_{j, 1}^{(1)} = \frac{1}{4}$ . However, it is proposed to optimize the prediction weights $p_{j, - 1}^{(1)}$ and $p_{j, 0}^{(1)}$ in order to design a coding scheme well adapted to the content of the hologram data. Since the detail coefficients can be viewed as prediction errors, the prediction filter coefficients can be optimized at each resolution level j by minimizing the variance of the detail signal ${\tilde{d}}_{j + 1}^{(1)}$ . More specifically, the well-known Yule-Walker equations are applied to solve the optimization problem:

$\begin{array}{l} E [S_{j} (x, y) S_{j} {(x, y)}^{⊤}] p_{j} = E [S_{j}^{(1)} (x, 2 y + 1) S_{j} (x, y)], \end{array}$

(5.27)

where

$\begin{array}{l} S_{j} (x, y) = {(S_{j}^{(1)} (x, 2 y), S_{j}^{(1)} (x, 2 y + 2))}^{⊤} . \end{array}$

(5.28)

$p_{j} = {(p_{j, 0}^{(1)}, p_{j, - 1}^{(1)})}^{⊤}$ is the prediction weighting vector and E[⋅] denotes the mathematical expectation.
Concerning the second signal S⁽²⁾, the same intra-prediction and update filters used with S⁽¹⁾ will be employed to generate the signals ${\overset{˅}{d}}_{j + 1}^{(2)}$ and ${\tilde{S}}_{j + 1}^{(2)}$ . Then, the second prediction stage is performed by setting $Q_{j} = {- 1, 0}$ and $P_{j}^{(1, 2)} = {- 1, 0, 1}$ for j ∈{0,…,J − 1} and $P_{J}^{(1, 2)} = {0}$ . The coefficients q_j,k and $p_{j, k}^{(1, 2)}$ are also optimized by minimizing the variance of the detail signal ${\tilde{d}}_{j + 1}^{(2)}$ by Yule-Walker equations.

The subband coefficients resulting from the VLS decomposition are then quantized and encoded using the EBCOT algorithm similarly to the JPEG 2000 standard (see Chapter 4).

The effectiveness of the joint coding scheme based on separable VLS for holographic data compression purposes has been validated in Xing et al. (2014a). More specifically, the authors compared the performance of the VLS scheme with two reference methods using the shifted distance holographic data representation. The first method corresponds to the state-of-the-art hologram compression technique where the inputs D⁽¹⁾ and D⁽²⁾ are separately encoded by using existing still image coders. To this end, the 9/7 wavelet transform retained in the lossy compression mode of JPEG 2000 has been used. In what follows, this scheme will be designated by “Independent.” The second one is the standard joint coding scheme, where the reference image D⁽¹⁾ and a residual one, given by D⁽²⁾ − D⁽¹⁾, are also encoded by applying the 9/7 transform. We recall that this technique has been considered in most of joint coding schemes developed in the context of stereo and video data compression. It will be designated by “Standard.” Note that JPEG 2000 has been used as an entropy encoder for these different hologram compression methods.Figure 5.4 shows the corresponding rate-distortion results in terms of PSNR, respectively, SSIM, versus the bitrate given in bit per pixel (bpp). Both PSNR and SSIM are computed between the original and reconstructed objects. For more details about the experimental results, readers can refer to Xing et al. (2014a).

f05-04-9780128028544 — Figure 5.4 Rate-distortion performance of three different hologram compression schemes: independent, standard, and VLS, applied on D⁽¹⁾ and D⁽²⁾, for the objects: (a) “Bunny-1,” (b) “Bunny-2,” and (c) “Girl.”

5.2.3 Nonseparable Vector Lifting Scheme

The VLS approach presented in Section 5.2.2 has been performed in a separable way by cascading the 1D decomposition along the horizontal direction, then along the vertical direction. However, according to the visual patterns of the difference data D⁽¹⁾ and D⁽²⁾, it can be noticed that these signals present some structures that are neither horizontal nor vertical.

Therefore, to exploit better the characteristics of the hologram data, a nonseparable decomposition has been proposed in Xing et al. (2015), with the objective to build an efficient content-adaptive decomposition. This 2D nonseparable vector lifting scheme (NS-VLS) is described in more detail in this section. We use the same notations as in Section 5.2.2.

5.2.3.1 Basic Concept of Decomposition Structure

At each pixel location (x,y), the approximation coefficients of the first (respectively, second) image $D_{j}^{(1)} (x, y)$ (respectively, $D_{j}^{(2)} (x, y)$ ) are divided into four polyphase components denoted by

$\begin{array}{l} \{\begin{array}{l} D_{0, j}^{(i)} (x, y) = D_{j}^{(i)} (2 x, 2 y), \\ D_{1, j}^{(i)} (x, y) = D_{j}^{(i)} (2 x, 2 y + 1), \\ D_{2, j}^{(i)} (x, y) = D_{j}^{(i)} (2 x + 1, 2 y), \\ D_{3, j}^{(i)} (x, y) = D_{j}^{(i)} (2 x + 1, 2 y + 1), \end{array} \end{array}$

si82_e (5.29)

where i ∈{1,2}.

Figure 5.5 shows the NS-VLS analysis structure. One image (here D⁽¹⁾) is selected as a reference image and encoded independently of the other one. To this end, a classical nonseparable lifting structure (Kaaniche et al., 2011b), composed of three prediction steps and an update one, is applied to $D_{j}^{(1)}$ in order to generate the detail signals oriented diagonally $D_{j + 1}^{(HH, 1)}$ , vertically $D_{j + 1}^{(LH, 1)}$ , and horizontally $D_{j + 1}^{(HL, 1)}$ , as well as the approximation one $D_{j + 1}^{(1)}$ . Thus, the output wavelet coefficients, at the resolution level (j + 1), can be written as follows:

$\begin{array}{l} D_{j + 1}^{(HH, 1)} (x, y) & = D_{3, j}^{(1)} (x, y) - ({(P_{0, j}^{(HH, 1)})}^{⊤} D_{0, j}^{(HH, 1)} + {(P_{1, j}^{(HH, 1)})}^{⊤} D_{1, j}^{(HH, 1)} \\ + {(P_{2, j}^{(HH, 1)})}^{⊤} D_{2, j}^{(HH, 1)}), \end{array}$

si88_e (5.30)

$\begin{array}{l} D_{j + 1}^{(LH, 1)} (x, y) = D_{2, j}^{(1)} (x, y) - ({(P_{0, j}^{(LH, 1)})}^{⊤} D_{0, j}^{(LH, 1)} + {(P_{1, j}^{(LH, 1)})}^{⊤} {\underline{D}}_{j + 1}^{(HH, 1)}), \end{array}$

si89_e (5.31)

$\begin{array}{l} D_{j + 1}^{(HL, 1)} (x, y) = D_{1, j}^{(1)} (x, y) - ({(P_{0, j}^{(HL, 1)})}^{⊤} D_{0, j}^{(HL, 1)} + {(P_{1, j}^{(HL, 1)})}^{⊤} {\bar{D}}_{j + 1}^{(HH, 1)}), \end{array}$

si90_e (5.32)

$\begin{array}{l} D_{j + 1}^{(1)} (x, y) & = D_{0, j}^{(1)} (x, y) + ({(U_{0, j}^{(HL, 1)})}^{⊤} D_{j + 1}^{(HL, 1)} + {(U_{1, j}^{(LH, 1)})}^{⊤} D_{j + 1}^{(LH, 1)} \\ + {(U_{2, j}^{(HH, 1)})}^{⊤} D_{j + 1}^{(HH, 1)}), \end{array}$

si91_e (5.33)

where for every i ∈{0,1,2} and o ∈{HL, LH, HH},

• $P_{i, j}^{(o, 1)} = {(p_{i, j}^{(o, 1)} (s, t))}_{(s, t) \in P_{i, j}^{(o, 1)}}$ represents the prediction weighting vector whose support is denoted by $P_{i, j}^{(o, 1)}$ ;

• $D_{i, j}^{(o, 1)} = {(D_{i, j}^{(1)} (x + s, y + t))}_{(s, t) \in P_{i, j}^{(o, 1)}}$ is a reference vector used to compute $D_{j + 1}^{(o, 1)} (x, y)$ ;

• ${\underline{D}}_{j + 1}^{(HH, 1)} = {(D_{j + 1}^{(HH, 1)} (x + s, y + t))}_{(s, t) \in P_{1, j}^{(LH, 1)}}$ and ${\bar{D}}_{j + 1}^{(HH, 1)} = {(D_{j + 1}^{(HH, 1)} (x + s, y + t))}_{(s, t) \in P_{1, j}^{(HL, 1)}}$ correspond to two reference vectors used in the second and the third prediction steps, respectively;

• $U_{i, j}^{(o, 1)} = {(u_{i, j}^{(o, 1)} (s, t))}_{(s, t) \in U_{i, j}^{(o, 1)}}$ is the update vector coefficients whose support is designated by $U_{i, j}^{(o, 1)}$ ; and

• $D_{j + 1}^{(o, 1)} = {(D_{j + 1}^{(o, 1)} (x + s, y + t))}_{(s, t) \in U_{i, j}^{(o, 1)}}$ is the reference vector containing the set of detail samples used in the update step.

f05-05-9780128028544 — Figure 5.5 Proposed NS-VLS decomposition structure.

For the second image $D_{j}^{(2)}$ , selected as a target one, a joint wavelet decomposition is performed by taking into account its correlation with the reference one $D_{j}^{(1)}$ . More specifically, a lifting structure, similar to that used with $D_{j}^{(1)}$ , is firstly applied on $D_{j}^{(2)}$ in order to produce three intermediate detail signals ${\tilde{D}}_{j + 1}^{(HH, 2)}$ , ${\tilde{D}}_{j + 1}^{(LH, 2)}$ , and ${\tilde{D}}_{j + 1}^{(HL, 2)}$ , which will be used to compute the approximation coefficients $D_{j + 1}^{(2)}$ . Then, a second prediction stage is added.

Indeed, three hybrid prediction steps, which aim at exploiting simultaneously the intra- and inter-image redundancies, are applied in order to generate the final detail coefficients $D_{j + 1}^{(HH, 2)}$ , $D_{j + 1}^{(LH, 2)}$ , and $D_{j + 1}^{(HL, 2)}$ . Thus, the final detail wavelet coefficients, at the resolution level (j + 1), are expressed as follows:

$\begin{array}{l} \begin{matrix} D_{j + 1}^{(HH, 2)} (x, y) = & {\tilde{D}}_{j + 1}^{(HH, 2)} (x, y) - ({(Q_{0, j}^{(HH, 2)})}^{⊤} {\tilde{D}}_{0, j + 1}^{(HH, 2)} \\ + {(Q_{1, j}^{(HH, 2)})}^{⊤} {\tilde{D}}_{1, j + 1}^{(HH, 2)} + {(Q_{2, j}^{(HH, 2)})}^{⊤} {\tilde{D}}_{2, j + 1}^{(HH, 2)} \\ + {(P_{0, j}^{(HH, 1, 2)})}^{⊤} D_{0, j}^{(HH, 1)} + {(P_{1, j}^{(HH, 1, 2)})}^{⊤} D_{1, j}^{(HH, 1)} \\ + {(P_{2, j}^{(HH, 1, 2)})}^{⊤} D_{2, j}^{(HH, 1)} + {(P_{3, j}^{(HH, 1, 2)})}^{⊤} D_{3, j}^{(HH, 1)}), \end{matrix} \end{array}$

si112_e

$\begin{array}{l} \begin{matrix} D_{j + 1}^{(LH, 2)} (x, y) = & {\tilde{D}}_{j + 1}^{(LH, 2)} (x, y) - ({(Q_{0, j}^{(LH, 2)})}^{⊤} {\tilde{D}}_{0, j + 1}^{(LH, 2)} \\ + {(Q_{1, j}^{(LH, 2)})}^{⊤} {\underline{D}}_{j + 1}^{(HH, 2)} + {(P_{0, j}^{(LH, 1, 2)})}^{⊤} D_{0, j}^{(LH, 1)} \\ + {(P_{2, j}^{(LH, 1, 2)})}^{⊤} D_{2, j}^{(LH, 1)}), \end{matrix} \end{array}$

si113_e

$\begin{array}{l} \begin{matrix} D_{j + 1}^{(HL, 2)} (x, y) = & {\tilde{D}}_{j + 1}^{(HL, 2)} (x, y) - ({(Q_{0, j}^{(HL, 2)})}^{⊤} {\tilde{D}}_{0, j + 1}^{(HL, 2)} \\ + {(Q_{1, j}^{(HL, 2)})}^{⊤} {\bar{D}}_{j + 1}^{(HH, 2)} + {(P_{0, j}^{(HL, 1, 2)})}^{⊤} D_{0, j}^{(HL, 1)} \\ + {(P_{1, j}^{(HL, 1, 2)})}^{⊤} D_{1, j}^{(HL, 1)}), \end{matrix} \end{array}$

si114_e

where for every i ∈{0,1,2,3} and o ∈{HL, LH, HH},

• $Q_{i, j}^{(o, 2)} = {(q_{i, j}^{(o, 2)} (s, t))}_{(s, t) \in Q_{i, j}^{(o, 2)}}$ is the intra-prediction weighting vector whose support is designated by $Q_{i, j}^{(o, 2)}$ ;

• $P_{i, j}^{(o, 1, 2)} = {(p_{i, j}^{(o, 1, 2)} (s, t))}_{(s, t) \in P_{i, j}^{(o, 1, 2)}}$ is the hybrid prediction weighting vector whose support is denoted by $P_{i, j}^{(o, 1, 2)}$ ;

• ${\tilde{D}}_{0, j + 1}^{(o, 2)} = {(D_{j + 1}^{(2)} (x + s, y + t))}_{(s, t) \in Q_{0, j}^{(o, 2)}}$ is a reference vector containing the approximation coefficients $D_{j + 1}^{(2)}$ used to compute the detail ones $D_{j + 1}^{(o, 2)} (x, y)$ ;

• ${\tilde{D}}_{1, j + 1}^{(HH, 2)} = {({\tilde{D}}_{j + 1}^{(HL, 2)} (x + s, y + t))}_{(s, t) \in Q_{1, j}^{(HH, 2)}}$ and ${\tilde{D}}_{2, j + 1}^{(HH, 2)} = {({\tilde{D}}_{j + 1}^{(LH, 2)} (x + s, y + t))}_{(s, t) \in Q_{2, j}^{(HH, 2)}}$ are two reference vectors, containing, respectively, the intermediate detail coefficients ${\tilde{D}}_{j + 1}^{(HL, 2)}$ and ${\tilde{D}}_{j + 1}^{(LH, 2)}$ , used to compute the final detail coefficients $D_{j + 1}^{(HH, 2)} (x, y)$ ;

• ${\underline{D}}_{j + 1}^{(HH, 2)} = {(D_{j + 1}^{(HH, 2)} (x + s, y + t))}_{(s, t) \in Q_{1, j}^{(LH, 2)}}$ and ${\bar{D}}_{j + 1}^{(HH, 2)} = {(D_{j + 1}^{(HH, 2)} (x + s, y + t))}_{(s, t) \in Q_{1, j}^{(HL, 2)}}$ are two intra-prediction vectors used to compute $D_{j + 1}^{(LH, 2)} (x, y)$ and $D_{j + 1}^{(HL, 2)} (x, y)$ ; and

• $D_{i, j}^{(o, 1)} = {(D_{i, j}^{(1)} (x + s, y + t))}_{(s, t) \in P_{i, j}^{(o, 1, 2)}}$ is a vector containing some samples of the reference image $D_{j}^{(1)}$ used to exploit the inter-image redundancies during the computation of the final detail coefficients $D_{j + 1}^{(o, 2)} (x, y)$ .

By repeating the same decomposition strategy on the approximation subbands over J resolution levels, two multiresolution representations of D⁽¹⁾ and D⁽²⁾ are obtained. Finally, at the last resolution level J, instead of encoding the approximation subband of the target image $D_{j}^{(2)}$ , it would be interesting to exploit its correlation with $D_{j}^{(1)}$ and thus to encode the following residual subband $e_{J}^{(2)}$ :

$\begin{array}{l} e_{J}^{(2)} (x, y) = D_{J}^{(2)} (x, y) - p_{J}^{(1, 2)} D_{J}^{(1)} (x, y), \end{array}$

(5.34)

where $p_{J}^{(1, 2)}$ is a hybrid prediction coefficient that exploits the correlation between $D_{j}^{(2)}$ and $D_{j}^{(1)}$ .

5.2.3.2 Optimization Techniques

For the same reason as the VLS-based approach, the ultimate aim is to design an NS-VLS-based decomposition well adapted to the characteristics of the holographic data. Consequently, the optimization of all the operators is necessary. Due to the complexity of the NS-VLS structure, different strategies of optimization are applied:

Optimization of the predictors for D⁽¹⁾ For the first image D⁽¹⁾, the different prediction filters $P_{j}^{(o, 1)}$ (with o ∈{HH, LH, HL}), used to generate the detail subbands $D_{j + 1}^{(o, 1)}$ , are optimized at each resolution level by minimizing the variance of the detail coefficients. As mentioned before, the Yule-Walker equations should be satisfied.

Optimization of the update operators Different from the fixed assignment of the update operators in separable VLS, the update filter here at each level is optimized by minimizing the error between the approximation coefficients $D_{j + 1}^{(1)}$ and the decimated version of the signal resulting from an ideal low-pass filter. Due to the complexity of the equations for the application, the details will not be provided here. The readers can refer to Kaaniche et al. (2011b) for more details.

Optimization of the operators for D⁽²⁾ For the second image D⁽²⁾, all the prediction and update operators can also be optimized by adopting the same strategy used with D⁽¹⁾. However, since D⁽¹⁾ and D⁽²⁾ present similar content, the optimization process of the first three prediction filters $P_{j}^{(o, 2)}$ and the update one $U_{j}^{(2)}$ can be omitted by imposing these operators to be equal to those obtained with D⁽¹⁾. Thus, we have:

$\begin{array}{l} P_{j}^{(HH, 2)} = P_{j}^{(HH, 1)}, P_{j}^{(LH, 2)} = P_{j}^{(LH, 1)}, \\ P_{j}^{(HL, 2)} = P_{j}^{(HL, 1)}, U_{j}^{(2)} = U_{j}^{(1)} . \end{array}$

si146_e (5.35)

Note that this procedure presents the advantages of simplifying the optimization strategy and reducing the overhead cost corresponding to the number of filter coefficients that must be sent to the decoder. Finally, for the remaining hybrid prediction filters $P_{j}^{(o, 1, 2)}$ and $Q_{j}^{(o, 2)}$ , they will be optimized by minimizing the variance of the detail coefficients $D_{j + 1}^{(o, 2)}$ .

Figure 5.6 shows the improved performance of NS-VLS, compared to independent coding scheme, standard joint coding scheme, and separable VLS, when using the shifted distance holographic data representation.

f05-06-9780128028544 — Figure 5.6 Rate-distortion performance of four different hologram compression schemes: independent, standard, separable VLS, and NS-VLS, applied on D⁽¹⁾ and D⁽²⁾, for the objects: (a) “Luigi-1,” (b) “Luigi-2,” (c) “Bunny-1,” (d) “Bunny-2,” (e) “Girl,” and (f) “Teapot.”

In addition, different prediction filter lengths L_p have also been investigated. Indeed, increasing L_p may be interesting for two reasons. The first one is explained by the fact that the holographic data present repetitive circular structures similar to the propagation of waves. The second one is due to the objective of VLS, which consists of exploiting the inter-images redundancies through the prediction stage. In this respect, in addition to the case given by L_p = 2, other cases with L_p ∈{6,12,20,32} are also considered. The structures are designated by NS-VLS(2,2), NS-VLS(6,2), NS-VLS(12,2), NS-VLS(20,2), and NS-VLS(32,2). Some results are given in Fig. 5.7. Compared to the case L_p = 2, it can be observed that increasing this length up to 12 or 20 leads to greatly improved compression performance. However, very long filters (L_p = 32) lead to worse performance at low bitrate because of the expensive cost of encoding the filter coefficients compared to the case of small L_p values. Noted that the best filter length may differ according to the different content of holograms. For more details about the experimental results, readers can refer to Xing et al. (2015).

f05-07-9780128028544 — Figure 5.7 Rate-distortion performance of NS-VLS with different prediction filter lengths for the objects: (a) “Luigi-1,” (b) “Luigi-2,” (c) “Bunny-1,” (d) “Bunny-2,” (e) “Girl,” and (f) “Teapot.”

5.2.4 Arbitrary Packet Decomposition and Direction-Adaptive Discrete Wavelet Transform

Digital holographic data exhibit different characteristics when compared to natural images. For instance, whereas images typically have a 1/f ² power spectrum distribution, holographic data are characterized by significantly larger amount of high-frequency content. Moreover, the fringe patterns show a significant directionality. Straightforwardly, these characteristics need to be taken into account when designing a coding scheme.

In order to address the higher spatial frequency content, packet decompositions can be applied. More specifically, in contrast with the Mallat decomposition, packet decompositions further decompose the higher-frequency subbands, in addition to the decomposition of the low-frequency subbands, in order to achieve higher energy compaction. Such packet decompositions are supported in JPEG 2000 Part 2; however, the standard does not support a homogeneous decomposition style when the number of wavelet levels increases. For this purpose, a JPEG 2000 architecture compliant decomposition mechanism is proposed in Blinder et al. (2013, 2014) to fully enable arbitrary packet decompositions.

Exploiting directionality is also clearly an important property. For this purpose, in Blinder et al. (2013, 2014), it is proposed to replace the classical JPEG 2000 wavelet transform by a direction-adaptive discrete wavelet transform (DA-DWT). More precisely, the direction-adaptive wavelet filters introduced by Chang and Girod (2007) are used, in combination with the above wavelet packet decomposition.

The resulting coding architecture is shown in Fig. 5.8, where multiple DA-DWTs are applied on each evenly sized blocks of the hologram.

f05-08-9780128028544 — Figure 5.8 JPEG 2000 architecture extended with an arbitrary packet decomposition and direction-adaptive discrete wavelet transform. Source: Adapted from Blinder et al. (2014)

The DA-DWT is also based on the idea of lifting scheme, to be precise, the directional lifting based. Moreover, it is able to represent more efficiently the sharp features in images. In order to explain DA-DWT, we follow the notations in Chang and Girod (2007). Let $π = {(l_{x}, l_{y}) \in Z^{2}}$ be a 2D orthogonal sampling grid composed of four subgrids: π_pq = {(l_x,l_y) ∈π|l_x mod 2 = p,l_y mod 2 = q}.S = {s[l]}, where s[l] = s[l_x,l_y] and l = (l_x,l_y), denotes a set of image samples on π. $S_{0} = {s [l_{0}] | l_{0} \in π_{0} = π_{00} \cup π_{01}}$ and $S_{1} = {s [l_{1}] | l_{1} \in π_{1} = π_{10} \cup π_{11}}$ denote the samples in even and odd rows, respectively.

Similar to the description of the lifting scheme in Section 5.2.2, the detail signal w₁ = {w₁[l₁],l₁ ∈π₁} and approximation signal w₀ = {w₀[l₀],l₀ ∈π₀} can be obtained by a prediction step and an update step in 1D vertical lifting of samples in S₀, respectively, as follows:

$\begin{array}{l} \begin{matrix} w_{1} [l_{1}] = g_{H} \cdot (s [l_{1}] - P_{s, l_{1}} (S_{0})), \forall l_{1} \in π_{1} \\ w_{0} [l_{0}] = g_{L} \cdot (s [l_{0}] - g_{H}^{- 1} \cdot U_{s, l_{0}} (w_{1})), \forall l_{0} \in π_{0}, \end{matrix} \end{array}$

si153_e (5.36)

where g_H, g_L are scaling factors and P(⋅), U(⋅) are prediction and update operators, respectively. After further decomposition along the columns, four subbands in total are generated: w₀₀ = LL defined on π₀₀, w₀₁ = LH defined on π₀₁, w₁₀ = HL defined on π₁₀, and w₁₁ = HH defined on π₁₁.

In DA-DWT, the prediction and update operators with direction d = (d_x,d_y) are defined as

$\begin{array}{l} \begin{matrix} P_{s, l_{1}}^{i} (s_{0}) = \sum_{k = - K_{P}}^{K_{P} - 1} c_{P, k} \cdot [l_{1} - (2 k + 1) d] \\ U_{s, l_{0}} (w_{1}) = \sum_{k = - K_{U}}^{K_{U} - 1} c_{U, k} \cdot \sum_{{l_{1} | l_{1} - (2 k + 1) d_{l_{1}}^{*} = l_{0}}} w_{1} [l_{1}], \end{matrix} \end{array}$

si154_e (5.37)

where d is defined such as to satisfy:

$\begin{array}{l} l_{1} - (2 k + 1) d \in π_{0}, \forall l_{1} \in π_{1}, k = - K_{P}, \dots, K_{P} - 1, \end{array}$

(5.38)

and i = 0,1,…,N_c − 1 is the direction index; N_c is the number of prediction directions; K_P, K_U, c_P,k, and c_U,k are defined by the adopted wavelet kernel; and d_l₁* denotes the direction selected at location l₁ for P_s,l₁(s₀).

For each image block, the optimum direction d is selected by minimizing the prediction error w₁[l₁]. A speed-up process to search for the optimum direction has been also proposed in Stevens et al. (2008). For a multiscale decomposition, the directional lifting steps above are repeated for every LL subband.

Due to the high overhead cost for storing the direction indices of each subband, the DA-DWT are only applied for three main decompositions involved in the generation of the 1LL, 2LL, and 3LL subbands (Blinder et al., 2013).

Significant improvements in coding performance have been reported for off-axis holography and computer-generated holography compared to the conventional JPEG 2000 standard, with Bjøntegaard delta-peak signal-to-noise ratio improvements ranging up to 11 dB for lossy compression in the 0.125-2.00 bits per pixel (bpp) range, and bit-rate reductions of up to 1.6 bpp for lossless compression (Blinder et al., 2013, 2015).Table 5.1 demonstrates the significant improvements resulting from the wavelet packet decomposition and DA-DWT on five different test objects.

Table 5.1

BD-PSNR Improvements in (dB) Compared to the DWT Mallat, per Decomposition, With and Without DA-DWT

Lossy	Pack1	Pack2	Full Packet	DA-DWT+ Mallat	DA-DWT+ Pack1	DA-DWT+ Pack2	DA-DWT+ Full Packet
Neuron	3.45	3.34	2.79	5.75	3.40	3.28	2.75
Erythocryte	5.44	5.41	8.20	7.28	8.31	8.30	8.39
Microlenses	0.06	0.30	2.10	2.89	2.65	2.61	2.37
Ball	11.81	11.91	12.64	5.14	12.74	12.74	12.57
Scratch	−7.13	−7.07	−5.26	1.98	−4.59	−4.57	−4.41
Average	2.73	2.78	4.09	4.59	4.50	4.47	4.33

t0010

Source: Blinder et al. (2013).

5.2.5 Morlet Transform

The problem of viewpoint scalability is considered in Viswanathan et al. (2013). The authors address different aspects related to hologram display in the presence of a few viewers. In such a case, only a sparse set of holographic data is necessary to reconstruct a given viewpoint. In other words, the wavelet transform should allow coefficient pruning so that only relevant data are transmitted. To enable this view-based representation of the hologram, good space-frequency localization is needed. Since the wavelet families based on or derived from Gabor basis functions can provide the best space-frequency localization, in Viswanathan et al. (2014), Morlet wavelets are proposed to transform off-axis holograms.

Morlet wavelets are based on the concepts of Gabor functions. In order to design the Morlet wavelets, we start from the concepts of Gabor functions. A Gabor function is a Gaussian kernel multiplied by a sinusoid. Its 1D continuous form is defined as

$\begin{array}{l} g_{β, f_{0}} (x) = K \cdot exp (- β^{2} x^{2}) exp (2 i π f_{0} x) β^{2} = \frac{1}{2 σ^{2}}, \end{array}$

(5.39)

where f₀ is the frequency, K is the norm of the basis function, and σ is the standard deviation of the Gaussian function. In order to have equal number of oscillations for all frequencies, the setting

$\begin{array}{l} f_{0} = \frac{1}{\sqrt{2}} π β \end{array}$

si157_e (5.40)

is chosen. The 1D Morlet mother wavelet is further defined as

$\begin{array}{l} ψ (x) = K \cdot exp (- β^{2} x^{2}) (exp (2 i π f_{0} x) - exp (- \frac{π^{2}}{β^{2}} f_{0}^{2})), \end{array}$

si158_e (5.41)

where the term $exp (- \frac{π^{2}}{β^{2}} f_{0}^{2})$ si159_e is introduced to eliminate the DC term of the off-axis hologram (Zhong 340et al., 2009). By extending Eq. (5.41) to 2D, we obtain

$\begin{array}{l} ψ (x, y) & = K \cdot exp (- β^{2} (x^{2} + y^{2})) (exp (2 i π (μ_{0} x + ν_{0} y)) \\ - exp (- \frac{π^{2}}{β^{2}} (μ_{0}^{2} + ν_{0}^{2}))), \end{array}$

si160_e (5.42)

where μ₀ and ν₀ are the spatial frequencies in x and y directions, respectively; K is the L₂-norm of the basis. For the same purpose of having equal number of oscillations for all frequencies, the setting

$\begin{array}{l} μ_{0}^{2} + ν_{0}^{2} = π^{2} β^{2} \end{array}$

(5.43)

is chosen. Consequently, we can obtain

$\begin{array}{l} exp (- \frac{π^{2}}{β^{2}} (μ_{0}^{2} + ν_{0}^{2})) = exp (- π^{4}) \approx 0 . \end{array}$

si162_e (5.44)

Representing the spatial frequencies μ₀ and ν₀ into polar form

$\begin{array}{l} μ_{0} = F_{0} cos θ, ν_{0} = F_{0} sin θ, \end{array}$

(5.45)

where the variation of θ describes the variation of the spatial frequencies in x and y directions. Substituting Eqs. (5.40), (5.43), and (5.45) to Eq. (5.42), the 2D Morlet mother wavelet becomes

$\begin{array}{l} ψ (x, y) = K \cdot exp (- \frac{2 f_{0}^{2} (x^{2} + y^{2})}{π^{2}}) exp (2 i π^{2} β (x cos θ + y sin θ)), \end{array}$

si164_e (5.46)

where the mother wavelet is centered at f₀. In order to build the family of Morlet wavelets, the scaling parameter s for controlling the spatial and frequency resolution of decomposition is introduced to span the frequency plane, where the scaled frequencies are defined by

$\begin{array}{l} f = \frac{f_{0}}{s} . \end{array}$

(5.47)

Therefore, the continuous 2D Morlet wavelet can be represented with the parameters s and θ by

$\begin{array}{l} ψ_{s, θ} (x, y) = K \cdot exp (- \frac{2 f_{0}^{2}}{s^{2} π^{2}} (x^{2} + y^{2})) exp (2 i π^{2} \frac{f_{0}}{s π} (x cos θ + y sin θ)) . \end{array}$

si166_e (5.48)

For further application, the continuous transform has to be discretized to obtain the view-based representation of holograms. The discretization is mainly conducted from three aspects: the discretization of x and y, the discretization of s, and the discretization of θ.

The discretization of x and y is based on the consideration of the Gaussian function. Assume that the length of the Gaussian is L which is selected based on the tapering of the Gaussian function and there are N₀ oscillations of the sinusoid with frequency f₀, we have

$\begin{array}{l} L = \frac{N_{0}}{f_{0}} . \end{array}$

si167_e (5.49)

By Nyquist criterion, the sampling frequency f_s should be twice of f₀. We consequently have the actual size of the analog window function

$\begin{array}{l} L_{ω} = L \cdot f_{s} = \frac{N_{0}}{f_{0}} \cdot 2 f_{0} = 2 N_{0} . \end{array}$

si168_e (5.50)

The ration parameter θ is descretized by performing N_θ rotations of the sinusoid from 0 to π. The discrete spatial frequencies f_x and f_y in x and y directions are then represented by

$\begin{array}{l} \begin{matrix} f_{x} = \frac{f_{c} cos (Θ r)}{s}, \\ f_{y} = \frac{f_{c} sin (Θ r)}{s}, \end{matrix} \end{array}$

si169_e (5.51)

respectively, where $Θ = \frac{π}{N_{θ}}$ , r is the discrete rotation parameter and $f_{c} = \frac{1}{Δ}$ where Δ is the pixel period of the holographic display.

The discretization of the scale parameter s is dependent on the discrete viewer positions. Assume that θ_x and θ_y are the diffraction angles along x and y directions, respectively. Then the viewer plan can be divided into zones $R_{θ_{x}, θ_{y}}$ , each of which corresponds to a unique Morlet transformed hologram zone $H_{θ_{x}, θ_{y}}$ . The discrete diffraction angles are obtained by

$\begin{array}{l} \begin{matrix} θ_{x} = arcsin (\frac{λ f_{c} cos (Θ r)}{s}), \\ θ_{y} = arcsin (\frac{λ f_{c} sin (Θ r)}{s}) . \end{matrix} \end{array}$

si174_e (5.52)

Above all, the discrete Morlet wavelet can be derived as

$\begin{array}{l} ψ_{s, r} (m, n) & = K \cdot exp (- \frac{f_{c}^{2}}{s^{2} π^{2}} (m^{2} + n^{2})) \\ exp (2 i π^{2} \frac{f_{c}}{s π} (m cos (Θ r) + n sin (Θ r))), \end{array}$

si175_e (5.53)

where $- N_{0} \leq m \leq N_{0}$ and $- N_{0} \leq n \leq N_{0}$ . The Morelet wavelet transform is then performed on the off-axis holograms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5: Compression of Digital Holographic Data

Create new playlist

Sign In

Sign Up