Maria G. Martini, Chaminda T.E.R. Hewage, Moustafa M. Nasrall and Ognen Ognenoski
Kingston University, UK
New multimedia systems and services have higher quality requirements, not limited to connectivity: users expect services to be delivered according to their demands in terms of quality. In recent years, the concept of Quality of Service (QoS) has been extended to the new concept of Quality of Experience (QoE) [1], as the former only focuses on network performance (e.g., packet loss, delay, and jitter) without a direct link to perceived quality, whereas the latter reflects the overall experience of the consumer accessing and using the provided service. Experience is user- and context-dependent, that is, it involves considerations about subjective quality and users' expectations based on the cost of the service, their location, the type of service, and the convenience of using the service. However, subjective QoE evaluation is time-consuming, costly, and not suitable for use in closed-loop adaptations, hence there is a growing demand for objective QoE monitoring and control: objective, rather than subjective, QoE evaluation can be used to enable user-centric design of novel multimedia systems, including wireless systems based on recent standards (such as WiMAX and 3GPP LTE/LTE-A) through an optimal use of the available resources based on the aforementioned objective utility index.
The main aim of achieving a satisfactory QoE for the users of a system can be afforded at different layers of the protocol stack. Dynamic rate control strategies, also optimized across the users, can be considered at the application layer in order to allocate the available resources according to users' requirements and transmission conditions. Rate control was originally adopted with the goal of achieving a constant bit rate, then with the goal of adapting the source data to the available bandwidth [2]. Dynamic adaptation to variable channel and network conditions (i.e., by exploiting the time-varying information about lower layers) can also be performed.
Packet scheduling schemes across multiple users can be considered at the Medium Access Control (MAC) layer in order to adapt each stream to the available resources [3]. Content-aware scheduling can also be considered, as in [4]. At the physical layer, Adaptive Modulation and Coding (AMC) can be exploited to improve the system performance, by adapting the relevant parameters to both the channel and the source characteristics. In addition, throughput variations, resulting in lower QoE, can be smoothed out through a number of methods, including interference shaping [5], or compensated via appropriate buffering.
This chapter focuses on QoE monitoring, control, and management for different types of service. The remainder of the chapter is organized as follows. Section 7.2 focuses on QoE monitoring, describing subjective and objective methodologies, the need for real-time monitoring, and relevant technical solutions. Section 7.3 focuses on QoE management and control in different scenarios, including wireless scenarios and adaptive streaming over HTTP. The case of transmission to multiple users is addressed as an example requiring the joint management of different QoE requirements from different users. Finally, conclusions are drawn in Section 7.4.
QoE monitoring is the key to assessing the overall enjoyment or annoyance of emerging multimedia applications and services. This could lead further to designing a system which maximizes user experience. For instance, the monitored QoE at different nodes could be used to optimize the system parameters and to maximize the user QoE in general. However, monitoring QoE is a challenge due to a range of factors associated with QoE, such as human factors (e.g., demographic and socioeconomic background), system factors (e.g., content- and network-related influences), and contextual factors (e.g., duration, time of day, and frequency of use). The overall experience can be monitored, analyzed, and measured by QoE-related parameters, which quantify the user's overall satisfaction with a service [1, 6].
Therefore, QoE monitoring goes beyond conventional QoS monitoring, which focuses on the performance of the underlying network. QoS measurements are typically obtained using objective methods, whereas monitoring and understanding QoE requires a multi-disciplinary and multi-technological approach. The methods adopted to measure QoE should account for all the factors associated with user perception or experience. At least, the major factors need to be identified in order to comprehensively evaluate the user experience for a given application. While the monitoring of each individual aspect of QoE remains a challenge, understanding the interaction among these aspects and its overall effect is a far greater challenge. With the advancement of technology, measuring certain aspects of QoE has become feasible (e.g., via psychophysical measurements and physiological measurements[7]). However, more work has to be done in order to understand the overall experience, which could be a result of multi-disciplinary research. The following discusses the state-of-the-art of QoE monitoring and technologies, potentially enabling QoE-driven applications and services.
A common challenge of any multimedia application or service provider is to ensure that the offered services meet at least the minimum quality expected by the users. The use of QoE measurements enables us to measure the overall performance of the system from the user's perspective [8]. The factors influencing QoE are specific to certain applications. For instance, perceptual video quality is the major QoE factor for video delivery services. In this case, accurate video quality measurements and monitoring at different system nodes will enable us to achieve maximum user QoE. A few solutions are commercially available to measure the QoE of different applications (e.g., Microsoft's Quality of Experience Monitoring Server).
QoE monitoring tools can be classified into two main categories, namely active monitoring and passive monitoring. In active monitoring, specific traffic is sent through the network for performance evaluation, whereas in passive monitoring devices are placed in measuring points to measure the traffic features as it passes. Both methodologies have their own advantages and disadvantages. For instance, passive monitoring does not incur additional overheads to the user traffic, whereas active monitoring inserts traffic specifically for QoE monitoring. The monitoring of QoE can take place at different stages of the end-to-end application delivery chain, for example at the head/server end or at the last mile (e.g., media gateway, home gateway, or Set-Top Box (STB)). Even though the measurements taken at the receiver provide the best knowledge of user experience, QoE monitoring at intermediate nodes will also provide a good indication of the effect of these technologies.
Traditionally, QoE measuring and monitoring is a human-based process: due to the subjective nature of QoE, the outcome should be evaluated by human observers. Therefore, subjective quality evaluation tests are the golden standard to monitor the QoE of novel media applications. Several international standards describe perceptual image and video quality evaluation procedures (e.g., ITU-R BT.500-13 [9]). However, specific standards targeting more general QoE monitoring have not been reported in the literature so far. Defining such a procedure is a challenge because of the several perceptual and physiological aspects attached to QoE. For instance, in 3D video, increased binocular disparity may provide the user an increased depth perception, which could also induce discomfort due to increased parallax. Therefore, emerging QoE monitoring procedures should be able to measure the overall effect of these affecting factors. Even though subjective QoE measuring provides the best judgment, it comes with several disadvantages. For instance, we need controlled test environments, several human subjects, time and effort to obtain subjective measurements. In addition, these standardized measurement methods cannot be deployed in real-time QoE monitoring scenarios.
Objective QoE monitoring procedures will overcome several disadvantages associated with subjective QoE measurements. However, they may lack the accuracy of the results obtained with subjective procedures. Unlike subjective tests, objective methods can employ simple algorithms to calculate different factors of QoE. For instance, Peak Signal-to-Noise Ratio (PSNR) can easily be calculated to understand the quality of images/video. However, these measures may or may not correlate well with subjective measurements. Therefore, hybrid QoE measurement tools will enable more reliable and accurate measurements, since they account for impairments which affect user perception. For stereoscopic video, the possibility of using objective quality measures of individual left and right views to predict 3D video subjective quality is discussed in [10, 11]. These methodologies evaluate both image and depth perception-related artifacts, which is key to understanding the perceived QoE by end users.
Objective QoE monitoring tools cannot always be used in real-time monitoring tasks. For instance, monitoring the objective image or video quality in real time needs a reference image or video for comparison with the received image or video (to calculate, e.g., PSNR). It is not practical to send the original video sequence to the receiver side to measure the quality, due to the high bandwidth demand. To overcome this problem, Reduced-Reference (RR) and No-Reference (NR) quality measurements are used to measure the quality in real time. RR methods compute a quality metric based on limited information about the original source, whereas NR – or blind – metrics do not need any reference information for quality evaluation.
The RR quality evaluation methods in [12, 13] extract features from the original image/video sequence. Such features are transmitted and compared with the result of the evaluation of these features at the receiver side (see Figure 7.1). The two metrics in [12, 13] adopt different strategies for quality evaluation based on feature comparison. Similarly, the RR quality evaluation method for 3D video proposed in [14] evaluates the perceived quality at the receiver side using the edge information extracted from the original 3D video sequence as side-information. This method accounts for both image artifacts as well as depth-map-related artifacts, which is important to describe the overall perception. Similar approaches are necessary for real-time QoE monitoring, in order to measure the quality at the receiver side with limited information from the sender side [15, 16]. Designing such a system will always be a challenge, because of the range of perceptual attributes attached to QoE. However, a careful identification of QoE factors (influential factors) and understanding the effect of end-to-end technologies on these aspects will enable researchers and developers to design optimum QoE measurement and monitoring tools.
Quality assessment of the received multimedia information is crucial for two main purposes:
In the first case, the goal is to assess the final quality, reflecting the subjective quality experienced by the users (through subjective tests or objective metrics well matching subjective results). In the second case, while matching subjective results is also of importance, the main requirements are the possibility of calculating the video quality metric in real time and without reference to the original transmitted signal. QoE-driven system adaptation is addressed in Section 7.3.
Different transmission technologies result in different types of quality impairment. For instance, in transmission via RTP/UDP, packet losses are the major source of impairments. In this case, the RTP Control Protocol (RTCP) and its extended version RTCP-XR [17] enable monitoring QoS (e.g., via reports on packet loss rates) and also QoE (e.g., with voice quality metrics for Voice over IP (VoIP)).
In transmission via TCP, due to retransmissions of lost packets, delay is the main reason for QoE reduction. The remainder of this section focuses on QoE monitoring for TCP-based video streaming, focusing in particular on QoE monitoring for Dynamic Adaptive Streaming over HTTP (DASH).
The recent standards for HTTP streaming (e.g., DASH [18–20], developed by the Motion Pictures Expert Group (MPEG)) support a streaming client–server model with user-oriented control, where initially metadata is exchanged allowing the user to learn about content type and availability. The user fetches content from the network according to its preferences and network conditions, adaptively switching between the available multimedia content utilizing the HTTP protocol. The advantages of this technique compared with previous streaming techniques (i.e., RTP/RTSP streaming, HTTP progressive streaming) make MPEG-DASH a promising framework for adaptive transmission of multimedia data to end users.
There are two main sets of relevant parameters regarding the QoE paradigm for HTTP adaptive streaming. The first set refers to parameters that influence the overall QoE, whereas the second set refers to observable parameters that can be used directly for the derivation of QoE metrics. The important parameters from the first set are summarized on a per-level basis as follows.
The second set of parameters refers to parameters that are observed and taken into direct consideration for the derivation of QoE metrics (e.g., video quality, initial delay, frequency of rebuffering events).
With reference to the aforementioned second set of parameters, three levels of QoS for HTTP video streaming are addressed in [21]: network QoS, application QoS, and user QoS. The authors refer to user QoS as a set of observable parameters that reflect the overall QoE. The general idea of the authors is to investigate the effect of network QoS on user QoS (i.e., QoE), with application QoS as bridging level between them. Initially, they use analytical models and empirical evaluations to derive the correlation between application and network QoS, and then perform subjective experiments to evaluate the relationship between application QoS and user QoE. The network QoS is observed with active measurements and refers to the network path performance between the server and the client in terms of Round Trip Time (RTT), packet loss, and network bandwidth. Further, the following application-level QoS metrics are observed:
Tinit refers to the period between the starting time of loading the video and the starting time of playing the video, Trebuff measures the average duration of rebuffering event, and frebuff denotes the frequency of rebuffering event.
The user QoE is measured via the Mean Opinion Score (MOS) according to the ITU-T P.911 recommendation [22] via subjective measurements. This approach is adopted since objective metrics such as PSNR and MSE evaluate only distortion and do not take into account the Human Visual System (HVS), hence they are not suitable for video streaming.
The authors propose to estimate the QoE based on application-level QoS parameters as follows:
The most significant input within this model is the frequency of rebuffering, outlined as the main factor affecting users' QoE.
The authors in [23] propose a no-reference QoE model based on Random Neural Networks (RNNs). This QoE estimation approach models two main problems regarding adaptive video streaming: playout interruptions and video quality variations due to lossy compression during the encoding of different video quality levels. The QoE estimation model considers the following parameters: the quantization parameter used in the video compression and playout interruptions that occur during the video playout. This model is a no-reference model, and hence simple compared with full- or partial-reference QoE models.
The work in [24] similarly considers video playout interruptions, but not the effect of changing the video bit rate as in the case of adaptive video streaming. The QoE model elaborated utilizes the PSQA method based on RNNs [23]. When the parameters affecting the QoE change, a new PSQA module is designed based on subjective tests. The idea is to have several distorted samples evaluated subjectively by a panel of human observers. Then the results of this evaluation are used to train an RNN in order to capture the relationship between the parameters and the QoE. During these tests the authors keep the resolution of the videos and the frame rate constant, and different video qualities are produced via different quantization parameters. The effects from the network (i.e., packet losses, delay, jitter) are included in the playout interruptions. Playout interruptions are represented as a function of three measurable parameters: the total number of playout interruptions N, the average value of interruption delays Davg, and the maximum value of interruption delay Dmax. These parameters are measured over an interval containing a fixed duration of video data. The RNN-based QoE model estimation can be summarized in the following points:
Furthermore, it is outlined that, when the network bandwidth decreases, it is recommended to use a coarser representation (i.e., a lower bit rate) rather than risking even a single playout interruption. Hence, when considering the trade-off between QP and playout interruptions, the latter should be kept to a minimum at the cost of decreasing QP.
In the end, the RNN-based QoE is compared with a freeze distortion model [24] through Mean Square Error (MSE). The freeze distortion model does not take into consideration the values of the QP, hence it is significantly outperformed even for very high QP values.
An extensive study to understand the video quality impact on user engagement is presented in [25], where a large data set of different content types is used and parameters at the client side (join time, buffering ratio, average bit rate, rendering quality, and rate of buffering events) are measured in order to observe the effect of these parameters on the QoE. The analysis shows that the buffering ratio has the largest impact on the QoE regardless of the content type. Furthermore, application- and content-specific conclusions are derived; for example, the average bit rate is more significant for live content compared with VoD content.
The authors in [26] state that the buffer underrun is not enough to represent the viewers' QoE by presenting subjective tests for TCP streaming. A no-reference metric (pause intensity) is proposed, which is a product of the average pause duration and the pause frequency. The metric is derived utilizing an equation-based TCP model.
Cross-Layer Design (CLD) solutions should be investigated in order to optimize the global system based on a QoE criterion. As an example, in [27] a CLD approach is considered with multi-user diversity, which explores source and channel heterogeneity for different users.
Typically, CLD is performed by jointly designing two layers in the protocol stack [28–31]. In [15], CLD takes the form of a network-aware joint source and channel coding approach, where source coding (at the application layer) and channel coding and modulation (at the physical layer) are jointly designed by taking the impact of the network into account. In [29], cross-layer optimization also involves two layers, the application layer and the MAC layer of a radio communications system. The proposed model for the MAC layer is suitable for a transmitter without instantaneous Channel State Information (CSI). A way of reducing the amount of exchanged control information is considered, by emulating the layer behavior in the optimizer based on a few model parameters to be exchanged. The parameters of the model are determined at the corresponding layer, and only these model parameters are transmitted as control information to the optimizer. The latter can tune the model to investigate several layer states without the need to exchange further control information with the layer. A significant reduction of control information to be transmitted from a layer to the optimizer is achieved, at the expense of the control information from the optimizer to the layers that might increase slightly.
The work in [28] includes in the analysis MAC-PHY and APP layers, presenting as an example a MAC/application-layer optimization strategy for video transmission over 802.11a wireless LANs based on classification.
The CONCERTO and the OPTIMIX European projects address(ed) CLD strategies, the cross-information to be exchanged, and the strategies to pass such information among the layers in mobile networks. In order to control the system parameters based on the observed data, two controller units were proposed in the OPTIMIX project: one at the application layer (APP) and one at the base station (BSC) to control lower-layer parameters [32] and in particular resource allocation among the different users based on the (aggregated) multiple feedback. A block diagram of the two controllers is shown in Figure 7.2.
The two controllers operate at different time scales, since more frequent updates are possible at the base station controller, and rely on different sets of observed parameters. For instance, the application-level controller outputs the parameters for video encoding and application-layer error protection based on the collected information on available bandwidth, packet loss ratio, and bit error rate. The base station controller performs resource allocation among users based on information on channel conditions and bit error rate, as well as quality information from the application layer. The goal of the proposed system is to provide a satisfactory quality of experience to video users, hence video quality is the major target and evaluation criterion, not neglecting efficient bandwidth use, real-time constraints, robustness, and backward compatibility.
Owing to the wide range of QoE-aware strategies (the reader can see the recent special issues [1, 33] for other examples), we focus in the following on two categories: QoE management for DASH and QoE management in wireless shared channels.
In dynamic adaptive video streaming, adaptation can be performed [34–36] based on the QoE parameters discussed in Section 7.2.1. The different system parameters can be selected based on QoE requirements. In addition, video configurations can be adapted in real time with the target of optimizing the QoE [37].
For instance, the work in [38] investigates the required client's buffer size for the prescribed video quality by targeting memory-constrained applications. This approach considers video streaming over TCP and provides an analytical model for the buffer size as a function of the TCP transmission parameters, network characteristics (packet loss and round trip time), and probability of buffer underrun at the playback.
A network-oriented proposal for QoE management is presented in [39], where video adaptation is performed at a proxy at the edge between Internet and wireless core, in order to improve the DASH QoE in cellular broadband access networks. The proxy performs global optimization over multiple DASH flows by splitting the connection toward the client (hence, increasing the average throughput). Video quality-aware dynamic prioritization is adopted (low-rate streams have high priority to preserve minimal QoE), and fairness is introduced. Further, an adaptive controller is introduced in the network in order to minimize the cost function. This function depends on video distortion, bit-rate variations, and playback jitter at the clients, and is defined as a weighted sum.
The work in [40] presents a network-driven QoE approach for DASH, which jointly considers the content characteristics and wireless resource availability in Long-Term Evolution (LTE) networks in order to enhance the HTTP streaming user experience. This is achieved by rewriting the client's HTTP request at a proxy in the mobile network, since the network operator has better information regarding network conditions. The DASH client is proxy-unaware and can play the obtained segments from the network.
A DASH system with client-driven QoE adaptation is presented in [41]. The proposed adaptation logic in this work combines TCP throughput, content-related metrics, buffer level, and user requirements and expectations, which enhances the viewing experience at the user side. The analysis outlines that regarding DASH, the representation switch rate and media encoding parameters should be quantified together during the design of the QoE system. In the worst case, the proposed automated QoE model results in a 0.5-point divergence compared with the evaluation of human subjects.
An approach for using user-viewing activities in improving the QoE is presented in [42]. These activities are used to mitigate temporal structure impairments of the video. In this study, by using subjective tests, these activities are linked with the network measurements and user QoE. The results show that network information is insufficient to capture the user's dissatisfaction with the video quality. In addition, the impairments in the video can trigger user activities (pausing and reducing screen size); therefore, inclusion of pause events improves the prediction capability of the proposed model.
When transmitting multimedia signals to multiple users over wireless systems (see the example scenario in Figure 7.3), the trade-off between resource utilization and fairness among users has to be addressed. On the one hand, the interest of network operators is to maximize the exploitation of the resources (e.g., assigning more resources to the user(s) experiencing better channel conditions). On the other hand, this strategy can result in unsatisfied users, since users experiencing worse channel conditions would not be served and would not meet their QoE requirements. For this reason, fairness among users has to be considered in scheduling and resource allocation. In recent years, several approaches have been proposed, with the goal of jointly maximizing the quality experienced by different users.
In [32] we addressed the aforementioned trade-off by focusing on fairness, targeting the maximization of the minimum weighted quality among different users. This approach results in a good level of fairness among users. However, without a proper admission control strategy, this could lead to weak exploitation of the resources and if a single user experiences very bad channel conditions, the attempt to serve this user with a reasonable quality may dramatically jeopardize the quality received by other users.
Content awareness is a key feature in providing QoE and, in recent years, a number of relevant approaches have emerged. The authors of [4, 43] investigated a content-aware resource allocation and packet scheduling for video transmission over wireless networks. They presented a cross-layer packet scheduling approach, transmitting pre-encoded video sequences over wireless networks to multiple users. This approach is used for Code Division Multiple Access (CDMA) systems, and it can be adapted for Orthogonal Frequency Division Multiple Access (OFDMA) systems such as IEEE 802.16 and LTE wireless networks. The data rates of the served users are dynamically adjusted depending on the channel quality and the gradient of a content-aware utility function, where the utility takes into account the distortion of the received video.
In multimedia applications, the content of a video packet is critical for determining the importance of the packets. Utility functions can be defined as either a function of the head-of-line packet delay, a function of each flow's queue length, or a function of each user's current average throughput. In terms of content, the utility gained due to transmitting the packet, the size of the packet in bits, and the decoding deadline for the packet (i.e., each frame's time stamp) can be considered. In addition, CSI can inform the scheduling decision. The method adopted in [4, 43] consists of ordering the packets of the encoded video according to their relative contribution to the final quality of the video, and then constructing a utility function for each packet in which its gradient reflects the contribution of each packet to the perceived video quality. Hence, the utility function is defined as a function of the decoded video quality (i.e., based on the number of packets already transmitted to a user for every frame). Further, robust data “packetization” at the encoder and realistic error concealment at the decoder are considered. The proposed utility function enables optimization in terms of the actual quality of the received video. The authors provide an optimal solution where video packets are decoded independently and a simple error concealment approach is used at the decoder. Moreover, with complex error concealment a proper solution is provided where a distortion utility is calculated. Performance evaluation is carried out and it is noticed that the proposed content-aware scheduler outperforms content-independent approaches in particular for video streaming applications. The parameters used in this scheduler are achievable rate, CSI from User Equipment (UE), a weighting parameter for fairness purposes across users (which is based on the distortion in a user's decoded video based on the previous transmissions), and three features of each packet which are utility gained due to packet transmission, decoding deadline, and packet size.
A content-aware downlink packet scheduling scheme for multi-user scalable video delivery over wireless networks is proposed in [44]. The scheduler uses a gradient-based scheduling framework, as elaborated earlier in [4], along with the Scalable Video Coding (SVC) schemes. The reason for using SVC is to provide multiple high-quality video streams over different prevailing channel conditions for multiple users. The scheduler proposed in [44] outperforms the traditional content-independent scheduling approaches. Furthermore, the SVC encoder offers the potential to utilize packet prioritization strategies without degrading/compromising system performance. Packet prioritization can be signaled to the MAC layer (i.e., scheduler), in conjunction with the utility metrics of each packet. A distortion model is also proposed in order to efficiently and accurately predict the distortion of an SVC-encoded video stream. This model has been employed in this work in order to prioritize source packets in the queue based on their estimated impact on the overall video quality. The parameters used are achievable rate for every user, loss probability, user's estimated channel state, and expected distortion.
A detailed review of content-aware resource allocation schemes for video transmission over wireless networks is given in [45], although this does not include some of the most recent approaches.
The authors of [46, 47] discuss a scheduling and resource allocation strategy for multi-user video streaming over Orthogonal Frequency Division Multiplexing (OFDM) downlink systems. The authors utilize SVC for encoding the video streams. This work utilizes only the temporal and quality scalability (not spatial scalability) for video coding via the adaptive resource allocation and scheduling. The authors propose a gradient-based scheduling and resource allocation strategy. This strategy prioritizes different users by considering adaptively adjusted priority weights, computed based on the video content, deadline requirements, and transmission history. A delay function is designed to cope with the effect of the deadline approaching, for which the possibility of delay violation is reduced.
The aim of the work presented in [43, 47] is to maximize the average PSNR of all SVC video users under a constrained power transmission, time-varying channel conditions, and variable-rate video content. The obtained results show that the proposed scheduler outperforms the content-blind and deadline-blind algorithms with a gain of as much as 6 dB in terms of average PSNR when the network is saturated. The parameters considered for this scheduler are average throughput to control fairness, dynamic weight based on the target delay (i.e., dynamic/desirable bit rate for the unfinished sub-flow of the video streams), video content (e.g., packet sizes at temporal and quality layers), achievable rate, CSI, length of unfinished sub-flow, playback deadline, priority weight computed based on the corresponding distortion decrease, bit rate required to deliver the current sub-flow of the video streams which consider the target delay and the bits remaining for the unfinished sub-flow, and different delay functions to decrease the delay violation which is mainly adjusted to achieve remarkable results.
In [48], the authors presented a scheduling algorithm that can be tuned to maximize the throughput of the most significant video packets, while minimizing the capacity penalty due to quality/capacity trade-off. It is shown that the level of content awareness required for optimum performance at the scheduler, and the achieved capacity, are highly sensitive to the delay constraint.
The authors of [49] propose a distortion-aware scheduling approach for video transmission over OFDM-based LTE networks. The main goal of this work is to reduce the end-to-end distortion in the application layer for every user in order to improve the video quality. Hence, parameters from the Physical (PHY), MAC, and Application (APP) layers are taken into consideration. At the APP layer, the video coding rate is extracted; at the MAC layer, Physical Resource Block (PRB) scheduling and channel feedback are exchanged; and at the PHY layer, modulation and coding parameters are used. Parameters used are frame distortion caused by lost slice, waiting time, transmitting time, latency bound, video distortion caused by Block Error Rate (BLER) – a function of modulation and coding scheme of PRB – Signal-to-Interference-Noise Ratio (SINR) of wireless channel, the dependency of the video content under the constraint of the transmitting delay, and different coding rate. Simulation results show that the proposed gradient-based cross-layer optimization can improve the video quality.
The authors of [50] propose a novel cross-layer scheme for video transmission over LTE-based wireless networks. The scheme takes into consideration the I-based and P-based packets from the APP layer, scheduling packets according to their importance from the MAC layer, and channel states from the PHY layer. The work in [50] aims to improve the perceived video quality for each user and improve the overall system performance in terms of spectral efficiency. It is assumed that I packets are more important than P packets for each user. The reason is that the loss of an I packet may lead to error propagation within the Group of Pictures (GoP). Hence, the packet scheduling algorithm at the MAC layer is adapted to prioritize I packets over P packets for each video sequence. Results show that the proposed cross-layer scheme performs better in terms of system throughput and perceived video quality. The parameters used are achievable rate, service rate requirement I and P packet queues, CSI, and I packets being more important than P packets since the loss due to error propagation may lead to error propagation within the GoP.
The authors of [51] propose a cross-layer algorithm of packet scheduling and resource allocation for multi-user video transmission over OFDMA-based wireless systems. The goal of this work is to maintain fairness across different users and minimize the received video distortion of every user by adopting a proper packet scheduling and radio resource allocation approach. Similar work is done in [52]. Furthermore, when video streams are requested by the end user, video packets of the corresponding video streams are ordered according to their contribution to the reconstructed video quality which is estimated before transmission. Then, video packets are buffered in order at the base station. Hence, video content information from the APP layer (i.e., indicating the importance of each packet on the reconstructed video quality), queue state information from the MAC layer (i.e., indicating the order of the video packets in the buffer), channel state information from the PHY layer, playback delay, and video packet size are parameters used in the proposed algorithm.
A quality-aware fair downlink packet scheduling algorithm for scalable video transmission over LTE systems was proposed in [53]: we addressed quality fairness by relying on the Nash bargaining solution. We proposed a downlink scheduling strategy for scalable video transmission to multiple users over OFDMA systems, such as the recent LTE/LTE-A wireless standard. A novel utility metric based on video quality was used in conjunction with our proposed quality-driven scheduler. The proposed metric takes into account the frame dependency in a video sequence. The streams are encoded according to the SVC standard, and hence organized in layers where the upper layers cannot be decoded unless the lower layers are correctly received. The proposed metric is called frame significance throughput. Results showed that our proposed strategy outperforms throughput-based strategies, and in particular it enables the operator of the mobile system to select the level of fairness for different users in a cell based on its business model. The system capacity in terms of satisfied users can be increased by 20% with the proposed quality-based utility, in comparison with advanced state-of-the-art throughput-based strategies.
A channel and content-aware 3D video downlink scheduling combined with a prioritized queuing mechanism for OFDMA systems is proposed in [54]. The idea behind the prioritized queuing mechanism is to prioritize the most important video layers/components with the goal of enhancing the perceived 3D video quality at the receiver. We focused on color plus depth 3D video and considered the different importance of diverse components with respect to the perceived quality. 3D video is encoded using an SVC video encoder. The priority values of the encoded 3D video components are signaled from the application layer to the MAC layer via cross-layer signaling. The users feed back their sub-channel gain to the base station, which is then used at the MAC layer for the resource allocation process. Hence, the proposed scheduler is designed to guarantee that the important layers are scheduled at every scheduling epoch over the sub-channels with higher gain. The Packet Loss Ratio (PLR) is increased for the prioritized color/depth layers at the MAC layer, at the expense of a small increase in the PLR for the less perceptual video layers. Video layers highly affected by packet losses are discarded, so as not to waste radio resources. The prioritization scheme results in a global quality improvement in the prioritized case. The parameters used are the Head of Line (HoL) packet delay, a weight that controls the throughput fairness among users, the fractional rate based on video-layer bit rate, SINR, Dependency/Temporal/Quality (DTQ) identifications of the SVC video stream, and the maximum tolerable delay based on the playout time (i.e., frame rate).
The authors of [55] propose a content-aware scheduler to allocate resources for downlink scalable video transmission over LTE-based systems. The goal of this scheduler is to control the video quality, as this can be done by allocating PRB to the users based on their available resources, link quality, and device capability. In addition, the number of available PRBs and the link quality control the scheduler decision in choosing the profile level and assigning the required number of PRBs for each user. The parameters used are SVC profile levels, Channel Quality Indicator (CQI), number available, along with a quality-driven scheduler. The quality of the video is obtained by considering two methods: one is based on a no-reference metric and the other is based on a full-reference metric (i.e., dependent on knowledge of the PSNR of the original video).
A packet scheduling approach for wireless video streaming is proposed in [56]. The wireless network is a 3G-based network. The proposed approach involves applying different deadline thresholds to video packets with different importance in order to obtain different packet loss ratios. The importance value of a video packet is determined by its relative position within its GoP and motion texture context.
The authors of [57, 58] discuss a cross-layer system design between the streaming server and the mobile Worldwide Interoperability for Microwave Access (WiMAX) base station for SVC streaming over mobile WiMAX. The aim of this work is to support the QoS of video streaming services effectively over the WiMAX network. It is worth noting that transmission packets can be classified into multiple levels of importance when using the SVC standard. The authors of [59] investigate an application-driven cross-layer approach for video transmission over OFDMA-based networks. The proposed schemes are named quality-fair and quality-maximizing, which are used to maximize the video quality with and without fairness constraints, respectively. The packet scheduling will be responsible for selecting the packets with the largest contribution to the video quality. The assumption in this design is that each video frame is partitioned into one or more slices, each slice header acts as a resynchronization marker to allow independent decoding of the slices, and each slice contains a number of macro-blocks. Hence, due to the variation of the content among different video streams, different packets make diverse contributions to the video quality. The parameters used are a quality contribution index for each packet (i.e., expressed by the decreased distortion value caused by the successful transmission of the packets), the size of the packets, maximum delay, real-time requirements for video applications, and the CSI fed back by the mobile station.
As a final consideration, it is worth noting that different applications and services may have different definitions of quality of experience, as well as different requirements, and different methods may be used for QoE management. As an example, in medical applications the ability to use the received data to perform a diagnosis or a remote medical intervention is often the key QoE requirement [60–67]. In online gaming, interactivity is a main factor for QoE management, while in learning applications different strategies can be adopted to distribute the same content to a large number of users. The detailed study of the application-specific QoE requirements enables the most appropriate strategy for QoE management and control.
This chapter has presented a review of the most widely adopted strategies for QoE monitoring, control, and management, including solutions proposed by the authors. The focus was in particular on QoE provision for HTTP streaming and wireless transmission to multiple users. Designing current transmission systems with the goal of achieving QoE requirements for all involved users will enable better exploitation of resources, with a higher number of satisfied users in the system.
The authors acknowledge support from the European Union's Seventh Framework Program (FP7/2007-2013) under grant agreement no. 288502 (CONCERTO project).
Adaptive Modulation and Coding
Application
Block Error Rate
Code Division Multiple Access
Cross-Layer Design
Channel Quality Indicator
Channel State Information
Dynamic Adaptive Streaming over HTTP
Dependency/Temporal/Quality
Full Reference
Group of Pictures
Head of Line
Long-Term Evolution
Medium Access Control
Mean Opinion Score
Motion Pictures Expert Group
Mean Square Error
No Reference
Orthogonal Frequency Division Multiple Access
Packet Loss Ratio
Physical Resource Block
Peak Signal-to-Noise Ratio
Quality of Experience
Quality of Service
Reduced Reference
RTP Control Protocol
Signal-to-Interference Noise Ratio
Set-Top Box
Scalable Video Coding
User Equipment
Voice over IP
Worldwide Interoperability for Microwave Access