11.4 Summary of Standardization Activities

This section provides an overview of major standardization efforts that have emerged in recent years and can be generically applied to the implementation of services and tools that contribute to universal access to multimedia content to realize UMA. Context-aware content adaptation systems provide the means to implement UMA, and can thus be built based on the technologies described. Having carefully analyzed these standards, it seems fair to say that the MPEG-21 standard is the most complete one. This is easily explained by the fact that the scope of MPEG-21 is much more than “just” adaptation or “just” context description, as it:

  • Defines a complete infrastructure for packaging all the intervenient elements in a compact way that contributes to the augmented access and use of multimedia content, such as content and context descriptions, content representation and composition, management of intellectual property rights (IPRs) and digital rights, enforcement of digital rights and coarse control content consumption, and so on.
  • Provides a comprehensive framework for the use of context information, and accordingly mediates the optimum access to content while also deciding the need for adaptation.
  • Has a scope of adaptation aiming at all types of content, with its tools/descriptions targeting all kinds of adaptation.
  • Makes use of a comprehensive set of descriptions (also through the use of MPEG-7) that are able to characterize virtually any of the entities usually cooperating for the delivery of the adapted multimedia service.
  • Is flexible enough to scale to new requirements as the technology progresses.

MPEG-21 provides the means to bridge the gap between being able to understand and use context and evaluating the quality of the result achieved by the adaptation mechanisms. The tool AQoS, from the set of DIA tools, is a step in this direction. Nevertheless, more work is needed to actually evaluate the results of adaptation decision-taking and of the adaptation itself in terms of user expectations and degree of fulfillment of the satisfaction of user experience. Likewise, further investigation is needed to fully understand context and how to optimally use it, especially in what concerns the generation of implicit or higher-level context. In this particular aspect, it seems of utmost interest to try to merge the work of the research community engaged in the use of ontologies to build contextual models, as described in Section 11.2.1.

Nevertheless, it should be emphasized that the W3C specifications are very important not only due to their flexibility, but also because of their wide acceptance and already extensive usage. Thus, cooperation between the two standardization activities in MPEG and W3C would enable the former to benefit from the larger market penetration, as well as from the use of Web 2.0 technologies, and the latter from the extensive knowledge gained on multimedia processing.

11.4.1 Integrating Digital Rights Management (DRM) with Adaptation

(Portion reprinted, with permission, from M.T. Andrade, H. Kodikara Arachchi, S. Nasir, S. Dogan, H. Uzuner, A.M. Kondoz, J. Delgado, E. Rodriguez, A. Carreras, T. Masterton and R. Craddock, “Using context to assist the adaptation of protected multimedia content in virtual collaboration applications”, roc. 3rd IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'2007), New York, NY, USA, 12–15 November 2007. ©2007 IEEE

A wide variety of research work has been conducted on DRM [5153] and adaptation [5456] to date. However, such work has essentially been carried out independently, without any significant exchange of information between the different groups addressing each of the two topics. Nonetheless, adaptation is an operation to be performed upon the content. Accordingly, and as long as the content is governed or protected, the content adaptation operation should also be subjected to certain rules and rights. Therefore, it is inevitable that these two separate communities will cross the boarders eventually and start to work together.

11.4.2 Existing DRM Initiatives

We consider DRM as any of the several technologies used by copyright owners to control access and usage of digital data and hardware, handling usage restrictions associated with a specific instance of apiece of digital work. The most significant initiatives trying to standardize open DRM systems (which guarantee interoperability) are MPEG-21 [57], Open Mobile Alliance (OMA) DRM [58], and TV Anytime [59], but there are also many industry solutions like the Windows Media DRM 10 [60].

They all use rights expression languages (RELs) to define the rights that a user has over their digital work, and the restrictions that have to be applied on usage. The most relevant rights expression languages are MPEG-21 REL, based on the eXtensible rights Markup Language (XrML) [61] proposed by ContentGuard Inc. [62], and the Open Digital Rights Language (ODRL) [63] proposed by Ianella from IPR Systems [64]. XrML and ODRL syntactically are based on XML, while structurally they both conform to the axiomatic principles of rights modeling, which were first laid down by, among others, Dr Mark Stefik of Xerox PARC, the designer of the Digital Property Rights Language (DPRL) [65].

11.4.3 The New “Adaptation Authorization” Concept

As the two aforementioned areas of research were developed separately, it seems impossible to govern content adaptations due to the lack of descriptions about permissible conversions nowadays. Only very recently did the two groups of researchers working on adaptation and DRM start to cooperate in jointly defining approaches and methodologies to combine each other's outcomes into a single framework. In this framework, adaptation operations are subjected to restrictions based on the content owner's rights, i.e. content adaptation is governed by the content owner's rights, in addition to the constraints imposed by terminals, networks, natural environments, and users. Thus, this framework brings out a new concept, adaptation authorizations, which can be seen as a new form of contextual information.

Not surprisingly, the joint effort between these two research fields has also emerged within the MPEG working community. In fact, MPEG already addressed the two issues separately within the MPEG-21 standard [30,31]. As each area evolved during standardization, it became clear that some kind of integration was crucial.

In multimedia networks where digital rights are governed, providers can protect the distribution and use of their content by means of standardized REL [66] and Rights Data Dictionary (RDD) [67] (Parts 5 and 6 of MPEG-21, respectively). However, as adaptation is becoming more and more important in multimedia systems, we arrive at a point where more detailed descriptions are needed about permissible conversions in order to be able to govern content adaptations.

The first amendment of MPEG-21 DIA [68] provides the description of fine-grained media conversions by means of the conversion operations' names and parameters, which can be used to define rights expressions to govern adaptation in an interoperable way. This amendment is mainly the result of the work done in the context of two European projects: DANAE [40], discussed in more detail in Section 11.2.2.2, and ADMITS (Adaptation in Distributed Multimedia IT Systems) [69]. In Section 11.6.7, we will go into further detail on how adaptation authorization can be achieved, but first we will give a brief overview of how rights can be expressed by means of licenses: the fundamental units of communication in the rights domain.

Figure 11.15 shows a license, which is divided into two main parts: a set of grants and the issuer of these grants. In the example illustrated in this figure, Bob is the issuer of the license (and probably the owner or distributor of the content, named “image.jpg”), and he expresses that he allows Alice (the principal) to play his content during the month of April by means of a grant. The MPEG-21 REL data model for a rights expression includes four basic entities, and the basic relationship among these entities is defined by the MPEG-21 REL assertion grant. Structurally, a grant consists of the following elements:

images

Figure 11.15 REL data model

  • The principal to whom the grant is issued.
  • The right that the grant specifies.
  • The resource to which the right in the grant applies.
  • The condition that must be met before the right can be exercised.

A grant, by itself, is not a complete rights expression that can be transferred unambiguously from one party to another. A full rights expression is called a license. As mentioned above, a typical license consists of one or more grants and an issuer – the party who issued the license.

In order to follow this structure and guarantee interoperability, MPEG-21 DIA conversion permissions have to be integrated within the “Condition” field of each grant. The details of this integration will be discussed in Section 11.6.7.

To date, there has not been a real implementation of adaptation authorization made, but many of the projects currently working with MPEG-21 DIA (DAIDALOS [70], aceMEDIA [71], etc.) have earmarked this point as possible future work. Other projects like AXMEDIS [72] or the second part of Projecte Integrat [73], named “Machine”, are also beginning to conduct some research work in this area. The most advanced publication about the subject is [74], in which a very interesting use case can be found, illustrating a complex UMA scenario that justifies the need for conversion and permission descriptions, as well as giving some detailed examples of them.

11.4.4 Adaptation Decision

ADEs can be considered as the intelligent part of the content adaptation systems. Their goal is to make a decision regarding the actions to be performed when contextual information is available. Thus, they provide an implementation of the phase “sensing higher-level context”, as defined in Section 11.2.1.1. Within the scope of MPEG-21, an ADE realizes the basic contextual information as constraints imposed by the delivery and consumption environment. Using also descriptions of the service to be paid for or the content to be delivered, in terms of technical characteristics such as encoding format, encoding rate, spatial or temporal resolution, it implements an optimization algorithm to select the set of characteristics that satisfy the constraints. As the ultimate goal of adaptation is to improve the quality of experience of the user, this optimization algorithm usually uses a utility value to represent the user's satisfaction, which drives the optimization algorithm.

Accordingly, an ADE needs to receive not only the low-level contextual information, but also sets of media characteristics that provide the technical parameters with which the content is encoded (corresponding to different flavors of the same content) as well as a utility value that quantifies the degree of satisfaction of the user with each set of the technical parameters. The fact that the ADE receives sets of values of encoding parameters should not be seen as imposing any kind of restriction upon the type of adaptation operation that may be performed as a result of the decision-taking process. In fact, it does not require that the adaptation be performed by transcoding or trans-rating the media resource. For example, if the original DI is composed of video and audio, and among the available sets of media characteristics there is one that has only video encoding parameters and text, then this implies that the adaptation operation needs to be a voice-to-text transformation (and possibly also a rate transformation of the video). Thus, it will be up to the ADE to reason about the low-level contextual information it receives and infer higher-level context. Furthermore, in this example, if the description of the natural environment indicates a rather high surrounding noise level, the ADE might also opt to select the set of parameters that do not include audio.

Current implementations of ADEs do not address this kind of behavior. They only provide the functionality to compare similar sets of parameters, and select the one that maximizes the utility. Generally speaking, the adaptation decision process can be seen as a problem of resource allocation in a generalized rate-distortion or resource-distortion framework. It is mainly an optimization problem that operates in a three-dimensional space: content, resources, and utility, as represented in Figure 11.16. The content space provides the indication of the possible variations of the content that can be offered (or equivalently of the possible content adaptation operations that can be performed). The resources space represents the characteristics and limitations of the current consumption environment. It describes the consumption environment in terms of availability of resources at the terminal and in the networks, of the conditions of the natural surrounding environment, and of the user preferences or requirements. It can thus be seen as the space dictating the initial rules by which to choose among the available content variations, by imposing some kinds of constraints. It allows for elimination of variations that do not meet the described constraints. Finally, the utility space provides values that quantify the degree of satisfaction of the user with each of the available variations of the content. It can thus be used to enable a finer-grain selection among the subset of variations that have initially satisfied the constraints imposed by the resources space.

For a given flavor of the content (a possible adaptation of the content), a set of resources is selected from among those available so as to minimize the distortion introduced, or in other terms maximize the utility. This distortion can be a measure of the degradation of the quality of the adapted content (correspondingly, the utility is the level of quality of the content). It can also be another measure that reflects the degree of satisfaction of the user, or even any other metric, such as the cost that the user will have to pay for the adapted service (any other metric that may reflect some preference of the user or degree of satisfaction). In [24], the problem of defining utility measures is discussed. The author argues that there is no universal solution, due to the complex nature of utility and its dependencies on a number of subjective factors, such as the nature of the content itself and the characteristics of the user (for example, user 1 may consider that both content A and content B with bit rate r1 are of very good and of satisfactory quality respectively, whereas user 2 considers content A as of medium quality only). Accordingly, it is concluded that this is still an open issue, which is currently being studied within the concept of quality of experience (QoE).

images

Figure 11.16 The representation of the three different spaces for adaptation decision

An early research work addressing the context-aware adaptation of content [37] presents a framework using an info pyramid, where different variations and modalities of the content are represented at different levels of fidelity. From this pyramid, a customizer selects the best pair of variation and modality so as to meet the constraints of the usage environment. The focus of the work is on adapting Web documents or applications composed of multiple media types to meet different terminals with various capabilities. The system architecture proposed in this work is presented in Figure 11.17.

Although this approach is somewhat rigid, its concepts are quite useful when the objective is to adapt a complete presentation composed of different media types. It could be complemented with the approach based on the three-dimensional space. The info pyramid provides different modalities of a given content in the horizontal axis, where the most demanding modality, in terms of required resources, is placed on the left corner. Along the vertical axis, it provides variations of each of those modalities, starting with the highest available quality variation at the bottom. Figure 11.18 illustrates an example info pyramid.

images

Figure 11.17 Internet content adaptation system architecture based on the info pyramid concept [37]

In more recent work [38], a three-dimensional space approach has been presented using the designation “adaptation-resource-utility”. Figure 11.19 illustrates the concept of this three-dimensional space for content adaptation decision. For a given possible adaptation of the content, a set of resources is selected from among those available so as to minimize the distortion introduced or, in equivalent terms, to maximize the utility. In this research work, different case studies are described, for which different utilities are developed to drive the selection of the adaptation operation. The use of the resource space is essentially restricted to resources that are directly related to the technical specificities of the adaptation operation upon a given media content (for example, “network bandwidth available” as the resource and “frame dropping” as the adaptation). This work is focused on establishing relationships between the spaces from the perspective of video adaptation. Another aspect where this work diverges from that discussed in this chapter is that the selection of the adaptation is driven by the minimization of the used resources (such as bandwidth) and the maximization of the utility. In this chapter, on the other hand, the adaptation decision is described to be initially driven by the constraints imposed by the context of usage, and then further refined by the maximization/minimization of the utility. Nonetheless, this model is sufficiently generic to allow for the description of a number of different constraints in the resource space and the establishment of different mapping relations between the spaces.

images

Figure 11.18 Example of an info pyramid for a video item [37]

images

Figure 11.19 The “adaptation-resource-utility” space for content adaptation [38]

The MPEG-21 DIA tools, together with MPEG-7, can be used to represent the above-mentioned three spaces (refer to Section 11.2.2 for a succinct description of these standards).

As indicated above, the content space provides structural metadata about the content for each possible variation or flavor. More precisely, the technical parameters with which the content is encoded in order to provide the specified variation are described. Accordingly, this space can be implemented via the MPEG-7 MDS media characteristics (see Section 11.2.2).

The resources space provides information about the characteristics, capabilities, and conditions of the whole delivery and consumption environment, which are used to determine the constraints imposed on the service. The MPEG-21 DIA UED tool is thus adequate for implementing this space together also with the UCD tool, as UCD can be used to express specific limitations or optimization constraints based on the UED-specific characteristics in order to facilitate the adaptation decision.

Finally, the utility space is the vehicle through which the ADE is able to formulate a decision by reasoning about the contextual information present in the content and resources spaces. Basically, it achieves this goal by assigning a utility to each set of technical parameters in order to encode the content. As such, the MPEG-21 DIA AQoS tool is adequate for employment here. In fact, as the AQoS tool provides the mechanism to describe the relations between the content space and the utility space, it also incorporates the content space by making use of the MPEG-7 MDS media characteristics referred to.

Figure 11.20 illustrates the conceptual architecture of an adaptation decision framework, implementing the three-dimensional space approach through the use of the MPEG-21 description tools referred to. It also shows the high-level architecture of an ADE, which can be seen as a service by other modules of a context-aware system, as well as small examples of the metadata that it uses. This ADE is quite generic as it can accept metadata in different formats. The module that wants to use the ADE can completely specify how the provided metadata is to be used in the decision-taking process. In some implementations, this module can be the entity or process responsible for monitoring the quality of the service being offered to the user. In other cases, this functionality can be incorporated within the ADE. The important aspect to highlight here is that this ADE can potentially be used in many different application scenarios and by different external modules, regardless of the fact that formats and rules are externally supplied or internally generated. The use of XSLT provides the ADE with great flexibility, while decoupling it from other components. For example, it is possible to seamlessly use different Adaptation Decision Taking Engine (ADTE) components, which implement different search strategies, accept different UEDs, and use different forms of transforming the UEDs into UCDs. The output of the ADE is a set of “name–value” pairs selected from among the AQoS descriptors originally provided. This output is used to configure different resources, including the encoding parameters of AEs. In the current implementation, this configuration is done using Web Services and associated Simple Object Access Protocol (SOAP) messages [75].

images

Figure 11.20 ADE framework based on MPEG-21

This type of ADE was developed under the VISNETI NoE project [76]. It did not take into consideration the information concerning the protection of the content and authorization of adaptation operations. In addition, it was able to provide an adaptation decision for one medium only, not taking into account aspects of adapting a composition of multiple media components. However, given its high versatility, it forms the basis of the work conducted within the VISNET II NoE project.

11.4.5 Context-based Content Adaptation

Content adaptation is the process of converting the media available from the content provider into a format which can be consumed by the user. An AE performs this operation as instructed by the ADE. Content adaptation in multimedia processing research has primarily been realized in the form of video adaptation, as compared to the speech and audio components of the multimedia content, video requires special attention for its coding, processing, and transmission over access networks. Most of the video adaptation techniques that have been discussed in the literature address network constraints. Bit-rate adaptation of MPEG-4 visual-coded bitstreams by frame dropping (FD), AC, and/or DCT coefficient dropping (CD), and their combinations (FD–CD), have been discussed in [77]. A utility function (UF) has been used to model video entity, adaptation, resource, utility, and the relations among them. Each video clip is classified into one of several distinctive categories and then local regression is used to accurately predict the utility value. Techniques reported in [7881] consider only frame dropping as a means of bit-rate adaptation. A more intelligent frame-skipping technique has been presented in [82]. This technique determines the best set of frames (key frames) to represent the entire sequence. The proposed technique utilizes a neural network model that is capable of predicting the indices of the most appropriate subsequent key frames. In addition, a combined spatial and temporal technique for multidimensional scalable video adaptation has also been discussed in [83].

A framework for video adaptation based on content recomposition is presented in [84]. The objective of the proposed technique is to provide effective small-size videos which emphasize the important aspects of a scene while faithfully retaining the background context. This is achieved by explicitly separating different video objects based on a generic video attention model that extracts the objects in which a user is interested. Three types of visual attention feature, namely intensity, color, and motion, have been used in the attention model. Subsequently, these objects are integrated with the direct-resized background to optimally match the specific screen sizes under consideration.

However, the aforementioned techniques do not consider user preferences when determining the nature of adaptation. The technique presented in [85] extracts the highlights in sports videos according to user preferences. The system is able to extract the highlights, such as shots at goal, free kicks, and so on for soccer, and start, arrival, and turning moments for the swimming scenes. Objects and events are assigned to different classes of relevance. The user can assign a degree of preference to each class, in order to have the best quality–cost tradeoff for the classes most relevant to what they are interested in, at the price of a lower quality for the least relevant ones. The adaptation module performs content-based video adaptation according to the bandwidth requirements and the weights of the classes of relevance. [86] considers user preferences together with network characteristics for the adaptation of sports videos. Events are detected by audio/video analysis, and annotated by the DSs provided by the MPEG-7 MDSs. Subsequently, user preferences for events and network characteristics are considered in the adaptation of the videos through event selection and frame dropping.

An effective way of performing content adaptation is to utilize adaptation operations in the networks, due to their advantages such as transparency and better network bandwidth resource utilization. Scalable coding technologies help to simplify the function of the network element that carries out the adaptation operation. Adaptation is particularly needed when compressed media streams traverse heterogeneous networks. In such cases, a number of content-specific properties of the coded multimedia information require adaptation to new conditions imposed by the different networks and/or terminals in order to retain an acceptable level of service quality. Network-based adaptation mechanisms can be employed at the edges or other strategic locations of different networks, using a fixed-location content adaptation gateway, node, or proxy as in conventional networking strategies [8790]. Alternatively, content adaptation through transcoding can be performed dynamically wherever and whenever needed using active networking technologies [91,92].

Not only do the network- and/or user terminal-based characteristics impose adaptation needs on the accessed/delivered content, but the users themselves play a major role in choosing the way the content is distributed. For instance, a user may wish to select a specific area that draws their main attention in visual content. Thus, they may want to access a part of the video scene based on their selection. In addition to this, or in a totally isolated situation, the terminal that the user is using may have a restricted display capability with lower resolution than that of the originally-encoded content. Moreover, the access network that the user is connected to may not be able to support elaborate visual information transfer due to bandwidth limitations and/or other channel-specific characteristics. All of these add up to the profiling of a use case for this particular user, and the different display capabilities, attention area selection preferences, access network-based features, and so on provide the necessary context elements for this use case.

The content adaptation strategies and mechanisms that are being discussed within this chapter aim to implement user-centric content adaptation operations, which ultimately will provide the user with the best possible user experience of the service they have requested. This goal can only be realized if the content access/distribution is effectively decoupled from the service-related limitations, which in turn makes the service delivery transparent to the user. Under such a circumstance, the different factors, all of which can be referred to as context information, collectively affect the adaptation of the content, and can provide guidance on how to perform the best possible adaptation for each and every use case.

A number of content adaptation tools will be described in subsequent sections, with a view to addressing the needs of the application scenario described in Section 11.6. The main focus is placed on the context-based methods for user-centric content adaptation with management of digital rights. While discussing the issues related to the focused objective, a region of interest (ROI) selection by the user is assumed to form a driving context element for developing an ROI-based user-centric content adaptation tool. ROI selection provides a key advantage during content adaptation through transcoding, as it identifies a visually-important area or object in the digital video. The advantage is particularly significant when high-resolution video services are distributed across a wide range of heterogeneous user terminals with diverse display capabilities [93]. Selecting an ROI in video content allows a content adaptation (e.g. through transcoding, etc.) gateway to accurately reformat the resolution of input video while focusing on the main region or object of visual attention, as requested by the user. In this way, the AE is able to reorganize the predefined scene priorities, allowing for unequal video parameter allocation to different parts of a scene based on their perceptual qualities [94,95].

Various methods for determining an area of visual attention have been presented in the literature to date. These methods have been exploited to develop a number of algorithms for ROI selection [9698]. However, most of these algorithms were employed to select an ROI in the pixel domain during the encoding of a video sequence [98100]. Therefore, they are not quite adequate for network-based adaptation operations for heterogeneous video access scenarios with quick system responses. Recent research has focused on finding ROI in the coded domain in order to allow for a number of fast applications, such as transcoding systems, object detection, tracking and identification techniques, image and video retrieval/summarization schemes based on MPEG-7 descriptors, event detection, AV content analysis and understanding tools, and so on [51,94,101105].

The reorganization of the content at a gateway in the network has to be context-driven, which wil depend on either the user preferences or the network conditions. Focusing on the main region or object of main attention could be implemented by separating the source stream into substreams and varying source and channel rates for these streams to provide better error protection to the stream carrying the selected attention area. Other streams can be assigned lower source and channel coding rates. This facility could also be useful in situations where the network is experiencing congestion or a user is in an area with weaker signal reception, where adequate bandwidth is not available for higher-quality content access. Here, unequal rate allocation to different regions of the stream could provide better quality for the selected region of the video scene. The network gateway could sense network conditions and resources could be allocated on a priority basis to different regions of the video content. Optimizations in the allocation of resources for video applications over mobile networks have served for transmission power control and improved visual quality [106,107].

Scalable video coding (SVC) has been identified as a feasible video adaptation technique that fits within the MPEG-21 DIA architecture. A number of scalability options have been discussed in the literature, namely spatial, temporal, fidelity, and interactive ROI (IROI) [108]. If the coded video is featured with one or more of the aforementioned scalability options, the adaptation operation is as simple as letting the set of coding units that define the adapted bitstream through the AE and discarding the rest. This technology has been available in video coding standards, such as MPEG-4, for many years. Nevertheless, it remains underutilized for several reasons, such as excessive demand for computational resources at the encoder, coding efficiency, and delay. Furthermore, user-generated content cannot be expected to be always scalable, due to the use of low-cost hardware and software by some content providers during the content production cycle.

IROI adaptation is a vital ingredient in user-centric adaptation. A user may wish to view a selected ROI from the high-resolution video on their low-resolution display, rather than a low-resolution version of the entire frame. For instance, this scenario frequently occurs in security and surveillance applications [109]. Spatial cropping of each video frame in the video sequence (sequence-level cropping) is necessary to address such adaptation requests. Leaving the decoder to handle this adaptation is not an ideal choice since it will not only be a misuse of the precious network bandwidth, but will demand more computational resources at the user terminal. Furthermore, if the access network has bandwidth limitations in particular, the overall concept of decoder-driven scalability or adaptation becomes totally unfeasible.

SVC extension of H.264/AVC [110] provides provisions for user IROI adaptation. This technique is formally identified as IROI scalability [111,112]. This scalability is achieved by coding non-overlapping rectangular regions (tiles) of a frame into independently decodable entities called network abstract layer (NAL) units (NALUs) using flexible macroblock ordering (FMO). An AE can utilize the IROI scalability to extract a substream that provides enhanced visual quality over the ROI [109]. However, a sequence-level cropping operation can be performed at an AE only if cross-tile temporal prediction is restricted, which drastically affects the overall compression efficiency. A similar IROI scalability technique has also been proposed by Lambert et al. [113]. In this work, FMO in H.264/AVC has been utilized to code an ROI into NALUs, and therefore this technique also suffers from the limitations of SVC extension of H.264/AVC IROI scalability, as discussed above. Consequently, transcoder-based adaptation is a necessity for serving such scenarios. This chapter discusses a platform to accomplish such adaptations on both MPEG-4- and H.264/AVC-coded video streams, and presents experimental results on the effectiveness of the described adaptation tools for context-based user-centric content adaptation.

The adaptation tools under view are utilized as the AE, and form an integral part of the content adaptation block shown in Figure 11.21 [114]. Here:

  1. The user specifies a selected ROI in a feedback message to the service provider.
  2. The service provider consults an ADE.
  3. The ADE determines the type of adaptation needed after processing the available context descriptors. The context descriptors describe the user-defined ROI and other constraints, such as terminal capabilities, access network capabilities, usage environment, DRM, and so on.
  4. The relevant adaptation decision is then passed on to the AE.

images

Figure 11.21 User-centric ROI-based content adaptation architecture

An AE that focuses on content adaptation based on a method for optimized source and channel rate allocation is presented in Section 11.6.8.1. Then an AE that carries out sequence-level cropping-type recommendations specified in the adaptation decision message in order to provide ROI-based content adaptation is described in Section 11.6.8.2. Moreover, adaptation based on scalable video content is also reported, in Section 11.6.8.3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset