Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11

Opinion Summarization and Visualization

G. Murray^a; E. Hoque^b; G. Carenini^b ^a University of the Fraser Valley, Abbotsford, BC, Canada
^b University of British Columbia, Vancouver, BC, Canada

Abstract

Given a very large amount of social media text, being able to understand the variety of opinions contained in the text often depends on the generation of summaries and visualizations of the dataset so as to make it more manageable. This chapter first surveys approaches used for both extractive and abstractive summarization of opinion-filled social media text, including discussion of summarization evaluation. We then survey approaches for presenting these opinion summaries to users in the form of visualizations, including interactive visualizations. Taken together, these summarization and visualization techniques can allow users to get concise overviews of the opinions expressed in the data, enabling them to draw sound conclusions and make more informed decisions.

Keywords

Opinion summarization; Opinion visualization; Summarization evaluation; Abstractive summarization; Social network; Visual text analytics

1 Introduction

There are many cases where it would be useful to be able to analyze opinions contained in a large amount of social media text. For example, it could help inform the decision making of an organization regarding development and marketing of a product. However, the vastness of textual data in social networks makes this a challenging and time-consuming proposition. For that reason it would be highly desirable to generate natural language summaries of opinions in social media, and present these summaries to the user in an easily understood format.

This chapter presents an overview of a set of computational methods for summarizing opinions from textual data. In particular, it discusses how various approaches for performing opinion detection and sentiment analysis can be used in generating textual summaries. It gives an overview of several extractive and abstractive summarization systems along with their relative trade-offs.

Once summaries have been generated from a large set of opinions, a further challenge is how to effectively present these summaries to the user. Information visualization can play a critical role here by creating visual representations of summaries along with important attributes of the underlying opinions (eg, strength, polarity). Often, visualizations can be interactive (eg, the user may filter and zoom into the actual opinions and get detailed information on demand). In this way, interesting patterns, trends, and outliers among opinions can be easily perceived, thus enhancing the ability of the user to make informed decisions. In the second part of the chapter, an overview of various visualization tools will be provided.

2 Opinion Summarization

The task of automatically summarizing documents has been researched for several decades [1–6]. Most of this work has been on summarizing highly structured documents such as news articles and scientific journals. As a result, summarization researchers have focused a great deal on well-formed, grammatical documents that are mostly devoid of sentiment and opinion. Most of this work has also been extractive, meaning that sentences in the source document(s) are classified as informative or noninformative, and those informative sentences are simply concatenated to form a summary.

Recently, those trends have shifted. Researchers have focused on new domains such as automatic summarization of meeting transcripts [7], written conversations [8], lectures [9, 10], and increasingly social media text. In general, the natural language processing community has become more interested in “noisy” text, which may include ungrammatical sentences, misspellings, fragments, and slang [11]. Documents in these domains are also likelier to contain opinions, sentiment, and disagreements.

Simultaneously, the summarization research community has started moving increasingly toward abstractive summarization approaches. The loosest definition of abstractive is that the summary sentences do not occur in the original source documents. By this definition, sentence compression approaches would qualify as abstractive [12, 13]. In a stricter sense, abstractive approaches should mimic human-authored summarization by following steps roughly corresponding to understanding, synthesis, and creation of summary text. Spärck Jones [3] termed those steps interpretation, transformation, and generation. This type of abstraction is clearly more difficult than extraction, not least because it requires a natural language generation component, but it also offers the hope of more fluent and informative summaries than extractive techniques can offer. Many current abstractive summarization systems in truth fall somewhere between extractive and abstractive [14].

In the following sections we describe the challenges associated with summarizing opinion-filled social media text, and the approaches that researchers have taken.

2.1 Challenges

There are two main challenges associated with summarizing social media text: the documents tend to be noisy and ungrammatical, and they are filled with opinions that may be very diverse and conflicting. Those challenges are addressed in turn.

2.1.1 Challenges of summarizing noisy text

The noisy nature of social media text poses challenges for all summarization systems but the challenges are different for extractive versus abstractive systems.

For an extractive system the output sentences are a subset of the input sentences. This means that the summary will reflect any ungrammaticalities, disfluencies, and slang that are in the input unless care is taken to remove them. This type of postprocessing can include expansion of acronyms, correction of misspellings, appropriate capitalization, and sentence compression. Sentence fragments and ungrammatical sentences are more difficult to deal with; they can be filtered out altogether, at the cost of reducing coverage of the document, or they can be left in the summary as is, at the cost of reducing readability and coherence of the summary.

Abstractive systems do not suffer from the same problem. We can use natural language generation to create well-formed sentences that describe the input documents at a high level. The natural language generation component gives us some control over readability, coherence, conciseness, and vocabulary. However, abstractive systems may be more reliant than extractive systems on syntactic and semantic parsing to represent the meaning of the input sentences at a deeper level. In some domains it is very difficult to get good parsing performance because of the noisy nature of the text. It may be possible to improve parsing by preprocessing the sentences, similar to the extractive postprocessing steps described above. Alternatively, systems could try to incorporate parsers, partial parsers, chunkers, or ontology mappers that have been specifically trained on such noisy data.

In one evaluation and comparison of extractive and abstractive summarizers in the similarly noisy domain of meeting speech [15], user study participants were extremely dissatisfied with extractive summaries and many participants remarked that the extracts did not even constitute summaries. Abstraction seems to have a clear advantage in such domains.

2.1.2 Challenges of summarizing opinion-filled text

The main challenge in summarizing opinion-filled text is to generate a summary that accurately reflects the varied, and potentially conflicting, opinions. If the system input consists of social media posts in which the vast majority of social media users share similar opinions about a person, organization, or product, then simple extractive approaches may suffice: just find some exemplar posts and make those the summary sentences. However, if the social media users disagree about that entity, or have a similar opinion but for different reasons, extraction alone may not be enough. For instance, if the group exhibits binary polarization about an issue or entity, we may be able to identify a mix of positive and negative exemplar texts, but concatenating them has the potential to create a very incoherent summary.

So abstractive approaches seem well suited to the task of summarizing opinion-filled text. The system can generate high-level sentences that describe the differing opinions and any information or evidence that seems to be driving those opinions. Further, a hybrid system can have each abstractive sentence linked to extracts that exemplify the viewpoint being described.

2.2 Evaluation

We briefly describe two types of summarization evaluation that have been used in the literature.

2.2.1 Intrinsic evaluation

Intrinsic evaluation measures are so-called because they attempt to measure intrinsic properties of the summary, such as informativeness, coverage, and readability. These qualities are sometimes rated by human judges (eg, using Likert-scale ratings).

For system development purposes, human ratings are often too expensive and so are used sparingly. As an alternative, automatic intrinsic evaluation can be done by comparison of machine-generated summaries with multiple human-authored reference summaries. Multiple reference summaries are used because even human-authored summaries will often exhibit little overlap with one another. A very popular automatic summarization evaluation suite is ROUGE [16], which measures n-gram overlap between machine summaries and reference summaries. However, in other noisy domains it has been observed that ROUGE scores do not always correlate well with human judgments [17].

Another intrinsic evaluation technique is the pyramid method [18], which is more informative than ROUGE because it assesses the content similarity between machine and reference summaries in a more fine-grained manner. However, the pyramid method is much more time-consuming and only recently has a partially automatic version been proposed [19].

2.2.2 Extrinsic evaluation

The ideal evaluation is an extrinsic one, where one tests whether the generated summaries are actually effective in assisting users in some realistic task. Like human ratings of intrinsic summary qualities, extrinsic evaluations can be very expensive and so are rarely used during the system development process. Rather, they are employed after a system has been developed and has already been tested with intrinsic measures.

To give the most general example of an extrinsic summary evaluation, one can test whether user study participants are able to find information in the source documents more quickly when they have a generated summary to assist them. The required information may be in the form of a set of questions that the user needs to answer in a limited amount of time. Depending on the types of questions and the relevant documents, in some cases it may be easier and quicker for the user to simply do a keyword search. If the information need is more complex (eg, requiring the user to understand some aspect of group interaction in the source documents), a summary may be very valuable in completing the task.

2.3 Opinion Summarization Approaches

Some of the most widely cited work on opinion summarization is by Hu and Liu [20], who developed a system for automatic generation of summaries for product reviews. They do not take a purely extractive or abstractive approach, instead developing a hybrid system that analyzes product features and creates a structured summary with links to actual review sentences.

The Hu and Liu system has three major steps:

1. identify product features from the customer reviews;

2. identify opinion sentences from the reviews, and determine whether they are positive or negative;

3. generate a concise summary of the results.

For identification of frequent features, they use a type of association mining based on the Apriori algorithm. They also develop a simple procedure for identifying infrequent features that may be of interest to some customers.

For identification of opinion words, they use adjectives and exploit WordNet’s synonym and antonym sets to predict “semantic orientation” of adjectives.¹ They begin with a seed list of positive adjectives (eg, fantastic, nice) and negative adjectives (eg, boring, dull) and consult WordNet to expand the adjective lists. Identification of opinion sentences is done by analysis of the constituent opinion words of each sentence.

The actual summary generation component of the Hu and Liu system works in two steps:

• For each feature, the relevant opinion sentences are put into positive and negative categories, and a count is displayed for each.

• Features are ranked according to how often they are mentioned in reviews.

For a particular feature such as the picture feature of a camera, the summary snippet might show that there are 12 positive comments, including the following example: “Overall this is a good camera with a really good picture clarity.”

To continue that example, it might also show that there are two negative comments, including: “The pictures come out hazy if your hands shake even for a moment during the entire process of taking a picture.”

Hu and Liu performed three types of intrinsic evaluation corresponding to the various system components described above, and found the coverage and performance to be very satisfactory.

Carenini et al. [21] explored extractive versus abstractive approaches for summarization of product reviews. They began by distinguishing between crude features that can be extracted from product reviews and user-defined features that represent a more abstract taxonomy of product characteristics. They derived a mapping between crude features and user-defined feature; for example, the crude features “unresponsiveness” and “lag time” are mapped to the user-defined feature “delay between shots.” Each crude feature cf_i is associated with a set of polarity and strength evaluations ps(cf_i) derived from the product reviews. With the mapping between crude features and user-defined features, they can also derive the polarity and strength evaluations associated with each user-defined feature.

Their extractive approach relies on the MEAD open-source summarization framework [22]. A summarization feature they found particularly useful is defined in Eq. (11.1).

$\begin{array}{l} C F_{sum} (s_{k}) = \sum_{p s_{i} \in eval (s_{k})} | p s_{i} |, \end{array}$ $\begin{array}{l} C F_{sum} (s_{k}) = \sum_{p s_{i} \in eval (s_{k})} | p s_{i} |, \end{array}$

si1_e (11.1)

where s_k is a sentence and eval(s_k) is a set of evaluations of crude features associated with the sentence. Their extractive system also used reranking i to reduce redundancy.

Carenini et al. compared this MEAD-based extractive system with an abstractive system wherein they first calculate the direct importance of each feature in the user-defined feature according to Eq. (11.2):

$\begin{array}{l} dir_moi ({udf}_{i}) = \sum_{p s_{k} \in P S_{i}} | p s_{k} |^{2}, \end{array}$ $\begin{array}{l} dir_moi ({udf}_{i}) = \sum_{p s_{k} \in P S_{i}} | p s_{k} |^{2}, \end{array}$

(11.2)

where ps_i is the set of polarity and strength evaluations directly associated with feature udf_i. They then use a dynamic greedy selection algorithm to determine the most important features for the product that will need to be described in the summary. They calculate a polarity for each user-defined feature, and also assess the distribution of opinions. If the distribution is bimodal, users are split on whether they like the feature, and both viewpoints will need to be included in the summary. They generated the actual summaries using the GEA natural language generation system [23].

A key finding of their evaluation is that the two approaches performed similarly well but that the extractive summaries sometimes fail to give a thorough overview of the opinions expressed, while the abstractive summaries can be repetitive or “robotic.” The complementary nature of the strengths and weaknesses suggests that hybrid approaches should be feasible.

Carenini and Cheung [24] further investigated such issues and found that abstractive approaches are superior precisely when the products are more controversial. Extraction may not suffice when there are many viewpoints to be represented.

Nishikawa et al. [25] developed an opinion summarization system with a focus on online product reviews and restaurant reviews. They framed the summarization task as an integer linear programming problem, a strategy that was explored previously in other domains as well [26, 27]. In their approach, maximizing the objective function results in a summary that covers the core concepts and has maximum coherence. More specifically, the objective function is given by Eq. (11.3):

$\begin{array}{l} max λ \sum_{e_{i} \in E} w_{i} e_{i} + (1 - λ) \sum_{a_{i, j} \in A} c_{i, j} a_{i, j} . \end{array}$ $\begin{array}{l} max λ \sum_{e_{i} \in E} w_{i} e_{i} + (1 - λ) \sum_{a_{i, j} \in A} c_{i, j} a_{i, j} . \end{array}$

si3_e (11.3)

Here e_i represents an opinion, which is a tuple < target, aspect, polarity >, where target is an entity and aspect is a feature of the entity. The tuple has an associated weight w_i, for which they use the frequency of the opinion in the document. The term a_{i, j} represents an arc between sentences i and j, and c_{i, j} represents the coherence of that arc. This is a constrained optimization problem, with example constraints being the length of the summary and constraints that tie together opinions and sentences.

Nishikawa et al. used two types of intrinsic evaluation: ROUGE and human ratings. They outperformed two state-of-the-art systems according to the ROUGE evaluations, but there were no significant differences according to human readability scores.

Potthast and Becker [28] described an opinion summarization system for web comments, which combines elements of summarization and visualization. Related to our earlier point about the risk of extractive summarization on noisy documents, Potthast and Becker state that it is “pointless” to extract sentences from web comments, and they instead focus on extracting words. Given a positive sentiment lexicon V⁺ and negative sentiment lexicon V⁻, they calculate the semantic orientation of a word using Eq. (11.4);

$\begin{array}{l} S O (w) = \sum_{w^{+} \in V^{+}} assoc (w, w^{+}) - \sum_{w^{-} \in V^{-}} assoc (w, w^{-}), \end{array}$ $\begin{array}{l} S O (w) = \sum_{w^{+} \in V^{+}} assoc (w, w^{+}) - \sum_{w^{-} \in V^{-}} assoc (w, w^{-}), \end{array}$

(11.4)

where assoc() is calculated as pointwise mutual information. If the semantic orientation of some word w is greater than a threshold ϵ, it is added to V⁺. If it is below − ϵ, it is added to V⁻1. This has the effect of adapting the sentiment lexicon to the particular domain. They can then detect the commonest positive and negative words in Flickr images and YouTube videos. Finally, they arrange these words as positive and negative tag clouds.

Recently, Gerani et al. [29] have also explored how to leverage the discourse structure of product reviews to generate better abstractive summaries of those reviews. The discourse trees of all the reviews are aggregated in a graph that provides both content and structure for the summary. Two intrinsic crowdsourced evaluations in which users expressed pairwise preferences between summaries show that the proposed approach significantly outperforms extractive and abstractive baselines.

In this section we have focused on opinion-based summarization of web text such as weblogs and product reviews. We conclude the section by noting that there is also research on summarization of web text that does not focus on opinions. For example, Sharifi et al. [30] generate short summaries explaining why Twitter users are tweeting about a particular subject, but they do not incorporate opinion modeling in their work. There has also been some work on opinion summarization on data from nonweb sources (eg, telephone speech [31]).

3 Opinion Visualization

So far in this chapter we have discussed different methods for summarizing opinions from textual data in social media. However, a further challenge is how to support the user in understanding and analyzing the results of such summarization methods.

It is well known that information visualization techniques can be very useful to support the user in exploring a large amount of data [32]. Information visualization techniques take advantage of our visual information processing ability by creating visual representations of the large dataset. As a result, various interesting patterns, trends, and outliers can be much more easily observed.

An information visualization system initially shows an overview of the dataset; however, the user can get more detailed information on demand through interactive techniques. Showing such detailed information on demand is critical because we may lose important information in the summarization process [32]. For instance, if an opinion visualization for customer reviews presents the keyphrase “terrible display,” only after reading texts from original reviews may the user realize that the keyphrase is concerned with the low-resolution display of a smartphone. In such situations, interactive techniques can help the user to read the original text on demand to understand the context of the summary. Another reason for introducing interactivity is that when we deal with large datasets, given the display limitations, a static visualization cannot show all the data at once. In such situations, interactive techniques can deal with large datasets by changing the view from an overview to more detailed data through direct manipulation, filtering, and zooming [33].

3.1 Challenges for Opinion Visualization

A fundamental challenge of designing any information visualization system arises from the human and display limitations. Our perceptual and cognitive abilities are limited, which must be taken into account to design an effective visualization. Moreover, the limited display size often means there are trade-offs regarding what data should be shown and how they should be shown to the user.

Even if we assume that the visualization addresses these limitations successfully, it could be still ineffective if it does not match the specific task that users care about. In other words, a visualization can be comprehensible by humans but not well suited for the intended task. Therefore it is important to understand the user tasks and carefully choose the best possible visualization design by consideration of multiple alternatives.

There are also some design challenges that are very specific to opinion visualization for social media text data. One particular challenge arises from the noisy nature of social media text, as pointed out earlier in this chapter. As a consequence of noisy text, the results of text mining and summarization methods can be inaccurate. If the visualization does not account for such inaccuracy or uncertainty, the user may reach wrong conclusions after analyzing the data, or may lose trust in the system after realizing that the results are unreliable [34].

In this era of big data, another challenge emerges from the fact that social media data are often generated at a volume and velocity that cannot be handled by most of the existing tools. When we need to deal with such a large amount of data, many basic visualization techniques such as bar charts or scatter plots may not be sufficient to display the information. In such cases, a more complicated visualization may be needed to link multiple types of visualizations through interactions.

In the reminder of the section, we discuss a set of opinion visualization techniques for different text genres and how they address specific design challenges that we have pointed out.

3.2 Text Genres and Tasks for Opinion Visualization

Most of the previous work on opinion visualization in social media can be broadly categorized on the basis of the text genres and subsequently by the task characteristics, as shown in Table 11.1. We now provide an overview of the key text genres and possible tasks for each of these genres.

Table 11.1

Summary of the Work on Opinion Visualization Discussed in This Chapter, Organized on the Basis of Text Genres (Row) and Tasks (Column)

Text Genre	Task
Customer feedback	Explore opinions of a single entity Static data: Gamon et al. [35], Carenini et al. [36], Yatani et al. [37], Wu et al. [38] Streaming data: Hao et al. [39]
	Compare opinions between multiple entities Static data: Liu et al. [40], Carenini and Rizoli [41]
User reactions to large-scale events via microblogs	Explore opinions for large-scale events Static data: Diakopoulos et al. [42] Streaming data: Marcus et al. [43]
	Discover opinion diffusion for large-scale events Static data: Wu et al. [34]
Asynchronous conversations	Explore opinions in asynchronous conversations Static data: Hoque and Carenini [44]

3.2.1 Customer feedback

Early work on opinion visualization was done for customer review datasets with a focus on feature-based (aka aspect-based) sentiment analysis. When one is performing feature-based sentiment analysis, it is assumed that given an entity of interest (eg, a smartphone model), opinions are expressed on its features (eg, cameras, battery, screen). The goal is to support a potential customer to either explore what opinions were expressed on different features of an entity [35–37] or compare multiple entities [40, 41]. In contrast, a business analyst may be interested in performing more complex tasks; for example, finding correlation between opinions and various data dimensions, such as geographical, demographical, and temporal aspects [38, 39].

3.2.2 User reactions to large-scale events

Many news stories or events trigger a huge amount of discussion in social media, especially via microblogs such as Twitter. Analyzing people’s opinions about such events is of great interest to business analysts, market intelligence analysts, and social scientists. The visualizations for large-scale events have mainly focused on supporting the user in analyzing overall sentiment trends; namely, how the sentiment evolves over time [42, 43] and how a particular opinion spreads among users [34].

3.2.3 Online conversations

Online conversations in social media such as Facebook discussions or blog conversations exhibit several unique characteristics: unlike microblogs or messaging [45], they do not have fixed-length comments; furthermore they have a finer conversational structure as participants often reply to a post and/or quote fragments of other comments [46]. These unique characteristics need to be taken into account when one is designing both mining and visualization techniques. For this text genre, users can be categorized into two groups on the basis of their activities: (1) participants who have already contributed to the conversations, and (2) nonparticipants who wish either to join the conversations or to analyze the conversations. A few visualization tools [44] have been developed recently to support the exploration of online conversations.

3.3 Opinion Visualization of Customer Feedback

Opinion visualizations on customer review datasets mainly focused on two major tasks: (1) explore opinions on a single entity and (2) compare opinions across features for multiple entities.

3.3.1 Explore opinions on a single entity

Often it is useful to organize the features of an entity into a hierarchy to provide a more structured view of opinions. Treemapping is a hierarchical visualization technique that was applied to represent the sentiment associated with different features of a product [35, 36]. Within a treemap, each node is represented as a rectangle with nested rectangles that indicates the descendants of the node. The Pulse system clusters the sentences of reviews into different topics and visualizes these topic clusters and their opinions using a treemap [35]. Carenini et al. [36] extracted a set of features from reviews and automatically mapped these features into a user-defined hierarchy. The resultant hierarchy is visualized with use of treemaps, where each feature is represented as a rectangle as shown in Fig. 11.1. The size of the rectangle is used to represent the importance of that feature in the hierarchy, while color is used to represent the polarity (positive in green vs. negative in red) of customer opinions about that feature. In addition, a textual summary based on an abstractive method is presented to provide an overview of all reviews. The user can zoom into any node in the hierarchy by clicking on it, and eventually drill down to individual feature nodes decomposed into squares, one for each evaluation the feature received. The user can then click on an evaluation and see the original review from which the evaluation was extracted. This interface was tested in a user study, where some participants liked the visualization, but most of the participants preferred the text-based interface.

f11-01-9780128044124 — Fig. 11.1 A screenshot showing the opinions expressed about a DVD player [36]. The treemap represents sentiment information for each feature in the user-defined hierarchy of features. The interface additionally shows a textual summary of the reviews (left).

Review Spotlight presents an alternative visualization that is based on tag clouds [37]. The interface shows a list of adjective and noun word pairs that appeared most frequently. The font size of a noun word is proportional to its frequency in the reviews, whereas the font size of an adjective is determined by the number of occurrences of the word pair consisting of it and the associated noun. Also, the font color indicates the sentiment of each word, where the color hue represents sentiment polarity (green, red, and blue for positive, negative, and neutral respectively) and the saturation represents sentiment strength (darker tone indicates strong sentiment). A laboratory-based study revealed that the participants were able to perform the given decision making task faster with Review Spotlight than with the traditional customer review interface.

Both treemap-based visualization and tag cloud–based visualization have own advantages and limitations. The tag cloud–based visualization provides an arguably more compact representation than a treemap. However, if the features of an entity can be organized into a hierarchy, a treemap visualization technique would be arguably more suitable than a tag cloud one.

While earlier work mainly focused on just visualizing sentiment of different features of an entity, recent work has focused more on the tasks involving finding correlations between opinions and other important dimensions, such as temporal, spatial, and demographic information. OpinionSeer [38] supports the analysis of hotel customers’ feedback data based on a combination of tag clouds, scatter plots, and radial visualizations. Here, opinions are represented as points on a scatter plot that is placed inside a triangle. The position of an opinion is determined according to its distance from the three triangle vertices representing the most positive, negative, and uncertain sentiment. For example, an opinion shown in the lower left center of the triangle indicates a highly negative opinion. The opinion rings surrounding the triangle are designed to explore the correlations between the customer opinions and other data dimensions, such as time, location, and demographic information, such as age range. While OpinionSeer may support the user to perform more complex analytical tasks than earlier approaches, a critical limitation is that presenting a large number of reviews within a scatter plot may lead to potential overlapping and visual clutter.

With the proliferation of web-based social media, new challenges have emerged because of the high volume and high velocity at which customer feedback data can be generated. Therefore the visualization needs to deal with the scalability of data volume and velocity. Considering these challenges, Hao et al. [39] presented a visual analysis system that facilitates exploration and analysis of customer feedback streams based on geotemporal aspects of opinions. Geotemporal aspects are important to analysts because they can be useful to answer some critical questions; for instance, how a product or service is received in different cities or states over time. Hao et al. present sentiment information as colored pixels on the map. However, the challenge then is how to represent the large number of opinions as points on a map, especially in high-density areas. To address this challenge, they apply a pixel placement algorithm that replaces the overlapping points by a circle of points, positioning them at the nearest free position within the circle. They also display the most significant term in each geographical location, where the color of a key term indicates the average sentiment value of all the sentences containing that term.

3.3.2 Compare opinions for multiple entities

A common task performed by consumers is making a preferential choice (ie, given a set of alternative products or services, find the best alternatives on the basis of the opinions expressed about those products or services. In such a case it is useful to compare the alternatives across different features of those products. Such comparisons of quantitative values can be made easier with small, multiple bar charts. For instance, Opinion Observer presents multiple bar graphs where each bar graph represents the sentiment value of a feature for different alternatives [40]. Carenini and Rizoli [41] focused on a similar task of comparing opinions using small, multiple bar graphs. However, their visualization addresses two key limitations of Opinion Observer. First, unlike the earlier system, they compare entities across features on the basis of three levels of strength of opinions for both positive polarity (+ 3, +2, +1) and negative polarity (− 3, −2, −1), thus providing more accurate comparisons. Another difference is that they organize the features into a hierarchy, and then represent the opinions of features for each entity using stacked bars.

3.4 Opinion Visualization of User Reactions to Large-Scale Events via Microblogs

Recent work on visualizing opinions has focused on supporting the exploration of people’s reactions to large-scale events [42, 43]. Diakopoulos et al. [42] presented Vox Civitas, which supports journalists in their analysis and making sense of the large amount of tweets generated for a news event. The visual interface shows a sentiment timeline that aggregates sentiment responses into four different categories: positive, negative, controversial, or neutral (see Fig. 11.2). In addition, the system shows the volume of tweets over time, the keywords over time, and the detailed twitter messages.

f11-02-9780128044124 — Fig. 11.2 The Vox Civitas user interface [42], showing the Twitter messages along with the volume of tweets, the trend of overall sentiment, and keywords over time.

A limitation of this visualization is that it is designed to deal with an archived collection of tweets. However, we often need to analyze a huge amount of tweets for an event in real time. To address this challenge, TwitInfo [43] supports the user in browsing a large collection of tweets in real time. The system applies a streaming algorithm that identifies the peaks (subevents) within the event timeline and labels these peaks with meaningful text from the related tweets. Users can then zoom into the timeline to discover more peaks and subevents. It also presents the total proportion of positive and negative tweets during the event or subevent by means of a pie chart.

Another key analytical task in exploring reactions to large-scale events is to investigate how opinions propagate among users. This task can be very important in several situations; for instance a business analyst may want to detect the diffusion of negative opinions early, so that her company can address the issue, before it goes viral. OpinionFlow [34] focuses on this task by visualizing the spreading of opinions among participants with a combination of a density map and a Sankey diagram. The Sankey diagram shows the flow of users across different topics in an event over time, where each horizontally arranged strip is associated with a topic and each line that connects two topic strips represents the transition of user attention between the two corresponding topics. Within the Sankey diagram, density maps are encoded by application of kernel density estimation with scaled and oriented Gaussian kernels to convey the density and orientation information of opinion diffusion among users. Finally, a hierarchical topic structure represented as a stacked tree allows analysts to select topics of interest at different levels of granularity and examine how opinions spread for those topics.

3.5 Visualizing Opinions in Online Conversations

Today it is quite common for people to exchange hundreds of comments in online conversations (eg, blogs, Facebook conversations). However, it can often be very difficult to analyze and gain insights from such long conversations. To address this problem, a number of visualization approaches have been proposed (eg, [44,47–49]). Among them, ConVis has been the most recently proposed system, and it is designed by consideration of the unique characteristics of online conversations.

As shown in Fig. 11.3, ConVis consists of an overview (Thread Overview) of the conversation along with two primary facets, topics and authors, which are presented circularly around this overview. The Thread Overview visually represents the sentiment distribution of each comment of the conversation as a stacked bar. A set of five diverging colors is used to visualize the distribution of sentiment orientation of a comment in a perceptually meaningful order, ranging from purple (highly negative) to orange (highly positive). In addition, three different metadata are encoded within the stacked bar: the comment length (height), the ordinal position of the comment in the conversation (vertical position), and the depth of the comment (horizontal position). To indicate the topic-comment-author relationship, the facet elements are connected to their corresponding comments in the Thread Overview via curved links. Finally, the Conversation View shows the actual text of the comments as a scrollable list.

f11-03-9780128044124 — Fig. 11.3 A snapshot of ConVis for exploring blog conversation. The Thread Overview visually represents the whole conversation encoding the thread structure and how the sentiment is expressed for each comment (middle), the Facet Overview presents topics and authors circularly around the Thread Overview, and the Conversation View presents the actual conversation in a scrollable list (right). Here topics and authors are connected to their related comments via curved links.

The user can start exploring the conversation by causing the mouse to hover over topics, which highlights the connecting curved links and related comments in the Thread Overview. As such, one can quickly understand how topics may be related to different comments and authors. If the user clicks on a topic, a thick vertical outline is drawn next to the corresponding comments in the Thread Overview. Such outlines are also mirrored in the Conversation View. Besides exploring by the topics/authors, the reader can browse individual comments by causing the mouse to hover over them and clicking on them in the Thread Overview. In particular, when the mouse hovers over a comment, its topic is highlighted, while when the user clicks on a comment, the actual text for that comment is shown in the Conversation View (by scrolling). A user study [50] showed that ConVis outperformed traditional interfaces for several subjective metrics (eg, usefulness, enjoyable).

3.6 Current and Future Trends in Opinion Visualization

Visualizing uncertainty and scaling up for big data are two critical aspects of visualizing opinions that have been underresearched in the past but are currently receiving a lot of attention.

3.6.1 Visualizing uncertainty in opinions

Uncertainty may arise when one is making a prediction with noisy or incomplete data, which is quite common for opinion mining in social media. When the results are presented to the user without the uncertainty being taken into account, she may reach incorrect conclusions. To tackle this problem, it is important to convey the degree of uncertainty to the user as auxiliary information. In this way users can decide how confident they should be in the conclusions they are drawing from the data.

While several techniques for visualizing uncertainty have been proposed in the generic information visualization literature [51–53], they have recently been applied to opinion visualization. One notable exception is the Opinion Seer visualization system [38], where the degree of uncertainty of an opinion is encoded along with the positive and negative strength of its sentiment by use of the distance to the three vertices of a triangle. By looking at the position of an opinion within the triangle, one can perceive how uncertain that opinion is.

Recall that bar graph–based visualization was used in some early work on opinion visualization [40, 41]. A common approach to encode uncertainty within such bar graphs is to introduce error bars to represent the confidence interval. Unfortunately, recent research shows that error bars have many drawbacks from the perceptual perspectives; therefore new techniques such as gradient plots (use transparency in bars to encode uncertainty) and violin plots (use width) should be considered as alternatives to bar charts with error bars [54]. For text visualization, other visual attributes such as color hue, or saturation of the background, or use of the border thickness of the surrounding box of the text have been used for the encoding of uncertainty [52], and this could be adopted for opinion visualization involving text data.

3.6.2 Scaling up for big data

As social media text data are becoming larger and more complex at an unprecedent rate, new challenges have emerged for visualization of opinions. In particular, we discuss two key aspects of big data that need to be addressed for opinion visualization:

1. Volume: Opinion visualization needs to deal with challenges that arise from the massive amount of data. Unfortunately, most of the visualizations discussed in this chapter would be inadequate to handle very large amounts of raw opinion data. To tackle such situations, data reduction methods such as filtering and sampling, and aggregation should be applied before visualization [55]. Then the reduced data could be presented along with various interactive techniques to progressively reveal the data from a high-level overview to low-level details, similarly to what was done in [21].

2. Velocity: The high velocity of big data poses enormous challenges for the opinion mining and visualization methods. For instance, immediately after a product is released, a business analyst may want to analyze text streams in social media to identify problems or issues, such as whether customers have started complaining about a feature of the product. In such cases, timely analysis of the streaming text can be critical for the company’s reputation. For this purpose, a combination of efficient opinion mining and streaming text visualization is required. This subject is beyond the scope of this chapter, and interested readers are referred to [56] for various ways of visualizing streaming text in real time.

4 Conclusion

In this chapter we have discussed how a large set of opinions extracted from social media can be effectively summarized and visualized. The generation of textual summaries in this domain is particularly challenging because the source documents tend to be noisy and ungrammatical, and are often filled with diverse and conflicting opinions. In the first part of the chapter, we described how these challenges can be addressed by the application of both extractive and abstractive summarization methods. In the second part of the chapter, we discussed how large sets of opinions extracted from social media can be effectively visualized. We focused on three key text genres that are typically rich in opinions: customer feedback in the form of reviews, user reactions to large-scale events via microblogs, and online blog conversations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11: Opinion Summarization and Visualization

Create new playlist

Sign In

Sign Up