Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6

Sentiment Analysis in Social Networks

A Machine Learning Perspective

E. Fersini University of Milano-Bicocca, Milan, Italy

Abstract

Social networks represent an emerging challenging sector where the natural language expressions of people can be easily reported through short but meaningful text messages. Key information that can be grasped from social environments relates to the polarity of text messages (ie, positive, negative, or neutral). In this chapter we present a literature review regarding polarity classification in social networks, by distinguishing between supervised, unsupervised, and semisupervised machine learning models. In particular, the most recent advancements of the state of the art are presented, focusing on the real nature of the messages that are actually provided in an informal and networked environment.

Keywords

Sentiment analysis; Microblogs; Machine learning; Social network; Language; Relationships

1 Introduction

The continuous adoption of online social networks has generated several opportunities for capturing interest in multiple aspects both from an individual user point of view and from a collective perspective. The interconnections that the users create among themselves contribute to the establishment of a virtual discussion environment that allows people to express themselves and to influence others on the net, and provides the possibility of belonging to multiple virtual communities. Online social networks are therefore creating a digital revolution that is investing in the lives of every single individual on the net. People are free to express their thoughts and to have their virtual space in which to be themselves, with very few constraints and limits. The changes from social networks to online social networks are enabling people to talk about their emotions and opinions, and more importantly to share and spread their thoughts, with other people with no geographical barrier. This revolution has therefore contributed to the rise in a novel sentiment analysis tasks from a machine learning and natural language processing point of view. From a methodological perspective, the sentiment analysis models need to consider the actual virtual environment (where users interact) to capture and analyze what people think. The definition of complex supervised, unsupervised, and semisupervised models able to take into account the novel nature of expressions (language used on the network) and the novel way of communication (relationships established in the virtual space) becomes a mandatory step.

Considering the new scenario, we present in this chapter the most recent advancements regarding sentiment analysis in online social networks. The chapter is structured as follows. In Section 2 the key elements that characterize the online social networks for sentiment analysis purposes are described. In Section 3 a literature review of sentiment analysis from a machine learning perspective is presented, focusing on the nature of the social networks, which are actually rich in informal languages and relationships among users. According to these main characteristics, we analyze the state of the art by distinguishing between supervised, semisupervised, and unsupervised models. In Section 4, some possible applications of polarity classification are outlined. In Section 5 an overview of some possible future directions for the next generation of sentiment analysis models is presented. Finally, conclusions are drawn in Section 6.

2 Polarity Classification in Online Social Networks: The Key Elements

User-generated content has been proved to be of paramount importance both for users and organizations/institutions: on one hand, people can share their opinions in an unconstrained and unbiased environment, and on the other hand, corporations can extract useful insights for their decision making processes. To quantify what people think from qualitative raw textual data, a polarity classification task—aimed at detecting positive, negative, or neutral text—is necessary. Although there is a extensive state of the art regarding the analysis of well-formed documents such as newspaper articles, reviews, and official news, there are still several open issues to address to properly tackle the real nature of the message in the available online social networks. According to this novel communication scenario, the sentiment analysis solutions that need to be defined not only inherit a multitude of issues from traditional sentiment analysis and natural language processing but they also have to deal with new and complex challenges. Social network messages are one of the most challenging types of text to deal with. This complexity is mainly due to the following characteristics:

• Short messages: Social networks messages are usually very short, but rich in embedded semantics. To make explicit the hidden information, several methods have been proposed to bridge the semantic gap between the few words written by the user and the corresponding more complex meaning. Among the most recent investigations able to exploit extra information included in the text messages, we can broadly distinguish between approaches working on additional textual cues, such as hashtags in Twitter [1, 2], and methods based on multimodal analysis, such as text and images [3, 4].

• Noisy content: An additional aspect that should be explicitly modeled when one is dealing with sentiment analysis in social network relates to badly formed texts, where vocabulary, spelling, and syntax represent a linguistic challenge. Social network messages are characterized by colloquial expressions, abbreviations, emoticons, word lengthening, irregular capitalization, and emphatic expressions, and they usually do not conform to canonical grammatical rules. To deal with this novel type of language, two main paradigms can be followed: adapting and enhancing computational approaches of natural language processing to fit the text (ie, domain adaptation) [5–7] or adapting text to fit the language technologies (ie, normalization) [8, 9].

• Dynamics: Social network content is characterized by a strong temporal dynamic because of the continuous evolution of trending topics and because of their potential to open a debate with content provided by other users. Addressing a polarity classification task, taking into account the temporal dimension on which opinions are provided, is a key challenge to capture the change in user interests and therefore to model the volatility of attitudes toward topics over time [10–13].

• Explicit and implicit information: The users of online social networks not only produce content but they are usually characterized by their own distinctive features (eg, gender, location, age), which can be used to improve a polarity detection task. Understanding the correlation between real-time expressions of individuals and additional exogenous/endogenous variables could help to derive more effective sentiment models [14–16].

• Multilingualism: Thanks to the worldwide diffusion of social networks, the computational models for sentiment analysis should be able to deal with the multitude of languages available: less than 50% of tweets are written in English, but Japanese, Spanish, Portuguese, and German ones are featuring prominently [17]. Although most of the technologies have been mostly focused on English, the adaptation to new languages is still an open issue. Very few investigations work on non-English text [18–21] but some recent studies are trying to move the attention of the sentiment analysis community to adaptation techniques for cross-lingual analysis [22–25].

• Relationships: The key aspect of online social networks is that they are rich in both content and relationships, pointing out new challenges and opportunities from the sentiment analysis perspective. Content can contribute to inferring the user sentiment even if the network structure is not informative. Conversely, the relationships established by the users can contribute to reasoning about a user when limited and ambiguous content information is available. Here the concept of homophily [26] plays a fundamental role: a contact among similar people is expected to occur at a higher rate than among dissimilar people, implying that differences in terms of social characteristics can be converted into network distances.

Among the above-mentioned characteristics, the two key elements on which the sentiment analysis community is working are the rich nature of the natural language and the rich nature of the network. Regarding the language used in social networks, the main effort of the sentiment analysis community has been devoted to capturing and modeling the typical expression on the network through text, part-of-speech tags (eg, adverbs and adjectives), and paralinguistic content (eg, emojis, slang, hashtags) to derive more effective prediction models. In this context, most of the literature is related to the traditional statistical learning paradigm, where the content obeys a specific language typical of the social network and is independent of the content of other users (independent and identically distributed assumption).

Concerning the nature of the network, different types of relationships based on different types of homophily have been investigated to derive computational models able to disregard the independent and identically distributed assumption. In [27], two types of homophily are distinguished: status homophily, in which similarity is based on informal, formal, or ascribed status, and value homophily, which is based on values, attitudes, and beliefs. Status homophily includes the major sociodemographic dimensions such as race, ethnicity, gender, or age, and acquired characteristics such as religion, education, occupation, or behavior patterns. Value homophily includes the wide array of internal states presumed to shape our orientation toward future behavior: attitude, belief, and value similarity lead to attraction and interaction. According to these distinctions, two kinds of approaches have been developed in the literature: methods grounded on status homophily relationships (such as friendships in Facebook and following/follower in Twitter) and approaches that exploit relationships based on value homophily (such as retweets in Twitter, +1 in Google+, and “like” in Facebook).

In the following sections the main advances in the machine learning literature will be discussed, focusing on the above-mentioned two main characteristics: natural language and relationships. A graphical representation that summarizes the key elements of machine leaning approaches for sentiment analysis purposes in given in Fig. 6.1.

f06-01-9780128044124 — Fig. 6.1 Key elements of social networks.

The discussion of the state of the art regarding sentiment analysis techniques will be focused on the machine learning perspective, distinguishing between (1) supervised learning, where labeled observations are used to train a classifier, (2) unsupervised learning, where observations given to the learner are unlabeled, and (3) semisupervised learning, where both labeled and unlabeled data are used to derive the final sentiment model.

3 Polarity Classification: Natural Language and Relationships

3.1 Leveraging Natural Language

Sentiment classification is usually treated as a traditional text classification problem where, instead of classifying documents of different topics (eg, politics, sciences, and sports), one estimates positive, negative, and neutral classes (or [− 5,5], [0,100] intervals). According to this perspective, any existing supervised learning method, such as naïve Bayes and support vector machine classifiers, can be easily applied. However, in contrast to well-formed text, the natural language in online social networks has several distinctive features that can be exploited to approach the polarity classification problem:

• Parts of speech. Words belonging to different parts of speech should be treated according to their linguistic role (nouns, verbs, pronouns, etc.). When one is dealing with user-generated content, part-of-speech tags can be inferred if one adopts specific language models that are able to capture the structure of the expressive forms used in online social networks [5, 6]. Among the available parts of speech, it has been shown that some specific elements (eg, adjectives, adverbs, interjection) are important indicators of subjectivity [28], polarity [29], and irony [30].
An additional challenge relates to sentiment shifters; that is, those expressions used to alter the sentiment orientations (ie, from positive to negative or vice versa). Negation words (eg, not, never,cannot) [31], modal auxiliary verbs (eg, would, should, could) [32], and presuppositional words (eg, strongly, smartly) [33] are examples of sentiment shifters that have been shown to be relevant for sentiment analysis purposes.

• Paralinguistic content. Paralinguistic content is those pragmatic particles typically used in social networks to elicit a given message:

• Emoticons are introduced as expressive, nonverbal components into the written language, mirroring the role played by facial expressions in speech [34]. To take advantage of these sentiment signals, several investigations have been conducted in different social networks [35–38]. Examples of lexicons containing positive emoticons (such as , , and ) and negative ones (such as , , and ) can be found in [39, 40].

• Initialisms for emphatic expressions are an additional paralinguistic element used in nonverbal communication in online social networks. Although they act as a constituent, these emphatic abbreviations have been shown to play a role similar to that of emoticons [41]: expressions such as “ROFL” (meaning “rolling on floor laughing”) can represent positive expressions, while abbreviations such as “BM” (meaning “bad manners”) can denote negative statements.

• Onomatopoeic expressions in online social networks can help to convey emotions [42]: some expressions such as “bleh” and “wow” are clear indicators of negative and positive emotional states and therefore can help to distinguish the polarity of a text message.

• Word lengthening: Word styling (as bold, italic, and underlining) is not always available in online social network platforms and it is often replaced by some linguistic conventions. Word lengthening¹ (usually known as expressive lengthening or word stretching) is an example of such novel linguistic conventions that nowadays are extremely popular in online social networks. In [43, 44] it was shown that such a commonly observed phenomenon is an indication of emphasis that is strongly associated with subjectivity and sentiment.

• Capital letters. Positive and negative expressions are commonly reported by capitalization of some specific words (eg, “#StarWars was AMAZING!”) to express the intensity of the user sentiments. To take advantage of this indicator, it is possible either to give it more weight than other commonly occurring words when one is creating the text representation of the content [45] or to consider this indication as an additional feature (eg, count of capitalized words) to be exploited by any learning algorithm.

• Hashtags. A large number of posts in online social networks are characterized by a wide range of user-defined hashtags. Some of these tags are defined and used to express one or more specific sentiment associated with the corresponding text. However, the distinction between sentiment hashtags and topic hashtags is a challenge that needs to be properly addressed for polarity classification purposes [46–48].

In the following the state of the art will be distinguished according to three paradigms (ie, supervised, semisupervised, and unsupervised learning).

A graphical representation that summarizes the most recent literature leveraging the natural language used on social networks is presented in Fig. 6.2.

f06-02-9780128044124 — Fig. 6.2 State of the art of approaches based on natural language. PC, paralinguistic content; POS, parts of speech.

3.1.1 Supervised learning

There is extensive literature that leverages the combinations of the characteristics described earlier to train a supervised sentiment classification model [49]. In this research area we can roughly distinguish between the state of the art in two main fields (ie, baseline and ensemble models). Concerning the baseline models, several contributions have been provided in the last decade. Among the most recent approaches are those in [50–55].

Dhande and Patnaik [50] proposed a supervised machine learning approach that is able to combine naïve Bayes and neural network classifiers for sentiment categorization. In their investigation, they tried to overcome the attribute independence assumption underlying the naïve Bayes classifier by using a neural network to explicitly represent the dependencies among attributes (ie, words). The resulting naïve Bayes neural classifiers have been shown to be able to achieve promising accuracy with use of a simple unigram representation of user messages.

Chikersal et al. [51] combine two main baseline approaches. The first one is a rule-based classifier that takes a decision about positive, negative, or unknown sentiment, using rules that are dependent on the occurrences of emoticons and opinion words in tweets (ie, lexicons). The second one, based on support vector machines, is trained on semantic, dependency, and sentiment lexicons to identify positive, negative, and neutral messages. The predictions provided by the two baseline approaches are subsequently combined to refine the support vector machine predictions. A further supervised approach based on sentiment lexicon construction was presented in [55]. The main goal is to deal with the problem of unavailability of labeled data by use of SentiWordNet [56]. The proposed framework, SWIMS (for “semisupervised subjective feature weighting and intelligent model selection”), starts by acquiring SentiWordNet as a labeled corpus to extract adjectives, verbs, adverbs, and nouns that are subjective. Then a new feature-weighting mechanism—based on the pointwise mutual information [57]—is used to finally train a supervised support vector machine for sentiment classification.

Even though the traditional supervised learning approaches have addressed several sentiment analysis tasks, most of the recent literature is moving toward a different perspective of the problem. In particular, the current research direction is related to the definition of novel representation spaces, by means of word embeddings [58], for subsequent training of traditional learning models [52–54]. Severyn and Moschitti [52] proposed a new model for initializing the parameter weights of a convolutional neural network [59] based on word embedding to finally train an accurate soft-max sentiment model. The solution they proposed successfully combines two important constituents for sentiment analysis: word embeddings for a rich language model and supervised learning on available data. A similar architecture was presented in [53]. A further investigation that exploits distributed word representations was proposed in [54], where two types of word embeddings were adopted (ie, the skip-gram model [60] and the sentiment-word model [61]) to subsequently train a support vector machine.

Although the above-mentioned approaches represent an important step toward the definition of robust systems, within the sentiment classification research field, none of the classification algorithms consistently performs better than the others and there is no consensus regarding which method should be adopted for a given problem in a given domain.

To overcome this limitation, the ensemble learning paradigm has been investigated recently [62–67]. The idea behind ensemble mechanisms is to take advantage of several independent classifiers by combining them to achieve better performance than the baseline ones.

A first approach to ensemble composition was presented in [62, 63]. The main idea is for one to exploit all the possible baseline models in the hypothesis space by taking into account their marginal prediction capabilities and their reliabilities. One finally derives the optimal ensemble composition by taking advantage of a greedy model selection strategy able to find a good bias-variance trade-off among all the baseline models considered. Lin et al. [64] presented as a novel contribution a criterion for sentiment-based ensemble composition. In their work an approximate algorithm, which is able to exploit accuracy and diversity of baseline classifiers, is designed to tackle the combinatorial problem of classifier selection.

A different approach to ensemble learning is related to the feature space instead of the model space. Wang et al. [65] reported an extensive study of the traditional vector representations for inducing state-of-the-art ensemble methods. In particular, they investigated traditional bagging and boosting approaches [68] over different bag-of-word representations (unigram and bigram) and different weighting schema (Boolean, term frequency, and term frequency–inverse document frequency).

In [66] the hypothesis is that an appropriate feature engineering—known as feature hashing [69]—can lead to more accurate ensemble models. The original bag-of-word space is reduced by the hashing of the features into a lower-dimensional space, allowing multiple features to be mapped to the same hash key. An approach that tries to exploit the potentiality of baseline methods through ensemble learning and enriched word representation was presented by Zhang and He [67], who took advantage of two different feature sets: the first one describes the latent topics of the messages, while the second one is related to word embeddings. These feature sets are then used to tune the contribution of different learners in the ensemble model.

As a general consideration we can highlight that although the approaches based on supervised learning can lead to very high sentiment recognition performance, a drawback is concerned with the necessary human effort to label the data. To overcome this limitation, alternative solutions (grounded on semisupervised and unsupervised learning) have been developed.

3.1.2 Semisupervised learning

Several semisupervised approaches have recently been presented to address the well-known problem related to the acquisition of manually labeled data for sentiment classification. Generally, the semisupervised classification methods can be categorized into two types: approaches based on prior lexical knowledge combined with labeled (and unlabeled) data [23, 70, 71], and bootstrap techniques [72–78].

Concerning the lexical-based approaches, Ju et al. [71] addressed the seed-word selection for semisupervised sentiment classification through a joint lexicon-corpus learning approach. They investigated the (human) costs for annotating words to then measure their informativeness. Both the annotation cost and the informativeness measurement are taken into account to guide the selection strategy for good words for the final sentiment analysis task. Although the pure lexical-based approaches have shown promising results for tackling semisupervised sentiment classification, the most recent literature is focused on sentiment transfer learning. He et al. [23] proposed a transfer learning approach that is able to take advantage of the sentiment knowledge available in a resource-rich (source) language to restore the information lost during the transfer process in the new resource-poor (target) domain. Zhu et al. [70] presented an approach that combines lexicons and labeled and unlabeled data for sentiment transfer across different domains. First, several emotion keywords are used to automatically extract labeled samples and then both the automatically labeled samples from the target domain and the real labeled samples from the source domain are combined to create a new labeled data set. Finally, all the labeled and unlabeled data in the target domain are used to perform cross-domain sentiment classification with a standard label-propagation algorithm.

Regarding the bootstrap techniques, we can further distinguish between self-training [79] and co-training [80]. The main characteristic of self-training approaches is to adapt a predefined polarity lexicon with use of an unlabeled set of social network messages. Among the different self-training solutions for sentiment analysis, the most recent approaches are focused on the enrichment of existing vocabularies with (unsupervised) sentiment lexical items for a subsequent learning phase [72–74, 78].

An additional strategy for semisupervised boosting is represented by the co-training approaches. The key characteristic of such methods is their ability to derive and take advantage of different “views” for the same opinionated text. Liu et al. [77] exploit both textual features (ie, opinionated words weighted according to WordNet-Affect [81]) and nontextual indications (ie, emoticons, temporal indications, and punctuation) to train two supervised classifiers. A further approach was presented in [76], where a semisupervised deep neural network framework was developed to co-train on the feature representation and pattern classification spaces. Yang et al. [75] recently investigated a combination of lexicon-based learning and corpus-based learning in a unified co-training framework.

After analyzing the current state of the art, we can affirm that although semisupervised methods for sentiment classification are still in their infancy, they are becoming ever more popular thanks to their ability to use both (small sets of) a labeled corpus and (huge sets of) unlabeled data.

3.1.3 Unsupervised models

Unsupervised approaches to sentiment classification can solve the problem of domain dependency and reduce the need for annotated training data. Most of the approaches available for unsupervised sentiment classification can be broadly distinguished into lexicon-based methods [82–85] and generative models [86–89].

Concerning the lexicon-based approaches, the first work was presented by Turney [82], who used two arbitrary seed words to compute the semantic orientation of a sentence (measured by pointwise mutual information [57]). An improvement on the above-mentioned approach was by Zagibalov and Carroll [83], who presented a method for the automatic seed word selection. The method requires information only about commonly occurring negations and adverbials to iteratively find sentiment-bearing items.

More recent methods are based on the automatic construction of lexicons. Lu et al. [85] addressed the problem of deriving a sentiment lexicon that is not only domain specific but also aspect dependent. To accomplish this task, an optimization framework was proposed to combine different sources of information for learning context-dependent sentiment lexicons.

Sheng et al. [84] proposed an automatic construction strategy for a domain-specific sentiment lexicon based on constrained label propagation. In particular, a set of candidate sentiment terms is extracted by use of the chunk dependency information and an a priori generic lexicon. Then a set of pairwise contextual and morphological constraints are extracted from a domain corpus and are exploited as prior knowledge to improve the sentiment lexicon construction.

The generative models represent an alternative and more flexible solution to the lexicon-based approaches. The main characteristic of this class of models is that they are able to simultaneously extract aspects and classify sentiments from textual messages. A sentence such as “#iOS7: battery life is a plus, but the security is a big issue!” is an example of different aspects, each having its own polarity, reported on the main target “iOS7.” Several studies that deal with sentiment classification and topic modeling have been proposed in the literature.

A first generative model known as the topic sentiment mixture model to separate topics and sentiment words with use of an extended probabilistic latent semantic analysis model [90] was presented in [87]. Further investigations based on the latent Dirichlet allocation principle [91] can be found in [86, 88], where an aspect and sentiment unification model (ASUM) and a joint sentiment-topic (JST) model were proposed respectively. The JST model is a fully unsupervised model that can capture the topic and sentiment at the same time. The ASUM is a model based on the JST model, characterized by a small adaption of the former one, that introduces the assumption that a sentence in a message can be related to only a specific aspect and sentiment. The main advantage of the JST model with respect to the ASUM comes from its ability to reciprocally reduce the noise of both topic and sentiment generation tasks.

A more recent generative model available in the literature was presented by He et al. [89]. They proposed a dynamic JST model that allows the detection and tracking of topics and sentiment over time: topic and sentiment dynamics are captured by the current time-dependent sentiment-topic word distributions and the word distributions at previous time stamps.

3.2 Leveraging Natural Language and Relationships

A key aspect of social networks is that they are rich both in content and in relationships, providing unprecedented challenges and opportunities from a sentiment analysis point of view. For instance, combining content and relationships could be useful when one is dealing with implicit (or implied) opinions, where textual features do not always provide explicit information about the sentiment orientation. However, as shown in the previous section, most of the state-of-the-art approaches are consistent with the classical statistical inference problem formulation, in which instances (posts/users) are represented as homogeneous, independent, and identically distributed. In other words, they consider textual information only, not taking into account that social networks are actually relational environments. Although relationships in online social networks have been extensively investigated from a sociological and psychological point of view, investigation of their role from a machine learning perspective is still in its infancy: most of the investigations are based on exploiting relationships from a propositional point of view (ie, flattening the social network connections by an aggregation function or including an additional attribute in the text representation to represent sentiment influence), while very few approaches tackle the networked environments in their native forms.

Analogously to the distinction provided in the previous section, in the following a description of approaches that combine content and relationships is provided, distinguishing between supervised, unsupervised, and semisupervised models. Focusing on these three main paradigms, we consider two types of relationships available in social networks; that is, relationships based on status homophily and relationships based on value homophily.

A graphical representation that summarizes the most recent literature based on relationships (and natural language) is reported in Fig. 6.3.

f06-03-9780128044124 — Fig. 6.3 State of the art of approaches based on relationships (and natural language). H, homophily.

3.2.1 Supervised learning

The first and more popular investigations in the state of the art aimed at combining content and relationships are grounded on the status homophily sociological theory. Assuming status homophily as an underlying principle of online relationships between users implies the assumption that users form ties thanks to demographic factors (ie, race/ethnicity, sex/gender, age, religion, education, occupation, and social class). According to this principle the relationships between users have been modeled in online social networks both as directed and as undirected binary connections: friendships in Facebook,² Google+,³ and Weibo,⁴ and following/follower in Twitter.⁵

Concerning the state-of-the-art approaches that assume relationships based on status homophily jointly with text, we highlight the most recent supervised learning methods [92–94]. The first tentative approach exploiting the statistical relational learning paradigm for polarity classification purposes was presented Rabelo et al. [92], who investigated a relational neighbor classifier to estimate a polarity probability model (based on text ad adjoining friends) and a relaxed collective inference approach to determine the sentiment of the users in the network. Hu et al. [93] presented a mathematical programming formulation that is able to capture the sentiment consistency (by means of user-content connections) and the emotional contagion (by taking advantage of user-user social relations) for networks of reduced size. Recently, an investigation into the emotional contagion in large social networks was presented by Coviello et al. [94], who modeled the emotions of the users as dependent not only on the endogenous and exogenous factors (eg, always being happy and rainfall effect) but also on contagion of groups of friends.

Although the above-mentioned investigations are characterized by consistent performance gains with respect to those approaches based purely on textual content, they strongly assume that the explicitly available relationships (friendships and following/follower connections) unconditionally represent the sentiment agreement between connected users. However, this assumption does not reflect what happens in the real world, where two structurally connected users can have divergent opinions on a given topic. To better capture the sentiment agreement among users, some recent approaches have been grounded on the value homophily theory. Considering this principle as underlying the online relationships between users implies the assumption that users create bonds thanks to attractions and interactions that contribute to their sharing of attitudes, abilities, beliefs, and aspirations. According to this principle the relationships between users have been modeled in online social networks as weighted directed connections: appreciations in Facebook (ie, “like”) and Google+ (“+1”) and retweets in Twitter (“RT”).

Regarding those approaches in the literature that assume relationships based on value homophily together with text, we describe a few recent investigations. Jiang et al. [95] derived a probabilistic Bayesian model that considers the text content posted by a user smoothed by the sentiment labels of neighbors directly connected through a structural connection (ie, a retweet in Twitter). Once the probabilistic model has been trained, a graph-based classification algorithm based on relaxation labeling [96] is used to infer the polarity of unobserved users.

Anjaria and Guddeti [97] proposed a supervised system that exploits the salient features of supervised machine learning algorithms for text-based sentiment classification to subsequently incorporate social network structural features (again a retweet in Twitter) as an approach to sentiment analysis at the user level. Finally, in [98] any kind of social interaction (eg, “like,” comments, and sharing activities) is captured by a sentiment opinion graph. To derive the sentiment orientation of each user of the network, a relaxation labeling process is followed.

3.2.2 Semisupervised learning

Dealing with sentiment classification in networked environments usually requires a fully supervised learning paradigm, where the sentiment orientation must be known a priori to derive suitable predictive models. However, this does not always reflect the real setting of social networks, where some partial information can be grasped to derive a relational semisupervised model and therefore reduce the human effort related to the annotation process. Among the semisupervised techniques able to deal with content and relationships, we can again distinguish between models that include status homophily relationships and approaches that take advantage of value homophily relationships.

Among the models based on status homophily relationships, we highlight the most recent approaches presented in the literature [99–102]. Tan et al. [99] proposed a semisupervised approach to predict the user sentiment by introducing explicitly available undirected user-user relationships (“friendships”) into a text-based factor-graph model. Speriosu et al. [100] proposed enriching the content representation by including directed user-user relationships as features additional to the text ones. The same kind of directed user-user social relationships (eg, “following” and “follower” in Twitter) was exploited in [101] to predict the sentiment orientation of users by a collective inference approach based on a partially labeled network. Tang et al. [102] recently presented a semisupervised version of the supervised model presented in [93]. The main rationale behind this recent approach is (1) content from the same author is likely to be more consistent in terms of polarity than two randomly selected messages and (2) content provided by connected friends is likelier than two randomly selected texts to have the same sentiment.

Concerning the value homophily relationships, only two semisupervised approaches have been found in the literature. Pozzi et al. [103] proposed a semisupervised sentiment learning approach that extends the model presented in [99] by introducing a social network representation based on the concept of an “approval network.” Given a small proportion of users already labeled in terms of polarity, the model predicts the sentiments of the remaining unlabeled users by combining directly in the probabilistic model both the textual information and the retweet-based graph representation.

A subsequent contribution built on the work reported in [103] was presented by Nozza et al. [104]. In their investigation a social network is represented as a heterogeneous graph, where a latent representation of the nodes (both users and posts) is learned to infer the corresponding polarity labels.

3.2.3 Unsupervised learning

Similarly to the semisupervised paradigm, the unsupervised approaches in the literature able to leverage both content and relationships are very preliminary. The work in [105, 106] represents the first tentative approaches to address the polarity detection task assuming relationships based on status homophily. The first explorative analysis in [105] proposed an unsupervised label propagation algorithm for dealing with both explicit and implicit opinion targets. The authors consider posts as nodes in the graph with a corresponding polarity label vector initialized by the hashtags reported in the text. Then at each iteration of the label propagation algorithm, the label vector of one node is propagated to the adjoining ones.

Ou et al. [106] proposed a content and link unsupervised sentiment model as a richer framework able to take advantage of four main components: content, same-user link (ie, two posts are provided by the same user), friend link (ie, the users of two messages are connected by a friendship relationship), and behavior link (two users are connected if any repost, reply, or comment activity is performed). These four ingredients were introduced in a unified probabilistic model, for which parameter estimation and inference can be approached by a maximum likelihood method and Gibbs sampling respectively.

Concerning the value homophily relationships, only one unsupervised approach has been found in the literature. Zhu et al. [107] proposed an unsupervised triclustering framework that is able to analyze both user-level and message-level sentiments through co-clustering of a tripartite graph. The most important contribution is concerned with the finding that making use of mutual dependency among aspects, messages, and user relationship can lead to effective unsupervised sentiment classification models.

4 Applications

Polarity classification is well suited to various types of intelligence applications. Indeed, business intelligence seems to be one of the main factors behind corporate interest in the field. One of the most important needs of businesses and organizations in the real world is to find and analyze consumer or public opinions about their products and services (eg, Why are consumers not buying our laptop?). Polarity classification paves the way to several interesting applications, in almost every possible domain. For example, summarizing user reviews is a relevant task of analytics. Moreover, opinions matter a great deal in politics. Some work has focused on understanding what voters are thinking [108]. For instance, the US president Barack Obama used the polarity classification task of sentiment analysis to gauge feelings of core voters during the 2008 presidential election. A further task is the augmentation of recommendation systems, where the system might not recommend items that receive negative feedback several times [49].

However, polarity classification has also been applied to more ethical principles. For example, on the basis of observations of Twitter’s role in the civilian response during the 2009 Jakarta and Mumbai terrorist attacks, Cheong and Lee [109] proposed a structured framework to harvest civilian sentiment and response on Twitter during terrorism scenarios. Arunachalam and Sarkar [110] monitored and analyzed several social networks to assess the citizens’ perception of government agencies for several purposes: fine-tuning of policies, identification of best practices positively perceived, negative aspects of the actions and decisions. Polarity classification has also been applied to the medical field. Cobb et al. [111] examined how exposure to messages about a smoking-cessation drug affects smokers’ decision making regarding its use. In recent years, social networks have emerged as a potential source of information for sentiment analysis in the financial domain. Financial tweets have been investigated to predict short- and long-term stock market evolutions [112–114].

Since sentiment analysis is nowadays accessible to a large audience (researchers, governments, institutions, companies, and corporations), we can expect even more upcoming applications: violence prevention, e-health intervention, monitoring of cyber bullying and cyber pedophilia, transportation optimization, and emergency management.

5 Future Directions

In the previous sections the most recent contributions to the state of the art of sentiment analysis were presented from a machine learning point of view. Concerning the future directions, some conclusions can be drawn. For the sentiment analysis methods focused on natural language, we can highlight the following:

• The supervised models that are able to leverage natural language are strictly focused on explicit opinions. A challenge that remains to be addressed relates to the more difficult task of identifying and properly dealing with implicit opinions (ie, objective statements that express a desirable or undesirable fact through regular or comparative statements). In this direction, not only syntactic cues could contribute to identifying text constituents that characterize implicit opinions, but also the semantics of co-occurrent patters in the language could provide a distinctive advantage.

• Regarding the future work on semisupervised models, a major challenge that remains to be addressed is related to incremental learning. While most of the available techniques are based on statistical learning and therefore assume a given stochastic distribution of the data they observe, an incremental learning model could be applied whenever new observations emerge and could adapt to what has been learned accordingly.

• According to the analysis of the literature on unsupervised models, we can affirm that although they represent a relevant alternative to the supervised and semisupervised ones, they can introduce bias when dealing with short and noisy text. The fact that social network text is composed of a few words poses considerable problems when one is applying traditional topic/sentiment models. These models typically suffer from data sparsity to estimate robust word co-occurrence statistics when they are dealing with short and ill-formed text. We can therefore expect as upcoming contributions several approaches able to adapt the generative process behind topic/sentiment modeling to the social network language.

For the sentiment analysis methods focused on both natural language and relationships, we highlight the following:

• As a future direction of supervised models that are able to leverage both information sources, we can expect several additional extensions of probabilistic learning/inference techniques to deal with complex relational structures (ie, connections based both on status homophily and on value homophily). From a machine learning point of view, we expect an increasing number of investigations that attempt to create a successful marriage between probability theory and several relational representations. In particular, the solutions to learn and infer over the relational environment of social networks are presumed to retain the relational data structure in its totality (ie, not focusing on directly connected users, but considering the whole network) and by adapting/enriching learning and/or inference algorithms to consider the real nature of the social networks.

• After analyzing the state of the art of this type of semisupervised models, we believe a possible future research direction relates to the uncertainty of relationships available in the social networks. The totality of the models (based on both status homophily and value homophily) assume certain relationships that do not evolve over time. In a more concrete scenario, all of these connections are uncertain: they can be broken, they can vary over time and with topic, and they can be latent (not directly observable). We can therefore expect ever richer models able to tackle the uncertainty over the relational structure to perform more accurate sentiment classification and propagation tasks.

• As a future direction of unsupervised models, we expect an extension of propositional generative models (presented in Section 3.1.3) for dealing with connections among users and relationships among messages. From a machine learning perspective, we expect an increasing number of investigations into the statistical relational learning domain able to explicitly model the relational component into the generative topic-sentiment models.

6 Conclusion

The growth of sentiment analysis as one of the most active research areas of the last 10 years is due to different reasons. First, sentiment analysis has a wide array of applications, in almost every domain. Second, it offers many challenging research problems that have never been studied before. Third, with the advent of the big data technologies, we now have a huge volume of opinionated data recorded and easily accessible in digital forms on the web. These reasons have motivated the recent advances in the state of the art presented in this chapter. Most of the work regarding polarity classification usually considers text as unique information to infer sentiment, disregarding that social networks are actually networked environments. To take advantage of both natural language and social networks relationships, a novel research branch is developing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Sentiment Analysis in Social Networks: A Machine Learning Perspective

Create new playlist

Sign In

Sign Up