Chapter 5

Mapping and Analyzing Social Networks

Mapping and analyzing social networks can reveal the identity of key people and relationships, and track the spread of ideas. This chapter details the steps to conduct social network analysis to map and analyze social networks primarily exhibited on social media. It introduces key concepts and definitions, reviews and selects appropriate software tools to conduct the analysis, and teaches you how to use the tools to map social networks, identify influencers, and determine the proliferation of and relationship between ideas. Although this chapter focuses only on using data from social media to conduct analysis on standalone social networks, you can use the same processes and tools to conduct social network analysis on data from other sources such as financial records, and compare and combine different networks.

Key Concepts and Definitions

Social network analysis (SNA) seeks to discover the underlying rules governing the behavior of people in a social network through the use of specialized algorithms. It involves studying the relationships between people and how those relationships affect everyone in the network, and even in other networks. SNA is ideal for understanding how people use social media to form and sustain social networks, and influence people in their online networks and beyond. It can also help explain how relationships formed offline can influence relationships formed online and vice versa. Specifically, SNA can help you answer questions such as:

  • Which individuals on social media have the most influence and reach with their message?
  • What clusters of online groups have the most influence over others?
  • If a positive message were injected into the debate, who should help propagate the message?
  • Do major influential individuals for either side of a problem set appear inactive or not important?
  • If you know the network structure, what nodes, when eliminated, will best cause the breakdown or hinder the network?

SNA comes in many forms and can reveal a variety of insights about social networks and the people in them. This chapter focuses on how SNA can help you:

  • Map social networks by creating a visual representation of social networks sustained on social media. In other words, visually grasp who talks to who and how much.
  • Identify influencers by using specialized algorithms to identify people and relationships that wield the greatest influence over other people in the network. The algorithms you choose to employ depend on the characteristics of the network.
  • Identify the topics and ideas that the most people are discussing together at a specific moment in time.

Cross-Reference
You, in your role as a private law-abiding citizen or a dissenter against an oppressive regime, may be understandably worried that governments and others are using sensitive information about you available on social media to conduct SNA and other types of analyses without you knowing about it or being comfortable with it. In the wrong hands, such analyses could be potentially harmful. Check out Chapter 14 for tips on how you can protect your privacy and yourself.

Before you can start employing SNA tools to solve problems, you need to understand each of the elements that make up social networks, the role of influence, and how algorithms can help uncover influence.

Elements of Social Networks

Each social network consists of numerous elements, the most important of which are nodes and links.

A node is a person. When constructing social networks on social media, a node can be a Twitter account, a forum user, or a blogger, because in the social media world, social media accounts represent people. In some cases, numerous people can use one account and they would act as one node. However, we will not worry about such cases because they are rare, and, for the most part, one social media account usually only represents one person. Technically, anything can be a node. A piece of social media content such as a text message can be a node and the links can be anyone who reads or receives the text message. But for simplicity's sake and to ease understanding, we will only consider persons as nodes. We will use the words individuals and persons interchangeably with nodes.

A link is the relationship between two nodes. Links are also known as ties or edges. In our case, a link is the indication of a communication channel between two people on social media. A link can exist between two people on Twitter if they follow each other or if one follows the other. It can exist between two people on a blog if one comments on another's article and the blogger comments back. We will use the word relationships interchangeably with links.

Each link can contain many different attributes, the most relevant of which is reciprocity or symmetry. A link is said to be reciprocal or bidirectional if the communication channel between two people flows both ways, or if both people communicate with each other. If only one person communicates with the other, the link is not reciprocal or unidirectional. For example, if Ahmed follows Kim on Twitter, but Kim does not follow Ahmed back, there is a one-way link between Ahmed and Kim. If Kim followed Ahmed back, the link is reciprocal because there is a two-way link between them. Other attributes can include strength. For example, consider a social network consisting of Felipe, Anne, and Michal. Each person in the network communicates with every other person in the network. Thus, all links are reciprocal. However, each person can differ in the extent to which they communicate with each other. Felipe and Anne send a text message to each other once a week, Felipe and Michal send a text message to each other three times a week, and Anne and Michal send a text message to each other ten times a week. See Figure 5.1 for a visual representation of the network under consideration. Because Anne and Michal communicate with each other more often than Felipe and Anne do, or Felipe and Michal do, the link between Anne and Michal is stronger than it is between Anne and Felipe, or Felipe and Michal. Additionally, the link between Felipe and Michal is stronger than it is between Felipe and Anne. For the sake of simplicity and to ease understanding, we will not analyze attributes apart from reciprocity, such as strength, when conducting SNA.

Figure 5.1 Social network with link attributes

5.1

You may have noticed that reading written descriptions of social networks can be confusing and tedious. Mapping social networks visually as in Figure 5.1 is an easy way to bolster comprehension of social networks. Figure 5.2 maps a social network with unidirectional and bidirectional links. Arrowheads indicate the direction in which the link flows. In the network in Figure 5.2, Francisco talks to Zane but Zane does not talk back to him, whereas Zane talks to Emi but Emi does not talk back to him. The links between Francisco and Zane, and Zane and Emi are unidirectional. However, Arturo and Zane talk to each other and have a bidirectional link.

Figure 5.2 Unidirectional and bidirectional networks visualized

5.2

Another way that will also help you prepare data for SNA is through matrices or a square array. Figure 5.3 represents the network described before in a matrix; refer to it as you read the subsequent sentences. Reading a matrix is very simple, because they are similar to spreadsheets. We numbered the columns and rows to ease explanation. The names in Column A represent the nodes who choose who they should communicate with and to what extent. The names in Row V represent the nodes who the nodes in Column A choose with whom to communicate. In other words, the nodes in Row V receive the link from the nodes in Column A. Start with Row X, and the first name in Column A, which is Felipe. Then go right, across the other columns, and read who he communicates with and to what extent. For example, in Column C, under the heading of Anne, you will see the number 1. That is because Felipe sends Anne one text message a week. In Column D, under the heading of Michal, you will see the number 3, because Felipe sends Michal three text messages a week. In Row X, Column B, under the heading of Felipe, you will see a 0. The 0 represents lack of a link and non-applicability because Felipe cannot choose to communicate with himself. Similarly, pick Row Y, which represents who Anne has chosen with whom to speak. Going across Row Y, under Column B and under the heading of Felipe, you see the number 1 because Felipe and Anne text each other once a week. And so on.

Figure 5.3 Social network as a matrix

5.3

This matrix communicated not only the presence of links but also the strength of the links, by the number of text messages sent by each person. We, however, will not consider other attributes, including strength. Thus, we will only represent the links using a binary code of 1 or 0. In other words, if Felipe texts Anne, you will see a 1 in Row X, Column C. If Felipe never texted Anne, you will see a 0. See Figure 5.4 for this binary representation of the social network. If you are still confused about matrices, do not worry. Later you learn how to create such matrices to prepare the data for SNA.

Figure 5.4 Social network as a binary matrix

5.4

Influence and Memes in Social Networks

Apart from cataloguing who speaks to whom, SNA's true power lies in identifying the nodes and relationships that drive the behavior of all other nodes in the social network. Identifying key influencers in social networks exhibited on social media enables you to focus your research, investigation, and targeting efforts concerning people on social media.

Influence is changing how an individual or group perceives the world and their relationship to the world. A key influencer is a node in the social network whose behavior and content influences the behavior of others in the network. When a key influencer speaks, his followers and even others outside his immediate network listen and eventually respond. For example, the terrorist Anwar Al-Awlaki was a key influencer because others consumed the messages and content he created on social media, and consequently changed their viewpoints and behavior.

Memes are the ideas, concepts, and beliefs present in messages and content that spread from person to person, and spurs influence. Two competing theories describe exactly who can be a key influencer on a social network, or in other words, spread memes most effectively and widely. One theory says that key influencers are prominent individuals such as celebrities who can influence people in a variety of social networks, sometimes regardless of the context, by virtue of their persona. The competing theory says that the identity of the key influencer is fluid and highly dependent on the social network and context. Anyone can become a key influencer, regardless of how famous or prominent they are, by virtue of their message and their position in the network.1 An individual's position in a network is a function largely of how many people they talk to and how many people those people talk to. It is also a function of how easy it is for the individual to talk to numerous people directly and indirectly, and how many constraints they face. A constraint appears when an individual has to go through another individual to speak to the rest of the network.

Real-world evidence suggests that for the issues you will face, the latter theory is correct. A person who appears at the right time with the right message and has the right amount of people listening to them can become a key influencer within a social network. As social networks and context change, that person may, after some time, cease to be a key influencer in the network, and another person may take their spot as the key influencer. This theory focuses the definition of the key influencer.

A key influencer is the person who spreads memes to the most people directly and indirectly in a social network with the least amount of constraints. SNA provides the ability to identify the key influencer in any network largely by virtue of her position in the network, regardless of the context or whether the individual was an influencer before. As mentioned in Chapter 3, conducting SNA on a network and identifying a key influencer involves picking a specific algorithm and then applying that algorithm to the network data. The algorithm then outputs a list of the most influential nodes. Similarly, algorithms can help you identify the memes that are propagating through networks at a certain moment of time.

Algorithms in SNA

Because SNA is an emerging field, there is no scientific consensus that states which algorithm is the best at identifying a key influencer in a network. Numerous algorithms exist and some are specific to only niche problem sets and network types. The algorithms usually have the name of the mathematician or scientist who came up with it. A typical SNA software package (that we describe later) provides the option to apply numerous algorithms to a given data set. We will not list the various algorithms that exist or explain the math behind them because they will only induce headaches and confusion. Instead, we will only conceptually explain the three most relevant algorithms that you will learn to use. Comparing the results of the three algorithms will bolster your analysis and increase confidence in your result. The algorithms approach the problem differently, and it is difficult to say which approach is more correct. Using and comparing the three algorithms ensures that one algorithm does not skew the analysis too much. However, in some cases, one algorithm is significantly better than the others, and we identify those cases. Ideally, you would want to create your own special algorithm for your own special cases, but that is a complicated venture beyond the scope of this book and largely unnecessary. The three algorithms we will concern ourselves with here measure centrality, closeness, and betweenness.2

Algorithm 1: Bonacich's Approach to Degree Centrality

Centrality is a simple concept that says an individual's influence is measured by how many direct links he has with other individuals. The more links an individual has, the more central he is or the more centrality he has. However, centrality does not always equal more influence. Consider the examples of Connie and Paul, who go to the same school and have five different friends each but do not know each other. They both have the same amount of centrality because they both have five direct links. However, Connie's friends are friends with a lot more people, but Paul's friends do not have any other friends except for Paul. See Figure 5.5 for a visual representation of Connie and Paul's networks.

Figure 5.5 Networks with different Bonacich centralities

5.5

Connie's social network is thus much larger than Paul's. Consequently, Connie's friends can talk to plenty of other people but Paul's friends can only talk to Paul. In such a case, Paul could actually turn out to have more influence, because his friends only really have Paul to communicate with. On the other hand, Connie's message can reach a lot more people. In this case, who is more influential—Connie or Paul? The answer depends on the nature of the social network, the individuals in the network, and the environment in which everyone is located. Phillip Bonacich tweaked the standard Degree Centrality algorithm to take this social insight into account. The algorithm enables you to determine whether in a specific scenario, an individual is more influential if he can talk to a lot of people or if he can monopolize communication within a small group. Bonacich's approach enables you to choose which scenario is appropriate by selecting a positive or a negative attenuation factor called the Beta number. If you think an individual is more influential if he can talk to more people, choose a positive Beta number. If you think an individual is more influential if he can monopolize communication within a small group, choose a negative Beta number. Use your judgment to figure out which scenario is more appropriate. Do not worry if you are a little lost. We go over in detail how to use this and other algorithms later in the chapter.

Algorithm 2: Eigenvector of Geodesic Distances

Closeness goes one step beyond centrality and takes into account the indirect links that an individual can make to all other individuals in his social network. In other words, closeness takes into account the fact that if Connie says something to her friends, her friends might tell their friends, who might tell their friends, and so on. Connie may then be able to influence someone who she has never directly communicated with, simply because her friends are so popular and close with others. Thus, Connie has high closeness. In Paul's case, whose friends are largely disconnected from the rest of the school, Paul is not very close to others in the school, and has low closeness. The eigenvector approach calculates how close an individual is to everyone else in his network. The closer Connie is to people in her network by virtue of her friends being so popular, the easier it will be for her to influence a greater number of people in the network. This algorithm is more appropriate for larger, more complex networks.

Algorithm 3: Freeman's Betweenness Centrality

Betweenness considers the social fact that the individual who controls access to the network and decides if the network gets to hear a message wields immense influence over the network. In other words, the more people who depend on an individual to help them communicate and make connections with other people, the more influential is the individual and the more betweenness he has. Such individuals are often known as gatekeepers, because they control the flow of information. People have to go through them if they want to speak to other members of the network. Removing them from the network would cripple the network. In Paul's network, Paul is the gatekeeper because if his friends want to speak to each other, they have to send their message to Paul first, who then relays it to others. Freeman's Betweenness Centrality measures how much betweenness an individual has, by essentially counting how many times the individual falls between two people in a network. The algorithm works only in cases with binary link data, which is when links are coded as only 0 and 1, and not by an attribute.

Choosing SNA Software Program

The final step before conducting SNA is choosing the right software program with which to do it. You can always manually map and apply the algorithms to your data sets. However, for those who are not masochists, we highly recommend choosing an SNA software program. At least a hundred SNA software programs are available and they range in cost, features, algorithms available, and usability. Some software suites bundle two or more software programs together, and many allow for interoperability—you can use the same data set in different software without reformatting it. The following three types of SNA software exist:

  • Visualization—Maps the social network or graph in 2-D and 3-D graphics, and allows you to edit and export it
  • Analytical—Applies various SNA algorithms to data sets
  • Combined Packages—Combine the functionality of visualization and analytical software and enable you to do both with the same data set

Unless you need to create nice looking graphics of social networks, do not use strictly visualization software. We highly recommend using software that combines visualization and analytical capabilities. Increasingly, more and more software programs are combining both capabilities so the list of choices is immense. They are available open source and commercially, and range in prices. Some are free, some are affordable, some are expensive and worth it, and some are expensive but not worth it. The commercial ones usually have more features and algorithms, are more reliable, and offer some sort of trial period. The open source ones are free and actually fairly powerful and have a lot of features, but they are hard to use. We recommend using a combination of commercially available and free SNA software programs, depending on your needs. In the remaining sections, we will describe how to use one commercially available software program and one free software program.

The commercially available software program we teach you to use is called UCINET. Unless you have a large budget and require hundreds of features that go beyond SNA, we recommend you get UCINET, which comes packaged with a visualization software program called NetDraw. It is relatively cheap (only $150 for a license), is widely used in academic circles, is easy to use, has the algorithms you will use, and has a free 90-day trial period. We use UCINET in our case studies. On its website, UCINET also features free data sets to play around with, user guides, and a free online textbook that goes more in-depth into SNA. UCINET is available as a download at www.analytictech.com/UCINET/.

UCINET does not work on Apple and Linux computers. However, you can run UCINET and other Windows-only programs on Apple and Linux computers by purchasing an easy-to-use and relatively cheap Windows emulation software program called Parallels Desktop, which is available for download at www.parallels.com/products/desktop/. You can also use a free, open source Windows emulation software program known as WineHQ, which is harder to set up but generally works well with UCINET. It is available for free download at www.winehq.org.

If you want free software or do not want to use Windows emulation software, we recommend getting a software program called R, which is available at www .r-project.org. R is more difficult to use and requires some familiarity with coding; however, many user guides and wiki pages detail how to use R in laymen's terms. R also enables you to apply many more algorithms, although finding and integrating those algorithms can be tedious. You can also use R for other types of analyses including advanced statistical analyses. Many other free software solutions exist, and we encourage you to try them out. A list of SNA software is available at www.wikipedia.org/wiki/Social_network_analysis_software.

Although not necessary, you should also expect to use a program like Microsoft Excel that enables you to input data into spreadsheets. The added benefit of using Microsoft Excel is that you need it to run the second software program we teach you how to use. The second program is not exactly a software program but instead is a template file for Excel. It is known as NodeXL and is available for free at nodexl.codeplex.com. NodeXL is similar to UCINET in many ways and provides some unique features geared for users interested in analyzing social media data. Specifically, NodeXL enables you to easily and automatically download social media data whereas with other SNA software such as UCINET, you will need to download the social media data manually. In other words, NodeXL comes with a free data aggregation and filtering system. NodeXL also enables you to identify the memes in a network. Like UCINET, NodeXL does not work on Apple and Linux computers and requires that you use the Windows operating system. You also need Microsoft Excel versions 2007 or 2010.

SNA requires completing three steps:

1. Creating and formatting a data set based on downloaded social media data
2. Visually mapping the social network using a visualization tool
3. Applying algorithms to the network data, and then comparing the results of the algorithms to find the key influencer(s)

The remainder of this chapter details the three steps through three real-world examples or walkthroughs. For the first two examples, we will teach you how to use UCINET. For the third example, we will teach you how to use NodeXL. The first two examples will teach you how to identify influencers. The third final example will teach you how to identify the memes being discussed together by people on Twitter. As you read through the examples, notice the differences between the two software programs and develop an understanding of when to use which type of software. Overall, use UCINET when you need to identify influencers using the specific algorithms mentioned before. Use NodeXL when you want to automatically download and format social media data, and determine what topics people are talking about online. As we describe later, you can also transfer files between the two programs.

First Example–Identify Influencers

The example is the social network of people likely in Pakistan and on Twitter who regularly tweet anti-Taliban content. Identifying the key influencers in such a network can help you to, for example:

  • Contact and incentivize influential people to amplify their tweets of anti-Taliban information
  • Understand the most influential anti-Taliban messages, trends, and themes, and configure information operations to mimic them
  • Guide those at risk of recruitment by the Taliban or extremists to connect with the most influential and persuasive anti-Taliban communicators

We then go through the steps again, albeit with less detail, with a second example. We are focusing on Twitter data for both examples because Twitter data is readily available, it translates easily to social networking data, and a lot of people in the government are interested in seeing how they can use it.

Creating the Data Set

Preparing the data set involves creating the matrices we referred to earlier. We will then input the matrices into the UCINET software, so the software can use it to map and analyze the networks. Unfortunately, most contemporary SNA software programs do not automatically download social media data and format it into a data set that you can readily use. For now you will have to manually code the data set using steps detailed in the subsequent sections. Of course, you can use NodeXL for this example and save yourself some time because NodeXL enables you to automatically download social media data and format it properly. However, manually downloading social media data and mapping networks will help you understand and appreciate the complexity of SNA. Thus, we recommend you use UCINET for now. Have UCINET installed and ready as you follow along.


Cross-Reference
To follow along, download the files anti-taliban.##h and anti-taliban.xls from the website. You can either use the files or create a relevant data set on your own.

Collect the Twitter Data

First, follow these steps to build the list of relevant Twitter users:

1. Formulate the data collection need. We are searching for the Twitter accounts of people on Twitter who are saying anti-Taliban things and are likely in Pakistan.
2. Identify a few popular, appropriate Twitter users. You have two ways to go about this, depending on your knowledge of the topic. If you know some Twitter users that match your criteria, start with them. Otherwise, ping the Twitter Search API with English and Arabic translations of words an anti-Taliban person might use, such as “Taliban,” “evil,” “behead(ings),” “killings,” and “immoral.” Read through some tweets and make a list of a few users with a lot of followers and tweets that seem to match your criteria. Identify users who do not seem to be linked with each other; that is, are not following, followed by, and retweeting each other. Check the users' location status to see if they are from Pakistan. If their location is not available, then read through their tweets to see if they mention participating in activities or being in places that suggest they are in Pakistan. Because anti-Taliban rhetoric originating from Pakistan is a fairly niche topic, the number of users you will find will be limited. We used both ways to make our initial list.
3. Build off the initial list of users. Catalog who the initial users are following, followed by, and retweeting—these are the second-degree users. Keep track of how they are linked to the initial user; that is, if the links are unidirectional or bidirectional. If the second-degree user follows or retweets the initial user but the initial user does not follow or retweet the second-degree user, the link is unidirectional from the second-degree user to the initial user. If the initial user also follows or retweets the second-degree user, the link is bidirectional. Read through the tweets of second-degree users to see if they tweet about issues related to the topic. Discard the ones who do not seem to be at all interested in the topic and are linked to only one initial user. They may be following the person for personal or other reasons. Do not discard users who have only a few tweets, have only a few followers, or are following only a few people. Remember that influence is not necessarily a function of how many followers or tweets a user has. Do ignore the Twitter accounts of major newspapers and international organizations. Thousands of followers can follow a major organization, and they may have nothing to do with your topic. Catalog the list in a rough network drawing. See Figure 5.6 for what our network looks like so far with 15 users.
4. Decide how many data points you need. In SNA, the more data, the better. However, theoretically the network you are cataloging can grow to include everyone on Twitter. Due to our time and resource constraints, we chose to limit the number of users to 32. Continue cataloging users until you see fit.

Figure 5.6 First example's network map of 15 users

5.6

Note
If you are following along by building your own data from scratch, it will probably look different from ours. This is due to the somewhat subjective nature of the data collection, and the time difference between us creating our data set and the publication of this book—some of the Twitter accounts may no longer exist or people may change their online behavior and stop following certain people.

The subjective nature of this type of data collection is a little frustrating. New software solutions are in development that will help standardize this type of data collection and reduce the subjectivity. However, some subjectivity will always remain—the software solution will make the same assumptions that you are making. Picking a niche topic and focusing your data collection needs will reduce the overall amount of users involved, and thus make it less likely that you ignore a big chunk of users.


Note
To protect the identity of Twitter users in the first two examples, we replaced the vowels in their user account names with asterisks.

Convert the Network Data into Matrix Format

You have multiple ways to perform the second step, including using network visualization software, which we describe later. You should go through this somewhat painful process at least once, so you understand what the underlying data looks like. Follow these steps to put the data into the appropriate format for SNA software:

1. Start a spreadsheet program. Open a program like Microsoft Excel. You can use other spreadsheet programs or you can bypass spreadsheet programs altogether and input your data directly into the SNA software. However, it is easier to manipulate data with programs like Excel and export it to different formats later.
2. Write the list of all users. In Excel, navigate to the cell at the first column, second row. Start inputting the names of the users, and work down the column till you exhaust all names. The order you put the names in is irrelevant. Leave out the “@” that is in front of Twitter usernames. You have now created the list of users.
3. Write the list of “link receiving users.” Copy all the names in the first column. Navigate to the cell at the second column, first row. Either right-click with your mouse, or select Edit on the menu bar, and choose Paste Select. A menu should pop up. Select Transpose and click Paste. The column of names should now be pasted as a row in the first row. You have now created the list of link receiving users, who are the users that receive the link from the users in the first column.
4. Indicate links. Go through the matrix you have created and insert a 1 for every link that exists between a user and the linked user. To do so, start with the first name in Column 1. In our column, the first name is sh*ms_z and the corresponding row is numbered 2. On Row 2, move to the right through the columns, starting at Column 2, putting a 1 in the cell if sh*ms_z either follows or retweets the usernames heading the column. In our case, Column 2's name is also sh*ms_z. A user cannot have links with himself, so do not put anything in the cell. The user heading Column 3 is m*qb*lb*rh*n. The user sh*ms_z does follow or retweet m*qb*lb*rh*n, so put a 1 in the cell indicating a link between the two users. Continue through all the columns. Then move to Row 3, and so on. See Figure 5.7 for a screenshot of what our spreadsheet looks like.
5. Save the file. After going through all the rows and columns, save the file in CSV or XLS format. Go through the matrix again to ensure you entered everything correctly.

Figure 5.7 First example's network matrix

5.7

Enter the Data into the SNA Software

Third, follow these steps to copy the data into the SNA software:

1. Open UCINET. Install and start UCINET. You may use other software but most of our instructions will be specific to UCINET. SNA software, however, is usually fairly similar so you should be able to follow along with different software. When you start UCINET, you should see the home interface with a menu bar on the top. On the menu bar, select Data. In the menu, select the first choice of Data Editors, and then select Matrix Editor. Once you click Matrix Editor, a spreadsheet-type interface should pop up. You can also enter the Matrix editor by clicking the second graphic button on the home interface, directly underneath the menu bar.
2. Copy data into the Matrix editor. Open the CSV or XLS file you created using Excel. Highlight all the cells, including the names in the first column and first row and all the 1s indicating the links. Copy the cells. Go back to UCINET's Matrix editor. In the editor, make sure your cursor is at the first cell, which is the left-most, top-most cell. Usually when you start the editor, the cursor starts there as default. Otherwise, simply click the first cell to place your cursor there. Then go to Edit ⇒ Paste, or use the standard hot key to paste the data. The data should populate the Matrix editor as it did the Excel spreadsheet. Finally, click Fill in the menu bar, and select Blanks w/ 0s. The number 0 should populate in all the cells that are empty, which should be any cells that do not have a 1. You can also click the graphic button underneath the menu bar that says the word Fill. See Figure 5.8 for a screenshot of the Matrix editor. Notice the options on the right of the editor that say Mode. Make sure that under Mode, you select Normal. In networks where every link is bidirectional, select the option Symmetric. In a social media case, the network is rarely completely bidirectional or symmetric.
3. Save the file. Save the file by going to File ⇒ Save or clicking the floppy disc graphic button. Name the file as something relevant and make sure to save it as a UCINET 4-6 file, and not UCINET 7 type or any other type. The extension of this type of file is .##h. Our file is saved as anti-taliban.##h. Once saved, close the Matrix editor and Excel, and return to the UCINET home interface.

Figure 5.8 UCINET's Matrix editor

5.8

You have completed preparing the data set and can now move to visually mapping and analyzing the social network.

Visually Map the Network

Visually mapping the network is not necessary for analyzing the network and finding the key influencers. However, it can help you reason about the network and check your work and results. A network that looks extremely broken up or strange can be a clue that you need to double-check your data collection and coding, because most social networks are fairly integrated. Also, you can use the visual representations to explain your analysis to others. The process for visually mapping the network is fairly simple. However, making the network look good by configuring the features can be difficult. The subsequent sections show you the main steps for mapping and configuring the network. Refer to your SNA software user guide for more.

Create the Initial Map

First, follow these steps to create the initial visual representation of the network:

1. Open NetDraw. In UCINET's home interface, click Visualize on the top menu bar, and then select NetDraw. Alternatively, you can click the last graphic button beneath the menu bar. A new window should pop up, showing NetDraw's main interface.
2. Input the data set. On the menu bar at the top of NetDraw's interface, go to File ⇒ Open ⇒ Ucinet Dataset ⇒ Network. In the pop-up, click the small button with an ellipsis next to the box. In the pop-up, select the .##h file you created or downloaded from our website. Click OK and open the file. On the NetDraw interface, you should see your network represented as a network map, as in Figure 5.9.

Figure 5.9 First example's complete network map

5.9
3. Save the map. Go to File ⇒ Save Diagram As. In the subsequent menu, select how you want to export the image of the network.

Configure the Map

The extent to which you want to configure your map and make it look better depends on your needs. The following bullet points highlight a few features that may prove useful:

  • Move the nodes around—Click on any node, represented as a colored shape, and drag it to move it around. The lines representing the links will move along with the dragged node. Use this feature to clean up the network map and make it less cluttered. The nodes with the fewest links should generally be away from the center of the map.
  • Resize the map to fit the area—Critical to ensuring your map looks less cluttered, this option resizes your map to fit your screen. On the menu bar, go to Layout and select Resize. Alternatively, on the buttons under the menu bar, click the box that is fifth from the left.
  • Differentiate certain nodes by attributes—You may want to call attention to certain nodes for a variety of reasons, perhaps because they are the key influencers. By assigning attributes to nodes, you can selectively change the look of specific nodes, or add more fidelity to the data set. Select Transform in the menu bar, and then Node Attribute Editor. In the pop-up, click Add Attrib at the top of the new window, and provide a name for the attribute. A new column should appear alongside the names. Put a 1 next to the name under your attribute column for nodes for which you want to assign that attribute.
  • Change the look of nodes—Click Properties on the menu bar and select Nodes. The subsequent menu options enable you to change the color, size, label, and shape of the nodes. You can choose to assign the properties to either all the nodes or by attribute. Alternatively, you can right-click any of the nodes in the NetDraw interface and change the properties of only the selected node. You can also use your mouse to draw a box around the nodes you want to change, and then right-click to change the properties of the selected nodes. The program tells you that a node is selected when it changes the node's look from a colored shape to that of a shape containing an X.
  • Change the look of links—This is similar to changing the properties of links. Click Properties on the menu bar and select Lines. You can then change the color, size, label, and arrow head type of all or some links. You can also right-click specific links in the NetDraw interface and change the look of specific links.
  • Play with other options—Notice that the Properties menu also enables you to change the background color and other properties of the map. Go through the Properties menu and other menus to see how else you can change the network map.
  • Edit the data set visually—Right-clicking nodes and links also enables you to delete and add links and nodes. By adding and deleting nodes and links, you are editing the underlying data set. If you want to save the edited data set, go to File ⇒ Save Data As ⇒ Ucinet ⇒ Binary Network. Give it a new filename and save it as a .##h file type. You will have created a whole new data set. This option can prove useful. Instead of cataloging a data set as a matrix or in a spreadsheet, you can build the data set in NetDraw and then save it. Be careful with adding attributes to nodes and then saving them. The algorithm operations we teach you may not work if the nodes have attributes assigned to them.

You have now completed visually mapping and editing the network, and can analyze it to find the key influencers.

Analyze the Network

Analyzing social networks involves running three algorithms through the data set and then comparing their results. Note that you can run many other algorithms also. Explore your SNA software to see what kinds of results you get with the other algorithms. The subsequent steps show you how to run the algorithms on UCINET, output the results in an exportable format, and compare them.

Run the Algorithms

First, follow these steps to apply the three algorithms to the data set:

1. Prepare UCINET. Close NetDraw and the Matrix editor, and return to UCINET's home interface.
2. Run the Bonacich Degree Centrality algorithm. On the menu bar, go to Network ⇒ Centrality and Power, and select Bonacich Power (Beta centrality). In the pop-up, which is shown in Figure 5.10, click the graphic box with the ellipsis next to the box for Input Network Dataset. Select the appropriate .##h file. Then look at the bottom right of the window and find the options for setting the Beta value.

Figure 5.10 UCINET's Bonacich Centrality input pop-up

5.10
Our assumption is that, in this network, having your message heard by more people makes you more influential. The goal of the anti-Taliban people is to be heard so they can popularize their effort against the Taliban and make sure a lot of people know about the case against the Taliban. Thus, the Beta number must be positive. Click the Get Beta button to the right. Close the window that popped up. You should notice a number in the dialog box that says Beta Coefficient. Our number says 0.1128126. Unless specified otherwise, keep the default options for all algorithms. Finally, click OK and run the algorithm. A new window should pop up (most likely in the Notepad program), showing a text file with a list of the names and numbers next to them. The text file should say “Bonacich Power / Beta Centrality” near the top. These are the results of the algorithm. See Figure 5.11 for a screenshot. Save the text file with the name “Centrality_name of file.txt.” Close the text file and return to the UCINET interface. We will return to the text file later.

Figure 5.11 UCINET's Bonacich Centrality results text file

5.11
Run the Eigenvector of Geodesic Distances algorithm. On the menu bar, go to Network ⇒ Centrality and Power, and select Eigenvector. In the pop-up, click the graphic box with the ellipsis next to the box for Input Network Dataset. Select the appropriate .##h file. The default value for Method should say “Slow & super accurate.” If your data set is very large, such as more than 10,000 nodes, and your computer is not very powerful, select Fast. Click OK and run the algorithm. A new window should pop up, showing a text file with a column of numbers next to another column of numbers, and when you scroll down, a column or list of the names and numbers next to them. The text file should say “Eigenvector” near the top. These are the results of the algorithm. Save the text file with the name Closeness_name of file.txt. Close the text file and return to the UCINET interface. We will return to the text file later.
4. Run Freeman's Betweenness Centrality algorithm. On the menu bar, go to Network ⇒ Centrality and Power ⇒ Freeman Betweenness, and select Node Betweenness. In the pop-up, click on the graphic box with the ellipsis next to the box for Input Network Dataset. Select the appropriate .##h file. Click OK and run the algorithm. A new window should pop up, showing a text file with a list of the names and numbers next to them. The text file should say “Freeman Betweenness Centrality” near the top. These are the results of the algorithm. Save the text file with the name Betweenness_name of file.txt. Close the text file and return to the UCINET interface. We will return to the text file later.

Compare the Results

Second, follow these steps to compare the results of the three algorithms to find the key influencers in the network:

1. Open the result text files. Open the three text files you saved before. Look at the top of the Closeness file, which is shown in Figure 5.12, and find the heading called “Eigenvalues,” below which is a list of numbers. Find “Factor 1” and the “Percent” associated with Factor 1. If the percent value is less than 70, you should take the results of the Eigenvalue Closeness algorithm lightly. It means that the algorithm did not find enough differences in the network to produce an accurate enough result, probably because the network was too small. Our Factor 1's percent value is 15.8, so we will only slightly consider the results. In the same file, scroll down to the part that starts with the header “Bonacich Eigenvector Centralities” and lists the names with numbers next to them. You will notice that the names are listed in the order in which you entered the nodes in the Matrix editor. This ordering of list is considered unsorted. Look at the numbers next to the names under the heading of “Eigenvec,” short for eigenvectors. You need the names sorted by the value of the eigenvectors, with the name with the highest eigenvec on top, because it is the most influential. Use Excel to re-sort the data in this way. Open the Centrality file. Find the list of names, and also realize that this list is also unsorted. You need it sorted by the “Power” numbers, with the name having the highest Power number on top because it is the most influential. Use Excel to re-sort this data as well. Open the Betweenness file and find the list of names. This list is already sorted by the “Betweenness” numbers, which is a good thing and saves you work. Line up the three sorted name lists next to each other. Table 5.1 lists the sorted list of names according to each algorithm. Finally, quickly skim through the numbers next to the column of names for each list. Within each list, if the numbers for some names vary significantly in value from the numbers for the other names, you can trust the output of the algorithm. The results are significant because there is some difference between the nodes. Within each list, if the numbers are all similar in value, you may not have enough data for that algorithm, and you should take those results less seriously.

Figure 5.12 UCINET's Eigenvector Closeness results text file

5.12
2. Determine the number of key influencers. The number of influencers you want to find and focus on depends on what you want to do with the names and your resources. A good rule is that the number of influencers equals about 5–10 percent of the total number of nodes in the network. Our network has 32 nodes, so we are looking for three key influencers.
3. Compare the names and their positions. Notice that the Closeness (Eigenvector algorithm) and the Centrality (Bonacich Centrality algorithm) result lists are identical. This will not always be the case, but it is in this network. The Betweenness list, however, is different from the other lists. Some names that appear at the top of the Closeness and Centrality lists do appear at the top of the Betweenness list. The user *kch*sht* tops all three lists, which indicates *kch*sht* is very likely a key influencer. In the case where the lists appear different, look at the top five names of the three lists and see if similar names appear in the top five. If you cannot find similar names in the top five, expand it to the top 10, and so on. In our case, the users h*fs*q and r*z*r*m* appear in the top five for all lists, alongside *kch*sht*. Go back to the result text files, and validate your results by comparing the value of the Power, eigenvec, and Betweenness numbers for each key influencer with the corresponding numbers for other nodes. Within each list, the greater the difference, the more likely your results are correct. Consider the numbers to be guidelines and not an objective measure.
4. Compare the algorithms. In some cases, you may have a bias against or with a certain algorithm. The algorithms define influence differently and you may prefer one definition to another. In such a case, give more weight to the output of one algorithm and less to the others.
5. Finalize your results. The key influencers for this network appear to be *kch*sht*, h*fs*q, and r*z*r*m*. Based on the numbers, *kch*sht* appears to be far more influential.
6. Compare results with the visual map. Look at the network map to see if the nodes you identified do appear to be more influential. Depending on your algorithm and influencer definition preference, the nodes you identify as influential should appear centrally located in the network, close to users, and between users. If you identify a node as influential and it looks isolated and only connected to one other node in a network of 100 nodes, you are likely incorrect. In our network map, the three key influencers appear central, close to other users, and between users. Therefore, our results appear correct. Note that looks can be deceiving. The user *shr*fk*kk*r appears well connected and indeed has high closeness and centrality, but *shr*fk*kk*r's lower position on the Betweenness lists indicates that *shr*fk*kk*r may not be as important as he or she visually appears.

Table 5.1 Influence Ranking of Users per Algorithm

Centrality (Bonacich Centrality) Closeness (Eigenvector of Geodesic Distances) Betweenness (Freeman's Betweenness)
*kch*sht* *kch*sht* *kch*sht*
h*fs*q h*fs*q **m*bb*sr*z*
*shr*fk*kk*r *shr*fk*kk*r sh*ms_z
r*z*r*m* r*z*r*m* h*fs*q
cpy*l* cpy*l* r*z*r*m*
*t*fs*jj*d *t*fs*jj*d *shr*fk*kk*r
m*rt*z*s*l*ng* m*rt*z*s*l*ng* *j*zkh*n
*l**rq*m *l**rq*m sh*r*zh*ss*n
k*mr*nsh*f*46 k*mr*nsh*f*46 sh*fq*t_m*hm**d
*bb*sn*s*r59 *bb*sn*s*r59 m*rt*z*s*l*ng*
*j*zkh*n *j*zkh*n *l*d*y*n
m*zd*k* m*zd*k* *nj*mk**n*
*l*d*y*n *l*d*y*n *v*8x8
t*kh*l*s t*kh*l*s *t*fs*jj*d
sh*r*zh*ss*n sh*r*zh*ss*n k*mr*nsh*f*46
*z*dp*sht*n *z*dp*sht*n *n*p*k*st*n*
myr**m*cd*n*ld myr**m*cd*n*ld cpy*l*
sh*h*ds***d sh*h*ds***d *l**rq*m
**m*bb*sr*z* **m*bb*sr*z* m*zd*k*
*v*8x8 *v*8x8 *zr*kh*n_pt*
sh*fq*t_m*hm**d sh*fq*t_m*hm**d *hm*dj*ml*t
*zr*kh*n_pt* *zr*kh*n_pt* s*d*th*s*nm*nt*
n*v**dpt* n*v**dpt* **mr*n*
*n*p*k*st*n* *n*p*k*st*n* m*qb*lb*rh*n
*nj*mk**n* *nj*mk**n* n*v**dpt*
f*h**mr**n* f*h**mr**n* t*kh*l*s
sh*ms_z sh*ms_z *z*dp*sht*n
**mr*n* **mr*n* *bb*sn*s*r59
s*d*th*s*nm*nt* s*d*th*s*nm*nt* m*rt*z*b*
*hm*dj*ml*t *hm*dj*ml*t myr**m*cd*n*ld
m*rt*z*b* m*rt*z*b* f*h**mr**n*
m*qb*lb*rh*n m*qb*lb*rh*n sh*h*ds***d

Second Example–Identify Influencers

The second real-world example is the social network of people likely in Pakistan and on Twitter who regularly tweet pro-Taliban content. Identifying the key influencers in such a network can help you to, for example:

  • Focus investigation resources on the most influential communicators and recruiters
  • Distance the most influential pro-Taliban messengers from the networks to cripple the network
  • Understand the most influential pro-Taliban messages, and configure information operations to counter them
  • Dissuade those at risk of recruitment by the Taliban or jihadists from connecting with the most influential pro-Taliban communicators

The objectives of analyzing the network in this example are different from those of the first network, thereby slightly affecting how we perform SNA and how we configure the algorithms. Because the first example contained all the explicit details on how to complete the steps and we do not want to be redundant, in this example, we will breeze through a truncated number of steps. Only use the following steps as guidelines and to check your results as you complete the SNA on your own.


Cross-Reference
To follow along, download the files pro-taliban.##h and pro-taliban.xls from the website. You can either use the files or create a relevant data set on your own.

Create the Data Set

The following steps summarize the process for assembling and coding the data:

1. Collect the Twitter data. Search for Twitter users that are pro-Taliban and likely from Pakistan. We use our existing knowledge to identify a few key users, and ping the Twitter Search API with English and Arabic translations of words such as “Taliban,” “heroes,” “martyr,” and “sacrifice,” and see who uses them regularly. Building out from our initial set of users, go a few degrees out from the initial list of users. Due to our time and resource constraints, we stopped at 91 users.
2. Convert the network data into matrix format. Catalog the names in Excel. Go through the spreadsheet, placing a 1 where links exist between the names in Column 1 and the names in Row 1. Save the file and review the work for corrections.
3. Enter the data into the SNA software. Copy the spreadsheet from Excel into UCINET's Matrix editor. Fill the blank cells with 0s and ensure the Mode type is Normal. Save the file as pro-taliban.##h and close Excel and the Matrix editor.

Visually Map the Network

The following steps summarize the process for visually representing and configuring the network map:

1. Create the initial map. Open NetDraw, and input the pro-taliban.##h data set.
2. Configure the map. Change the size, shape, color, and other properties of the nodes and links, as desired. Drag the nodes around to make the network map look less cluttered. Resize the network so it fits the screen properly. We decided to make the labels invisible because they were cluttering the map. See Figure 5.13 for our network map.

Figure 5.13 Second example's complete network map

5.13

Analyze the Network

The following steps summarize the process for running the algorithms through the data set and comparing their results to identify the top key influencers:

1. Run the algorithms. Input the data set into the three algorithm pop-up windows, select the right options, and run the algorithms. Our assumption is that in this network, because the message is somewhat unpopular on social media, having a small group of people repeatedly listen and absorb your message makes you more influential. The influencers in the network do not have to compete with other messages and can take time instilling their relatively less popular ideology into potential recruits and true believers. Thus, the Beta number for the Bonacich Degree Centrality algorithm should be negative. Specifically, it is –0.0932322. Also, because the number of nodes is still fairly small, we select the Slow method for the Eigenvector algorithm. Save the output of the results, sort the list of the names by the Power, eigenvec, and Betweenness numbers (Betweenness comes pre-sorted), and create a list as in Table 5.2. To save space, Table 5.2 shows only the top 30 rankings for each algorithm. The complete list is available on the website. Notice that the Closeness and Centrality lists differ somewhat. Also notice that the Factor 1's percent value in the Eigenvector output is only 6.6, so give less credence to the eigenvector results. Otherwise, the numbers for the names seem to vary among each algorithm result list, which bolsters confidence in the overall results.
2. Compare the results. We estimate that about 5 percent of the network are influencers, resulting in five key influencers. Comparing the results, the key influencers appear to be: *bn_*lk*tt*b, *b*bd*sS*l**m, j*f*rh*ss**n1, *lm*s*f*r*q8, and *hm*dkh*n111. If you prefer one algorithm, for whatever reason, your results may look somewhat different. However, overall, *bn_*lk*tt*b does appear to be the most influential.

Table 5.2 Top 30 Influence Rankings of Users per Algorithm (Second Example)

Centrality (Bonacich Centrality) Closeness (Eigenvector of Geodesic Distances) Betweenness (Freeman's Betweenness)
*bn_*lk*tt*b *bn_*lk*tt*b *bn_*lk*tt*b
*lm*s*f*r*q8 n*rj*sk*ld* *b**bd*sS*l**m
*b**bd*sS*l**m J*h*d*l*mm*h L*ND_D*F*ND*R
j*f*rh*ss**n1 *B*1433 *m*mb*nb*z
n*rj*sk*ld* H***lj*h*d *lm*s*f*r*q8
*b*lkh* R*ckst*n*RC j*f*rh*ss**n1
*B*1433 r*h*lc*h*d *hm*dkh*n111
s*lt*n2277 *l_n*khb* *lb*t*r2
HSMPr*ss M*HH B*nt**m**h
*hm*dkh*n111 *hm*dkh*n111 D**n*b*s*x
*b*Kh*d**j*hSP *ls*m**d yv*nn*r*dl*y
*lb*t*r2 L*ND_D*F*ND*R *lr*sh*d_Kh*l*d
yv*nn*r*dl*y *lb*t*r2 *b*lkh*
L*ND_D*F*ND*R *s_*ns*r *sl*m*cTh*nk*ng
J*h*d*l*mm*h M*nb*rT*wh*d J*h*d*l*mm*h
R*ckst*n*RC *sm**_j*h*d H***lj*h*d
Sh*r**lM*j*h*d *b*lkh* d*h*nt*d786
85W*l**D *lv*z**r st*ntr*d*r00
*b*H*k**mB*l*l tvjh*d n*rj*sk*ld*
M*HH r*lg*h*d *B*1433
MYC_Pr*ss HSMPr*ss G*t*Pr*t*19
sw*bzy 85W*l**D *b*Kh*d**j*hSP
*ls*m**d s***t*lj*h*d HSMPr*ss
*hm*d*ss*ng *b**_1986 M*HH
*b**_1986 d*h*nt*d786 85W*l**D
G*t*Pr*t*19 *lf*r*qm*d** *ls*m**d
d*h*nt*d786 Sh*r**lM*j*h*d sw*bzy
H***lj*h*d st*ntr*d*r00 Tw**t_K*shm*r
*l_n*khb* *lr*sh*d_Kh*l*d Dr**s*lS*nn*h
st*ntr*d*r00 *m*mb*nb*z *b**_1986

Third Example–Determine Top Memes

The third real-world example entails determining what topics and subjects people are currently discussing. Identifying the key memes in a network can help you to, for example:

  • Understand how topics of discussion in a network change over time
  • Understand which topics tend to be discussed together
  • Identify the type of subjects you must anticipate discussing if you wish to join the network as a member

We will use NodeXL to conduct the analysis. We will determine what topics people are talking about in relation to a specific topic and who those people are. In our example, we will determine what other things most people talk about and mention in their tweets when they tweet about drugs, and we will determine the identity (usernames) of the people talking about drugs. Keep in mind that identifying current memes in a network is only one of NodeXL's abilities. The software can also download and format various types of social media data, draw network maps, conduct many of the same analyses we discussed in the UCINET examples, and do many other things. For ease of understanding, we will focus on only using NodeXL to download social media data and conduct the meme analysis. Check out the NodeXL website for instructions on doing other things with NodeXL. Install and ready Microsoft Excel 2007 or 2010 and NodeXL version 210 or above to follow along.

Create the Data Set

NodeXL makes it much easier to create a data set and download all the necessary data. You can use NodeXL to download all sorts of social media data such as the networks of certain Twitter accounts, YouTube searches, and Facebook profile information. For now, we will only download data about people talking about the topic of drugs on Twitter.


Cross-Reference
To use our dataset, download the files drugs_NodeXL.xls from the website. Note that if you decide to create a new file using the drug topic search, your analysis will look different because you will be analyzing tweets from a different time period. NodeXL only analyzes tweets from a 7-day period from the time of analysis. We are doing our analysis months before you do yours, so we are analyzing a very different set of tweets.

Feel free to download data about any other topic. Follow these instructions to start:

1. Find the “NodeXL Excel Template” file on your computer and install it. You should be able to find it by going to Start, then Programs, and then the NodeXL folder. The Microsoft Excel program will open with the NodeXL template pre-loaded. On older or less powerful computers, loading times may be long.
2. Save the opened file as a new Excel workbook file.
3. Notice the ribbon/Table Tools heading that says “NodeXL” on the top of the Excel window. Click on it to reveal the NodeXL ribbon of options. In the ribbon, on the top left side, click on “Import” and then select “From Twitter Search Network.” A small menu window will appear. Notice the other types of data in the menu that you can download quickly and easily.
4. Near the top of the window, type in the search or topic term you wish to analyze. We are analyzing the term “drugs.”
5. Below the field where you entered the name, you need to select a number of options that determine what type of data you collect about the Twitter search term and how it is structured. Feel free to play around with the options. For now keep all the default options as is. Under the “Add an edge for each,” we put a checkbox on all the options except for the “Follows relationship (slower).” We limited it to a hundred people. Make sure to check the box next to “Add a Tweet column to the Edges worksheet” and to “Expand URLs in tweets (slower).” For “Your Twitter account” option, we selected “I don't have a Twitter account.”
6. After selecting the options, click “Ok.” It may take some time but eventually you will see your Excel workbook fill up with information about your desired Twitter search term. A pop-up may appear asking you if it is ok to shut off text wrapping. In the pop-up, click “Yes” to continue with the data importing.

Analyze the Network

Analyzing the network to determine the top memes in this case involves identifying the terms associated with your search term. Associated terms refer to the words and phrases people mention in the Tweets that contain the desired search term. The number of times people mention a certain word in the same tweet that contains your search term determines that word's ranking. The following steps describe how to conduct the analysis:

1. After the workbook is filled with data, select the analysis you want to conduct. In the ribbon menu, near the top right of the Excel window, click on “Graph Metrics” and then select the similar looking “Graph Metrics” option in the drop-down menu. A new window should pop-up showing you a list of analyses you can conduct.
2. In the new window, deselect “Overall graph metrics” and select “Twitter search network top items.” Then click on “Calculate Metrics.”
3. A new worksheet should open showing all the results. Scroll through the worksheet to see the results. You should see the top ten people replied-to on Twitter from the people who tweeted the word “drugs.” You will also see the top ten people mentioned, the top ten URLs, and the top ten hashtags (without the # character in front of the words). The top replied-to and people mentioned tell you about the people in the Twitter network that are tweeting the word “drugs.” The top URLs and hashtags tell you what other things or memes people are talking about when they mention the word “drugs” on Twitter. See Table 5.3 for the list of top ten hashtags that our analysis revealed.

Table 5.3 Top ten hashtags from analysis on Twitter search term “drugs”

Rank Hashtag (Without the # in front)
1 myxfactorsobstory
2 imthepersonwho
3 drugs
4 bewhatiwannabe
5 cough
6 cold
7 shrugs
8 thanksmom
9 treasurehunting
10 real

Most of the hashtags found seem irrelevant to security and likely are. You will need to use NodeXL in a much more targeted way to acquire relevant information and analysis. We urge you to play around with NodeXL to see how you can better focus your analysis.

We now complete the social network analysis part of the book. We continue with other types of analyses in Chapter 6.


Note
You can open the UCINET file in NodeXL by converting the UCINET .##h file into a DL file. First, in the UCINET home screen, select Data ⇒ Export ⇒ DL File, and then select the .##h file you want to convert. Next, in the NodeXL Excel file, select the NodeXL tab, choose Import ⇒ From UCINET Full Matrix DL File, and select the appropriate DL file.

Summary

  • Use social network analysis to map social networks exhibited on social media and other venues, and reveal the identity of key influencers.
  • A social network is made up of links (relationships) between nodes (individuals).
  • Links can be unidirectional, where person A talks to person B but person B does not talk back. Links can also be bidirectional or reciprocal, where person A and person B talk to each other.
  • Social networks can be represented as a matrix, which looks like a spreadsheet, or graphically in a network map.
  • A key influencer spreads memes, ideas, and concepts that spur changes in behavior, to the most people in a network with the least constraints.
  • Three SNA algorithms are effective at finding key influencers in networks:
  • Centrality (Bonacich's approach) assumes an individual's influence is a function of how many people he links to, and depending on the nature of the network, how many people his links link to. Sometimes it is better to have links with people who have no other links, so a person can monopolize communication with them. Sometimes it is better to have links with people who have other links, so your message can spread beyond your direct links.
  • Closeness (Eigenvectors) assumes the more links that people in an individual's network have, the more likely it is for the individual's message and influence to spread to people beyond her networks, and thereby increase her influence.
  • Betweenness (Freeman's) assumes that an individual has more influence if he is the ultimate middleman for the people he links to and controls what messages the people he links to get.
  • SNA software enables you to visualize networks and analyze them. They vary widely in price and features. We recommend UCINET, which features lots of analytical algorithms and NodeXL, which features the ability to easily download and format social media data and determine what topics people are talking about online.
  • Conducting SNA involves creating the data set, visually mapping the network, and analyzing the network.
  • Creating the data set entails:
  • Collecting Twitter data
1. Converting network data into matrix format
2. Entering the data into SNA software
  • Visually mapping the network entails:
1. Creating the initial map
2. Configuring the map
  • Analyzing the network entails:
1. Running the algorithms
2. Comparing the results

 

 

Notes

1. Thompson, C. (2008) “Is the Tipping Point Toast? Fast Company.” Accessed: 29 June 2012. http://www.fastcompany.com/magazine/122/is-the-tipping-point-toast.html

2. Hanneman, R. and Riddle, M. (2005) Introduction to Social Network Methods. Riverside, CA: University of California, Riverside. Accessed: 29 June 2012. http://www.faculty.ucr.edu/∼hanneman/nettext/

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset