176 Handbook of Big Data
Extensions to the stochastic blockmodel include mixed membership models (Airoldi
et al., 2008) and degree corrected stochastic blockmodels, which induce power law
distributions on the degrees (Karrer and Newman, 2011).
11.2.3 General Latent Space Model
A final level of complexity is afforded by the general latent space model. Under this model,
the nodes of a network are embedded into a low-dimensional latent space, usually Euclidean,
and the probability of an edge between any two nodes is a function of their latent positions.
For example, in the latent distance model, the probability of a tie increases as the (Euclidean)
distance between the latent positions decreases (Hoff et al., 2002). This captures both
reciprocity and transitivity in the formation of network edges: since distances are symmetric,
if the probability of an edge between i and j is high, then the probability of an edge between
j and i will also be high, and the triangle inequality suggests that if i and j are close and j
and t are close, then i and t are going to be close. Reciprocity and transitivity are properties
that are thought to be important in real-world networks but are impossible to incorporate
into the Erdos–Renyi–Gilbert model or the stochastic blockmodel. The inherent symmetry
of the distance model rules out the possibility that certain nodes have a greater affinity for
ties than others, and to circumvent this limitation, the general latent space model allows for
asymmetric functions of the latent positions as well as for node- and dyad-specific covariates
to affect the probability of tie formation. An example of a latent space model with additive
and multiplicative functions of the latent positions as well as such covariates is described in
detail below.
Consider an n×n asymmetric adjacency matrix A, representing a directed graph, and let
X be an n ×n ×p array of observed characteristics. Each n ×n slice of X is either constant
in the rows (representing a fixed effect that contributes to the propensity to send ties in
the network, or sender effect); constant in the columns (representing a fixed effect that
contributes to the propensity to receive ties in the network, or receiver effect); or neither,
representing dyadic effects. We can model a
ij
as the indicator 1
s
ij
>0
that s
ij
> 0, where
s
ij
= X
ij·
θ + α
i
+ β
j
+ u
t
i
v
j
+
ij
, X
ij·
is the p-dimensional vector of covariates associated
with the relationship between nodes i and j, α
i
is an additive sender effect, β
j
is an additive
receiver effect, and u
t
i
v
j
is a multiplicative effect (as it is the projection of u
i
in the direction
of v
j
in the latent space) that captures similarity between nodes i and j (Hoff, 2005). This
model is a generalization of the social relations model of Warner et al. (1979). Reciprocity
can be introduced into the model by allowing for the error terms (
ij
,
ji
) to be correlated.
Here X
ij·
might include sender-specific information, receiver-specific information, or dyadic
information. The additive latent effects α
i
and β
j
contain information about the affinity of
nodes i and j to send and receive ties in general, while the multiplicative effect u
t
i
v
j
contains
the information on the latent similarity of the two nodes. In particular, if the nodes are close
in the latent space (u
t
i
v
j
> 0), then the probability of a tie is increased and if they are far
apart (u
t
i
v
j
< 0), then it is decreased.
The third panel of Figure 11.1 displays a directed network generated from the latent class
model described above (without covariates and with a one-dimensional latent space). The
two sets of nodes are colored according to the sign of α
i
. The emergence of the two clusters
is due to the multiplicative effect α
i
α
j
: ties are more likely between individuals for whom
the signs of α
i
match. This demonstrates the ability of this model to capture stochastic
blockmodel behavior. Each node has its own probability of sending a tie to another node,
which allows for much greater flexibility than the blockmodel. The yellow nodes send out
more ties to the blue nodes than they receive from the blue nodes, due to the additional
additive effect of α
i
in the model as nodes with α
i
> 0 have a higher probability of sending
out ties.