Basic concepts of Social Network Analysis
Antonio Zinilli
National Research Council (CNR)
Rome, November 20-21, 2017
Objectives for Today
• Understand what network analysis is
• Introduce basic concepts
• Introduce main network measures
• Overview methodological approaches
What Are Networks?
• Networks are patterns of relationships that connect
individuals, institutions, or objects (or leave them
disconnected).
• EXAMPLES
• Individuals’ co-memberships in organizations
• Relationship between countries
When to Study Networks?
• It is possible applied network analysis tecniques in
more contexts. But the question we want to ask is:
when in the network aspect of phenomenon
particularly pertinent to the social dynamics that
matter to us?
• Network analysis tends to place a strong emphasis
on the relationship (or “the dyad”) as a unit of
analysis.
Graphs
Social networks can be represented as graphs
Graphs are made up of nodes (i.e., actors, cities, organizations, articles etc.)
that are connected by links (i.e., relationships, membership, citations etc.).
Types of Links
Undirected vs. directed links
Undirected links, identified with a simple straight line, are used when
there is a symmetric relationship:
E. g. Facebook has always used a symmetric model, if you add someone
as a friend they have to add you as a friend as well.
Directed links, identified with arrows, are used when there is an
asymmetry in a relationship:
E. g. On Twitter you can “follow” someone else without them
following you back.
Dichotomous vs. Valued Links
Dichotomous vs Valued Links
Dichotomous Links: either a link exists or it doesn’t (e.g. if we are friends
or we are not friends, there is a collaboration or not etc..) :
Valued Links: the links vary on the based of a weight (strength) (e.g. our
friendship may be strong or weak, the number of times that each pair
of countries cooperate):
Main parts of a graph
• Component: all nodes that assemble a connected
subgraph within a network:
main component is the largest component within a
network;
minor component is a component that is smaller than
the main component. Usually there are more minor
components.
• Isolate: a node that has no links to the other
nodes within the network
Main parts of a graph
MINOR ISOLATE
COMPONENT
MAJOR COMPONENT
Main Graph Implementation Strategies
• Edge List
• Adjacency Matrix
10
Edge List
Edge list: an unordered list of all edges in the graph
A A A B B C E E E G
B
A C
B E F G C D F G D D
D
F
E
11
Matrices
The most basic matrix is an adjacency matrix: an nxn matrix where:
• the nondiagonal entry aij is the number of edges joining vertex i and
vertex j (or the weight of the edge joining vertex i and vertex j). 1
indicates the presence of a link, while a 0 indicates the absence of a
link.
• the diagonal entry akk corresponds to the number of loops (self-
connecting edges) at vertex k. Usually loops are not counted.
A B C D E F G B
A 0 1 0 0 1 1 0 A C
B 1 0 1 0 0 0 1
C 0
1 0 1 0 0 0
G
M = D 0 0 1 0 1 0 1
E 1 0 0 1 0 1 1
F 1 0 0 0 1 0 0
D
G 0 1 0 1 1 0 0 F
E
Matrices
Austria France Germany Italy A 1 indicates
Austria 0 0 1 0 the presence of
France 0 0 0 1 a link, while a 0
Germany 1 0 0 0 indicates the
Italy 0 1 0 0 absence of a
link.
Symmetric Matrices
Austria France Germany Italy If matrices are
Austria 0 symmetric, they
may be
France 0 0
represented by
Germany 1 0 0 upper or lower
Italy 0 1 0 0 triangle only.
One-Mode Network
• Network analysis typically involves only one mode.
A mode is a class of nodes in a network.
Two-Mode Networks
Mode 1 Mode 2
People Events
Students Universities
Countries Projects or Programmes
Example: Two-mode data have countries and research programmes.
France and Italy collaborate at the same programme.
Basic Network Statistics
• Path
• Geodesic distance
• Density
• Degree centrality
• Closeness centrality
• Betweenness centrality
• Clustering Coefficient
• Eigenvector centrality
Path
F
A
G
B E
H
C
D
• ABEDHG is a path from A to G. There are multiple paths
from A to G.
• Path length is the number of steps in a path. The path length
of ABEDHG is equal to 5.
Geodesic distance
• Geodesic distance is the shortest path from one node to
another node
• The shortest path is the path that achieves that distance.
• The average network diameter is the average of shortest
path lengths over all pairs of nodes in a network.
Geodesic distance: an example
A F
G
B E
C H
D
ABEG is the geodesic from A to G
Density
• Density is a property of a network.
• The proportion of links in a network relative to the total
number possible.
Density: an example
In this example we have 3 links and 4 vertices
Network Density:
A C
Actual connections
Potential connections
B D
Potential connections:
n(n − 1)
PC =
2
Potential connections = [(4 ( 4-1))/2] = 12/2 = 6
Density: 3/6=0,5
This graph has one half of all possible links.
Degree
• Degree is a property of a node. The degree of a node is
equal to the number of links that it has.
A B
C
• B has a degree of 3. G
D
F
E
The average degree of a graph is given by
g
1
k=
g
∑ k ( ni )
i =1
Degree distribution
• A degree distribution a property of a network.
• A degree distribution is the number of nodes of a network that have
each degree level.
• A degree distribution may be a good way of summarizing the activity of
nodes in a network. This measure provides a first indication on the
importance of a node; important nodes are those that have a greater
influence to the flow of information in a network.
• May be a good way of comparing networks to one another.
Indegree and Outdegree
• Directed networks only
• Indegree: The number of links that a node receives from other
nodes
• Outdegree: The number of links that a node sends to other nodes
• What is B’s indegree? Answer: 4
• What is B’s outdegree? Answer: 2
B
Closeness centrality
Closeness is based on the length of the average shortest path
between a vertex and all vertices in the graph
Closeness Centrality:
−1
N
Cc (i) = ∑ d(i, j)
j=1
Normalized Closeness Centrality
'
C (i ) = (CC (i )) /( N − 1)
C
Closeness centrality: an example
A B
C
G
D
F
E
−1
N
∑ d ( A, j )
d(AB)+d(AC)+d(AD)+d(AE)+d(AF)+d(AG)
−1
1+ 2 + 2 +1+1+ 2
−1
9
−1
Cc ( A) =
' j =1
=
= = 6 = 0.66
N −1 N −1 6
Betweenness centrality
Betweenness centrality indicates the extent to which a vertex lies on paths
between other vertices.
paths between j and k
that pass through i
CB (i) = ∑ g jk (i) /g jk
j<k
all paths between j and k
Betweenness centrality: an example
A B C D E
In this simple example there are no alternate paths.
A lies between no two other vertices
B lies between A and 3 other vertices: C, D, and E - (AC), (AD), (AE)
C lies between 4 pairs of vertices: A, B, D, E - (A,D),(A,E),(B,D),(B,E)
D lies between E and 3 other vertices: A, B, and C - (AE), (BE), (CE)
E lies between no two other vertices
We can conclude that C gets full credit.
29
Clustering coefficient
• Clustering coefficient is a measure of the degree to
which nodes in a graph tend to cluster together.
• We can speak about:
• Local Clustering coefficient
• Global Clustering coefficient
Local Clustering coefficient
The local clustering coefficient is defined as the ratio of the observed connections
between all neighbours, and all possible connections between the neighbours
bi
Ci =
n!
k !( n − k )!
with bi indicates the number of observed links between all neighbours of a specific
node; Ki indicates all possible connections between the neighbours.
31
Example
A B
C
G
D
E
2 2 2 2
Ci = = = = = 0, 2
n! 5! 5! 120
k !( n − k )! 2!(5 − 2)! 2!(5 − 2)! 12
Global Clustering coefficient
The global clustering of a graph is given by the
average of all local clustering coefficients
_ g
1
C=
g
∑C
i =1
i
Eigenvector centrality
Eigenvector: is a measure of centrality that takes into
account the centrality of other nodes to which a
node is connected.
1
Ei = ∑ anni
λ j <k
Other Measures of centrality
There are a large number of other possible measures
of centrality.
K-star, bridge, Transitivity etc.
Usually, these different measures are highly
correlated
More information
• Usually, measures of centrality are used as
independent variable.
• Usually the network ties are used as dependent
variable.
PAJEK
UCINET
GEPHI
Textbooks
• Hanneman & Riddle (2005) Introduction to Social
Network Methods, available at
http://faculty.ucr.edu/~hanneman/nettext/
• Wasserman & Faust (1994): Social Network
Analysis – Methods and Applications, Cambridge:
Cambridge University Press.