Lecture 8:
Graph Data Structures
Graphs
Graphs are one of the unifying themes of computer science.
A graph G = (V, E) is defined by a set of vertices V , and
a set of edges E consisting of ordered or unordered pairs of
vertices from V .
Road Networks
In modeling a road network, the vertices may represent the
cities or junctions, certain pairs of which are connected by
roads/edges.
Stony Brook Green Port
Orient Point
vertices - cities
Riverhead
edges - roads
Shelter Island
Montauk
Islip Sag Harbor
Electronic Circuits
In an electronic circuit, with junctions as vertices as
components as edges.
vertices: junctions
edges: components
Other Graphs/Networks
• Social networks
• The World Wide-Web
• Control flow within a computer program
• Pairwise similarities between items
Flavors of Graphs
The first step in any graph problem is determining which
flavor of graph you are dealing with.
Learning to talk the talk is an important part of walking the
walk.
The flavor of graph has a big impact on which algorithms are
appropriate and efficient.
Directed vs. Undirected Graphs
A graph G = (V, E) is undirected if edge (x, y) ∈ E implies
that (y, x) is also in E.
undirected directed
Road networks between cities are typically undirected.
Street networks within cities are almost always directed
because of one-way streets.
Most graphs of graph-theoretic interest are undirected.
Weighted vs. Unweighted Graphs
In weighted graphs, each edge (or vertex) of G is assigned a
numerical value, or weight.
5
7
2
3
9 3
5 4 7
12
unweighted weighted
The edges of a road network graph might be weighted with
their length, drive-time or speed limit.
In unweighted graphs, there is no cost distinction between
various edges and vertices.
Simple vs. Non-simple Graphs
Certain types of edges complicate working with graphs. A
self-loop is an edge (x, x) involving only one vertex.
An edge (x, y) is a multi-edge if it occurs more than once in
the graph.
simple non−simple
Any graph which avoids these structures is called simple.
Are you your own friend?
Sparse vs. Dense Graphs
Graphs are sparse when only a small fraction of the possible
number of vertex pairs actually have edges defined between
them.
sparse dense
Graphs are usually sparse due to application-specific con-
straints. Road networks must be sparse because of road
junctions.
Typically dense graphs have a quadratic number of edges
while sparse graphs are linear in size.
Cyclic vs. Acyclic Graphs
An acyclic graph does not contain any cycles. Trees are
connected acyclic undirected graphs.
cyclic acyclic
Directed acyclic graphs are called DAGs. They arise naturally
in scheduling problems, where a directed edge (x, y) indicates
that x must occur before y.
Data Structures for Graphs: Adjacency Matrix
There are two main data structures used to represent graphs.
We assume the graph G = (V, E) contains n vertices and m
edges.
We can represent G using an n × n matrix M , where element
M [i, j] is, say, 1, if (i, j) is an edge of G, and 0 if it isn’t. It
may use excessive space for graphs with many vertices and
relatively few edges, however.
Can we save space if (1) the graph is undirected? (2) if the
graph is sparse?
Adjacency Lists
An adjacency list consists of a N × 1 array of pointers, where
the ith element points to a linked list of the edges incident on
vertex i.
1 2
1 2 5
2 1 5 4 3
3 2 4
3
4 5 2 3
5 4 5 1 2 4
To test if edge (i, j) is in the graph, we search the ith list for
j, which takes O(di), where di is the degree of the ith vertex.
Note that di can be much less than n when the graph is sparse.
If necessary, the two copies of each edge can be linked by a
pointer to facilitate deletions.
Tradeoffs Between Adjacency Lists and
Adjacency Matrices
Comparison Winner
Faster to test if (x, y) exists? matrices
Faster to find vertex degree? lists
Less memory on small graphs? lists (m + n) vs. (n2 )
Less memory on big graphs? matrices (small win)
Edge insertion or deletion? matrices O(1)
Faster to traverse the graph? lists m + n vs. n2
Better for most problems? lists
Both representations are very useful and have different
properties, although adjacency lists are probably better for
most problems.
Graph Representation
import networkx as nx
myGraph = nx.Graph()
myGraph.add_nodes_from(range(1,5))
myGraph.add_edges_from([(1,2),(1,3),(4,2)])
The degree field counts the number of meaningful entries for
the given vertex.
An undirected edge (x, y) appears twice in any adjacency-
based graph structure, once as y in x’s list, and once as x in
y’s list.