Complex Network Report
Krzysztof Maciejewski
June 3, 2025
1 Introduction
For my project I chose a weighted directed network representing the neural network of Caenorhabditis
elegans, which is a type of nematode worm. I decided to use this graph because it had a small number
of nodes (297), which allows for easy analysis without consuming my computer’s huge computing
resources. The whole data set consists of 2345 lines. Every line is composed of the source node, the
end node and the weight of the edge. My project was completed using Python and the NetworkX
library. The dataset can be found here: https://networkrepository.com/celegansneural.php
2 Description of the network
2.1 Distances
To better understand the structure of the graph, I calculated the shortest path distances between all
node pairs using Dijkstra’s Algorithm, which is well-suited for directed, weighted graphs.
After that I extracted the distances between all reachable node pairs (excluding self-distances and
unreachable combinations). Using the obtained data, I was able to calculate the average shortest path
length, which turned out to be approximately 5.91.
I also analyzed the distribution of these shortest path lengths. The histogram below shows how
often different path lengths occur in the network. The distribution is right-skewed, indicating that
while most node pairs are relatively close, a significant number of pairs require longer paths, possibly
due to the directed and modular nature of the network.
1
Figure 1: Distribution of the shortest path lengths between nodes in the graph.
2.2 Degree
Table 1: Degree statistics for the C. elegans neural network
Degree Type Average Standard Deviation Variance
In-degree 7.90 10.34 106.83
Out-degree 7.90 6.81 46.36
Total degree 15.79 13.93 194.14
To analyze the connectivity structure of the C. elegans neural network, I computed the statistics for
the in-degree, out-degree, and total degree for each node.
The results, summarized in Table 1, show that both the average in-degree and out-degree are 7.90,
which was expected for a directed network where each edge contributes once to the in-degree and once
to the out-degree of two different nodes. The total degree, which sums both in- and out-degrees for
each node, has an average of 15.79.
The standard deviation of the total degree is 13.93, indicating a moderately wide spread in node
connectivity. The Degree Distribution plot shown below also proves this point. Some neurons act as
hubs, with significantly higher connectivity than the average, while many others are sparsely connected.
2
Figure 2: Distribution of the total degree of nodes in the graph.
2.3 Power Law Analysis
To investigate whether the degree distribution of the C. elegans neural network follows a power law, I
performed a power law fit on the total node degrees using the Python powerlaw library.
The fitting results are as follows:
• The power law exponent α is approximately 3.29, which is outside the typical range associated
with scale-free networks, which usually lies between 2 ≤ α ≤ 3.
• The minimum value xmin from which the power law fit is applied is 18.
2.3.1 Interpretation of the results:
The value α = 3.29 is slightly higher than the upper bound typical for scale-free networks. I think that
this is influenced by the fact that this is a relatively small network which was measured by humans,
which makes the data prone to mistakes and anomalies. The experiment showed that the degree
distribution has a heavy tail and it does not perfectly follow a classical scale-free pattern. However, it
does have a heavy tail, indicating the presence of hubs.
The xmin = 18 means the power law fit applies well, only to nodes with very high degrees (hubs),
while nodes with smaller degrees may follow a different distribution.
2.4 Clustering coefficient
Since my graph is directed and weighted, but clustering is usually computed on undirected graphs, I
converted it first to an undirected one.
1. Local clustering coefficient for each node This shows how well a node’s neighbors are
connected to each other. For example, node 202 has a relatively high coefficient of 0.4035,
meaning that about 40% of the possible connections among its neighbors actually exist and form
a clique. In contrast, node 135 has a low value of 0.0714, indicating very sparse connectivity
among its neighbors. Sample of local clustering coefficients: Node 135: 0.0714 Node 1: 0.3091
Node 202: 0.4035 Node 2: 0.1675 Node 8: 0.2157
3
2. Average clustering coefficient The average clustering coefficient of the network is 0.2924. So
on average, around 29% of the possible connections among neighboring nodes are present. This
reflects the moderate level of local ties in the network.
3. Global clustering coefficient The global clustering coefficient is 0.1807, which is lower than
the average local clustering. This is typical for real-world networks and implies that while local
clusters exist, they do not form globally dense structures.
2.5 Largest connected component size
• Largest weakly connected component size: 297
• Largest strongly connected component size: 239
The largest weakly connected component includes 297 nodes, indicating that the whole graph is
loosely connected when edge directions are ignored. It makes sense that in biological structures such
as neurons, completely isolated islands of neurons don’t exist. However, when we look at the largest
strongly connected component, it contains 239 nodes, meaning a substantial portion of the network
also remains mutually reachable under directional constraints. This shows that the C. elegans neural
network’s neurons are part of a tightly integrated signaling system.
In the picture below, I highlighted the Largest Strongly Connected Component in blue color.
Figure 3: Graph with Largest Strongly Connected Component Highlighted.
2.6 Degree correlation
• Degree assortativity coefficient: -0.163
Negative degree assortativity coefficient indicates that my network is disassortative. This means
that nodes with high degree tend to connect to nodes with low degree and vice versa. In the plot
below, horizontal streaks at high v degrees for low u degrees (and vice versa) prove that low-degree
nodes are often connected to high-degree nodes.
4
Figure 4: Degree-Degree Scatter Plot.
2.7 Communities
To detect communities, I decided to use the Louvain Algorithm because it has low complexity compared
to other algorithms.
The algorithm identified a total of 7 communities. The number of nodes in each community is
as follows:
• Community 0: 12 nodes
• Community 1: 108 nodes
• Community 2: 53 nodes
• Community 3: 4 nodes
• Community 4: 82 nodes
• Community 5: 23 nodes
• Community 6: 15 nodes
To better understand the structure of communities I decided to display them on a plot. This was
helpful to understand that some communities composed by the Louvain Algorithm are very tightly
connected, whereas others have a large portion of single edge, peripheral nodes. The reliance on
maximizing modularity does not necessarily correspond to the best semantically meaningful partition,
but this is the cost we pay for being very fast.
5
Figure 5: Louvain Community Detection.
2.8 Centralities
To analyze the importance of individual neurons, I computed several centrality measures:
• Degree centrality: To understand the relative importance of each node in the network, I
computed the degree centrality, which measures how connected a node is by counting the number
of edges connected to it, normalized by the total number of possible connections. Top 5 nodes
by Degree Centrality:
1. Node 45: 0.4527
2. Node 13: 0.2804
3. Node 3: 0.2703
4. Node 173: 0.2027
5. Node 126: 0.1993
These nodes represent the most connected neurons in the network. In particular, Node 45 stands
out with a centrality of 0.4527, meaning it is directly connected to nearly 45% of all other nodes in
the network. This suggests that Node 45 could play a key broadcast role in the graph, potentially
acting as a hub or control center. The histogram below shows the distribution of degree centrality
across all nodes in the network.
6
Figure 6: Degree centrality distribution
The distribution is heavily right-skewed, with most nodes having low centrality values and only
a few having high centrality. This is consistent with many real-world complex networks, where
a small number of hubs dominate in terms of connectivity.
• Betweenness centrality: Quantifies how often a node appears on the shortest paths between
other nodes. High betweenness nodes may control communication.
Top 5 nodes by Betweenness Centrality:
1. Node 178: 0.1087
2. Node 143: 0.1069
3. Node 126: 0.0913
4. Node 222: 0.0755
5. Node 166: 0.0569
Node 178 has the highest betweenness centrality in the network, indicating that it likely plays a
key role in connecting different parts of the network. Its removal could disrupt communication
between multiple pairs of nodes, making it a control point. It’s intresting that Node 126 appears
in the top 5 for both degree and betweenness centrality, suggesting it is both well-connected and
influential in terms of communication pathways.
Figure 7: Betweenness centrality distribution
The histogram of betweenness centrality is right-skewed, with the majority of nodes having values
close to zero. This indicates that only a small subset of neurons serve as important pathway for
communication in the network.
7
• Closeness centrality: Reflects how close a node is to all other nodes in the network.
Top 5 nodes by Closeness Centrality:
1. Node 45: 0.5767
2. Node 13: 0.2906
3. Node 3: 0.2898
4. Node 167: 0.2857
5. Node 4: 0.2853
Node 45 is an outlier with a closeness centrality of 0.5767, whichc is nearly double that of the
next highest node. This implies that Node 45 is extraordinarily central, with minimal average
path lengths to all other nodes. It also had the highest Degree Centrality so it likely serves as a
core hub of the network.
Figure 8: Closeness centrality distribution
Most nodes have a closeness centrality in the range of 0.18 to 0.28, showing a fairly narrow
concentration around the mean. A cluster of nodes at 0 implies a set of peripheral nodes with
no effective access to the rest of the network. This is true to the previously shown visualization
of the network structure.
The histograms above show the distribution of each centrality metric. All measures reveal a small
number of dominant nodes and a large number of peripheral neurons, consistent with biological
expectations of network structure.
2.9 Homophily
Unfortunately, Homophily is a node-level concept and it measures whether nodes connect pref-
erentially to other nodes. This is impossible to analyze for this network, because it lacks the
information about node attributes.
3 Conclusion
The analysis of the C. elegans neural network revealed structural characteristics that align with
the properties of biological networks. Despite its relatively small size, the network has rich
features.
The average shortest path length of approximately 5.91 and the right-skewed path length distribu-
tion suggest that while most neurons are relatively close, a subset requires longer communication
8
paths, probably because of the modular nature of the neurons. Degree analysis showed a moder-
ately high variance, with a few highly connected hub neurons evidenced by the heavy tail in the
degree distribution. Although the degree distribution did not strictly follow a power law (with
α ≈ 3.29), it still indicated the presence of hub-like neurons.
Clustering coefficients highlighted moderate local connectivity (average clustering coefficient of
0.2924). Global clustering coefficient of 0.1807 suggested localized clusters without strong global
interlinking. The strongly connected component composed of 239 nodes, showed a very good
reachability for a directed network. This probably reflects the biological need for robust signal
propagation.
The network was found to be disassortative (assortativity coefficient of -0.163), indicating that
high-degree neurons often connect to low-degree ones, which is a common feature in neural and
technological networks. Community detection via the Louvain algorithm identified 7 commu-
nities, with different sizes and internal cohesion. This may suggest a modular organization of
functional units.
Finally, centrality analysis identified a few key neurons (for example Node 45) with high degree
centrality, likely serving as critical hubs for communication within the neural network.
Overall, the C. elegans neural network shows typical features of complex networks, including
modularity, hub nodes, small-world properties and heterogeneous connectivity. It was very in-
teresting to visualize the graph structure of a biological network and validate it with NetworkX
tools.