8000 Completed some details about Pregel's Label Propagation. by romanatarango · Pull Request #1037 · arangodb/docs · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Completed some details about Pregel's Label Propagation. #1037

Merged
merged 6 commits into from
Jun 23, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Completed some details about Pregel's Label Propagation.
  • Loading branch information
romanatarango committed Jun 21, 2022
commit 0cea3f1c24bccbd30c7d4d37f0cbf2d4c4432bfc
34 changes: 25 additions & 9 deletions 3.10/graphs-pregel.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,17 +409,33 @@ based on common location, interests, occupation, etc.
#### Label Propagation

*Label Propagation* can be used to implement community detection on large
graphs. The idea is that each vertex should be in the community that most of
his neighbors are in. We iteratively determine this by first assigning random
Community ID's. Then each iteration, a vertex will send it's current community
ID to all its neighbor vertices. Then each vertex adopts the community ID it
received most frequently during the iteration.
graphs. The algorithm assigns a community, more precisely, a Community ID
(a natural number), to every vertex in the graph.
The idea is that each vertex should be in the community that most of
its neighbors are in.

The algorithm, first, assigns unique initial Community ID's to the vertices.
There is no guarantee that a vertex obtains the same initial
ID in two different runs of the algorithm even if the graph does not change
(although, it may often happen). Moreover, there is no guarantee on a particular
distribution of the initial ID's over the vertices.

Then, in each iteration, a vertex sends its current Community
ID to all its neighbor vertices. Then each vertex adopts the Community ID it
received most frequently in the last step. If a vertex obtains more than one
most frequent ID's, it chooses the least one. Moreover, if no ID arrived more
than once and the ID of the vertex from the previous step is less than the
least obtained ID, the old ID is kept.

The algorithm runs until it converges, which likely never really happens on
large graphs. Therefore you need to specify a maximum iteration bound which
suits you. The default bound is 500 iterations, which is likely too large for
your application. It should work best on undirected graphs, results on directed
graphs might vary depending on the density of your graph.
large graphs. Therefore you need to specify a maximum iteration bound.
The default bound is 500 iterations, which is too large for
common applications.

The algorithm should work best on undirected graphs. On directed
graphs, the resulting partition into communities might change if the number
of performed steps changes. How strong the dependence is
may be influenced by the density of the graph.

```js
const pregel = require("@arangodb/pregel");
Expand Down
0