8000 Completed some details about Pregel's Label Propagation. (#1037) · arangodb/docs@b6d6683 · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit b6d6683

Browse files
Completed some details about Pregel's Label Propagation. (#1037)
* Completed some details about Pregel's Label Propagation. * Update graphs-pregel.md * Update graphs-pregel.md * applied changes to 3.9 * Update graphs-pregel.md * applied changes to 3.8 Co-authored-by: ansoboleva <93702078+ansoboleva@users.noreply.github.com>
1 parent a82085b commit b6d6683

File tree

3 files changed

+75
-27
lines changed

3 files changed

+75
-27
lines changed

3.10/graphs-pregel.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -409,17 +409,33 @@ based on common location, interests, occupation, etc.
409409
#### Label Propagation
410410

411411
*Label Propagation* can be used to implement community detection on large
412-
graphs. The idea is that each vertex should be in the community that most of
413-
his neighbors are in. We iteratively determine this by first assigning random
414-
Community ID's. Then each iteration, a vertex will send it's current community
415-
ID to all its neighbor vertices. Then each vertex adopts the community ID it
416-
received most frequently during the iteration.
412+
graphs. The algorithm assigns a community, more precisely, a Community ID
413+
(a natural number), to every vertex in the graph.
414+
The idea is that each vertex should be in the community that most of
415+
its neighbors are in.
416+
417+
At first, the algorithm assigns unique initial Community IDs to the vertices.
418+
There is no guarantee that a vertex obtains the same initial
419+
ID in two different runs of the algorithm, even if the graph does not change
420+
(although, it may often happen). Moreover, there is no guarantee on a particular
421+
distribution of the initial IDs over the vertices.
422+
423+
Then, in each iteration, a vertex sends its current Community
424+
ID to all its neighbor vertices. After that each vertex adopts the Community ID it
425+
received most frequently in the last step. If a vertex obtains more than one
426+
most frequent IDs, it chooses the lowest number (as IDs are numbers). If no ID arrived more
427+
than once and the ID of the vertex from the previous step is less than the
428+
lowest obtained ID number, the old ID is kept.
417429

418430
The algorithm runs until it converges, which likely never really happens on
419-
large graphs. Therefore you need to specify a maximum iteration bound which
420-
suits you. The default bound is 500 iterations, which is likely too large for
421-
your application. It should work best on undirected graphs, results on directed
422-
graphs might vary depending on the density of your graph.
431+
large graphs. Therefore you need to specify a maximum iteration bound.
432+
The default bound is 500 iterations, which is too large for
433+
common applications.
434+
435+
The algorithm should work best on undirected graphs. On directed
436+
graphs, the resulting partition into communities might change, if the number
437+
of performed steps changes. How strong the dependence is
438+
may be influenced by the density of the graph.
423439

424440
```js
425441
const pregel = require("@arangodb/pregel");

3.8/graphs-pregel.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -409,17 +409,33 @@ based on common location, interests, occupation, etc.
409409
#### Label Propagation
410410

411411
*Label Propagation* can be used to implement community detection on large
412-
graphs. The idea is that each vertex should be in the community that most of
413-
his neighbors are in. We iteratively determine this by first assigning random
414-
Community ID's. Then each iteration, a vertex will send it's current community
415-
ID to all its neighbor vertices. Then each vertex adopts the community ID it
416-
received most frequently during the iteration.
412+
graphs. The algorithm assigns a community, more precisely, a Community ID
413+
(a natural number), to every vertex in the graph.
414+
The idea is that each vertex should be in the community that most of
415+
its neighbors are in.
416+
417+
At first, the algorithm assigns unique initial Community IDs to the vertices.
418+
There is no guarantee that a vertex obtains the same initial
419+
ID in two different runs of the algorithm, even if the graph does not change
420+
(although, it may often happen). Moreover, there is no guarantee on a particular
421+
distribution of the initial IDs over the vertices.
422+
423+
Then, in each iteration, a vertex sends its current Community
424+
ID to all its neighbor vertices. After that each vertex adopts the Community ID it
425+
received most frequently in the last step. If a vertex obtains more than one
426+
most frequent IDs, it chooses the lowest number (as IDs are numbers). If no ID arrived more
427+
than once and the ID of the vertex from the previous step is less than the
428+
lowest obtained ID number, the old ID is kept.
417429

418430
The algorithm runs until it converges, which likely never really happens on
419-
large graphs. Therefore you need to specify a maximum iteration bound which
420-
suits you. The default bound is 500 iterations, which is likely too large for
421-
your application. It should work best on undirected graphs, results on directed
422-
graphs might vary depending on the density of your graph.
431+
large graphs. Therefore you need to specify a maximum iteration bound.
432+
The default bound is 500 iterations, which is too large for
433+
common applications.
434+
435+
The algorithm should work best on undirected graphs. On directed
436+
graphs, the resulting partition into communities might change, if the number
437+
of performed steps changes. How strong the dependence is
438+
may be influenced by the density of the graph.
423439

424440
```js
425441
const pregel = require("@arangodb/pregel");

3.9/graphs-pregel.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -409,17 +409,33 @@ based on common location, interests, occupation, etc.
409409
#### Label Propagation
410410

411411
*Label Propagation* can be used to implement community detection on large
412-
graphs. The idea is that each vertex should be in the community that most of
413-
his neighbors are in. We iteratively determine this by first assigning random
414-
Community ID's. Then each iteration, a vertex will send it's current community
415-
ID to all its neighbor vertices. Then each vertex adopts the community ID it
416-
received most frequently during the iteration.
412+
graphs. The algorithm assigns a community, more precisely, a Community ID
413+
(a natural number), to every vertex in the graph.
414+
The idea is that each vertex should be in the community that most of
415+
its neighbors are in.
416+
417+
At first, the algorithm assigns unique initial Community IDs to the vertices.
418+
There is no guarantee that a vertex obtains the same initial
419+
ID in two different runs of the algorithm, even if the graph does not change
420+
(although, it may often happen). Moreover, there is no guarantee on a particular
421+
distribution of the initial IDs over the vertices.
422+
423+
Then, in each iteration, a vertex sends its current Community
424+
ID to all its neighbor vertices. After that each vertex adopts the Community ID it
425+
received most frequently in the last step. If a vertex obtains more than one
426+
most frequent IDs, it chooses the lowest number (as IDs are numbers). If no ID arrived more
427+
than once and the ID of the vertex from the previous step is less than the
428+
lowest obtained ID number, the old ID is kept.
417429

418430
The algorithm runs until it converges, which likely never really happens on
419-
large graphs. Therefore you need to specify a maximum iteration bound which
420-
suits you. The default bound is 500 iterations, which is likely too large for
421-
your application. It should work best on undirected graphs, results on directed
422-
graphs might vary depending on the density of your graph.
431+
large graphs. Therefore you need to specify a maximum iteration bound.
432+
The default bound is 500 iterations, which is too large for
433+
common applications.
434+
435+
The algorithm should work best on undirected graphs. On directed
436+
graphs, the resulting partition into communities might change, if the number
437+
of performed steps changes. How strong the dependence is
438+
may be influenced by the density of the graph.
423439

424440
```js
425441
const pregel = require("@arangodb/pregel");

0 commit comments

Comments
 (0)
0