Does _totalEdgesCount in src/arangod/Pregel/Conductor.cpp represents the total edge number in graph?

My Environment

ArangoDB Version: 3.4.8
Storage Engine: RocksDB
Deployment Mode: Cluster 3 nodes with 3 Agencis ,3 Dbservers and 3 Coordinators
Deployment Strategy: ArangoDB Starter in Docker
Configuration: default
Infrastructure: own
Operating System: CentOS
Total RAM in your machine: 128G
Disks in use: SSD

Size of your Dataset on disk:

one vertex collection: 374M
one edge collection: 37G

Dataset:

the dataset contains only one vertex collection called users with 41,652,230 docs like as follows:

 {
    "_key": "12",
    "_id": "users/12",
    "_rev": "_Z4it3Eu--K",
  }

and only one edge collection which means the follower relationship with 1,468,365,182 docs like as follows:

  {
    "_key": "6842768634",
    "_id": "follow/6842768634",
    "_from": "users/324",
    "_to": "users/20",
    "_rev": "_Z4FeNU---u",
    "vertex": 324
  }

and shard key is ["vertex"];
I confirmed that there are no invaild edges.

Replication Factor & Number of Shards (Cluster only):

Replication Factor 1
Shards 81

Problem:
when I running pregel algorithm,the status received as follows:

the vertexCount is 41,652,230,which is the same as vertex collection, but the edgeCount is 16,695,168, which is much less than edge collection(1.4billion edges).
And, whatever kinds of pregel algorithm I run, the edgeCount number is the same, the logs is as follows:

So is edgeCount parameter represens the total egde number in graph? If so, why the egde number in graph is much less than edge collection? Did I do something wrong?

By the way, how can I get the total edges in graph? I run the following aql but out of time since the edgeCount is too large

AQL query (if applicable):

FOR i IN users
 LET ec = (
           FOR v,e,p IN 1..1 OUTBOUND i Graph "twitter"
                 RETURN DISTINCT(e)
          )
 RETURN COUNT(ec)

AQL explain (if applicable):

Execution plan:
 Id   NodeType                  Site      Est.   Comment
  1   SingletonNode             DBS          1   * ROOT
  2   EnumerateCollectionNode   DBS   41652230     - FOR i IN users   /* full collection scan, 81 shard(s) */
 14   RemoteNode                COOR  41652230       - REMOTE
 15   GatherNode                COOR  41652230       - GATHER 
  8   SubqueryNode              COOR  41652230       - LET ec = ...   /* subquery */
  3   SingletonNode             COOR         1         * ROOT
 11   CalculationNode           COOR         1           - LET #15 = true   /* json expression */   /* const assignment */
  4   TraversalNode             COOR         9           - FOR v  /* vertex */, e  /* edge */ IN 1..1  /* min..maxPathDepth */ OUTBOUND i /* startnode */  GRAPH 'twitter'
  6   CollectNode               COOR         9             - COLLECT #11 = e   /* distinct */
  7   ReturnNode                COOR         9             - RETURN #15
  9   CalculationNode           COOR  41652230       - LET #13 = COUNT(ec)   /* simple expression */
 10   ReturnNode                COOR  41652230       - RETURN #13

Indexes used:
 By   Type   Collection   Unique   Sparse   Selectivity   Fields        Ranges
  4   edge   follow       false    false            n/a   [ `_from` ]   base OUTBOUND

Functions used:
 Name    Deterministic   Cacheable   Uses V8
 COUNT   true            true        false  

Traversals on graphs:
 Id  Depth  Vertex collections  Edge collections  Options                                  Filter / Prune Conditions
 4   1..1   users               follow            uniqueVertices: none, uniqueEdges: path                           

Optimization rules applied:
 Id   RuleName
  1   remove-unnecessary-calculations
  2   optimize-subqueries
  3   move-calculations-up-2
  4   optimize-traversals
  5   scatter-in-cluster
  6   remove-unnecessary-remote-scatter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

My Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

My Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions