8000 DOC-147/added ml adapters to data science (#1058) · arangodb/docs@a5776d8 · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit a5776d8

Browse files
authored
DOC-147/added ml adapters to data science (#1058)
* added ml adapters to data science documentation * added RDF adapter * changed name ArangoRDF * renamed files * applied review suggestions
1 parent eea916e commit a5776d8

10 files changed

+751
-0
lines changed
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
---
2+
layout: default
3+
description: >-
4+
ArangoRDF allows you to export graphs from ArangoDB into RDF and vice-versa
5+
---
6+
# ArangoRDF
7+
8+
{{ page.description }}
9+
{:class="lead"}
10+
11+
RDF is a standard model for data interchange on the Web. RDF has features that
12+
facilitate data merging even if the underlying schemas differ, and it
13+
specifically supports the evolution of schemas over time without requiring all
14+
the data consumers to be changed.
15+
16+
RDF extends the linking structure of the Web to use URIs to name the relationship
17+
between things as well as the two ends of the link (this is usually referred to
18+
as a "triple"). Using this simple model, it allows structured and semi-structured
19+
data to be mixed, exposed, and shared across different applications.
20+
21+
This linking structure forms a directed, labeled graph, where the edges represent
22+
the named link between two resources, represented by the graph nodes. This graph
23+
view is the easiest possible mental model for RDF and is often used in
24+
easy-to-understand visual explanations.
25+
26+
Check the resources below to get started:
27+
28+
- [RDF Primer](https://www.w3.org/TR/rdf11-concepts/){:target="_blank"}
29+
- [RDFLib (Python)](https://pypi.org/project/rdflib/){:target="_blank"}
30+
- [Example for Modeling RDF as ArangoDB Graphs](data-modeling-graphs-from-rdf.html)
31+
32+
## Resources
33+
34+
Watch this
35+
[lunch & learn session](https://www.arangodb.com/resources/lunch-sessions/graph-beyond-lunch-break-2-11-arangordf/){:target="_blank"} to get an
36+
introduction on ArangoRDF - an RDF adapter developed with the community
37+
as a first step at bringing RDF graphs into ArangoDB.
38+
39+
The [ArangoRDF repository](https://github.com/ArangoDB-Community/ArangoRDF){:target="_blank"}
40+
is available on Github. Check it out!
41+
42+
## Installation
43+
44+
To install the latest release of ArangoRDF,
45+
run the following command:
46+
47+
```bash
48+
pip install arango-rdf
49+
```
50+
51+
## Quickstart
52+
53+
The following example shows how to get started with ArangoRDF.
54+
Check also the
55+
[interactive tutorial](https://colab.research.google.com/github/ArangoDB-Community/ArangoRDF/blob/main/examples/ArangoRDF.ipynb){:target="_blank"}.
56+
57+
```py
58+
from arango import ArangoClient
59+
from arango_rdf import ArangoRDF
60+
61+
db = ArangoClient(hosts="http://localhost:8529").db(
62+
"rdf", username="root", password="openSesame"
63+
)
< F438 /code>64+
65+
# Clean up existing data and collections
66+
if db.has_graph("default_graph"):
67+
db.delete_graph("default_graph", drop_collections=True, ignore_missing=True)
68+
69+
# Initializes default_graph and sets RDF graph identifier (ArangoDB sub_graph)
70+
# Optional: sub_graph (stores graph name as the 'graph' attribute on all edges in Statement collection)
71+
# Optional: default_graph (name of ArangoDB Named Graph, defaults to 'default_graph',
72+
# is root graph that contains all collections/relations)
73+
adb_rdf = ArangoRDF(db, sub_graph="http://data.sfgov.org/ontology")
74+
config = {"normalize_literals": False} # default: False
75+
76+
# RDF Import
77+
adb_rdf.init_rdf_collections(bnode="Blank")
78+
79+
# Start with importing the ontology
80+
adb_graph = adb_rdf.import_rdf("./examples/data/airport-ontology.owl", format="xml", config=config, save_config=True)
81+
82+
# Next, let's import the actual graph data
83+
adb_graph = adb_rdf.import_rdf(f"./examples/data/sfo-aircraft-partial.ttl", format="ttl", config=config, save_config=True)
84+
85+
86+
# RDF Export
87+
# WARNING:
88+
# Exports ALL collections of the database,
89+
# currently does not account for default_graph or sub_graph
90+
# Results may vary, minifying may occur
91+
rdf_graph = adb_rdf.export_rdf(f"./examples/data/rdfExport.xml", format="xml")
92+
93+
# Drop graph and ALL documents and collections to test import from exported data
94+
if db.has_graph("default_graph"):
95+
db.delete_graph("default_graph", drop_collections=True, ignore_missing=True)
96+
97+
# Re-initialize our RDF Graph
98+
# Initializes default_graph and sets RDF graph identifier (ArangoDB sub_graph)
99+
adb_rdf = ArangoRDF(db, sub_graph="http://data.sfgov.org/ontology")
100+
101+
adb_rdf.init_rdf_collections(bnode="Blank")
102+
103+
config = adb_rdf.get_config_by_latest() # gets the last config saved
104+
# config = adb_rdf.get_config_by_key_value('graph', 'music')
105+
# config = adb_rdf.get_config_by_key_value('AnyKeySuppliedInConfig', 'SomeValue')
106+
107+
# Re-import Exported data
108+
adb_graph = adb_rdf.import_rdf(f"./examples/data/rdfExport.xml", format="xml", config=config)
109+
```

3.10/data-science-cugraph-adapter.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
layout: default
3+
description: >-
4+
The ArangoDB-cuGraph Adapter exports graphs from ArangoDB into RAPIDS cuGraph, a library of collective GPU-accelerated graph algorithms, and vice-versa
5+
---
6+
# cuGraph Adapter
7+
8+
{{ page.description }}
9+
{:class="lead"}
10+
11+
While offering a similar API and set of graph algorithms to NetworkX,
12+
[RAPIDS cuGraph](https://docs.rapids.ai/api/cugraph/stable/){:target="_blank"}
13+
library is GPU-based. Especially for large graphs, this
14+
results in a significant performance improvement of cuGraph compared to NetworkX.
15+
Please note that storing node attributes is currently not supported by cuGraph.
16+
In order to run cuGraph, an Nvidia-CUDA-enabled GPU is required.
17+
18+
## Resources
19+
20+
The [ArangoDB-cuGraph Adapter repository](https://github.com/arangoml/cugraph-adapter){:target="_blank"}
21+
is available on Github. Check it out!
22+
23+
## Installation
24+
25+
To install the latest release of the ArangoDB-cuGraph Adapter,
26+
run the following command:
27+
28+
```bash
29+
conda install -c arangodb adbcug-adapter
30+
```
31+
32+
## Quickstart
33+
34+
The following examples show how to get started with ArangoDB-cuGraph Adapter.
35+
Check also the
36+
[interactive tutorial](https://colab.research.google.com/github/arangoml/cugraph-adapter/blob/master/examples/ArangoDB_cuGraph_Adapter.ipynb){:target="_blank"}.
37+
38+
```py
39+
import cudf
40+
import cugraph
41+
from arango import ArangoClient # Python-Arango driver
42+
43+
from adbcug_adapter import ADBCUG_Adapter
44+
45+
# Let's assume that the ArangoDB "fraud detection" dataset is imported to this endpoint
46+
db = ArangoClient(hosts="http://localhost:8529").db("_system", username="root", password="")
47+
48+
adbcug_adapter = ADBCUG_Adapter(db)
49+
50+
# Use Case 1.1: ArangoDB to cuGraph via Graph name
51+
cug_fraud_graph = adbcug_adapter.arangodb_graph_to_cugraph("fraud-detection")
52+
53+
# Use Case 1.2: ArangoDB to cuGraph via Collection names
54+
cug_fraud_graph_2 = adbcug_adapter.arangodb_collections_to_cugraph(
55+
"fraud-detection",
56+
{"account", "bank", "branch", "Class", "customer"}, # Vertex collections
57+
{"accountHolder", "Relationship", "transaction"}, # Edge collections
58+
)
59+
60+
# Use Case 2: cuGraph to ArangoDB:
61+
## 1) Create a sample cuGraph
62+
cug_divisibility_graph = cugraph.MultiGraph(directed=True)
63+
cug_divisibility_graph.from_cudf_edgelist(
64+
cudf.DataFrame(
65+
[
66+
(f"numbers/{j}", f"numbers/{i}", j / i)
67+
for i in range(1, 101)
68+
for j in range(1, 101)
69+
if j % i == 0
70+
],
71+
columns=["src", "dst", "weight"],
72+
),
73+
source="src",
74+
destination="dst",
75+
edge_attr="weight",
76+
renumber=False,
77+
)
78+
79+
## 2) Create ArangoDB Edge Definitions
80+
edge_definitions = [
81+
{
82+
"edge_collection": "is_divisible_by",
83+
"from_vertex_collections": ["numbers"],
84+
"to_vertex_collections": ["numbers"],
85+
}
86+
]
87+
88+
## 3) Convert cuGraph to ArangoDB
89+
adb_graph = adbcug_adapter.cugraph_to_arangodb("DivisibilityGraph", cug_graph, edge_definitions)
90+
```

3.10/data-science-dgl-adapter.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
---
2+
layout: default
3+
description: >-
4+
The ArangoDB-DGL Adapter exports graphs from ArangoDB into Deep Graph Library (DGL), a Python package for graph neural networks, and vice-versa
5+
---
6+
# DGL Adapter
7+
8+
{{ page.description }}
9+
{:class="lead"}
10+
11+
The [Deep Graph Library (DGL)](https://www.dgl.ai/){:target="_blank"} is an
12+
easy-to-use, high performance and scalable
13+
Python package for deep learning on graphs. DGL is framework agnostic, meaning
14+
that, if a deep graph model is a component of an end-to-end application, the
15+
rest of the logics can be implemented in any major frameworks, such as PyTorch,
16+
Apache MXNet or TensorFlow.
17+
18+
## Resources
19+
20+
Watch this
21+
[lunch & learn session](https://www.arangodb.com/resources/lunch-sessions/graph-beyond-lunch-break-2-8-dgl-adapter/){:target="_blank"}
22+
to get an introduction and see how to use the DGL adapter.
23+
24+
The [ArangoDB-DGL Adapter repository](https://github.com/arangoml/dgl-adapter){:target="_blank"}
25+
is available on Github. Check it out!
26+
27+
## Installation
28+
29+
To install the latest release of the ArangoDB-DGL Adapter,
30+
run the following command:
31+
32+
```bash
33+
pip install adbdgl-adapter
34+
```
35+
36+
## Quickstart
37+
38+
The following examples show how to get started with ArangoDB-DGL Adapter.
39+
Check also the
40+
[interactive tutorial](https://colab.research.google.com/github/arangoml/dgl-adapter/blob/master/examples/ArangoDB_DGL_Adapter.ipynb){:target="_blank"}.
41+
42+
```py
43+
from arango import ArangoClient # Python-Arango driver
44+
from dgl.data import KarateClubDataset # Sample graph from DGL
45+
46+
# Let's assume that the ArangoDB "fraud detection" dataset is imported to this endpoint
47+
db = ArangoClient(hosts="http://localhost:8529").db("_system", username="root", password="")
48+
49+
adbdgl_adapter = ADBDGL_Adapter(db)
50+
51+
# Use Case 1.1: ArangoDB to DGL via Graph name
52+
dgl_fraud_graph = adbdgl_adapter.arangodb_graph_to_dgl("fraud-detection")
53+
54+
# Use Case 1.2: ArangoDB to DGL via Collection names
55+
dgl_fraud_graph_2 = adbdgl_adapter.arangodb_collections_to_dgl(
56+
"fraud-detection",
57+
{"account", "Class", "customer"}, # Vertex collections
58+
{"accountHolder", "Relationship", "transaction"}, # Edge collections
59+
)
60+
61+
# Use Case 1.3: ArangoDB to DGL via Metagraph
62+
metagraph = {
63+
"vertexCollections": {
64+
"account": {"Balance", "account_type", "customer_id", "rank"},
65+
"customer": {"Name", "rank"},
66+
},
67+
"edgeCollections": {
68+
"transaction": {"transaction_amt", "sender_bank_id", "receiver_bank_id"},
69+
"accountHolder": {},
70+
},
71+
}
72+
dgl_fraud_graph_3 = adbdgl_adapter.arangodb_to_dgl("fraud-detection", metagraph)
73+
74+
# Use Case 2: DGL to ArangoDB
75+
dgl_karate_graph = KarateClubDataset()[0]
76+
adb_karate_graph = adbdgl_adapter.dgl_to_arangodb("Karate", dgl_karate_graph)
77+
```

3.10/data-science-networkx-adapter.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
layout: default
3+
description: >-
4+
The ArangoDB-NetworkX Adapter allows you to export graphs from ArangoDB into NetworkX for graph analysis with Python and vice-versa
5+
---
6+
# NetworkX Adapter
7+
8+
{{ page.description }}
9+
{:class="lead"}
10+
11+
[NetworkX](https://networkx.org/){:target="_blank"} is a commonly used tool for
12+
analysis of network-data. If your
13+
analytics use cases require the use of all your graph data, for example,
14+
to summarize graph structure, or answer global path traversal queries,
15+
then using the ArangoDB Pregel API is recommended. If your analysis pertains
16+
to a subgraph, then you may be interested in getting the NetworkX
17+
representation of the subgraph for one of the following reasons:
18+
19+
- An algorithm for your use case is available in NetworkX
20+
- A library that you want to use for your use case works with NetworkX Graphs as input
21+
22+
## Resources
23+
24+
Watch this
25+
[lunch & learn session](https://www.arangodb.com/resources/lunch-sessions/graph-beyond-lunch-break-2-9-introducing-the-arangodb-networkx-adapter/){:target="_blank"}
26+
to see how using this adapter gives you the best of both
27+
graph worlds - the speed and flexibility of ArangoDB combined with the
28+
ubiquity of NetworkX.
29+
30+
The [ArangoDB-NetworkX Adapter repository](https://github.com/arangoml/networkx-adapter){:target="_blank"}
31+
is available on Github. Check it out!
32+
33+
## Installation
34+
35+
To install the latest release of the ArangoDB-NetworkX Adapter,
36+
run the following command:
37+
38+
```bash
39+
pip install adbnx-adapter
40+
```
41+
42+
## Quickstart
43+
44+
The following examples show how to get started with ArangoDB-NetworkX Adapter.
45+
Check also the
46+
[interactive tutorial](https://colab.research.google.com/github/arangoml/networkx-adapter/blob/master/examples/ArangoDB_NetworkX_Adapter.ipynb){:target="_blank"}.
47+
48+
```py
49+
from arango import ArangoClient # Python-Arango driver
50+
from networkx import grid_2d_graph # Sample graph from NetworkX
51+
52+
from adbnx_adapter import ADBNX_Adapter
53+
54+
# Let's assume that the ArangoDB "fraud detection" dataset is imported to this endpoint
55+
db = ArangoClient(hosts="http://localhost:8529").db("_system", username="root", password="")
56+
57+
adbnx_adapter = ADBNX_Adapter(db)
58+
59+
# Use Case 1.1: ArangoDB to NetworkX via Graph name
60+
nx_fraud_graph = adbnx_adapter.arangodb_graph_to_networkx("fraud-detection")
61+
62+
# Use Case 1.2: ArangoDB to NetworkX via Collection names
63+
nx_fraud_graph_2 = adbnx_adapter.arangodb_collections_to_networkx(
64+
"fraud-detection",
65+
{"account", "bank", "branch", "Class", "customer"}, # Vertex collections
66+
{"accountHolder", "Relationship", "transaction"} # Edge collections
67+
)
68+
69+
# Use Case 1.3: ArangoDB to NetworkX via Metagraph
70+
metagraph = {
71+
"vertexCollections": {
72+
"account": {"Balance", "account_type", "customer_id", "rank"},
73+
"customer": {"Name", "rank"},
74+
},
75+
"edgeCollections": {
76+
"transaction": {"transaction_amt", "sender_bank_id", "receiver_bank_id"},
77+
"accountHolder": {},
78+
},
79+
}
80+
nx_fraud_graph_3 = adbnx_adapter.arangodb_to_networkx("fraud-detection", metagraph)
81+
82+
# Use Case 2: NetworkX to ArangoDB
83+
nx_grid_graph = grid_2d_graph(5, 5)
84+
adb_grid_edge_definitions = [
85+
{
86+
"edge_collection": "to",
87+
"from_vertex_collections": ["Grid_Node"],
88+
"to_vertex_collections": ["Grid_Node"],
89+
}
90+
]
91+
adb_grid_graph = adbnx_adapter.networkx_to_arangodb("Grid", nx_grid_graph, adb_grid_edge_definitions)
92+
```

0 commit comments

Comments
 (0)
0