[go: up one dir, main page]

Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
1
Herzlich Willkommen!
Einstieg in Neo4j Graph Data
Science
Alexander.Katzdobler@neo4j.com
andrew.frei@neo4j.com
Neo4j, Inc. All rights reserved 2021
2
Organisatorisches
○ Fragen während des Webinars werden zum Schluss behandelt und können
gerne währenddessen per Chat gestellt werden.
○ Informationen zum Webinar werden im Nachgang an alle Teilnehmer
versendet
Neo4j, Inc. All rights reserved 2021
7/10
20/25
7/10
Top Life Science Firms
Top Financial Firms
Top Software Vendors
Anyway You Like It
Neo4j - The Graph Company
The Industry’s Largest Dedicated Investment in Graphs
3
Creator of the Property
Graph and Cypher language
at the core of the GQL ISO
project
Thousands of Customers
World-Wide
HQ in Silicon Valley, offices
include London, Munich,
Paris & Malmo
Industry Leaders use Neo4j
On-Prem
DB-as-a-Service
In the Cloud
Neo4j, Inc. All rights reserved 2021
Highly Valuable Connected Data Use Cases
Drive Enterprise Adoption
Network &
IT Operations
Fraud
Detection
Identity & Access
Management
Knowledge
Graph
Master Data
Management
Real-Time
Recommendations
4
Neo4j, Inc. All rights reserved 2021
Graph is the Fastest Growing DBMS
Category, Neo4j is the Leading Player
FASTEST GROWING CATEGORY MOST POPULAR WITH DEVELOPERS
STRONGEST COMMUNITY
Developers
LinkedIn Skills
41k+
members with
220k+
Meetups
72k+
Members globally
5
Neo4j, Inc. All rights reserved 2021
CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
• Visibility security by user/role
Neo4j Invented the Labeled Property Graph Model
MARRIED TO
LIVES WITH
O
W
N
S
PERSON PERSON
6
Neo4j, Inc. All rights reserved 2021
7
Graph Databases Unlock Data-context for
Richer Insights, Better Decisions, and Faster Innovation
Shifting from Data-Driven to Intelligence-Driven
Bought
B
ou
gh
t
V
i
e
w
e
d
R
e
t
u
r
n
e
d
Bought
K
n
o
w
s
Knows
Knows
K
n
o
w
s
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
People
E.g., Employees, Customers,
Suppliers, Partners, Influencers
Transactions
E.g., Risk management, Supply chain,
Payments
Knowledge
E.g. Enterprise content, Knowledgebase,
eCommerce content
Neo4j, Inc. All rights reserved 2021
Which of the colored nodes would be considered the most
‘important'?
D
A
B
E
H
G
F
C
I
J
K
L
M N
8
Neo4j, Inc. All rights reserved 2021
Which of the colored nodes would be considered the most
‘important'?
D D has the highest valence
This is the most connected individual in the network. If importance is how well you are personally known, you pick D.
Node G has the highest closeness centrality (0.52).
Information will disperse through the network more quickly through this individual. If you need to get a message out rapidly, choose them.
G
Node I has the highest betweenness centrality (0.59).
This person is an efficient connector of other people. Risk of disruption is higher if you lose this individual.
I
D
A
B
E
H
G
F
C
I
J
K
L
M N
9
Neo4j, Inc. All rights reserved 2021
What is Graph data science?
Graph Data Science is a science-driven
approach to gain knowledge from the
relationships and structures in data, typically
to power predictions.
Data scientists use
relationships to answer
questions.
Neo4j, Inc. All rights reserved 2021
Why Graph data science?
Relationships and network structures are
highly predictive and underutilized – and
already in your data.
● Relationships are the strongest predictor of
behavior - James Fowler (“Connected”)
Productionize more accurate
and predictive models.
Graphs are a natural way to store and use
this predictive information, but different than
what you’re doing today.
Neo4j, Inc. All rights reserved 2021
Neo4j for Graph Data Science
Neo4j Graph Data
Science Library
Scalable Graph
Algorithms & Analytics
Workspace
Native Graph
Creation & Persistence
Neo4j
Database
Visual Graph
Exploration
& Prototyping
Neo4j
Bloom
Practical way to harness
the natural power of
relationships and network
structures to infer behavior
Automated transformations
with an integrated database
built to store and protect
relationships
Explore results visually,
quickly prototype and
collaborate with different
groups.
12
Neo4j, Inc. All rights reserved 2021
13
Graphs & Data Science
Knowledge Graphs
Graph Algorithms
Graph Native
Machine Learning
Find the patterns you’re
looking for in connected data
Use unsupervised machine
learning techniques to
identify associations,
anomalies, and trends.
Use embeddings to learn the
features in your graph that
you don’t even know are
important yet.
Train in-graph supervise ML
models to predict links,
labels, and missing data.
Neo4j, Inc. All rights reserved 2021
When do I need Graph Algorithms?
Query (e.g. Cypher/Python)
Real-time, local decisioning
and pattern matching
Graph Algorithms
Global analysis and iterations
You know what you’re looking for
and making a decision
You’re learning the overall structure of
a network, updating data, and
predicting
Local Patterns Global Computation
Neo4j, Inc. All rights reserved 2021
Robust Graph Algorithms
● Compute connectivity metrics and learn the topology of your graph
● Highly parallelized and scale to 10’s of billions of nodes
15
The Neo4j GDS Library
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
Efficient & Flexible Analytics
Workspace
● Automatically reshapes transactional graphs
into an in-memory analytics graph
● Optimized for analytics with global traversals
and aggregation
● Create workflows and layer algorithms
Neo4j, Inc. All rights reserved 2021
16
More, Better, Faster Algorithms
Pathfinding &
Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• A* Shortest Path
• Yen’s K Shortest Path
• Minimum Weight Spanning Tree
• K-Spanning Tree (MST)
• Random Walk
• Breadth & Depth First Search
Centrality &
Importance
• Degree Centrality
• Closeness Centrality
• Harmonic Centrality
• Betweenness Centrality & Approx.
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Hyperlink Induced Topic Search (HITS)
• Influence Maximization (Greedy, CELF)
Community
Detection
• Triangle Count
• Local Clustering Coefficient
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Coloring
• Modularity Optimization
• Speaker Listener Label Propagation
Supervised
Machine Learning
• Node Classification
• Link Prediction
… and more!
Heuristic Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Similarity
• Node Similarity
• K-Nearest Neighbors (KNN)
• Jaccard Similarity
• Cosine Similarity
• Pearson Similarity
• Euclidean Distance
• Approximate Nearest Neighbors (ANN)
Graph
Embeddings
• Node2Vec
• FastRP
• FastRPExtended
• GraphSAGE
• Synthetic Graph Generation
• Scale Properties
• Collapse Paths
• One Hot Encoding
• Split Relationships
• Graph Export
• Pregel API (write your own algos)
Neo4j, Inc. All rights reserved 2021
Graph Algorithms for Drug Discovery
Identify drug mechanisms and
new targets based on network
structure
PageRank to identify essential
regulatory genes or drug
targets
Shortest path to link drug
targets to possible outcomes
or side effects
Node Similarity to find
structurally similar chemicals
17
Neo4j, Inc. All rights reserved 2021
What: Finds important nodes
based on their relationships
Why: Recommendations,
identifying influencers
Features:
- Tolerance
- Damping
18
Page Rank
Neo4j, Inc. All rights reserved 2021
19
Get started with Graph Data Science in Neo4j
Neo4j, Inc. All rights reserved 2021
20
Neo4j GDSL Download
Neo4j, Inc. All rights reserved 2021
21
Neo4j Desktop & Sandbox
Neo4j, Inc. All rights reserved 2021
22
Neo4j Sandbox - GDS Playground
Neo4j, Inc. All rights reserved 2021
23
Neo4j - Graph Algorithms Free Copy
Neo4j, Inc. All rights reserved 2021
24
Neo4j - Graph Data Science for Dummies Free Copy
Neo4j, Inc. All rights reserved 2021
25
Neo4j, Inc. All rights reserved 2021
26

Einstieg in Neo4j Graph Data Science

  • 1.
    Neo4j, Inc. Allrights reserved 2021 Neo4j, Inc. All rights reserved 2021 1 Herzlich Willkommen! Einstieg in Neo4j Graph Data Science Alexander.Katzdobler@neo4j.com andrew.frei@neo4j.com
  • 2.
    Neo4j, Inc. Allrights reserved 2021 2 Organisatorisches ○ Fragen während des Webinars werden zum Schluss behandelt und können gerne währenddessen per Chat gestellt werden. ○ Informationen zum Webinar werden im Nachgang an alle Teilnehmer versendet
  • 3.
    Neo4j, Inc. Allrights reserved 2021 7/10 20/25 7/10 Top Life Science Firms Top Financial Firms Top Software Vendors Anyway You Like It Neo4j - The Graph Company The Industry’s Largest Dedicated Investment in Graphs 3 Creator of the Property Graph and Cypher language at the core of the GQL ISO project Thousands of Customers World-Wide HQ in Silicon Valley, offices include London, Munich, Paris & Malmo Industry Leaders use Neo4j On-Prem DB-as-a-Service In the Cloud
  • 4.
    Neo4j, Inc. Allrights reserved 2021 Highly Valuable Connected Data Use Cases Drive Enterprise Adoption Network & IT Operations Fraud Detection Identity & Access Management Knowledge Graph Master Data Management Real-Time Recommendations 4
  • 5.
    Neo4j, Inc. Allrights reserved 2021 Graph is the Fastest Growing DBMS Category, Neo4j is the Leading Player FASTEST GROWING CATEGORY MOST POPULAR WITH DEVELOPERS STRONGEST COMMUNITY Developers LinkedIn Skills 41k+ members with 220k+ Meetups 72k+ Members globally 5
  • 6.
    Neo4j, Inc. Allrights reserved 2021 CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Latitude: 37.5629900° Longitude: -122.3255300° Nodes • Can have Labels to classify nodes • Labels have native indexes Relationships • Relate nodes by type and direction Properties • Attributes of Nodes & Relationships • Stored as Name/Value pairs • Can have indexes and composite indexes • Visibility security by user/role Neo4j Invented the Labeled Property Graph Model MARRIED TO LIVES WITH O W N S PERSON PERSON 6
  • 7.
    Neo4j, Inc. Allrights reserved 2021 7 Graph Databases Unlock Data-context for Richer Insights, Better Decisions, and Faster Innovation Shifting from Data-Driven to Intelligence-Driven Bought B ou gh t V i e w e d R e t u r n e d Bought K n o w s Knows Knows K n o w s Pl ay s Lives_in In_sport Likes F a n _ o f Plays_for People E.g., Employees, Customers, Suppliers, Partners, Influencers Transactions E.g., Risk management, Supply chain, Payments Knowledge E.g. Enterprise content, Knowledgebase, eCommerce content
  • 8.
    Neo4j, Inc. Allrights reserved 2021 Which of the colored nodes would be considered the most ‘important'? D A B E H G F C I J K L M N 8
  • 9.
    Neo4j, Inc. Allrights reserved 2021 Which of the colored nodes would be considered the most ‘important'? D D has the highest valence This is the most connected individual in the network. If importance is how well you are personally known, you pick D. Node G has the highest closeness centrality (0.52). Information will disperse through the network more quickly through this individual. If you need to get a message out rapidly, choose them. G Node I has the highest betweenness centrality (0.59). This person is an efficient connector of other people. Risk of disruption is higher if you lose this individual. I D A B E H G F C I J K L M N 9
  • 10.
    Neo4j, Inc. Allrights reserved 2021 What is Graph data science? Graph Data Science is a science-driven approach to gain knowledge from the relationships and structures in data, typically to power predictions. Data scientists use relationships to answer questions.
  • 11.
    Neo4j, Inc. Allrights reserved 2021 Why Graph data science? Relationships and network structures are highly predictive and underutilized – and already in your data. ● Relationships are the strongest predictor of behavior - James Fowler (“Connected”) Productionize more accurate and predictive models. Graphs are a natural way to store and use this predictive information, but different than what you’re doing today.
  • 12.
    Neo4j, Inc. Allrights reserved 2021 Neo4j for Graph Data Science Neo4j Graph Data Science Library Scalable Graph Algorithms & Analytics Workspace Native Graph Creation & Persistence Neo4j Database Visual Graph Exploration & Prototyping Neo4j Bloom Practical way to harness the natural power of relationships and network structures to infer behavior Automated transformations with an integrated database built to store and protect relationships Explore results visually, quickly prototype and collaborate with different groups. 12
  • 13.
    Neo4j, Inc. Allrights reserved 2021 13 Graphs & Data Science Knowledge Graphs Graph Algorithms Graph Native Machine Learning Find the patterns you’re looking for in connected data Use unsupervised machine learning techniques to identify associations, anomalies, and trends. Use embeddings to learn the features in your graph that you don’t even know are important yet. Train in-graph supervise ML models to predict links, labels, and missing data.
  • 14.
    Neo4j, Inc. Allrights reserved 2021 When do I need Graph Algorithms? Query (e.g. Cypher/Python) Real-time, local decisioning and pattern matching Graph Algorithms Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation
  • 15.
    Neo4j, Inc. Allrights reserved 2021 Robust Graph Algorithms ● Compute connectivity metrics and learn the topology of your graph ● Highly parallelized and scale to 10’s of billions of nodes 15 The Neo4j GDS Library Mutable In-Memory Workspace Computational Graph Native Graph Store Efficient & Flexible Analytics Workspace ● Automatically reshapes transactional graphs into an in-memory analytics graph ● Optimized for analytics with global traversals and aggregation ● Create workflows and layer algorithms
  • 16.
    Neo4j, Inc. Allrights reserved 2021 16 More, Better, Faster Algorithms Pathfinding & Search • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • A* Shortest Path • Yen’s K Shortest Path • Minimum Weight Spanning Tree • K-Spanning Tree (MST) • Random Walk • Breadth & Depth First Search Centrality & Importance • Degree Centrality • Closeness Centrality • Harmonic Centrality • Betweenness Centrality & Approx. • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Hyperlink Induced Topic Search (HITS) • Influence Maximization (Greedy, CELF) Community Detection • Triangle Count • Local Clustering Coefficient • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • K-1 Coloring • Modularity Optimization • Speaker Listener Label Propagation Supervised Machine Learning • Node Classification • Link Prediction … and more! Heuristic Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors Similarity • Node Similarity • K-Nearest Neighbors (KNN) • Jaccard Similarity • Cosine Similarity • Pearson Similarity • Euclidean Distance • Approximate Nearest Neighbors (ANN) Graph Embeddings • Node2Vec • FastRP • FastRPExtended • GraphSAGE • Synthetic Graph Generation • Scale Properties • Collapse Paths • One Hot Encoding • Split Relationships • Graph Export • Pregel API (write your own algos)
  • 17.
    Neo4j, Inc. Allrights reserved 2021 Graph Algorithms for Drug Discovery Identify drug mechanisms and new targets based on network structure PageRank to identify essential regulatory genes or drug targets Shortest path to link drug targets to possible outcomes or side effects Node Similarity to find structurally similar chemicals 17
  • 18.
    Neo4j, Inc. Allrights reserved 2021 What: Finds important nodes based on their relationships Why: Recommendations, identifying influencers Features: - Tolerance - Damping 18 Page Rank
  • 19.
    Neo4j, Inc. Allrights reserved 2021 19 Get started with Graph Data Science in Neo4j
  • 20.
    Neo4j, Inc. Allrights reserved 2021 20 Neo4j GDSL Download
  • 21.
    Neo4j, Inc. Allrights reserved 2021 21 Neo4j Desktop & Sandbox
  • 22.
    Neo4j, Inc. Allrights reserved 2021 22 Neo4j Sandbox - GDS Playground
  • 23.
    Neo4j, Inc. Allrights reserved 2021 23 Neo4j - Graph Algorithms Free Copy
  • 24.
    Neo4j, Inc. Allrights reserved 2021 24 Neo4j - Graph Data Science for Dummies Free Copy
  • 25.
    Neo4j, Inc. Allrights reserved 2021 25
  • 26.
    Neo4j, Inc. Allrights reserved 2021 26