[go: up one dir, main page]

0% found this document useful (0 votes)
50 views42 pages

Building Graphs

The document provides an overview of a presentation on building graph applications with Neo4j. The presentation will include an introduction to Neo4j and graph databases, demonstrate building a recommendation system using a movie graph as an example, and show how to query the graph to find similar users and provide recommendations. It will also discuss extending the recommendation system by incorporating additional data and machine learning techniques.

Uploaded by

Mrinny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views42 pages

Building Graphs

The document provides an overview of a presentation on building graph applications with Neo4j. The presentation will include an introduction to Neo4j and graph databases, demonstrate building a recommendation system using a movie graph as an example, and show how to query the graph to find similar users and provide recommendations. It will also discuss extending the recommendation system by incorporating additional data and machine learning techniques.

Uploaded by

Mrinny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Building Graph Applications with Neo4j

Neo4j Manchester Meetup


Wednesday 1st March
Contents
● Introduction
● The Graph
○ Introduction to the demo
○ Neo4j Primer
○ The graph database
○ Recommendations
● Building a front end around Neo4j
○ Motivations
○ Data management
○ Live demo
Metafused
● Use AI to optimise existing and build net new
applications for a broad set of verticals
○ AI is applied to reduce friction, optimise processes, execute
action(s) based on observed trigger(s)
● Data driven
○ Being comfortable with being uncomfortable
○ Using the right tool(s) for the job
○ Build, measure, learn
Our Team
Seven Bridges of Königsberg
‘Devise a walk through the
city that would cross each of
those bridges once and only
once.’

Leonhard Euler (1736) proved


that, with some problems, you
can’t solve them doing the
same thing you did yesterday
… and expecting different
results.
Thinking Differently
Making the Case for the
Bottom Up Approach

‘You can’t do much


carpentry with bare
hands. Neither can you
do much thinking with a
bare brain.’ -Daniel
Dennett
Or, if you prefer not to have to read ...

But you will have to watch Brad Pitt.


Sorry.
Backend Design
Building Applications with Neo4j
● Presentation of a prototype application based on a well
known problem
○ Demonstrate capability of technology stack
○ Show how Metafused are using Neo4j as a key component of our
architecture
○ Demonstrate how Neo4j can be used as part of an AI system
● Building a live recommendation system
Neo4j Primer
● Nodes
○ Objects representing entities
● Labels
○ Assigned to nodes to specify the type of entity
● Relationships
○ Directional connections between nodes
● Properties
○ Additional information which can be attached to nodes and
relationships
● Cypher
○ Declarative query language for Neo4j
● Patterns
○ Selected combinations of nodes and relationships
● Let’s explore these concepts along with the graph...
A Known Graph
To demonstrate the technology, start simple. A great use case for Neo4j is the Movie graph
seen in many of the training examples.
A Simple Query (1)
Start with a simple Cypher query, find a node, with label Person and name property equal to
“Andrew Stanton”.

MATCH (p :Person) WHERE p.name = "Andrew Stanton" RETURN p;


A Simple Query (2)
We can look at specific relationships to answer simple questions, for example what movies
have Andrew Stanton had a relationship with, or more specifically directed?

MATCH (p :Person)-[r]->(m :Movie) WHERE p.name = "Andrew Stanton" RETURN p, r, m;


MATCH (p :Person)-[d :DIRECTED]->(m :Movie) WHERE p.name = "Andrew Stanton" RETURN p.name, collect(m.title);
The Data
● Data gathered from IMDb using IMDbPY
○ http://imdbpy.sourceforge.net/
○ IMDb not for commercial use, OK for this presentation
● Use IMDbPY to query database
● Either
○ Write script to build csv files for nodes and relationships
○ Directly ingest with py2neo (http://py2neo.org/v3/)
● This demo
○ 325 movies + full cast, directors, producers, writers
○ Easily extend with any further information e.g. full crew, trivia,
keywords etc
○ Initially 10 example users with ratings assigned to 25% of movies
following a defined distribution
User Query (1)
Who are the most active users of our recommendation engine

MATCH (u :User)-[r :RATED]->(m :Movie)


RETURN u.name AS Name,
u.username AS Username,
count(r) AS Reviews,
avg(r.rating) AS `Average Score`
ORDER BY Reviews DESC
LIMIT 5;
User Query (2)
We can compare two users reviews

MATCH (u1 :User)-[r1 :RATED]->(m :Movie)<-[r2 :RATED]-(u2 :User)


WHERE u1.name = "Malcolm Reynolds" AND u2.name = "Jayne Cobb"
WITH m,
r1.rating AS score_mal,
r2.rating AS score_jayne
MATCH (:User)-[r :RATED]->(m)
RETURN m.title AS Movie,
score_mal AS `Mal's Rating`,
score_jayne AS `Jayne's Rating`,
count(r.rating) AS `Total Reviews`,
avg(r.rating) AS `Average Rating`
ORDER BY (score_mal + score_jayne)/2 DESC
LIMIT 3;
Similarity
We can measure how similar users are based on their reviews.
To do this we will be using the Euclidean distance. The
smaller the value the more similar the user.
Similarity - Example

Rating 1 Rating 2 Diff Diff Squared


1 5 -4 16
1 4.5 -3.5 12.25
1 2.5 -1.5 2.25
1 4 -3 9
1 1 0 0
Total 39.5
Distance 6.28
Similarity - Example

Rating 1 Rating 2 Diff Diff Squared


5 5 0 0
4.5 4.5 0 0
2.5 2.5 0 0
4 4 0 0
1 1 0 0
Total 0
Distance 0.00
Nearest Neighbours
We can calculate similarities in Neo4j and then find a user’s nearest neighbours.

// Update similarities between users


MATCH (u1:User)-[x:RATED]->(m:Movie)<-[y:RATED]-(u2:User)
WHERE u1 <> u2
WITH SQRT(REDUCE(acc = 0.0, dif IN COLLECT(x.rating - y.rating) | acc + dif^2))/count(m) AS sim, u1, u2, m
MERGE (u1)-[s:SIMILARITY]-(u2)
SET s.similarity = sim;

// Get nearest neighbors (lower the better)


MATCH (u1 :User)-[s :SIMILARITY]-(u2 :User)
WHERE u1.name = “Malcolm Reynolds”
WITH u2.name AS Neighbor,
s.similarity AS sim
ORDER BY sim
RETURN Neighbor,
sim AS Similarity
LIMIT 10;
Nearest Neighbours
We can calculate similarities in Neo4j and then find a user’s nearest neighbours.
Recommendations
We can now design our recommendation engine.

1. User logs in and registers in the database


2. User rates a new movie
3. System
a. Updates database information
b. Finds user’s new nearest neighbors
c. Calculate average reviews of movies by x most similar users
d. Exclude movies already rated by the user
e. Order movies by average rating
f. Deliver recommendations
4. User filters results, for example by genre

Lets see how that looks...


Recommendation Query
We can now design our recommendation engine.

// Match user a to users who have rated 2 or more of the same films WITH m,
MATCH (u2 :User)-[:RATED]->(film: Movie)<-[:RATED]-(u1 :User {id: "usr0"}) REDUCE(s = 0, i IN COLLECT(rating) | s + i) * 1.0 AS rating_sum,
WITH u1, SIZE(COLLECT(rating)) AS n_ratings
u2, // Movie must have at least 2 ratings
COUNT(film) AS film_count WHERE n_ratings > 1
// Reviewed two or more of the same films // Get the average review
WHERE film_count > 1 WITH m,
WITH u1, rating_sum/n_ratings AS reco,
u2 n_ratings
// MATCH Users similarities // Get the genres
MATCH (u2)-[s:SIMILARITY]-(u1) MATCH (m)-[:HAS_GENRE]->(g :Genre)
WITH u1, u2, s.similarity AS similarity WITH m,
ORDER BY similarity reco,
LIMIT 3 COLLECT(g.name) AS genres,
WITH u1, u2, similarity n_ratings
// Get movies rated by the similar users // Order by the average recommendation score (not average rating)
MATCH (u2)-[r:RATED]->(m :Movie) // and then n_ratings
// Only movies user 1 hasn't seen ORDER BY reco DESC,
WHERE NOT((u1)-[:RATED]->(m)) n_ratings DESC
WITH m, // Return list
similarity, RETURN m.title,
r.rating AS rating reco AS score,
// Group movies genres
ORDER BY m.title LIMIT 10
Recommendation Query
Extending the Recommender
The current setup demonstrates the power of a graph database
for recommendation, this could be extended in several ways

● More complex queries incorporating further information


e.g. favourite actors
○ Natural use of Neo4j
○ Easily scales
○ Simple to understand
○ Real time
● Scale with more complex Machine Learning
○ Expand on simple nearest neighbour example
○ Include additional contextual information
Google Cloud Platform
We are using Google Cloud Platform (GCP) to help us build and deploy services/applications.

● Google are working


towards a ‘no-ops’
environment
● Incorporate virtual
machines, clusters,
software and APIs
● Allows for rapid
prototyping and focus on
areas of expertise
Future Stack
Incorporating Machine Learning into the application is easy with GCP.

Ingest Data

Process Data Learn/Optimise


Store
Batch/Stream
Front End

Store
Moving to an Application
We have built a simple recommendation engine around data
stored in Neo4j.

● How is this delivered to the user?


● How does the front end communicate with Neo4j?
Front End
Introduction
● Working using a build, measure, learn methodology allows
us to work through prototypes quickly
● Front end choices motivated by ability to reuse
components
● Walk through some design choices for Movie Recommendation
Engine
● Discuss future changes and what we have learned
Movie Recommendation Engine Technology Stack
Axios

User
Interacting with Neo4j: Key Points
● Interacting with Neo4j through transaction end points
○ Move to using node server connected via the Bolt driver
● Able to send multiple queries with one request
● Query design, balance query speed/complexity with number
of requests and amount of front end processing
○ For example - one query to return movies with associated genre rather
than one query per genre
State Management (1)
● Redux provides a system for state management
○ Immutable
● Data objects stored in a state tree which is updated via
Redux’s actions and reducers
● Demonstrate this for part of our application over the
next few slides
○ Focus on a small part of state tree
State Management (2)
● Application
initialises
○ Several data
objects
○ Focus on movies for
this example
State Management (3)
● Application
initialises
● Fetch movie data
from Neo4j
State Management (4)
● Application
initialises
● Fetch movie data
from Neo4j
● Movie data
processed, broken
down into genres
○ Data now available
in state
State Management (5)
● Application
initialises
● Fetch movie data
from Neo4j
● Move data
processed, broken
down into genres
○ Data available in
state
● User selects movie
State Management (5)
● Further actions
update the state
○ Movie review
○ Select another
movie
○ Log out
○ Log in
Future Stack
Joining the back and front end together helps us build out our application.

Ingest Data

Process Data Learn/Optimise


Store
Batch/Stream

Store

User
Extending the User Experience
● Currently have a very basic application
● Neo4j + GCP give us the ability to quickly iterate through new ideas using
a build, measure, learn methodology
○ Add new labels, nodes, relationships and properties to the graph
○ Ingest new data and add more ML
● User could add review comments
○ Sentiment analysis
○ Improve recommendation
● Users could add each other as friends
○ Improve recommendations
○ Push alerts
● Conversational UI/chatbot
● Live voting system
Live Demo
Thanks for listening
any questions?

You might also like