0% found this document useful (0 votes)

32 views9 pages

Deep Walk Algorithm

The DeepWalk algorithm learns latent representations of nodes in a graph using random walks and the Skip-gram model from Word2Vec to generate vector embeddings. These embeddings facilitate various machine learning tasks such as node classification, clustering, and link prediction. DeepWalk has applications in social network analysis, recommendation systems, biological network analysis, and more, by capturing the structure and relationships within graphs.

Uploaded by

VEERA KALYANI HEMA TAYARU PADALA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views9 pages

Deep Walk Algorithm

Uploaded by

VEERA KALYANI HEMA TAYARU PADALA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

DEEP WALK ALGORITHM

How Graphs Represent Relationships (Social Networks & Web Pages)

1. Social Networks as Graphs
In a social network (e.g., Facebook, Twitter, LinkedIn), users are nodes, and their
connections (friendships, follows) are edges.
Example: Simple Social Network Graph
Consider four people: Alice, Bob, Charlie, and David
 Alice is friends with Bob and Charlie
 Bob is also friends with David
 Charlie and David are not directly connected
We can represent this as an undirected graph:
Alice ---- Bob
| |
Charlie David
🔹 Nodes = {Alice, Bob, Charlie, David}
🔹 Edges = {(Alice, Bob), (Alice, Charlie), (Bob, David)}
👉 Use Case: Find "friend recommendations" using graph embeddings (DeepWalk).

2. Web Pages as Graphs (Google’s PageRank Algorithm)

The internet is a directed graph where:
 Web pages (URLs) are nodes
 Hyperlinks from one page to another are directed edges
Example: Small Web Graph
Page A → Page B
↓ ↓
Page C → Page D
🔹 Nodes = {A, B, C, D} (Webpages)
🔹 Edges (Hyperlinks) = {(A → B), (A → C), (B → D), (C → D)}
👉 Use Case: Google uses PageRank to rank pages based on incoming links.

What is a Random Walk?

A random walk on a graph is a path where each step is chosen randomly from the neighbors
of the current node. It's a way of exploring the graph structure by following edges in a
probabilistic manner.
🔹 Example:
 Start at a node.
 Move to a random neighbor.
 Repeat for a fixed number of steps.
Why Do Random Walks Capture Graph Structure?
Random walks encode structural information because they naturally reflect local and
global graph properties.
1️ Proximity Information
 If two nodes are frequently visited together, they are likely structurally close.
2️ Connectivity Patterns
 The probability of reaching a node depends on the graph topology.
 Hubs (high-degree nodes) are visited more often, capturing influential nodes.
3️ Structural Similarity
 Nodes with similar neighborhoods will have similar random walk distributions.
 This helps in clustering and classification.
4️ Adaptability
 Random walks can be biased to explore specific structures:
o DFS-like (Depth) walks explore deep hierarchical structures.
o BFS-like (Breadth) walks capture local communities.
What Are Embeddings?
Embeddings are dense vector representations of objects (e.g., words, nodes, or images) in a
lower-dimensional space, where similar objects have similar numerical representations.
Why Are Embeddings Useful?
 Dimensionality Reduction: Converts high-dimensional data (graphs, text, etc.) into
compact numerical vectors.
 Similarity Measurement: Nodes with similar roles in a graph have similar
embeddings.
 Machine Learning Input: Helps apply ML techniques like clustering and
classification on graphs.

Example: Word Embeddings vs. Node Embeddings

1. Word Embeddings (Word2Vec)
o "King" → [0.12, -0.34, 0.56, ...]
o "Queen" → [0.15, -0.32, 0.58, ...]
o Since "King" and "Queen" are related, their embeddings are close.
2. Node Embeddings (DeepWalk)
o Graph Example: Social Network (People as nodes, friendships as edges)
o "Alice" → [0.2, 0.4, -0.1, ...]
o "Bob" → [0.22, 0.38, -0.08, ...]
o Since Alice and Bob have similar connections, their embeddings are similar.

Word2Vec
Word2Vec is a popular algorithm used in Natural Language Processing (NLP) to learn
vector representations (or embeddings) of words in a continuous vector space. The key idea
behind Word2Vec is that words that appear in similar contexts should have similar
representations. There are two primary models in Word2Vec:
1. Skip-Gram
2. Continuous Bag of Words (CBOW).

Skip-Gram Model (Word2Vec)

Goal:
 The Skip-Gram model tries to predict the context words given a target word.
 Example: Given the word "king", the model will try to predict its surrounding words,
like "royal", "queen", "throne", etc., based on the training data.

Example:
Let’s take a sentence:
"The cat sat on the mat."
Step-by-Step Breakdown
 Sentence:
"The cat sat on the mat."
 Context Window:
We define a context window size. For simplicity, let’s use a window size of 1,
meaning we only look at the one word before and one word after the target word.
 Step 1: Generate Training Data We’ll create pairs of (target word, context word)
from the sentence.
For each word in the sentence, we take it as a target word and consider the words within the
context window as context words.
o For the word "cat":
 Target word = cat
 Context words = "The" (left), "sat" (right)
o For the word "sat":
 Target word = sat
 Context words = "cat" (left), "on" (right)
o For the word "on":
 Target word = on
 Context words = "sat" (left), "the" (right)
o For the word "the":
 Target word = the
 Context words = "on" (left), "mat" (right)
Training pairs from this sentence:
(cat, The), (cat, sat)
(sat, cat), (sat, on)
(on, sat), (on, the)
(the, on), (the, mat)
(mat, the)
 Step 2: Learn Embeddings (Training the Model) The Skip-Gram model uses these
pairs to train a neural network. The network learns the weights (embeddings) for each
word based on how often they appear together as target-context pairs. The output of
this training is a vector representation (embedding) for each word.
What Does the Output Look Like?
 The word embeddings are dense vectors (real-valued numbers). These vectors
capture the semantic meaning of words.
After training, words like "cat" and "mat" (which appeared together in the context) will
have similar embeddings, meaning they are closer in the vector space. Similarly, words like
"cat" and "dog" will likely have similar embeddings, as they appear in similar contexts.
Word Embeddings Example:
 cat: [0.1, 0.3, -0.4, 0.8, -0.1]
 sat: [0.2, 0.1, -0.3, 0.7, -0.2]
 mat: [0.15, 0.35, -0.45, 0.85, -0.1]
Here, the values are random numbers to represent the idea that "cat" and "mat" are closer in
the vector space compared to "cat" and an unrelated word like "apple."

Key Insights from Word2Vec:

1. Semantic Similarity:
o Words that appear in similar contexts will have similar embeddings. For
instance, "king" and "queen" will be closer in the vector space because they
often appear in similar contexts, like "royalty" or "throne".
2. Analogies:
o Once trained, you can use vector arithmetic to solve analogies. For example:
 "king" - "man" + "woman" ≈ "queen".
 This is because the vector difference between "king" and "man" (a
male king) plus "woman" should yield a vector close to "queen".

Summary Table

Feature Random Walks on Graphs Word2Vec (NLP)

Nodes Entities (e.g., people, items) Words

Relationships (e.g., friendships, Contextual relationships (co-occurring

Edges
interactions) words)

Learn node embeddings based on graph Learn word embeddings based on

Goal
structure word context

Context-based prediction (Skip-

Method Random movement through graph
Gram/CBOW)

Node embeddings (vector Word embeddings (vector

Output
representations) representations)

Node classification, link prediction, Word similarity, text classification,

Applications
clustering NLP tasks

Conclusion
Both random walks and Word2Vec model relationships by exploring proximity (in graphs
or in text). The underlying concept of learning embeddings based on local structures makes
them highly effective for their respective domains.

Deep Walk Algorithm

DeepWalk is an algorithm for learning latent representations of nodes in a graph. It is based
on random walks and Skip-gram (word2vec) to generate vector embeddings for nodes in a
network. These embeddings help in various machine learning tasks like node classification,
clustering, and link prediction.
Steps of DeepWalk Algorithm
1. Random Walks:
o Perform short random walks from each node in the graph.
o These walks capture the local structure of the graph.
2. Generating Training Data:
o Treat each random walk as a sentence (sequence of nodes).
o Each node is like a word in natural language processing.
3. Skip-gram Model (Word2Vec):
o Use Skip-gram to learn node embeddings.
o The goal is to predict surrounding nodes given a central node.

4. Generating Node Embeddings:

o After training, each node gets a low-dimensional embedding.
o These embeddings can be used in machine learning tasks.
Example
Consider a simple graph:
A -- B -- C
| | |
D -- E -- F
Step 1: Random Walks
A random walk simulates how a node is connected to others by following links.
Random Walk from 'A' (walk length = 4):
A→B→E→F
Repeat this process multiple times for each node to generate training sequences.
Walk from B: B → C → F → E

Step 2: Training Data for Skip-gram

Each random walk is treated like a sentence in NLP.
For example, if we collect multiple walks:
['A', 'B', 'E', 'F']
['B', 'C', 'F', 'E']
['D', 'E', 'B', 'A']
Each node (word) is associated with its neighbors within a window size.
For Skip-gram, we take a central node and predict nearby nodes:

Center Node Context Nodes (window=2)

A B, E

B A, E, C

C B, F

E B, F, A, D

Step 3: Learning Embeddings Using Word2Vec

Skip-gram model (from Word2Vec) is used to learn node embeddings.
Mathematical Formulation
For a node v, the probability of encountering a neighbor u in a random walk is:4

Where each wi is a random walk

vec(v) is the embedding of node v.
V is the set of all nodes.

Step 4: Gradient Descent Optimization

The embeddings are updated using Stochastic Gradient Descent (SGD):
 Start with random embeddings.
 Compute prediction error using Skip-gram.
 Update embeddings based on gradient descent.
 Repeat until convergence.

Final Embeddings
After training, each node gets a low-dimensional vector that captures graph structure:
Example output (assuming vector_size=4):
Node A: [0.21, -0.33, 0.44, 0.56]
Node B: [0.18, -0.29, 0.51, 0.61]
Node E: [0.24, -0.35, 0.42, 0.59]
Nodes with similar roles in the graph will have closer embeddings.
Applications
1. Social Network Analysis
 Use Case: Identifying communities or groups within social networks based on user
interactions.
 How DeepWalk Helps: By representing users as nodes and interactions as edges,
DeepWalk can embed users in a continuous vector space, capturing their behavioral
patterns and relationships. This is useful for tasks like friend recommendations or
identifying influencers.
2. Link Prediction
 Use Case: Predicting missing edges or connections in a graph.
 How DeepWalk Helps: By learning node representations, DeepWalk can capture the
likelihood of a connection between two nodes. This is useful in social networks,
recommendation systems, or knowledge graphs where new relationships need to be
predicted.
3. Recommendation Systems
 Use Case: Suggesting products, services, or content based on user-item interactions.
 How DeepWalk Helps: In collaborative filtering, DeepWalk can learn embeddings of
users and items from interaction graphs (such as users interacting with products),
allowing it to make personalized recommendations based on learned similarities.
4. Biological Network Analysis
 Use Case: Analyzing protein-protein interaction networks or gene regulatory
networks.
 How DeepWalk Helps: DeepWalk can help identify clusters of proteins or genes with
similar functions by learning latent representations of the nodes, which can be used
for drug discovery or disease prediction.
5. Graph-based Search Engines
 Use Case: Ranking and searching over graph data, such as academic papers, patents,
or websites.
 How DeepWalk Helps: The algorithm can embed both documents and terms into a
shared vector space, facilitating semantic search and better ranking of search results
based on relationships between concepts.

6. Natural Language Processing (NLP)

 Use Case: Word embeddings and document classification.
 How DeepWalk Helps: In text-based applications, DeepWalk can model words and
their relationships as nodes in a graph (e.g., co-occurrence graph of words), and then
it learns embeddings that capture semantic meanings of words or documents.
7. Graph Visualization
 Use Case: Visualizing large-scale graphs and networks.
 How DeepWalk Helps: The learned embeddings can be used for dimensionality
reduction techniques such as t-SNE or PCA to visualize complex graphs in 2D or 3D
space, helping researchers and practitioners understand the structure of large datasets.
8. Knowledge Graph Construction
 Use Case: Building knowledge graphs for structured data representation.
 How DeepWalk Helps: DeepWalk can represent relationships in large knowledge
graphs, helping automate the extraction of facts and relationships from unstructured
data sources.
9. Anomaly Detection
 Use Case: Identifying unusual behavior or anomalies in graph data.
 How DeepWalk Helps: By learning the representations of nodes, DeepWalk can
identify outliers or anomalies by detecting nodes whose embeddings deviate
significantly from the majority of nodes in the graph.
10. Graph Classification
 Use Case: Classifying entire graphs based on their structure.
 How DeepWalk Helps: In applications where the goal is to classify entire graphs
(e.g., chemical compounds or social network communities), DeepWalk can embed
graphs into vector spaces and then use those embeddings for classification tasks.
In all these cases, DeepWalk works by generating random walks over the graph, treating them
as sentences (similar to word2vec), and then learning to represent the nodes in a low-
dimensional vector space. This allows the model to capture the local and global structures of
the graph in its node embeddings.

Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi
No ratings yet
Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi
30 pages
Lec35 36
No ratings yet
Lec35 36
39 pages
Role-Based Graph Embeddings
No ratings yet
Role-Based Graph Embeddings
15 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
32 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
04 GNNBasic
No ratings yet
04 GNNBasic
107 pages
GML Part2
No ratings yet
GML Part2
48 pages
GML Part3
No ratings yet
GML Part3
49 pages
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
No ratings yet
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
53 pages
Word2Vec for NLP Enthusiasts
No ratings yet
Word2Vec for NLP Enthusiasts
13 pages
Person 2 Vec
100% (1)
Person 2 Vec
17 pages
Foundations of Text Representation, LLMs and Transformers
No ratings yet
Foundations of Text Representation, LLMs and Transformers
87 pages
Part 3
No ratings yet
Part 3
5 pages
NLP Notes
No ratings yet
NLP Notes
11 pages
Word 2 Vec
No ratings yet
Word 2 Vec
6 pages
NLP & AI Techniques Guide
No ratings yet
NLP & AI Techniques Guide
37 pages
Unit 3 NLP
No ratings yet
Unit 3 NLP
8 pages
Lecture 6 - Word2Vec and Text Classification
No ratings yet
Lecture 6 - Word2Vec and Text Classification
66 pages
Lecture Word Embeddings WordTo Vec IR
No ratings yet
Lecture Word Embeddings WordTo Vec IR
60 pages
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
No ratings yet
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
9 pages
Neural Networks For Machine Learning: Lecture 4a Learning To Predict The Next Word
No ratings yet
Neural Networks For Machine Learning: Lecture 4a Learning To Predict The Next Word
34 pages
GraphNeuralNetworks CaseStudyWalkthrough
No ratings yet
GraphNeuralNetworks CaseStudyWalkthrough
5 pages
Neural Networks: Predicting Words
No ratings yet
Neural Networks: Predicting Words
34 pages
GCAT - Link Prediction in Knowledge Graphs
No ratings yet
GCAT - Link Prediction in Knowledge Graphs
73 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
Word 2 Vec
No ratings yet
Word 2 Vec
22 pages
Introduction To Neural Networks and Machine Learning Lecture 4: Learning To Model Relationships and Word Sequences
No ratings yet
Introduction To Neural Networks and Machine Learning Lecture 4: Learning To Model Relationships and Word Sequences
21 pages
Word2Vec for NLP Enthusiasts
100% (1)
Word2Vec for NLP Enthusiasts
12 pages
NLP2
No ratings yet
NLP2
11 pages
L4 Cse256 Fa24 We
No ratings yet
L4 Cse256 Fa24 We
68 pages
7a. Word Embeddings Word2Vec and GloVe
No ratings yet
7a. Word Embeddings Word2Vec and GloVe
39 pages
Word and Document Embeddings
No ratings yet
Word and Document Embeddings
94 pages
GNN&Reasoning
No ratings yet
GNN&Reasoning
187 pages
Wordembed
No ratings yet
Wordembed
31 pages
NLP Word Vectors for Students
No ratings yet
NLP Word Vectors for Students
33 pages
Word Embadding
No ratings yet
Word Embadding
24 pages
1725888984module 4 Deep Learning For Natural Language Processing (NLP)
No ratings yet
1725888984module 4 Deep Learning For Natural Language Processing (NLP)
15 pages
cs224n 2025 Lecture02 Wordvecs2
No ratings yet
cs224n 2025 Lecture02 Wordvecs2
46 pages
Lebijp 59 SZ 31 Py
No ratings yet
Lebijp 59 SZ 31 Py
69 pages
Chapter II
No ratings yet
Chapter II
26 pages
Vector Semantics and Embedding (Part 2)
No ratings yet
Vector Semantics and Embedding (Part 2)
47 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
9 pages
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
No ratings yet
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
32 pages
Malware Classification Using Graph Neural Networks
No ratings yet
Malware Classification Using Graph Neural Networks
53 pages
Word 2 Vec
No ratings yet
Word 2 Vec
33 pages
7) Link Prediction in Dynamic Networks Using Time Aware Network Embedding and Time Series Forecasting
No ratings yet
7) Link Prediction in Dynamic Networks Using Time Aware Network Embedding and Time Series Forecasting
13 pages
L6 - UCLxDeepMind DL2020 Document of Google
No ratings yet
L6 - UCLxDeepMind DL2020 Document of Google
141 pages
Word Embeddings Classification
No ratings yet
Word Embeddings Classification
52 pages
NLP U3
No ratings yet
NLP U3
13 pages
Unit 2 Updated New
No ratings yet
Unit 2 Updated New
77 pages
199 Fast Node Embeddings Learning - 13
No ratings yet
199 Fast Node Embeddings Learning - 13
11 pages
Graph Based Data Science
No ratings yet
Graph Based Data Science
37 pages
Recurrent Neural Networks Cheatsheet
No ratings yet
Recurrent Neural Networks Cheatsheet
44 pages
NLP - L9 Word Embedding
No ratings yet
NLP - L9 Word Embedding
5 pages
GenAI Workflow Automation NPTEL Zoom Course
No ratings yet
GenAI Workflow Automation NPTEL Zoom Course
88 pages
Module 3 - NLP
No ratings yet
Module 3 - NLP
34 pages
GNNS
No ratings yet
GNNS
7 pages
Scan 29 Nov 24 14 55 48
No ratings yet
Scan 29 Nov 24 14 55 48
6 pages
Scan 06 Nov 24 17 32 49
No ratings yet
Scan 06 Nov 24 17 32 49
4 pages
Scan 07 Nov 24 09 41 43
No ratings yet
Scan 07 Nov 24 09 41 43
3 pages
Chapter 6 ICG78 105
No ratings yet
Chapter 6 ICG78 105
28 pages
Modern Power Station Practice Electrical Systems and Equipment
No ratings yet
Modern Power Station Practice Electrical Systems and Equipment
1,053 pages
CLEAR - Q2 - Math 8 - Week 1
No ratings yet
CLEAR - Q2 - Math 8 - Week 1
20 pages
XK3118K8 Manul Book
67% (3)
XK3118K8 Manul Book
25 pages
Stormwater Drainage Design Guide
No ratings yet
Stormwater Drainage Design Guide
34 pages
Monthly Report
No ratings yet
Monthly Report
7 pages
Possible Cds Topics
No ratings yet
Possible Cds Topics
2 pages
Quality Assurance System For BDRRMP
No ratings yet
Quality Assurance System For BDRRMP
88 pages
The Statistical Analysis of Multivariate Time Data A Marginal Modeling Approach Prentice Instant Download
No ratings yet
The Statistical Analysis of Multivariate Time Data A Marginal Modeling Approach Prentice Instant Download
77 pages
Fall Protection & Working at Height: by - S. Meril Panditha
No ratings yet
Fall Protection & Working at Height: by - S. Meril Panditha
32 pages
Chapter 3-Performance Management and Strategic Planning True/False Questions
No ratings yet
Chapter 3-Performance Management and Strategic Planning True/False Questions
5 pages
Nursing Students' Logic Guide
No ratings yet
Nursing Students' Logic Guide
57 pages
Business Plan - Contingency-Business-Plan
No ratings yet
Business Plan - Contingency-Business-Plan
2 pages
Interfacing Case-Isolated Two Wire Devices To Bently Nevada's 3500 Monitoring System
No ratings yet
Interfacing Case-Isolated Two Wire Devices To Bently Nevada's 3500 Monitoring System
8 pages
AS10169
No ratings yet
AS10169
11 pages
Rheomax CFD
No ratings yet
Rheomax CFD
7 pages
FPGSG Sor
No ratings yet
FPGSG Sor
11 pages
AS4C Industrial HexTow DataSheet
No ratings yet
AS4C Industrial HexTow DataSheet
2 pages
Guidance Note On Strengthening Urban Governance 1702978025
No ratings yet
Guidance Note On Strengthening Urban Governance 1702978025
35 pages
PI1008 - CAP Requirements On D-Dimer Testing EN
No ratings yet
PI1008 - CAP Requirements On D-Dimer Testing EN
9 pages
Phy Unit 5 (EM) Key
No ratings yet
Phy Unit 5 (EM) Key
4 pages
(Ebook PDF) Critical Thinking: A Students Introduction 6th Edition PDF Download
100% (4)
(Ebook PDF) Critical Thinking: A Students Introduction 6th Edition PDF Download
51 pages
Financial Planning 2nd Edition McKeown Digital Access
No ratings yet
Financial Planning 2nd Edition McKeown Digital Access
406 pages
Sri Chaitanya IIT Academy, India: KEY Sheet Physics
No ratings yet
Sri Chaitanya IIT Academy, India: KEY Sheet Physics
9 pages
Bishop Pike's Legacy & SFF Retreats
No ratings yet
Bishop Pike's Legacy & SFF Retreats
6 pages
Ijtf D 24 00025
No ratings yet
Ijtf D 24 00025
18 pages
2015 Natural Science Placement Results
No ratings yet
2015 Natural Science Placement Results
13 pages
NEET Chem Liquid Solution
No ratings yet
NEET Chem Liquid Solution
8 pages
9.1mechanisms of Evolution and Their Effect On Populations
No ratings yet
9.1mechanisms of Evolution and Their Effect On Populations
3 pages
Difficult Conversations Made Easy
No ratings yet
Difficult Conversations Made Easy
3 pages
Scientific American 2021 - 09
100% (1)
Scientific American 2021 - 09
92 pages

Deep Walk Algorithm

Uploaded by

Deep Walk Algorithm

Uploaded by

DEEP WALK ALGORITHM

How Graphs Represent Relationships (Social Networks & Web Pages)

2. Web Pages as Graphs (Google’s PageRank Algorithm)

What is a Random Walk?

Example: Word Embeddings vs. Node Embeddings

Skip-Gram Model (Word2Vec)

Key Insights from Word2Vec:

Feature Random Walks on Graphs Word2Vec (NLP)

Nodes Entities (e.g., people, items) Words

Relationships (e.g., friendships, Contextual relationships (co-occurring

Learn node embeddings based on graph Learn word embeddings based on

Context-based prediction (Skip-

Node embeddings (vector Word embeddings (vector

Node classification, link prediction, Word similarity, text classification,

Deep Walk Algorithm

4. Generating Node Embeddings:

Step 2: Training Data for Skip-gram

Center Node Context Nodes (window=2)

Step 3: Learning Embeddings Using Word2Vec

Where each wi is a random walk

Step 4: Gradient Descent Optimization

6. Natural Language Processing (NLP)

You might also like