NATURAL LANGUAGE PROCESSING
(62241203)
By
Dr. Kirti Raj Bhatele
Assistant Professor
Dept. of CSE
MITS-DU, Gwalior
4/7/2025
Unit 3 syllabus
Syntactic and Semantic Analysis: Constituency and
dependency parsing, context-free grammars, probabilistic
context-free grammars, semantic role labeling, word sense
disambiguation, lexical semantics, distributional semantics,
word embeddings (Word2Vec, GloVe, FastText).
4/7/2025
Syntactic and Semantic Analysis
4/7/2025
What is Syntactic Analysis
Syntactic analysis, also called parsing, is a crucial step in Natural
Language Processing (NLP) that involves analyzing the
grammatical structure of a sentence.
It determines how words in a sentence relate to one another and
conform to grammatical rules. Here’s a breakdown of its key
aspects in the next slide.
4/7/2025
Key Concepts in Syntactic Analysis
• Grammar Rules: It relies on pre-defined grammatical rules, such as
those from Context-Free Grammars (CFGs), to validate sentence
structures.
• Parse Trees: A parse tree is generated to represent the hierarchical
structure of a sentence. Nodes in the tree represent grammatical
components (e.g., noun phrases, verb phrases).
• Types of Parsing:
– Constituency Parsing: Breaks down a sentence into constituents
(phrases or subphrases) according to a grammar.
– Dependency Parsing: Focuses on relationships between words (e.g.,
subject-verb) and generates a dependency tree.
4/7/2025
Why is it Important in NLP?
Syntax Validation: Ensures that sentences follow grammatical rules.
Semantics Integration: Provides a foundation for semantic analysis,
where the meaning of a sentence is understood.
Applications: Used in machine translation, question answering
systems, and speech-to-text systems.
For example:
Sentence: "The cat sat on the mat.“
A syntactic analysis would reveal the subject ("The cat"), the verb
("sat"), and the object ("on the mat") along with their grammatical
roles.
4/7/2025
What is Constituency Parsing?
Constituency Parsing is a method of syntactic
analysis in Natural Language Processing (NLP) that
focuses on identifying the hierarchical structure of a
sentence.
It breaks sentences into smaller units called
constituents, which can be individual words or
phrases, such as noun phrases (NP) or verb phrases
(VP).
4/7/2025
Key points to understand in Constituency Parsing?
• Hierarchical Representation: Constituency parsing produces a parse
tree that shows the grammatical structure of a sentence.
• The tree represents how constituents combine according to grammar
rules.
• Rules and Context-Free Grammars (CFGs): Constituency parsing
often uses CFGs, where production rules define how constituents
can be generated (e.g., S → NP VP, meaning a sentence consists of a
noun phrase and a verb phrase).
• Applications: It's used in tasks like machine translation, question
answering, and text summarization.
4/7/2025
Key points to understand in Constituency Parsing?
For example:
• Sentence: "The cat sat on the mat.”
Parse Tree:
4/7/2025
What is Dependency Parsing?
Dependency parsing is a syntactic analysis technique in NLP that
focuses on the relationships between words in a sentence.
Unlike constituency parsing, which breaks sentences into sub phrases
(constituents), dependency parsing aims to identify how words depend
on or modify one another, creating a dependency tree.
4/7/2025
Key Features of Dependency Parsing
1. Word Relationships
• Each word in a sentence is related to another word.
• For example: In the sentence "The cat sat on the mat," the word
"sat" is the main verb, and other words (like "cat" and "on") depend
on it grammatically.
2. Head and Dependents:
• Each word (called a dependent) is connected to another word
(called a head) by a directed arc.
• For example, in the phrase "on the mat," "on" is the head, and "the
mat" are its dependents.
4/7/2025
4/7/2025
Key Features of Dependency Parsing
3. Dependency Tree
A tree structure is generated where
• The root represents the main verb or the overall action.
• Each word in the sentence is a node with arcs connecting
dependents to their heads.
Example:
• For the sentence "The dog barked loudly," a dependency tree might
look like this: Explanation:
"barked" is the root (main verb).
"dog" depends on "barked" as the subject.
"The" depends on "dog" as its determiner.
"loudly" modifies "barked" as an adverb.
4/7/2025
Why Dependency Parsing Matters:
Applications
• Machine Translation: Understanding word relationships is crucial
for accurate translation.
• Question-Answering Systems: Helps determine which words or
phrases answer the question.
• Sentiment Analysis: Identifies dependencies to pinpoint sentiment-
bearing terms.
4/7/2025
Why Dependency Parsing Matters:
Efficiency:
• Dependency parsing is often preferred in applications like
information extraction and search engines due to its direct focus on
word-level relationships.
• Dependency parsing is a powerful tool that aligns with the way
humans naturally process grammar—by identifying relationships
between words rather than just grouping them into phrases.
4/7/2025
What is Semantic analysis ?
Semantic analysis is the process of understanding the
meaning conveyed by a sentence or text.
It focuses on analyzing the meaning of words, phrases, and
sentences to generate a meaningful representation.
This step is crucial in understanding the context, intent, and
relationships within a text.
4/7/2025
Key Components of Semantic Analysis
Lexical Semantics: Examines the meaning of words, their
relationships (synonyms, antonyms), and their usage in context.
Example: Identifying that "bank" could mean a financial institution or
a riverbank depending on the sentence.
Compositional Semantics: Constructs the meaning of a sentence by
combining the meanings of individual words based on grammatical
rules.
Example: "John threw the ball" and "The ball threw John" have
different meanings due to word arrangement.
4/7/2025
Key Components of Semantic Analysis
Semantic Role Labeling (SRL): Assigns roles to words in a sentence,
such as agent, action, or target.
Example: In the sentence "Alice baked a cake for Bob," the roles are:
– Agent: Alice
– Action: Baked
– Recipient: Bob
– Object: Cake
4/7/2025
Key Components of Semantic Analysis
Word Sense Disambiguation (WSD): Resolves the ambiguity of
words with multiple meanings based on context.
Example: In "She went to the bank to deposit money," WSD identifies
that "bank" refers to a financial institution.
Knowledge Representation: Converts syntactic structures into
semantic formats, like logic-based representations or ontologies, to
capture the meaning of a sentence.
4/7/2025
Key Components of Semantic Analysis
Example:
Sentence: "The dog chased the cat in the garden."
Semantic Analysis:
• Lexical Semantics: Recognizes "dog," "chased," "cat," and "garden."
• Compositional Semantics: Understands the overall meaning— an action of
chasing took place in a garden.
• Semantic Roles:
– Agent: Dog
– Action: Chased
– Target: Cat
– Location: Garden
4/7/2025
Applications
Semantic analysis is used in areas like:
• Machine Translation: Translating text while retaining meaning.
• Question Answering: Extracting answers based on semantics.
• Text Summarization: Generating summaries that reflect the core
meaning.
4/7/2025
Context-Free Grammars (CFGs)
Definition: CFGs are a formal system of rules used to describe the
syntactic structure of sentences in a language. They consist of:
– Terminals: Basic symbols (e.g., words in a sentence).
– Non-terminals: Abstract categories or parts of speech (e.g.,
Noun Phrase, Verb Phrase).
– Start Symbol: The initial non-terminal from which parsing
begins.
– Production Rules: Rules that specify how non-terminals can be
replaced with terminals or other non-terminals.
4/7/2025
Context-Free Grammars (CFGs)
Here is an example CFG:
Grammar rules
4/7/2025
Probabilistic Context-Free Grammars (PCFGs)
PCFGs extend CFGs by assigning probabilities to production rules.
This helps handle ambiguity by choosing the most likely parse tree for
a sentence.
Mechanism:
Each rule is associated with a probability (e.g., P(N → "cat") = 0.6,
P(N → "dog") = 0.4).
The probability of a parse tree is the product of the probabilities of the
rules used.
4/7/2025
Probabilistic Context-Free Grammars (PCFGs)
Example: PCFG for a simple sentence structure:
Parsing "The cat chased a dog":
Rules used:
4/7/2025 Total probability: 𝑃=1.0×0.8×0.5×0.6×0.7×0.8×0.8×0.5×0.4=0.043.
Major Difference of CFG and PCFGs in NLP
• CFGs: Provide the structure and foundation for parsing algorithms,
used in syntax validation and text processing.
• PCFGs: Handle linguistic ambiguities, enabling more accurate
syntactic and semantic understanding.
• Lets take an example to understand it more clearly.
4/7/2025
Major Difference of CFG and PCFGs in NLP
Ambiguous Sentence
Consider the sentence: "The chicken is ready to eat.“
This sentence can have two possible interpretations:
Interpretation 1: The chicken is prepared and ready for someone to
eat.
Interpretation 2: The chicken (animal) itself is ready to eat something.
4/7/2025
Major Difference of CFG and PCFGs in NLP
Grammar Rules with Probabilities
Let's define a PCFG for parsing this sentence, assigning probabilities
to each production rule based on linguistic data.
4/7/2025
Major Difference of CFG and PCFGs in NLP
Grammar Rules with Probabilities
4/7/2025
Major Difference of CFG and PCFGs in NLP
The first interpretation ("The chicken is prepared and ready for
someone to eat") is more likely because it has a higher probability
(0.42) compared to the second interpretation (0.12).
PCFGs use these probabilities to resolve ambiguities and select the
most likely parse tree, making them valuable for tasks like machine
translation or voice recognition.
4/7/2025
Semantic Role Labeling (SRL) in NLP
Semantic Role Labeling (SRL) is a Natural Language Processing
(NLP) technique used to identify and classify the semantic roles of
words or phrases in a sentence.
It answers "who did what to whom, when, where, and how",
providing insight into the meaning of a sentence.
4/8/2025
Key Concepts
• Predicate and Arguments:
– Predicate: The main verb in the sentence that describes the
action or state.
– Arguments: The words or phrases associated with the predicate
that participate in the event. Examples include the subject,
object, and other modifiers.
4/8/2025
Key Concepts
Semantic Roles: Common semantic roles include:
– Agent: The doer of the action (e.g., "The dog" in "The dog
chased the ball").
– Patient/Theme: The entity affected by the action (e.g., "the
ball" in "The dog chased the ball").
– Instrument: The means by which an action is performed.
– Location: The place where the action occurs.
– Temporal: The time when the action happens.
4/8/2025
Key Concepts
Annotation and Labels: SRL involves annotating text with labels that
indicate the semantic roles of words or phrases.
For instance:[Agent: The dog] [Predicate: chased] [Theme: the ball].
4/8/2025
How SRL Works
• Step 1: Identify the Predicate: Locate the main verb(s) in the
sentence, which serves as the anchor for semantic roles.
• Step 2: Determine Arguments: Find other elements in the
sentence related to the predicate, such as subjects, objects, and
modifiers.
• Step 3: Assign Semantic Roles: Classify each argument into
predefined semantic roles based on their relationship with the
predicate.
4/8/2025
Applications of Semantic Role Labeling
SRL is applied in various domains, such as:
• Information Retrieval: Extracting structured data from text for
answering questions.
• Machine Translation: Improving translation by capturing
sentence-level semantics.
• Dialogue Systems: Enhancing natural communication between
humans and machines.
4/8/2025
Applications of Semantic Role Labeling
Example
Sentence: John baked a cake for his friend's birthday at 5 PM.
• [Agent: John]
• [Predicate: baked]
• [Theme: a cake]
• [Beneficiary: his friend's birthday]
• [Temporal: at 5 PM]
4/8/2025
Why SRL Matters in NLP
• Improves meaning extraction from text.
• Helps machines answer “wh-” questions effectively (who, what,
when, etc.).
• Aids in semantic search, chatbots, translation, and
summarization.
4/8/2025
Tools for SRL
Popular libraries and frameworks include:
• SpaCy: For basic SRL and dependency parsing.
• AllenNLP: Advanced models for SRL using neural networks.
• PropBank/FrameNet: Datasets for semantic role annotation.
4/8/2025
Activity
The chef cooked dinner for the guests in the evening.
Now identify them
• Predicate
• Agent
• Theme
• Recipient
• Time
4/8/2025
Activity
“The chef cooked dinner for the guests in the evening”.
Semantic Role Labeling Answer:
Predicate: cooked
Agent: The chef
Theme (Patient): dinner
Recipient: the guests
Time: in the evening
SRL Output:[Agent: The chef] [Predicate: cooked] [Theme: dinner]
[Recipient: for the guests] [Time: in the evening]
4/8/2025
Word Sense Disambiguation (WSD) in NLP
• Word Sense Disambiguation (WSD) is the process of identifying
the correct sense or meaning of a word in a given context. In
natural language, many words are polysemous, meaning they have
multiple meanings. For example:
• The word bank can mean a financial institution or the edge of a
river.
• WSD is an essential task in Natural Language Processing (NLP)
because accurate interpretation of word meanings improves
performance in various applications, including machine translation,
information retrieval, and text summarization.
4/8/2025
Word Sense Disambiguation (WSD) in NLP
Types of Ambiguities
• Lexical Ambiguity: When a word has multiple meanings.
Example: bat (an animal) vs. bat (used in sports).
• Syntactic Ambiguity: Ambiguity arising due to sentence structure.
While not WSD itself, resolving syntax can help disambiguate
word meanings. Consider the sentence: "I saw the man with the
telescope."
This sentence has multiple interpretations due to its syntactic
structure: Meaning 1: I used a telescope to see the man.
Meaning 2: The man I saw was holding a telescope.
4/8/2025
Word Sense Disambiguation (WSD) in NLP
Approaches to WSD
• Knowledge-Based Approaches:
– Use dictionaries, thesauri, and ontologies (e.g., WordNet).
– Algorithms:
• Lesk Algorithm: Disambiguates words by finding overlaps
in their dictionary definitions and the surrounding context.
• Path-Based Methods: Use the semantic relationships (e.g.,
hypernyms, hyponyms) in a lexical database like WordNet to
calculate similarity.
Word Sense Disambiguation (WSD) in NLP
Approaches to WSD
• Supervised Machine Learning:
– Requires a labeled dataset where the correct sense of a word is
annotated.
– Common algorithms: Decision Trees, SVMs, and Neural
Networks.
– Limitation: Depends on annotated corpora, which is costly to
create.
Word Sense Disambiguation (WSD) in NLP
Approaches to WSD
• Unsupervised Approaches:
– Cluster word instances based on context similarity without
labeled data.
– Methods include clustering algorithms like K-Means.
• Contextualized Embeddings (Modern Approach):
– Leverages deep learning models like BERT or ELMo, where
embeddings of words change based on context.
– Example: "He sat on the bank" vs. "He deposited money in the
bank."
Word Sense Disambiguation (WSD) in NLP
Applications
• Machine Translation: Accurate sense selection leads to better
translations.
• Information Retrieval: Helps in retrieving contextually relevant
documents.
• Text-to-Speech Systems: Determines correct pronunciation based
on word sense.
Word Sense Disambiguation (WSD) in NLP
Challenges
• Scarcity of large labeled datasets for supervised learning.
• Semantic similarity can be subjective and vary across different
contexts.
• Polysemy and homonymy make WSD a computationally
challenging task.
Introduction to Lexical Semantics
• Lexical semantics is the study of word meanings and their
relationships within a language.
• It focuses on understanding how words represent concepts, their
nuances, and connections with other words.
Introduction to Lexical Semantics
Key components include:
• Polysemy: A word with multiple meanings (e.g., light can mean
brightness or not heavy).
• Synonymy: Words with similar meanings (e.g., big and large).
• Antonymy: Words with opposite meanings (e.g., hot and cold).
• Hyponymy/Hypernymy: Hierarchical relationships, where one word is a
subset of another (e.g., sparrow is a hyponym of bird).
• Meronymy: Part-whole relationships (e.g., wheel is a part of car).
• Lexical semantics helps develop ontologies (e.g., WordNet) used in
Natural Language Processing (NLP) tasks like Word Sense
Disambiguation and Information Retrieval.
Introduction to Distributional Semantics
Distributional semantics is based on the idea that the meaning of a
word can be derived from its context in large corpora of text. It
operationalizes the distributional hypothesis, which states: “You
shall know a word by the company it keeps.” by J.R. Firth
Key features:
• Words are represented as vectors in a high-dimensional semantic
space, where distances between vectors indicate semantic
similarity.
• Example: In a context, the word dog might appear close to words
like pet, cat, and animal.
Introduction to Distributional Semantics
Applications:
• Synonym detection
• Semantic similarity computation
• Document clustering
Introduction to Word Embeddings
• Word embeddings are a type of representation that converts words into dense,
continuous vectors while capturing semantic and syntactic relationships. They are
a practical implementation of distributional semantics.
• Word2Vec:
– Developed by Google (2013).
– Uses two architectures:
• CBOW (Continuous Bag of Words): Predicts a word based on its
context.
• Skip-Gram: Predicts the context for a given word.
– Captures relationships like:
• king – man + woman ≈ queen
– Advantage: Produces embeddings with semantic richness.
Introduction to Word Embedding's
GloVe (Global Vectors for Word Representation):
• Developed at Stanford.
• Combines global word co-occurrence statistics with local context
to create embedding's.
• Example: Co-occurrence of ice and cold vs. steam and hot helps
distinguish related word groups.
• Advantage: Efficient and interpretable.
Introduction to Word Embedding's
FastText:
• Developed by Facebook.
• Extends Word2Vec by considering subword information (e.g.,
character n-grams).
• Example: The model treats play and playing as closely related by
embedding their subwords (e.g., play, -ing).
• Advantage: Handles rare and misspelled words better.