0% found this document useful (0 votes)

3 views13 pages

NLP Study Notes

Uploaded by

plunkerdiscord21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views13 pages

NLP Study Notes

Uploaded by

plunkerdiscord21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

08/09/2025, 15:41 NLP Study Notes for Computer Engineering

NLP Study Notes for Computer Engineering

BE, Semester VII, Computer Engineering

1. What is Natural Language Processing?

Figure: Natural Language Processing Stages

Natural Language Processing (NLP) is a subfield of artificial intelligence and computational

linguistics that focuses on enabling computers to understand, interpret, and generate human
language in a valuable way. It involves the interaction between computers and human
language, particularly how to program computers to process and analyze large amounts of
natural language data.

Key aspects of NLP include:

Language Understanding: Comprehending the meaning behind text or speech

Language Generation: Producing human-like text or speech
Translation: Converting text from one language to another
Sentiment Analysis: Determining the emotional tone of text

NLP combines computational linguistics with machine learning and deep learning models to
process and understand language. These technologies enable computers to process human
language in the form of text or voice data and understand its full meaning, complete with the
speaker or writer's intent and sentiment.

127.0.0.1:5500/nlp.html 1/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

2. Discuss various stages involved in NLP process with

suitable example.

Figure: NLP Process Stages

The NLP process involves several stages that transform raw text into structured information.
These stages work sequentially to extract meaning from natural language:

1. Lexical Analysis: Breaking down text into tokens (words, phrases) and identifying their
parts of speech.
Example: "The cat sits on the mat" → ["The", "cat", "sits", "on", "the", "mat"]
2. Syntactic Analysis: Analyzing the grammatical structure of sentences to understand how
words relate to each other.
Example: Creating a parse tree to show subject-verb-object relationships
3. Semantic Analysis: Extracting the meaning of text by understanding word relationships
and context.
Example: Recognizing that "bank" refers to a financial institution in "I deposited money in
the bank"
4. Discourse Integration: Understanding the relationship between sentences and the overall
context.
Example: Connecting pronouns to their antecedents across sentences
5. Pragmatic Analysis: Interpreting the intended meaning based on context and world
knowledge.
Example: Understanding sarcasm or implied meaning in "Great, another meeting!"
127.0.0.1:5500/nlp.html 2/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

3. Explain the difference between Natural Language and

Computer Language.

Figure: Natural Language vs Computer Language

Natural Language and Computer Language differ significantly in their structure, purpose, and
characteristics:

Feature Natural Language Computer Language

Structure Flexible, ambiguous, evolving Rigid, unambiguous, standardized

Purpose Human communication Instructing computers

Interpretation Context-dependent Literal, exact

Vocabulary Vast, continuously growing Limited, predefined keywords

Error tolerance High (humans can infer meaning) Low (syntax errors cause failure)

Natural languages like English or Spanish have evolved over centuries and contain ambiguities,
idioms, and cultural nuances. Computer languages like Python or Java are designed with
precise syntax and semantics to eliminate ambiguity, ensuring computers can execute
instructions exactly as specified.

4. What do you mean by ambiguity in Natural language?

Explain with suitable example. Discuss various ways to
127.0.0.1:5500/nlp.html 3/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

resolve ambiguity in NL.

Figure: Ambiguity in Natural Language

Ambiguity in Natural Language refers to situations where a word, phrase, or sentence can be
interpreted in multiple ways, leading to uncertainty about the intended meaning. This is a
fundamental challenge in NLP as computers struggle with context-dependent interpretation.

Types of ambiguity with examples:

Lexical Ambiguity: Words with multiple meanings.

Example: "I saw her duck" - Did I see her lower her head or see her pet duck?
Syntactic Ambiguity: Sentences with multiple grammatical structures.
Example: "The chicken is ready to eat" - Is the chicken hungry or ready to be eaten?
Semantic Ambiguity: Phrases with multiple interpretations.
Example: "Visiting relatives can be annoying" - Are the relatives who visit annoying or is
visiting them annoying?

Methods to resolve ambiguity:

1. Context Analysis: Examining surrounding words and broader context

2. Statistical Methods: Using probability models to determine most likely meaning
3. Knowledge Bases: Leveraging domain-specific information
4. Machine Learning: Training models on large annotated datasets

127.0.0.1:5500/nlp.html 4/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

5. Discuss various challenges in processing natural

language.

Figure: Challenges in Natural Language Processing

Natural Language Processing faces several significant challenges that make it a complex field:

Ambiguity Resolution: Words and sentences often have multiple interpretations, requiring
sophisticated context analysis.
Context Understanding: Language meaning heavily depends on context, which can be
cultural, situational, or conversational.
Language Variability: Languages evolve constantly, with new words, slang, and changing
usage patterns.
Resource Intensity: Training effective NLP models requires massive datasets and
computational power.
Cross-lingual Challenges: Different languages have unique structures, making translation
and multilingual processing difficult.
Domain Adaptation: Models trained on general text often perform poorly on specialized
domains like medical or legal texts.
Ethical Concerns: NLP systems can perpetuate biases present in training data, raising
fairness and privacy issues.

These challenges require continuous research and development of more sophisticated

algorithms, larger and more diverse datasets, and better evaluation metrics to create NLP

127.0.0.1:5500/nlp.html 5/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

systems that can truly understand and generate human-like language.

6. What is Morphology? Explain the Morphological

analysis.

Figure: Morphology and Morphological Analysis

Morphology is the study of word formation and structure in linguistics. It examines how
morphemes (the smallest meaningful units of language) combine to create words. In NLP,
morphological analysis is the process of breaking down words into their constituent morphemes
to understand their structure and meaning.

Morphological analysis involves:

1. Segmentation: Breaking words into morphemes.

Example: "unhappiness" → "un" + "happy" + "ness"
2. Classification: Identifying morpheme types (prefixes, suffixes, roots).
Example: "un" (prefix), "happy" (root), "ness" (suffix)
3. Analysis: Determining the grammatical function of each morpheme.
Example: "un" (negation), "ness" (noun formation)

127.0.0.1:5500/nlp.html 6/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Morphological analysis is crucial for many NLP tasks including:

Stemming and lemmatization

Part-of-speech tagging
Machine translation
Information retrieval

By understanding word structure, NLP systems can better handle inflection, derivation, and
compounding across different languages.

7. Differentiate between Derivational and Inflectional

Morphemes.

Figure: Derivational vs Inflectional Morphemes

Derivational and Inflectional Morphemes are two types of bound morphemes that serve
different functions in word formation:

Feature Derivational Morphemes Inflectional Morphemes

Function Create new words Modify grammatical properties

Effect Changes word meaning/class Preserves word meaning/class

127.0.0.1:5500/nlp.html 7/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Position Usually prefixes or suffixes Usually suffixes

Productivity Limited application Regularly applied

Examples of Derivational Morphemes:

"un-" in "unhappy" (changes meaning to negative)

"-ness" in "happiness" (changes adjective to noun)
"-ize" in "modernize" (changes noun to verb)

Examples of Inflectional Morphemes:

"-s" in "cats" (plural marker)

"-ed" in "walked" (past tense marker)
"-ing" in "walking" (present participle)

Understanding this distinction is crucial for morphological analysis in NLP systems.

8. Define Stemming and Lemmatization? How do they

work?

Figure: Stemming vs Lemmatization

127.0.0.1:5500/nlp.html 8/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Stemming and Lemmatization are techniques used in NLP to reduce words to their base
forms, but they differ in approach and accuracy:

Stemming:

A crude heuristic process that chops off word endings

Produces stems that may not be actual words
Fast but less accurate
Example: "studies", "studying", "studied" → "studi"

How Stemming Works:

1. Apply predefined rules to remove suffixes

2. Common algorithms: Porter Stemmer, Snowball Stemmer
3. Does not consider word context or part of speech

Lemmatization:

A sophisticated process using vocabulary and morphological analysis

Produces actual dictionary words (lemmas)
Slower but more accurate
Example: "studies", "studying", "studied" → "study"

How Lemmatization Works:

1. Determine the part of speech of a word

2. Apply morphological analysis to find the root form
3. Use dictionaries or ontologies like WordNet

9. Explain the N-gram model.

127.0.0.1:5500/nlp.html 9/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Figure: N-gram Model

N-gram Model is a probabilistic language model used in NLP for predicting the next item in a
sequence of text. It's based on the Markov assumption that the probability of a word depends
only on the previous n-1 words.

Types of N-grams:

Unigram (1-gram): Single word

Example: "The", "quick", "brown"
Bigram (2-gram): Two consecutive words
Example: "The quick", "quick brown"
Trigram (3-gram): Three consecutive words
Example: "The quick brown"
127.0.0.1:5500/nlp.html 10/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

How N-gram Models Work:

1. Training: Count frequencies of n-grams in a large corpus

2. Probability Calculation: P(wi|wi-1,wi-2,...,wi-n+1) = Count(wi-n+1,...,wi) / Count(wi-
n+1,...,wi-1)
3. Smoothing: Handle unseen n-grams using techniques like Laplace smoothing
4. Prediction: Select the most probable next word given previous context

N-gram models are used in spelling correction, speech recognition, machine translation, and
text generation. While simple and effective, they have limitations in capturing long-range
dependencies in language.

10. What is POS tagging? What are open and closed

classes in POS tagging?

Figure: POS Tagging Example

POS (Part-of-Speech) Tagging is the process of marking words in a text as corresponding to a

particular part of speech (noun, verb, adjective, etc.) based on both their definition and context.
It's a fundamental task in NLP that helps in understanding sentence structure and meaning.

Open Classes:

Word categories that readily accept new members

Continuously evolving with language
Examples: Nouns, Verbs, Adjectives, Adverbs
Can be further subclassified (e.g., common nouns, proper nouns)

127.0.0.1:5500/nlp.html 11/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Closed Classes:

Word categories with relatively fixed membership

Resistant to adding new words
Examples: Pronouns, Prepositions, Conjunctions, Determiners, Auxiliary verbs
Function primarily to express grammatical relationships

POS tagging algorithms use:

Rule-based approaches (hand-crafted grammar rules)

Stochastic approaches (Hidden Markov Models, Maximum Entropy)
Transformation-based approaches (Brill Tagger)
Deep learning approaches (Neural Networks, BERT)

Accurate POS tagging is essential for syntactic parsing, information extraction, and many other
NLP applications.

11. What are Hidden Markov Models (HMM)?

Figure: Hidden Markov Model

Hidden Markov Models (HMM) are statistical models used to represent systems that are
assumed to be Markov processes with unobservable (hidden) states. In NLP, HMMs are widely
used for sequence labeling tasks like Part-of-Speech tagging, named entity recognition, and
speech recognition.

Key Components of HMM:

States: The hidden variables we want to infer (e.g., POS tags)

127.0.0.1:5500/nlp.html 12/13
08/09/2025, 15:41 NLP Study Notes for Computer Engineering

Observations: The visible outputs (e.g., words in a sentence)

Transition Probabilities: P(state_j | state_i) - probability of moving from one state to
another
Emission Probabilities: P(observation | state) - probability of an observation given a state
Initial State Probabilities: P(state_i) - probability of starting in each state

Three Fundamental Problems in HMM:

1. Evaluation: Given an HMM and observation sequence, calculate the probability of the
observation sequence (solved using Forward algorithm)
2. Decoding: Given an HMM and observation sequence, find the most likely sequence of
hidden states (solved using Viterbi algorithm)
3. Learning: Given observation sequences, adjust model parameters to best fit the data
(solved using Baum-Welch algorithm)

HMMs are particularly useful in NLP because they can effectively model the sequential nature
of language and handle uncertainty in state assignments.

127.0.0.1:5500/nlp.html 13/13

Natural Language Processing
No ratings yet
Natural Language Processing
72 pages
Poeter Stemmer Algorithm
No ratings yet
Poeter Stemmer Algorithm
57 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
45 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
NLP Notes (Ch1-5) PDF
100% (1)
NLP Notes (Ch1-5) PDF
41 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
NLP Notes (Ch-1)
No ratings yet
NLP Notes (Ch-1)
5 pages
NLP Introduction
No ratings yet
NLP Introduction
35 pages
Selected Topic CH 1
No ratings yet
Selected Topic CH 1
36 pages
Natural Language Processing (NPL) : Group Name: Goal Diggers
No ratings yet
Natural Language Processing (NPL) : Group Name: Goal Diggers
22 pages
NLP CSM
No ratings yet
NLP CSM
136 pages
Unit 1 Extra
No ratings yet
Unit 1 Extra
6 pages
Text and Speech Analysis Notes ccs369 Unit 1
No ratings yet
Text and Speech Analysis Notes ccs369 Unit 1
28 pages
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
No ratings yet
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
41 pages
Unit V
No ratings yet
Unit V
16 pages
2 Introduction
No ratings yet
2 Introduction
15 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
NLP Ia1
No ratings yet
NLP Ia1
7 pages
Tsa-Unit-1 To 5 Notes
No ratings yet
Tsa-Unit-1 To 5 Notes
124 pages
NLPNotes
No ratings yet
NLPNotes
12 pages
1.1chap NLP - Introduction
No ratings yet
1.1chap NLP - Introduction
34 pages
Natural Language Processing
No ratings yet
Natural Language Processing
57 pages
Introduction
No ratings yet
Introduction
24 pages
Module 1
No ratings yet
Module 1
40 pages
Introduction To NLP - Chap1
No ratings yet
Introduction To NLP - Chap1
47 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
INTRONLP
No ratings yet
INTRONLP
30 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
28 pages
Tsa Unit 1 Notes
No ratings yet
Tsa Unit 1 Notes
27 pages
DLNLP Chapter-1
No ratings yet
DLNLP Chapter-1
38 pages
NLP QB
No ratings yet
NLP QB
14 pages
NLP Lecture Notes R20
No ratings yet
NLP Lecture Notes R20
56 pages
NLP Course: Theory & Applications
No ratings yet
NLP Course: Theory & Applications
16 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
Unit 1 Text and Speech Analysis Notes
No ratings yet
Unit 1 Text and Speech Analysis Notes
28 pages
Unit 1
No ratings yet
Unit 1
68 pages
NLP Course for Students
No ratings yet
NLP Course for Students
25 pages
Introduction To NLP - Chap1
No ratings yet
Introduction To NLP - Chap1
47 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
69 pages
NLP Lecture
No ratings yet
NLP Lecture
18 pages
Text and Speech Analysis Notes CCS369-UNIT 1
No ratings yet
Text and Speech Analysis Notes CCS369-UNIT 1
27 pages
Nayie Bayes Classifier 21 Page
No ratings yet
Nayie Bayes Classifier 21 Page
28 pages
Group Assignment: Unit One
No ratings yet
Group Assignment: Unit One
27 pages
Solution NLP UT1
No ratings yet
Solution NLP UT1
7 pages
NLP PPT
No ratings yet
NLP PPT
41 pages
Natural Language Processing: By-Himani (ROLL NO. 43)
No ratings yet
Natural Language Processing: By-Himani (ROLL NO. 43)
19 pages
ACFrOgBKMtkrKQXYgwzYfGAQxQ0GJjQ4MloahBs6vi5pwqo xRZUN6IRgh8lAAyR2U7sguAn6becvxh174Y RYo84nZ3K9mm OlN3Q JrDvd18FxMzMkCBuxruzd1tH0C6XqndKXsCSXuwHIWVT7olg5FKOstIhFYq-Kh6hMBg
No ratings yet
ACFrOgBKMtkrKQXYgwzYfGAQxQ0GJjQ4MloahBs6vi5pwqo xRZUN6IRgh8lAAyR2U7sguAn6becvxh174Y RYo84nZ3K9mm OlN3Q JrDvd18FxMzMkCBuxruzd1tH0C6XqndKXsCSXuwHIWVT7olg5FKOstIhFYq-Kh6hMBg
32 pages
10 Natural Language Processing
No ratings yet
10 Natural Language Processing
27 pages
5.natural Language Processing
No ratings yet
5.natural Language Processing
5 pages
Lect1 Intro 3jan08
No ratings yet
Lect1 Intro 3jan08
94 pages
Module 1 Part1 NLP
No ratings yet
Module 1 Part1 NLP
24 pages
Natural Language Processing
No ratings yet
Natural Language Processing
20 pages
NLP Unit1
No ratings yet
NLP Unit1
51 pages
Class X NLP
No ratings yet
Class X NLP
27 pages
Science-Religion Dialogue Crisis
No ratings yet
Science-Religion Dialogue Crisis
8 pages
Business Development
No ratings yet
Business Development
65 pages
Methods of Estimation II: Dr. Kempthorne
No ratings yet
Methods of Estimation II: Dr. Kempthorne
44 pages
MG 1351 - Principles of Management 20 Essay Questions and
93% (112)
MG 1351 - Principles of Management 20 Essay Questions and
15 pages
Reflective Silence for Peace
No ratings yet
Reflective Silence for Peace
6 pages
Introduction To Micro Economics Lecture 01
100% (1)
Introduction To Micro Economics Lecture 01
32 pages
The Power of Yogas in Astrology
No ratings yet
The Power of Yogas in Astrology
4 pages
Age of Exploration Research Paper
No ratings yet
Age of Exploration Research Paper
6 pages
Dental Gypsum and Occlusal Care
No ratings yet
Dental Gypsum and Occlusal Care
4 pages
Law's Influence on Behavior
No ratings yet
Law's Influence on Behavior
2 pages
DAILY LESSON PLAN IN Mapeh Health WEEK 5
100% (1)
DAILY LESSON PLAN IN Mapeh Health WEEK 5
11 pages
Work Transport in Manufacturing
No ratings yet
Work Transport in Manufacturing
3 pages
Electronics
No ratings yet
Electronics
37 pages
HRM Suggestions
No ratings yet
HRM Suggestions
3 pages
Crits Notes On Water and Ion Exchange PDF
No ratings yet
Crits Notes On Water and Ion Exchange PDF
55 pages
Grammar For Writing 12
50% (2)
Grammar For Writing 12
24 pages
Argumentative Essay
100% (1)
Argumentative Essay
2 pages
Technical Data Sheet: Applications
No ratings yet
Technical Data Sheet: Applications
2 pages
Olt Qe 0-0-1
No ratings yet
Olt Qe 0-0-1
35 pages
Differencial Equations Boyce Chapter 2 Solution
No ratings yet
Differencial Equations Boyce Chapter 2 Solution
52 pages
Civil Capsule
No ratings yet
Civil Capsule
93 pages
Flipkart Email-Chat Blended Process
No ratings yet
Flipkart Email-Chat Blended Process
3 pages
Subject Secured Maximum Grade: Print Result Check Another Result
No ratings yet
Subject Secured Maximum Grade: Print Result Check Another Result
3 pages
Media and Policy Making in The Digital Age
No ratings yet
Media and Policy Making in The Digital Age
19 pages
Graeber - Dancing With Corpses Reconsidered - An Interpretation of 'Famadihana' (In A
No ratings yet
Graeber - Dancing With Corpses Reconsidered - An Interpretation of 'Famadihana' (In A
22 pages
Chapter Ten: Forecasting
No ratings yet
Chapter Ten: Forecasting
49 pages
Pompe 212-1
No ratings yet
Pompe 212-1
3 pages
Legacy Asset Data Transfer Guide
No ratings yet
Legacy Asset Data Transfer Guide
18 pages
Juan Dela Cruz, RRT: Personal Information
No ratings yet
Juan Dela Cruz, RRT: Personal Information
3 pages
Aman Yala: Rembetiko of The Month
No ratings yet
Aman Yala: Rembetiko of The Month
3 pages

NLP Study Notes

Uploaded by

NLP Study Notes

Uploaded by

08/09/2025, 15:41 NLP Study Notes for Computer Engineering

NLP Study Notes for Computer Engineering

1. What is Natural Language Processing?

Figure: Natural Language Processing Stages

Natural Language Processing (NLP) is a subfield of artificial intelligence and computational

Key aspects of NLP include:

Language Understanding: Comprehending the meaning behind text or speech

2. Discuss various stages involved in NLP process with

Figure: NLP Process Stages

3. Explain the difference between Natural Language and

Figure: Natural Language vs Computer Language

Feature Natural Language Computer Language

Structure Flexible, ambiguous, evolving Rigid, unambiguous, standardized

Purpose Human communication Instructing computers

Interpretation Context-dependent Literal, exact

Vocabulary Vast, continuously growing Limited, predefined keywords

4. What do you mean by ambiguity in Natural language?

resolve ambiguity in NL.

Figure: Ambiguity in Natural Language

Types of ambiguity with examples:

Lexical Ambiguity: Words with multiple meanings.

Methods to resolve ambiguity:

1. Context Analysis: Examining surrounding words and broader context

5. Discuss various challenges in processing natural

Figure: Challenges in Natural Language Processing

These challenges require continuous research and development of more sophisticated

systems that can truly understand and generate human-like language.

6. What is Morphology? Explain the Morphological

Figure: Morphology and Morphological Analysis

Morphological analysis involves:

1. Segmentation: Breaking words into morphemes.

Morphological analysis is crucial for many NLP tasks including:

Stemming and lemmatization

7. Differentiate between Derivational and Inflectional

Figure: Derivational vs Inflectional Morphemes

Feature Derivational Morphemes Inflectional Morphemes

Function Create new words Modify grammatical properties

Effect Changes word meaning/class Preserves word meaning/class

Position Usually prefixes or suffixes Usually suffixes

Productivity Limited application Regularly applied

Examples of Derivational Morphemes:

"un-" in "unhappy" (changes meaning to negative)

Examples of Inflectional Morphemes:

"-s" in "cats" (plural marker)

Understanding this distinction is crucial for morphological analysis in NLP systems.

8. Define Stemming and Lemmatization? How do they

Figure: Stemming vs Lemmatization

A crude heuristic process that chops off word endings

How Stemming Works:

1. Apply predefined rules to remove suffixes

A sophisticated process using vocabulary and morphological analysis

How Lemmatization Works:

1. Determine the part of speech of a word

9. Explain the N-gram model.

Figure: N-gram Model

Unigram (1-gram): Single word

How N-gram Models Work:

1. Training: Count frequencies of n-grams in a large corpus

10. What is POS tagging? What are open and closed

Figure: POS Tagging Example

POS (Part-of-Speech) Tagging is the process of marking words in a text as corresponding to a

Word categories that readily accept new members

Word categories with relatively fixed membership

POS tagging algorithms use:

Rule-based approaches (hand-crafted grammar rules)

11. What are Hidden Markov Models (HMM)?

Figure: Hidden Markov Model

Key Components of HMM:

States: The hidden variables we want to infer (e.g., POS tags)

Observations: The visible outputs (e.g., words in a sentence)

Three Fundamental Problems in HMM:

You might also like