0% found this document useful (0 votes)

24 views9 pages

Unit 2 Pos Tagger

Parts of Speech (PoS) tagging is a fundamental task in Natural Language Processing (NLP) that assigns grammatical categories to words, enhancing machine understanding of human language. It is crucial for various applications like machine translation and sentiment analysis, involving processes such as tokenization, language model loading, and linguistic analysis. Different methods of PoS tagging exist, including rule-based, transformation-based, and statistical approaches, each with its own advantages and disadvantages.

Uploaded by

Stella Thanis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views9 pages

Unit 2 Pos Tagger

Uploaded by

Stella Thanis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

POS(Parts-Of-Speech) Tagging in

NLP
Parts of Speech (PoS) tagging is a core task in NLP,
It gives each word a grammatical category such as
nouns, verbs, adjectives and adverbs. Through
better understanding of phrase structure and
semantics, this technique makes it possible for
machines to study human language more
accurately.
PoS tagging is essential in many NLP applications
like machine translation, sentiment analysis and
information retrieval. It serves as a link between
language and machine understanding, enabling the
creation of complex language processing systems.
POS tagging illustration

POS(Parts-Of-Speech) Tagging
Parts of Speech tagging is a linguistic activity
in Natural Language Processing (NLP) wherein each
word in a document is given a particular part of
speech (adverb, adjective, verb etc.) or grammatical
category. Through the addition of a layer of
syntactic and semantic information to the words,
this procedure makes it easier to understand the
sentence's structure and meaning.
In NLP applications, POS tagging is useful
for machine translation, named entity
recognition and information extraction, among other
things. It also works well for clearing out ambiguity
in terms with numerous meanings and revealing a
sentence's grammatical structure.
Example of POS Tagging
Consider the sentence: "The quick brown fox jumps
over the lazy dog."
After performing POS Tagging:
 "The" is tagged as determiner (DT)
 "quick" is tagged as adjective (JJ)
 "brown" is tagged as adjective (JJ)
 "fox" is tagged as noun (NN)
 "jumps" is tagged as verb (VBZ)
 "over" is tagged as preposition (IN)
 "the" is tagged as determiner (DT)
 "lazy" is tagged as adjective (JJ)
 "dog" is tagged as noun (NN)

By offering insights into the grammatical structure,

this tagging helps machines in understanding not
just individual words but also the connections
between them inside a phrase. For many NLP
applications like text summarization, sentiment
analysis, this kind of data is essential.
Workflow of POS Tagging in NLP
 Tokenization: The input text is divided into
individual tokens, representing words or
subwords. Tokenization is the foundational step in
most NLP tasks which enables further analysis at
the word level.
 Loading a Language Model: Tools
like NLTK or SpaCy requires a pre-trained
language model to perform POS tagging. These
models are trained on large datasets and provide
insights into the grammatical rules and structure
of the language.
 Text Preprocessing: The text is then cleaned to
improve accuracy. Common preprocessing steps
include converting text to lowercase, removing
xspecial characters and eliminating irrelevant
content.
 Linguistic Analysis: This stage involves parsing
the sentence to understand the grammatical role
of each token. It lays the groundwork for
assigning the appropriate part of speech by
interpreting the sentence’s syntactic structure.
 POS Tagging: Each token is then assigned a
specific part-of-speech label. This is based on its
role in the sentence and contextual clues
provided by surrounding words.
 Result Evaluation: Finally, the POS-tagged
output is reviewed to ensure accuracy. Any
misclassifications or anomalies are identified and
corrected as needed.
Implementation of Parts-of-Speech
tagging using NLTK
1. Installing packages

import nltk
from nltk.tokenize import word_tokenize
from nltk import pos_tag
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
2. Implementation
 The sentence is stored in the variable text.
 The text is tokenized into words using
word_tokenize(text) before applying POS tagging.
 pos_tag(words) assigns grammatical tags (e.g.,
noun, verb) to each word.
 The original sentence is printed for reference.
 A loop prints each word alongside its predicted
part-of-speech tag.
 Let me know if you want to add output
interpretation too!

# Sample text
text = "NLTK is a powerful library for natural language
processing."

# Tokenize the text

words = word_tokenize(text)

# Performing PoS tagging

pos_tags = pos_tag(words)

print("Original Text:")
print(text)

print("\nPoS Tagging Result:")

for word, pos_tag in pos_tags:
print(f"{word}: {pos_tag}")
Output:
POS using NLTK
Implementation of Parts-of-Speech
tagging using Spacy
Installing Packages

!pip install spacy

!python -m spacy download en_core_web_sm
Implementation
 Imports the SpaCy library.
 Loads the pre-trained English language model
en_core_web_sm.
 Defines a sample sentence in the variable text.
 Processes the text using nlp(text), which returns
a object containing linguistic annotations.
 Prints the original sentence for reference.
 Iterates through each token in the doc and prints
the word along with its part-of-speech (POS) tag
using token.text and token.pos_.

#importing libraries
import spacy

# Load the English language model

nlp = spacy.load("en_core_web_sm")

# Sample text
text = "SpaCy is a popular natural language processing
library."

# Process the text with SpaCy

doc = nlp(text)

print("Original Text: ", text)

print("PoS Tagging Result:")
for token in doc:
print(f"{token.text}: {token.pos_}")
Output:
POS using Spacy
Types of POS Tagging in NLP
Assigning grammatical categories to words in a text
is known as Part-of-Speech (PoS) tagging and it is an
essential aspect of Natural Language Processing
(NLP). Different PoS tagging approaches exist, each
with a unique methodology. Here are a few typical
kinds:
1. Rule-Based Tagging
Rule-based POS tagging assigns grammatical tags
to words using a predefined set of rules, as opposed
to machine learning-based methods that require
training on annotated corpora. These rules are
crafted based on morphological features (like word
endings) and syntactic context, making the
approach highly interpretable and transparent.
Example
a rule might specify that words ending in “-tion” or
“-ment” should be tagged as nouns, based on
common suffix patterns found in English.
 Rule: Assign the POS tag "Noun" to words ending
in -tion or -ment.
 Text: "The presentation highlighted the key
achievements of the project's development."
Tagged Output:
 "The" : Determiner (DET)
 "presentation" : Noun (N)
 "highlighted" : Verb (V)
 "the" : Determiner (DET)
 "key" : Adjective (ADJ)
 "achievements" : Noun (N)
 "of" : Preposition (PREP)
 "the" : Determiner (DET)
 "project's" : Noun (N)
 "development" : Noun (N)
In this case, the rule-based tagger correctly
identifies "presentation," "achievements," and
"development" as nouns by applying suffix-based
rule. While simple, this example illustrates how rule-
based systems can handle a wide range of linguistic
patterns using structured, interpretable logic.
2. Transformation Based tagging
Transformation-Based Tagging (TBT) is a method for
refining POS tags through a series of context-based
transformations. Unlike statistical taggers that rely
on probabilities or rule-based taggers that apply
static rules, TBT starts with initial tags and improves
them iteratively by applying transformation rules.
Example
a rule might state: “Change a word’s tag from
Verb to Noun if it follows a determiner like
‘the’.”
 Text: "The cat chased the mouse."
 Initial Tags: "The" – DET, "cat" – N, "chased" – V,
"the" – DET, "mouse" – N
 Transformation Rule Applied: Change
“chased” from Verb to Noun because it follows
“the”.
 Updated Tags: "chased" becomes Noun.
3. Statistical POS Tagging
Statistical POS tagging is a computational linguistics
approach that uses probabilistic models to assign
grammatical categories (e.g., noun, verb, adjective)
to words in a text. Unlike rule-based methods, which
rely on handcrafted rules, statistical tagging learns
patterns from large annotated corpora using
machine learning techniques.
These models estimate the probability of a tag given
a word and its context, enabling them to resolve
linguistic ambiguities and adapt to complex
grammatical structures. Popular models include:
 Hidden Markov Models (HMMs)
 Conditional Random Fields (CRFs)

Advantages of POS tagging

Advantages Description

Helps deconstruct complex sentences for easier

Text Simplification
understanding.

Improved Information Enables more accurate indexing and searching based on

Retrieval grammatical categories.

Named Entity Serves as a precursor for identifying names, places and

Recognition (NER) organizations.

Assists in analyzing sentence structure and word

Syntactic Parsing
relationships.

Disadvantages of POS Tagging

Disadvantages Description

Words may have multiple meanings depending on

Ambiguity
context.
Disadvantages Description

Informal or non-standard phrases are hard to tag

Idiomatic Expressions
correctly.

Out-of-Vocabulary
Unseen words can lead to incorrect tagging.
Words

Models may not generalize well outside their training

Domain Dependence
domain.

Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
17 pages
Unit 3
No ratings yet
Unit 3
16 pages
NLPChapter 3
No ratings yet
NLPChapter 3
14 pages
Ai TXT Unit4
No ratings yet
Ai TXT Unit4
39 pages
Lecture#11 (POS Tagging)
No ratings yet
Lecture#11 (POS Tagging)
19 pages
Module 3
No ratings yet
Module 3
33 pages
Tagging and Its Types
No ratings yet
Tagging and Its Types
3 pages
POS Tagging
No ratings yet
POS Tagging
11 pages
Rule Based POS Tagging Example
No ratings yet
Rule Based POS Tagging Example
4 pages
01 NLP Unit 4 Part 1
No ratings yet
01 NLP Unit 4 Part 1
25 pages
NLP Lab 2
No ratings yet
NLP Lab 2
6 pages
Experiment 4
No ratings yet
Experiment 4
3 pages
4 Pos
No ratings yet
4 Pos
62 pages
POS Tagging: Name: E Gayathri REG NO: 21MIS0241
No ratings yet
POS Tagging: Name: E Gayathri REG NO: 21MIS0241
18 pages
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
No ratings yet
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
108 pages
Unit No 3
No ratings yet
Unit No 3
8 pages
POS Tagging: Techniques and Challenges
No ratings yet
POS Tagging: Techniques and Challenges
75 pages
POS Tagging for NLP Enthusiasts
No ratings yet
POS Tagging for NLP Enthusiasts
47 pages
What Is POS Tagging in NLP
No ratings yet
What Is POS Tagging in NLP
8 pages
CH-2 Natural Language Processing Models and Algorithm
No ratings yet
CH-2 Natural Language Processing Models and Algorithm
119 pages
Intro to Syntactic Processing
No ratings yet
Intro to Syntactic Processing
56 pages
Pos Tagging and Chunking
No ratings yet
Pos Tagging and Chunking
29 pages
Lec3-Posner Intro
No ratings yet
Lec3-Posner Intro
30 pages
NLP Ia2
No ratings yet
NLP Ia2
18 pages
Rutuja
No ratings yet
Rutuja
10 pages
PARTS OF SPEECH TAGGING Article
No ratings yet
PARTS OF SPEECH TAGGING Article
4 pages
Unit Ii Part of Speech Tagging and Syntactic Parsing
No ratings yet
Unit Ii Part of Speech Tagging and Syntactic Parsing
29 pages
2025-NLP-Lecture 05 - Sequence Labeling For Parts of Speech and Name Entities
No ratings yet
2025-NLP-Lecture 05 - Sequence Labeling For Parts of Speech and Name Entities
69 pages
NLP Notes Unit2 & Unit3
No ratings yet
NLP Notes Unit2 & Unit3
22 pages
SPR 07 Nltk2
No ratings yet
SPR 07 Nltk2
30 pages
Developing Methods For Part of Speech Tagging in Turkish Language
No ratings yet
Developing Methods For Part of Speech Tagging in Turkish Language
45 pages
Module-2 NLP
No ratings yet
Module-2 NLP
50 pages
NLP Exp 6
No ratings yet
NLP Exp 6
4 pages
Be4 A 17 NLP Exp6
No ratings yet
Be4 A 17 NLP Exp6
4 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
94 pages
3.1 Chap NLP Pos - Tagging - Lecture3
No ratings yet
3.1 Chap NLP Pos - Tagging - Lecture3
38 pages
L11-POS - Tagging - II
No ratings yet
L11-POS - Tagging - II
43 pages
Part of Speech Tagging and Hidden Markov Models
No ratings yet
Part of Speech Tagging and Hidden Markov Models
24 pages
Unit-3.Word Level Analysis AIML
No ratings yet
Unit-3.Word Level Analysis AIML
5 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
CAT King Study Material 5
No ratings yet
CAT King Study Material 5
21 pages
POS Tagging: Introduction: Heng Ji
No ratings yet
POS Tagging: Introduction: Heng Ji
35 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
36 pages
Lecture6 2022
No ratings yet
Lecture6 2022
101 pages
Lecture 16-17-18-19
No ratings yet
Lecture 16-17-18-19
42 pages
Natural Language Processing: Parts of Speech Tagging - Pos
No ratings yet
Natural Language Processing: Parts of Speech Tagging - Pos
20 pages
Hmms Spring2013
No ratings yet
Hmms Spring2013
22 pages
Module 3 NLP
No ratings yet
Module 3 NLP
97 pages
8 POSNER Intro May 6 2021
No ratings yet
8 POSNER Intro May 6 2021
26 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
NLP Unit III Notes
No ratings yet
NLP Unit III Notes
30 pages
Adnan Amin
No ratings yet
Adnan Amin
19 pages
POStagging
No ratings yet
POStagging
72 pages
Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
NLP - CA4 - Explain Sentence Segmentation and POS Tagging With Example
100% (1)
NLP - CA4 - Explain Sentence Segmentation and POS Tagging With Example
2 pages
Hidden Markov Model Parts of Speech Tagging
No ratings yet
Hidden Markov Model Parts of Speech Tagging
21 pages
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
No ratings yet
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
40 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
17 pages
Syllabus: Cambridge IGCSE Sanskrit
No ratings yet
Syllabus: Cambridge IGCSE Sanskrit
35 pages
Unit - 1 Formation of Words 1.1 Morphology: Morphemes-Free and Bound, Lexical and Functional, Derivational and Inflectional
No ratings yet
Unit - 1 Formation of Words 1.1 Morphology: Morphemes-Free and Bound, Lexical and Functional, Derivational and Inflectional
4 pages
Tracy S Tiger
No ratings yet
Tracy S Tiger
44 pages
Describing An Organizational Chart
No ratings yet
Describing An Organizational Chart
2 pages
Comparative 1 (Cheaper, More Expensive Etc.)
No ratings yet
Comparative 1 (Cheaper, More Expensive Etc.)
8 pages
Eng. 4 2ND Monthly Exam
100% (1)
Eng. 4 2ND Monthly Exam
2 pages
Indo-European Language Branches
100% (1)
Indo-European Language Branches
19 pages
Public Speaking Tips for Beginners
No ratings yet
Public Speaking Tips for Beginners
19 pages
Lesson Plan
No ratings yet
Lesson Plan
14 pages
Exercise 4
No ratings yet
Exercise 4
7 pages
Unit 5
No ratings yet
Unit 5
27 pages
Multilingual Glossary
No ratings yet
Multilingual Glossary
27 pages
Topic 1: Introduction To Grammar: Week 1
No ratings yet
Topic 1: Introduction To Grammar: Week 1
57 pages
10 Signs That You May Not Make A Good Transcriptionist
No ratings yet
10 Signs That You May Not Make A Good Transcriptionist
3 pages
Suffix & Prefix
No ratings yet
Suffix & Prefix
11 pages
Kaleidoskop: Kultur, Literatur Und Grammatik 9th Edition Edition Adolph Online Version
No ratings yet
Kaleidoskop: Kultur, Literatur Und Grammatik 9th Edition Edition Adolph Online Version
108 pages
Book - Wiktionary, The Free Dictionary
No ratings yet
Book - Wiktionary, The Free Dictionary
25 pages
Adaptation and Selection of Words and Jargons: Fundamentals of Business Writing
No ratings yet
Adaptation and Selection of Words and Jargons: Fundamentals of Business Writing
14 pages
Hatcher (2022) Phonetics and Phonology of Cayuga Prosody
No ratings yet
Hatcher (2022) Phonetics and Phonology of Cayuga Prosody
298 pages
UNIT 1 - Belajar Bahasa Inggris Dari Nol - WWW - RitueliDaeli.com (SFILE - MOBI) 386
No ratings yet
UNIT 1 - Belajar Bahasa Inggris Dari Nol - WWW - RitueliDaeli.com (SFILE - MOBI) 386
6 pages
Jubilant Academy, Pala. IELTS Writing Format & Tips
No ratings yet
Jubilant Academy, Pala. IELTS Writing Format & Tips
3 pages
1e Friendly Faces Traveller Pre Intermediate A2 Student's Book
No ratings yet
1e Friendly Faces Traveller Pre Intermediate A2 Student's Book
3 pages
Exam B1 6th October 2020
No ratings yet
Exam B1 6th October 2020
8 pages
Re-Lesson Plan-R1a-W1d1
No ratings yet
Re-Lesson Plan-R1a-W1d1
2 pages
3 Describe A Person Worksheet Part 3 (Zandi English)
No ratings yet
3 Describe A Person Worksheet Part 3 (Zandi English)
2 pages
Real Numbers Unit Test
No ratings yet
Real Numbers Unit Test
4 pages
Article Sociolinguistic.
No ratings yet
Article Sociolinguistic.
9 pages
DLL English 7 Week 3 Final
No ratings yet
DLL English 7 Week 3 Final
14 pages
Practice - Have Get Something Done - Get Someone To Do STH
No ratings yet
Practice - Have Get Something Done - Get Someone To Do STH
3 pages
Use of English. Exercises. 1
No ratings yet
Use of English. Exercises. 1
15 pages

Unit 2 Pos Tagger

Uploaded by

Unit 2 Pos Tagger

Uploaded by

POS(Parts-Of-Speech) Tagging in

By offering insights into the grammatical structure,

# Tokenize the text

# Performing PoS tagging

print("\nPoS Tagging Result:")

!pip install spacy

# Load the English language model

# Process the text with SpaCy

print("Original Text: ", text)

Advantages of POS tagging

Helps deconstruct complex sentences for easier

Improved Information Enables more accurate indexing and searching based on

Named Entity Serves as a precursor for identifying names, places and

Assists in analyzing sentence structure and word

Disadvantages of POS Tagging

Words may have multiple meanings depending on

Informal or non-standard phrases are hard to tag

Models may not generalize well outside their training

You might also like