[go: up one dir, main page]

0% found this document useful (0 votes)
10 views6 pages

NLP Lab Manual

The document outlines three Python programs that utilize Natural Language Processing (NLP) techniques. The first program performs basic word analysis using tokenization, POS tagging, stemming, and lemmatization with the NLTK library. The second program generates a random word from a given corpus, while the third demonstrates morphological analysis through stemming and lemmatization.

Uploaded by

subhashree6124
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views6 pages

NLP Lab Manual

The document outlines three Python programs that utilize Natural Language Processing (NLP) techniques. The first program performs basic word analysis using tokenization, POS tagging, stemming, and lemmatization with the NLTK library. The second program generates a random word from a given corpus, while the third demonstrates morphological analysis through stemming and lemmatization.

Uploaded by

subhashree6124
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

1.

Program to Perform Basic Word Analysis Using Natural Language Processing (NLP)

Aim

To perform basic word analysis using Natural Language Processing (NLP) techniques in Python
using the NLTK library.

Algorithm

Step 1: Start the program.​


Step 2: Import the required modules from the NLTK library:​
word_tokenize, pos_tag, PorterStemmer, WordNetLemmatizer.​
Step 3: Download the required NLTK datasets:​
- punkt (for tokenization)​
- averaged_perceptron_tagger (for POS tagging)​
- WordNet (for lemmatization)

Step 3: Define a sample text for analysis.​


Step 4: Tokenize the text into words.​
Step 5: Perform POS tagging on the tokens.​
Step 6: Apply stemming to each token.​
Step 7: Apply lemmatization to each token.​
Step 8: Display tokens, POS tags, stems, and lemmas.​
Step 9: End the program.

PROGRAM:

import nltk

from nltk.tokenize import word_tokenize

from nltk.tag import pos_tag

from nltk.stem import PorterStemmer

from nltk.stem import WordNetLemmatizer

# Download necessary NLTK data

nltk.download('punkt')

nltk.download('averaged_perceptron_tagger')

nltk.download('wordnet')
# Sample text for word analysis

text = "The quick brown fox jumps over the lazy dog."

# Step 1: Tokenization

tokens = word_tokenize(text)

print("Tokens:", tokens)

# Step 2: POS Tagging

pos_tags = pos_tag(tokens)

print("\nPOS Tags:", pos_tags)

# Step 3: Stemming

stemmer = PorterStemmer()

stems = [stemmer.stem(word) for word in tokens]

print("\nStems:", stems)

# Step 4: Lemmatization

lemmatizer = WordNetLemmatizer()

lemmas = [lemmatizer.lemmatize(word, pos='v') for word in tokens]

print("\nLemmas:", lemmas)

OUTPUT:

Tokens: ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog', '.']

POS Tags: [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'VBZ'),
('over', 'IN'), ('the', 'DT'), ('lazy', 'JJ'), ('dog', 'NN'), ('.', '.')]

Stems: ['the', 'quick', 'brown', 'fox', 'jump', 'over', 'the', 'lazi', 'dog', '.']

Lemmas: ['The', 'quick', 'brown', 'fox', 'jump', 'over', 'the', 'lazy', 'dog', '.']

Result:

Thus, the Python program to perform basic word analysis using NLP techniques was executed
and verified successfully.
2.Program to Generate a Random Word Using a Given Corpus

Aim

To write a Python program that generates a random word of a given length using a predefined set
of characters (corpus).

Algorithm

Step 1: Start the program.​


Step 2: Import the random module.​
Step 3: Define a sample corpus containing lowercase English alphabets.​
Step 4: Define a function generate_word(length) to create a random word:​
4.1: Use a loop to select random characters from the corpus.​
4.2: Join them into a single string.​
4.3: Return the generated word.​
Step 5: Call the function with a specified length (e.g., 6).​
Step 6: Display the generated word.​
Step 7: End the program.

Program
import random
# Sample corpus of characters
corpus = "abcdefghijklmnopqrstuvwxyz"
# Function to generate a new word
def generate_word(length):
word = "".join(random.choice(corpus) for _ in range(length))
return word
# Generate a word of length 6
new_word = generate_word(6)
print("Generated Word:", new_word)

Output

Generated Word: tnwaey

Result

Thus, the Python program to generate a random word from a given corpus was executed and
verified successfully.
3. Program to Perform Morphological Analysis Using Stemming and Lemmatization

AIM

This program aims to demonstrate morphological analysis in Natural Language Processing


(NLP). Specifically, the program will perform stemming and lemmatization, two common
techniques in morphology, to analyse the structure of words and reduce them to their base forms.

ALGORITHM

Step 1: Start the program.​


Step 2: Import PorterStemmer and WordNetLemmatizer from the nltk.stem module.​
Step 3: Download the WordNet dataset for lemmatization.​
Step 4: Create a list of words for morphological analysis.​
Step 5: Initialise the stemmer and lemmatizer objects.​
Step 6: For each word in the list:​
6.1: Find the stem using the stemmer. stem(word).​
6.2: Find the lemma using lemmatizer.lemmatize(word, pos='v').​
6.3: Display the original word, stem, and lemma in a tabular format.​
Step 7: End the program.

PROGRAM

import nltk

from nltk.stem import PorterStemmer

from nltk.stem import WordNetLemmatizer

# Download necessary NLTK data

nltk.download('wordnet')

# Sample list of words for morphological analysis

words = ["running", "jumps", "easily", "fairly", "happier"]

# Initialize the stemmer and lemmatizer

stemmer = PorterStemmer()

lemmatizer = WordNetLemmatizer()

# Perform stemming and lemmatization

print(f"{'Word':<10} {'Stem':<10} {'Lemma':<10}")

for word in words:


stem = stemmer.stem(word)

lemma = lemmatizer.lemmatize(word, pos='v') # 'v' for verb

print(f"{word:<10} {stem:<10} {lemma:<10}")

OUTPUT

Word Stem Lemma

running run run

jumps jump jump

easily easili easily

fairly fairli fairly

happier happier happier

RESULT

Thus, the Python program to perform morphological analysis using stemming and lemmatization
was executed and verified successfully.

You might also like