[go: up one dir, main page]

0% found this document useful (0 votes)
11 views3 pages

NLP Program 1

The document outlines two experiments in Natural Language Processing (NLP). The first experiment focuses on word analysis, including tokenization, part-of-speech tagging, stemming, and lemmatization using NLTK, while the second experiment generates new words using a character-level n-gram model. Sample outputs for both experiments are provided, showcasing the results of the analyses and word generation.

Uploaded by

Mohana Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

NLP Program 1

The document outlines two experiments in Natural Language Processing (NLP). The first experiment focuses on word analysis, including tokenization, part-of-speech tagging, stemming, and lemmatization using NLTK, while the second experiment generates new words using a character-level n-gram model. Sample outputs for both experiments are provided, showcasing the results of the analyses and word generation.

Uploaded by

Mohana Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

EXP NO: 1 WORD ANALYSIS

AIM:

The aim of this program is to perform basic word analysis using Natural Language Processing (NLP)

techniques.

import nltk

from nltk.tokenize import word_tokenize

from nltk.tag import pos_tag

from nltk.stem import PorterStemmer

from nltk.stem import WordNetLemmatizer

# Download necessary NLTK data

nltk.download('punkt')

nltk.download('averaged_perceptron_tagger')

nltk.download('wordnet')

# Sample text for word analysis

text = "The quick brown fox jumps over the lazy dog."

# Step 1: Tokenization - Splitting the text into individual words

tokens = word_tokenize(text)

print("Tokens:", tokens)

# Step 2: Part-of-Speech (POS) Tagging - Assigning POS tags to each token

pos_tags = pos_tag(tokens)

print("\nPOS Tags:", pos_tags)

# Step 3: Stemming - Reducing words to their root form

stemmer = PorterStemmer()

stems = [stemmer.stem(word) for word in tokens]

print("\nStems:", stems)
# Step 4: Lemmatization - Reducing words to their base or dictionary form

lemmatizer = WordNetLemmatizer()

lemmas = [lemmatizer.lemmatize(word, pos='v') for word in tokens] # pos='v' indicates verbs

print("\nLemmas:", lemmas)

OUTPUT:

Tokens: ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog', '.']

POS Tags: [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'VBZ'), ('over', 'IN'), ('the',

'DT'), ('lazy', 'JJ'), ('dog', 'NN'), ('.', '.')]

Stems: ['the', 'quick', 'brown', 'fox', 'jump', 'over', 'the', 'lazi', 'dog', '.']

Lemmas: ['The', 'quick', 'brown', 'fox', 'jump', 'over', 'the', 'lazy', 'dog', '.']

RESULTS:

Running the program with the sample sentence "The quick brown fox jumps over the lazy dog."

produces the following results.

EXP NO: 2 WORD GENERATION

AIM:

The aim of this program is to generate new words or text sequences based on given input using basic

n-gram models.

PROGRAM:

import random

# Sample corpus of characters

corpus = "abcdefghijklmnopqrstuvwxyz"

# Function to generate a new word

def generate_word(length):
word = "".join(random.choice(corpus) for _ in range(length))

return word

# Generate a word of length 6

new_word = generate_word(6)

print("Generated Word:", new_word)

OUTPUT:

Generated Word: tnwaey

RESULTS:

This simple word generation program demonstrates the basics of creating new words using

character-level prediction.

You might also like