[go: up one dir, main page]

0% found this document useful (0 votes)
9 views10 pages

Natural Language Processing Notes Class 10

Natural Language Processing (NLP) is a sub-field of AI focused on enabling computers to understand human languages, with applications including chatbots, voice assistants, language translation, sentiment analysis, and grammar checkers. Key techniques in NLP include text normalization, bag of words, and TFIDF, which help in processing and analyzing text data. The document also discusses the importance of chatbots and various methods used in NLP for effective communication and data processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

Natural Language Processing Notes Class 10

Natural Language Processing (NLP) is a sub-field of AI focused on enabling computers to understand human languages, with applications including chatbots, voice assistants, language translation, sentiment analysis, and grammar checkers. Key techniques in NLP include text normalization, bag of words, and TFIDF, which help in processing and analyzing text data. The document also discusses the importance of chatbots and various methods used in NLP for effective communication and data processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

NATURAL LANGUAGE PROCESSING -NOTES

NLP-
Natural Language Processing, or NLP, is the sub-field of AI that is focused on
enabling computers to understand and process human languages. AI is a
subfield of Linguistics, Computer Science, Information Engineering
In NLP, we can break down the process of understanding English for a model into
a number of small pieces.
A usual interaction between machines and humans using Natural Language
Processing could go as follows:
 Humans talk to the computer
 The computer captures the audio
 There is an audio to text conversion
 Text data is processed Data is converted to audio
 The computer plays the audio file and responds to humans

Applications of Natural Language Processing


1.Chatbots :-
Chatbots are a form of artificial intelligence that is programmed to interact with
humans in such a way that they sound like humans themselves. Chatbots are
created using Natural Language Processing and Machine Learning, which means
that they understand the complexities of the English language and find the
actual meaning of the sentence and they also learn from their conversations
with humans and become better with time.
Chatbots work in two simple steps. First, they identify the meaning of the
question asked and collect all the data from the user that may be required to
answer the question. Then they answer the question appropriately.
2. Autocomplete in Search Engines
3.Voice Assistants
These days voice assistants are all the rage! Whether its Siri, Alexa, or Google
Assistant, almost everyone uses one of these to make calls, place reminders,
schedule meetings, set alarms, surf the internet, etc. These voice assistants have
made life much easier.
They use a complex combination of speech recognition, natural language
understanding, and natural language processing to understand what humans are
saying and then act on it

4.Language Translator
Want to translate a text from English to Hindi but don’t know Hindi? Well,
Google Translate is the tool for you! While it’s not exactly 100% accurate, it is
still a great tool to convert text from one language to another

Google Translate and other translation tools as 132 well as use Sequence to
sequence modeling that is a technique in Natural Language Processing. It allows
the algorithm to convert a sequence of words from one language to another
which is translation.

5.Sentiment Analysis
This application of NLP is very significant as it helps business organizations gain
insights on consumers and do a competitive comparison and make necessary
adjustments in the business strategy development.
Almost all the world is on social media these days! And companies can use
sentiment analysis to understand how a particular type of user feels about a
particular topic, product, etc. They can use natural language processing,
computational linguistics, text analysis, etc. to understand the general sentiment
of the users for their products and services and find out if the sentiment is good,
bad, or neutral

6.Grammar Checkers
Grammar and spelling is a very important factor while writing professional
reports for your superiors and even assignments for your lecturers. After all,
having major errors may get you fired or failed! That’s why grammar and spell
checkers are a very important tool for any professional writer.
7.Text Classification
Text classification is defined as classifying the unstructured text into groups or
categories. For example, the spam folder in our google mails accounts, The
articles can be organized by topics, chat conversations can be organized by
languages and many more uses of text classification.

8.Text Classification
Text classification is defined as classifying the unstructured text into groups or
categories. For example, the spam folder in our google mails accounts, The
articles can be organized by topics, chat conversations can be organized by
languages and many more uses of text classification.

INTRODUCTION TO CHATBOTS
Chatbots is a computer program designed to simulate conversation with human
users, especially over the internet powered by Artificial intelligence
Eg:-1) Mitsuku Bot
2)AskDISHA 2.0 (Digital Interaction To Seek Help Anytime) is an Artificial
Intelligence and Machine learning based Chatbot, that answers queries
pertaining to various services offered by IRCTC and even help users perform
various transactions like end to end ticket booking, and more.
Types of Chatbots
1. Simple Chatbot (Script bots)
2. Smart Chatbots (AI based Smart bots)

QUESTIONS

TEXT NORMALIZATION:-
Text Normalisation is a process to reduce the variations in text’s word
forms to a common form when the variation means the same thing. Text
normalisation simplifies the text for further processing

In Text Normalisation, we undergo several steps to normalize the text to a


lower level. Textual data from multiple documents altogether is known as
corpus.

Let us take a look at the steps of Text Normalization:

Sentence Segmentation Under sentence segmentation, the whole text is


divided into individual sentences. Eg

Tokenisation
After segmenting the sentences, each sentence is then further divided
into tokens. Tokens is a term used for any word or number or special
character occurring in a sentence. Under tokenisation, every word, number
and special character is considered separately and each of them is now a
separate token.

Stemming
In this step, the remaining words are reduced to their root words. In other
words, stemming is the process in which the affixes of words are removed
and the words are converted to their base form.
Stemming does not take into account if the stemmed word is meaningful
or not.

Lemmatization
Stemming and lemmatization both are alternative processes to each other as
the role of both the processes is same – removal of affixes. But the difference
between both of them is that in lemmatization, the word we get after affix
removal (also known as lemma) is a meaningful one

TECHNIQUES OF NLP:-
There are many techniques used in NLP for extracting information but the three
given below are most commonly used:
1. Bag Of Words
2. Term Frequency and Inverse Document Frequency (TFIDF)
3. NLTK
1.Bag of Words
After the process of text normalisation the corpus is converted into normalised
corpus which is just a collection of meaningful words with no sequence.
Bag of Words is a Natural Language Processing model which helps in extracting
features out of the text which can be helpful in machine learning algorithms. In
bag of words, we get the occurrences of each word and construct the
vocabulary for the corpus.

Thus, we can say that the bag of words gives us two things:
• A vocabulary of words for the corpus
• The frequency of these words (number of times it has occurred in the whole
corpus)
Here is the step-by-step approach to implement bag of words algorithm:
1. Text Normalisation: Collect data and pre-process it
2. Create Dictionary: Make a list of all the unique words occurring in the
corpus. (Vocabulary)
3. Create document vectors: For each document in the corpus, find out how
many times the word from the unique list of words has occurred.

Term Frequency and Inverse Document Frequency (TFIDF)


TFIDF stands for Term Frequency and Inverse Document Frequency. TFIDF
helps un in identifying the value for each word.
TFIDF was introduced as a statistical measure of important words in a
document.
Term Frequency:- Term Frequency is the frequency of a word in one
document. Term frequency can easily be found from the document vector
table
Document Frequency :-Document Frequency is the number of documents in
which the word occurs irrespective of how many times it has occurred in
those documents
Inverse Document :-Frequency Inverse Document Frequency is obtained
when document frequency is in the denominator and the total number of
documents is the numerator.
3.NLTK
NLTK is a leading platform for building Python programs to work with
human language data.

NLTK has been called “a wonderful tool for teaching, and working in,
computational linguistics using Python,” and “an amazing library to play
with natural language.”

NLTK is suitable for linguists, engineers, students, educators, researchers,


and industry users alike. NLTK is available for Windows, Mac OS X, and
Linux. Best of all, NLTK is a free, open source, community-driven project.

Installing NLTK:- Will be done in class

You might also like