Natural Language Processing Notes Class 10
Natural Language Processing Notes Class 10
NLP-
Natural Language Processing, or NLP, is the sub-field of AI that is focused on
enabling computers to understand and process human languages. AI is a
subfield of Linguistics, Computer Science, Information Engineering
In NLP, we can break down the process of understanding English for a model into
a number of small pieces.
A usual interaction between machines and humans using Natural Language
Processing could go as follows:
Humans talk to the computer
The computer captures the audio
There is an audio to text conversion
Text data is processed Data is converted to audio
The computer plays the audio file and responds to humans
4.Language Translator
Want to translate a text from English to Hindi but don’t know Hindi? Well,
Google Translate is the tool for you! While it’s not exactly 100% accurate, it is
still a great tool to convert text from one language to another
Google Translate and other translation tools as 132 well as use Sequence to
sequence modeling that is a technique in Natural Language Processing. It allows
the algorithm to convert a sequence of words from one language to another
which is translation.
5.Sentiment Analysis
This application of NLP is very significant as it helps business organizations gain
insights on consumers and do a competitive comparison and make necessary
adjustments in the business strategy development.
Almost all the world is on social media these days! And companies can use
sentiment analysis to understand how a particular type of user feels about a
particular topic, product, etc. They can use natural language processing,
computational linguistics, text analysis, etc. to understand the general sentiment
of the users for their products and services and find out if the sentiment is good,
bad, or neutral
6.Grammar Checkers
Grammar and spelling is a very important factor while writing professional
reports for your superiors and even assignments for your lecturers. After all,
having major errors may get you fired or failed! That’s why grammar and spell
checkers are a very important tool for any professional writer.
7.Text Classification
Text classification is defined as classifying the unstructured text into groups or
categories. For example, the spam folder in our google mails accounts, The
articles can be organized by topics, chat conversations can be organized by
languages and many more uses of text classification.
8.Text Classification
Text classification is defined as classifying the unstructured text into groups or
categories. For example, the spam folder in our google mails accounts, The
articles can be organized by topics, chat conversations can be organized by
languages and many more uses of text classification.
INTRODUCTION TO CHATBOTS
Chatbots is a computer program designed to simulate conversation with human
users, especially over the internet powered by Artificial intelligence
Eg:-1) Mitsuku Bot
2)AskDISHA 2.0 (Digital Interaction To Seek Help Anytime) is an Artificial
Intelligence and Machine learning based Chatbot, that answers queries
pertaining to various services offered by IRCTC and even help users perform
various transactions like end to end ticket booking, and more.
Types of Chatbots
1. Simple Chatbot (Script bots)
2. Smart Chatbots (AI based Smart bots)
QUESTIONS
TEXT NORMALIZATION:-
Text Normalisation is a process to reduce the variations in text’s word
forms to a common form when the variation means the same thing. Text
normalisation simplifies the text for further processing
Tokenisation
After segmenting the sentences, each sentence is then further divided
into tokens. Tokens is a term used for any word or number or special
character occurring in a sentence. Under tokenisation, every word, number
and special character is considered separately and each of them is now a
separate token.
Stemming
In this step, the remaining words are reduced to their root words. In other
words, stemming is the process in which the affixes of words are removed
and the words are converted to their base form.
Stemming does not take into account if the stemmed word is meaningful
or not.
Lemmatization
Stemming and lemmatization both are alternative processes to each other as
the role of both the processes is same – removal of affixes. But the difference
between both of them is that in lemmatization, the word we get after affix
removal (also known as lemma) is a meaningful one
TECHNIQUES OF NLP:-
There are many techniques used in NLP for extracting information but the three
given below are most commonly used:
1. Bag Of Words
2. Term Frequency and Inverse Document Frequency (TFIDF)
3. NLTK
1.Bag of Words
After the process of text normalisation the corpus is converted into normalised
corpus which is just a collection of meaningful words with no sequence.
Bag of Words is a Natural Language Processing model which helps in extracting
features out of the text which can be helpful in machine learning algorithms. In
bag of words, we get the occurrences of each word and construct the
vocabulary for the corpus.
Thus, we can say that the bag of words gives us two things:
• A vocabulary of words for the corpus
• The frequency of these words (number of times it has occurred in the whole
corpus)
Here is the step-by-step approach to implement bag of words algorithm:
1. Text Normalisation: Collect data and pre-process it
2. Create Dictionary: Make a list of all the unique words occurring in the
corpus. (Vocabulary)
3. Create document vectors: For each document in the corpus, find out how
many times the word from the unique list of words has occurred.
NLTK has been called “a wonderful tool for teaching, and working in,
computational linguistics using Python,” and “an amazing library to play
with natural language.”