[go: up one dir, main page]

0% found this document useful (0 votes)
13 views3 pages

NLP Tech-Names

The document outlines various Python techniques used for natural language processing (NLP), including sentiment analysis, optical character recognition, text categorization, and speech recognition. It provides brief descriptions and small code examples for each technique, utilizing libraries such as TextBlob, scikit-learn, and TensorFlow. Additionally, it covers text preprocessing methods like tokenization and lemmatization, as well as model creation using TF-IDF values.

Uploaded by

Aryan Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

NLP Tech-Names

The document outlines various Python techniques used for natural language processing (NLP), including sentiment analysis, optical character recognition, text categorization, and speech recognition. It provides brief descriptions and small code examples for each technique, utilizing libraries such as TextBlob, scikit-learn, and TensorFlow. Additionally, it covers text preprocessing methods like tokenization and lemmatization, as well as model creation using TF-IDF values.

Uploaded by

Aryan Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Python Technique

Name Brief Small Code Example


Used

from textblob import TextBlob


Analyzing the
Sentiment sentiment text = "I love this product!"
TextBlob, VADER
Analysis (positive/negative) of
blob = TextBlob(text)sentiment =
text.
blob.sentiment.polarity

import pytesseract
Optical
Character Extracting text from Tesseract OCR, from PIL import Image
Recognition images. Pytesseract image = Image.open('image.png')
(OCR)
text = pytesseract.image_to_string(image)

from sklearn.feature_extraction.text import


CountVectorizer
Categorizing text into from sklearn.naive_bayes import
Text
predefined scikit-learn MultinomialNB
Categorization
categories.
X_train = ['text data']

Y_train = [1]

from keras.preprocessing.text import Tokenizer

Word Predicting the next text = ["hello world"]


Keras, TensorFlow
Prediction word or sentence. tokenizer = Tokenizer()

tokenizer.fit_on_texts(text)

import speech_recognition as sr

recognizer = sr.Recognizer()
Speech Converting speech to
SpeechRecognition with sr.Microphone() as source:
Recognition text.
audio = recognizer.listen(source)

text = recognizer.recognize_google(audio)

from googletrans import Translator


Translating text from
Machine translator = Translator()
one language to Googletrans
Translation
another. result = translator.translate('Hello', src='en',
dest='es')
Python Technique
Name Brief Small Code Example
Used

import re
Cleaning and
Text
preparing text data re, nltk text = "This is a sample text!"
Preprocessing
for further analysis.
cleaned_text = re.sub(r'[^\w\s]', '', text)

import nltk
Splitting text into
nltk.word_tokenize, nltk.download('punkt')
Tokenization smaller chunks like
spacy
words or sentences. tokens = nltk.word_tokenize("This is a sample
sentence.")

from nltk.stem import WordNetLemmatizer


Reducing words to
Lemmatization
their base or root nltk, spaCy lemmatizer = WordNetLemmatizer()
and Stemming
form.
lemma = lemmatizer.lemmatize("running")

Extracting
Feature CountVectorizer,
meaningful features No code example needed.
Extraction TfidfVectorizer
from text data.

NLP Understanding terms


nltk, spaCy No code example needed.
Terminology used in NLP.

Key components like


parsing, part-of-
Components of
speech tagging, spaCy, nltk No code example needed.
NLP
named entity
recognition.

from sklearn.feature_extraction.text import


CountVectorizer
Term Frequency of a term
scikit-learn
Frequency (TF) in a document. vectorizer = CountVectorizer()

X = vectorizer.fit_transform(['sample text'])

Inverse Measures the


Document importance of a term scikit-learn No code example needed.
Frequency (IDF) in the corpus.

Creating models
Modeling using based on TF-IDF
scikit-learn No code example needed.
TF-IDF values for text
analysis.
Python Technique
Name Brief Small Code Example
Used

Classifying text as Naive Bayes, scikit-


Spam Filtering No code example needed.
spam or non-spam. learn

You might also like