[go: up one dir, main page]

0% found this document useful (0 votes)
13 views39 pages

Course Introduction-I-1

The document introduces a course on Natural Language Processing (NLP), covering its definition, applications, and challenges. It outlines key components such as text processing, representation, and modeling, along with various NLP tasks like sentiment analysis, machine translation, and chatbots. The document also highlights the evolution of NLP from rule-based systems to deep learning advancements.

Uploaded by

khalidagnaber123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views39 pages

Course Introduction-I-1

The document introduces a course on Natural Language Processing (NLP), covering its definition, applications, and challenges. It outlines key components such as text processing, representation, and modeling, along with various NLP tasks like sentiment analysis, machine translation, and chatbots. The document also highlights the evolution of NLP from rule-based systems to deep learning advancements.

Uploaded by

khalidagnaber123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Natural

Language
Processing
(CSC3348)
Course
Introduction
Asmaa Mourhir
Agenda
• What is NLP?
• Applications of NLP
• Challenges of NLP
• The cornerstones of NLP

2
This course is new
• How can you help?
• Give feedback
• Type some of the notes
• Correct typos
• Participate

3
Text data

62B pages 500M tweets/day 360M user pages 13M articles

4
What is Natural Language
Processing (NLP)

Computer
Science Linguistics
NLP

5
NLP
• Natural Language Processing (NLP) is the study of how to get
computers to “understand”, process, and leverage human language
data.

6
NLP
• Natural Language Processing (NLP) is the study of how to get
computers to “understand”, process, and leverage human language
data.

Speech Audio Written Text Sign Language


(Signal Processing work is a cousin Neural Decipherment via Minimum-Cost Flow: Including Signed Languages in Natural
community and often done by EE folks) From Ugaritic to Linear B Luo, et al. (2019) Language Processing Yin, et al. (2021)
7
What can you do with NLP?

Text classification

SPAM detection

8
What can you do with NLP?

Text classification

Intent detection
9
What can you do with NLP?

Sentiment Analysis

10
What can you do with NLP?

Chatbots

11
What can you do with NLP?

Search Engines
(information retrieval)

12
What can you do with NLP?
Information extraction

13
What can you do with NLP?

Automatic Speech Recognition

14
What can you do with NLP?

Speech Synthesis

15
What can you do with NLP?
Named Entity Recognition

16
What can you do with NLP?

Machine Translation

17
What can you do with NLP?
Text summarization

“Overall, xxxxx COVID-19


vaccine is very safe and one of
the most effective vaccines ever
produced”

18
What can you do with NLP?
Topic modeling

sports news politics fashion

19
Objectives
Video captioning

20
NLP Nowadays

Generative AI
and prompt
engineering

21
NLP nowadays

DALL-E

22
NLP nowadays

AlphaFold

Q8W3K0: A
potential plant
disease
resistance
protein. Mean
pLDDT 82.24

23
This course, broadly
speaking
• Text processing

• Text representation

• Modelling

24
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data

25
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data
• Noise removal

26
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data
• Noise removal
• Segmentation

27
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data
• Noise removal
• Segmentation
• Tokenization

28
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data
• Noise removal
• Segmentation
• Tokenization
• Normalization (stop word removal,
stemming, lemmatization)

29
The cornerstones of NLP
• Text processing is the practice of cleaning and preparing
text data
• Noise removal
• Segmentation
• Tokenization
• Normalization (stop word removal,
stemming, lemmatization)
• Part-of-Speech Tagging

30
The cornerstones of NLP

• Representation: how do we transform symbolic meaning (e.g., words,


signs, braille, speech audio) into something the computer can use

31
The cornerstones of NLP

• Modelling: given these represented symbols, how do we use them to


model the task at hand?

32
Modeling – how do NLP
systems work?

Data (e.g., text, Useful


speech) 𝑓 𝑿 output

33
What is in the box?
• Our computational model could be anything:
• Rule-based system
• Conditional Random Fields
• Hidden Markov Models
• Probabilistic Graphical Model
• Neural Network/deep learning

34
What is in the box?
• Rule-based models:
• A heuristic stemmer rule example
• Porter stemmer(implemented in NLTK library):
• ATIONAL become ATE (e.g., relational become relate)

You can check example of rules here

• Prone to error, not covered in this course

35
What is in the box?
• Statistical
• Probabilistic models built from language data:
P(“maison” → “house”) high
P(“L’avocat général” → “the general avocado”) low

36
What is in the box?
• Machine learning
• Models that use mathematics and statistical models to discover
patterns in data:
• infer meaning based on patterns between words and the wider
context of the sentence and paragraph they sit within

37
Brief history of NLP
• 1960s: pattern-matching and rules (highly limiting)

• 1970s – 1980s: linguistically rich, logic-driven systems; labor-intensive successes


on a few, very specific tasks

• 1990s – 2000s: statistical modelling takeover! ML becomes a central component;


some systems are deployed for practical use (e.g., speech to text)

• 2010s – 2020s: Deep Learning (neural nets) yields astronomical progress on nearly
every NLP task; systems become fairly useful for consumers

• 2020s – 2030s?: you can help drive the change


38
NLP Progress
making good progress
Sentiment analysis still really hard
mostly solved Best roast chicken in San Francisco!
Question answering (QA)
The waiter ignored us for 20 minutes.
Q. How effective is ibuprofen in reducing fever
Spam detection Coreference resolution in patients with acute febrile illness?

Let’s go to Agra! ✓
✗ Paraphrase
Carter told Mubarak he shouldn’t run again.
Buy V1AGRA …
Word sense disambiguation (WSD) XYZ acquired ABC yesterday
ABC has been taken over by XYZ
Part-of-speech (POS) tagging I need new batteries for my mouse.
ADJ ADJ NOUN VERB ADV Summarization
Colorless green ideas sleep furiously. Parsing The Dow Jones is up
Economy is
I can see Alcatraz from the window! The S&P500 jumped good
Housing prices rose
Named entity recognition (NER) Machine translation (MT)
PERSON ORG LOC 第13届上海国际电影节开幕… Dialog Where is Citizen Kane playing in SF?
Einstein met with UN officials in Princeton
The 13th Shanghai International Film Festival…
Castro Theatre at 7:30. Do you
Information extraction (IE) want a ticket?
Party
You’re invited to our dinner party, May 27
Friday May 27 at 8:30 add
39

You might also like