100% found this document useful (1 vote)

256 views81 pages

Intro Class

The document outlines an introductory graduate course on Large Language Models (LLMs) taught by Tanmoy Chakraborty at IIT Delhi. It covers fundamental concepts in natural language processing (NLP), deep learning, and the architecture of transformers, along with recent advancements in LLM research. Prerequisites include a background in data structures, algorithms, and machine learning, while the course does not delve into generative models for modalities other than text.

Uploaded by

Nithin Randhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

256 views81 pages

Intro Class

Uploaded by

Nithin Randhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Course Introduction

Tanmoy Chakraborty
Associate Professor, IIT Delhi
https://tanmoychak.com/

Introduction to Large Language Models

Instructors Teaching Assistants

Tanmoy Chakraborty Soumen Chakrabarti Anwoy Chatterjee Poulami Ghosh

IIT Delhi IIT Bombay PhD student, IIT Delhi PhD student, IIT Bombay

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
• This is an introductory graduate course and we will be teaching the fundamental
concepts underlying large language models.

• This course will start with a short introduction to NLP and Deep Learning, and then move
on to the architectural intricacies of Transformers, followed by the recent advances in
LLM research.

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
Basics
• Introduction
• Intro to NLP
• Intro to Deep
Learning
• Intro to Language
Models (LMs)
• Word Embeddings
(Word2Vec,
GloVE)
• Neural LMs (CNN,
RNN, Seq2Seq,
Attention)

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
Basics Architecture
• Introduction
• Intro to
• Intro to NLP Transformer
• Intro to Deep • Positional
Learning encoding
• Intro to Language • Tokenization
Models (LMs) strategies
• Word Embeddings • Decoder-only LM,
(Word2Vec, Prefix LM,
GloVE) Decoding
• Neural LMs (CNN, strategies
RNN, Seq2Seq, • Encoder-only LM,
Attention) Encoder-decoder
LM

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
Basics Architecture
• Introduction
Learnability
• Intro to
• Intro to NLP Transformer • Instruction fine-
• Intro to Deep tuning
• Positional
Learning encoding • In-context learning
• Intro to Language • Tokenization • Advanced
Models (LMs) strategies prompting (Chain of
• Word Embeddings Thoughts, Graph of
• Decoder-only LM, Thoughts, Prompt
(Word2Vec, Prefix LM,
GloVE) Chaining, etc.)
Decoding
• Neural LMs (CNN, strategies • Alignment
RNN, Seq2Seq, • Encoder-only LM, • PEFT
Attention) Encoder-decoder
LM

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
Basics Architecture
• Introduction
Learnability Knowledge &
• Intro to Retrieval
• Intro to NLP Transformer • Instruction fine-
tuning • Knowledge graphs
• Intro to Deep • Positional
Learning encoding • In-context learning • Open-book
question
• Intro to Language • Tokenization • Advanced answering
Models (LMs) strategies prompting (Chain of
Thoughts, Graph of • Retrieval
• Word Embeddings • Decoder-only LM, augmentation
(Word2Vec, Thoughts, Prompt
Prefix LM, Chaining, etc.) techniques
GloVE) Decoding
• Neural LMs (CNN, strategies • Alignment
RNN, Seq2Seq, • Encoder-only LM, • PEFT
Attention) Encoder-decoder
LM

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Course Content
Basics Architecture
• Introduction
Learnability Knowledge &
• Intro to Retrieval Ethics and Misc.
• Intro to NLP Transformer • Instruction fine-
tuning • Knowledge graphs
• Intro to Deep • Positional • Overview of recently
Learning encoding • In-context learning • Open-book popular models
question
• Intro to Language • Tokenization • Advanced answering • Bias, toxicity and
Models (LMs) strategies prompting (Chain of hallucination
Thoughts, Graph of • Retrieval
• Word Embeddings • Decoder-only LM, augmentation
(Word2Vec, Thoughts, Prompt
Prefix LM, Chaining, etc.) techniques
GloVE) Decoding
• Neural LMs (CNN, strategies • Alignment
RNN, Seq2Seq, • Encoder-only LM, • PEFT
Attention) Encoder-decoder
LM

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Pre-Requisites
• Excitement about language!
• Willingness to learn

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Pre-Requisites
• Excitement about language!
• Willingness to learn

Mandatory Desirable
• Data Structures & Algorithms • NLP
• Machine Learning • Deep learning
• Python programming

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Pre-Requisites
• Excitement about language!
• Willingness to learn

Mandatory Desirable
• Data Structures & Algorithms • NLP
• Machine Learning • Deep learning
• Python programming

This course will NOT cover:

• Details of NLP, Machine Learning and Deep Learning
• Generative models for modalities other than text

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Reading and Reference Materials
• Books (optional reading)
• Speech and Language Processing, Dan Jurafsky and James H. Martin
https://web.stanford.edu/~jurafsky/slp3/
• Foundations of Statistical Natural Language Processing, Chris Manning and Hinrich Schütze
• Natural Language Processing, Jacob Eisenstein
https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
• A Primer on Neural Network Models for Natural Language Processing, Yoav Goldberg
http://u.cs.biu.ac.il/~yogo/nnlp.pdf
• Journals
• Computational Linguistics, Natural Language Engineering, TACL, JMLR, TMLR, etc.
• Conferences
• ACL, EMNLP, NAACL, COLING, ICML, NeurIPS, ICLR, AAAI, WWW, KDD, SIGIR, etc.

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Research Papers Repository

https://aclanthology.org/

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Research Papers Repository

https://arxiv.org/list/cs.CL/recent

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

Acknowledgements (Non-exhaustive List)
• Advanced NLP, Graham Neubig http://www.phontron.com/class/anlp2022/
• Advanced NLP, Mohit Iyyer https://people.cs.umass.edu/~miyyer/cs685/
• NLP with Deep Learning, Chris Manning, http://web.stanford.edu/class/cs224n/
• Understanding Large Language Models, Danqi Chen https://www.cs.princeton.edu/courses/archive/fall22/cos597G/
• Natural Language Processing, Greg Durrett https://www.cs.utexas.edu/~gdurrett/courses/online-course/materials.html
• Large Language Models: https://stanford-cs324.github.io/winter2022/
• Natural Language Processing at UMBC, https://laramartin.net/NLP-class/
• Computational Ethics in NLP, https://demo.clab.cs.cmu.edu/ethical_nlp/
• Self-supervised models, CS 601.471/671: Self-supervised Models (jhu.edu)
• WING.NUS Large Language Models, https://wing-nus.github.io/cs6101/
• And many more…

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

What is a Language Model (LM)?
Language Model gives the probability distribution over a sequence of tokens.

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

What is a Language Model (LM)?
Language Model gives the probability distribution over a sequence of tokens.

Language Model

Vocabulary
V = {arrived, delhi, have,
is, monsoon, rains, the}

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

What is a Language Model (LM)?
Language Model gives the probability distribution over a sequence of tokens.

P(the monsoon rains

have arrived) 0.2

Language Model

Vocabulary
V = {arrived, delhi, have,
is, monsoon, rains, the}

Introduction to LLMs Tanmoy Chakraborty Tanmoy Chakraborty

What is a Language Model (LM)?
Language Model gives the probability distribution over a sequence of tokens.

P(the monsoon rains

have arrived) 0.2

Language Model
P(monsoon the have
rains arrived) 0.001

Vocabulary
V = {arrived, delhi, have,
is, monsoon, rains, the}