[go: up one dir, main page]

0% found this document useful (0 votes)
93 views19 pages

Community Session IndexingChaining

The document discusses LLMs as builders and the concepts of chaining and indexing. It introduces LangChain as an interface for chaining LLMs, vector databases, and documents. Chaining involves connecting different LLMs and data sources to perform useful tasks. Indexing involves splitting documents into chunks, creating embeddings, and storing them in a vector database for retrieval. The document provides examples of chaining applications and companies working on vector databases and chaining tools.

Uploaded by

Sani Kamal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views19 pages

Community Session IndexingChaining

The document discusses LLMs as builders and the concepts of chaining and indexing. It introduces LangChain as an interface for chaining LLMs, vector databases, and documents. Chaining involves connecting different LLMs and data sources to perform useful tasks. Indexing involves splitting documents into chunks, creating embeddings, and storing them in a vector database for retrieval. The document provides examples of chaining applications and companies working on vector databases and chaining tools.

Uploaded by

Sani Kamal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

LLM Indexing and Chaining

Community Session
Outline
Introductory Talk - Thinking about LLMs as Builders
● Chaining
● Indexing
● LLMs, Vector DBs, and LLM Ops

Community Breakout Discussion Activities

Live Interactive Build Demo!


● Document querying with LangChain

© 2023 FourthBrain
How most people use LLMs

© 2023 FourthBrain
What most people ask about LLMs

© 2023 FourthBrain
Thinking about LLMs as Builders
● Primary Chains
○ Prompt Chain
○ Tools Chain
○ Data Indexing Chain

● LangChain (also LlamaIndex, HayStack,


others) provide a standard interface for
Chains

● Chains = main abstraction innovation

© 2023 FourthBrain
“LangChain provides a generic interface for
interacting with LLMs”

“LLMs in isolation is often insufficient for creating a truly


powerful app - the real power comes when you can combine
them with other sources of computation or knowledge.” ~
Harrison Chase, Creator of LangChain
Creating an Index (with a Data-Indexing Chain)
1. Splitting doc into chunks
2. Creating embeddings for each document
3. Storing documents and embeddings in a vectorstore

© 2023 FourthBrain
A Simplified LangChain Application

Same chains…
● Prompt Chain
● Tools Chain
● Data Indexing Chain

Same primary components…


● LLM
● Vector Database
● Document(s)

© 2023 FourthBrain
Def: Vector Store (a.k.a. Vector Database)
● optimized for storing documents and
their embeddings

● fetching the most relevant documents for


a particular query

○ → those whose embeddings are


most similar to the embedding of the
query

© 2023 FourthBrain
LLMs, Chaining, Data Indexing, Vector DBs, Documents…

How does all of this fit together?


LLM Ops
● Definition

How we store, index,


and retrieve
knowledge that we
need to perform useful
LLM tasks

https://every.to/chain-of-thought/a-few-things-i-believe-about-ai

© 2023 FourthBrain
Players to Watch (Chaining)
● LangChain ~ $10M seed funding

● LlamaIndex ~ Open Source

● HayStack ~ $9.2M seed funding/debt financing


○ Extractive QA

● AgentGPT ~ Open Source Project by Level AI ($20M Series B, 2022)


○ Call centers

© 2023 FourthBrain
The more mature infrastructure layer is…
Vector Store DB Companies (LangChain Support)
● Chroma ~ $18M seed round
● FAISS (Facebook)
● Elastic Search (est. company)
● Milvus $60M Series B (ext), $43M in ‘21
● Pinecone ~ $100M Series B
● Qdrant ~ $7.5M seed round
● Weaviate ~$50M Series B

© 2023 FourthBrain
Ex Project Ideas from “Building with LLMs” Course
Simple (1-step)
● Natural Language Website Search: Scrape all text from {hotel}.com webpages
and index it in a vector store so that any information can be searched with an
LLM

Mild (2-step)
● Technical Q&A (“AI Tech Support”): Create a fine-tuned LLM to answer FAQs
about technical documentation, and then if there is no answer use a non-fine-
tuned LLM to search all relevant documentation to find answer.

© 2023 FourthBrain
Ex Project Ideas from “Building with LLMs” Course
Medium (2+step)
● Qualitative + Quantitative Q&A (“The AI VP”): Create a fine-tuned LLM to
generate SQL queries for your database structure using common queries useful to
your product/sales/etc. team, then perform the SQL query and return the
quantitative result. Compare the result against the question asked, and combine
into a holistic response.

© 2023 FourthBrain
What useful LLM tasks (projects)

Breakouts! are you interested in building


solutions for?
(20 minutes)
10 per room Assign ONE person from your
room to take notes and share!
Indexing, Chaining, and LLM Ops

© 2023 FourthBrain
This week’s build - Questioning Your Document
● Model: OpenAI’s gpt-3.5-turbo

● Dataset: Hitchhiker’s Guide to the Galaxy

● Chaining Tool: LangChain

● Vector DB Tool: ChromaDB

© 2023 FourthBrain
Let’s Check it Out!

You might also like