0% found this document useful (0 votes)

19 views51 pages

Lecture 05

The lecture discusses enhancements to scoring and evaluation in information retrieval systems, focusing on efficient cosine ranking and methods to select the top K documents without computing all cosine values. It introduces various pruning techniques and scoring methods, including champion lists and static quality scores, to improve retrieval efficiency. The evaluation of IR systems is emphasized, highlighting the importance of user happiness and the measurement of relevance through precision-recall metrics.

Uploaded by

samplesaji1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views51 pages

Lecture 05

Uploaded by

samplesaji1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Lecture 5:

Further enhancement to Scoringandresults

assembly and evaluation
Today’s Lecture
• Speeding up vector spaceranking
• Putting together acompletesearch system
• Will require learning about anumber of miscelaneous topics and
heuristics

• Evaluation of the IR System

.
Efficient CosineRanking
• Find the Kdocs in the collection“nearest” tothe query Klargest
query-doccosines.
• Efficientranking:
– Computingasingle cosineefficiently.
– Choosingthe Klargestcosinevaluesefficiently.

• Canwe do this without computing allN cosines?

Computing CosineScores
Computing the K largest Cosines:
Selection vs.Sorting
• Typicaly we want to retrieve the topK docs (in the cosine
ranking for thequery)
– not to totally order all docs in the collection
• Canwe pick off docswith K highestcosines?
• Let J= number of docs with non zero cosines
– Weseekthe Kbest of thesefromJ
Sec.7.1

UseHeap for SelectingTop K

• Binary tree in which each node’s value > the values of children
(max heap)
• Takes 2J operations to construct,then each of
K “winners” read off in 2log J steps.
• For J=1M, K=100, this is about 10% of the cost of sorting
1

.9 .3

.3 .8 .1

.1 13
Bottlenecks
• Primarycomputational bottleneck inscoring: cosinecomputation

• Canwe avoid all this computation?

• Yes,but maysometimes get it wrong

– adoc not in the top Kmaycreepinto the list ofK output docs
– Is this suchabad thing?
CosineSimilarity is only aProxy
• User has a task and a query formulation

• Cosine matches docs to query

• Thus cosine is anyway a proxy for user happiness

• If we get a list of K docs “close” to the top K by cosine measure,

should be ok
GenericApproach
• Find a set A of contenders,with K < |A|<< N
– A does not necessarily contain the top K,but has many docs
from among the top K
– Return the top K docs in A
• Think of A as pruning non-contenders
• The same approach is also used for other (non-
cosine) scoring functions
• Will look at several schemes following this approach
Pruning N on-Contenders
• Index elimination
• Champion lists
• Static quality scores
• Impact ordering
• Cluster pruning
Index Elimination
• Basic cosine computation algorithm only considers docs
containing at least one query term

• Take this further:

– Only consider high-idf query terms
– Only consider docs containing many query terms
High-idf Q ueryTerms O nly
• For aquery such as catcher in therye
• O nly accumulate scores from catcher and rye
• Intuition:in andthe contribute little to the scores andso don’t
alter rank-ordering much
• Benefit:
– Postingsoflow-idftermshavemanydocs. these(many)docsget
eliminatedfrom setAof contenders
Docs Containing Many Q ueryTerms
• Anydoc with at least one query termis a candidate for
the top Koutput list
• For multi-term queries,only compute scores for docs
containingseveralof the queryterms
– Say,at least 3 out of 4
– Imposes a“soft conjunction” on queries seenon web search engines
(earlyGoogle)
• Easyto implement in postings traversal
Sec.7.1.2

3 of 4 Q ueryTerms
Antony 3 4 8 16 32 64 128

Brutus 2 4 8 16 32 64 128

Caesar 1 2 3 5 8 13 21 34

Calpurnia 13 16 32

Scoresonly computed for docs 8,16 and32.

16
Champion lists
• Precompute for each dictionary term t,the r

docs of highest weight in t’s postings

– Call this the champion list for t
– (aka fancy list or top docs for t)
• At query time,only compute scores for docs in the champion list of some query
term
– Pick the K top-scoring docs from amongst these
• Note that r has to be chosen at index build time
– Thus, it’s possible that r < K
Static quality scores
• We want top-ranking documents to be both relevant and authoritative
• Relevance is being modeled by cosine scores
• Authority is typically a query-independent property of a document
• Examples of authority signals
 Wikipedia among websites
 Articles in certain newspapers
 A research paper with many citations
 Many bitly’s,diggs marks
 (Pagerank)
Modeling authority
• Assign to each document a query-independent quality score in [0,1]
to each document d
– Denote this by g(d)
• Thus,a quantity like the number of citations is scaled into [0,1]
N et score
• Considerasimpletotal scorecombining cosinerelevancea
authority
• net-score(q,d) = g(d) + cosine(q,d)
– Canusesomeother linear combination
– Indeed,anyfunction of the two“signals” of user happiness– more
later
• Now we seekthe top Kdocs by netscore
TopK by net score – fastmethods
• First idea:O rder all postings byg(d)
• Key:this is acommon ordering for allpostings
• Thus,canconcurrently traverse queryterms’ postings for
– Postings intersection
– Cosinescore computation
W hy order postings byg(d)?
• Under g(d)-ordering, top-scoring docs likely to appear early in
postings traversal
• In time-bound applications (say, we have to return whatever
search results we can in 50 ms), this allows us to stop postings
traversal early
– Short of computing scores for all docs in postings
Champion lists in g(d)-ordering
• Can combine champion lists with g(d)- ordering
• Maintain for each term a champion list of the
r docs with highest g(d) + tf-idftd
• Seek top-K results from only the docs in these champion lists
High and low lists
• For each term,we maintain two postings lists called high and low
– Think of high as the champion list
• When traversing postings on a query,only traverse high lists first
– If we get more than K docs, select the top K and stop
– Else proceed to get docs from the low lists
• Can be used even for simple cosine scores, without global quality
g(d)
• A means for segmenting index into two tiers
• Document-at-a-time scoring
Cluster pruning
Preprocessing:
• Pick 𝑁 docs at random:call these leaders
• For every other doc,pre-compute nearest leader
– Docs attached to a leader:its followers;
– Likely:each leader has ~ 𝑁 followers.
Cluster pruning
Query Processing:
• Process a query as follows:
– Given query Q,find its nearest leader L.
– Seek K nearest docs from among L’s followers.
Visualization
Parametric and zoneindexes
• Thus far,a doc has been a sequence of terms
• In fact documents have multiple parts,some with special semantics:
– Author
– Title
– Date of publication
– Language
– Format
– etc.
• These constitute the metadata about a document
Fields
• We sometimes wish to search by these metadata
– E.g.,find docs authored byWilliam Shakespeare in the year
1601,containing alas poorYorick
• Year = 1601 is an example of a field
• Also,author last name = shakespeare,etc.
• Field or parametric index: postings for each field value
– Sometimes build range trees (e.g.,for dates)
• Field query typically treated as conjunction
– (doc must be authored by shakespeare)
Zone
• A zone is a region of the doc that can contain an arbitrary
amount of text,e.g.,
– Title
– Abstract
– References …
• Build inverted indexes on zones as well to permit querying
• E.g.,“find docs with merchant in the title zone and matching the
query gentle rain”
Examplezone indexes

Encode zones in dictionary vs. postings.

Tiered indexes
• Break postings up into a hierarchy of lists
– Most important
–…
– Least important
• Can be done by g(d) or another measure
• Inverted index thus broken up into tiers of decreasing
importance
• At query time use top tier unless it fails to yield K docs
– If so drop to lower tiers
Exampletiered index
Q uery term proximity
• Free text queries:just a set of terms typed into the query box –
common on the web
• Users prefer docs in which query terms occur within close
proximity of each other
• Let w be the smallest window in a doc containing all query terms,
e.g.,
• For the query strained mercy the smallest window in the
doc The quality of mercy is not strained is ? (words)
• Would like scoring function to take this into account – how?
Impact-ordered postings
• We only want to compute scores for docs for which wft,d
(weighted term-frequency) is high enough
• Term-at-a-time scoring
• We sort each postings list by wft,d
• Now: not all postings in a common order!
• How do we compute scores in order to pick off top K?
– Two ideas follow
1.Earlytermination
• When traversing t’s postings,stop early after either
– a fixed number of r docs
– wft,d drops below some threshold
• Take the union of the resulting sets of docs
– One from the postings of each query term
• Compute only the scores for docs in this union
2.idf-ordered terms
• When considering the postings of query terms
• Look at them in order of decreasing idf
– High idf terms likely to contribute most to score
• As we update score contribution from each query term
– Stop if doc scores relatively unchanged
• Can apply to cosine or some other net scores
Query parsers
• Free text query from user may in fact spawn one or more queries to
the indexes,e.g.,query rising interest rates
– Run the query as a phrase query
– If <K docs contain the phrase rising interest rates,run the two phrase
queries
– If we still have <K docs, run the vector space query
rising interest rates
– Rank matching docs by vector space scoring
• This sequence is issued by a query parser
Aggregate scores
• We’ve seen that score functions can combine cosine,static quality,
proximity,etc.
• How do we know the best combination?
• Some applications – expert-tuned
• Increasingly common: machine-learned
Putting it alltogether
Evaluation of Information Retrieval Systems
• How do we evaluate??
• Try a few queries on his engine and say “Not bad”?
• Is it enough??
• How fast does it index?
• Number of documents/hour
• Incremental indexing
• How fast does it search?
• Latency and CPU needs
• Does it recommend related products/links?
Measure for a search engine
• All of the preceding criteria are measurable
◦ We can make expressiveness precise

• The key measurement:User happiness

◦ What is this?
◦ Speed of response/size of index are factors
◦ But blindingly fast, useless answers won’t make a user happy
◦ Need a way of quantifying user happiness
Measuring user happiness
• Issue :who is the user we are tying to make happy?
◦ Depends on setting

• Web Engine
oUser finds what he/she wants and returns to the engine

• eCommerce site
Happiness:Elusive to measure
• Most common proxy: relevance of search results
• But how do you measure relevance?
 Three elements:
1. A benchmark document collection
• CRAN FIELD,REUTERS ,TREC
2. A benchmark suite of queries
3. An assessment of either Relevant or Nonrelevant for each query and
each document
• Relevance is assessed relative to an information need, not a query
Evaluating an IR system
• Note :User need is translated into a query
• Relevance is assessed relative to the user need,not the query
•E.g,Information need:My swimming pool bottom is becoming black
and needs to be cleaned
• Query:Pool cleaner
•Assess whether the doc addresses the underlying need,not whether
it has these words
Binary Assessments

F Measure: A single measure that trades off precision versus recall

Precision-recall C urve
1.0

0.8
Precision
0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
Recall
Rank Based Measures
Binary relevance

◦ Precision@K(P@K)
◦ MeanAverage Precision (MAP)
Precision@K(P@K)
• Set a rank threshold K
• Compute % relevant in top K
• Ignores documents ranked lower than K
Mean Average Precision
•Consider rank position of each relevant doc (K1, K2, ……,Kr)
• Compute Precision@ K for each K1, K2, … … ,Kr
• Average Precision = average of P@K
• MAP is average Precision across multiple queries/rankings
Average Precision
Mean Average Precision
Mean Average Precision

Lecture 6 - Scoring, Term Weighting, Vector Space Model - Part 2
No ratings yet
Lecture 6 - Scoring, Term Weighting, Vector Space Model - Part 2
44 pages
Lect 13-Text Ranking
No ratings yet
Lect 13-Text Ranking
58 pages
Lecture12 Efficient Scoring
No ratings yet
Lecture12 Efficient Scoring
52 pages
IRCh 7 Slides
No ratings yet
IRCh 7 Slides
52 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
52 pages
Module 2-1
No ratings yet
Module 2-1
6 pages
Chapter 6 - Scoring Term Weighting and Vector Space Model
No ratings yet
Chapter 6 - Scoring Term Weighting and Vector Space Model
43 pages
Advanced IR Scoring Techniques
No ratings yet
Advanced IR Scoring Techniques
43 pages
Information Retrieval: Introduction To
No ratings yet
Information Retrieval: Introduction To
48 pages
Learning Guide Unit 5 - Home
No ratings yet
Learning Guide Unit 5 - Home
12 pages
Complete System Design
No ratings yet
Complete System Design
22 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
31 pages
CS583 Info Retrieval
No ratings yet
CS583 Info Retrieval
34 pages
CS583 Info Retrieval
No ratings yet
CS583 Info Retrieval
33 pages
2 Introduction To Information Retrieval
No ratings yet
2 Introduction To Information Retrieval
38 pages
4 IRinArabic2021 Ranked Retrieval I
No ratings yet
4 IRinArabic2021 Ranked Retrieval I
49 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
L05
No ratings yet
L05
33 pages
Lecture 2: More Similarity Searching Multidimensional Scaling
No ratings yet
Lecture 2: More Similarity Searching Multidimensional Scaling
8 pages
Module 1 Part BInformation Retrieval Webdocuments
No ratings yet
Module 1 Part BInformation Retrieval Webdocuments
49 pages
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
No ratings yet
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
45 pages
3 Retrieval Models
No ratings yet
3 Retrieval Models
87 pages
Lecture 04
No ratings yet
Lecture 04
41 pages
TF Idf
100% (3)
TF Idf
38 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
33 pages
NLP See
No ratings yet
NLP See
9 pages
Query Languages: Chapter Seven
No ratings yet
Query Languages: Chapter Seven
36 pages
IR Unit III - Notes
No ratings yet
IR Unit III - Notes
18 pages
NLP See
No ratings yet
NLP See
27 pages
Lecture5 6
No ratings yet
Lecture5 6
30 pages
L12&L13 Ranked Retrieval
No ratings yet
L12&L13 Ranked Retrieval
31 pages
Boolean VectorSpace 11
No ratings yet
Boolean VectorSpace 11
15 pages
Hatakenaka 2011
No ratings yet
Hatakenaka 2011
6 pages
Intro to Information Retrieval
No ratings yet
Intro to Information Retrieval
47 pages
Probabilistic IR & Query Expansion
No ratings yet
Probabilistic IR & Query Expansion
37 pages
Tif Idf Cosine Similarity
No ratings yet
Tif Idf Cosine Similarity
44 pages
Efficient Scoring in Retrieval Systems
No ratings yet
Efficient Scoring in Retrieval Systems
18 pages
Automatic Indexing: Automatic Text Processing by G. Salton, Addison-Wesley, 1989
No ratings yet
Automatic Indexing: Automatic Text Processing by G. Salton, Addison-Wesley, 1989
65 pages
Vector Space Model: TF - IDF: Adapted From Lectures by
No ratings yet
Vector Space Model: TF - IDF: Adapted From Lectures by
37 pages
Indexing 1
No ratings yet
Indexing 1
61 pages
chapter2-MA212-Indexing & Preprocessing
No ratings yet
chapter2-MA212-Indexing & Preprocessing
68 pages
Session 4 Text Feature
No ratings yet
Session 4 Text Feature
40 pages
1 Overview
No ratings yet
1 Overview
44 pages
Information Retrieval Models Guide
No ratings yet
Information Retrieval Models Guide
54 pages
TF-IDF and Ranked Retrieval Basics
No ratings yet
TF-IDF and Ranked Retrieval Basics
51 pages
AI6122 Topic 3.2 - Ranking
No ratings yet
AI6122 Topic 3.2 - Ranking
27 pages
Bulu
No ratings yet
Bulu
47 pages
Query Languages
No ratings yet
Query Languages
54 pages
Information Retrieval System and The Pagerank Algorithm
No ratings yet
Information Retrieval System and The Pagerank Algorithm
37 pages
Information Retrieval
No ratings yet
Information Retrieval
142 pages
Ranked Retrieval
No ratings yet
Ranked Retrieval
52 pages
Lec2 2
No ratings yet
Lec2 2
17 pages
IRS Unit-4
No ratings yet
IRS Unit-4
35 pages
Introduction To Automatic Indexing
No ratings yet
Introduction To Automatic Indexing
28 pages
Irs Unit - 4
No ratings yet
Irs Unit - 4
29 pages
Introduction To Telecom Technologies (Telecom) : Getachew Mamo
No ratings yet
Introduction To Telecom Technologies (Telecom) : Getachew Mamo
65 pages
L5 Clustering
No ratings yet
L5 Clustering
6 pages
2025 S1 IT2070 Lecture 01 Stacks
No ratings yet
2025 S1 IT2070 Lecture 01 Stacks
30 pages
2025 S1 IT2070 Lecture 03 Linked Lists
No ratings yet
2025 S1 IT2070 Lecture 03 Linked Lists
24 pages
2025 S1 IT2070 Lecture 04 Trees
No ratings yet
2025 S1 IT2070 Lecture 04 Trees
39 pages
Lecture 03
No ratings yet
Lecture 03
53 pages
2025 S1 IT2070 Lecture 05 Recursion
No ratings yet
2025 S1 IT2070 Lecture 05 Recursion
22 pages
2024 S2 IT3061 Lecture 01 Introduction To Cloud Computing
No ratings yet
2024 S2 IT3061 Lecture 01 Introduction To Cloud Computing
22 pages
Nov Dec 2023
No ratings yet
Nov Dec 2023
3 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
Method of Elimination 1
No ratings yet
Method of Elimination 1
5 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
Combinational Test Generation Guide
No ratings yet
Combinational Test Generation Guide
24 pages
Ad3251 Data Structures Desgin Course Plan
No ratings yet
Ad3251 Data Structures Desgin Course Plan
7 pages
1.2.5. Machine Learning With Python Lab
No ratings yet
1.2.5. Machine Learning With Python Lab
2 pages
Policy-Based Reinforcement Learning: Shusen Wang
No ratings yet
Policy-Based Reinforcement Learning: Shusen Wang
46 pages
CS DBM20083 (Q) Set 2
No ratings yet
CS DBM20083 (Q) Set 2
6 pages
Medical Imaging Lab Report
No ratings yet
Medical Imaging Lab Report
13 pages
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
No ratings yet
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
3 pages
Esdat Unit 2 Activity 2
No ratings yet
Esdat Unit 2 Activity 2
4 pages
Sampling Process
No ratings yet
Sampling Process
24 pages
Algorithm Exam for CS Students
No ratings yet
Algorithm Exam for CS Students
49 pages
A Modified Homomorphic Filter For Image Enhancement: Gu Hailan LV Wenzhe
No ratings yet
A Modified Homomorphic Filter For Image Enhancement: Gu Hailan LV Wenzhe
5 pages
DSP PPT-1
100% (1)
DSP PPT-1
25 pages
Algorithm Design & Representation (Flowchart)
No ratings yet
Algorithm Design & Representation (Flowchart)
7 pages
Correlation of Discrete-Time Signals
No ratings yet
Correlation of Discrete-Time Signals
12 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Duality Theory Problems Explained
No ratings yet
Duality Theory Problems Explained
22 pages
DSP & Digital Filters Course
100% (1)
DSP & Digital Filters Course
215 pages
bm3401 SP Question Bank
No ratings yet
bm3401 SP Question Bank
14 pages
Numerical Errors Explained
100% (2)
Numerical Errors Explained
12 pages
Simplex Method
No ratings yet
Simplex Method
5 pages
System Notes for TE-301 Course
91% (70)
System Notes for TE-301 Course
56 pages
Error Correcting Codes Guide
No ratings yet
Error Correcting Codes Guide
5 pages
CI Lecture6
No ratings yet
CI Lecture6
28 pages
PDF Evolutionary Optimization Algorithms Full Online: Book Details
No ratings yet
PDF Evolutionary Optimization Algorithms Full Online: Book Details
1 page
Image Processing with Fiji/ImageJ
No ratings yet
Image Processing with Fiji/ImageJ
59 pages
Algorithm Basics and Efficiency
No ratings yet
Algorithm Basics and Efficiency
4 pages

Lecture 05

Uploaded by

Lecture 05

Uploaded by

Lecture 5:

Further enhancement to Scoringandresults

• Evaluation of the IR System

• Canwe do this without computing allN cosines?

UseHeap for SelectingTop K

• Canwe avoid all this computation?

• Yes,but maysometimes get it wrong

• Cosine matches docs to query

• Thus cosine is anyway a proxy for user happiness

• If we get a list of K docs “close” to the top K by cosine measure,

• Take this further:

Scoresonly computed for docs 8,16 and32.

docs of highest weight in t’s postings

Encode zones in dictionary vs. postings.

• The key measurement:User happiness

F Measure: A single measure that trades off precision versus recall

You might also like