0% found this document useful (0 votes)

224 views8 pages

VC Exit Predictor Technical Documentation

The PitchBook VC Exit Predictor utilizes machine learning to evaluate the likelihood of successful exits for VC-backed startups based on extensive data. It provides a score from 0 to 100, indicating the attractiveness of an investment, with a model accuracy of 67.8% on test data and higher when combining exit categories. The document details the model's methodology, performance evaluation, and the data inputs used to train it, highlighting the importance of investor networks and financing rounds.

Uploaded by

satyamuppal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

224 views8 pages

VC Exit Predictor Technical Documentation

Uploaded by

satyamuppal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

PitchBook Data, Inc.

John Gabbert Founder, CEO

PitchBook VC Exit Predictor
Nizar Tarhuni Vice President, Institutional
Research and Editorial PitchBook is a Morningstar company providing the most comprehensive, most
Daniel Cook, CFA Head of Quantitative Research accurate, and hard-to-find data for professionals doing business in the private markets.

pbinstitutionalresearch@pitchbook.com

Contents Introduction
Introduction 1
The PitchBook VC Exit Predictor leverages machine learning and our vast database
Model performance evaluation 2 of information about VC-backed companies, financing rounds, and investors to
Historical data 4
objectively assess a startup’s prospect of a successful exit. The primary component
underpinning the score is a classification model that predicts the probability that
Inputs 6
a VC-backed startup will ultimately be acquired, go public, or not exit due to either
Scoring 8 failure or becoming self-sustaining. These probabilities are then used to calculate a
naïve expected return of an investment in the startup’s next financing round using
historical returns by series derived from capitalization table data. Finally, these
expected returns are normalized across the VC universe by percentile ranking. The
final score for each currently VC-backed company is a number from zero to 100,1
wherein a score of 100 represents the most attractive and zero the least attractive.
This document provides methodological details of the VC Exit Predictor, including
performance evaluation, the data and inputs used to train the model, the validation
process, and how companies are ultimately scored. 2

1: A company must be VC-backed, have at least two VC financing rounds, have experienced a financing event in the past six years, and have not
undergone an exit event to be eligible for scoring in the PitchBook Platform.
2: Data and performance metrics were generated on January 5, 2023.

1
PitchBook VC Exit Predictor

Model performance evaluation

The model achieved an accuracy rate of 67.8% on the test data. When the merger
and public listing classes were combined into a single “success” category to create
a binary classification problem, accuracy improved to 73.6%. While accuracy is
easy to interpret, it can potentially be misleading and should be viewed in the
context of the outcome distribution. A good way to achieve this is by looking at
the confusion matrix, which in this case is a 3x3 matrix whose rows represent the
predicted outcome and whose columns represent the true outcome. The values are
normalized for the total sample size.

Normalized confusion matrix

True outcome

No exit Merger Public listing Total

No exit 24.2% 10.9% 0.6% 35.6%

Predicted outcome

Merger 14.4% 40.6% 4.0% 59.0%

Public listing 0.5% 1.8% 3.0% 5.4%

Total 39.1% 53.3% 7.6% 100.0%

Source: PitchBook | Geography: Global

Entries along the diagonal are correct predictions, whereas off-diagonal entries are
different types of errors. For example, the entry in the first row and second column
(10.9%) is the percentage of observations wherein the model predicted failure and
the actual outcome was a merger. Two summary metrics related to the confusion
matrix that provide additional perspective are precision and recall. Precision is the
accuracy given the model predicted a certain outcome, and recall is the percentage
of observations of a specific outcome that the model correctly identified.

Precision and recall by outcome

Precision Recall Class %

No exit 67.8% 61.8% 39.1%

Merger 68.9% 76.2% 53.3%

Public listing 56.4% 40.0% 7.6%

Source: PitchBook | Geography: Global

Precision and recall offer insights into the model’s strengths and weaknesses.
Similar precision metrics across the three classes indicate that the model is equally
good irrespective of the predicted class. The recall metrics show more variation.
Merger is the easiest class to identify, while public listing is the hardest. This is
often the case for classes with low representation, and public listing recall should be
viewed in the context that less than 8% of the observations are public listings.

2
PitchBook VC Exit Predictor

Due to differences in outcome distributions, evaluating the model by VC deal

number also unveils interesting insights into its performance. From a three-
class perspective, the model has similar accuracy across companies of different
maturities. Binary accuracy improves as VC deal number increases, but this is
driven by a change in the underlying distribution of outcomes. When viewed in
the context of the percentage of successful exit, binary accuracy is best at earlier
VC deal numbers because the unconditional probability of success is closer to a
50/50 proposition.

Model performance by VC deal number

2 3 4 5 6+

Data count 5,972 3,154 1,733 960 1,175

Accuracy
68.4% 67.7% 68.0% 67.5% 65.2%
(three-class)
Accuracy (binary) 71.5% 73.4% 75.5% 77.5% 79.1%

Successful exit % 52.1% 63.5% 68.5% 73.4% 77.5%

Source: PitchBook | Geography: Global

Precision and recall by class and VC deal number

100% No exit Merger Public listing

80%

60%

40%

20%

0%
2 3 4 5 6+ 2 3 4 5 6+ 2 3 4 5 6+
VC deal number VC deal number VC deal number
Precision Recall Precision Recall Precision Recall

Source: PitchBook | Geography: Global

The charts above provide further detail on performance by showing how precision
and recall change for each class as VC deal number increases. Two main conclusions
can be made: First, performance on the failure class declines over time; and second,
performance on the public listing class improves over time. This is an unsurprising
result—at early stages, it is difficult to determine if a company will go public many
years into the future, while at later stages, it becomes rare for a company to fail after
it has received significant VC investment.

3
PitchBook VC Exit Predictor

A potential drawback of the model evaluation discussed thus far is that the data was
not separated by time. While we have excluded any forward-looking information
in the features for an individual company, the training data contained some
observations that had not yet occurred with respect to some observations in the
test data. Therefore, the predictions made for the test data are not a true backtest—
that is, the model output could not have been replicated on the prediction date.
Setting up a backtest for this analysis is challenging because it requires balancing
having enough data to both train and evaluate the model. We need to go back far
enough so the companies for which predictions were made have a chance to mature
and exit. However, if the backtest date is too early, there will not be a large enough
sample of VC-backed companies with a known outcome to adequately train the
model. This is especially challenging due to the exponential growth in VC activity—
most observations have come within the last three and a half years. With this trade-
off in mind, we selected December 31, 2018 as the date of the backtest, which led
to approximately 32,000 observations to train the model and 13,000 observations
to evaluate its performance. The model had a three-class accuracy of 72.6% and
a binary accuracy of 76.6%. The normalized confusion matrix summarizing the
performance is shown below.

Normalized confusion matrix for model backtest

True outcome

No exit Merger Public listing Total

No exit 41.6% 6.6% 0.2% 48.3%

Predicted outcome

Merger 15.2% 27.8% 1.0% 43.9%

Public listing 1.4% 3.1% 3.2% 7.7%

Total 58.2% 37.5% 4.3% 100.0%

Source: PitchBook | Geography: Global

The model performed particularly well on the no exit class, with precision and recall
of 86.1% and 71.4%, respectively. Relative to the prior results, the model performed
worse on the merger and public listing classes in terms of precision, but better
in terms of recall. The backtest performance is not without its caveats, however.
These caveats arise because the outcome for all companies is not realized, as only
a fraction of the companies that the model made predictions for have a known exit
at the time of this writing. Of the more than 30,000 companies that were eligible for
prediction on the date of the backtest, around 40% have a known exit. Because the
set of companies with a known exit inherently depends on time, it is not a random
sample and is thus subject to bias. We found that companies that were predicted to
fail had a higher likelihood of having a known outcome.

Historical data
Individual data observations used to train and evaluate the model are associated with
VC financing rounds, while inclusion is established at the company level. We included
companies that had raised at least two rounds of VC financing (including angel and
seed rounds) and are no longer VC-backed, which means they have undergone a

4
PitchBook VC Exit Predictor

merger or public listing, filed for bankruptcy, ceased business operations, or become
self-sustaining. Because many startup failures are undisclosed, a company that
has not received a VC financing round in more than six years was deemed to have
failed or become self-sustaining, which was determined by analyzing the empirical
distribution of time between VC rounds. The inclusion criteria resulted in over 64,000
observations from 31,000 distinct companies in the final dataset. The table and plot
below provide additional detail on the data in terms of the outcome distribution.

Data distribution by outcome

Data count Overall %

No exit 25,523 39.5%

Merger 33,987 52.6%

Public listing 5,131 7.9%

Total 64,641 100.0%

Source: PitchBook | Geography: Global

Data distribution (thousands) by VC deal number and outcome

35 Public listing
30 Merger
Number of observations

25 No exit
20
15
10
5
0
2 3 4 5 6+
VC deal number
Source: PitchBook | Geography: Global

Theoretically, model inputs could be generated daily for a company between

its second VC financing date and exit date. Not only is this unreasonable from
a computational perspective, but it would also result in highly correlated
observations, given that many of the features would not change from one day to the
next. Significant feature updates mainly occur after a financing event. Therefore,
we generated one observation per VC financing round for each qualifying company.3
The prediction date for each observation was determined by randomly sampling from
a uniform distribution in the interval from the close date of the current round to the
close date of the next event (subsequent VC round or exit). The prediction date for
each observation dictates what information is included in the input—only data that
was known at the time of the prediction is allowed in order to avoid look-ahead bias.
Randomly sampling the prediction date, as opposed to using the close date of the
current round, enables the model to learn how time affects outcomes. For example,

3: The data frequency of observations used for model training and evaluation differ from that used for model inference. The outcome probabilities
and scores shown on the PitchBook Platform are updated daily.

5
PitchBook VC Exit Predictor

a company that raised its last round one year ago has a better chance of successfully
exiting than one that has not raised a round in four years, all else equal. In addition,
this matches the structure of the data that the model will be used on for inference
(current eligible VC-backed companies), wherein the time from last close date will
differ across companies.

Inputs
The inputs, or features, to the machine learning classification model were compiled
from the extensive amount of information on each startup’s PitchBook profile. In total,
each observation has 34 features, which can be categorized into three main groups:
company, financing, and investors.

Company-level inputs can be further broken down into static and point-in-time
information. Static features are basic, unchanging descriptive data points about a
company, including industry/vertical, geographic location, and number of founders.
Point-in-time features, on the other hand, are company attributes whose value
depends on when a prediction is made. This is a broad category containing data
from stage of business (for example, product development, generating revenue, and
profitable, among others) to patents. Additional point-in-time inputs include number
of employees, company age, acquisitions, and related news articles. Inputs in the
financing category comprise data from current and past financing rounds with a focus
on VC deals. Key variables consist of VC round number, stock type, close date, and
deal size.4

Engineering features from investor data, particularly data related to individual investor
entities, present a challenge due to their high dimensionality. This analysis contains
nearly 10,000 distinct investors, and only a small fraction invest in each company.
Rather than treat investors as a sparse and high-dimensional categorical feature, we
developed a method to rank investors based on their importance and experience
within the VC universe. This ranking method relies on the well-known hypothesis that
influential VC investors frequently work together by investing in the same companies;
this is often compared to an exclusive social club. To capture this dynamic, we
model VC investors as a social network wherein two entities are connected if they
have invested in the same company. The connections, or edges, are then weighted
by the number of distinct co-invested companies between pairs of investors. To
quantify the idea that investors should be highly ranked if they have both VC investing
experience and are connected with other experienced VC investors, we calculated
the eigenvector centrality of the investor network.5 In addition to the investor ranking,
other inputs in this category include average capital invested per distinct investor,
investor counts, counts by type other than VC (such as CVC), follow-on counts,
frequency of a lead investor, and geographic location.

Modeling
The first step in the modeling process is to split the data into training and test sets
so that the model is trained and evaluated on mutually exclusive samples. Extra care

4: Due to better data coverage, deal size is used as a proxy for valuation.
5: The concept of eigenvector centrality was famously used in Google’s PageRank algorithm. For more information on network centrality, see:
“Network Centrality: An Introduction,” arXiv, Francisco Aparecido Rodrigues, January 22, 2019.

6
PitchBook VC Exit Predictor

needs to be given to this step because there can be multiple observations with
the same outcome for the same company, which can lead to information leakage
between the training and test data.6 To avoid this pitfall, we partitioned the data
at the company level such that all of a company’s observations were either in the
training or test set. Therefore, when the model made predictions on the test set, it
had no prior information on the outcomes of the companies. We performed stratified
random sampling to assign each company to the training or test set with an 80/20
split, which results in around 48,000 and 12,000 observations in the training and test
sets, respectively.

The presence of unbalanced outcomes is another aspect of the modeling process that
deserves attention. Unbalanced class distributions in supervised machine learning
classification can cause the model to overemphasize the majority class during
training, thus potentially leading to biased predictions in favor of the majority class.
The class distribution in this case is imbalanced in favor of mergers and failures,
while public listings make up less than 10% of the data. Startups that go public are
often the most lucrative for investors and, therefore, are important for the model
to perform well on. To mitigate the impact of unbalanced classes, we implemented
an oversampling method known as the Synthetic Minority Oversampling Technique
(SMOTE),7 which creates synthetic observations of the minority class(es). Synthetic
observations are created by randomly sampling along the hyperplanes (in the case
of more than two dimensions) connecting all the k minority class’ nearest neighbors,
wherein k is a hyperparameter. The synthetic observations are therefore logical
perturbations from the original data. These observations are strictly used during
model training and are not considered during evaluation. The oversampling process
effectively gives each class an equal weight on the loss function during training.

The specific algorithm we employed is known as XGBoost,8 a gradient-boosted

classification tree model. Since its introduction, this algorithm has produced state-
of-the-art performance on many traditional machine learning tasks with two-
dimensional feature inputs. For this task, it outperformed a multinomial linear model,
a multilayer perceptron (MLP) neural network, and a recurrent neural network with
long short-term memory (LSTM) layers wherein financing rounds were treated as
sequences. In addition, we chose XGBoost due to its flexibility in handling outliers and
missing data, ability to represent complex nonlinear functions, robustness to data
preprocessing, and fast training times.

The model’s hyperparameters were tuned using five-fold cross-validation with both
grid search and Bayesian optimization.9 Cross validation is a data splitting process
used to select the “best” set of hyperparameters wherein the training data is split
multiple times—in this case, five—to create additional validation sets. Each fold is
used as a validation set once, while all other folds are combined to train the model.
Just like the training and test sets, observations were assigned to each fold at the
company level to avoid information leakage.

6: Information leakage occurs when the test set contains information from the training set that can cause overfitting and optimistic performance
evaluation on the test set. This particular form of information leakage is known as “identity confounding” because the model learns identities (that is,
companies) as well as features.
7: “SMOTE: Synthetic Minority Over-Sampling Technique,” Journal of Artificial Intelligence Research, N. V. Chawla, et al., June 1, 2002.
8: “XGBoost: A Scalable Tree Boosting System,” arXiv, Tianqi Chen and Carlos Guestrin, June 10, 2016.
9: Hyperparameters are components of the model that must be specified before training and cannot be learned directly.

7
PitchBook VC Exit Predictor

Scoring
The scoring process maps the outcome probabilities from the classification model
to a naïve expected return from the perspective of an investor in a company’s next
VC financing round based on historical returns by series derived from capitalization
table information. Scoring serves two main benefits: First, it creates a single
value for each company, which is necessary for the final rankings; and second, it
quantifies the benefit of investing early in successful startups and/or exiting them
via the public markets.

The expected return for an individual startup is a weighted geometric average of

the historical returns based on the upcoming series of its next VC financing round.
For example, if a startup had last raised a Series B, the relevant returns would be for
Series C investments. The weights are taken as the probability of each exit outcome
from the classification model. The tables below show average annualized startup
returns by series and type as well as the average holding period.10 For simplicity, we
assume that a failure results in a total loss at all stages.

Average return by series Average holding periods (years)

Merger Public listing Merger Public listing

Series A 36.7% 47.8% Series A 5.34 5.89

Series B 31.0% 37.9% Series B 4.67 4.34

Series C 28.0% 34.4% Series C 4.40 3.74

Series D+ 20.0% 30.0% Series D+ 3.69 3.04

Source: PitchBook | Geography: Global Source: PitchBook | Geography: Global

For example, consider a startup that recently raised a Series B with no exit, merger,
and public listing exit probabilities of 50%, 30%, and 20%, respectively. The
annualized geometric expected return would be calculated as follows:

( 4.40×0.6+3.74×0.4 ) − 1 = 10.0%
1
r = (1.28 4.40 × 0.3 + 1.343.74 × 0.2)

Finally, the return figures are normalized as a percentile ranking across all eligible
VC-backed companies. A percentile ranking of 100 represents the most attractive
company, while a ranking of 0 represents the least attractive.

10: Holding periods of less than one year are not annualized. In addition, outlier returns of more than 350% are excluded from the average calculations.

COPYRIGHT © 2023 by PitchBook Data, Inc. All rights reserved. No part of this publication may be reproduced in
any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, and
information storage and retrieval systems—without the express written permission of PitchBook Data, Inc. Contents
are based on information from sources believed to be reliable, but accuracy and completeness cannot be guaranteed.
Nothing herein should be construed as investment advice, a past, current or future recommendation to buy or sell
any security or an offer to sell, or a solicitation of an offer to buy any security. This material does not purport to
contain all of the information that a prospective investor may wish to consider and is not to be relied upon as such or
used in substitution for the exercise of independent judgment.

In The Cloud
No ratings yet
In The Cloud
25 pages
On-Demand Economy Insights
100% (7)
On-Demand Economy Insights
150 pages
2018 - American Manganese Inc - Business - Plan-March2018 - Recycling Electric Vehicle Litium-Ion Batteries
No ratings yet
2018 - American Manganese Inc - Business - Plan-March2018 - Recycling Electric Vehicle Litium-Ion Batteries
30 pages
The Home Depot Inc. - Analyst - Investor Day
No ratings yet
The Home Depot Inc. - Analyst - Investor Day
107 pages
Uzbekistan THE NATIONAL - GREEN - ECONOMY TAXONOMY
No ratings yet
Uzbekistan THE NATIONAL - GREEN - ECONOMY TAXONOMY
15 pages
Bain Report Asia Pacific Private Equity-Report-2023
No ratings yet
Bain Report Asia Pacific Private Equity-Report-2023
44 pages
Asset Light Vs Asset Heavy Valuations
50% (2)
Asset Light Vs Asset Heavy Valuations
4 pages
Startup Investment Readiness Guide
No ratings yet
Startup Investment Readiness Guide
10 pages
Asset Bubbles and Where To Find Them: Commentary by Joe Davis, Vanguard Global Chief Economist
No ratings yet
Asset Bubbles and Where To Find Them: Commentary by Joe Davis, Vanguard Global Chief Economist
4 pages
Ethical Dilemmas in Industrial Engineering
No ratings yet
Ethical Dilemmas in Industrial Engineering
6 pages
AQUATETHYS Anglaisseul
No ratings yet
AQUATETHYS Anglaisseul
14 pages
Trigeneration - An Efficient Energy Solution For Business: Index
No ratings yet
Trigeneration - An Efficient Energy Solution For Business: Index
10 pages
U.S. Market Insights for Investors
No ratings yet
U.S. Market Insights for Investors
6 pages
Lean Startup
No ratings yet
Lean Startup
37 pages
Prospectus-2019-09-03 Partners Group Direct Equity 2019 (EUR) S.C.A., SICAV-RAIF
No ratings yet
Prospectus-2019-09-03 Partners Group Direct Equity 2019 (EUR) S.C.A., SICAV-RAIF
161 pages
A Guide For Raising Equity Investment by MSMEs
No ratings yet
A Guide For Raising Equity Investment by MSMEs
33 pages
Renda Pitch Deck - 2023
No ratings yet
Renda Pitch Deck - 2023
17 pages
Case2005 Excerpt PDF
100% (1)
Case2005 Excerpt PDF
14 pages
Blue Economy
No ratings yet
Blue Economy
19 pages
Net Zero Strategy PWC
No ratings yet
Net Zero Strategy PWC
85 pages
NVCA 2020 Yearbook
No ratings yet
NVCA 2020 Yearbook
71 pages
The Vault Guide To Procter & Gamble
No ratings yet
The Vault Guide To Procter & Gamble
54 pages
Strategy Dossier
No ratings yet
Strategy Dossier
50 pages
Value Creation Opportunities For Renewable Energy
No ratings yet
Value Creation Opportunities For Renewable Energy
14 pages
Tax Equity Financing and Asset Rotation
No ratings yet
Tax Equity Financing and Asset Rotation
18 pages
Venture Capital Guide for SMEs
No ratings yet
Venture Capital Guide for SMEs
64 pages
Social Enterprise Innovation Guide
No ratings yet
Social Enterprise Innovation Guide
16 pages
Exercise Case - Hybrid Financing 2 - Bath 5 - Pu
No ratings yet
Exercise Case - Hybrid Financing 2 - Bath 5 - Pu
4 pages
Blue Economy
No ratings yet
Blue Economy
4 pages
Atman Capital Deck
No ratings yet
Atman Capital Deck
34 pages
Capital Markets Partner Guide 09
No ratings yet
Capital Markets Partner Guide 09
56 pages
Potential Pathways For Decarbonizing China's Inland Waterway Shipping
No ratings yet
Potential Pathways For Decarbonizing China's Inland Waterway Shipping
4 pages
WonderFi Fall 2021 Investor Deck
No ratings yet
WonderFi Fall 2021 Investor Deck
31 pages
Cians Financial Modeling Sample Pack
No ratings yet
Cians Financial Modeling Sample Pack
35 pages
LuxSE Indian GDRs Brochure PDF
No ratings yet
LuxSE Indian GDRs Brochure PDF
14 pages
2021 01 11 A Top Tier Operator Initiating With Outperform Rating 101 Target
No ratings yet
2021 01 11 A Top Tier Operator Initiating With Outperform Rating 101 Target
66 pages
Union Homes Prospectus
No ratings yet
Union Homes Prospectus
28 pages
2022 Global Fund Performance
No ratings yet
2022 Global Fund Performance
21 pages
Unilever
No ratings yet
Unilever
14 pages
Bose Corporation's Competitive Advantages and Its Shift To An Online-Only Model
No ratings yet
Bose Corporation's Competitive Advantages and Its Shift To An Online-Only Model
16 pages
2020 Social Capital Annual Letter
No ratings yet
2020 Social Capital Annual Letter
14 pages
PWC Ir Practical Guide
No ratings yet
PWC Ir Practical Guide
32 pages
Ey Ias 39 Repos
No ratings yet
Ey Ias 39 Repos
5 pages
Article - Vertical Take-Off and Landing
No ratings yet
Article - Vertical Take-Off and Landing
13 pages
Case Study
No ratings yet
Case Study
3 pages
Enel Investor Presentation 2020-22
No ratings yet
Enel Investor Presentation 2020-22
164 pages
Financialisation Series 2
No ratings yet
Financialisation Series 2
23 pages
Private Equity's Role in Olam's Growth
No ratings yet
Private Equity's Role in Olam's Growth
2 pages
Nomura Greentech General Monthly Report February 2021
No ratings yet
Nomura Greentech General Monthly Report February 2021
24 pages
Goldgroup Mining: Growth & Strategy Overview
No ratings yet
Goldgroup Mining: Growth & Strategy Overview
38 pages
Introduction To VC Business Model
100% (1)
Introduction To VC Business Model
3 pages
12-8-2020 TR No End in Sight - Final
100% (4)
12-8-2020 TR No End in Sight - Final
93 pages
Byd Auto: From Battery Manufacturer To Electric-Vehicle Innovator
No ratings yet
Byd Auto: From Battery Manufacturer To Electric-Vehicle Innovator
7 pages
Strasys Solutions - Consulting Services 2023
No ratings yet
Strasys Solutions - Consulting Services 2023
24 pages
Internal Rate of Return: A Cautionary Tale
100% (2)
Internal Rate of Return: A Cautionary Tale
5 pages
Zopa: Pioneering UK Peer-to-Peer Lending
No ratings yet
Zopa: Pioneering UK Peer-to-Peer Lending
16 pages
CH 02
No ratings yet
CH 02
49 pages
Data Driven VC Landscape 2025 1753648755
No ratings yet
Data Driven VC Landscape 2025 1753648755
47 pages
Seed Accelerator Rankings Project - SXSW
100% (2)
Seed Accelerator Rankings Project - SXSW
58 pages
Assessing The Performance and Return On Investment of Different Exit Strategies Employed by V
No ratings yet
Assessing The Performance and Return On Investment of Different Exit Strategies Employed by V
10 pages
DOC-20221113-WA0003 - WTP Culvert
No ratings yet
DOC-20221113-WA0003 - WTP Culvert
4 pages
Kami Export - Year-8-Autumn 2-Pre-Learning HA
No ratings yet
Kami Export - Year-8-Autumn 2-Pre-Learning HA
9 pages
Laptop Compatibility List
No ratings yet
Laptop Compatibility List
131 pages
Salvatory Filbert .Q2
100% (1)
Salvatory Filbert .Q2
4 pages
LibreOffice Questions & Answers
No ratings yet
LibreOffice Questions & Answers
15 pages
Introduction To Unified Modeling Language (UML)
No ratings yet
Introduction To Unified Modeling Language (UML)
27 pages
Tutorial Ipi2win
No ratings yet
Tutorial Ipi2win
32 pages
AI Image Generator Project Report
No ratings yet
AI Image Generator Project Report
16 pages
Digital Pressure Gauges Additel 680 Series
No ratings yet
Digital Pressure Gauges Additel 680 Series
3 pages
Online Crime Reporting System
No ratings yet
Online Crime Reporting System
14 pages
Python String Concatenation Guide
No ratings yet
Python String Concatenation Guide
11 pages
3navigation and Routing in Flutter
No ratings yet
3navigation and Routing in Flutter
34 pages
UPS GTEC Zs110 User Manual
No ratings yet
UPS GTEC Zs110 User Manual
21 pages
App001 - Patent Application
No ratings yet
App001 - Patent Application
5 pages
Devicenet: Leoni Special Cables GMBH
No ratings yet
Devicenet: Leoni Special Cables GMBH
2 pages
Manual Kick Tolerance Guide
100% (1)
Manual Kick Tolerance Guide
3 pages
Pengaruh Tegangan Air Pori Negatif Terhadap Kuat Geser Tanah Lempung
No ratings yet
Pengaruh Tegangan Air Pori Negatif Terhadap Kuat Geser Tanah Lempung
6 pages
Part Xe Nâng - Geni S85 PDF
No ratings yet
Part Xe Nâng - Geni S85 PDF
364 pages
ATR72 Modifications Overview
No ratings yet
ATR72 Modifications Overview
133 pages
2024 CF Moto - 450sr S - SM
100% (3)
2024 CF Moto - 450sr S - SM
219 pages
DXCS4 - SI - 2267745 - S4TWL - New Advanced ATP in SAP - Table VBBS
No ratings yet
DXCS4 - SI - 2267745 - S4TWL - New Advanced ATP in SAP - Table VBBS
1 page
The Use of Post Tensioning in Marine Structures
No ratings yet
The Use of Post Tensioning in Marine Structures
39 pages
5E Lesson Plan Template
No ratings yet
5E Lesson Plan Template
6 pages
Agile Methodology
No ratings yet
Agile Methodology
7 pages
Sun Cluster
100% (1)
Sun Cluster
87 pages
FN595NWS
No ratings yet
FN595NWS
53 pages
Compare Xiaomi Poco F3 vs Galaxy A72/A71
No ratings yet
Compare Xiaomi Poco F3 vs Galaxy A72/A71
2 pages
Certificado de Conformidade - Compressed
No ratings yet
Certificado de Conformidade - Compressed
7 pages
Workstation PC Build Download - Keyframes Animation
No ratings yet
Workstation PC Build Download - Keyframes Animation
4 pages
Fire Safety Standards for Marine Vessels
No ratings yet
Fire Safety Standards for Marine Vessels
4 pages

VC Exit Predictor Technical Documentation

Uploaded by

VC Exit Predictor Technical Documentation

Uploaded by

PitchBook Data, Inc.

John Gabbert Founder, CEO

Model performance evaluation

Normalized confusion matrix

No exit Merger Public listing Total

No exit 24.2% 10.9% 0.6% 35.6%

Merger 14.4% 40.6% 4.0% 59.0%

Public listing 0.5% 1.8% 3.0% 5.4%

Total 39.1% 53.3% 7.6% 100.0%

Source: PitchBook | Geography: Global

Precision and recall by outcome

Precision Recall Class %

No exit 67.8% 61.8% 39.1%

Merger 68.9% 76.2% 53.3%

Public listing 56.4% 40.0% 7.6%

Source: PitchBook | Geography: Global

Due to differences in outcome distributions, evaluating the model by VC deal

Model performance by VC deal number

Data count 5,972 3,154 1,733 960 1,175

Successful exit % 52.1% 63.5% 68.5% 73.4% 77.5%

Precision and recall by class and VC deal number

100% No exit Merger Public listing

Source: PitchBook | Geography: Global

Normalized confusion matrix for model backtest

No exit Merger Public listing Total

No exit 41.6% 6.6% 0.2% 48.3%

Merger 15.2% 27.8% 1.0% 43.9%

Public listing 1.4% 3.1% 3.2% 7.7%

Total 58.2% 37.5% 4.3% 100.0%

Source: PitchBook | Geography: Global

Data distribution by outcome

Data count Overall %

No exit 25,523 39.5%

Merger 33,987 52.6%

Public listing 5,131 7.9%

Total 64,641 100.0%

Source: PitchBook | Geography: Global

Data distribution (thousands) by VC deal number and outcome

Theoretically, model inputs could be generated daily for a company between

The specific algorithm we employed is known as XGBoost,8 a gradient-boosted

The expected return for an individual startup is a weighted geometric average of

Average return by series Average holding periods (years)

Merger Public listing Merger Public listing

Series A 36.7% 47.8% Series A 5.34 5.89

Series B 31.0% 37.9% Series B 4.67 4.34

Series C 28.0% 34.4% Series C 4.40 3.74

Series D+ 20.0% 30.0% Series D+ 3.69 3.04

Source: PitchBook | Geography: Global Source: PitchBook | Geography: Global

You might also like