0% found this document useful (0 votes)

87 views47 pages

Credit Risk - Predictive Modelling

This document provides an overview of a seminar on predictive modelling in credit risk. The 3-day seminar covers topics such as credit risk, market risk, climate risk, predictive modelling, scorecard development, and model assessment. Participants will complete a case study assignment to build a probability of default scorecard using logistic regression on a dataset of 50,000 US mortgage loans. The goal is to create a model with sufficient discriminatory power following good modelling practices. Resources and support are provided to help participants complete the assignment and presentation.

Uploaded by

Aldo Leal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views47 pages

Credit Risk - Predictive Modelling

Uploaded by

Aldo Leal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Credit Risk –

Predictive Modelling
4EK614

09 Feb 2022
With You Today

Our services

Credit Risk Team

Jan Nusko
Senior Consultant in Credit Risk Team
Risk Impairment
Jan.Nusko@cz.ey.com Parameters Loss

AQR Regulatory

Our projects
Model Development Data Analysis Stress Testing
Model Validation Data Mining Impairment

Methodological Reviews LIC(R), EPIC, CFE Regulatory Reporting

Asset Quality Reviews Advanced Analytics Business Intelligence

Page 2 Credit Risk – Predictive Modelling

About This Seminar

Study materials
Course Structure
1. PowerPoint slides, provided after the course
Day 1: Credit Risk, Market Risk, Climate Risk
Day 2: Predictive Modelling in Credit Risk
Day 3: Assignment + Seminar Data Walkthrough Prerequisites

Time: 9:15 – 15:15 1. Basic understanding of statistical and mathematical concepts

2. Elementary knowledge of programming (Python, R, …)

Course Assessment

1. Case study – Credit Risk - Preparation of PD scorecard:

a) Prepare development sample from portfolio of mortgage loans
b) Model scorecard using logistic regression (or any technique you want!) and include assessment

2. Outputs – PPT presentation or PDF, summarizing the abovementioned outputs, and scripts used.

3. Output presentation – Short (10-15 minute) presentation about results of this assessment.

Page 3 Credit Risk – Predictive Modelling

Coursework

Goal Resources

• Your task is to build a PD scorecard using the • Mortgage_sample.csv: Modelling dataset

provided data. The goal is to create a model that with data about 50000 US mortgages
will predict a probability of default for each
mortgage.
• Mortagage_metadata.xlsx: Data dictionary
• The presentation contains an overview of a
proposed modelling process and some • Package suggestions:
considerations to consider when developing and • Python - scorecardpy
assessing the model.
• R – scorecard
• You will be assessed on the “good modelling
practice” you employ. Remember, the best model
is not necessarily the one with the highest • jan.nusko@cz.ey.com is available for a 30
performance metric. Your goal should be to build min consultation, please send him an email
a scorecard with enough discriminatory power, if interested.
but the steps taken during the modelling process
are most important.

Page 4 Credit Risk – Predictive Modelling

Agenda Day 1

1. Intro & Admin 09:15-09:45 Operative

2. Banking & Credit Risk 09:45-10:30 Don’t hesitate to ask or comment at any point
We recommend teams for case study
3. Underwriting 10:40-11:15
Menti.com – 4563 7579
4. Scoring 11:15-11:50

5. Lunch 11:50-13:00

6. Market Risk 13:00-14:30

7. ESG & Taxonomy 14:35-15:30

Page 5 Credit Risk – Predictive Modelling

Agenda Day 2

1. Predictive Modelling 09:15-10:00 Operative

2. Scorecard Development 10:10-11:00 Don’t hesitate to ask or comment at any point
We recommend teams for case study
4. Model Assessment 11:10-12:00
Menti.com – 4563 7579
6. Q&A 12:00-12:30

Page 6 Credit Risk – Predictive Modelling

Agenda Day 3

1. Assignment Intro 09:15-10:00 Operative

2. Data Walkthrough 10:10-10:50 Don’t hesitate to ask or comment at any point
Assignment We recommend teams for case study
3. 11:00-12:00
Walkthrough
5. Lunch 12:00-13:00

Page 7 Credit Risk – Predictive Modelling

Banks

Page 8 Credit Risk – Predictive Modelling

Balance sheet and off-balance sheet of a bank

Assets Liabilities
Cash Deposits from customers
Deposits at central bank Loans from other banks
Balance

Loans to customers Securities

sheet

Loans to other banks Hybrid instruments

Securities Other liabilities
Other assets Equity

Undrawn limits of credit lines Undrawn limits of credit lines

Off-balance

Loan commitments Guarantees received

sheet

Guarantees given Derivatives

Derivatives

Page 9 Credit Risk – Predictive Modelling

What is credit risk?

• The risk that a counterparty fails to meet a contractual obligation

Banking book Trading book

▪ Retail: mortgages, credit cards ▪ Counterparty credit risk (CCR): whenever a

▪ Corporate: Investment property financing, trade is settled in the future and/or is not
project financing, large corporate lending “delivery versus payment” (DvP), a firm takes
▪ Wholesale: Lending to banks & sovereigns on credit risk

Insurance Other

▪ Reinsurer default ▪ Intermediary: Default on commissions

▪ Corporate bond / ABS default / CDS receivable
▪ Derivative counterparties ▪ Accounts receivable: Non payment of invoice

Page 10 Credit Risk – Predictive Modelling

Components of credit risk

▪ Probability of Default: The likelihood the borrower will default on its obligation
PD either over the life of the obligation.

▪ Loss Given Default: Loss that lender would incur in the event of borrower’s
default. It is the exposure that cannot be recovered through bankruptcy
LGD proceedings, collateral recovery or some other form of settlement. Usually
expressed as a percentage of exposure at default.

▪ Exposure at Default: The exposure that the borrower would have at default.
EAD Takes into account both on-balance sheet (capital) and off-balance sheet
(unused lines, derivatives or repo transactions) exposures and payment
schedule.

Expected Credit Loss (ECL) = PD x LGD x EAD

Page 11 Credit Risk – Predictive Modelling

Credit risk agenda

► Risk management function reshaping roadmap ► Diagnostics on the effectiveness & efficiency of the
collections process
► Credit risk strategy and linkage to business strategy
► Development of a collections strategy, strategic and
► Risk appetite framework and statements tactical (cost-benefit) analysis of available
► Credit risk processes and segregation of duties outsourcing options

► Model governance framework (model request, design ► Design of a collections framework

implementation, validation) ► Support with collections technology requirements
► Stress testing framework analysis, selection and implementation of an
appropriate solution

Governance Collection services

Application process Performing portfolio Non-performing portfolio

Application scoring Rating models Provisioning LGD models

► Business model request ► Model design / validation / ► Design of impairment ► LGD estimates design and
specification internal audit reviews methodology in line with IFRS validation
► Application scorecard design ► Regulatory compliance ► Effective interest rate and ► LGD (scoring) models design
and validation ► PD estimation recognitions of fees and and validation
► Design and review of the Model usage for business commissions
► ► LGD data warehouse
application processes purposes ► Back-testing analyses specification
► Support with application ► Proprietary IT tools ► Collateral valuation scenarios
workflow technology

Page 12 Credit Risk – Predictive Modelling

Underwriting process
Underwriting process

• Underwriting (UW) process is the processing of credit application and making a decision about
the final approval or decline of the application.

• Generally the UW process can end up in several different states: approval, decline, cancelation
from client side, non-eligibility (for example the applicant is not meeting minimum age criteria,
etc.)

Product Personal/ Client Scoring and Decision /

parameters financial identification Overdebtedness Contract
selection information verification signature

Page 14 Credit Risk – Predictive Modelling

Underwriting process

• Client segment is a crucial parameter to the UW process and scoring

Entrepreneurs
Private individuals Small business Corporates
Freelancers
• Usually automated process • Usually automated process with • Partially automated process, but • Typically manual assessment on
• Scoring applications in order to assess possibly manual inputs mostly manual assessment yearly basis (rating process using
riskiness of newly issued loans/credits • Scoring applications in order to • Scoring applications for automated financial, qualitative and behavioral
• Scoring client behavior on monthly assess riskiness of newly issued products scoring)
basis on credit and deposit products loans/credits • Process for manual yearly rating • Sometimes not sufficient data to use
• Large data sets → statistical approach • Scoring client behavior on (typically financial scoring, statistical approach – especially in
• Need to verify income and over- monthly basis qualitative scoring and behavioral case of project financing
indebtedness • Large data sets → statistical scoring) • Industry dependent and seasonal
• Credit registers (BRKI, NRKI, Solus) approach • Sufficient data sets for statistical • Credit registers (CRÚ, Cribis,
• No need to verify income and approach Bisnode, etc.)
over-indebtedness • Credit registers (CRÚ, Cribis,
• Credit registers Bisnode, etc.)

Page 15 Credit Risk – Predictive Modelling

Underwriting process

• Underwriting process differs significantly for different products

Credit card, Overdraft

Mortgage Consumer loan and Revolving Investment loan

• Financing housing needs • Purpose or non-purpose • Credit limit that can be utilized, but • Typical financing for corporate and
• Subject to consumer protection • Subject to consumer protection it is not a must small business segments, but also for
• Requires real estate collateral and • Can have collaterals or guarantors, • Client can flexibly utilize whatever entrepreneurs
insurance but usually it doesn’t part of the limit he needs to • Processed manually
• Large financed amount • Automated, easy and fast UW • Grace period • Very high financed amount
• Typically longer maturity process • High interest rates • Based on business and financial plan
• More thorough and detailed UW • Higher interest rates • Typically no collaterals • Usually with collaterals and
process • Co-applicants possible, but not • Lower financed amount guarantees
• Partially manual assessment that frequent as for mortgages • Maturity is not specified (contract
• Loan to value condition • Medium financed amount terminates on request when fully
• Lower interest rates • Medium maturity repaid)
• Fixation periods • Medium risk • High risk
• Co-applicants possible • Credit cards come with plastic card

Page 16 Credit Risk – Predictive Modelling

Underwriting process

• First step in the process is the assessment of client general eligibility

• Is the client over 18 years old?
• Is the client eligible to sign contracts?
• Is the client on the international sanction list?
• Is the client a politically exposed person?
• Has the client a tax domicile in the same country?
• Does the client agree with all the legally required actions (credit bureau request, information protection principles,
general terms and conditions, pre-contractual information, etc.)?

• Second step is the assessment of client eligibility for the given product and channel
• Is the client below prescribed age when applying for a long term product such as mortgage?
• Does the client have eligible income for the particular product and process?
• Does the client have all prescribed documents (valid ID card and valid second ID document)?
• Is the collateral for the issued loan eligible and sufficient (LTV threshold)?

Page 17 Credit Risk – Predictive Modelling

Underwriting process

• There are several laws and directives that affect the underwriting process
Consumer needs to be protected from dishonest and
Law on consumer loan malicious practices including intentional over-
indebting, but also non-intentional over-indebting –
the responsibility of not over-indebting the client is
Consumer protection now on the borrower

Mortgage credit directive (MCD) Market and economy needs to be protected against
adverse economic impacts originating in the
financial system
Consumer credit directive (CCD)
Society needs to be protected against criminal acts
EBA guidelines and terrorism

Basel Capital Accord Consumer needs to be protected against loosing his

money deposited in a bank by irresponsible lending
and crediting banks clients
Anti-money laundering (AML)

Page 18 Credit Risk – Predictive Modelling

Underwriting process

•
• Client authentication •
Expiry date check – ID not expired
Check on validity in MPSV database
• Anti-fraud module • Issue date consistency check (based on linear regression below)
• Check on issue date – not week-end or public holiday
• Check on address at MěÚ or OÚ
• Control on ID manipulation (color histogram, fonts)
• Check on consistency of bar-code and ID number
• Consistency of sex and birth number (third digit)
• Birth date divisible by 11 after 1953
• Overall control number check
• Expiry date control number check
• Birth date control number check

Page 19 Credit Risk – Predictive Modelling

Underwriting process

• Internal blacklists on phone numbers, ID cards, IČO of employers, ready-made companies

• Frequency checks in on-line underwriting process (applications are tracked with respect to different identificators and
their combinations
• Device fingerprint (publicly available libraries)
Hardware: CPU architecture & device memory, GPU canvas, Audio stack
Software: User agent, OS version,
Storage: local storage, session storage
Display: color depth, screen size
Browser customizations: fonts, plug-ins, codecs, mime types, time zone, user language,
Miscellaneous: floating point calculations, callbacks / objects to DOM
• Phone number
• Account number
• ID card number
• E-mail address
• Birth number
• IP address

• Geolocation (via IP address and Google API) – can be used for anti-fraud as well as for scoring

• Checks on discrepancy between past applications with the same identifiers

Page 20 Credit Risk – Predictive Modelling

Underwriting process

• Individuals / Entrepreneurs:
• BRKI – Banking Register of Client Information
• Information about applications and loan contracts shared among the banks operating in Czech Republic. Generally only banks
can access it.
• Information is stored in BRKI during the existence of credit relationship and 4 years after it terminates. If the contract with the
bank has not been signed is this information in BRKI stored for one year.
• NRKI – Non-Banking Register of Client Information
• Information about applications and loan contracts shared among non-bank credit providers. Generally only those that
participate on the sharing can access it.
• SOLUS
• Information about applications and loan contracts shared among participating credit providers and some other companies.
Generally only those that participate on the sharing can access it. It contains both – register of negative as well as register of
positive information.
• In SOLUS participate also TELCO companies and utility providers.

• Companies / Entrepreneurs:
• CRÚ – Kreditní Registr Úvěrů
• Information about loan contracts of entrepreneurs and companies – compulsory register operated by Czech National Bank.

Page 21 Credit Risk – Predictive Modelling

Scoring – Discrimination?

2017 2021

Page 22 Credit Risk – Predictive Modelling

Underwriting process

• Scoring is one of the tools to measure the creditworthiness of a business or person. It is the
result of scoring, where different scales are given different weight. This procedure results in
a credit score
Physical location,
work location,
Transactional
household location
profile,
behavior on
deposits

Account Invoice payment

statements, history, services,
application data, location track
psychoscoring

DATA
Text analytics,
Credit registers,
friends, posts,
social security,
activity, job
health insurance,
history
government

Relatives,
Device price, age
transactional networks
(suppliers, cost
and attractivity,
structure) level of user
experience

Page 23 Credit Risk – Predictive Modelling

Scoring - Goal

• Problem to be solved: single own house

university degree
29 years

predictors target
i=1 i=2 i=3 i=4 …..

find a function f such that

𝑇 𝑓 𝑎𝑔𝑒𝑖 , 𝑠𝑡𝑎𝑡𝑢𝑠𝑖 , 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 , ℎ𝑜𝑢𝑠𝑖𝑛𝑔𝑖 , … − I 𝑖 defaulting in 1 year from snapshot

is in some sense minimized
Can be numeric, ordinal,
Attains only values 0, or 1
nominal or even missing

the value f(.) we call probability of default (PD)

Page 24 Credit Risk – Predictive Modelling

Predictive modelling
Predictive Modelling - Model life-cycle

Model monitoring & 4 Model request

review 1 Business line
Management
Monitoring
Independent review
Internal audit Model
lifecycle
steps

Model implementation & Model development

usage 3 Management
IT & data 2 Risk
Loan approval
Capital requirement calculation

Page 26 Credit Risk – Predictive Modelling

Components of credit risk

▪ Probability of Default: The likelihood the borrower will default on its obligation
PD either over the life of the obligation.

Expected Credit Loss (ECL) = PD x LGD x EAD

Page 27 Credit Risk – Predictive Modelling

Loss given default

𝑖 𝑜 𝑎𝑡𝑒 𝑎 𝑒𝑎 𝑖 𝑎𝑡𝑖𝑜𝑛
Predictors' values as at DD-12M 𝑒𝑐𝑜 𝑒 𝑖𝑒𝑠

Predictors' values as at DD-9M

Predictors' values as at DD-6M

Predictors' values as at DD-3M

1 𝑖 1
𝑖
Recoveries

Exposure at default

Collateral realization
CF 1

CF 2

CF 3

CF 4
DD-12M DD-9M DD-6M DD-3M DD DD+3M DD+6M DD+9M DD+12M DD+15M DD+18M DD+21M
Arbitrarily chosen
end of recovery
process
Prediction of the LGD Discounting

Page 28
LGD models

• “U-shape”
• It does not make sense to use average LGD = 45% for these clients
• Real LGD is lower then 10% for the best 1/3 of the clients and higher then 90% for the worst 1/3 of the
clients

Page 29
Predictive Modelling - Goal

• Problem to be solved: single own house

university degree
29 years

predictors target
i=1 i=2 i=3 i=4 …..

find a function f such that

𝑇 𝑓 𝑎𝑔𝑒𝑖 , 𝑠𝑡𝑎𝑡𝑢𝑠𝑖 , 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 , ℎ𝑜𝑢𝑠𝑖𝑛𝑔𝑖 , … − I 𝑖 defaulting in 1 year from snapshot

is in some sense minimized
Can be numeric, ordinal,
Attains only values 0, or 1
nominal or even missing

the value f(.) we call probability of default (PD)

Page 30 Credit Risk – Predictive Modelling

Predictive Modelling - Workflow

• 1) Data exclusions
• 2) Missing values analysis
• 3) Outlier treatment
• 4) Variable transformation (feature engineering)
• 5) Univariate analysis
• 6) Correlation analysis
• 7) Modelling
• Selection of shortlist of variables
• Estimation of coefficients based

Page 31 Credit Risk – Predictive Modelling

Predictive Modelling – Sample definition

• Since we will be using a regressive

approach, we need to keep in mind that we
cannot have dependent observations.
• To avoid this, a cohort approach is used:
• Flexible cohort – fixed number of snapshots
after first observation
• Fixed cohort – fixed snapshot date (e.g. from
every September)
• For our target, we define a “performance”
window – usually 12 months
• No balancing needed ;)
• Unless we’re talking about LDP portfolios

Page 32 Credit Risk – Predictive Modelling

Predictive Modelling - Workflow

• 1) Data exclusions
• 2) Missing values analysis (> 50%?)
• 3) Outlier treatment (< 5th Q/> 95th Q?)
• 4) Variable transformation (feature engineering) (Binning)
• 5) Univariate analysis (GINI below .2?)
• 6) Correlation analysis (Spearman >.5?)
• 7) Modelling
• Selection of shortlist of variables
• Estimation of coefficients based

Page 33 Credit Risk – Predictive Modelling

Predictive Modelling – Linear regression

• Problem to be solved: single own house

university degree
29 years

Historical data with already known target value

Name Age Status Education Housing Target
Adam 29 single high school rent 0 i=1 i=2 i=3 i=4 …..
Annie 27 single elementary with parents 1
Jane 31 single high school own house 0 Annie
John 30 married university mortgage 0

We choose linear function

𝑘

𝑓 𝑥Ԧ ≔ 𝛼 𝛽𝑗 𝑥𝑗
𝑗 Adam John Jane

Page 34 Credit Risk – Predictive Modelling

Predictive Modelling – Logistic regression

• Problem to be solved: single own house

university degree
29 years

Historical data with already known target value

We choose logistic function

1
𝑓 𝑥Ԧ ≔ 𝑘
1 𝑒 −𝛼−σ𝑗=1 𝛽𝑗𝑥𝑗 Adam John Jane

Page 35 Credit Risk – Predictive Modelling

Predictive Modelling - Logit

• We can choose other functions, but market standard is to use the logit link function

• Using linear function is not proper as it can give estimates above 1 or below 0, which is not
convenient for estimating probability of default

• Selection of the link function if it preserves the output between 0 and 1

• The reason for choosing logit function instead of others is mainly interpretational – the log-
odds ratio defined below is a linear combination of the predictors

𝑃𝐷
𝐿𝑜𝑔 − 𝑜𝑑𝑑𝑠 𝑎𝑡𝑖𝑜 ln 𝑓 − 𝑃𝐷
1 − 𝑃𝐷

• By central limit theorem under very general conditions the log-odds ratio distribution
converges in distribution to a normal distribution

Page 36 Credit Risk – Predictive Modelling

Predictive Modelling – Prediction

• Let’s say we have processed our data (deduplication, formatting, primary keys, consistency
checks…)
• We could take advantage of models with some sort of elimination
• E.g. – Lasso regression
• Least absolute shrinkage and selection operator
• Performs both variable selection and regularization

• Is this a good model?

Page 37 Credit Risk – Predictive Modelling

Comparison of different modelling techniques

Logistic Regression Decision Trees

► Logistic regression with variables grouping, WOE transformation ► Decision trees

► Quasi-maximum likelihood method for not independent ► Regression trees for real target
observations (especially autocorrelation on client level) ► Classification trees for multinomial target
► Multinomial logistic regression for multinomial target ► Can use a combination of these two
PROS CONS PROS CONS
► Fully under control ► Many assumptions ► No assumptions (dependent ► Lower predictive power
► Robust (predictors uncorrelated) predictors) ► Lack of sensitivity
► Easy to interpret ► Variable selection process ► Easy to interpret

Gradient boosting* Association Analysis

► Gradient boosting ► Association analysis is a data mining technique

► Regression as well as tree version ► Searches for small parts of the predictors space and finds
► Based on iterative algorithm boosting the performance irregularities in terms of certain target (default rate,
power by fitting the residuals approval rate, etc.)
PROS CONS PROS CONS
► Higher prediction power than ► Not easy to interpret ► Can run parallelly to classical ► Running time
trees ► Can be overfitted scoring ► Can affect scoring
► Implementation, running time ► Fraud detection ability

SVM* and NN* Bagging Ensemble Methods*

► Are powerful, but can be easily overfitted and can have high ► Are based on developing many models on random subsamples or
impact to reject inference (should be used as challengers) with different predictors and putting them together by ensemble
► Support vector machines (SVM) rule (random forests, etc.)
► Neural networks
PROS CONS PROS CONS
► Higher prediction power than ► Overfitting ► Higher prediction power than ► Overfitting
other methods ► Not interpretable standard linear methods ► Not interpretable
► Not deterministic optimization ► Sometimes higher stability ► Not sufficient track record

Page 38 Predictive Modelling * issue of selection of hyperparameters

Comparison of different modelling techniques

• We found that the predictive power of the logistic regression model and more advanced
approaches is in the same league

Modelling approach Predictive power (GINI) Variable selection

Logistic Regression Model (L) 55,08 % Stepwise regression
Random Forest (R) 59,11 % Default settings in h2o
Gradient Boosting Machine (G) 59,22 % Default settings in h2o

A potential increase in predictive power with Random Forests is highly

subject to information in the data (nonlinearity etc.)

Page 39 Predictive Modelling

Comparison of different modelling techniques

► All methods were applied after data cleansing, feature

extraction and the categorization of all features in the Gini: 55,08 %
95 %-Conf. int.: [49,29 %, 60,87 %] 1)
short list.
100 %
► As only a few categories were allowed for each feature,
nonlinear characteristics in the data may have been 80 %

Cumulative % of defaults
reduced by the loss of information from categorization.
60 %
► Hence, the advantage of Random Forests to cover also
nonlinearity in the model is only of minor importance. 40 %
Log. regression
► The features for the logistic model were selected by 20 % Random Forest
stepwise regression. Perfect
Random
0%
► We further used a Random Forest implementation in VBA
0% 20 % 40 % 60 % 80 % 100 %
in order to validate the result which we obtained using Cumulative % of all contracts
the H20 algorithm in R.

Page 40 Predictive Modelling

Predictive Modelling - Binning

• Another standardly used technique is binning of predictors and WoE transformation:

𝐺𝑂𝑂 𝑆𝑖Τ𝐵𝐴 𝑆𝑖
• Weight of evidence for i-th bin: 𝑊𝑜𝐸𝑖 : ln 𝐺𝑂𝑂 𝑆Τ𝐵𝐴 𝑆

BIN GOODS BADS DR WoE

1 [-inf,33) 69 52 0,429752 -0,42156
2 [33,37) 63 45 0,416667 -0,36795
3 [37,40) 72 47 0,394958 -0,27790
4 [40,46) 172 89 0,340996 -0,04556
5 [46,48) 59 25 0,297619 0,15424
6 [48,51) 99 41 0,292857 0,17712
7 [51,58) 157 62 0,283105 0,22469
8 [58,inf) 93 25 0,211864 0,60930
9 MISSING 19 11 0,366667 -0,15787

• Why binning? solves leverage points, solves informative missings, solves non-
numerical (either ordinal or multinomial) variables, assesses robustness
• Why WoE transformation? normalizes predictors values, enables easy interpretation
(under reasonable conditions always attains negative and
positive values, zero value represents portfolio default rate)

Page 41 Credit Risk – Predictive Modelling

Log – odds

Page 42 Credit Risk – Predictive Modelling

Predictive Modelling - WoE

• WoE is the new value of binned predictor

• Coefficient is the estimated parameter from
logistic regression corresponding to the
variable or to the absolute term (intercept)
• In case number in some bin is zero, we need
to compensate: 𝑊𝑜𝐸 ln
𝐵𝐴𝐷𝑆 0.5 Τ 𝐺𝑂𝑂𝐷𝑆 0.5
𝑖 𝑖
𝐵𝐴𝐷𝑆Τ𝐺𝑂𝑂𝐷𝑆

• Missing category can be treated

• Scorecard points serve as a standardized
linear transformation of log-odds so that
certain criteria are met – it is motivated
mainly by interpretation
• Coefficients should be negative when using
WoE

Page 43 Credit Risk – Predictive Modelling

Predictive Modelling - Scorecard

• Scorecard points (score)

• The motivation is to derive a scale such that:

• It’s a linear combination of log-odds ratio
• More score points means lower PD
• Double odds ratio corresponds to a prescribed number of score points 𝐴 :

• 𝐵 score points corresponds to a prescribed PD value 𝑥 :

Page 44 Credit Risk – Predictive Modelling

Model performance - GINI

• ROC (Receiver Operation Characteristics) curve, GINI

• Measuring discriminatory power – only ordering matters, not the actual score values
True positive rate if we sort the clients increasingly by the score value, the true positive rate can be calculated
for i-th observation as the number of observations with target=1 and index lower or equal to i
divided by the total number of observations with target=1
False positive rate if we sort the clients increasingly by the score value, the false positive rate can be calculated
for i-th observation as the number of observations with target=0 and index lower or equal to i
divided by the total number of observations with target=0

AUC (Area Under Curve) =

Client Event Score 1/3*1/5+2/3*1/5+3/3*3/5=
Annie 1 325 4/5=0.8
Paul 0 398
Lisa 1 415 GINI = 2*(AUC-0.5)=0.6
Jane 0 463
Jack 1 499 • GINI attains values between -1 and 1, but
Adam 0 520 relevant are only values between 0 and 1
John 0 611 • GINI=0 stands for theoretical random
Mary 0 672 model (no predictive power)
• GINI=1 stands for perfectly discriminating
model

Page 45 Credit Risk – Predictive Modelling

Correlation

• Degree of statistical association between

two random variables
• Pearson correlation coefficient
• Sensitive to linear relationships
• Spearman correlation coefficient
• More robust, sensitive to nonlinearity

Page 46 Credit Risk – Predictive Modelling

Representativeness/Stability - PSI

• PSI (Population Stability Index) is a measure of difference between two discrete distributions
• It is typically used in order to assess representativity – i.e. assess whether distribution of a
binned variable differs in two different data samples which are typically from two different
time periods (threshold of 0.2 is frequently used)

𝐴𝑐𝑡𝑢𝑎 %𝑖
𝑃𝑆𝐼 𝐴𝑐𝑡𝑢𝑎 %𝑖 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑%𝑖 ∗ ln
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑%𝑖
𝑖
where n is number of bins

Score bands Actual % Expected % Ac-Ex ln(Ac/Ex) Index

< 251 5% 8% -3% -0,470 0,014

251–290 6% 9% -3% -0,410 0,012

291–320 6% 10% -4% -0,510 0,020

321–350 8% 13% -5% -0,490 0,024

351–380 10% 12% -2% -0,180 0,004

381–410 12% 11% 1% 0,090 0,001

411–440 14% 10% 4% 0,340 0,013

441–470 14% 9% 5% 0,440 0,022

471–520 13% 9% 4% 0,370 0,015

520 < 9% 8% 1% 0,120 0,001

Population Stability Index (PSI) = 0,1269

Page 47 Credit Risk – Predictive Modelling

Altman2017 Handbook of Behavioural
No ratings yet
Altman2017 Handbook of Behavioural
606 pages
ISMP High Alert Medication List
No ratings yet
ISMP High Alert Medication List
1 page
14.predictive Modeling Using Logistic Regression.2007
No ratings yet
14.predictive Modeling Using Logistic Regression.2007
266 pages
Customer Segmentation in Python Chapter2
No ratings yet
Customer Segmentation in Python Chapter2
33 pages
Reject Inference Methodologies in Credit Risk Modeling
No ratings yet
Reject Inference Methodologies in Credit Risk Modeling
10 pages
Credit Risk Modeling Using Python
No ratings yet
Credit Risk Modeling Using Python
133 pages
How To Credit Score With Predictive Analytics: Whitepaper
No ratings yet
How To Credit Score With Predictive Analytics: Whitepaper
7 pages
Predicting Credit Risk For Unsecured Lending
No ratings yet
Predicting Credit Risk For Unsecured Lending
9 pages
A Benchmark of Machine Learning Approaches For Credit Score Prediction
No ratings yet
A Benchmark of Machine Learning Approaches For Credit Score Prediction
8 pages
Credit Behavioral Model
No ratings yet
Credit Behavioral Model
54 pages
6632-Bootcamp in Credit Risk
No ratings yet
6632-Bootcamp in Credit Risk
167 pages
Portfolio Management Report
No ratings yet
Portfolio Management Report
10 pages
Credit Score Validation
No ratings yet
Credit Score Validation
5 pages
Credit Derivatives
No ratings yet
Credit Derivatives
146 pages
Predictive Analytics I: Data Mining: Process, Methods, and Algorithms
No ratings yet
Predictive Analytics I: Data Mining: Process, Methods, and Algorithms
60 pages
Accenture Counterparty Credit Risk Basel Framework Successful Implementation
No ratings yet
Accenture Counterparty Credit Risk Basel Framework Successful Implementation
17 pages
Risk and Types of Risks
No ratings yet
Risk and Types of Risks
2 pages
Bank Stress Testing and Comprehensive Capital Assessment and Review (CCAR)
No ratings yet
Bank Stress Testing and Comprehensive Capital Assessment and Review (CCAR)
34 pages
Risk Definitions From CreditSuisse
No ratings yet
Risk Definitions From CreditSuisse
22 pages
(Morton Lane) Alternative Risk Strategies
No ratings yet
(Morton Lane) Alternative Risk Strategies
725 pages
Data Mining and Credit Scoring
No ratings yet
Data Mining and Credit Scoring
8 pages
PWC Basel III Capital Market Risk Final Rule
No ratings yet
PWC Basel III Capital Market Risk Final Rule
30 pages
Analyzing IoT Data in Python Chapter4
No ratings yet
Analyzing IoT Data in Python Chapter4
34 pages
Adjustment and Application of Transition Matrices in Credit Risk Models
No ratings yet
Adjustment and Application of Transition Matrices in Credit Risk Models
27 pages
106 - Machine Learning and Credit Risk Modelling
100% (1)
106 - Machine Learning and Credit Risk Modelling
8 pages
Python Library Functions
No ratings yet
Python Library Functions
12 pages
Neural Network in Financial Analysis
No ratings yet
Neural Network in Financial Analysis
33 pages
RAROC A Tool For Factoring Risk
No ratings yet
RAROC A Tool For Factoring Risk
5 pages
ALM Review
No ratings yet
ALM Review
16 pages
Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
No ratings yet
Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
8 pages
Customer Segmentation in Python Chapter3
No ratings yet
Customer Segmentation in Python Chapter3
25 pages
Credit Risk Modelling and Quantification
No ratings yet
Credit Risk Modelling and Quantification
144 pages
Logistic Regression
No ratings yet
Logistic Regression
35 pages
Stress Testing and Risk Integration in Banks: University of Passau
No ratings yet
Stress Testing and Risk Integration in Banks: University of Passau
53 pages
A Review On Credit Card Default Modelling Using Data Science
No ratings yet
A Review On Credit Card Default Modelling Using Data Science
7 pages
Modelling Credit Risk
No ratings yet
Modelling Credit Risk
27 pages
Credit Risk Modeling in Python Chapter2
100% (1)
Credit Risk Modeling in Python Chapter2
36 pages
Forecasting Default With The KMV-Merton Model
No ratings yet
Forecasting Default With The KMV-Merton Model
35 pages
Counterparty Credit Risk Benchmark Model 1732196434
100% (1)
Counterparty Credit Risk Benchmark Model 1732196434
76 pages
Dynamic ALM
No ratings yet
Dynamic ALM
5 pages
Back Testing
No ratings yet
Back Testing
33 pages
Application of AI in Credit Risk Scoring For Small Business Loans: A Case Study On How AI-based Random Forest Model Improves A Delphi Model Outcome in The Case of Azerbaijani SMEs.
No ratings yet
Application of AI in Credit Risk Scoring For Small Business Loans: A Case Study On How AI-based Random Forest Model Improves A Delphi Model Outcome in The Case of Azerbaijani SMEs.
23 pages
1.4 Credit Risk Transfer Mechanisms-1607079978449
No ratings yet
1.4 Credit Risk Transfer Mechanisms-1607079978449
22 pages
Eb16 D 07 Pavel Charamza Practical Aspects of Forward Looking Components in LGD
No ratings yet
Eb16 D 07 Pavel Charamza Practical Aspects of Forward Looking Components in LGD
11 pages
A Quantitative Liquidity Model For Banks, Introductuion
0% (1)
A Quantitative Liquidity Model For Banks, Introductuion
18 pages
What Is Behavioral Modeling?
No ratings yet
What Is Behavioral Modeling?
2 pages
PD Estimation Approaches-ABm
No ratings yet
PD Estimation Approaches-ABm
83 pages
FASB's Current Expected Credit Loss Model For Credit Loss Accounting (CECL) : Background and FAQ 'S For Bankers June 2016
No ratings yet
FASB's Current Expected Credit Loss Model For Credit Loss Accounting (CECL) : Background and FAQ 'S For Bankers June 2016
23 pages
Basel II
No ratings yet
Basel II
5 pages
Credit Concentration Risk
No ratings yet
Credit Concentration Risk
15 pages
Risk Management AND Financial Institutions: Second Edition
No ratings yet
Risk Management AND Financial Institutions: Second Edition
8 pages
A Hybrid Bankruptcy Prediction Model With Dynamic Loadings On Acct-Ratio-Based and Market-Based Info
No ratings yet
A Hybrid Bankruptcy Prediction Model With Dynamic Loadings On Acct-Ratio-Based and Market-Based Info
16 pages
Econometric Methods With Applications in Business
No ratings yet
Econometric Methods With Applications in Business
9 pages
Development and Validation of Credit-Scoring Models
No ratings yet
Development and Validation of Credit-Scoring Models
70 pages
Introduction To Data Visualization With Matplotlib Chapter2
No ratings yet
Introduction To Data Visualization With Matplotlib Chapter2
27 pages
IFRS 17 FAQs
100% (1)
IFRS 17 FAQs
53 pages
Mastering IDEAScript: The Definitive Guide
From Everand
Mastering IDEAScript: The Definitive Guide
IDEA
No ratings yet
Credit Risk Predictive Modelling - by EY
0% (1)
Credit Risk Predictive Modelling - by EY
37 pages
ABSTRACT
No ratings yet
ABSTRACT
2 pages
Credit Risk Modeling
No ratings yet
Credit Risk Modeling
8 pages
Credit Risk Prediction Presentation
No ratings yet
Credit Risk Prediction Presentation
11 pages
Chapter 1 3 Format Final - .Docx Google Docs
No ratings yet
Chapter 1 3 Format Final - .Docx Google Docs
38 pages
Hard Times Charles Dickens
No ratings yet
Hard Times Charles Dickens
239 pages
Jason Ryan Perez - Honors Research Senior Thesis Final
No ratings yet
Jason Ryan Perez - Honors Research Senior Thesis Final
12 pages
College Adjustment of First Year Students: The Role of Social Anxiety
No ratings yet
College Adjustment of First Year Students: The Role of Social Anxiety
10 pages
Jowett, Training
100% (2)
Jowett, Training
20 pages
Qaballah and Tarot: A Basic Course in Nine Lessons. Lesson II
No ratings yet
Qaballah and Tarot: A Basic Course in Nine Lessons. Lesson II
10 pages
(John Tabak) Natural Gas and Hydrogen (Energy and (Book4You)
100% (1)
(John Tabak) Natural Gas and Hydrogen (Energy and (Book4You)
225 pages
Hella HINC Electrics Catalog
No ratings yet
Hella HINC Electrics Catalog
100 pages
Amravati Dips 12-13
No ratings yet
Amravati Dips 12-13
32 pages
Split Valuation SAP
No ratings yet
Split Valuation SAP
7 pages
DC Motor Control Using A Single Switch: Circuit Ideas
No ratings yet
DC Motor Control Using A Single Switch: Circuit Ideas
2 pages
Rumah Cerdas Bahasa Inggris Belajar Bahasa Inggris Dari Nol 4 Minggu Langsung Bisa
No ratings yet
Rumah Cerdas Bahasa Inggris Belajar Bahasa Inggris Dari Nol 4 Minggu Langsung Bisa
3 pages
Lokesh CV MTech VLSI DIAT PDF
No ratings yet
Lokesh CV MTech VLSI DIAT PDF
1 page
Canonical Correlation
No ratings yet
Canonical Correlation
7 pages
Chapter 22: Hydrocarbon Compounds: Lesson 22.1: Hydrocarbons
No ratings yet
Chapter 22: Hydrocarbon Compounds: Lesson 22.1: Hydrocarbons
7 pages
The Revenge of Gaia
No ratings yet
The Revenge of Gaia
2 pages
Implementing Microsoft Windows Server 2022 Using HPE ProLiant Servers, Storage, and Networking Options-A50003760enw
No ratings yet
Implementing Microsoft Windows Server 2022 Using HPE ProLiant Servers, Storage, and Networking Options-A50003760enw
18 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Perio
No ratings yet
Perio
63 pages
ISTQB Certified Tester - Foundation Level Syllabus v4.0-pg2
No ratings yet
ISTQB Certified Tester - Foundation Level Syllabus v4.0-pg2
1 page
How To Pass SAP MM Certification - SAP Study Material PDF
33% (3)
How To Pass SAP MM Certification - SAP Study Material PDF
13 pages
CHAINWAY C2000 Handheld Terminal API Instructions
No ratings yet
CHAINWAY C2000 Handheld Terminal API Instructions
66 pages
Certificate
No ratings yet
Certificate
4 pages
Conducting Action Research For Business and Management Students 9781529716566 152971656x Compress
No ratings yet
Conducting Action Research For Business and Management Students 9781529716566 152971656x Compress
118 pages
Annual Question Paper Montfort Class 9
No ratings yet
Annual Question Paper Montfort Class 9
2 pages
Strategy in Theory Strategy in Practice: Journal of Strategic Studies
No ratings yet
Strategy in Theory Strategy in Practice: Journal of Strategic Studies
21 pages
Business Management and Behavioural Studies: Certificate in Accounting and Finance Stage Examination
No ratings yet
Business Management and Behavioural Studies: Certificate in Accounting and Finance Stage Examination
4 pages
The Hebrew Scriptures and The Theology
No ratings yet
The Hebrew Scriptures and The Theology
18 pages
微调方法 ROSA - ACCURATE PARAMETER-EFFICIENT FINE-TUNING VIA ROBUST ADAPTATION
No ratings yet
微调方法 ROSA - ACCURATE PARAMETER-EFFICIENT FINE-TUNING VIA ROBUST ADAPTATION
16 pages