[go: up one dir, main page]

0% found this document useful (0 votes)
45 views5 pages

Term Project Problem Statement: Predict The Outcomes of ICC T20 World Cup 2016. Data Collection

The document describes a term project to predict the outcomes of the 2016 ICC T20 World Cup using a logit model. Data was collected on factors that could influence the winner such as the toss, home field advantage, team ratings, and innings. Records from 2008-2016 were used to build the model. The analysis showed that only the innings and team ratings played a significant role in the predictions. The model was able to predict results with 57% accuracy.

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views5 pages

Term Project Problem Statement: Predict The Outcomes of ICC T20 World Cup 2016. Data Collection

The document describes a term project to predict the outcomes of the 2016 ICC T20 World Cup using a logit model. Data was collected on factors that could influence the winner such as the toss, home field advantage, team ratings, and innings. Records from 2008-2016 were used to build the model. The analysis showed that only the innings and team ratings played a significant role in the predictions. The model was able to predict results with 57% accuracy.

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Term Project

Problem Statement : Predict the outcomes of ICC T20 world cup 2016.
Data Collection
Factors on which winning probability for a particular team depends :a) Toss :- Team winning the toss can decide whether to chase or to
lead, They can make the decision favoring them depending upon the
pitch, weather conditions and their strengths
b) Ground :- Team playing on the home ground has higher chance of
winning the game compared to the opposite team
c) Ratings of the team :- Team ratings are calculated on the basis of the
performance of teams in T20 matches for the last 3-4 years. Higher
rating signifies better performance
d) Innings :- Team who goes for batting in second inning(chasing) has
higher probability of winning the match
e) Average Score :- Average score made by team in last 3-4 years in
T20 matches
Data :- Records of all the matches played by the 10 teams (teams
participating in T20 2016 world cup) since 2008 were used to build the
model.
Model :A logit model was fitted to the dataset and the result was used to make
predictions for T20 World cup 2016 results. The analysis of the model
shows that only innings & Team ratings are playing significant role.

Output

Anova test

Innings, Average score by the team & Rating of the team are the
most significant factors in predicting the winner for the game.

Performance

The model was able to predict the result with 57% accuracy.
Performance measure A (Accuracy of prediction) = 57%
Performance measure B (1 +(log2p(1 p)/2)) = -3.7959

Code used in the model


setwd('D:/R_AMSM2')
data<-read.csv('Train_Data_3.csv', header=T)
testdata_wc<-read.csv('test_2.csv', header=T)
raw_test <- testdata_wc
summary(data)
str(testdata_wc)
data$Inns <- factor(data$Inns)
data$Ground <- factor(data$Ground)
M= nrow(data)
N = ncol(data)
TrainvsTest= 1;
Train_idx = ceiling(TrainvsTest*M)

Test_idx = Train_idx+1
traindata = data
testdata = data [Test_idx:M,]
testdata_wc$Inns <- factor(testdata_wc$Inns)
testdata_wc$Ground <- factor(testdata_wc$Ground)
logit <- glm(Result~ Toss + Inns + Ground +Average_Team +
Average_Opposition + Rating_Team + Rating_Opposition
, family = binomial(link ='logit'), data = traindata)
summary(logit)
anova(logit,test="Chisq")
plot(logit)
step(logit)
logit_new <- glm(formula = Result ~ Inns + Average_Team +
Rating_Opposition,
family = binomial(link = "logit"), data = traindata)
fitted.results <- predict(logit,newdata=testdata_wc,type='response')
fitted.results.str <- ifelse(fitted.results > 0.5,"Won","Lost")
misClasificError <- mean(fitted.results.str != testdata_wc$Result)
print(paste('Accuracy',1-misClasificError))
p <- fitted.results
col2 <- log((2*p)*(1-p)/2)+1
Test_logit<-read.csv('test_class.csv', header = T)
logit_result <- cbind(raw_test, p, 1-p, col2, fitted.results.str )
logit_result
write.csv(logit_result,file='logit.csv')

You might also like