T20 Batting Analysis
T20 Batting Analysis
2023-24
INFORMATICS PRACTICES
PROJECT REPORT ON
Submitted By:
Name:
Class: Div.
Register Number:
Date of Examination:
Center of Examination:
CERTIFICATE
Teacher-in-charge
External Examiner:
Signature:
Date:
PROJECT DESCRIPTION
MODULES USED
SOFTWARE SPECIFICATION
CODING
BIBLIOGRAPHY
PROJECT DESCRIPTION
Every sports event generates a lot of data which we can use to analyze the performance of
players, teams, and many highlights of the game. Cricket is one of the most popular team
games in the world. Cricket is the second most watched sports in the world after soccer, and
battle of planning and execution. This spectator interest is due to the game being played at a
higher pace, exciting scoring shots, high run rate and the limited time required to complete
the game. T20 is the shortest and fastest version of cricket with each team facing twenty
(20) overs, consisting of six (6) balls each, with fielding restrictions being applied in the
The aim of the project is to identify how batting performance variables such as the total
runs scored by a team, maximum individual runs scored, total fours, total sixes, wickets lost
in the match and wickets lost in the power play correlate with winning and losing teams in
the IPLT20. The results will also assist performance personnel of the respective IPLT20
teams to better understand the effect of performance indicators regarding the success of
teams and assist in the possible forecasting of future performance. Analyzing the statistics
of cricket aids in understanding which factors greatly affect performance. This knowledge
leads to more specific training programs and a more educated selection of players for the
team.
SYSTEM SPECIFICATION
Operating System : Windows 7
Processor: Intel core i3-4150 or higher
RAM:4 GB or higher
Platform : Python IDLE 3.7
Languages:
important and necessary information by the user. Usually, the data inputted by the user along
with the generated output are displayed but are not stored, since all the program execution
takes place inside the RAM, which is a temporary memory, and as soon as we close the form,
its contents (form input and generated output) get erased. They can’t be retrieved since they
are not getting saved on a hard disk (or any secondary storage device). Thus, when the
application is executed for the second time, it requires a new set of inputs from the user. This
limitation can be overcome by sending the output generated and saving the input fetched from
the user in a database created at the back-end of the application. The input is fetched from the
user using Python Interface. This is termed as the Front End Interface of the application.
While working with an application, it is required to save data permanently on some secondary
storage device, which is usually the hard disk, so that the data can be retrieved for future
A comma-separated values file is a delimited text file that uses a comma to separate values.
Each line of the file is a data record. Each record consists of one or more fields, separated by
commas. The use of the comma as a field separator is the source of the name for this file
format.
CODING
import pandas as pd
import matplotlib.pyplot as plt
print("T20 World Cup Batting Analysis")
print("==================================")
while True:
print("MAIN MENU")
print("1.Dataframe Attributes")
print("2.Record Analysis")
print("3.Data Visualization as per records")
print("4.Customized Data Visualization")
print("5.Exit")
ch=int(input("Enter Your Choice:"))
if(ch==1):
df=pd.read_csv("/content/t20wc.csv")
print("Dataframe Attributes:")
print("1.Diplay the transpose")
print("2.Display column names")
print("3.Display indexes")
print("4.Display the shape")
print("5.Display the dimension")
print("6.Display the data types of all columns")
print("7.Display the size")
print("8.Back")
ch1=int(input("Enter Your Choice:"))
if ch1==1:
print("Diplaying the transpose")
print(" ")
print(df.T)
input("Press Enter to continue...")
elif ch1==2:
print("The column names are:")
print(" ")
print(df.columns)
input("Press Enter to continue...")
elif ch1==3:
print(df.index)
input("Press Enter to continue...")
elif ch1==4:
print("The shape of the dataframe is:")
print(" ")
print(df.shape)
input("Press Enter to continue...")
elif ch1==5:
print("The dimension of the dataframe is:")
print(" ")
print(df.ndim)
input("Press Enter to continue...")
elif ch1==6:
print("The data type of each columns is:")
print(" ")
print(df.dtypes)
input("Press Enter to continue...")
elif ch1==7:
print("The size of the dataframe is:")
print(" ")
print(df.size)
input("Press Enter to continue...")
elif ch1==8:
pass
elif ch==2:
df=pd.read_csv("/content/t20wc.csv")
print("RECORD ANALYSIS MENU")
print("1.Highest Score (Inning - Top 10)")
print("2.Lowest Score (Inning - Botton 10)")
print("3.Specific Number of Records From Top")
print("4.Specific Number of Records From Bottom")
print("5.Details of record for a Team")
print("6.Details of record for a Batsman")
print("7.Most Runs (Top Ten)")
print("8.Least Runs (Bottom Ten)")
print("0.Back")
ch2=int(input("Enter Your Choice:"))
if ch2==1:
df1=df.loc[:,['city','name','runs','ballsFaced']]
df1=df1.sort_values(by='runs',ascending=False)
print(df1.head(10))
input("Press Enter to continue...")
elif ch2==2:
df1=df.loc[:,['city','name','runs','ballsFaced']]
df1=df1.sort_values(by='runs',ascending=False)
print(df1.tail(10))
input("Press Enter to continue...")
elif ch2==3:
no=int(input("How Many Number of Records You Want To Be Printed From The Top:"))
df1=df.loc[:,['city','name','runs','ballsFaced']]
print(df1.head(n))
input("Press enter to continue...")
elif ch2==4:
n=int(input("How Many Number of Records You Want To Be Printed From Bottom:"))
df1=df.loc[:,['city','name','runs','ballsFaced']]
print(df1.tail(n))
input("Press enter to continue...")
elif ch2==5:
team=input("Enter The Team Name For Which You Want The data To Be Displayed:")
df1=df.loc[df['team']==team]
print(df1.loc[:,['city','name','against','runs','ballsFaced']])
input('Press enter to continue...')
elif ch2==6:
print("Ensure the name should match with CSV records:")
b=input("Enter The Player Name For Which You Want The data To Be Displayed:")
df1=df.loc[df['name']==b]
print(df1.loc[:,['city','name','against','runs','ballsFaced']])
print(' ')
df1.at['Total','runs']=df1['runs'].sum()
print(df1)
input('Press enter to continue...')
elif ch2==7:
print(" Most Runs (Top Ten)")
print(" ")
df1=df[['name','runs']].groupby('name').sum()
df1=df1.sort_values(by='runs',ascending=False)
print(df1.head(10))
input("Press enter to continue...")
elif ch2==8:
print(" Least Runs (Top Ten)")
print(" ")
df1=df[['name','runs']].groupby('name').sum()
df1=df1.sort_values('runs')
print(df1.head(10))
input("Press enter to continue...")
elif ch2==0:
pass
else:
print("Invalid Choice")
elif(ch==3):
df=pd.read_csv("/content/t20wc.csv")
print("Data Visualization Menu - According to no. of rows")
print("1.Line Plot")
print("2.Vertical Bar Plot")
print("3.Horizontal Bar Plot")
print("4.Histogram")
print("5.Exit The Data Visualization Menu")
ch3=int(input("Enter Choice:"))
df1=pd.DataFrame()
if ch3==1:
n=int(input("How many records from the top of table you want to plot:"))
df1=df.head(n)
df1.plot(linestyle="-.",linewidth=2,label="WORLD CUP RECORD")
plt.legend()
plt.show()
elif ch3==2:
n=int(input("How many records from the top of table you want to plot:"))
df1=df.head(n)
df1.plot(kind="bar",color="pink",width=.8)
plt.show()
elif ch3==3:
n=int(input("How many records from the top of table you want to plot:"))
df1=df.head(n)
df1.plot(kind="barh",color="cyan",width=.8)
plt.show()
elif ch3==4:
df.hist(color="yellow",edgecolor="pink")
plt.show()
elif ch3==5:
pass
elif(ch==4):
df=pd.read_csv("/content/t20wc.csv")
print("Customized Data Visualization Menu")
print("1.By Player")
print("2.By Team")
print("3.Back")
ch4=int(input("Enter Choice:"))
df1=pd.DataFrame()
if ch4==1:
print("Ensure the name should match with CSV records:")
player=input("Enter player name you want to plot:")
print('''
1. Line Chart
2. Bar Chart
3. Horizontal Bar Chart
4. Histogram
5. Back
''')
ch4_1=int(input("Enter your choice:"))
if ch4_1==1:
df1=df.loc[df['name']==player]
df1=df1.loc[:,['against','runs']]
df1.plot(x='against',y='runs',kind='line',linestyle="-.",linewidth=2,color='r')
plt.show()
elif ch4_1==2:
df1=df.loc[df['name']==player]
df1=df1.loc[:,['against','runs']]
df1.plot(x='against',y='runs',kind='bar',color='r')
plt.show()
elif ch4_1==3:
df1=df.loc[df['name']==player]
df1=df1.loc[:,['against','runs']]
df1.plot(x='against',y='runs',kind='barh',color='r')
plt.show()
elif ch4_1==4:
df1=df.loc[df['name']==player]
df1=df1.loc[:,['against','runs']]
df1.plot(x='against',y='runs',kind='hist',bins=25,cumulative=True)
plt.show()
elif ch4_1==5:
pass
elif ch4==2:
print("Ensure the name should match with CSV records:")
team=input("Enter team name you want to plot:")
print('''
1. Line Chart
2. Bar Chart
3. Horizontal Bar Chart
4. Histogram
5. Back
''')
ch4_2=int(input("Enter your choice:"))
if ch4_2==1:
df1=df.loc[df['team']==team]
df1=df1.loc[:,['name','runs']]
df1.plot(x='name',y='runs',kind='line',linestyle="-.",linewidth=2,color='r')
plt.show()
elif ch4_2==2:
df1=df.loc[df['team']==team]
df1=df1.loc[:,['name','runs']]
df1.plot(x='name',y='runs',kind='bar',color='r')
plt.show()
elif ch4_2==3:
df1=df.loc[df['team']==team] df1=df1.loc[:,
['name','runs']]
df1.plot(x='name',y='runs',kind='barh',color='r')
plt.show()
elif ch4_2==4:
df1=df.loc[df['team']==team]
df1=df1.loc[:,['name','runs']]
df1.plot(x='name',y='runs',kind='hist',bins=25,cumulative=True)
plt.show()
elif ch4_2==5:
pass
else:
print("* *INVALID CHOICE* *")
CSV FILE
OUTPUT SCREENSHOTS
CONCLUSION
In this study, data were analyzed in order to identify how batting performance variables
such as total runs scored, maximum individual runs scored, number of balls faced, batting
strike rate etc. related to winning a T20 cricket match. The total number of runs scored for
the team batting first is significant. Thus a high total runs scored per match for teams
batting first is an essential performance predictor for success. Along with a high total runs
scored, maximum individual runs scored by a batsman is also considered as a predictor for
success. Individual batting performance is crucial to increasing the total score for a team.
The amount of fours scored by IPLT20 teams batting first is significant and correlates with
winning a cricket match. The results of this study concludes that there are various
performance variables such as the influence of the higher total runs scored in T20 cricket are
https://scholar.ufs.ac.za/bitstream/handle/SloaneS.pdf/
https://www.tutorialaicsip.com/xii-projects-ip/