[go: up one dir, main page]

0% found this document useful (0 votes)
17 views14 pages

Ipl Dataset Write Up

The document outlines the steps to download and work with the IPL dataset from Kaggle, including signing in, finding, and downloading the dataset. It details how to set up a directory in Cloudera VM to store the dataset and provides commands for loading the matches and deliveries datasets. Additionally, it lists various queries to analyze the dataset, such as counting matches won by teams and player statistics.

Uploaded by

Meenu vatta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views14 pages

Ipl Dataset Write Up

The document outlines the steps to download and work with the IPL dataset from Kaggle, including signing in, finding, and downloading the dataset. It details how to set up a directory in Cloudera VM to store the dataset and provides commands for loading the matches and deliveries datasets. Additionally, it lists various queries to analyze the dataset, such as counting matches won by teams and player statistics.

Uploaded by

Meenu vatta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

WORKING WITH IPL DATASET

Step 1: Download Data from Kaggle

Sign in to Kaggle:

 Visit Kaggle and log in to your account.

Find the Dataset:

 Navigate to the dataset you want to download. For example, search for a dataset such as a
"IPL dataset 2008 -2024"

Download the Dataset:

 On the dataset page, click the "Download" button to get the dataset as a compressed .zip or
.tar file.
 The file will be saved to your Windows machine, typically in the "Downloads" folder.
 Create new folder for this “ ipl “ as there are two files deliveries.csv and matches.csv ; we
will be loading both of them.

Step 2: Start your Cloudera VM and open

 Create a directory named ipls to store the dataset.


 Mount the dataset in linux using the given command and check if the folder is mounted
successfully using ls command.
STEP 3: LOADING DATASET

 Load the matches dataset using the command given below:

 OUTPUT FOR DUMP:

 Load the delivery dataset using the command given below:

 Output for dump:


FOLLOWING ARE THE QUERIES FOR IPL DATASET:

1) NUMBER OF MATCHES WON BY EACH TEAM.

 Output:

 Storing the output of c into result file


2) NUMBER OF TIMES A PLAYER HAS BEEN AWARDED THE TITLE OF “PLAYER OF THE MATCH”

 Output:

 Storing the output of c into result file


3) HOW MANY TEAMS HAVE WON BOTH, THE TOSS AND THE MATCH

4) WHICH CITY HAS HOSTED HOW MANY MATCHES

 Output:
5) WHICH BATSMAN HAS PLAYED HOW MANY DELIVERIES

 Output:
6) WHICH BOWLER HAS BOWLED HOW MANY DELIVERIES

 Output:
7) NUMBER OF RUNS SCORED BY EACH BATSMAN

 Output:
9) COUNT THE NUMBER OF NO-BALLS BOWLED BY EACH BOWLER IN DESCENDING ORDER

 Output:
10) COUNT THE NUMBER OF WIDE BOWLED BY EACH BOWLER IN DESCENDING ORDER

 Output:
11) FINDING TOTAL EXTRAS GIVEN BY EACH PLAYER:

 Output:
12) NUMBER OF BALLS FACED BY EACH BATSMAN

 Output:
13) NUMBER OF OVERS BOWLED BY EACH PLAYER:

 Output:
14) NUMBER OF DELIVERIES FOR EACH DISMISSAL TYPE.

 Output:

15) NUMBER OF FOURS FOR EACH BATSMAN.

You might also like