The document outlines an exploratory data analysis (EDA) process using a Cars dataset from Kaggle to predict car prices based on various features. It includes steps such as importing libraries, loading the dataset, checking and cleaning data by dropping irrelevant columns, renaming features, removing duplicates, handling missing values, and identifying outliers. The goal is to prepare the dataset for accurate model training by ensuring the data is clean and relevant.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
26 views19 pages
EDA Assignment Day 14.ipynb
The document outlines an exploratory data analysis (EDA) process using a Cars dataset from Kaggle to predict car prices based on various features. It includes steps such as importing libraries, loading the dataset, checking and cleaning data by dropping irrelevant columns, renaming features, removing duplicates, handling missing values, and identifying outliers. The goal is to prepare the dataset for accurate model training by ensuring the data is clean and relevant.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 19
{
"cells": [ { "cell_type": "markdown", "metadata": { "id": "DgE0o3YHBw-n" }, "source": [ "<center> <h1 style=\"background-color:orange; color:white\"><br>Exploratory Data Analysis<br></h1></center>" ] }, { "cell_type": "markdown", "metadata": { "id": "w6lzj4kjDJWu" }, "source": [ "# `Problem Statement:`\n", "We have used Cars dataset from kaggle with features including make, model, year, engine, and other properties of the car used to predict its price." ] }, { "cell_type": "markdown", "metadata": { "id": "JpZPe8JBBw-y" }, "source": [ "## `Importing the necessary libraries`\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "dl9ocdwHBw-2" }, "outputs": [], "source": [ "# import pandas as pd\n", "# import numpy as np\n", "# import seaborn as sns #visualisation\n", "# import matplotlib.pyplot as plt #visualisation\n", "# %matplotlib inline \n", "# sns.set(color_codes=True)\n", "# from scipy import stats\n", "# import warnings\n", "# warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": { "id": "K5JcLAN2Bw-7" }, "source": [ "## `Load the dataset into dataframe`" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "id": "Yc-ChymZBw_A" }, "outputs": [], "source": [ "## load the csv file \n", "# df = " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "ZUd5Fl7jBw_C", "outputId": "79c6280b-0909-4245-a805-9607cb59effa" }, "outputs": [], "source": [ "## print the head of the dataframe\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Gi3_9poxrSjE" }, "source": [ "Now we observe the each features present in the dataset.<br>\n", "\n", " `Make:` The Make feature is the company name of the Car.<br>\n", "`Model:` The Model feature is the model or different version of Car models.<br>\n", "`Year:` The year describes the model has been launched.<br>\n", "`Engine Fuel Type:` It defines the Fuel type of the car model.<br>\n", "`Engine HP:` It's say the Horsepower that refers to the power an engine produces.<br>\n", "`Engine Cylinders:` It define the nos of cylinders in present in the engine.<br>\n", "`Transmission Type:` It is the type of feature that describe about the car transmission type i.e Mannual or automatic.<br>\n", "`Driven_Wheels:` The type of wheel drive.<br>\n", "`No of doors:` It defined nos of doors present in the car.<br>\n", "`Market Category:` This features tells about the type of car or which category the car belongs. <br>\n", "`Vehicle Size:` It's say about the about car size.<br>\n", "`Vehicle Style:` The feature is all about the style that belongs to car.<br>\ n", "`highway MPG:` The average a car will get while driving on an open stretch of road without stopping or starting, typically at a higher speed.<br>\n", "`city mpg:` City MPG refers to driving with occasional stopping and braking.<br>\n", "`Popularity:` It can refered to rating of that car or popularity of car.<br>\ n", "`MSRP:` The price of that car.\n", "\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "VQ9qn4PaBw_i" }, "source": [ "## `Check the datatypes`" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "OPozGraJBw_l", "outputId": "b72042d2-5913-43d8-c78a-2101feea6294" }, "outputs": [], "source": [ "# Get the datatypes of each columns number of records in each column.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "gFyzAJLIBw_n" }, "source": [ "## `Dropping irrevalent columns`" ] }, { "cell_type": "markdown", "metadata": { "id": "ZZ863Z4jBw_p" }, "source": [ "If we consider all columns present in the dataset then unneccessary columns will impact on the model's accuracy.<br>\n", "Not all the columns are important to us in the given dataframe, and hence we would drop the columns that are irrevalent to us. It would reflect our model's accucary so we need to drop them. Otherwise it will affect our model.\n", "\n", "\n", "The list cols_to_drop contains the names of the cols that are irrevalent, drop all these cols from the dataframe.\n", "\n", "\n", "`cols_to_drop = [\"Engine Fuel Type\", \"Market Category\", \"Vehicle Style\", \"Popularity\", \"Number of Doors\", \"Vehicle Size\"]`\n", "\n", "These features are not neccessary to obtain the model's accucary. It does not contain any relevant information in the dataset. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "oW5t3xE-Bw_p" }, "outputs": [], "source": [ "# initialise cols_to_drop\n", "\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "RJvrJS9-Bw_r", "outputId": "69709257-f66a-41b3-f3e8-0cced7dbb28b" }, "outputs": [], "source": [ "# drop the irrevalent cols and print the head of the dataframe\n", "# df = \n", "\n", "# print df head\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Jg4y0BS7Bw_s" }, "source": [ "## `Renaming the columns`" ] }, { "cell_type": "markdown", "metadata": { "id": "aDciVmlRBw_t" }, "source": [ "Now, Its time for renaming the feature to useful feature name. It will help to use them in model training purpose.<br>\n", "\n", "We have already dropped the unneccesary columns, and now we are left with useful columns. One extra thing that we would do is to rename the columns such that the name clearly represents the essence of the column.\n", "\n", "The given dict represents (in key value pair) the previous name, and the new name for the dataframe columns" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "LPr2b3NPBw_u" }, "outputs": [], "source": [ "# rename cols \n", "# rename_cols = \n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "YpY0qGvIBw_v" }, "outputs": [], "source": [ "# use a pandas function to rename the current columns - \n", "# df = \n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "id": "3N1i99nYBw_v", "outputId": "d4c5d762-55ef-4566-c6d3-374cc8f9160e" }, "outputs": [], "source": [ "# Print the head of the dataframe\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "UgNExPnZBw_w" }, "source": [ "## `Dropping the duplicate rows`" ] }, { "cell_type": "markdown", "metadata": { "id": "ozWzkdrSBw_x" }, "source": [ "There are many rows in the dataframe which are duplicate, and hence they are just repeating the information. Its better if we remove these rows as they don't add any value to the dataframe. \n", "\n", "For given data, we would like to see how many rows were duplicates. For this, we will count the number of rows, remove the dublicated rows, and again count the number of rows." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "drvQvYs2Bw_x", "outputId": "a7e6f707-fab9-47f8-86c4-9cbd9f1b110f" }, "outputs": [], "source": [ "# number of rows before removing duplicated rows\n", "\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "id": "LvwZZUruBw_x", "outputId": "617daeb0-f1e8-46dd-9623-34dd5b4d3bdf" }, "outputs": [], "source": [ "# drop the duplicated rows\n", "# df = \n", "\n", "# print head of df\n", "\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "id": "Gg4hjGakBw_y", "outputId": "a0f3f48c-7f23-4f2b-911b-57529b32663b" }, "outputs": [], "source": [ "# Count Number of rows after deleting duplicated rows\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Q06o1NwrBw_z" }, "source": [ "## `Dropping the null or missing values`" ] }, { "cell_type": "markdown", "metadata": { "id": "ddf1mIspBw_z" }, "source": [ "Missing values are usually represented in the form of Nan or null or None in the dataset.\n", "\n", "Finding whether we have null values in the data is by using the isnull() function.\n", "\n", "There are many values which are missing, in pandas dataframe these values are reffered to as np.nan. We want to deal with these values beause we can't use nan values to train models. Either we can remove them to apply some strategy to replace them with other values.\n", "\n", "To keep things simple we will be dropping nan values" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "id": "s0MtVaYABw_z", "outputId": "61fbc5cc-d21a-453c-8bf5-8ba42a7f553e" }, "outputs": [], "source": [ "# check for nan values in each columns\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "58N8lvWRlIVT" }, "source": [ "As we can see that the HP and Cylinders have null values of 69 and 30. As these null values will impact on models' accuracy. So to avoid the impact we will drop the these values. As these values are small camparing with dataset that will not impact any major affect on model accuracy so we will drop the values." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "id": "TObFlN7xBw_0" }, "outputs": [], "source": [ "# drop missing values\n", "# df = \n", " " ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "id": "q3tsOjvcBw_0", "outputId": "067469f3-04d9-4894-f1e2-7ee4132a1d79" }, "outputs": [], "source": [ "# Make sure that missing values are removed\n", "# check number of nan values in each col again\n", "\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "id": "N0Ge8_yfBw_1", "outputId": "88459604-4bba-434c-d5fb-6e81910b4b50" }, "outputs": [], "source": [ "#Describe statistics of df\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "qBk8SZ29Bw_1" }, "source": [ "## `Removing outliers`" ] }, { "cell_type": "markdown", "metadata": { "id": "tn5lLccGBw_2" }, "source": [ "Sometimes a dataset can contain extreme values that are outside the range of what is expected and unlike the other data. These are called outliers and often machine learning modeling and model skill in general can be improved by understanding and even removing these outlier values." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "id": "2QnFqFbyBw_3", "outputId": "b0a85d54-e5d7-4943-aec5-854695406cac" }, "outputs": [], "source": [ "## Plot a boxplot for 'Price' column in dataset. \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "qCpI41VqBci9" }, "source": [ "### **`Observation:`**<br>\n", "\n", "Here as you see that we got some values near to 1.5 and 2.0 . So these values are called outliers. Because there are away from the normal values.\n", "Now we have detect the outliers of the feature of Price. Similarly we will checking of anothers features." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "id": "lvDBhe4jBw_3", "outputId": "6acf12e7-757f-4cbc-9020-d1d6a6e40564" }, "outputs": [], "source": [ "## PLot a boxplot for 'HP' columns in dataset\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "-YWNqTn7GI-4" }, "source": [ "### **`Observation:`**<br>\n", "Here boxplots show the proper distribution of of 25 percentile and 75 percentile of the feature of HP." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "S9tucB8ABw_4" }, "source": [ "print all the columns which are of int or float datatype in df. \n", "\n", "Hint: Use loc with condition" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "id": "4uEumv0uBw_4", "outputId": "c0c5515e-96dc-4e40-ca4b-e83c76ce7fad" }, "outputs": [], "source": [ "# print all the columns which are of int or float datatype in df.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "pQOOqmvEBw_5" }, "source": [ "### `Save the column names of the above output in variable list named 'l'`\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "id": "PgJz8dtQBw_5" }, "outputs": [], "source": [ "# save column names of the above output in variable list\n", "# l=\n" ] }, { "cell_type": "markdown", "metadata": { "id": "3iAhdSFPBw_5" }, "source": [ "## **`Outliers removal techniques - IQR Method`**\n", " " ] }, { "cell_type": "markdown", "metadata": { "id": "4u67f7AzBw_6" }, "source": [ "**Here comes cool Fact for you!**\n", "\n", "IQR is the first quartile subtracted from the third quartile; these quartiles can be clearly seen on a box plot on the data." ] }, { "cell_type": "markdown", "metadata": { "id": "eMW1PTL_Bw_6" }, "source": [ "- Calculate IQR and give a suitable threshold to remove the outliers and save this new dataframe into df2.\n", "\n", "Let us help you to decide threshold: Outliers in this case are defined as the observations that are below (Q1 − 1.5x IQR) or above (Q3 + 1.5x IQR)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "id": "G5EHp8JxBw_6" }, "outputs": [], "source": [ "## define Q1 and Q2\n", "# Q1 = \n", "# Q3 = \n", "\n", "# # define IQR (interquantile range) \n", "# IQR = \n", "\n", "# # define df2 after removing outliers\n", "# df2 = \n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "# find the shape of df & df2\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "id": "Ok1cLuSEBxAB", "outputId": "40c55ded-4804-4ecb-b6ab-9795033207dd" }, "outputs": [], "source": [ "# find unique values and there counts in each column in df using value counts function.\n", "\n", "# for i in df.columns:\n", "# print (\"--------------- %s ----------------\" % i)\n", "# # code here" ] }, { "cell_type": "markdown", "metadata": { "id": "zQ0GaJ_kBxAB" }, "source": [ "## `Visualising Univariate Distributions`" ] }, { "cell_type": "markdown", "metadata": { "id": "H0PQlhWEBxAC" }, "source": [ "We will use seaborn library to visualize eye catchy univariate plots. \n", "\n", "Do you know? you have just now already explored one univariate plot. guess which one? Yeah its box plot.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "SnzpC8JABxAC" }, "source": [ "### `Histogram & Density Plots`\n", "\n", "Histograms and density plots show the frequency of a numeric variable along the y-axis, and the value along the x-axis. The ```sns.distplot()``` function plots a density curve. Notice that this is aesthetically better than vanilla ```matplotlib```." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "id": "-uqWiICoBxAC", "outputId": "47e45800-1103-40e0-e407-93977635ea53" }, "outputs": [], "source": [ "#ploting distplot for variable HP\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "1GSaLnCxiWHc" }, "source": [ "### **`Observation:`**\n", "We plot the Histogram of feature HP with help of distplot in seaborn.<br> \n", "In this graph we can see that there is max values near at 200. similary we have also the 2nd highest value near 400 and so on. <br>\n", "It represents the overall distribution of continuous data variables.<br>" ] }, { "cell_type": "markdown", "metadata": { "id": "-P7Xup3vBxAD" }, "source": [ "Since seaborn uses matplotlib behind the scenes, the usual matplotlib functions work well with seaborn. For example, you can use subplots to plot multiple univariate distributions.\n", "- Hint: use matplotlib subplot function" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "CdlvvfvfBxAD", "outputId": "23484911-5553-41bd-cdf6-8bd38a526ce7" }, "outputs": [], "source": [ "# plot all the columns present in list l together using subplot of dimention (2,3).\n", "\n", "\n", "# c=0\n", "# plt.figure(figsize=(15,10))\n", "# for i in l:\n", "# # code here\n", "# plt.show()\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ziOcNh-sBxAD" }, "source": [ "## `Bar Chart Plots`\n" ] }, { "cell_type": "markdown", "metadata": { "id": "lF54VPLRBxAE" }, "source": [ "Plot a histogram depicting the make in X axis and number of cars in y axis. <br>" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "d1gpl5LxBxAE", "outputId": "726eae7f-c413-456a-e989-960d43a9c89b" }, "outputs": [], "source": [ "# plt.figure(figsize = (12,8))\n", "\n", "# use nlargest and then .plot to get bar plot like below output\n", "# Plot Title, X & Y label\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "N-8CXMKVkn-I" }, "source": [ "### **`Observation:`**\n", "In this plot we can see that we have plot the bar plot with the cars model and nos. of cars." ] }, { "cell_type": "markdown", "metadata": { "id": "Xk2s0-9UBxAE" }, "source": [ "### `Count Plot`\n", "A count plot can be thought of as a histogram across a categorical, instead of quantitative, variable.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "OmT9X5aBBxAF" }, "source": [ " Plot a countplot for a variable Transmission vertically with hue as Drive mode" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "id": "UyYYXn36BxAF", "outputId": "24b59852-4612-4065-cf6e-29b02c259565" }, "outputs": [], "source": [ "# plt.figure(figsize=(15,5))\n", "\n", "# plot countplot on transmission and drive mode\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "9I0XvhdTla4h" }, "source": [ "### **`Observation:`**\n", "In this count plot, We have plot the feature of Transmission with help of hue.<br>\n", "We can see that the the nos of count and the transmission type and automated manual is plotted. Drive mode as been given with help of hue.<br>\n" ] }, { "cell_type": "markdown", "metadata": { "id": "zDHMfUpNBxAF" }, "source": [ "# `Visualising Bivariate Distributions`\n", "\n", "\n", "Bivariate distributions are simply two univariate distributions plotted on x and y axes respectively. They help you observe the relationship between the two variables.\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "DQxcdTZsBxAG" }, "source": [ "## `Scatter Plots`\n", "Scatterplots are used to find the correlation between two continuos variables.\n", "\n", "Using scatterplot find the correlation between 'HP' and 'Price' column of the data. \n", "\n" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "id": "L5zvuQD8BxAG", "outputId": "6cc2ef16-7039-4eaa-df3f-7bdd6b4e5c80" }, "outputs": [], "source": [ "## Your code here - \n", "# fig, ax = plt.subplots(figsize=(10,6))\n", "\n", "# plot scatterplot on hp and price\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "kPLqA4B6o92w" }, "source": [ "### **`Observation:`**<br>\n", "It is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data.<br>\n", "We have plot the scatter plot with x axis as HP and y axis as Price.<br>\n", "The data points between the features should be same either wise it give errors.<br>\n" ] }, { "cell_type": "markdown", "metadata": { "id": "HEUOARh5BxAN" }, "source": [ "## `Plotting Aggregated Values across Categories`\n", "\n", "\n", "### `Bar Plots - Mean, Median and Count Plots`\n", "\n", "\n", "\n", "Bar plots are used to **display aggregated values** of a variable, rather than entire distributions. This is especially useful when you have a lot of data which is difficult to visualise in a single figure. \n", "\n", "For example, say you want to visualise and *compare the Price across Cylinders*. The ```sns.barplot()``` function can be used to do that.\n" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "id": "dTSOpY5jBxAN", "outputId": "13ca613f-edab-42d8-819d-84cc5b566ee2" }, "outputs": [], "source": [ "# bar plot with default statistic=mean between Cylinder and Price\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "rFd9QisOBxAO" }, "source": [ "### **`Observation:`**<br>\n", "By default, seaborn plots the mean value across categories, though you can plot the count, median, sum etc.<br>\n", "Also, barplot computes and shows the confidence interval of the mean as well.\ n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "od8Fuqm_BxAO" }, "source": [ "## `When you want to visualise having a large number of categories, it is helpful to plot the categories across the y-axis.`\n", "\n", "### `Let's now drill down into Transmission sub categories.`" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "id": "lJnPU4KtBxAP", "outputId": "2dfa446f-874f-435f-dba0-a17f30f34718" }, "outputs": [], "source": [ "# Plotting categorical variable Transmission across the y-axis\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Q5Y7xg3ZBxAQ" }, "source": [ "These plots looks beutiful isn't it? In Data Analyst life such charts are there unavoidable friend.:)" ] }, { "cell_type": "markdown", "metadata": { "id": "QX2szH0MBxAQ" }, "source": [ "# `Multivariate Plots`\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "_wiepyZEBxAT" }, "source": [ "## `Heatmaps`\n", "\n", "\n", "A heat map is a two-dimensional representation of information with the help of colors. Heat maps can help the user visualize simple or complex information" ] }, { "cell_type": "markdown", "metadata": { "id": "VslkQJNWBxAU" }, "source": [ "Using heatmaps plot the correlation between the features present in the dataset." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "id": "DWpcsVJCBxAU", "outputId": "dae92aaa-5a7f-4acf-8082-03555340ee16" }, "outputs": [], "source": [ "#find the correlation of features of the data \n", "# corr = \n", "\n", "# print corr\n" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "id": "rDqYeuI1BxAW", "outputId": "e20f0d9a-e76f-4f59-8ebb-11047156049d" }, "outputs": [], "source": [ "# Using the correlated df, plot the heatmap \n", "# set cmap = 'BrBG', annot = True - to get the same graph as shown below \n", "# set size of graph = (12,8)\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "-uMl7P-DBxAX" }, "source": [ "### **`Observation:`**<br>\n", "A heatmap contains values representing various shades of the same colour for each value to be plotted. Usually the darker shades of the chart represent higher values than the lighter shade. For a very different value a completely different colour can also be used.\n", "\n", "\n", "The above heatmap plot shows correlation between various variables in the colored scale of -1 to 1. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "colab": { "collapsed_sections": [], "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.0" } }, "nbformat": 4, "nbformat_minor": 1 }