Importing a CSV file into the
DataFrame
Importing a CSV file into the DataFrame
• Pandas read_csv() function imports a CSV file to DataFrame format.
• Here are some options:
• filepath_or_buffer: this is the file name or file path
• df.read_csv('file_name.csv’) # relative position
df.read_csv('C:\\Users\\abc\\Desktop\\file_name.csv')
Importing a CSV file into the DataFrame
• header: this allows you to specify which row will be used as column
names for your dataframe. Expected an int value or a list of int values.
• df.read_csv('file_name.csv’, header=0) # default
• df.read_csv('file_name.csv’, header=None) # no header
Importing a CSV file into the DataFrame
• sep: Specify a custom delimiter for the CSV input, the default is a
comma.
• pd.read_csv('file_name.csv', sep='\t') # Use Tab to separate
Importing a CSV file into the DataFrame
• names: This is to allow you to customize header while reading csv file.
• pd.read_csv('file_name.csv', names=[‘A’, ‘B’, ‘C’] ) # set new names of
columns
Importing a CSV file into the DataFrame
• index_col: This is to allow you to set which columns to be used as the
index of the dataframe. The default value is None, and pandas will
add a new column start from 0 to specify the index column.
• pd.read_csv('file_name.csv',index_col='Name') # Use 'Name' column
as index
Importing a CSV file into the DataFrame
• skiprows: Skip rows of csv file while reading.
• pd.read_csv('file_name.csv', skiprows=5) # skip first 5 rows
• pd.read_csv('file_name.csv', skiprows=[1,3,5]) # skip row 1,3,5
Importing a CSV file into the DataFrame
• nrows: Only read the number of first rows from the file. Needs an int
value.
• pd.read_csv('file_name.csv',nrows=5) # read first 5 rows only
Importing a CSV file into the DataFrame
• usecols: Specify which columns to import to the dataframe. It can a
list of int values or column names.
• pd.read_csv('file_name.csv',usecols=[1,2,3]) # Only reads col1, col2,
col3. col0 will be ignored.
•
pd.read_csv('file_name.csv',usecols=['Name']) # Only reads 'Name'
column. Other columns will be ignored.
Importing a CSV file into the DataFrame
• na_values: The default missing values will be NaN. Use this if you
want other strings to be considered as NaN. The expected input is a
list of strings.
• pd.read_csv('file_name.csv',na_values=['a','b']) # a and b values will
be treated as NaN after importing into dataframe.
Writing a DataFrame into CSV file
• to_csv()
• Options:
• filename
• index
• header
• na_rep
• sep