EXCEL BASICS
Grasping the Excel user interface
Ribbons
Different tabs in Ribbon:
- File
- Home
- Insert
- Draw
- Page Layout
- Formulas
- Data
- Review
- View
- Help
Just hover over the features to understand what operation it does.
File Tab: This is where you can access commands related to file management such as opening,
saving, printing, and sharing your workbook. You can also set options for Excel itself from
here.
Home Tab: The Home tab contains the most frequently used commands for formatting your
spreadsheet data. It includes options for font formatting, alignment, cell styles, and editing
tools like copy, paste, and find/replace.
Insert Tab: This tab is used for inserting various elements into your spreadsheet such as tables,
charts, shapes, pictures, hyperlinks, and more. It's where you go to add new content to your
workbook.
Draw Tab: The Draw tab provides tools for freehand drawing and annotation. You can use
it to sketch or highlight elements directly on your worksheet, which can be especially useful
for presentations or collaborative editing.
Page Layout Tab: This tab deals with the layout and printing options of your spreadsheet.
Here you can set page orientation, margins, page breaks, and other settings to optimize
your document for printing or viewing.
Formulas Tab: The Formulas tab is where you can access a wide range of built-in functions
and formulas to perform calculations and data analysis in your workbook. It provides
tools for inserting, auditing, and managing formulas.
Data Tab: This tab contains commands for managing and analyzing your data. You can sort
and filter your data, import external data sources, create data connections, and perform various
data analysis tasks like pivot tables and data validation.
Review Tab: The Review tab is used for proofreading and collaborating on your spreadsheet.
It includes tools for spell check, comments, track changes, and protecting your workbook.
View Tab: This tab offers different views and settings for how you can visualize your
spreadsheet. You can switch between different workbook views like Normal, Page Layout,
and Page Break Preview, as well as adjust zoom levels and freeze panes.
Help Tab: The Help tab provides access to Excel's built-in help system where you can find
assistance on using Excel features, troubleshooting issues, and accessing online resources and
community forums for additional support.
The "Show Tabs Only" option in Excel simplifies your workspace by hiding the ribbon, leaving
only the worksheet tabs visible at the top. It offers a cleaner interface for focusing solely on
your content without distractions. You can toggle this option on and off to customize your Excel
experience based on your needs.
Types of data format:
Structured data refers to organized information with a clear format, often stored in databases
or spreadsheets in the form of tables. It follows a predefined schema and can be easily queried
and analysed. Common file extensions include .csv (Comma-Separated Values) and
.xls/.xlsx.
Unstructured data lacks a specific format and organization, making it more challenging to
analyze. Examples include text documents, emails, images, and videos. File extensions vary
widely depending on the type of data, such as .txt for text files, .jpg/.png for images, and .mp4
for videos.
Semi-structured data falls between structured and unstructured data. While it does not adhere
to a rigid schema like structured data, it contains some organizational elements, making it
more accessible than unstructured data. Examples include XML (Extensible Markup
Language) and JSON (JavaScript Object Notation) files.
Dataset taken for practice: “[Link]”
The Hospital Charges dataset comprises over 150,000 entries detailing treatments for diverse
ailments across numerous US hospitals. Key points to note from the dataset:
- The dataset has a record of 100 ailments It has a list of 3000+ healthcare providers
(hospitals, medical centres, etc.).
- It has data from 50 states of the USA and approximately 2000 cities across the USA.
- It describes the cost of treatment for approximately 7 million patients.
- Each record (or row) is a unique combination of state, city, ailment & provider.
- It also lists the average cost of treatment for a particular ailment.
Basic Excel understanding:
- Firstly, the table is organized in the form of rows and columns.
- Number of rows: 163066, Number of columns: 12
- Ctrl + Down arrow – To reach the last value of current column
- Ctrl + Right arrow – To reach the last value of current row
- Ctrl + Up arrow – To reach the first value of current column
- Ctrl + Left arrow – To reach the first value of current row
- Row numbers: 1,2,3,4…., Column Names: A,B,C,D,E,F,….
- For an empty excel sheet, there are 1048576 rows and 16384 columns (till XFD).
- First row typically has the header of the dataset.
- If you hover the mouse just above the vertical line that splits the column headers, a
plus sign appears. If you click and drag the plus sign, the width of the cell can be
changed. If you hover the mouse just above the horizontal line that splits the row
numbers, a plus sign appears. If you click and drag the plus sign, the height of the cell
can be changed.
- If you double click on the plus sign on the vertical line, the width of the column adjusts
accordingly to the size of the contents in that specific column. If you double click on
the plus sign on the horizontal line, the height of the row adjusts accordingly to the
size of the content in that specific row.
- Default value for a cell width is: 8.43 characters, 80 pixels; row height: 15 points, 24
pixels (If you hover over the vertical line that splits the cells, it shows the default width
of the cell and if you hover over the horizontal line that splits the cells, it shows the
default height of the cell).
- Now, the full dataset cannot be viewed in a single stretch without scrolling.
- We can Zoom-in or Zoom-out the sheet according to our needs.
- Selection of cells can encompass individual cells, complete rows, columns, or any
assortment of cells, irrespective of their arrangement. This enables to perform some
common operations on the selected cells.
- To choose multiple cells dispersed across the sheet, hold down the Ctrl key while
selecting each cell.
- To select the entire column, hover over the column name, the cursor changes to down
arrow, select the entire column.
- To select the entire row, hover over the row number, the cursor changes to right arrow,
select the entire row.
- To select the whole sheet, click on the triangle symbol on the left top corner of the sheet.
- For selecting continuous rows, just select the row numbers together. Similarly, to select
continuous columns, just select the column names together.
- To select multiple rows dispersed across the sheet, hold down the Ctrl key while
selecting each row. To select multiple columns dispersed across the sheet, hold down
the Ctrl key while selecting each column.
- To adjust the width of multiple columns at a time based on their content size, select
the column names and double click the plus sign. To adjust the height of multiple
rows at a time based on their content size, select the row numbers and double click the
plus sign.
- Upon selecting a cell, its contents will appear in the formula bar.
From the “hospital charges” dataset, our objective now is to locate the cheapest healthcare
provider in the city based on the disease.
Data Search:
- To locate specific content within an Excel file, utilize the "Find" feature (Ctrl + F) or
select the "Find & Select" option from the Ribbon.
- Input the desired text into the "Find What" text box.
- If the content exists, the occurrence nearest to the current cursor position will be
highlighted, allowing you to proceed with the "Find Next" option for subsequent
instances or "Find All" to display all occurrences.
- To search for a specific text within a column exclusively, select the column and click
the "Find & Select" option from the ribbon.
Filtering & Sorting:
Filters:
Filters allow you to selectively display specific data based on set criteria.
- Select any cell in column "A" and then choose "Sort & Filter -> Filter” option. You'll
notice a drop-down list near all the column headers. Short-cut for filtering is
Ctrl+Shift+L.
- Under the search box, type the text value you want to filter (“chest”). Select “313 –
CHEST PAIN”. Filters are case-insensitive.
- All the rows corresponding to “313 – CHEST PAIN” are displayed along with the
appropriate row numbers.
- The drop-down list symbol for the filtered column looks different from those of other
columns.
- To remove filter, select the drop-down and click “Clear Filter from DRG Definition”.
- The filter is removed, and all values are displayed as before.
- We can also select only the specific filter values based on the “Text Filters ->
Contains” option. Give the appropriate conditions and click OK.
- Now the values with the search text “chest” or “cardiac” are displayed.
- We can also select specific values under “Filter” apart from these conditions. To remove
filters, simply deselect the checkboxes.
- Now, we'll examine the number of hospitals treating "Chest Pain" at "California".
- From the above picture, it is evident that 201 hospitals in California treat Chest Pain.
- To add another state to the current selection, just click the filter drop-down near
“Provider State”, and then type “DC” in the search box. Click the check box “Add
current selection to filter”.
- Next, we need to identify the hospitals in California that charge higher and lower amounts
for treating chest pain.
Sorting:
- Sorting involves organizing data, often within a column, in ascending, descending, or
alphabetical order.
- “Average Covered Charges” column contains the cost values in $. So, if you sort the
column without removing $ symbol, the sorted order obtained will be in the alphabetical
order rather than the numerical order.
- To remove $ symbol, select the column, click “Find & Select” option. Click “Replace”
and then type “$” in “Find What” text box and leave it blank in the “Replace with”
text box.
- Click “Replace All” to replace all occurrence of “$” with blank space.
- Similarly, remove “$” symbol from “Average Total Payments” and “Average
Medicare Payments” columns.
- Select the filters that are beside the column names to observe how sorting is displayed
for categorical and numerical columns.
- We can also select “Sort & Filter” option from the ribbon and select “Sort Largest to
Smallest”. For “Average Covered Charges”, sorting is done in descending order.
- The analysis reveals that "DOCTORS MEDICAL CENTER" charges the highest
($79146.59) for chest pain treatment in California, while "MADERA COMMUNITY
HOSPITAL" charges ($8396.86) the least.
Multi – level Sorting:
You might need to alphabetically sort all provider cities, and within each city, sort hospitals
by cost from highest to lowest or alphabetically. However, applying the sort operation twice
won't achieve this, as the second operation overrides the first.
To accomplish this, Multi-Level Sorting is used.
- To perform multi-level sorting, begin by sorting the data within the hierarchy, followed
by sorting the outer components.
- For example, if we aim to determine the average covered charges from highest to lowest
for each provider city, we first sort the average covered charges column in descending
order. Then, we sort the provider city alphabetically to obtain the descending order of
average charges per city.
- This can also be done using “Custom Sort” for multiple levels.
- Click “Sort & Filter” option, select “Custom Sort”.
- In custom sorting, we have the option to include multiple sorting levels. Priority is
assigned to each level in the order they are specified. We specify the column names for
sorting, the criteria for sorting, and the preferred sorting order - be it ascending,
descending, or alphabetical.
- We can also specify whether the sorting must be done as Case sensitive (Capital A-Z
followed by a-z) or Case insensitive.