Data Analytics
Data Analytics
Data analytics is the process of exploring and analyzing large datasets to make
predictions and boost data-driven decision making. Data analytics allows us to
collect, clean, and transform data to derive meaningful insights. It helps to answer
questions, test hypotheses, or disprove theories.
Data analytics is used in most sectors of businesses. Here are some primary areas
where data analytics does its magic:
6. Data analytics can be used for city planning, to build smart cities.
1. Descriptive Analytics
It tells you what has happened. It can be done using an exploratory data analysis.
Example: Studying the total units of chairs sold and the profit that was made in the
past.
2. Predictive Analytics
It tells you what will happen. It can be achieved by building predictive models.
Example: Predicting the total units of chairs that would sell and the profit we can
expect in the future.
3. Prescriptive Analytics
It tells you how to make something happen. It can be done by deriving key insights
and hidden patterns from the data.
There are primarily five steps involved in the data analytics process, which include:
2. Data Preparation: The next step in the process is to prepare the data. It
involves cleaning the data to remove unwanted and redundant values,
converting it into the right format, and making it ready for analysis. It also
requires data wrangling.
3. Data Exploration: After the data is ready, data exploration is done using
various data visualization techniques to find unseen trends from the data.
4. Data Modeling: The next step is to build your predictive models using
machine learning algorithms to make future predictions.
There are many programming languages available, but Python is popularly used by
statisticians, engineers, and scientists to perform data analytics.
Here are some of the reasons why Data Analytics using Python has become popular:
4. Python provides libraries for graphics and data visualization to build plots.
One of the main reasons why Data Analytics using Python has become the most
preferred and popular mode of data analysis is that it provides a range of libraries.
Matplotlib: Matplotlib library is commonly used for plotting data points and creating
interactive visualizations of the data.
SciPy: SciPy library is used for scientific computing. It contains modules for
optimization, linear algebra, integration, interpolation, special functions, signal and
image processing.
Scikit-Learn: Scikit-Learn library has features that allow you to build regression,
classification, and clustering models.