[go: up one dir, main page]

0% found this document useful (0 votes)
2 views7 pages

test (1)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 7

1. The definition of data analytics?

(Lesson 2)

- Data analytics is the process of examining and analyzing raw data sets to:

+ Draw conclusions
+ Derive more information
+ Improve businesses, products, and services

In addition to making business decisions, it is also used by data scientists and researchers
to verify scientific models and theories.

2. Present data analytics life cycle. (Lesson 2)


 (1) Discovery: learn about the business domain and assess available resources.
This phase involves understanding the business problem you’re trying to solve and
what data you have available to help solve it. You’ll also need to identify any
constraints, such as budget or time.
 (2) Data preparation: execute ELT (extract, load and transform).
This phase involves collecting the data from various sources, cleaning it to remove
errors and inconsistencies, and then structuring it for analysis. This may involve
using data wrangling or ETL (extract, load, and transform) tools.
 (3) Model planning: identify techniques and data to understand variable
relationships.
This phase involves determining how you’ll analyze the data. You’ll need to
choose the right statistical or machine learning techniques and identify the
variables you want to focus on. You may also want to create a hypothesis at this
stage.
 (4) Model building: develop data sets for testing, training and production.
This phase involves creating and testing your models using the prepared data.
You’ll need to split your data into training, testing, and production sets. You’ll
also need to evaluate your models to ensure they’re accurate and generalizable.
 (5) Communicate results: identify key findings, business values, and develop
narratives for stakeholders.
This phase involves interpreting the results of your models and drawing
conclusions. You’ll need to identify the key findings that are relevant to the
business problem and communicate them in a way that stakeholders can
understand.
 (6) Operationalize: deliver final reports, briefs, codes, and technical documents.
This phase involves putting your insights into action. You may need to create
reports, dashboards, or other deliverables to communicate your findings to
stakeholders. You may also need to develop code or other technical documentation
to operationalize your models.

3. Describe stages of data analytics. (Lesson 2)


The four stages of data analytics are descriptive analytics, diagnostic analytics, predictive
analytics, and prescriptive analytics.

1. Descriptive analytics
 Descriptive analytics is designed to access information about the past.
 It is the conventional form of analytics.
 It focuses on the summarized view of facts.
 Its purpose is to summarize the findings.
 Techniques of Descriptive analytics are data aggregation and data mining.
 Data aggregation is the process of gathering and expressing information in a
summarized form.
 Tools used for data aggregation include MS Excel, MATLAB, SPSS and STATA.
 Company report is an example of descriptive analytics.
2. Diagnostic analytics
 Diagnostic analytics helps you identify why something happened in the past.
 It takes a deeper look at data to understand the root cause of events.
 It has a limited ability to provide actionable insights.
 It provides an understanding of causal relationships and sequences.
 Diagnostic analytics techniques: drill-down, data discovery, data mining,
correlation.
 They can be used to discover a causal relationship between two or more data sets.
 Diagnostic analytics is helpful for those concerned with day-to-day operations.
 For example, it helps identify why a sales representative has sold fewer items than
usual.
3. Predictive analytics:
 Predicting future outcomes in terms of probability of an event to occur.
 Analyzing sentiments where all opinions posted on social media are collected to
predict a person’s sentiment.
 Identifying target audience for a promotional campaign.
 Forecasting weather, plan-failure prediction, and travel products recommender
system.
 Predictive analytics tools:
 Machine learning algorithms such as random forests, SVM and statistics.
 Popular tool for predictive analytics: Python, R and RapidMiner.
 Trained data scientists and machine learning experts building these models.
4. Prescriptive analytics
 Provides the solution for a prediction in the future.
 It creates and updates the relationships between action and outcome using a
feedback system.
 It helps in making optimal recommendations during the decision-making process.
 It helps in mitigating the possible risks based on the available predictive analytics.
 It has the power to suggest favorable solutions and ease the decision-making
process.
 It is the final frontier of advanced analytics.
 It is used by recommendation engines in companies.

4. List the terminologies used in data analytics. (Lesson 3)

- Observation: a single row or a record of data from the database. Any data can be
assumed as a set of observations. Besides that, observation is the unit of analytics on
which the measurements are taken. It is also known as a case, record, or row.

- Data Sampling: a statistical analysis technique used to select, manipulate, and analyze a
representative subset of data points. Data sampling identifies patterns and trends in the
larger data set. Data sampling is cost effective and surveys only the representative
sample. It enables data scientists, predictive modelers, and data analytics to produce
accurate findings

- Data Set: a collection of data or the total data captured about a particular use case. It can
hold information such as medical, insurance, and loan approval records. It’s not limited to
numbers and texts and may include collections of images or videos.

- Prediction: The goal of prediction is to move from what has happened to providing the
best assessment of what will happen.

5. Describe the types of data? (Lesson 3)

Structured Data: It is the data that is processed, stored, and retrieved in a fixed format.

Example: employee details, job positions, and salaries.

Unstructured Data: It is the type of data that lacks any specific form or structure and its
information is text-heavy and contains data such as dates, numbers and facts. About 80%
of business data is unstructured.

Example: Email

Semi-Structured Data: It is the data type containing both structured and unstructured
data.

Example: CSV and JSON documents

6. Explain the levels of measurement. (Lesson 3)

Levels of measurement is a classification that describes the nature of information within


the values assigned to variables, which includes nominal, ordinal, interval, and ratio.

At the nominal level of measurement, numbers in the variable are used to classify data.
At this level, words, letters, and alphanumeric symbols can be used. Example: People in
female gender category are classified as F and those in male gender category are
classified as M.

Ordinal level of measurement depicts ordered relationship among the variable’s


observations. It indicates the order of the measurements. Example: A student with 100%
score is assigned the first rank, another student with 95% score would be assigned the
second rank, and so on.

The interval level of measurement classifies and orders the measurements. It also
specifies that the distances between each interval on the scale are equivalent. Example:
Temperature in centigrade where the distance between 80 degrees and 100 degrees is the
same as the distance between 1000 degrees and 1020 degrees.

In the ratio level of measurement, observations can have a value of zero. Although
properties of ratio measurement are similar to the interval level of measurement, the zero
in scale makes it different from the other levels of measurement.

7. Describe the importance of data visualization. (Lesson 4)

- Data visualization tools provide access to trends, outliers, and patterns in data.

- Data in user-friendly charts help businesses gain insights to make right decisions.

- They help organize and present important findings from the data.

- Data analytics tool allows a user to present massive data intuitively.

- Decision makers see patterns, trends, and correlations in the data being analyzed.

- It helps decision makers in cutting costs or improving operational processes.

8. Describe types of Data Science. (Lesson 5)

The three types of data science is data analytics, machine learning and data mining.

- Data Analytics: is the process of examining and analyzing raw data sets to:

● Draw conclusion

● Derive information

● Derive insights from raw data sources

- Machine Learning :

● Learns from patterns in the past using a set of algorithms


● Predicts outcomes accurately

- Data Mining:

● Data mining is the process of analyzing data from different perspectives.

● It summarizes data into useful information

● It helps increase revenue and cut costs.

9. Describe the various stages of data science methodology? (Lesson 6)

- Business Understanding: is the first stage of the data science methodology and
lays the foundation for a successful end result

● This stage identifies key business sponsors, steering committee, and internal
sponsors.

● It helps understand business and customer needs and identify who needs the
analytical solution.

● It includes defining the problem, project objectives, and solution requirements


from a business perspective.

- Analytic Approach

● The analytic approach determines business requirements as well as data


requirements.

● It identifies the analytic methods, hardware and software, data content, formats,
and representations to be used.

- Data Requirements

● The requirement stage is specific to identifying necessary data with its initial
source and appropriate format.

● This stage has multiple sub-stages including data acquisition, data wrangling, data
analysis and data modeling.

- Data Collection

● In collection stage, data scientists identify and gather the available relevant data as
a good quality input data is required for a great output.
● Data scientists evaluate the volume and properties of the data and understand the
distribution of each attribute.

● High-performance platforms and in-database analytic functionality enable data


scientists to use large data sets.

- Data Understanding

Data scientists use descriptive statistics and visualization techniques to:

● Understand data content

● Assess data quality

● Discover initial insights about the data

- Data Preparation

● The data preparation stage includes activities to construct a data set for data
modeling.

● This stage includes cleaning of data, eliminating duplicates, formatting data from
multiple sources, and transforming data into more useful variables.

● Data scientists are capable of creating explanatory variables through a


combination of domain knowledge and existing structured variables.

- Modeling

● The modeling stage applies predictive model on historical data to obtain the
outcome.

● This stage helps organizations gain intermediate insights and future trends, leading
to strategic improvements.

● Using exploratory data analytics, data scientists attempt multiple algorithms to


find the best model for the available data set

- Evaluation

● Once the model is developed, data scientists evaluate the model to understand its
quality and ensure that it addresses the business problem.

● In model evaluation, diagnostic measures are computed and outputs such as tables
and graphs are evaluated.
● During the evaluation phase, data mining result is evaluated for novelty and
usefulness.

- Deployment

● In the deployment stage, a satisfactory model should be deployed into the


production environment.

● It involves multiple groups, skills, and technologies.

● It requires planning on how knowledge can be propagated to users.

- Maintenance: in this maintenance phase, identify:

● What could change in the environment?

● How will the accuracy be monitored?

● When should the data mining model not be used?

● Will business objectives change over time?

● What kind of report is required?

● Were initial data mining goals met?

● Who will be target groups for reports?

- Feedback

In this last stage of feedback, review the whole framework by:

● Interviewing people involved in the project

● Interviewing end users and identifying improvement areas

● Summarizing the feedback and documenting the experience

You might also like