[go: up one dir, main page]

0% found this document useful (0 votes)
7 views3 pages

Data Analytics Notes

The document outlines key characteristics of data in analytics, including volume, variety, velocity, and veracity, which impact analytical outcomes. It details the steps in the discovery and data preparation phases of the analytics lifecycle, emphasizing problem definition, data collection, and cleaning. Additionally, it explains linear regression as a statistical method for modeling relationships and describes a Digital Analytics Sandbox for testing data in a secure environment.

Uploaded by

tejasshelar198l
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

Data Analytics Notes

The document outlines key characteristics of data in analytics, including volume, variety, velocity, and veracity, which impact analytical outcomes. It details the steps in the discovery and data preparation phases of the analytics lifecycle, emphasizing problem definition, data collection, and cleaning. Additionally, it explains linear regression as a statistical method for modeling relationships and describes a Digital Analytics Sandbox for testing data in a secure environment.

Uploaded by

tejasshelar198l
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Data Analytics Notes

1. What are the characteristics of data in data analytics?

Data is the foundation of data analytics, and its characteristics play a crucial role in determining the
accuracy, reliability, and effectiveness of analytical outcomes. Below are the key characteristics:

- **Volume**: Large amounts of data require efficient storage and processing techniques.
- **Variety**: Includes structured, semi-structured, and unstructured data.
- **Velocity**: Speed at which data is generated and processed.
- **Veracity**: Accuracy and reliability of data.
- **Value**: Usefulness of data for decision-making.
- **Variability**: Changes in data patterns over time.
- **Completeness**: Ensures no critical data is missing.
- **Timeliness**: Data must be up-to-date.
- **Accessibility**: Ease of data retrieval and sharing.
- **Granularity**: Level of detail in the data.

---

2. What are the steps in the discovery phase of data analytics?

- **Define the Business Problem**: Identify challenges and goals.


- **Identify Key Metrics**: Determine relevant KPIs.
- **Understand Data Requirements**: Identify necessary data sources.
- **Data Collection and Exploration**: Gather and examine data.
- **Assess Data Quality**: Clean and validate data.
- **Identify Analytical Approaches**: Choose statistical or machine learning methods.
- **Develop Hypotheses**: Formulate and test assumptions.
- **Prepare a Project Plan**: Define timelines and responsibilities.
- **Communicate Findings**: Present initial insights.
---

3. What are the steps in the data preparation phase of data analytics lifecycle?

- **Data Collection**: Gather data from multiple sources.


- **Data Integration**: Merge data from different formats.
- **Data Cleaning**: Remove missing values, duplicates, and inconsistencies.
- **Data Transformation**: Normalize and encode variables.
- **Feature Engineering**: Create new relevant features.
- **Data Reduction**: Reduce dataset size while maintaining accuracy.
- **Data Validation**: Ensure correctness and consistency.
- **Data Storage**: Store processed data securely.

---

4. What is Linear Regression?

Linear Regression is a statistical method used to model the relationship between a dependent
variable (Y) and one or more independent variables (X).

- **Simple Linear Regression**: Y = mX + c (one independent variable)


- **Multiple Linear Regression**: Y = b0 + b1X1 + b2X2 + ... + bnXn (multiple independent variables)

**Example**: Predicting sales based on advertising spend.

**Applications**: Used in business forecasting, finance, healthcare, and engineering.

---

5. What is a Digital Analytics Sandbox?

A Digital Analytics Sandbox is a secure environment where data is tested before applying it in
real-world scenarios.

**Purpose**:
- Experimenting with analytics tools.
- Ensuring data accuracy before deployment.
- Enhancing data privacy and security.

**Example**:
A retail company tests a customer tracking system before launching it on its website.

**Applications**:
- Website & App Analytics
- E-commerce Optimization
- Marketing Campaign Analysis
- Big Data Processing

You might also like