Analytics Tools:
● Common tools: Excel, Tableau, Power BI, Python, R, SAS, Google Analytics, and SQL.
● Purpose: Used for data collection, analysis, reporting, and visualization to gain insights,
make predictions, or optimize business decisions.
Data Type:
● Quantitative: Numerical data (e.g., sales figures, temperatures).
● Qualitative: Descriptive data (e.g., customer feedback).
● Structured: Organized data in databases (e.g., tables).
● Unstructured: Raw data (e.g., emails, social media posts).
Correlation:
● Definition: Measures the strength and direction of the relationship between two
variables.
● Types:
○ Positive correlation: Variables move in the same direction.
○ Negative correlation: Variables move in opposite directions.
○ No correlation: No discernible relationship.
Primary Key vs Foreign Key:
● Primary Key: A unique identifier for each record in a database table.
● Foreign Key: A field in one table that links to the primary key in another table, creating
a relationship between the two tables.
Big Data:
● Definition: Extremely large and complex datasets that are difficult to process with
traditional methods.
● Characteristics: Often described by the 5 V’s—Volume, Velocity, Variety, Veracity,
and Value.
● Tools: Hadoop, Spark, and cloud platforms like AWS and Google BigQuery.
Regression:
● Definition: A statistical method for predicting the relationship between a dependent
variable and one or more independent variables.
● Types:
○ Linear regression: Predicts a continuous outcome based on linear relationships.
○ Logistic regression: Used for binary outcomes (e.g., Yes/No).
Outliers/Anomalies:
● Outlier: A data point that is significantly different from other data points in a dataset.
● Anomalies: Irregular or unexpected patterns that do not conform to expected behavior.
Data Visualization:
● Definition: The graphical representation of data to reveal patterns, trends, and insights.
● Tools: Tableau, Power BI, Google Data Studio, D3.js.
● Common charts: Bar charts, pie charts, histograms, line graphs, scatter plots, heat
maps.
Tableau Functions:
● Basic functions: SUM(), AVG(), MIN(), MAX(), COUNT(), IF(), ZN().
● Advanced: Calculated fields, LOD (Level of Detail) expressions, RANK(),
WINDOW_SUM(), DATEPARSE(), and filtering and aggregation tools.
Analytics Types:
1. Descriptive Analytics: Summarizes past data (e.g., historical sales).
2. Diagnostic Analytics: Explains why something happened (e.g., root cause analysis).
3. Predictive Analytics: Uses data models to predict future outcomes (e.g., demand
forecasting).
4. Prescriptive Analytics: Provides recommendations for action based on data (e.g.,
optimizing processes).
Hypothesis Testing:
● Definition: A statistical method for testing an assumption about a population parameter.
● Process:
1. Formulate null (H0) and alternative (H1) hypotheses.
2. Set significance level (e.g., 0.05).
3. Calculate the test statistic and P-value.
4. Compare P-value to significance level to accept or reject H0.
Benford’s Law:
● Definition: Predicts the distribution of the first digits in many real-life datasets, where the
number 1 appears more frequently as the leading digit. Used in fraud detection.
Goal Seek:
● Definition: An Excel tool used to find the input value needed to achieve a specific goal
or result (e.g., finding the interest rate needed for a target future value).
Vertical Analysis:
● Definition: A method of financial statement analysis where each line item is listed as a
percentage of a base figure (e.g., each income statement item as a percentage of
revenue).
P-value:
● Definition: The probability that the observed result occurred by chance. In hypothesis
testing, if the P-value is less than the significance level (e.g., 0.05), the null hypothesis is
rejected.
Du Pont Analysis:
● Definition: A method for analyzing a company's return on equity (ROE) by breaking it
down into three components:
1. Profit margin (Net Income / Sales)
2. Asset turnover (Sales / Total Assets)
3. Financial leverage (Total Assets / Equity)