Buisness Analytics Notes
Buisness Analytics Notes
Introduction
The word analytics has come into the foreground in last decade or so. The increase of the
internet and information technology has made analytics very relevant in the current age.
Analytics is a field which combines data, information technology, statistical analysis,
quantitative methods and computer-based models into one.
This all are combined to provide decision makers all the possible scenarios to make a well
thought and researched decision. The computer-based model ensures that decision makers are
able to see performance of decision under various scenarios.
Meaning
Business analytics (BA) is a set of disciplines and technologies for solving business problems
using data analysis, statistical models and other quantitative methods. It involves an iterative,
methodical exploration of an organization's data, with an emphasis on statistical analysis, to
drive decision-making.
• forecasting future business needs, performance, and industry trends with predictive
modelling; and
Definition
Business analytics (BA) refers to the skills, technologies, and practices for continuous
iterative exploration and investigation of past business performance to gain insight and
drive business planning. Business analytics focuses on developing new insights and
understanding of business performance based on data and statistical methods.
Business Analytics is the process of transforming data into insights to improve
business decisions. Data management, data visualization, predictive modelling, data
mining, forecasting simulation, and optimization are some of the tools used to create
insights from data.
Business analytics has been existence since very long time and has evolved with
availability of newer and better technologies. It has its roots in operations research,
which was extensively used during World War II.
Analytics have been used in business since the management exercises were put into
place by Frederick Winslow Taylor in the late 19th century.
Henry Ford measured the time of each component in his newly established assembly
line. But analytics began to command more attention in the late 1960s when computers
were used in decision support systems.
Since then, analytics have changed and formed with the development of enterprise
resource planning (ERP) systems, data warehouses, and a large number of other
software tools and processes.
In later years the business analytics have exploded with the introduction of computers. This
change has brought analytics to a whole new level and has brought about endless possibilities.
As far as analytics has come in history, and what the current field of analytics is today, many
people would never think that analytics started in the early 1900s with Mr. Ford himself.
As the economies started developing and companies became more and more competitive,
management science evolved into business intelligence, decision support systems and into PC
software.
Business analytics has a wide range of application and usages. It can be used for descriptive
analysis in which data is utilized to understand past and present situation. This kind of
descriptive analysis is used to asses’ current market position of the company and effectiveness
of previous business decision.
It is used for predictive analysis, which is typical used to asses’ previous business performance.
Business analytics is also used for prescriptive analysis, which is utilized to formulate
optimization techniques for stronger business performance.
For starters, business analytics is the tool your company needs to make accurate decisions.
These decisions are likely to impact your entire organization as they help you to improve
profitability, increase market share, and provide a greater return to potential shareholders.
While some companies are unsure what to do with large amounts of data, business analytics
works to combine this data with actionable insights to improve the decisions you make as a
company
Essentially, the four main ways business analytics is important, no matter the industry, are:
Improves performance by giving your business a clear picture of what is and isn’t working
Provides faster and more accurate decisions
Minimizes risks as it helps a business make the right choices regarding consumer
behaviour, trends, and performance
Inspires change and innovation by answering questions about the consumer.
• Business analytics uses data from three sources for construction of the business model.
It uses business data such as annual reports, financial ratios, marketing research, etc. It
uses the database which contains various computer files and information coming from
data analysis.
Apart from having applications in various arenas, following are the benefits of Business
Analytics and its impact on business –
Moreover, any technology is subject to its own set of problems and challenges. Following are
the challenges in implementing business analytics in an organization.
Business analytics can be possible only on large volume of data. It is sometime difficult
obtain large volume of data and not question its integrity.
Business analytics depends on sufficient volumes of high-quality data.
The difficulty in ensuring data quality is integrating and reconciling data across
different systems, and then deciding what subsets of data to make available.
Now business analytics is becoming a tool that can influence the outcome of customer
interactions. When a specific customer type is considering a purchase, an
analyticsenabled enterprise can modify the sales pitch to appeal to that consumer. This
means the storage space for all that data must react extremely fast to provide the
necessary data in real-time.
Application
Business analytics has a wide range of application from customer relationship management,
financial management, and marketing, supply-chain management, humanresource
management, pricing and even in sports through team game strategies.
In healthcare, business analysis can be used to operate and manage clinical information
systems. It can transform medical data from a bewildering array of analytical methods into
useful information. Data analysis can also be used to generate contemporary reporting systems
which include the patient's latest key indicators, historical trends and reference values.
• Decision analytics: supports human decisions with visual analytics that the user models
to reflect reasoning.
• Descriptive analytics: gains insight from historical data with reporting, scorecards,
clustering etc.
• Predictive analytics: employs predictive modelling using statistical and machine
learning techniques
• Prescriptive analytics: recommends decisions using optimization, simulation, etc.
• Behavioural analytics
• Cohort analysis
• Competitor analysis
• Cyber analytics
• Enterprise optimization
• Financial services analytics
• Fraud analytics
• Health care analytics
• Key Performance Indicators (KPI's)
• Marketing analytics
• Pricing analytics
• Retail sales analytics
• Risk & Credit analytics
• Supply chain analytics
• Talent analytics
• Telecommunications
• Transportation analytics
• Customer Journey Analytics
• Market Basket Analysis
The aim of business analytics is data and reporting—examining past business performance and
forecasting future business performance. On the other hand, the business analysis focuses on
functions and processes—determining business requirements and suggesting solutions.
Business analysis is the practice of assisting firms in resolving their technical difficulties by
understanding, defining, and solving those issues.
The activities that are carried out while performing Business Analysis:
• Company analysis: Business analysis aims at figuring out the requirements of a firm
in general and its strategic direction and determining the initiatives that will enable the
business to address those strategic goals.
• Requirements planning and management: It focuses on planning the requirements
of the development process, identifying what the top priority is for execution, and
managing the changes.
• Requirements elicitation: It outlines techniques for collecting needs from relevant
members of the project team.
• Requirements analysis and documentation: It explains how to establish and define
the needs in detail to allow them to be effectively carried out by the team.
• Requirements communication: Business analysis explains methods to help stakeholders
have a shared understanding of the needs and how they will be carried out.
• Solution assessment and validation: It also explains how a business analyst can execute
a suggested solution, how to support the execution of a solution, and how to evaluate
possible flaws in the implementation.
Business analytics is also known as data analytics. It is a process of collecting, evaluating, and
drawing valuable outcomes from the enormous amount of data available. Business analytics is
widely used in the following applications:
• Finance
• Marketing
• HR
• CRM
• Manufacturing
• Banking and Credit Cards
Most people believe that business analysis and analytics are the same, but they are not! The
primary differences between business analysis and business analytics:
business needs.
• It is employed to figure out the organizational needs and possible problems to have
productive outcomes.
• Here, the tasks are carried out by Functional Analysts, Systems Analysts, and Business
Analysts.
• Business, functional, and domain skills are needed to perform business analysis.
• The architectural domains for business analysis include enterprise architecture, process
architecture, technology architecture, and organization architecture.
Business Analytics
Business analysis and business analytics have some commonalities. They both:
Business analysis is a practice of identifying business requirements and figuring out solutions
to specific business problems. This has a heavy overlap with the analysis of business needs to
function normally and to enhance how they function. Sometimes, the solutions include a
system’s development feature. It can also incorporate business change, process enhancement
or strategic planning, and policy improvement.
On the contrary, business analytics is all about the group of tools, techniques, and skills that
help the investigation of previous business performance. It also aids to gain insights into future
performance. In general, business analytics aims mostly at data and statistical analysis.
Categorization of Analytical Models
1. Descriptive Analytics
It summarizes an organisation’s existing data to understand what has happened in the past or
is happening currently. Descriptive Analytics is the simplest form of analytics as it employs
data aggregation and mining techniques. It makes data more accessible to members of an
organisation such as the investors, shareholders, marketing executives, and sales managers.
It can help identify strengths and weaknesses and provides an insight into customer behaviour
too. This helps in forming strategies that can be developed in the area of targeted marketing.
2. Diagnostic Analytics
This type of Analytics helps shift focus from past performance to the current events and
determine which factors are influencing trends. To uncover the root cause of events, techniques
such as data discovery, data mining and drill-down are employed. Diagnostic analytics makes
use of probabilities, and likelihoods to understand why events may occur. Techniques such as
sensitivity analysis and training algorithms are employed for classification and regression.
3. Predictive Analytics
This type of Analytics is used to forecast the possibility of a future event with the help of
statistical models and ML techniques. It builds on the result of descriptive analytics to devise
models to extrapolate the likelihood of items. To run predictive analysis, Machine Learning
experts are employed. They can achieve a higher level of accuracy than by business intelligence
alone.
One of the most common applications is sentiment analysis. Here, existing data collected from
social media and is used to provide a comprehensive picture of an users opinion. This data is
analysed to predict their sentiment (positive, neutral or negative).
4. Prescriptive Analytics
Going a step beyond predictive analytics, it provides recommendations for the next best action
to be taken. It suggests all favourable outcomes according to a specific course of action and
also recommends the specific actions needed to deliver the most desired result. It mainly relies
on two things, a strong feedback system and a constant iterative analysis. It learns the relation
between actions and their outcomes. One common use of this type of analytics is to create
recommendation systems.
Business Analytics Tools
Business Analytics tools help analysts to perform the tasks at hand and generate reports which
may be easy for a layman to understand. These tools can be obtained from open source
platforms, and enable business analysts to manage their insights in a comprehensive manner.
They tend to be flexible and user-friendly. Various business analytics tools and techniques like.
• Python is very flexible and can also be used in web scripting. It is mainly applied
when there is a need for integrating the data analyzed with a web application or the
statistics is to be used in a database production. The I Python Notebook facilitates
and makes it easy to work with Python and data. One can share notebooks with other
people without necessarily telling them to install anything which reduces code
organizing overhead
• SAS The tool has a user-friendly GUI and can churn through terabytes of data with
ease. It comes with an extensive documentation and tutorial base which can help
early learners get started seamlessly.
• R is open source software and is completely free to use making it easier for
individual professionals or students starting out to learn. Graphical capabilities or
data visualization is the strongest forte of R with R having access to packages like
GGPlot, RGIS, Lattice, and GGVIS among others which provide superior graphical
competency.
• Tableau is the most popular and advanced data visualization tool in the market.
Story-telling and presenting data insights in a comprehensive way has become one
of the trademarks of a competent business analyst Tableau is a great platform to
develop customized visualizations in no time, thanks to the drop and drag features.
Python, R, SAS, Excel, and Tableau have all got their unique places when it comes to usage.
Data Scientist vs. Data Engineer vs. Data Analyst
1. Data scientists use their advanced statistical skills to help improve the models the data
engineers implement and to put proper statistical rigour on the data discovery and analysis the
customer is asking for.
Companies extract data to analyze and gain insights about various trends and practices.
In order to do so, they employ specialized data scientists who possess knowledge of
statistical tools and programming skills. Moreover, a data scientist possesses
knowledge of machine learning algorithms.
However, Data Science is not a singular field. It is a quantitative field that shares its
background with math, statistics and computer programming. With the help of data
science, industries are qualified to make careful data-driven decisions.
These algorithms are responsible for predicting future events. Therefore, data science
can be thought of as an ocean that includes all the data operations like data extraction,
data processing, data analysis and data prediction to gain necessary insights.
For becoming a Data Scientist, you must have the following key skills –
Should be proficient with Math and Statistics.
• Should be able to handle structured & unstructured information.
• In-depth knowledge of tools like R, Python and SAS.
• Well versed in various machine learning algorithms.
• Have knowledge of SQL(Structured Query Language) and NoSQL(Non Structured
Query Language or not only SQL) Must be familiar with Big Data tools.
2. A Data Engineer is a person who specializes in preparing data for analytical usage. Data
Engineering also involves the development of platforms and architectures for data processing.
In other words, a data engineer develops the foundation for various data operations. A
Data Engineer is responsible for designing the format for data scientists and analysts to
work on.
Data Engineers have to work with both structured and unstructured data. Therefore,
they need expertise in SQL and NoSQL databases both. Data Engineers allow data
scientists to carry out their data operations.
Data Engineers have to deal with Big Data where they engage in numerous operations
like data cleaning, management, transformation, data deduplication etc.
A Data Engineer is more experienced with core programming concepts and algorithms.
The role of a data engineer also follows closely to that of a software engineer. This is
because a data engineer is assigned to develop platforms and architecture that utilize
guidelines of software development.
For example, developing a cloud infrastructure to facilitate real-time analysis of data requires
various development principles. Therefore, building an interface API is one of the job
responsibilities of a data engineer. Tools used by Data Engineers
• A data analyst does not directly participate in the decision-making process; rather, he
helps indirectly through providing static insights about company performance. A data
engineer is not responsible for decision making. And, a data scientist participates in
the active decision-making process that affects the course of the company.
• A data analyst uses static modelling techniques that summarize the data through
descriptive analysis. On the other hand, a data engineer is responsible for the
development and maintenance of data pipelines. A data scientist uses dynamic
techniques like Machine learning to gain insights about the future.
• Knowledge of machine learning is not important for data analysts. However, this is
mandatory for data scientists. A data engineer need not require the knowledge of
machine learning but he is required to have the knowledge of core computing concepts
like programming and algorithms to build robust data systems.
• A data analyst only has to deal with structured data. However, both data scientists
and data engineers deal with unstructured data as well.
• Data analyst and data scientists are both required to be proficient in data visualization.
However, this is not required in the case of a data engineer.
• Both data scientists and analysts need not have knowledge of application
development and working of the APIs. However, this is the most essential requirement
for a data engineer.
In order to become a Data Analyst, you must possess the following skills –
Should possess the strong mathematical aptitude Should be well
versed with Excel, Oracle, and SQL.
• Possession of problem-solving attitude.
• Proficient in the communication of results to the team.
• Should have a strong suite of analytical skills.
• Talend :Talend is one of the most powerful data analytics tools available in the
market and is developed in the eclipse graphical development environment. ...
• Qlik Sense. ...
• Apache Spark. ...
• Power BI. ...
• ThoughtSpot. ...
• RapidMiner. ...
• Tableau
Business Analyst
Business analysts use data to form business insights and recommend changes in businesses and
other organizations. Business analysts can identify issues in virtually any part of an
organization, including IT processes, organizational structures, or staff development.
As businesses seek to increase efficiency and reduce costs, business analytics has become an
important component of their operations. Let’s take a closer look at what business analysts do
and what it takes to get a job in business analysis.
Business analysts identify business areas that can be improved to increase efficiency and
strengthen business processes. They often work closely with others throughout the business
hierarchy to communicate their findings and help implement changes.
Tasks and duties can include:
• Identifying and prioritizing the organization's functional and technical needs and
requirements
• Using SQL and Excel to analyze large data sets
• Compiling charts, tables, and other elements of data visualization
• Creating financial models to support business decisions
• Understanding business strategies, goals, and requirements
• Planning enterprise architecture (the structure of a business)
• Forecasting, budgeting, and performing both variance analysis and financial analysis
Business analyst skills
• Technical skills: These skills include stakeholder management, data modeling and
knowledge of IT.
• Analytical skills: Business analysts have to analyze large amounts of data and other
business processes to form ideas and fix problems.
• Communication: These professionals must communicate their ideas in an expressive
way that is easy for the receiver to understand.
• Problem-solving: It is a business analyst’s primary responsibility to come up with
solutions to an organization’s problems.
• Research skills: Thorough research must be conducted about new processes and
software to present results that are effective.
Business analyst responsibilities
• Analyzing and evaluating the current business processes a company has and identifying
areas of improvement
• Researching and reviewing up-to-date business processes and new IT advancements to
make systems more modern
• Presenting ideas and findings in meetings
• Training and coaching staff members
• Creating initiatives depending on the business’s requirements and needs
• Developing projects and monitoring project performance
• Collaborating with users and stakeholders
• Working closely with senior management, partners, clients and technicians
Types of Data
Qualitative vs. Quantitative Data
1. Quantitative data
• Quantitative data seems to be the easiest to explain. It answers key questions such as
“how many, “how much” and “how often”.
• Quantitative data can be expressed as a number or can be quantified. Simply put, it can
be measured by numerical variables.
•Quantitative data are easily amenable to statistical manipulation and can be represented
by a wide variety of statistical types of graphs and charts such as line, bar graph, scatter
plot, and etc.
Examples of quantitative data:
• Scores on tests and exams e.g. 85, 67, 90 and etc.
• The weight of a person or a subject.
• Your shoe size.
• The temperature in a room.
2. Qualitative data Qualitative data can’t be expressed as a number and can’t be measured.
Qualitative data consist of words, pictures, and symbols, not numbers.
• Qualitative data is also called categorical data because the information can be sorted by
category, not by number.
• Qualitative data can answer questions such as “how this has happened” or and “why
this has happened”.
Examples of qualitative data:
• Colors e.g. the color of the sea
• Your favorite holiday destination such as Hawaii, New Zealand and etc.
• Names as John, Patricia..
• Ethnicity such as American Indian, Asian, etc.
BA LEVELS
There are four levels that a business analyst in an organization comprises of:
• Strategic management: This is the analysis level, where a business analyst evaluates
and calculates the strategic where about if a company. This is one of the most critical
levels because unless the evaluation is done on the point, none of the further steps can
work appropriately.
• Analysis of business model: This level has to do with evaluating policies that are
currently being employed by the company. This not only enables us to implement
what’s new but also helps in checking the previous ones.
• Designing the process: Like an artist creates his imagination, business analysts do that
with their skills. The step includes modelling the business processes, which comes out
to be designing and modelling.
• Analysis of technology: Technical systems need a thorough analysis too. This is
something that, if not taken care of, leads to severe consequences.
The key business analyst roles and responsibilities:
What does a business needs: As a business analyst, it is his key responsibility to
understand what stakeholders need and pass these requirements to the developers, and
also give on the developer’s expectations to the stakeholders. A business analyst’s skill
for this responsibility is the communication skills that can impress everyone across.
While he transfers the information, he is the one who needs to put these in such words
that make a difference. This responsibility is no doubt tome taking because he needs to
listen and execute, which might seem easy, but only a skilled professional can handle
all this.
Conducting meetings with developing team and stakeholders: Business analysts are
supposed to coordinate with both stakeholders and the development team whenever a
new feature or update is added to a project. This may vary from project to project. This
facilitates the collection of client feedback and the resolution of issues encountered by
the development team when implementing new features. The business analyst role is
to understand and explain the new feature updates to clients and take feedback for
further development. Based on client feedback, Business Analyst instructs the
development team to make amendments or continue as is. At times, the client requests
an additional feature be added to a project, and the BA must determine whether or not
it is feasible, and then assign resources if necessary to implement it.
System possibilities: A business analyst might be considered one among those working
in the software team, but their key responsibility Is not what the team does. He has to
ensure that he figures out what a project needs. He is the one who leads the path to the
goals. He might be the one who dreams of targets, but he is also the one who knows
how to make those dreams a reality. Looking for the opportunities and grabbing them
before they go is what a business analyst is good at.
Present the company: He can be called the face of a business. A business analyst is
responsible for putting a business’s thoughts and goals in front of the stakeholders. In
short, he is the one who needs to impress the stakeholders with his presentation skills
and the skill to present what the person on the other side is looking for and not what the
company has in store for them.
Present the details: A project brings with itself hundreds of minute details that might
be left unseen. A business analyst is the one who is responsible for elaborating the
project with the tiniest of the loopholes or hidden secrets. This is considered the most
crucial role of a business analyst because unless the details are put across the
stakeholders, they won’t take an interest, and unless they show the part, the project is
likely to take a pause.
Implementation of the project: After going through all the steps mentioned above,
the next and the most important role of a business analyst in agile is to implement
whatever has been planned. Execution is not easy unless the previous steps have been
taken care of in a systemized fashion.
Functional and non-functional requirements of a business: As an organization, the
main goal is to receive an end product that is productive and gives a company a long
time. The role of business analyst in it company is to take care of the business’s
functional aspect, which includes the steps and ways to ensure the working of the
project. Sideways he is also supposed to take care of the non-functional that comprise
how a project or a business is supposed to work.
Testing: The role of a business analyst is way longer than expected. Once the product
is prepared, the next step is to test it among the users to know it’s working capacity and
quality. The Business Analyst tests the prototype/interface by involving some clients
and recording their experiences with the model that has been developed, according to
the role description. Based on their feedback, Business Analyst intends to make some
changes to the model that will make it even better. They conduct UAT (user acceptance
test) to determine whether or not the prototype meets the requirements of the project
under consideration.
Decision making and problem-solving: The responsibilities of business analyst
range from developing the required documents to making decisions in the most
stringent circumstances, job role of business analyst is to do it all. Moreover, a business
analyst is expected to be the one who tackles things most easily and calmly because he
should also be good at problem-solving, even if that’s related to the stakeholders,
employees, or the clients.
Maintenance: Like they say that care is as essential as building something new. No
matter how much human resources, energy, or finds you spend on a project, if the
maintenance part is not taken care of properly or is neglected, it tends to spoil the entire
hard work put across. What is the role of a business analyst here? Is it just limited to
the maintenance of the clients or sales; it also has to ensure that the quality and the
promised products are maintained throughout.
Building a team: Everyone is born with varied skills. As a business analyst, the
business analyst’s responsibility is to make the team with people possessing different
skills required for the project. Not only the hiring but retaining them is as essential. A
well united and skilled team can do wonders. The things that are required in a great
section inside co combination, structuring, and skills. A good team tends to take the
company to the heights of success.
Presentation and Documentation of the Final Project: After the business project is
completed, the Business Analyst must document the details of the project and share the
project’s findings with the client. In most cases, BA roles and responsibilities include
preparing reports and presenting the results of a project to key stakeholders and clients.
During building the project, they must also record all of the lessons learned and
challenges they encountered in a concise form. This step aids the business analyst in
making better decisions in the future.
CONCLUSION
A business analyst might be another position in an organization but its roles and responsibilities
play a vital role in an organization’s success. While he needs to be a good orator, he should
possess the quality of bringing people closers to his team and across. His roles are not limited
to a specific step in project management. He is required one overstep till the end. From the
initial stages of evaluation to the maintenance, a company needs a business analyst’s skill.
Dealing with Data and Data Science
Data
Data Collection
Data collection is the process of acquiring, collecting, extracting, and storing the voluminous
amount of data which may be in the structured or unstructured form like text, video, audio,
XML files, records, or other image files used in later stages of data analysis. In the process of
big data analysis, “Data collection” is the initial step before starting to analyze the patterns or
useful information in data. The data which is to be analyzed must be collected from different
valid sources.
The actual data is then further divided mainly into two types known as:
1. Primary data
2. Secondary data
1. Primary data:
The data which is Raw, original, and extracted directly from the official sources is known as
primary data. This type of data is collected directly by performing techniques such as
questionnaires, interviews, and surveys. The data collected must be according to the demand
and requirements of the target audience on which analysis is performed otherwise it would be
a burden in the data processing.
Few methods of collecting primary data:
Interview method:
The data collected during this process is through interviewing the target audience by a person
called interviewer and the person who answers the interview is known as the interviewee. Some
basic business or product related questions are asked and noted down in the form of notes,
audio, or video and this data is stored for processing. These can be both structured and
unstructured like personal interviews or formal interviews through telephone, face to face,
email, etc.
Survey method:
The survey method is the process of research where a list of relevant questions are asked and
answers are noted down in the form of text, audio, or video. The survey method can be obtained
in both online and offline mode like through website forms and email. Then that survey answers
are stored for analyzing data. Examples are online surveys or surveys through social media
polls.
Observation method:
The observation method is a method of data collection in which the researcher keenly observes
the behaviour and practices of the target audience using some data collecting tool and stores
the observed data in the form of text, audio, video, or any raw formats. In this method, the data
is collected directly by posting a few questions on the participants. For example, observing a
group of customers and their behaviour towards the products. The data obtained will be sent
for processing.
Projective Technique
Projective data gathering is an indirect interview, used when potential respondents know why
they're being asked questions and hesitate to answer. For instance, someone may be reluctant
to answer questions about their phone service if a cell phone carrier representative poses the
questions. With projective data gathering, the interviewees get an incomplete question, and
they must fill in the rest, using their opinions, feelings, and attitudes.
Delphi Technique.
The Oracle at Delphi, according to Greek mythology, was the high priestess of Apollo’s temple,
who gave advice, prophecies, and counsel. In the realm of data collection, researchers use the
Delphi technique by gathering information from a panel of experts. Each expert answers
questions in their field of specialty, and the replies are consolidated into a single opinion.
Focus Groups.
Focus groups, like interviews, are a commonly used technique. The group consists of anywhere
from a half-dozen to a dozen people, led by a moderator, brought together to discuss the issue.
Questionnaires.
Questionnaires are a simple, straightforward data collection method. Respondents get a series
of questions, either open or close-ended, related to the matter at hand.
Experimental method:
The experimental method is the process of collecting data through performing experiments,
research, and investigation. The most frequently used experiment methods are CRD, RBD,
LSD, FD.
• CRD- Completely Randomized design is a simple experimental design used in data
analytics which is based on randomization and replication. It is mostly used for comparing
the experiments.
• RBD- Randomized Block Design is an experimental design in which the experiment is
divided into small units called blocks. Random experiments are performed on each of the
blocks and results are drawn using a technique known as analysis of variance (ANOVA).
RBD was originated from the agriculture sector.
• LSD – Latin Square Design is an experimental design that is similar to CRD and RBD
blocks but contains rows and columns. It is an arrangement of NxN squares with an equal
amount of rows and columns which contain letters that occurs only once in a row. Hence
the differences can be easily found with fewer errors in the experiment. Sudoku puzzle is
an example of a Latin square design.
• FD- Factorial design is an experimental design where each experiment has two factors
each with possible values and on performing trail other combinational factors are derived.
2. Secondary data:
Secondary data is the data which has already been collected and reused again for some valid
purpose. This type of data is previously recorded from primary data and it has two types of
sources named internal source and external source.
i. Internal source:
These types of data can easily be found within the organization such as market record, a sales
record, transactions, customer data, accounting resources, etc. The cost and time consumption
is less in obtaining internal sources.
• Financial Statements
• Sales Reports
• Retailer/Distributor/Deal Feedback
• Customer Personal Information (e.g., name, address, age, contact info)
• Business Journals
• Government Records (e.g., census, tax records, Social Security info)
• Trade/Business Magazines
• The internet
1. Word Association.
The researcher gives the respondent a set of words and asks them what comes to mind when
they hear each word.
2. Sentence Completion.
Researchers use sentence completion to understand what kind of ideas the respondent has. This
tool involves giving an incomplete sentence and seeing how the interviewee finishes it.
3. Role-Playing.
Respondents are presented with an imaginary situation and asked how they would act or react
if it was real.
4. In-Person Surveys.
The researcher asks questions in person.
5. Online/Web Surveys.
These surveys are easy to accomplish, but some users may be unwilling to answer truthfully,
if at all.
6. Mobile Surveys.
These surveys take advantage of the increasing proliferation of mobile technology. Mobile
collection surveys rely on mobile devices like tablets or smart phones to conduct surveys via
SMS or mobile apps.
7. Phone Surveys.
No researcher can call thousands of people at once, so they need a third party to handle the
chore. However, many people have call screening and won’t answer.
8. Observation.
Sometimes, the simplest method is the best. Researchers who make direct observations collect
data quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in
small-scale situations.
Data Management
Data management works symbiotically with process management, ensuring that the actions
teams take are informed by the cleanest, most current data available — which in today’s world
means tracking changes and trends in real-time. Below is a deeper look at the practice, its
benefits and challenges, and best practices for helping your organization get the most out of its
business intelligence.
2. Data stewardship: A data steward does not develop information management policies but
rather deploys and enforces them across the enterprise. As the name implies, a data steward
stands watch over enterprise data collection and movement policies, ensuring practices are
implemented and rules are enforced.
3. Data quality management: If a data steward is a kind of digital sheriff, a data quality
manager might be thought of as his court clerk. Quality management is responsible for combing
through collected data for underlying problems like duplicate records, inconsistent versions,
and more. Data quality managers support the defined data management system.
4. Data security: One of the most important aspects of data management today is security.
Though emergent practices like DevSecOps incorporate security considerations at every level
of application development and data exchange, security specialists are still tasked with
encryption management, preventing unauthorized access, guarding against accidental
movement or deletion, and other frontline concerns.
5. Data governance: Data governance sets the law for an enterprise’s state of information.
A data governance framework is like a constitution that clearly outlines policies for the intake,
flow, and protection of institutional information. Data governors oversee their network of
stewards, quality management professionals, security teams, and other people and data
management processes in pursuit of a governance policy that serves a master data management
approach.
6. Big data management: Big data is the catch-all term used to describe gathering,
analyzing, and using massive amounts of digital information to improve operations. In broad
terms, this area of data management specializes in intake, integrity, and storage of the tide of
raw data that other management teams use to improve operations and security or inform
business intelligence.
7. Data warehousing: Information is the building block of modern business. The sheer
volume of information presents an obvious challenge: What do we do with all these blocks?
Data warehouse management provides and oversees the physical and/or cloud-based
infrastructure used to aggregate raw data and analyze it in-depth to produce business insights.
The unique needs of any organization practicing data management may require a blend of
some or all of these approaches. Familiarity with management areas provides data managers
with the background they need to build solutions customized for their environments.
Once data is under management, it can be mined for informational gold: business intelligence.
This helps business users across the organization in a variety of ways, including the following:
• Smart advertising that targets customers according to their interests and interactions
• Holistic security that safeguards critical information
• Alignment with relevant compliance standards, saving time and money
• Machine learning that grows more environmentally aware over time, powering automatic
and continuous improvement
• Reduced operating expenses by restricting use to only the necessary storage and compute
power required for optimal performance
• The amount of data can be (at least temporarily) overwhelming. It’s hard to overstate
the volume of data that must come under management in a modern business, so, when
developing systems and processes, be ready to think big. Really big. Specialized thirdparty
services and apps for integrating big data or providing it as a platform are crucial allies.
• Many organizations silo data. The development team may work from one data set, the
sales team from another, operations from another, and so on. A modern data management
system relies on access to all this information to develop modern business intelligence. Re
• Real-time data platform services help stream and share clean information between teams
from a single, trusted source.
• The journey from unstructured data to structured data can be steep. Data often pours
into organizations in an unstructured way. Before it can be used to generate business
intelligence, data preparation has to happen: Data must be organized, deduplicated, and
otherwise cleaned. Data managers often rely on third-party partnerships to assist with these
processes, using tools designed for on-premises, cloud, or hybrid environments.
• Managing the culture is essential to managing data. All of the processes and systems in
the world won’t do you much good if people don’t know how — and perhaps just as
importantly, why — to use them. By making team members aware of the benefits of data
management (and the potential pitfalls of ignoring it) and fostering the skills of using data
correctly, managers engage team members as essential pieces of the information process.
These and other challenges stand between the old way of doing business and initiatives that
harness the power of data for business intelligence. But with proper planning, practices, and
partners, technologies like accelerated machine learning can turn pinch points into gateways
for deeper business insights and better customer experience.
1. Make a plan
• Develop and write a data management plan (DMP). This document charts estimated
data usage, accessibility guidelines, archiving approaches, ownership, and more. A
DMP serves as both a reference and a living record and will be revised as circumstances
change.
• Additionally, DMPs present the organization’s overarching strategy for data
management to investors, auditors, and other involved parties — which is an important
insight into a company’s preparedness for the rigors of the modern market. The best
DMPs define granular details, including:
• Preferred file formats
• Naming conventions
• Access parameters for various stakeholders
• Backup and archiving processes
• Defined partners and the terms and services they provide
• Thorough documentation
• There are online services that can help create DMPs by providing step-by-step guidance
to creating plans from templates.
2. Store your data
• Among the granular details mentioned above, a solid data storage approach is central
to good data management. It begins by determining if your storage needs best suit a
data warehouse or a data lake (or both), and whether the company’s data belongs
onpremises or in the cloud.
• Then outline a consistent, and consistently enforced, agreement for naming files,
folders, directories, users, and more. This is a foundational piece of data management,
as these parameters will determine how to store all future data, and inconsistencies will
result in errors and incomplete intelligence.
1. Security and backups. Insecure data is dangerous, so security must be considered at
every layer. Some organizations come under special regulatory burdens like HIPAA,
CIPA, GDPR, and others, which add additional security requirements like periodic
audits. When security fails, the backup plan can be the difference between business life
and death. Traditional models called for three copies of all important data: the original,
the locally stored copy, and a remote copy. But emerging cloud models include
decentralized data duplication, with even more backup options available at an
increasingly affordable cost for storage and transfer.
2. Documentation is key. If it’s important, document it. If the entire team splits the lottery
and runs off to Jamaica, thorough, readable documentation outlining security and
backup procedures will give the next team a fighting chance to pick up where they left
off. Without it, knowledge resides exclusively with holders who may or may not be part
of a long-term data management approach.
Data storage needs to be able to change as fast as the technology demands, so any approach
should be flexible and have a reasonable archiving approach to keep costs manageable.
In line with classical definitions of the concept, big data is generally associated
with three core characteristics:
Beyond “the Three Vs,” current descriptions of big data management also include two
other characteristics, namely:
4 Veracity: This is the degree of reliability and truth that big data has
to offer in terms of its relevance, cleanliness, and accuracy.
5 Value: Since the primary aim of big data gathering and analysis is to
discover insights that can inform decision-making and other processes,
this characteristic explores the benefit or otherwise that information and
analytics can ultimately produce.
Organization/Sources of Data
There are different ways of how to collect big data from users. These are
the most popular ones.
1. Asking for it the majority of firms prefer asking users directly to share
their personal information. They give these data when creating website
accounts or buying online. The minimum information to be collected
includes a username and an email address, but some profiles require more
details.
2. Cookies and Web Beacons
Cookies and web beacons are two widely used methods to gather the
data on users, namely, what web pages they visit and when. They
provide basic statistics about how a website is used. Cookies and
web beacons in no way compromise your privacy but just serve to
personalize your experience with one or another web source.
3. Email tracking
Email trackers are meant to give more information on the user actions in the
mailbox.
In particular, an email tracker allows detecting when an email was opened. Both
Google and Yahoo use this method to learn their users’ behavioural
patterns and provide personalized advertising.
Quality data is key to making accurate, informed decisions. And while all
data has some level of “quality,” a variety of characteristics and factors
determines the degree of data quality (high-quality versus low-quality).
Furthermore, different data quality characteristics will likely be more
important to various stakeholders across the organization. A list of popular
data quality characteristics and dimensions include:
1. Completeness: Completeness is defined as a measure of the percentage of data
that is missing within a dataset.
2. Timeliness: Timeliness measures how up-to-date or antiquated the data is at
any given moment.
3. Validity: Validity refers to information that fails to follow specific company
formats, rules, or processes.
4. Integrity: Integrity of data refers to the level at which the information is reliable
and trustworthy.
5. Uniqueness: Uniqueness is a data quality characteristic most often associated
with customer profiles.
6. Consistency: It ensures that the source of the information collection is capturing
the correct data based on the unique objectives of the department or company.
The concept of missing data is implied in the name: its data that is not
captured for a variable for the observation in question. Missing data reduces
the statistical power of the analysis, which can distort the validity of the
results.
Fortunately, there are proven techniques to deal with missing data.
When dealing with missing data, data scientists can use two primary methods
to solve the error: imputation or the removal of data.
The imputation method develops reasonable guesses for missing data. It’s
most useful when the percentage of missing data is low. If the portion of
missing data is too high, the results lack natural variation that could result
in an effective model.
The other option is to remove data. When dealing with data that is missing
at random, related data can be deleted to reduce bias. Removing data may
not be the best option if there are not enough observations to result in a
reliable analysis. In some situations, observation of specific events or
factors may be required.
Before deciding which approach to employ, data scientists must understand why the
data is missing.
Missing at Random means the data is missing relative to the observed data.
It is not related to the specific missing values. The data is not missing across
all observations but only within sub-samples of the data. It is not known if
the data should be there; instead, it is missing given the observed data. The
missing data can be predicted based on the complete observed data.
Missing Completely at Random (MCAR)
Data may be missing due to test design, failure in the observations or failure
in recording observations. This type of data is seen as MCAR because the
reasons for its absence are external and not related to the value of the
observation.
It is typically safe to remove MCAR data because the results will be unbiased.
The test may not be as powerful, but the results will be reliable.
The MNAR category applies when the missing data has a structure to it. In
other words, there appear to be reasons the data is missing. In a survey,
perhaps a specific group of people – say women ages 45 to 55 – did not
answer a question. Like MAR, the data cannot be determined by the
observed data, because the missing information is unknown. Data scientists
must model the missing data to develop an unbiased estimate. Simply
removing observations with missing data could result in a model with bias.
Deletion
There are two primary methods for deleting data when dealing with missing
data: list wise and dropping variables.
List wise
In this method, all data for an observation that has one or more missing
values are deleted. The analysis is run only on observations that have a
complete set of data. If the data set is small, it may be the most efficient
method to eliminate those cases from the analysis. However, in most cases,
the data are not missing completely at random (MCAR). Deleting the
instances with missing observations can result in biased parameters and
estimates and reduce the statistical power of the analysis.
Pair wise
Dropping Variables
If data is missing for more than 60% of the observations, it may be wise to discard
it if the variable is insignificant.
Imputation
When data is missing, it may make sense to delete data, as mentioned above.
However, that may not be the most effective option. For example, if too
much information is discarded, it may not be possible to complete a reliable
analysis. Or there may be insufficient data to generate a reliable prediction
for observations that have missing data.
This is one of the most common methods of imputing values when dealing
with missing data. In cases where there are a small number of missing
observations, data scientists can calculate the mean or median of the
existing observations. However, when there are many missing variables,
mean or median results can result in a loss of variation in the data. This
method does not use time-series characteristics or depend on the
relationship between the variables.