[go: up one dir, main page]

0% found this document useful (0 votes)
123 views2 pages

Data Analysis: Navigating Limitations

Data has limitations and incomplete or dirty data can lead to poor decision making. It is important for data analysts to understand the limitations of the data they are using. This includes considering whether data is incomplete, if different business rules were used in data collection, and cleaning data to fix errors. Proper data analysis requires establishing standards, checking for statistical significance, using large enough samples, and being transparent about data limitations.

Uploaded by

Muhammad Zubair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views2 pages

Data Analysis: Navigating Limitations

Data has limitations and incomplete or dirty data can lead to poor decision making. It is important for data analysts to understand the limitations of the data they are using. This includes considering whether data is incomplete, if different business rules were used in data collection, and cleaning data to fix errors. Proper data analysis requires establishing standards, checking for statistical significance, using large enough samples, and being transparent about data limitations.

Uploaded by

Muhammad Zubair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Limitations of data

Data is powerful, but it has its limitations. Has someone’s personal opinion found its way into the
numbers? Is your data telling the whole story? Part of being a great data analyst is knowing the limits of
data and planning for them. This reading explores how you can do that.

If you have incomplete or nonexistent data, you might realize during an analysis that you don't have
enough data to reach a conclusion. Or, you might even be solving a different problem altogether! For
example, suppose you are looking for employees who earned a particular certificate but discover that
certification records go back only two years at your company. You can still use the data, but you will
need to make the limits of your analysis clear. You might be able to find an alternate source of the data
by contacting the company that led the training. But to be safe, you should be up front about the
incomplete dataset until that data becomes available.

If you're collecting data from other teams and using existing spreadsheets, it is good to keep in mind
that people use different business rules. So one team might define and measure things in a completely
different way than another. For example, if a metric is the total number of trainees in a certificate
program, you could have one team that counts every person who registered for the training, and
another team that counts only the people who completed the program. In cases like these, establishing
how to measure things early on standardizes the data across the board for greater reliability and
accuracy. This will make sure comparisons between teams are meaningful and insightful.

Dirty data refers to data that contains errors. Dirty data can lead to productivity loss, unnecessary
spending, and unwise decision-making. A good data cleaning effort can help you avoid this. As a quick
reminder, data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted,
duplicate, or incomplete data within a dataset. When you find and fix the errors - while tracking the
changes you made - you can avoid a data disaster. You will learn how to clean data later in the training.
Avinash Kaushik, a Digital Marketing Evangelist for Google, has lots of great tips for data analysts in his
blog: Occam's Razor. Below are some of the best practices he recommends for good data storytelling: 

 Compare the same types of data: Data can get mixed up when you chart it for visualization. Be
sure to compare the same types of data and double check that any segments in your chart
definitely display different metrics. 

 Visualize with care: A 0.01% drop in a score can look huge if you zoom in close enough. To make
sure your audience sees the full story clearly, it is a good idea to set your Y-axis to 0.

 Leave out needless graphs: If a table can show your story at a glance, stick with the table
instead of a pie chart or a graph. Your busy audience will appreciate the clarity.

 Test for statistical significance: Sometimes two datasets will look different, but you will need a
way to test whether the difference is real and important. So remember to run statistical tests to
see how much confidence you can place in that difference.

 Pay attention to sample size: Gather lots of data. If a sample size is small, a few unusual
responses can skew the results. If you find that you have too little data, be careful about using it
to form judgments. Look for opportunities to collect more data, then chart those trends over
longer periods.

In any organization, a big part of a data analyst’s role is making sound judgments. When you know the
limitations of your data, you can make judgment calls that help people make better decisions supported
by the data. Data is an extremely powerful tool for decision-making, but if it is incomplete, misaligned,
or hasn’t been cleaned, then it can be misleading. Take the necessary steps to make sure that your data
is complete and consistent. Clean the data before you begin your analysis to save yourself and possibly
others a great amount of time and effort.

You might also like