Data Blending
Data Blending
Data Blending
Data blending is a process whereby big data from multiple sources[1] are merged into a single data
warehouse or data set.[2] It concerns not merely the merging of different file formats or disparate sources of
data but also different varieties of data.[3] Data blending allows business analysts to cope with the
expansion of data that they need to make critical business decisions based on good quality business
intelligence.[4]
Data blending has been described as different from data integration due to the requirements of data analysts
to merge sources very quickly, too quickly for any practical intervention by data scientists.[5]
Representing the increased demand for analysts to combine data sources, multiple software companies have
seen large growth and raised millions of dollars,[6] with some early entrants into the market now public
companies.[7] Examples include AWS, Alteryx, Microsoft Power Query,[8] and Incorta,[9] which enable
combining data from many different data sources, for example, text files, databases, XML, JSON, and
many other forms of structured and semi-structured data.[10][11][12][13]
Data blending is similar to ETL in many ways. Both ETL and data blending take data from various sources
and combine them. However, ETL is used to merge and structure data into a target database,[14] often a
data warehouse. Data blending differs slightly as it's about joining data for a specific use case at a specific
time.[15] With some software, data isn't written into a database, which is very different to ETL. For
example, with Google Data Studio[16] and Tableau, the data blend occurs on the reporting layer; it's not
written anywhere, only displayed.
The other key differentiator is the granularity of the data join. Generally, when blending data into a single
data set, this would use a database join, which would usually join at the most granular level, using an ID
field where possible.[18] A data blend in Tableau should happen at the least granular level.[19]
See also
Data preparation
Data fusion
Data wrangling
Data cleansing
Data editing
Data scraping
Data curation
Data pre-processing
References
1. Alteryx Analytics Brings Power of Predictive and Big Data to Market (https://blog.ventanares
earch.com/2014/05/30/alteryx-analytics-brings-power-of-predictive-and-big-data-to-market)
2. Data blending is the process of combining data from multiple sources into a functioning data
set (http://www.datawatch.com/what-is-data-blending/)
3. The Definitive Guide to Data Blending (http://pages.alteryx.com/rs/alteryx/images/ALT_WPD
efGuideDataBlending-WithGraphics38.pdf)
4. "Data Blending" (https://www.trifacta.com/data-blending/). Trifacta.com. August 24, 2017.
5. What Is Data Blending, and Which Tools Make It Easier? (http://www.softwareadvice.com/re
sources/what-is-data-blending-tool/)
6. "Incorta raises $30M Series C for ETL-free data processing solution" (https://social.techcrun
ch.com/2019/08/15/incorta-raises-30m-series-c-for-etl-free-data-processing-solution/).
TechCrunch. Retrieved 2021-02-27.
7. "Alteryx Announces Pricing of Initial Public Offering" (https://www.alteryx.com/press-release
s/2017-03-23-alteryx-announces-pricing-initial-public-offering). Alteryx. Retrieved
2021-02-27.
8. Corporation, Microsoft. "Microsoft Power Query" (https://powerquery.microsoft.com/en-us/).
powerquery.microsoft.com. Retrieved 2021-02-27.
9. "Direct Data Analytics Software | Incorta" (https://www.incorta.com/). www.incorta.com.
Retrieved 2021-02-27.
10. "Data Sources" (https://docs.incorta.com/4.4/data-sources/). docs.incorta.com. Retrieved
2021-02-27.
11. davidiseminger. "Shape and combine data from multiple sources using Power Query" (http
s://docs.microsoft.com/en-us/power-query/power-query-tutorial-shape-combine).
docs.microsoft.com. Retrieved 2021-02-27.
12. "Supported Data Sources - Amazon QuickSight" (https://docs.aws.amazon.com/quicksight/la
test/user/supported-data-sources.html). docs.aws.amazon.com. Retrieved 2021-02-27.
13. "Data Sources | Alteryx Help" (https://help.alteryx.com/current/designer/data-sources).
help.alteryx.com. Retrieved 2021-02-27.
14. "How ETL Works" (https://databricks.com/de/glossary/extract-transform-load). Databricks (in
German). Retrieved 2021-02-27.
15. "What Is Data Blending, and Which Tools Make It Easier?" (https://www.softwareadvice.com/
resources/what-is-data-blending-tool/). Software Advice. 2016-08-25. Retrieved 2021-02-27.
16. "Google Data Studio Overview" (https://datastudio.google.com/overview).
datastudio.google.com. Retrieved 2021-02-27.
17. "Blend Your Data" (https://help.tableau.com/current/pro/desktop/en-us/multiple_connections.
htm). help.tableau.com. Retrieved 2021-02-27.
18. "SQL Joins Explained" (http://www.sql-join.com/). SQL Joins Explained. Retrieved
2021-02-27.
19. TAR Solutions (2021-01-20). "Data Blending in Tableau" (https://tarsolutions.co.uk/blog/data
-blending-in-tableau/). TAR Solutions. Retrieved 2021-02-27.
20. "About data blending - Data Studio Help" (https://support.google.com/datastudio/answer/906
1420). support.google.com. Retrieved 2021-02-27.
21. Heer, Jeffrey; Hellerstein, Joseph; Kandel, Sean; Rattenbury, Tye (July 2017). Principles of
Data Wrangling (http://shop.oreilly.com/product/0636920045113.do). O'Reilly Media.
22. "Data Mashups for Analytics" (http://www.pentaho.com/data-mashups-for-analytics).
Pentaho.