The difference between structured, unstructured and
semi-structured data.
The type of data that we can exploit has progressively moved from structured to semi-structured to unstructured data
Unstructured
This creates tremendous
Real-Time
Social Media
efficiency benefits and greatly
reduces costs
Sensors
Average
Cost per Iteration
Video
A Cost Per
Output
Audio
B
Website C
Structured
Database
Value of Insight
Static
Structured data reside in a fixed format within a record or a file. This includes data in relational databases or spreadsheets.
Most popular platforms for structured data include:
• Oracle
• MS SQL Server
• MS Access, etc.
Example application: Traditional financial data investigation.
Big Data can be associated with structured data sources (but not exclusively).
Structured Data Semi-structured Data Unstructured Data
Data that reside in fixed fields. Data that contain tags and other Data that does not reside in fixed
markers to separate data elements. fields.
Relational databases spreadsheets. XML or HTML tagged text. Books, e-mails, untagged audio,
image and video data.
Business Perspective Technology Perspective
• How do we understand the semantic meaning of the data in • Bandwidth and computing power required to move/analyze
context of a problem? large sets of unstructured data around
• How do we develop an indexing scheme so that information • Shortfall of analytical talent
can easily be found and utilized? • Insufficiency of current IT systems to support
volume/structure
Semi-Structured Unstructured Data
What are some business application of unstructured data?
• Does not conform to a • Contains files of various
structural format like formats, sizes, structure, Value from Unstructured Data
relational or other etc.
standard formats. • E.g. document collections
• E.g. XML, Email etc. Corporate Consumer Call Centers
(text), social interactions,
Forensics Facing
images, video, audio, etc.
Company data from E-Discovery Brand Monitoring
various sources:
Files Audio
Risk Product Open
Management Development Complaints
Social
Videos Media
Compliance Pandora Emotion
Detection
Emails Pictures