Data Mapping and
Exchange
CHERYL JARANILLA
INSTRUCTOR
Data Mapping
the process through which you take one set of data (known
as the “source”) and assign or “map” its destination (known
as the “target” )
Platforms speak a unique language when it comes to data.
Data mapping acts as a translator to bridge that gap, so
your data can be seamlessly migrated, integrated, or
transformed from its source to a destination
Use Cases of Data Mapping
Data Integration – bring all your data to a centralized
location and normalizing two different sets of data into a
single stream. (take both data sets, remove duplicate info,
and format data)
Data Migration – move data from one location to a similar
but structurally different location.
Data Transformation – translate data from one format to
another.
Techniques in Data Mapping
Automated – requires specialized software that will take
new data and match it to your existing structure/schema.
Semi-automated data mapping – also known as “schema
mapping”. Working with software that specifically created
the connection between different sources and targets.
Once the process has been mapped, team will manually
check and make necessary changes.
Manual – requires a developer who can code rules to
transfer or inject data from one source field to another.
Metadata
Information that describes and explains data.
Provides context with details such as source, type, owner,
and relationships to other data sets.
Data vs Metada
Metadata types
Technical – technical (row, column count, data type, etc.)
Governance – governance terms, ownership info, etc.
Operational – flow of data (dependencies, code, runtime)
Collaboration – data-related comments, discussions, and issues
Quality – quality metrics and measures (dataset status, test runs, etc.)
Usage – how much dataset is used (view count, popularity, top users, etc.)
XML
Extensible Markup Language
Defines a set of rules for encoding documents in a format
that is both human and machine-readable.
Designed to store and transport data
Self-descriptive
Design goals is to focus on simplicity, generality, and
usability across the Internet.
Syntax Rules
XML Prolog – must be at the top of doc (optional)
Root – parent of all elements
Case sensitive
Proper Nesting
Avoid pre-defined references – < > & ‘ “”
XSLT
Extensible Stylesheet Language Transformations
Allows a stylesheet author to transform a primary XML
document in two significant ways: manipulating and sorting
the content, including a wholesale reordering of it if so
desired, and transforming the content into a different
format.
DTD
Document Type Definition
Used to define document structure with a list of legal
elements and attributes.
JSON
JavaScript Object Notation
Format for structuring data
Supports data structures like arrays and objects and JSON
documents that are rapidly executed on the server.
Language-independent format that is derived from
JavaScript
Features of JSON
Easy to understand: easy to read and write
Format: text-based interchange format. Can store any kind
of data in an array
Support: light-weight and supported by almost every
language and OS
Dependency: much faster compare to other text-based
structured data