Guidance for Customer Data Platform on AWS 1
Data sources for building a customer 360 profile
include website and mobile application events,
advertising events, social media events, and
This Guidance shows how to build a customer data platform with a full, 360 degree profile view of customer data. It explores each stage transactional data from multiple system of
records and third-party data sets. This data is
of building the platform, including data ingestion, identity resolution, segmentation, analysis, and activation. available for consumption in multiple formats
and protocols. For example, software as service
First and 2 Data Ingestion 3 Near Real-Time Data Stream Processing 7 Segmentation Data Consumption 11 Destinations (SaaS) applications, batch files, cloud data
Third-Party and use cases shares, databases, and data market places.
Data Sources
Data Activation
8 Near real-time data ingestion is achieved
2 through Amazon Kinesis, Amazon Managed
AWS Lambda Amazon DynamoDB Amazon ads Streaming for Apache Kafka (Amazon MSK)
Amazon Kinesis DSP and and Amazon API Gateway. Batch data ingestion
Amazon uses AWS Transfer Family, AWS Database
Mobile client Amazon Pinpoint Marketing Migration Service (AWS DMS), and Amazon
Real-time Amazon SageMaker
Cloud (AMC) AppFlow. Amazon AppFlow Custom Connector
Stream Amazon AppFlow Software Development Kit (SDK) is used to build
AWS Lambda Amazon DynamoDB custom connectors to pull data from system of
Amazon MSK Stream record API’s. AWS Data Exchange subscriptions
provide access to third-party data in multiple
Amazon Connect modes.
Web Events 5 Batch Processing and Identity Resolution Amazon Connect
Profiles In near real-time data stream processing, the
3 ingestion services collect data, applies near real-
1 Amazon API
Data Cleansing,
9 Data Collaboration Contact time data transformations using AWS Lambda,
Gateway Centers and stores the data in Amazon DynamoDB. A
Validation and
enrichment DynamoDB stream is used to propagate data
downstream in near real-time using Lambda.
AWS Glue
Event-based
Scheduled AWS Glue AWS Clean Rooms In batch data processing, the ingestion services
SaaS Applications Amazon Personalize 4
(Data Catalog) collect and store raw data in Amazon Simple
AWS Step Storage Service (Amazon S3).
Functions
Amazon AppFlow Marketing AWS Step Functions orchestrates AWS
AWS Entity Resolution
Platforms 5 Glue data pipeline jobs to clean and validate
data. The cleansed data is passed to an Identity
Batch AWS Lambda Amazon API Gateway Resolution workflow. This workflow is built
File shares, offline Scheduled Storage using AWS Entity Resolution.
data
4 6
AWS Transfer Family Data processing and the transient data storage
Data Analytics SaaS Applications 6 for the Identity Resolution workflow uses clean
10 zone Amazon S3 bucket. The Amazon
Continuous
or one-time S3 curated zone bucket stores the final output
Replication Amazon S3 Amazon S3 Amazon S3 Amazon S3 Amazon S3 bucket of data processing for consumption.
Generic database
bucket bucket bucket Unified Profile and
AWS Database Raw Zone Clean Zone Curated Zone Segments Data The unified customer profile is stored in Amazon
Migration Service Amazon Redshift 7 S3 and used for segmentation. Artificial
intelligence and machine learning (AI/ML)
Amazon QuickSight models for segmentation are developed and
Batch 12 Data Governance Owned Media deployed using Amazon SageMaker. The unified
Scheduled view of customer profiles for contact center
Third-Party
Data vendors Amazon Athena applications is stored in Amazon Connect
AWS Data Exchange Customer Profiles. Next Best Item
AWS IAM AWS Lake Formation recommendations for cross sell or up sell are
created from the unified customer view using
Reviewed for technical accuracy August 15, 2023 Amazon Personalize.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture
Guidance for Customer Data Platform on AWS 8
Amazon Pinpoint utilizes the unified customer
This Guidance shows how to build a customer data platform with a full, 360 degree profile view of customer data. It explores each stage profile to conduct multi-channel outbound
marketing. Amazon Connect uses the unified
of building the platform, including data ingestion, identity resolution, segmentation, analysis, and activation. customer profile to enhance the customer’s
experience in call centers. Audience upload to
advertising platforms is done using Amazon
First and 2 Data Ingestion 3 Near Real-Time Data Stream Processing 7 Segmentation Data Consumption 11 Destinations
AppFlow integrations.
Third-Party and use cases
Data Sources
Data Activation AWS Clean Rooms is used for privacy
8 9 enhanced data collaborations to support
media planning, audience activation, and
AWS Lambda Amazon DynamoDB Amazon ads
Amazon Kinesis measurement use cases. The customer 360
DSP and
Amazon
profile is made available for API-based
Amazon Pinpoint Marketing consumption using DynamoDB, Lambda, and
Mobile client Amazon SageMaker API Gateway.
Real-time Cloud
Stream Amazon AppFlow
Amazon Redshift stores clean, modeled data
AWS Lambda Amazon DynamoDB 10 for fast and repeated queries. Amazon
Amazon MSK Stream QuickSight provides large-scale data analysis
Amazon Connect and visualization. Amazon Athena enables
data exploration and querying.
Web Events 5 Batch Processing and Identity Resolution Amazon Connect
Profiles Customer 360 profile data is uploaded to paid
11 media ad platforms such as Amazon
1 Amazon API
Data Cleansing,
9 Data Collaboration Contact
Centers Marketing Cloud and Amazon DSP for online
Gateway
Validation and media targeting. Marketing platforms and
enrichment other SaaS solutions use the customer 360
AWS Glue profile data for marketing and data
Event-based monetization use cases. Media platforms use
SaaS Applications Scheduled Amazon Personalize AWS Glue AWS Clean Rooms customer 360 profiles for website and mobile
(Data Catalog) app personalization.
AWS Step
Functions AWS Lake Formation defines access controls
Amazon AppFlow AWS Entity Resolution Marketing 12 on AWS Glue catalog tables, columns, and
Platforms rows in the data lake. AWS Identity and
AWS Lambda Amazon API Gateway Access Management (IAM) securely manages
Batch identities and access to AWS services and
File shares, offline Scheduled Storage resources.
data
4 6
AWS Transfer Family
Data Analytics SaaS Applications
10
Continuous
or one-time
Replication Amazon S3 Amazon S3 Amazon S3 Amazon S3 Amazon S3 bucket
Generic database
bucket bucket bucket Unified Profile and
AWS Database Raw Zone Clean Zone Curated Zone Segments Data
Migration Service Amazon Redshift
Amazon QuickSight
Batch
12 Data Governance Owned Media
Third-Party Scheduled
Data vendors Amazon Athena
AWS Data Exchange
AWS IAM AWS Lake Formation
Reviewed for technical accuracy August 15, 2023
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture