08 Using and Migrating Database Services
08 Using and Migrating Database Services
1
Migration to Huawei Cloud
Module 8: Using and Migrating Database Services
Objectives
⚫ Upon completion of this course, you will:
Understand the main steps and outputs of database migration.
Understand how to migrate existing complex databases to Huawei
Cloud.
Acquire some hands-on experience.
3
Contents
1. Introduction to Key Milestones
4
Key Milestones of a Database Migration Project
Day T+60
Day T+40 1. Complete data Day T+150 Day T+160
Day T Make service changes synchronization.
Day T+20 Complete Complete
Initiate a to match the new 2. Finalize the
Provide the service project
project. database migration
migration design. migration. acceptance.
architecture. runbook.
▪ Optimize the survey ▪ Plan the migration based ▪ Develop a step-by-step ▪ Add the database
template and survey the on the source database solution based on the migration procedures to
source database. The details and the design requirements of the migration runbook
template and the survey customer's migration the database solution. and use scripts to
will be used to design requirements, including monitor the source and
the specifications of the technologies used, destination during the
the destination database the migration principles migration phase to
and plan the migration. applied, and the user ensure data consistency.
▪ Export the source permissions required.
database details. ▪ After the migration
solution is confirmed,
generate a list of all
source and destination
databases.
6
Information Survey
IP Address
Number of Total Number of Memory Memory
PM/VM/Cloud OS Version CPU Usage
CPUs CPU Cores (GB) Usage Internal Network Internet
Storage Resource
Local Storage
Whether to Use the Local Disk Whether the Maximum Space Used
Disk Type (SATA/SAS/SSD) Disk Capacity (TB) Disk Utilization
on the Cloud by an Instance Exceeds 4 TB
7
User Permission Planning
No Usage Service User Database Permission
root (MySQL) Only for the customer's O&M department and not used as independent
4 Administrator
rwuser (MongoDB) service users or read-only users
8
Cyber Security Planning
9
Migration Precautions
No Database Type Migration Principle Precaution
To synchronize a large amount of data, you need to change the
binlog retention period of destination databases to 0, and change
Migrate to destination databases with the same
the values of innodb_flush_log_at_trx_commit and sync_binlog to 2
version as the source database, for example, you
1 MySQL and 500, respectively, to improve the migration efficiency. During
can migrate from 5.6 to 5.6, 5.7 to 5.7, or 8 to
the incremental synchronization period, change the values of the
8.
two parameters to 1 and change the binlog retention period back to
18 hours.
Select the primary database rather than the
The binlog of standby databases of some vendors is non-standard,
2 MySQL standby database as the migration source in a DRS
which may result in lost data.
task.
You are advised to migrate MySQL databases with a • You are advised to design sharding for DDM. Perform DDM
3 DDM large amount of data to Distributed Database operations on DDM instances, rather than RDS instances.
Middleware (DDM). • Perform DDM operations on the console.
Create a test database instance for each major
version. Databases with different names on the MongoDB does not support the mapping of database names and table
4 MySQL and MongoDB
source can be migrated to an instance of the same names.
major version.
The migration from a later version to an earlier version fails to
Migrate to destination databases with the same
pass some pre-check items. If advanced functions exclusive for the
5 MongoDB major version as the source database, for example,
later version are not used, ask DRS maintenance personnel to skip
you can migrate from 3 to 3.4 or 4.2 to 4.0.
the check items.
• In MongoDB 4.0, each service user must have the read permission
on the admin database.
Create service users on the destination for the
6 MongoDB • In MongoDB 3.0, each service user must have the read and
admin database for unified management.
readAnyDatabase permissions on the admin database to read the
10 database locally.
Migration Planning
Current Number of Number of Recommended Huawei Cloud Database Full Migration
Version Cluster Architecture Migration Method Migration Tool
Database Instances Databases Service Time
Cloud service in RDS for MySQL 5.6 in
5.5 1
primary/standby mode primary/standby mode
Cloud service in
RDS for MySQL 5.6 in
5.6 primary/standby or single- 9
primary/standby or single-node mode
node mode
Cloud service in
RDS for MySQL 5.7 in
MySQL 5.7 primary/standby or single- 61 121 Online migration
primary/standby or single-node mode
node mode
Three-Node enterprise RDS for MySQL 5.7 in
5.7 1
edition primary/standby mode
Cloud service in
RDS for MySQL 8.0 in
8 primary/standby or single- 34
primary/standby or single-node mode
node mode
Cloud service in RDS for PostgreSQL 9.5 in Online migration
9.4 1 DRS
primary/standby mode primary/standby mode (full)
Cloud service in RDS for PostgreSQL 10 in Online migration
PostgreSQL 10 1 15
primary/standby mode primary/standby mode (full+incremental)
Cloud service in RDS for PostgreSQL 12 in Online migration
12 7
primary/standby mode primary/standby mode (full+incremental)
3.4 Single-Node 1 DDS 3.4 in single-node mode
4 Single-Node 2 DDS 4.0 in single-node mode
3.2 Three-Node 6 DDS 3.2 in three-node mode
3.4 Three-Node 10 DDS 3.4 in three-node mode
Online migration
MongoDB 4 Three-Node 30 125 DDS 4.0 in three-node mode
(full+incremental)
4.2 Three-Node 25 DDS 4.0 in three-node mode
3.4 Five-Node 2 DDS 3.4 in five-node mode
4 Five-Node 3 DDS 4.0 in five-node mode
4.2 Five-Node 1 DDS 4.0 in five-node mode
1.7 HA 1 GaussDB(for Influx) 1.7 in two-node Non-commercial
InfluxDB 1 Full
1.8 Basic 1 mode tool
11
Migration Implementation and Efficiency
Improvement
⚫ Using a script to batch-create migration users with the minimum
permissions needed
⚫ Creating service users and granting permissions in batches
⚫ Data migration implementation solution
12
Migration Implementation
⚫ The primary node of a DDS instance must be connected. If the secondary
node of a DDS instance is connected, data cannot be updated.
13
Migration Implementation
⚫ The DDS password must contain uppercase letters, lowercase letters,
digits, and special characters. The recommended special characters are
asterisks (*) and exclamation marks (!). Other special characters may
be truncated on your client and an insufficient permissions error will
be reported.
14
Migration Implementation
⚫ For DDS 3.4, the read and readAnyDatabase permissions on the admin
database need to be added for local login. For DDS 4.0 and later
versions, only the read permission on the admin database needs to be
added.
15
Migration Drill
⚫ Due to complex service associations, you can revoke write permissions and grant read
permissions to reduce how long your database services are unavailable.
⚫ Scripts can be used to:
Monitor write operations on source MySQL databases.
Revoke permissions on the migration source and kill sessions.
Grant write permissions on Huawei Cloud databases.
Monitor the write status on Huawei Cloud databases.
Monitor source
databases and Grant write Check whether
Revoke the Stop applications Release
ensure that no permissions on data can be
write on the migration applications on
write operations Huawei Cloud written to Huawei
permissions. source. Huawei Cloud.
are performed on databases. Cloud databases.
them.
16
Contents
1. Introduction to Key Milestones
17
MySQL Architecture Evolution — Single-Node
Architecture
Characteristics
Application Applications directly read (SELECT) from and write (INSERT, DELETE, and UPDATE) to a single node.
Advantages
Read Write The architecture is simple and the O&M is easy.
Disadvantages
There may be single points of failure (SPOFs): As workloads grow and data volume increases,
available storage may be insufficient and read and write performance may be affected (low CPU
CPU and memory
usage, memory usage, and storage IOPS).
Scale Up: vertical scaling. You can use higher specifications (CPU and memory) or high-end storage,
or scale the storage capacity.
Scale Out: horizontal scaling
✓ Grouping: primary/standby architecture and read/write splitting. All nodes in the same group use
Solutions
primary/standby replication to ensure data consistency. This architecture is used to enhance the
database read capability.
✓ Sharding: distributed architecture and sharding. Table data is horizontally split into several
copies. Each copy of data can be treated as a shard. The sum of all shards forms full data.
Local disk Different shards may reside on different MySQL instances. This architecture is used to enhance
the database write capability.
18
MySQL Architecture Evolution — Primary/Standby
Architecture+Read/Write Splitting
Characteristic
Application
✓ Data changes on the primary node are synchronized to the standby node in real time.
✓ Data stored on each data node is consistent.
s
✓ Applications write data to the primary node, but read data from the standby node or
from both the primary and standby nodes.
Write
Read Read
Advantages
Primary node Standby node
(CPU and (CPU and The database read (SELECT) capability in the monolithic architecture is enhanced.
memory) memory)
19
MySQL Architecture Evolution — Distributed
Architecture+Sharding
Characteristi
✓ Table data is horizontally split into several copies. Each copy of data can
be treated as a shard.
cs
Application ✓ The sum of all shards forms full data.
Advantages
The shared-nothing architecture provides higher write performance (INSERT,
Write
DELETE, and UPDATE) than that in a monolithic architecture. In a shared-
nothing architecture, different data is routed to different data nodes, so
Write Read the maximum IOPS is much higher than with a single storage node.
20
Database Sharding — Vertical Sharding
⚫ If there is too much data in a database table and database performance
is affected, you can split data using vertical sharding or horizontal
sharding.
⚫ Vertical sharding is classified as either vertical table sharding or
vertical database sharding. Post
Publishe Post ID Title Publisher Posted Content
Post ID Title Posted Content ID
r
1 1 1
2 2 2
3 3 3
4 4 4
21
Database Sharding — Vertical Sharding
⚫ The basic idea of vertical database sharding is to divide databases by service module
and place data tables with highly coupled database transactions in the same database.
⚫ For table sharding and vertical database sharding, the most important concern is the
customer's services.
Unified
User center Policy center Settlement center Organization
authentication
database database database center database
center database
Service mid-end Payment center Insurance type Product center Risk control Message center
database database center database database center database database
Electronic
Engine center E-credential Financial center Report center
archive center
database center database database database
database
22
Database Sharding — Horizontal Sharding
⚫ Horizontal sharding consists of horizontal table sharding and horizontal database
sharding.
Basic Personnel Basic Personnel Info
Info Basic Personnel Info
- RID
Basic Personnel Info
Basic information - RID Basic information - RID
- Name
center database - Name center database - RID
...- Name
... ...- Name
...
Example of horizontal
table sharding
23
Database Sharding — Horizontal Sharding
⚫ After horizontal database sharding, you do not need to perform sharding again. Each database
shard has only one table and there is no child table.
Basic Personnel
Basic Personnel Basic Personnel
Info Basic Basic
Info_0 Info_0
- RID information ... information
- RID - RID
Basic information - Name center center
database_0
- Name database_1
- Name
center database ...
25
Distributed Databases — Key Concepts
⚫ Sharding keys
Database sharding keys
Table sharding keys
Rule for selecting sharding keys
⚫ Sharding algorithms
The value of a sharding key determines the database shard and table shard that a record should be routed to.
General sharding algorithms include hash algorithms and time-based hash algorithms.
Publishe
Post ID Title Posted Content
r
Publishe 1
Post ID Title Posted Content
r 3
1
2
3
4 mod(Post ID,2) Post ID Title Publisher Posted Content
2
4
26
Distributed Databases — Key Concepts
⚫ Relationships between database shards and MySQL DB instances
A MySQL DB instance can contain one or more database shards. This means that there can be
multiple databases on a MySQL DB instance. The number of database shards on a MySQL DB
instance depends on the storage IOPS and cannot be increased without limit. If the TPS/QPS
of an application is high, that means there are too many database shards in a single MySQL
DB instance. In this case, data needs to be sharded and distributed on multiple MySQL DB
instances.
27
Distributed Databases — Logical Architecture
⚫ Logically, a distributed database consists of the storage layer, compute layer, and metadata
layer.
Compute layer: It is the SQL layer in a single-node database. It is used to perform operations such as permission
check, route access, distributed transaction control, and calculation result summary and sorting on data access.
Metadata layer: It records the number of storage nodes in a distributed database cluster and metadata such as IP
addresses and ports.
Storage layer: It is used to store actual service data. DB
Distributed Distributed DB
[compute layer] [metadata layer]
28
Distributed Databases — Logical Architecture
✓ Schemas and logical tables ✓ Physical databases and physical tables
The compute layer stores schemas and logical tables and provides The storage layer stores physical databases and physical tables, which
unified access for external applications. store actual physical data.
Applications access schemas and logical tables as if they were Generally, one schema corresponds to multiple physical databases, and
accessing database tables in a single-node database. one logical table corresponds to multiple physical tables.
Schemas and logical tables shield the implementation details of The mappings between schemas and physical databases and between
underlying physical databases and physical tables from logical tables and physical tables are stored at the metadata layer.
applications.
Schemas and logical tables do not store actual physical data.
Distributed DB Distributed DB
[compute layer] [metadata layer]
Schema
db t Metadata
Logical
table
Physical
Distributed DB Distributed DB Distributed DB Distributed DB database
[storage layer] [storage layer] [storage layer] [storage layer]
t_0 t_0 t_0 t_0 Physical
t_1 t_1 t_1 t_1 table
t_2 t_2 t_2 t_2
29
Distributed Databases — Logical Architecture
⚫ Multiple schemas can be created in a distributed Distributed DB Distributed DB
database (at the compute layer). A schema can [compute layer] [metadata layer]
contain multiple logical tables. A MySQL
db1 x db2 y Metadata
instance (storage layer) can store database
shards and table shards corresponding to
multiple schemas.
31
DRS Technical Principles
⚫ In accordance with the full+incremental principle of DRS, when data is migrated from other distributed
databases to DDM, the source needs to be connected to the storage layer of the distributed database
because the storage layer has full physical data and binary logs that record changes in real time. The
destination needs to be connected to DDM, the compute layer of the distributed database, because SQL
statement playback needs to pass through the compute layer, and then related metadata is saved to the
metadata layer. This ensures that subsequent data query requests entering DDM are routed correctly.
Logs
DRS Data
captured replayed
Data
DRS flushed to
Log file disks
⚫ What is synchronization?
Synchronization means that the source and destination databases are retained at the same time. The source and destination databases
usually use different service systems. Data is shared and synchronized between the two databases in the unit of some tables. Data
continuously flows from the source system to the destination system for a long time.
34
Survey of the Source
⚫ Survey the following aspects of the source:
1 2 3 4
Source Sharding scheme Data volume of
Migration scope
construction of the source the source
5 6 7 8
Source database Future service
KPIs Other
resource status growth
35
Survey of the Source
⚫ Source construction
Learn whether the source is an IDC database, database built on the cloud, or cloud database
service.
This information will let you know if the source has any special restrictions. For example,
the values of some key parameters of cloud databases provided by different cloud vendors
may be different, and migration prerequisites may be different.
⚫ Migration scope
Environment: development environment, test environment, and production environment
Migration level: instance level, database level, and table level
⚫ The information determines the scope of subsequent survey and is used as the input
for selecting the scope when creating a DRS migration or synchronization task.
36
Sharding scheme of the source
⚫ Whether to use distributed middleware
For example, whether to use the application code logic, or to invoke a sharding framework such as Sharding-JDBC or ShardingSphere in an
application program, or to use a third-party independent distributed database middleware
⚫ Sharding rules
Determine the total number of schemas and the number of logical tables corresponding to each schema.
Determine the sharding mode of each table, for example, database and table sharding, database sharding without table sharding, and table
sharding without database sharding.
Determine the naming rules of database shards and table shards.
Determine the mapping between database shards and MySQL instances. For example, determine how many database shards are stored in a MySQL
instance and what the database shard names are.
39
Source Database Monitoring
⚫ MySQL key monitoring information
Collect key monitoring information about the source MySQL database, such as the CPU usage, memory
usage, average IOPS, and disk usage. You can use these details to evaluate the resources of the
destination.
Collect information about workload fluctuations, for example, whether there are peak hours in the
morning, at noon, or in the evening and the corresponding periods, whether there are batch tasks and
when they are processed, whether there are specific service peak hours (such as during weekly or
monthly settlement) per week or month, and whether there are traffic peaks caused by marketing
activities or promotions.
You can evaluate the resources on the destination based on the collected information and the database
resource monitoring information in the corresponding periods. For example, you can evaluate whether
higher performance specifications are required. Alternatively, evaluate whether the application
architecture needs to be optimized. For example, you can introduce caches to reduce read pressure, or
introduce message queues to reduce write pressure. You can also determine whether to control the
traffic flow during the creation of a DRS task, as well as set the traffic limiting period, and
40
determine whether to start the task immediately upon creation.
More Information Such as Data Volume and Growth
❖ Future service growth
▪ Data increments in the next N years
The customer needs to evaluate the data increments in the next N years based on current service volume. This
information helps you evaluate the destination database resources.
▪ Termination plan in the future
This information provides the reference input for the migration solution.
❖ KPIs
▪ TPS /QPS
This information helps you evaluate the resources of the destination.
▪ IOPS
This information helps you optimize the performance of the destination.
❖ Other
▪ Backup and restoration
The surveys of the backup mode, backup period, and backup frequency of the source provide inputs for configuring the
backup policy of the destination.
▪ Interconnection with related systems
For example, check whether the service system is interconnected with a data warehouse, whether other systems need to
periodically extract data from the source database, and whether the DDM read-only group capability is recommended.
41
Contents
1. Introduction to Key Milestones
42
RDS TPS/QPS
To evaluate RDS resources, factors such as the number of shards, TPS/QPS, storage, and HA must be considered.
Evaluate the TPS/QPS based on live-network workloads, and then evaluate the number of RDS DB instances needed based on the
performance baseline.
Note 1: You need to log in to the console to check whether there are dedicated MySQL instances of different versions in each
region.
Note 2: The performance baselines are obtained based on the sysbench stress tests.
MySQL Version Instance Type Performance Baseline
General-purpose https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_06.html
MySQL 5.6
Dedicated https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_07.html
General-purpose https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_09.html
MySQL 5.7
Dedicated https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_10.html
General-purpose https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_11.html
MySQL 8.0
Dedicated https://support.huaweicloud.com/intl/en-us/pwp-rds/rds_swp_mysql_12.html
The following uses the reference performance baseline of MySQL 5.7 general-purpose DB instances as an example.
CPU Cores Memory Connections (Stress TPS QPS CPU Cores Memory Connections (Stress TPS QPS
Test Value) Test Value)
2 4 1,500 334 6,673 2 8 2,500 552 11,039
4 8 2,500 756 15,122 4 16 5,000 1,062 21,249
8 16 5,000 1,338 26,756 8 32 10,000 2,117 42,335
16 32 10,000 2,630 52,000 16 64 18,000 3,016 60,336
If the application QPS is 50,000 and you use RDS DB instances with 8 CPU cores and 16 GB memory, two RDS DB instances are required.
43
RDS Sharding Design
✓ Database and table sharding recommendations
If there are less than 50 million data records, you are advised to use a single table.
If there are more than 50 million data records, sharding is recommended. It is recommended that the number of records in a
single table be no more than 10 million.
For example, if there are 1 billion data records and a single table contains 10 million records, a total of 100 table
shards are required.
✓ RDS DB instance sharding recommendations
For RDS DB instances with medium performance specifications (8 CPU cores and 16 or 32 GB memory), you are advised to use
16 shards for a single instance.
For RDS DB instances with high performance specifications (16 CPU cores and 32 or 64 GB memory), you are advised to use 32
shards for a single instance.
Note 1: Calculate the total number of shards and the number of RDS DB instances based on the table with the maximum number
of records in a service database.
Note 2: The number of RDS DB instances is more important than the performance specifications of the instances. If resources
are insufficient, adding more RDS DB instances is more effective than upgrading specifications. RDS DB instances with low
specifications can be scaled up in minutes if resources are sufficient. If high-specification RDS DB instances are used,
your only choice is to scale out. Scale-out involves data redistribution and takes hours. It is a slow, risky process.
For example, if there are 1 billion data records and a single table contains 10 million records, a total of 100 database
shards are required. If RDS DB instances with 8 CPU cores and 32 GB memory are used and 16 database shards are used for an
RDS DB instance, you are advised to plan eight RDS DB instances.
44
RDS Storage Space and HA
❑ Evaluate storage capacity from the following factors:
• The total storage capacity needed depends on the existing data volume, on how fast the data is growing (for example,
data growth over the next N years, avoiding scale-out upon rollout), and on how long the data will need to be
retained.
• It is recommended that the disk usage be kept around 60% to 70% of the total disk capacity.
• The storage capacity of a single RDS DB instance on Huawei Cloud is limited to 4 TB.
In summary: Total RDS storage capacity = Data capacity/(60% or 70%)
For example, if the actual data capacity is 5 TB, you are advised to apply for 7.2 TB to 8.4 TB of RDS storage capacity
(5 TB/(60% or 70%) = 7.2 TB to 8.4 TB). Based on the storage capacity limit of a single RDS DB instance, three RDS DB
instances are required.
❑ HA
To avoid single points of failure (SPOFs), select at least primary/standby instances in the production environment.
In scenarios where the service database needs to be provided for other applications, such as big data extraction, you can
apply for at least one read replica to reduce the impact of non-core services on system services.
Comprehensive evaluation
Based on these four factors, determine the performance specifications you will need for your RDS DB instances, how many
database shards will be needed for a single instance, the instance deployment mode, and how many RDS DB instances you will
need.
45
Production Environment Resource Evaluation — DDM
Factors such as TPS/QPS and HA must be considered during DDM resource evaluation.
❑ TPS/QPS
Evaluate the TPS/QPS based on live-network workloads, and then evaluate the number of DDM instance nodes based on the following
performance baselines.
Note: The performance baselines are obtained based on the sysbench stress tests.
❑ HA
1. The total number of CPU cores of all DDM instance nodes is half of that of RDS DB instances.
2. If the first requirement is met, you are advised to use at least two nodes to prevent SPOFs of DDM instances.
Case 1: If the estimated QPS is 40,000, one DDM instance with 8 CPU cores and 16 GB memory can meet the performance baseline
requirements. In this case, to prevent SPOFs, at least two DDM instances with 8 CPU cores and 16 GB memory must be used.
Case 2: A customer's system uses distributed databases. According to the RDS resource evaluation, eight RDS DB instances with 8 CPU
cores and 32 GB memory are required. In this case, 64 CPU cores (8 x 8 cores) are required for RDS instances. Therefore, DDM instance
nodes require a total of 32 cores (0.5 x 64 cores). In this case, you can use four DDM instances with 8 CPU cores and 16 GB memory or
two DDM instances with 16 CPU cores and 32 GB memory.
❑ Comprehensive evaluation
Evaluate the number of required DDM nodes and node performance specifications based on TPS/QPS and HA.
46
Contents
1. Introduction to Key Milestones
47
Migration Solution Design — Service Overview
O&M Client side Applications HTML5 pages APIs
DevOps platform
48
MySQL Migration and Rollback Solution —
Migration
⚫ During the migration, MySQL instance C and RDS for MySQL instance B need to be
retained for a period of time.
Applications
Applications
End
⑤ ⑥
④ ③
⑧
DRS task RDS for MySQL
Self-managed MySQL ① instance B
instance A
DRS task
MySQL instance C
②
Direct ⑦
Connect
49
Distributed Reconstruction Scenario 1 —
Migration Link Design of Shard Migration
Migration source Huawei Cloud
Self-managed ④
MySQL instance A DRS task
①
RDS for MySQL
Self-managed instance D
MySQL instance B ②
Self-managed
MySQL instance C ③
Direct
Connect
Migration scenarios Migration solution
• The source databases and tables are sharded in applications and ① ② ③ Create a DRS real-time synchronization task to
data is also routed in the applications. No third-party synchronize database-level incremental data from
middleware is used. MySQL instances A, B, and C to RDS for MySQL instance
• The total amount of data in all table shards is small. If there ④ D.
is only limited potential for data growth, you can combine Start applications on Huawei Cloud and verify
instances to reduce costs. services.
⑤ ⑥ Stop services on the migration source, compare data,
• Applications do not need to be reconstructed. You just need to
modify the mapping between the domain names and IP addresses of and ensure that the data is correct. Then, stop the
the MySQL instances where shards are located. Combining synchronization task.
databases and tables is not required.
50
Distributed Reconstruction Scenario 1 —
Migration Topology Structure of Shard Migration
Huawei Cloud
RDS 001
shard_002 data_0 ... data_63
MySQL A1 shard_001 data_0 ... data_63
51
Distributed Reconstruction Scenario 2 -
Migration Link Design for Data Aggregation and
Redistribution
Direct Connect
52
Distributed Reconstruction Scenario 2 — Database
Sharding and Table Sharding for Data Aggregation and
Redistribution
Migration source
Huawei Cloud
MySQL A1 DB 1 index_0
DDM
... RDS for MySQL
MySQL A3
DB 3 index_0
✓ To migrate data from a self-managed MySQL database to an RDS for MySQL database,
... create a DRS real-time synchronization (full+incremental) task, map database names and
table names, and aggregate data from multiple tables in multiple shards on the source
to a single table on the destination.
Index_63
✓ To migrate data from Huawei Cloud RDS for MySQL to DDM, create a DRS online migration
(full+incremental) task.
53
Distributed Reconstruction Scenario 2 — Database Sharding
Without Table Sharding for Data Aggregation and
Redistribution
⚫ To migrate data from a self-managed MySQL database to an RDS for MySQL database, create a DRS real-time
synchronization (full+incremental) task, map database names, and aggregate data from multiple shards on the
source to a single table on the destination.
⚫ To migrate data from Huawei Cloud RDS for MySQL to DDM, create a DRS online migration (full+incremental)
task.
DB 1 user_info
Migration
MySQL A1
ugc_category
source Huawei Cloud
RDS for MySQL DDM
DB 2 user_info
DRS real-time
MySQL A2 synchronization DB DRS online DB
ugc_category migration
user_info user_info ugc_category
ugc_category
DB 3 user_info
MySQL A3 RDS RDS RDS
Intermediate
ugc_category database
54
Distributed Reconstruction Scenario 2 — Table Sharding
Without Database Sharding for Data Aggregation and
Redistribution
Migration source Huawei Cloud
user_1
DB DRS online
migration DB
MySQL A1
user user
..
user_127
Intermediate RDS RDS RDS
database
✓ To migrate data from a self-managed MySQL database to an RDS for MySQL database, create a DRS real-time synchronization
(full+incremental) task, map database names and table names, and aggregate data from multiple tables on the source to a single
table on the destination.
✓ To migrate data from Huawei Cloud RDS for MySQL to DDM, create a DRS online migration (full+incremental) task.
55
Advantages and Disadvantages of Solutions in
Different Scenarios
Whether Code
Low Solution Scenario Method Advantage Reconstruction Is
Involved
57
Reflections on Distributed Database Migration
▪ Use DRS real-time ▪ After the data of table shards in ▪ Change the database table ▪ In a distributed database, the
synchronization tasks to each database shard on the source structure on the source and add primary key value of a table
aggregate data of table shards is aggregated to RDS DB instances the uid column. must be globally unique.
in each database shard on the (intermediate databases) on ▪ Reconstruct applications on the Otherwise, data migration or
source to intermediate databases Huawei Cloud using DRS, the data source, so that a uid value is server merging (for example, in
on the destination. comparison shows that the total added to the data before it is the gaming industry) is
▪ Use DRS online migration tasks number of data records on the written into the database. difficult and the reconstruction
to synchronize data from destination is about 100,000 less ▪ Use a script to copy the value of is expensive.
intermediate databases to DDM than that on the source. the uid column in the primary ▪ If distributed middleware is
instances. ▪ After analysis, the primary key table to the uid column in child used, the middleware can be used
values on the source are not tables based on the associated to ensure that the primary key
globally unique and are unique field content_id. value is globally unique.
within each database shard or ▪ Change the database table ▪ If distributed middleware is not
table shard. After data is structure of the intermediate or used, for example, when data
aggregated to a single table in destination databases. Use splitting and routing are
the intermediate or destination content_id and uid as composite performed in an application, a
databases, data with duplicate primary keys. primary key generator can be
key values is deleted. ▪ Bring the new code online. Perform used to ensure global uniqueness
the addition, deletion, of a primary key value.
modification, and query operations
based on composite primary keys.
Summary and
Migration links Migration problems Solutions
reflections
58
Thank You.
Copyright © 2023 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including,
without limitation, statements regarding the future financial and operating
results, future product portfolio, new technology, etc. There are a number of
factors that could cause actual results and developments to differ materially
from those expressed or implied in the predictive statements. Therefore, such
information is provided for reference purpose only and constitutes neither an
offer nor an acceptance. Huawei may change the information at any time
without notice.