CN114510464A

CN114510464A - A management method and management system for a highly available database

Info

Publication number: CN114510464A
Application number: CN202210146171.XA
Authority: CN
Inventors: 杨世利; 陈雪; 宋阳; 刘娟; 洪晓霞; 熊炜; 宋鹏; 裴劼; 王仁菊; 杨颖�; 李佳; 江欣祝; 吴云松; 何健
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-02-17
Filing date: 2022-02-17
Publication date: 2022-05-17

Abstract

The invention discloses a high-availability database management method and management system, belonging to the technical field of database management. The management method includes: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring The self-defined resources are generated, and configuration resources are generated; database clusters are managed according to the configuration resources. Through the description information, the configuration information and deployment information of the database are described in a unified manner, which is conducive to the unified management and deployment of custom resources, and the unified management of the database through the custom resources prevents wrong operations caused by the configuration of a large number of internal resources.

Description

A management method and management system for a highly available database

技术领域technical field

本发明涉及数据库管理技术领域，具体涉及一种高可用数据库的管理方法和管理系统。The invention relates to the technical field of database management, in particular to a management method and management system of a highly available database.

背景技术Background technique

自定义资源(Custom Resource)是对Kubernetes API的扩展，可以通过动态注册的方式在运行中的集群内或出现或消失，集群管理员可以独立于集群更新定制资源。一旦某定制资源被安装，用户可以使用kubectl来创建和访问其中的对象，就像他们为pods这种内置资源所做的一样。Custom Resource is an extension to the Kubernetes API. It can appear or disappear in a running cluster through dynamic registration. Cluster administrators can update custom resources independently of the cluster. Once a custom resource is installed, users can use kubectl to create and access objects within it, just as they do for built-in resources like pods.

Kubernetes是目前主流的云原生容器平台，支持通用的无状态应用、有状态应用等工作负载运行在之上。对于MYSQL等数据库可以以有状态应用的方式进行运行。Operator是Kubernetes的扩展软件，在Kubernetes上可以通过自动化来处理重复的任务。Operator模式会封装编写的任务自动化代码。Kubernetes的Operator模式概念使得无需修改Kubernetes自身的代码，通过定制控制器管理一个及其以上的定制资源，实现扩展集群的功能。Kubernetes is the current mainstream cloud-native container platform, which supports general stateless applications, stateful applications and other workloads to run on it. For databases such as MYSQL, it can be run as a stateful application. Operators are extensions to Kubernetes that automate repetitive tasks on Kubernetes. The Operator pattern encapsulates the written task automation code. The operator mode concept of Kubernetes makes it possible to manage one or more custom resources through a custom controller without modifying the code of Kubernetes itself, so as to realize the function of expanding the cluster.

随着业务量的上升，针对多个内部资源的改动会越来越多，采用自定义资源可以将用户所需要的配置属性集中到单个资源中进行统一管理，避免错误的更改导致数据库的崩溃。With the increase of business volume, there will be more and more changes to multiple internal resources. By using custom resources, the configuration properties required by users can be centralized into a single resource for unified management, so as to avoid the crash of the database caused by wrong changes.

发明内容SUMMARY OF THE INVENTION

针对现有技术中存在的上述技术问题，本发明提供一种高可用数据库的管理方法和管理系统，通过描述信息对数据库部署或管理进行统一描述，并根据描述信息对数据库进行管理，简化数据库管理，防止错误的操作。Aiming at the above technical problems existing in the prior art, the present invention provides a management method and management system for a high-availability database, which uniformly describes the deployment or management of the database through description information, and manages the database according to the description information, thereby simplifying database management. , to prevent erroneous operation.

本发明公开了一种高可用数据库的管理方法，所述管理方法包括：创建或修改描述信息；根据所述描述信息，创建、修改或删除自定义资源；监听所述自定义资源，并生成配置资源；根据所述配置资源管理数据库集群。The invention discloses a management method for a highly available database. The management method includes: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring the custom resources and generating a configuration resource; manage the database cluster according to the configuration resource.

优选的，所述描述信息包括以下任一信息或它们的组合：Preferably, the description information includes any of the following information or a combination thereof:

数据库配置信息、实例创建策略、升级策略、主从策略、选主策略和复制策略。Database configuration information, instance creation strategy, upgrade strategy, master-slave strategy, master election strategy, and replication strategy.

优选的，数据库故障恢复的方法包括：Preferably, the method for database failure recovery includes:

通过Operator监听数据库的实例；Listen to the instance of the database through the Operator;

判断主实例是否发生故障；Determine whether the main instance is faulty;

若是，根据所述选主策略，选择一个正常实例作为主实例；If so, select a normal instance as the primary instance according to the primary selection strategy;

修改复制策略，将数据复制指向主实例。Modify the replication policy to point the data replication to the primary instance.

优选的，选择主实例的方法包括：Preferably, the method for selecting a master instance includes:

访问存活实例，并获取所述实例的延迟状态；access a live instance and obtain the deferred state of said instance;

通过最小化延迟的方式，从存活实例中选择主实例。The primary instance is selected from the surviving instances in a way that minimizes latency.

优选的，所述复制策略包括主实例的连接地址、用户名、密码、复制方式；Preferably, the replication strategy includes the connection address, user name, password, and replication method of the primary instance;

主实例的只读策略设置为关，从实例的只读策略设置为开；The read-only policy of the master instance is set to off, and the read-only policy of the slave instance is set to on;

故障实例重连的方法包括：The methods for reconnecting the failed instance include:

检测故障实例；Detect failure instances;

判断故障实例是否恢复；Determine whether the faulty instance is recovered;

若是，将恢复的故障实例设置为从实例，并同步复制策略。If so, set the recovered failed instance as a slave instance and synchronize the replication policy.

优选的，所述配置资源包括以下任一资源或它们的组合：Preferably, the configuration resources include any one of the following resources or a combination thereof:

配置文件、有状态应用、存储卷的资源对象和服务发现对象；Configuration files, stateful applications, resource objects for storage volumes, and service discovery objects;

实例创建的方法包括：Instance creation methods include:

根据所述创建策略，创建有状态应用；Create a stateful application according to the creation strategy;

根据有状态应用的创建事件，创建容器资源；Create container resources based on stateful application creation events;

根据所述配置文件，对所述容器资源进行配置，获得数据库实例。According to the configuration file, the container resource is configured to obtain a database instance.

优选的，升级的方法包括：Preferably, the upgrade method includes:

创建升级策略，所述升级策略的升级描述信息包括：数据库配置更新、容器集群部署配置更新和数据库版本更新；Create an upgrade strategy, the upgrade description information of the upgrade strategy includes: database configuration update, container cluster deployment configuration update, and database version update;

根据所述升级描述信息，生成或修改自定义资源；Generate or modify custom resources according to the upgrade description information;

通过Operator监听所述自定义资源，并更新配置文件；Monitor the custom resource through the Operator, and update the configuration file;

根据所述配置文件，对数据库、容器或集群进行升级。According to the configuration file, upgrade the database, container or cluster.

优选的，实例卸载的方法包括：Preferably, the method for instance uninstallation includes:

删除实例相应的自定义资源；Delete the corresponding custom resources of the instance;

通过Operator监听自定义资源的删除事件，并删除相应的配置资源；Monitor the deletion event of the custom resource through the Operator, and delete the corresponding configuration resource;

根据配置资源的删除事件，删除实例的容器组。According to the deletion event of the configuration resource, the container group of the instance is deleted.

本发明还提供一种用于实现上述管理方法的管理系统，包括：The present invention also provides a management system for implementing the above management method, comprising:

描述信息管理模块、自定义资源管理模块、监听模块和执行模块；Describe the information management module, custom resource management module, monitoring module and execution module;

所述描述信息管理模块用于创建或修改描述信息；The description information management module is used to create or modify description information;

所述自定义资源管理模块用于根据所述描述信息，创建、修改或删除自定义资源；The custom resource management module is used to create, modify or delete custom resources according to the description information;

所述监听模块用于监听所述自定义资源，并生成配置资源；The monitoring module is used to monitor the user-defined resources and generate configuration resources;

所述执行模块用于根据所述配置资源管理数据库集群。The execution module is configured to manage the database cluster according to the configuration resource.

优选的，所述监听模块还用于：Preferably, the monitoring module is also used for:

监听数据库的实例；An instance of the listening database;

与现有技术相比，本发明的有益效果为：通过描述信息，对数据库的配置信息和部署信息进行统一描述，利于对自定义资源的统一管理和部署，通过自定义资源对数据库进行统一管理，防止由于大量API造成的错误操作。Compared with the prior art, the present invention has the beneficial effects that the configuration information and deployment information of the database are described in a unified manner through the description information, which is beneficial to the unified management and deployment of the self-defined resources, and the unified management of the database through the self-defined resources. , to prevent incorrect operations due to a large number of APIs.

附图说明Description of drawings

图1是本发明的高可用数据库的管理方法流程图；Fig. 1 is the management method flow chart of the high-availability database of the present invention;

图2是本发明的管理系统逻辑框图。FIG. 2 is a logical block diagram of the management system of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明的一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present invention.

下面结合附图对本发明做进一步的详细描述：Below in conjunction with accompanying drawing, the present invention is described in further detail:

一种高可用数据库的管理方法，如图1所示，所述管理方法包括：A management method for a highly available database, as shown in Figure 1, the management method includes:

步骤101：创建或修改描述信息。可以通过文件的方式保存描述信息，如yaml或者json文件；描述信息用于描述数据库管理的属性，例如数据库内部的配置信息：root密码、各类缓存区大小、数据落盘机制等，以及数据库在集群中的相关配置信息：实例数量、硬件资源配额、网络策略、调度策略等。Step 101: Create or modify description information. The description information can be saved in the form of files, such as yaml or json files; the description information is used to describe the attributes of database management, such as the internal configuration information of the database: root password, size of various cache areas, data storage mechanism, etc. Relevant configuration information in the cluster: number of instances, hardware resource quotas, network policies, scheduling policies, etc.

步骤102：根据所述描述信息，创建、修改或删除自定义资源。Step 102: Create, modify or delete a custom resource according to the description information.

步骤103：监听所述自定义资源，并生成配置资源。Step 103: Monitor the custom resource and generate a configuration resource.

所述配置资源包括以下任一资源或它们的组合：配置文件(Configmap)、有状态应用(Statefulset)、存储卷的资源对象(PVC)和服务发现对象(service)。在一个具体实施例中，利用Operator监听自定义资源。The configuration resource includes any one of the following resources or a combination thereof: a configuration file (Configmap), a stateful application (Statefulset), a resource object (PVC) of a storage volume, and a service discovery object (service). In a specific embodiment, an operator is used to monitor custom resources.

步骤104：根据所述配置资源管理容器化的数据库。所述数据库可以是MYSQL数据库，但不限于此。容器化的数据库可以具有多个实例，组成数据库集群。Step 104: Manage the containerized database according to the configuration resource. The database may be a MYSQL database, but is not limited thereto. A containerized database can have multiple instances, forming a database cluster.

通过描述信息，对数据库的配置信息和部署信息进行统一描述，利于对自定义资源的统一管理和部署，通过自定义资源对数据库进行统一管理，防止由于大量API造成的错误操作，提高数据库的高可用性。Through the description information, the configuration information and deployment information of the database are described in a unified manner, which is conducive to the unified management and deployment of custom resources, and the unified management of the database through the custom resources can prevent incorrect operations caused by a large number of APIs and improve the performance of the database. availability.

步骤101中，所述描述信息包括以下任一信息或它们的组合：In step 101, the description information includes any of the following information or a combination thereof:

数据库配置信息、实例创建策略、升级策略、主从策略、选主策略和复制策略。其中，自定义资源与上述描述信息相对应。Database configuration information, instance creation strategy, upgrade strategy, master-slave strategy, master election strategy, and replication strategy. The custom resource corresponds to the above description information.

数据库管理包括数据库故障恢复、数据库升级、实例创建和实例卸载。Database management includes database failure recovery, database upgrade, instance creation, and instance uninstallation.

实施例1Example 1

数据库故障恢复的方法包括：Methods for database failure recovery include:

步骤201：通过Operator监听数据库的实例。Operator可以通过定时任务，定时尝试连接每一个数据库实例，并执行相应的SQL语句以进行探活。Step 201: Monitor the instance of the database through the Operator. Operators can periodically try to connect to each database instance through scheduled tasks, and execute corresponding SQL statements for detection.

步骤202：判断主实例是否发生故障。Operator程序会定时对每个MYSQL实例进行探活，即通过配置的MYSQL连接地址、用户名、密码、端口等信息连接分布在不同节点的MYSQL实例。如果连接超时、则代表该实例故障。连接成功后，执行“SELECT1”命令，如果无法返回正常的结果则代表该实例故障，否则为正常的实例。标记正常的实例和异常的实例。Step 202: Determine whether the primary instance is faulty. The Operator program periodically probes each MYSQL instance, that is, connects to MYSQL instances distributed on different nodes through the configured MYSQL connection address, user name, password, port and other information. If the connection times out, the instance is faulty. After the connection is successful, execute the "SELECT1" command. If the normal result cannot be returned, it means that the instance is faulty, otherwise it is a normal instance. Mark normal instances and abnormal instances.

若是，执行步骤203：根据所述选主策略，选择一个正常实例作为主实例。If yes, go to step 203: select a normal instance as the primary instance according to the primary selection policy.

其中，选择主实例的方法包括：步骤211：访问存活实例，并获取所述实例的延迟状态。步骤212：通过最小化延迟的方式，从存活实例中选择主实例。Wherein, the method for selecting a master instance includes: Step 211 : Access a surviving instance, and obtain the delay state of the instance. Step 212: Select a master instance from the surviving instances in a manner that minimizes the delay.

若否，执行步骤206：持续或定期监听所述实例。If not, go to step 206: monitor the instance continuously or periodically.

步骤205：修改复制策略和负载均衡策略等，将从实例的数据复制指向主实例，并且主实例的只读策略设置为关，从实例的只读策略设置为开。Step 205 : Modify the replication policy and load balancing policy, etc., copy the data of the slave instance to the master instance, and set the read-only policy of the master instance to off and the read-only policy of the slave instance to on.

所述复制策略包括主实例的连接地址、用户名、密码、复制方式。具体的，默认以序号为0的MYSQL作为主库，其他数据库实例作为从库，配置数据由主库复制到从库。当最后执行启动复制以后，从库拉取主库写数据库保留的Binlog日志，并在自身回放，达到数据复制的目的，以此保证数据的一致性。The replication policy includes the connection address, user name, password, and replication mode of the primary instance. Specifically, by default, MYSQL with serial number 0 is used as the master database, other database instances are used as slave databases, and the configuration data is copied from the master database to the slave database. When the replication is finally started, the Binlog log written by the main database to the database is pulled from the database, and played back in itself to achieve the purpose of data replication, so as to ensure the consistency of the data.

步骤206：故障实例重连。故障实例重连的方法包括：检测故障实例；判断故障实例是否恢复；若是，将恢复的故障实例设置为从实例，并同步复制策略。Operator能够针对特定的故障场景进行自动化的修复，无需运维人员参与即可短时间恢复。Step 206: The faulty instance is reconnected. The method for reconnecting the faulty instance includes: detecting the faulty instance; judging whether the faulty instance is recovered; if so, setting the recovered faulty instance as a slave instance, and synchronizing the replication strategy. Operators can perform automatic repairs for specific failure scenarios, and can recover in a short time without the participation of operation and maintenance personnel.

Operator程序会在操作的时候记录主从关系的元数据并且记录。同时创建2个拥有负载均衡能力的服务发现，作为写数据和读数据的入口。客户端通过访问这2个服务发现的域名实现读写分离。The Operator program will record and record the metadata of the master-slave relationship during operation. At the same time, two service discovery with load balancing capability are created as the entry for writing data and reading data. The client implements read-write separation by accessing the domain names discovered by these two services.

应当指出的是，实例得不到的类型包括：主库故障、从库存活；从库故障、主库存活；主库和从库均故障三种情况。主库故障、从库存活的情况下，需要进行主从切换，使其中一个从库变为主库。从库故障、主库存活的情况下，无需进行主从切换，Kubernetes中的负载均衡机会自动剔除掉无法访问的从库。主库和从库均故障的情况下，系统整体故障，无法进行修复。It should be pointed out that the types of instances that cannot be obtained include: failure of the master database, active storage of the slave database; failure of the slave database, active storage of the master database; failure of both the master database and the slave database. When the master library fails and the slave library is active, it is necessary to perform a master-slave switchover to make one of the slave libraries become the master library. When the slave library fails and the master library is active, there is no need to perform master-slave switchover. The load balancer in Kubernetes will automatically remove the unreachable slave library. When both the master library and the slave library are faulty, the system as a whole is faulty and cannot be repaired.

其中，MYSQL集群整体可以采用属于一主多从的架构，每个实例之间是相互独立的。即其中一个实例作为主实例提供主要的数据写入能力，其他实例作为从实例提供数据读取的能力。本实施例通过Operator，实现MYSQL集群的主从架构实现和自动化故障修复；从而维护MYSQL集群的高可用性，比如当发生宕机故障时进行数据库的主从切换保证数据库的读写功能正常。Among them, the MYSQL cluster as a whole can adopt the architecture of one master and multiple slaves, and each instance is independent of each other. That is, one instance provides the main data writing capability as the master instance, and the other instances provide the data reading capability as the slave instance. This embodiment implements the master-slave architecture implementation of the MYSQL cluster and automatic fault repair through the Operator, thereby maintaining the high availability of the MYSQL cluster, for example, performing master-slave switching of the database when a downtime failure occurs to ensure that the read and write functions of the database are normal.

可以通过Operator在数据库启动完毕后依次访问每一个MYSQL数据库实例并配置复制策略，具体是为从库配置主库的数据库连接地址、用户名、密码、复制方式等信息，数据由主库增量复制到从库，保证数据的冗余。同时由Kubernetes平台提供读写分离的负载均衡入口，包含写数据的服务入口和读数据的服务出口，出口分别为主库和从库，由Operator维护该映射关系。You can use the Operator to access each MYSQL database instance in turn after the database is started and configure the replication strategy. Specifically, configure the database connection address, user name, password, replication method and other information of the master database for the slave database. The data is incrementally replicated by the master database. To the slave library to ensure data redundancy. At the same time, the Kubernetes platform provides a load balancing entry with read-write separation, including a service entry for writing data and a service exit for reading data. The exits are the master library and the slave library, and the operator maintains the mapping relationship.

实施例2Example 2

数据库实例创建的方法包括：Methods for database instance creation include:

步骤301：根据所述创建策略，创建有状态应用。Step 301: Create a stateful application according to the creation policy.

步骤302：根据有状态应用的创建事件，创建容器资源。可以通过Kubernetes平台创建容器资源。Step 302: Create a container resource according to the creation event of the stateful application. Container resources can be created through the Kubernetes platform.

步骤303：根据所述配置文件，对所述容器资源进行配置，获得数据库实例。Step 303: Configure the container resource according to the configuration file to obtain a database instance.

实施例3Example 3

数据库升级的方法包括：Methods of database upgrade include:

步骤401：创建升级策略，所述升级策略的升级描述信息包括：数据库配置更新、容器集群部署配置更新和数据库版本更新。例如将MYSQL版本升级到指定版本；对数据库集群进行水平扩容，即再创建实例。Step 401: Create an upgrade policy, the upgrade description information of the upgrade policy includes: database configuration update, container cluster deployment configuration update, and database version update. For example, upgrade the MYSQL version to the specified version; horizontally expand the database cluster, that is, create an instance again.

步骤402：根据所述升级描述信息，生成或修改自定义资源。Step 402: Generate or modify a custom resource according to the upgrade description information.

步骤403：通过Operator监听所述自定义资源，并更新配置文件。Step 403: Monitor the custom resource through the Operator, and update the configuration file.

步骤404：根据所述配置文件，对数据库、容器或集群进行升级。Step 404: Upgrade the database, container or cluster according to the configuration file.

实施例4Example 4

实例卸载的方法包括：Methods for instance uninstallation include:

步骤501：删除实例相应的自定义资源。Step 501: Delete the custom resource corresponding to the instance.

步骤502：通过Operator监听自定义资源的删除事件，并删除相应的配置资源。Step 502: Monitor the deletion event of the custom resource through the Operator, and delete the corresponding configuration resource.

步骤503：根据配置资源的删除事件，删除实例的容器资源，如容器组(POD)。Step 503: According to the deletion event of the configuration resource, delete the container resource of the instance, such as a container group (POD).

由于资源存在关联关系，数据库自定义资源属于最上层，云原生平台的控制器会根据级联关系自动删除所有关联的下级资源，删除了自定义资源会自动删除Statefulset、Service、Configmap等内置资源，由于Statefulset与Pod又是级联关系，Statefulset被删除以后，又会自动删除Pod，这个流程由云原生平台的控制器进行控制。数据库使用的数据卷会根据实际定义的方式进行保留或者自动删除。Due to the relationship between resources, the database custom resources belong to the top layer. The controller of the cloud native platform will automatically delete all associated lower-level resources according to the cascade relationship. If the custom resources are deleted, the built-in resources such as Statefulset, Service, and Configmap will be automatically deleted. Since the Statefulset and Pod are in a cascade relationship, after the Statefulset is deleted, the Pod will be automatically deleted. This process is controlled by the controller of the cloud native platform. Data volumes used by the database are retained or automatically deleted depending on how they are actually defined.

实施例5Example 5

本实施例提供一种用于实现上述高可用数据库管理方法的管理系统，如图2所示，包括描述信息管理模块1、自定义资源管理模块2、监听模块3和执行模块4；This embodiment provides a management system for implementing the above-mentioned high-availability database management method, as shown in FIG. 2 , including a description information management module 1, a custom resource management module 2, a monitoring module 3, and an execution module 4;

描述信息管理模块1用于创建或修改描述信息11；The description information management module 1 is used to create or modify the description information 11;

自定义资源管理模块2用于根据描述信息11，创建、修改或删除自定义资源12；The custom resource management module 2 is used to create, modify or delete custom resources 12 according to the description information 11;

监听模块3用于监听自定义资源12，并根据自定义资源生成配置资源13；The monitoring module 3 is used to monitor the custom resource 12, and generate the configuration resource 13 according to the custom resource;

执行模块4用于根据所述配置资源13管理容器化的数据库集群，所述数据库集群包括多个实例15。具体的，执行模块4可以通过Kubernetes平台管理数据库集群。The execution module 4 is configured to manage a containerized database cluster according to the configuration resource 13 , and the database cluster includes a plurality of instances 15 . Specifically, the execution module 4 can manage the database cluster through the Kubernetes platform.

其中，所述监听模块3还用于执行实施例1中的数据库故障恢复方法：Wherein, the monitoring module 3 is also used to execute the database failure recovery method in Embodiment 1:

监听数据库的实例；An instance of the listening database;

以上仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. a management method for a highly available database, wherein the management method comprises:

create or modify descriptive information;

Create, modify or delete custom resources based on the description;

Monitor the custom resource, and generate a configuration resource;

The database cluster is managed according to the configuration resource.

2. The management method according to claim 1, wherein the description information comprises any one of the following information or a combination thereof:

Database configuration information, instance creation strategy, upgrade strategy, master-slave strategy, master election strategy, and replication strategy.

3. The management method according to claim 2, wherein the method for database failure recovery comprises:

Listen to the instance of the database through the Operator;

Determine whether the main instance is faulty;

If so, select a normal instance as the primary instance according to the primary selection strategy;

Modify the replication policy to point the data replication to the primary instance.

4. management method according to claim 3, is characterized in that, the method for selecting master instance comprises:

access a live instance and obtain the deferred state of said instance;

The primary instance is selected from the surviving instances in a way that minimizes latency.

5. The management method according to claim 4, wherein the replication strategy comprises the connection address, user name, password, and replication mode of the primary instance;

The read-only policy of the master instance is set to off, and the read-only policy of the slave instance is set to on;

The methods for reconnecting the failed instance include:

Detect failure instances;

Determine whether the faulty instance is recovered;

If so, set the recovered failed instance as a slave instance and synchronize the replication policy.

6. The management method according to claim 2, wherein the configuration resource comprises any one of the following resources or a combination thereof:

Configuration files, stateful applications, resource objects for storage volumes, and service discovery objects;

Instance creation methods include:

Create a stateful application according to the creation strategy;

Create container resources based on stateful application creation events;

According to the configuration file, the container resource is configured to obtain a database instance.

7. The management method according to claim 6, wherein the method for upgrading comprises:

Create an upgrade strategy, the upgrade description information of the upgrade strategy includes: database configuration update, container cluster deployment configuration update or database version update;

Generate or modify custom resources according to the upgrade description information;

Monitor the custom resource through the Operator, and update the configuration file;

According to the configuration file, upgrade the database, container or cluster.

8. The management method according to claim 6, wherein the method for instance unloading comprises:

Delete the corresponding custom resources of the instance;

Monitor the deletion event of the custom resource through the Operator, and delete the corresponding configuration resource;

According to the deletion event of the configuration resource, the container resource of the instance is deleted.

9. A management system for implementing the management method according to any one of claims 1-8, characterized in that, comprising: a description information management module, a self-defined resource management module, a monitoring module and an execution module;

The description information management module is used to create or modify description information;

The custom resource management module is used to create, modify or delete custom resources according to the description information;

The monitoring module is used to monitor the user-defined resources and generate configuration resources;

The execution module is configured to manage the database cluster according to the configuration resource.

10. The management system according to claim 9, wherein the monitoring module is further used for:

An instance of the listening database;

Determine whether the main instance is faulty;