CN101079896B - A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system - Google Patents
A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system Download PDFInfo
- Publication number
- CN101079896B CN101079896B CN200710018108A CN200710018108A CN101079896B CN 101079896 B CN101079896 B CN 101079896B CN 200710018108 A CN200710018108 A CN 200710018108A CN 200710018108 A CN200710018108 A CN 200710018108A CN 101079896 B CN101079896 B CN 101079896B
- Authority
- CN
- China
- Prior art keywords
- data
- framework
- availability
- state
- availability mechanism
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Computer And Data Communications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开一种并行存储系统的多可用性机制共存架构,该架构包括状态检测和控制框架、数据服务框架、元数据服务框架、数据同步框架、客户端框架、系统管理框架、高可用机制模块7个部分,支持在线的模块加载和卸载。高可用机制模块以插件形式实现所有框架调用所需的接口函数,框架根据逻辑数据使用的高可用机制类型,调用对应的可用性模块中实现的接口,完成特定的功能。在这种架构的支持下,用户可以根据逻辑数据的可用性要求、读写特性、以及用户对数据的服务质量要求,在系统所提供高可用机制中选择最适合的机制以保证逻辑数据的可靠性和数据服务的可用性,以此可避免因为使用单一的高可用机制而造成的不必要的性能损失和磁盘冗余。
The invention discloses a multi-availability mechanism coexistence architecture of a parallel storage system, which includes a state detection and control framework, a data service framework, a metadata service framework, a data synchronization framework, a client framework, a system management framework, and a high-availability mechanism module 7 A part that supports online module loading and unloading. The high-availability mechanism module implements all the interface functions required by the framework in the form of plug-ins. The framework calls the interface implemented in the corresponding availability module according to the type of high-availability mechanism used by the logical data to complete specific functions. With the support of this architecture, users can choose the most suitable mechanism among the high-availability mechanisms provided by the system to ensure the reliability of logical data according to the availability requirements of logical data, read and write characteristics, and user quality of service requirements for data. And the availability of data services, so as to avoid unnecessary performance loss and disk redundancy caused by using a single high-availability mechanism.
Description
技术领域technical field
本发明涉及计算机应用技术领域,是一种构建并行存储系统多可用性机制并存架构的方法,特别是建立在并行文件系统和分布式文件系统之上的分布式存储系统的多可用性机制共存框架。The invention relates to the field of computer application technology, and is a method for constructing a multi-availability mechanism coexistence framework of a parallel storage system, in particular a multi-availability mechanism coexistence framework of a distributed storage system based on a parallel file system and a distributed file system.
背景技术Background technique
一个高可用系统是指当系统中出现软件或硬件故障时不会引起系统停止服务,但允许系统带故障运行。在并行存储系统中,现有技术大多是通过数据冗余来实现。如果一些数据不可用,其备份数据可以替代提供服务。高可用系统通常由两个或多个节点组成,这些节点通过互联网络与客户端相连,而每个节点有自己的本地存储空间。A highly available system means that when a software or hardware failure occurs in the system, it will not cause the system to stop service, but allows the system to run with failures. In the parallel storage system, most of the existing technologies are implemented through data redundancy. If some data is not available, its backup data can provide services instead. A high-availability system usually consists of two or more nodes, which are connected to clients through the Internet, and each node has its own local storage space.
现有的高可用系统大多只是提供了一个单一的高可用机制,所有的逻辑数据都使用这一种高可用机制来保证数据的安全性。由于不同的逻辑数据有着不同的高可用需求,使用单一的高可用机制必然会造成系统的性能损失和存储空间浪费。有些高可用系统虽然可以根据需要动态配置高可用机制,但是不能够根据逻辑数据的需求动态地决定应该使用哪种高可用机制。Most of the existing high-availability systems only provide a single high-availability mechanism, and all logical data use this high-availability mechanism to ensure data security. Since different logical data have different high-availability requirements, using a single high-availability mechanism will inevitably cause system performance loss and waste of storage space. Although some high-availability systems can dynamically configure high-availability mechanisms according to needs, they cannot dynamically determine which high-availability mechanism should be used according to the needs of logical data.
在实现并行存储系统的高可用时,提供一种能够应用于该系统的多可用性机制共存架构以支持多种高可用机制,并使得不同逻辑数据能够使用不同的高可用机制是很有必要的。本专利提出的多可用性机制共存架构就是为了满足逻辑数据的这种需求。在这种架构的支持下,用户可以根据逻辑数据的可用性要求,读写特性,以及用户对逻辑数据的服务质量要求,来决定使用系统中提供的适当的可用机制以保证逻辑数据的可靠性和数据服务的可用性。When realizing the high availability of the parallel storage system, it is necessary to provide a multi-availability mechanism coexistence architecture that can be applied to the system to support multiple high-availability mechanisms, and to enable different logical data to use different high-availability mechanisms. The multi-availability mechanism coexistence architecture proposed in this patent is to meet this requirement of logical data. With the support of this architecture, users can decide to use the appropriate available mechanism provided in the system to ensure the reliability and Availability of Data Services.
发明内容Contents of the invention
本发明的目的在于克服上述现有技术不足,提供一种并行存储系统的多可用性机制共存架构,目的是使得用户可选择逻辑数据的高可用机制,减少了不必要的性能损失和磁盘冗余。The purpose of the present invention is to overcome the above-mentioned deficiencies in the prior art, and provide a multi-availability mechanism coexistence architecture of a parallel storage system. The purpose is to enable users to select a high-availability mechanism for logical data, reducing unnecessary performance loss and disk redundancy.
本发明的技术方案是这样实现的:该架构由以下7个部分组成,分别为:状态检测和控制框架、数据服务框架、元数据服务框架、数据同步框架、客户端框架、系统管理框架、高可用机制模块,状态检测和控制框架负责检测和控制本节点上所有的实体的状态,数据服务框架负责创建具体的数据服务线程、分发请求到数据服务线程并完成特定高可用机制所需的数据冗余和服务接管功能,元数据服务框架根据逻辑数据的高可用机制的不同调用不同的函数来完成元数据操作,数据同步框架支持多种高可用机制的数据同步线程共存,完成相互冗余的数据之间的数据同步操作,客户端框架提供一整套用户访问并行存储系统的函数,支持多种高可用机制模块,根据请求的高可用机制类型调用对应的高可用机制函数,系统管理框架提供一个实现系统配置、系统监测、系统控制功能的界面,高可用机制模块作为插件实现其它6个部分的功能接口,The technical solution of the present invention is realized in this way: the framework is composed of the following seven parts, namely: state detection and control framework, data service framework, metadata service framework, data synchronization framework, client framework, system management framework, high-level The available mechanism module, the state detection and control framework is responsible for detecting and controlling the state of all entities on this node, and the data service framework is responsible for creating specific data service threads, distributing requests to data service threads, and completing data redundancy required by specific high-availability mechanisms. Yuhe service takeover function, the metadata service framework calls different functions to complete metadata operations according to the high availability mechanism of logical data, and the data synchronization framework supports the coexistence of data synchronization threads with multiple high availability mechanisms to complete mutually redundant data The client framework provides a complete set of functions for users to access the parallel storage system, supports multiple high-availability mechanism modules, and calls the corresponding high-availability mechanism functions according to the requested high-availability mechanism type. The system management framework provides an implementation Interfaces for system configuration, system monitoring, and system control functions. The high-availability mechanism module is used as a plug-in to realize the functional interfaces of the other 6 parts.
整个架构的工作流程如下:The workflow of the whole architecture is as follows:
a.用户发起针对逻辑数据的读写访问时,首先通过客户端框架中的函数发送请求到元数据服务框架,得到该逻辑数据的元数据信息,该元数据信息包含指明该段逻辑数据所使用的高可用机制类型;a. When a user initiates read and write access to logical data, he first sends a request to the metadata service framework through a function in the client framework to obtain the metadata information of the logical data. High availability mechanism type;
b.然后,客户端框架根据逻辑数据的高可用机制类型调用对应高可用机制模块中实现的客户端框架的接口函数,该函数通过向数据节点上的数据服务框架发送访问请求以完成读写操作;b. Then, the client framework calls the interface function of the client framework implemented in the corresponding high availability mechanism module according to the high availability mechanism type of the logical data. This function completes the read and write operations by sending an access request to the data service framework on the data node ;
c.数据服务框架根据访问请求中附带的逻辑数据的高可用机制类型,将数据请求分发到对应的数据服务线程中,由数据服务线程处理该请求并返回响应到客户端框架,以完成数据请求的响应操作;c. The data service framework distributes the data request to the corresponding data service thread according to the high-availability mechanism type of the logical data attached to the access request, and the data service thread processes the request and returns a response to the client framework to complete the data request The response operation;
d.若某个数据节点上的数据服务框架或者数据服务线程不能被访问,客户端框架则发送状态确认请求给该数据节点的状态检测和控制框架,确认该数据服务线程的当前状态,状态检测和控制框架收到状态确认请求后,调用对应的状态查询函数确认所查询实体的当前状态,同时,调用相关的状态查询函数与该实体的相关实体所在节点上的状态监控框架通信得到其当前状态,根据得到的所有状态,查询该高可用机制的配置信息中的状态转化表,当状态转化表的某个前项匹配所有当前的状态时,则设置本地实体的状态为该状态转换条目的后项中指明的状态,如果没有可匹配的条目,则不作任何操作,然后向客户端返回转换后的实体状态;d. If the data service framework or data service thread on a data node cannot be accessed, the client framework sends a status confirmation request to the status detection and control framework of the data node to confirm the current status of the data service thread, and the status detection After receiving the state confirmation request, the control framework calls the corresponding state query function to confirm the current state of the queried entity, and at the same time, calls the relevant state query function to communicate with the state monitoring framework on the node where the entity's related entity is located to obtain its current state , according to all obtained states, query the state transition table in the configuration information of the high-availability mechanism. When a previous item in the state transition table matches all the current states, set the state of the local entity as the subsequent state of the state transition entry The state specified in the item, if there is no matching entry, do nothing, and then return the converted entity state to the client;
e.当某数据节点发生故障时,在与节点的数据服务线程相关的其它数据节点上的数据服务线程运行正常的情况下,相关数据节点的数据服务框架记录与故障节点数据相关的冗余数据的修改日志;e. When a data node fails, and the data service threads on other data nodes related to the data service thread of the node are running normally, the data service framework of the relevant data node records the redundant data related to the data of the failed node the modification log of
f.故障发生后,系统管理员通过系统管理框架可以得知某数据服务线程发生故障,在人工干预后,系统管理员通过系统管理框架启动数据同步流程,位于原故障节点上的数据同步框架载入高可用机制模块中的同步函数启动数据同步线程,访问其它相关节点的数据同步框架,并根据修改日志同步故障节点数据,数据同步完成后,通知该数据节点上的状态检测和控制框架,调整数据服务线程的状态。f. After a fault occurs, the system administrator can know that a data service thread has failed through the system management framework. After manual intervention, the system administrator starts the data synchronization process through the system management framework. The synchronization function in the high-availability mechanism module starts the data synchronization thread, accesses the data synchronization framework of other related nodes, and synchronizes the data of the faulty node according to the modification log. After the data synchronization is completed, the state detection and control framework on the data node is notified to adjust The state of the data service thread.
所述的并行存储系统是由多个节点组成,节点之间通过网络互联,节点按照功能可划分为四种类型:元数据节点、数据节点、客户端节点、系统管理节点,元数据节点存储逻辑数据的描述信息,并响应面向该信息的访问请求,文件数据本身被分割为逻辑数据块并存储在多个数据节点之上,客户端节点是逻辑数据的使用者,管理节点面向系统管理员提供系统的配置和管理功能。The parallel storage system is composed of multiple nodes, and the nodes are interconnected through a network. The nodes can be divided into four types according to their functions: metadata nodes, data nodes, client nodes, system management nodes, and metadata node storage logic The description information of the data, and in response to the access request for this information, the file data itself is divided into logical data blocks and stored on multiple data nodes, the client node is the user of logical data, and the management node provides System configuration and management functions.
所述的状态检测和控制框架,在并行存储系统的每个数据节点和元数据节点上均有一个状态检测和控制框架,负责检测和控制本节点上所有的实体状态,如果需要得到处于其他节点上的实体状态,则状态检测和控制框架会同其他节点上的状态检测和控制框架通信,通过其他节点上的状态检测和控制框架得到所需实体的状态,实体是所有与高可用机制相关的服务程序,包括每种高可用机制中的数据服务线程、数据同步线程,并行存储系统需要实时检测这些实体的运行情况,得到各个实体的状态,实体因高可用机制类型的不同而状态有所不同,但至少包含有表示活动还是停止的两种状态。The state detection and control framework has a state detection and control framework on each data node and metadata node of the parallel storage system, which is responsible for detecting and controlling the state of all entities on this node. The state of the entity on the node, the state detection and control framework will communicate with the state detection and control framework on other nodes, and the state of the required entity can be obtained through the state detection and control framework on other nodes. Entities are all services related to high availability mechanisms Programs, including data service threads and data synchronization threads in each high-availability mechanism, the parallel storage system needs to detect the running conditions of these entities in real time, and obtain the status of each entity. The status of entities is different due to different types of high-availability mechanisms. However, at least two states are included to indicate whether it is active or inactive.
所述的数据服务框架,在启动时创建数据服务线程,载入具体的数据服务函数,然后根据客户端所请求的逻辑数据的高可用机制类型,将面向该逻辑数据的访问请求分发到对应的数据服务线程,并由该线程响应请求并返回客户端所需数据,同时,数据服务线程完成特定高可用机制所需的数据冗余和服务接管功能,数据服务框架支持多种高可用机制模块的共存,可动态的加载和卸载各种高可用机制模块。The data service framework creates a data service thread at startup, loads a specific data service function, and then distributes access requests for the logical data to corresponding The data service thread responds to the request and returns the data required by the client. At the same time, the data service thread completes the data redundancy and service takeover functions required by a specific high-availability mechanism. The data service framework supports multiple high-availability mechanism modules. Coexistence, dynamic loading and unloading of various high-availability mechanism modules.
所述的数据服务线程,每种高可用机制都有一个或者多个对应的数据服务线程,每个数据服务线程对于状态检测和控制框架来说就是一个实体,实体状态包含有活动状态、备份状态、同步状态、停止状态,分别代表不同的运行情况,数据服务线程和数据服务框架提供实体状态的访问接口,只允许同节点的状态检测和控制框架访问,同时实体也可以自主的改变其状态。As for the data service thread, each high-availability mechanism has one or more corresponding data service threads, and each data service thread is an entity for the state detection and control framework, and the entity state includes an active state and a backup state , Synchronization state, and Stop state represent different operating conditions. The data service thread and data service framework provide the access interface of the entity state, which only allows access to the state detection and control framework of the same node, and the entity can also change its state independently.
所述的元数据服务框架,支持多种高可用机制,根据逻辑数据的高可用机制的不同调用不同的接口函数来完成元数据操作,元数据操作根据高可用机制的不同实现在不同的高可用机制模块中,可动态地加载和卸载。The metadata service framework supports multiple high-availability mechanisms. Different interface functions are called to complete metadata operations according to different high-availability mechanisms of logical data. Metadata operations are implemented in different high-availability mechanisms according to different high-availability mechanisms. In the mechanism module, it can be loaded and unloaded dynamically.
所述的数据同步框架可支持多种高可用机制的数据同步线程共存,提供高可用机制下故障恢复,相互冗余的数据之间的数据同步操作,数据同步线程分为作为同步数据提供端的数据同步服务端线程和作为同步数据请求端的数据同步客户端线程两种类型,数据同步框架根据数据同步发生的高可用机制类型和同步请求线程类型,将数据同步请求分发到对应的数据同步服务线程。The data synchronization framework can support the coexistence of data synchronization threads of multiple high-availability mechanisms, provide fault recovery under high-availability mechanisms, and data synchronization operations between mutually redundant data. There are two types of synchronization server threads and data synchronization client threads as synchronization data requesters. The data synchronization framework distributes data synchronization requests to corresponding data synchronization service threads according to the type of high-availability mechanism for data synchronization and the type of synchronization request thread.
所述的客户端框架提供一整套用户访问并行存储系统的函数接口,用户通过调用这些接口访问并行存储系统中的数据,同时,客户端框架支持多种高可用机制共存,提供高可用机制模块加载和卸载操作,根据客户端所访问的逻辑数据的高可用机制类型调用对应函数来完成数据访问,每个高可用机制均提供一套客户端访问函数,与客户端框架所提供的函数接口相对应,客户端在这些函数的支持下,完成每个高可用机制特有的数据冗余、故障切换、故障恢复操作。The client framework provides a complete set of functional interfaces for users to access the parallel storage system. Users access the data in the parallel storage system by calling these interfaces. At the same time, the client framework supports the coexistence of multiple high-availability mechanisms and provides high-availability mechanism module loading And unloading operations, according to the high-availability mechanism type of the logical data accessed by the client, call the corresponding function to complete the data access. Each high-availability mechanism provides a set of client access functions, corresponding to the function interface provided by the client framework , with the support of these functions, the client completes data redundancy, failover, and failover operations specific to each high-availability mechanism.
所述的系统管理框架提供一个友好的管理并行存储系统的界面,该界面实现系统配置、系统监测、系统控制功能,系统配置中包括面向并行存储系统本身的配置,包括系统规模,节点类型、节点信息配置,同时还包括面向高可用机制模块的配置,系统监测中实现对系统的整体信息以及每个节点的运行信息的监测,同时包括对所有高可用机制模块对应的实体信息和实体状态的检测,提供给系统管理员完整的系统视图,系统控制功能中实现对系统内所有节点的实时控制、以及对所有高可用机制对应的实体的状态控制,管理员应用这些控制功能可以人工开启和关闭系统内的数据服务和数据同步操作。The system management framework provides a friendly interface for managing the parallel storage system, which implements system configuration, system monitoring, and system control functions. The system configuration includes the configuration for the parallel storage system itself, including system scale, node type, node Information configuration also includes the configuration of high-availability mechanism modules. In system monitoring, the monitoring of the overall information of the system and the operation information of each node is realized. It also includes the detection of entity information and entity status corresponding to all high-availability mechanism modules. , providing the system administrator with a complete system view. The system control function realizes real-time control of all nodes in the system and state control of entities corresponding to all high-availability mechanisms. Administrators can manually turn on and off the system by using these control functions Data service and data synchronization operations within.
所述的高可用机制模块被实现为不同的插件,插件实现其它六个框架的功能接口,这些框架根据逻辑数据使用的高可用机制类型,选择调用对应的可用性模块中实现的接口函数,完成数据访问、故障恢复和系统管理。The high-availability mechanism module is implemented as different plug-ins, and the plug-ins implement the functional interfaces of the other six frameworks. These frameworks select and call the interface functions implemented in the corresponding availability modules according to the type of high-availability mechanism used by the logical data to complete the data access, failure recovery, and system administration.
本发明由于采用上述技术方案,可以动态地容纳多种高可用机制同时存在,并根据逻辑数据的高可用需求调用系统中提供的适当的高可用机制线程来进行处理。对不同类型逻辑数据使用最为适合的高可用机制,可以避免因为使用单一的高可用机制而造成的不必要的性能损失和存储空间浪费,在满足用户对数据的可用性要求的前提下,服务质量也得到保证。Due to the adoption of the above technical solution, the present invention can dynamically accommodate multiple high-availability mechanisms coexisting, and call appropriate high-availability mechanism threads provided in the system according to the high-availability requirements of logic data for processing. Using the most suitable high-availability mechanism for different types of logical data can avoid unnecessary performance loss and waste of storage space caused by using a single high-availability mechanism. Guaranteed.
附图说明Description of drawings
图1是应用该多可用性机制并存架构的并行存储系统的结构图。FIG. 1 is a structural diagram of a parallel storage system applying the multi-availability mechanism coexistence architecture.
图2是无故障时的并行文件系统文件访问流程图。Fig. 2 is a flow chart of parallel file system file access when there is no fault.
图3是数据节点软件构成图。Figure 3 is a diagram of the data node software structure.
图4是元数据节点软件构成图。Fig. 4 is a diagram of the software configuration of the metadata node.
图5是客户端节点软件构成图。Fig. 5 is a diagram of client node software configuration.
图6是故障恢复时数据同步流程图。Fig. 6 is a flow chart of data synchronization during fault recovery.
图7是数据节点发生故障时,客户端访问流程图。Fig. 7 is a flow chart of client access when a data node fails.
下面结合附图对本发明的内容作进一步详细说明。The content of the present invention will be described in further detail below in conjunction with the accompanying drawings.
具体实施方式Detailed ways
参照图1所示,应用该架构的并行存储系统中由多个节点组成,节点之间通过网络互联。节点按照功能可划分为四种类型:元数据节点、数据节点、客户端节点、系统管理节点。元数据节点存储逻辑数据的描述信息,并响应面向该信息的访问请求。逻辑数据被分割为数据块并存储在多个数据节点之上。客户端节点是逻辑数据的使用者。管理节点面向系统管理员提供系统的配置和管理功能。Referring to FIG. 1 , a parallel storage system applying this architecture is composed of multiple nodes, and the nodes are interconnected through a network. Nodes can be divided into four types according to their functions: metadata nodes, data nodes, client nodes, and system management nodes. Metadata nodes store the description information of logical data and respond to access requests for this information. Logical data is divided into data blocks and stored on multiple data nodes. Client nodes are consumers of logical data. The management node provides system configuration and management functions for system administrators.
参照图2所示,无故障时的并行存储系统文件访问流程是:首先通过客户端框架中的函数访问元数据服务模块,得到该逻辑数据的元数据信息。客户端框架根据高可用机制类型调用对应高可用机制的函数访问数据节点上的数据服务框架,经过网络通信,数据服务框架得到该请求,根据请求中附带的逻辑数据元数据信息得知该逻辑数据的高可用机制类型,再将数据请求分发到该类型对应的高可用机制的数据服务线程中,由数据服务线程处理该请求并返回响应到客户端,完成数据请求。Referring to FIG. 2 , the file access process of the parallel storage system when there is no fault is: firstly, the metadata service module is accessed through the function in the client framework to obtain the metadata information of the logical data. The client framework calls the function corresponding to the high-availability mechanism to access the data service framework on the data node according to the type of the high-availability mechanism. After network communication, the data service framework gets the request and obtains the logical data according to the logical data metadata information attached to the request The high-availability mechanism type, and then distribute the data request to the data service thread of the high-availability mechanism corresponding to this type, and the data service thread processes the request and returns a response to the client to complete the data request.
参照图3所示,数据服务节点软件构成图包括状态检测和控制框架,数据服务框架,各种高可用机制模块,管理端代理和数据同步框架。状态检测和调入各种高可用机制模块中对应的检测和控制函数接口,负责各种实体的状态检测和设置操作。数据服务框架调入对应的数据服务函数,负责本节点上的数据服务。管理端代理接收管理节点的请求,并执行请求中所要求的操作。数据同步框架调入对应的数据同步函数接口,负责在故障恢复情况下数据的均衡操作。Referring to Figure 3, the data service node software composition diagram includes a state detection and control framework, a data service framework, various high-availability mechanism modules, a management agent and a data synchronization framework. State detection and transfer to the corresponding detection and control function interfaces in various high-availability mechanism modules, responsible for the state detection and setting operations of various entities. The data service framework transfers the corresponding data service function to be responsible for the data service on this node. The management agent receives the request from the management node and performs the operations required in the request. The data synchronization framework is called into the corresponding data synchronization function interface, which is responsible for the balance operation of data in the case of fault recovery.
参照图4所示,元数据节点软件构成图包括状态检测和控制框架,元数据服务框架,各种高可用机制模块和管理端代理,状态检测和控制框架通过调用高可用机制模块的特定函数,获取和设置与该高可用机制相关的实体的状态,元数据框架也需要调用具体高可用机制模块中的函数,完成与该高可用机制相关的元数据操作。Referring to Figure 4, the metadata node software composition diagram includes state detection and control framework, metadata service framework, various high-availability mechanism modules and management agent, the state detection and control framework calls specific functions of high-availability mechanism modules, To obtain and set the status of entities related to the high availability mechanism, the metadata framework also needs to call the functions in the specific high availability mechanism module to complete the metadata operations related to the high availability mechanism.
参照图5所示,客户端节点软件构成图包括各种lib函数,客户端框架,各种高可用机制模块和管理端代理。当某个操作调用一个lib函数时,lib函数通过客户端框架调用对应高可用机制模块中的相应函数完成该操作。各个高可用机制模块都应该实现客户端框架需要的接口。Referring to Figure 5, the client node software composition diagram includes various lib functions, client framework, various high-availability mechanism modules and management agent. When an operation calls a lib function, the lib function calls the corresponding function in the corresponding high availability mechanism module through the client framework to complete the operation. Each high-availability mechanism module should implement the interface required by the client framework.
参照图6所示,故障恢复时数据同步流程图是:系统管理框架向故障数据节点的状态检测和控制框架发送开始数据同步的请求,该请求被转发到该数据节点的数据同步框架上,数据同步框架根据高可用机制的类型,向状态检测和控制框架请求设置当前高可用机制的服务状态为阻塞状态,并启动所有相关数据同步客户端线程。然后状态检测和控制框架再向其他相关数据节点发送打开数据同步服务端线程的请求,该请求完成的同时也阻塞了当前节点的对应高可用机制的数据服务线程。数据同步服务端线程打开后,分析所记录下的修改日志,重新组织日志,生成同步列表,等待数据同步客户端线程的访问。数据同步客户端线程启动后则直接向相关节点的数据同步服务器端线程请求数据同步项,直到数据同步完成。最后,删除日志信息,数据同步线程的客户端和服务端分别向各自的状态检测与控制框架发送同步完成的通知,由状态检测与控制框架将数据服务程序的状态调整为正常,并终止数据同步线程。Referring to Figure 6, the flow chart of data synchronization during fault recovery is: the system management framework sends a request to start data synchronization to the state detection and control framework of the faulty data node, and the request is forwarded to the data synchronization framework of the data node, and the data According to the type of the high-availability mechanism, the synchronization framework requests the state detection and control framework to set the service state of the current high-availability mechanism to the blocked state, and starts all relevant data synchronization client threads. Then the state detection and control framework sends a request to open the data synchronization server thread to other related data nodes. When the request is completed, the data service thread corresponding to the high availability mechanism of the current node is also blocked. After the data synchronization server thread is opened, it analyzes the recorded modification log, reorganizes the log, generates a synchronization list, and waits for the access of the data synchronization client thread. After the data synchronization client thread is started, it directly requests the data synchronization item from the data synchronization server thread of the relevant node until the data synchronization is completed. Finally, the log information is deleted, and the client and server of the data synchronization thread send notifications of synchronization completion to their respective state detection and control frameworks, and the state detection and control framework adjusts the state of the data service program to normal and terminates data synchronization thread.
参照图7所示,客户端节点包含用户进程、客户端框架和各个高可用机制模块,数据服务节点包含状态检测和控制模块、数据服务框架、数据同步框架、管理代理和各个高可用机制模块。数据节点产生故障时,客户端访问流程图是:客户端首先连接数据服务框架失败,然后向状态检测和控制模块查询当前高可用机制线程的状态。状态检测和控制框架调用当前高可用机制模块中的状态获取函数获得该实体及其相关实体的状态,并返回给客户端。管理代理向系统管理模块报告当前状态变化情况。As shown in Figure 7, the client node includes user process, client framework and various high-availability mechanism modules, and the data service node includes state detection and control module, data service framework, data synchronization framework, management agent and various high-availability mechanism modules. When a data node fails, the flow chart of client access is as follows: the client first fails to connect to the data service framework, and then queries the state detection and control module for the state of the current high availability mechanism thread. The state detection and control framework calls the state acquisition function in the current high-availability mechanism module to obtain the state of the entity and its related entities, and returns it to the client. The management agent reports current status changes to the system management module.
构建适用于并行存储系统的高可用机制并存架构包括构建该框架的7个部分:状态检测和控制框架、数据服务框架、元数据服务框架、客户端框架、数据同步框架、系统管理模块和高可用机制模块。Building a high-availability mechanism coexistence architecture suitable for parallel storage systems includes building seven parts of the framework: state detection and control framework, data service framework, metadata service framework, client framework, data synchronization framework, system management module and high-availability mechanism module.
状态检测和控制框架是节点上实体的状态请求操作的代理,负责检测和控制本节点上所有的实体状态,以满足高可用机制中故障切换和故障恢复部分中对实体状态的请求。在实现中,它是一个常驻程序,在系统的每个数据节点和元数据节点上都存在,并在启动时载入所有的高可用机制模块。它的工作包括接收外部对实体状态的请求操作,向监控实体发送状态请求并根据需要控制实体的状态,响应应答。状态检测和控制框架可动态地加载和卸载各种高可用机制模块。The state detection and control framework is the agent of the state request operation of the entity on the node, which is responsible for detecting and controlling the state of all entities on the node, so as to meet the request for the state of the entity in the failover and recovery part of the high availability mechanism. In implementation, it is a resident program that exists on each data node and metadata node of the system, and loads all high-availability mechanism modules at startup. Its work includes receiving external request operations on entity status, sending status requests to monitoring entities, controlling the status of entities as needed, and responding to responses. The state detection and control framework can dynamically load and unload various high-availability mechanism modules.
数据服务框架也是一个常驻程序,在每个数据节点上都存在。它在启动时载入所有的高可用机制模块并为每一个模块生成至少一个数据服务线程。数据服务框架根据客户端所请求的逻辑数据的高可用机制类型,将面向该逻辑数据的访问请求分发到对应的数据服务线程,并由该线程响应该请求并返回客户端所需数据,同时,数据服务线程完成特定高可用机制所需的数据冗余和服务接管等功能。数据服务框架支持多种高可用机制模块的共存,可动态的加载和卸载各种高可用机制模块对应的数据服务线程,并根据客户端的请求内容,正确分发客户端请求,同时处理面向所有机制的共有操作,如重启服务等操作。The data service framework is also a resident program that exists on each data node. It loads all high-availability mechanism modules at startup and generates at least one data service thread for each module. According to the high-availability mechanism type of the logical data requested by the client, the data service framework distributes the access request for the logical data to the corresponding data service thread, and the thread responds to the request and returns the data required by the client. At the same time, The data service thread completes functions such as data redundancy and service takeover required by a specific high availability mechanism. The data service framework supports the coexistence of multiple high-availability mechanism modules, and can dynamically load and unload the data service threads corresponding to various high-availability mechanism modules, and correctly distribute client requests according to the client's request content, and simultaneously process all mechanisms. Common operations, such as restarting services and other operations.
元数据框架部署在元数据节点上,它也是一个常驻程序。它在启动时载入所有的高可用机制模块并获得各个与高可用机制相关联的处理函数的指针,向外提供元数据服务。它接收其他节点对元数据的请求,并根据高可用机制的不同调用不同的函数来完成与机制相关的操作,返回应答。The metadata framework is deployed on the metadata node, which is also a resident program. It loads all high-availability mechanism modules at startup and obtains pointers to processing functions associated with the high-availability mechanism, and provides metadata services to the outside. It receives requests from other nodes for metadata, calls different functions according to different high-availability mechanisms to complete mechanism-related operations, and returns responses.
客户端提供一整套用户访问并行存储系统的函数接口,可以是函数库接口或者通过VFS接口,用户调用该接口即可访问并行存储系统中的数据。具体的访问接口随并行存储系统不同而不同。为了支持多种高可用机制共存,客户端框架提供高可用机制模块加载和卸载的操作,同时可根据客户端所访问的逻辑数据的高可用机制类型调用对应函数来完成数据访问。在具体操作时,由客户为逻辑数据指定需要的高可用机制,用户的访问函数接口会根据配置信息加载该高可用机制模块并获得与机制相关的具体函数的指针,调用该函数来完成具体操作。每个高可用机制均提供一套客户端访问函数,与客户端框架所提供的接口函数相对应。客户端在这些函数的支持下,完成每个高可用机制特有的数据冗余、故障切换、故障恢复等相关操作。The client provides a set of functional interfaces for users to access the parallel storage system, which can be the function library interface or the VFS interface, and the user can access the data in the parallel storage system by calling this interface. The specific access interface varies with different parallel storage systems. In order to support the coexistence of multiple high-availability mechanisms, the client framework provides the operation of loading and unloading high-availability mechanism modules, and can call corresponding functions according to the high-availability mechanism type of the logical data accessed by the client to complete data access. In specific operations, the customer specifies the required high-availability mechanism for the logical data, and the user's access function interface will load the high-availability mechanism module according to the configuration information and obtain a pointer to a specific function related to the mechanism, and call the function to complete the specific operation . Each high availability mechanism provides a set of client access functions corresponding to the interface functions provided by the client framework. With the support of these functions, the client completes related operations such as data redundancy, failover, and failover unique to each high-availability mechanism.
数据同步框架提供高可用机制下故障恢复阶段,相互冗余的数据之间的数据同步操作,数据同步框架可支持多种高可用机制的数据同步线程共存。在具体实现时,它是一个常驻程序,在每一个数据节点中都存在。在该常驻程序启动时,就根据配置信息载入所有模块,但不生成线程,而是在接收到同步请求时才创建具体线程。数据同步线程有两种类型:作为同步数据提供端的数据同步服务器端线程,以及作为同步数据请求端的数据同步客户端线程。数据同步框架可根据数据同步发生的高可用机制类型和同步线程类型,将数据同步客户端线程的请求分发到对应的数据同步服务器端线程。数据同步线程作为一种状态检测和控制框架的被监控实体,具有多种运行状态。外部对其状态的控制通过其所在节点上的状态检测和监控框架完成。The data synchronization framework provides data synchronization operations between mutually redundant data in the fault recovery phase under the high-availability mechanism. The data synchronization framework can support the coexistence of data synchronization threads of multiple high-availability mechanisms. In actual implementation, it is a resident program that exists in every data node. When the resident program is started, all modules are loaded according to the configuration information, but threads are not generated, but specific threads are created when a synchronization request is received. There are two types of data synchronization threads: the data synchronization server thread as a synchronization data provider, and the data synchronization client thread as a synchronization data requester. The data synchronization framework can distribute the requests of the data synchronization client thread to the corresponding data synchronization server thread according to the high availability mechanism type and the synchronization thread type of the data synchronization. As a monitored entity of the state detection and control framework, the data synchronization thread has multiple running states. The external control of its state is completed through the state detection and monitoring framework on the node where it resides.
系统管理模块提供一个友好的管理并行存储系统的界面,该界面实现系统配置、系统监测、系统控制功能。系统配置中包括面向并行存储系统本身的配置,包括系统规模,节点类型、节点信息配置,同时还包括面向高可用机制模块的配置,这些配置包括:所支持的高可用机制数量、以及对应的高可用机制模块的基本信息、每种高可用机制特有的配置信息。系统监测中实现对系统的整体信息以及每个节点的运行信息的监测,同时包括对所有高可用机制模块对应的实体信息和实体状态的检测,提供给系统管理员完整的系统视图。系统控制功能中实现对系统内所有节点的实时控制、以及对所有高可用机制对应的实体的状态控制。管理员应用这些控制可以人工开启和关闭系统内的数据服务和数据同步操作。具体实现时,在所有服务节点,包括元数据服务节点和数据服务节点都有相应的管理端代理模块,系统管理模块与这些代理进行通信来获取对应节点上所需要的信息。The system management module provides a friendly interface for managing the parallel storage system, which implements system configuration, system monitoring, and system control functions. The system configuration includes the configuration for the parallel storage system itself, including the system scale, node type, node information configuration, and also includes the configuration for the high-availability mechanism module. These configurations include: the number of supported high-availability mechanisms, and the corresponding high-availability The basic information of the available mechanism modules, and the specific configuration information of each high-availability mechanism. System monitoring realizes the monitoring of the overall information of the system and the operation information of each node, and also includes the detection of entity information and entity status corresponding to all high-availability mechanism modules, providing system administrators with a complete system view. The system control function realizes real-time control of all nodes in the system and state control of entities corresponding to all high-availability mechanisms. Administrators apply these controls to manually enable and disable data services and data synchronization operations within the system. During specific implementation, all service nodes, including metadata service nodes and data service nodes, have corresponding management agent modules, and the system management module communicates with these agents to obtain the required information on the corresponding nodes.
不同高可用机制模块被实现为不同的插件,插件实现其它6个部分的功能接口。架构中其它框架根据逻辑数据使用的高可用机制类型,选择调用对应的可用性模块中实现的接口,相互协作完成数据访问、故障恢复和系统管理。Different high-availability mechanism modules are implemented as different plug-ins, and the plug-ins implement the functional interfaces of the other six parts. Other frameworks in the architecture choose to call the interface implemented in the corresponding availability module according to the type of high availability mechanism used by the logical data, and cooperate with each other to complete data access, fault recovery and system management.
用户要访问一段逻辑数据,首先通过客户端框架中的函数访问元数据服务框架,得到该逻辑数据的元数据信息,其中包括,该逻辑数据所使用的高可用机制类型,如果是创建逻辑数据,应在创建的操作中指明该段逻辑数据将要使用的高可用机制类型。取得元数据信息后,保存在客户端程序中,在后续的数据访问中,客户端框架根据高可用机制类型调用对应高可用机制的函数以访问数据节点上的数据服务框架,经过网络通信,数据服务框架得到该请求,根据请求中附带的逻辑数据元数据信息得知该逻辑数据的高可用机制类型,再将数据请求分发到该类型对应的高可用机制的数据服务线程中,由数据服务线程处理该请求并返回响应到客户端,完成数据请求。如若在数据请求对应数据节点的数据服务框架时发生故障,某数据数据节点的数据服务框架或者数据服务线程不能正常被访问,客户端函数则发送状态确认请求访问该节点的状态检测和控制框架,确认该数据服务线程的当前状态,状态检测和控制框架收到状态确认请求后,根据所请求实体的高可用机制类型和实体的标示符,调用该实体的状态查询函数确认该实体的当前状态,同时,对所有该实体的相关实体,调用相关实体的状态查询函数与其他节点上的状态监控框架通讯得到其当前状态,根据得到的所有状态,查询该高可用机制的状态转化表,当状态转化表的某个前项匹配所有当前的状态时,则根据该状态转化条目的后项设置本地实体的状态为后项中指明的状态。如果没有可匹配的条目,则不作任何操作。然后向客户端返回转换后的实体状态。客户端根据根据最新的数据服务线程的状态,判断是否可进行访问,如果该实体不能再被访问,可根据所使用的高可用机制选择其他节点进行访问,重复上述过程,直到得到正确响应或者返回出错。上述的实体状态请求根据请求的发起者作区分,如果是状态监控框架之间交换状态,返回的是状态转换前的实体状态,其他的发起者则返回的是状态转化后的实体状态。To access a piece of logical data, the user first accesses the metadata service framework through a function in the client framework to obtain the metadata information of the logical data, including the type of high-availability mechanism used by the logical data. If the logical data is created, The type of high-availability mechanism to be used for this segment of logical data should be indicated in the created operation. After the metadata information is obtained, it is saved in the client program. In the subsequent data access, the client framework calls the function corresponding to the high availability mechanism according to the type of high availability mechanism to access the data service framework on the data node. Through network communication, the data The service framework gets the request, learns the high-availability mechanism type of the logical data according to the logical data metadata information attached to the request, and then distributes the data request to the data service thread of the high-availability mechanism corresponding to the type, and the data service thread Process the request and return a response to the client, completing the data request. If a failure occurs when the data request corresponds to the data service framework of the data node, and the data service framework or data service thread of a certain data data node cannot be accessed normally, the client function will send a status confirmation request to access the status detection and control framework of the node, Confirm the current status of the data service thread. After receiving the status confirmation request, the status detection and control framework calls the status query function of the entity to confirm the current status of the entity according to the high availability mechanism type of the requested entity and the identifier of the entity. At the same time, for all related entities of the entity, call the status query function of the related entity to communicate with the status monitoring framework on other nodes to obtain its current status. When a previous item of the table matches all current states, the status of the local entity is set to the state specified in the latter item according to the latter item of the state conversion entry. If there are no matching entries, do nothing. The transformed entity state is then returned to the client. The client judges whether it can be accessed according to the state of the latest data service thread. If the entity can no longer be accessed, it can select other nodes to access according to the high-availability mechanism used, and repeat the above process until it gets a correct response or returns error. The above entity state requests are distinguished according to the initiator of the request. If the state is exchanged between state monitoring frameworks, the entity state before the state transition is returned, and the other initiators return the entity state after the state transition.
在数据的存储过程中,通常的高可用机制都会产生一些数据冗余以保证在发生节点故障时存储系统的可用性。产生数据冗余的操作可以实现在数据服务线程中,也可以实现在客户端函数中。数据服务线程在运行正常的情况下,冗余数据被正常的写入数据节点的存储介质中,如果节点发生故障,要写入到故障节点上的数据不能被写入存储介质中,则需要在其他相关数据节点上记录与故障节点数据相关的冗余数据的改变。所以,数据服务框架设置了与逻辑数据相关的修改日志,每个数据服务线程设置一个修改日志文件,记录在其他节点故障阶段的相关数据的修改。以提供在故障节点恢复时数据同步的依据。In the data storage process, the usual high availability mechanism will generate some data redundancy to ensure the availability of the storage system when a node failure occurs. Operations that generate data redundancy can be implemented in the data service thread or in the client function. When the data service thread is running normally, the redundant data is normally written into the storage medium of the data node. If the node fails, the data to be written to the faulty node cannot be written into the storage medium. Changes of redundant data related to the data of the faulty node are recorded on other related data nodes. Therefore, the data service framework sets a modification log related to logical data, and each data service thread sets a modification log file to record the modification of related data during the failure stage of other nodes. In order to provide the basis for data synchronization when the failed node recovers.
当存在故障节点时,存储系统自动进入带故障运行阶段,并自动记录修改日志,系统管理员通过系统管理模块可以得知节点发生故障,在人工干预后,如该节点恢复正常,系统发生故障前的数据未丢失,并以初始状态加入到系统后,系统管理员通过系统管理模块中系统控制功能,向故障节点的状态检测和控制框架发送开始数据同步的请求,该请求被转发到该节点的数据同步框架上,数据同步框架则根据高可用机制的类型,打开所有相关数据同步客户端线程。然后状态检测和控制框架再向其他相关节点发送打开数据同步服务端线程的请求。数据同步服务器端线程打开后,首先分析所记录下的修改日志,重新组织日志,生成同步列表,等待数据同步客户端线程的访问。数据同步客户端线程启动后则直接向相关节点的数据同步服务器端线程请求数据同步项,并逐个处理请求的同步项,数据同步服务端线程在发现当前日志中需要同步的数量小于一个阀值时,向状态检测和控制框架发送请求阻塞数据服务线程的请求,得到确认后,数据服务线程停止服务,在一个短暂的时间内,数据同步客户端可以快速的同步所有的数据,在数据同步服务端线程中的同步链表为空时,数据同步线程的客户端和服务端分别向各自的状态检测和控制框架发送同步完成的通知,由状态检测和控制框架将数据服务程序的状态调整为正常,并终止数据同步线程。当所有故障节点上的高可用机制都完成了上述的操作后,故障节点的数据同步操作完成。When there is a faulty node, the storage system automatically enters the faulty operation stage and automatically records the modification log. The system administrator can know that the node is faulty through the system management module. After manual intervention, if the node returns to normal, before the system fails After the data is not lost and added to the system in the initial state, the system administrator sends a request to start data synchronization to the status detection and control framework of the faulty node through the system control function in the system management module, and the request is forwarded to the node’s On the data synchronization framework, the data synchronization framework opens all relevant data synchronization client threads according to the type of high availability mechanism. Then the state detection and control framework sends a request to open the data synchronization server thread to other related nodes. After the data synchronization server thread is opened, it first analyzes the recorded modification log, reorganizes the log, generates a synchronization list, and waits for the access of the data synchronization client thread. After the data synchronization client thread is started, it directly requests data synchronization items from the data synchronization server thread of the relevant node, and processes the requested synchronization items one by one. When the data synchronization server thread finds that the number of synchronizations in the current log is less than a threshold , send a request to the state detection and control framework to block the data service thread. After being confirmed, the data service thread stops serving. In a short period of time, the data synchronization client can quickly synchronize all the data, and the data synchronization server When the synchronization linked list in the thread is empty, the client and server of the data synchronization thread send notifications of synchronization completion to their respective state detection and control frameworks, and the state detection and control framework adjusts the state of the data service program to normal, and Terminates the data synchronization thread. After the high-availability mechanisms on all failed nodes have completed the above operations, the data synchronization operation of the failed nodes is completed.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710018108A CN101079896B (en) | 2007-06-22 | 2007-06-22 | A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710018108A CN101079896B (en) | 2007-06-22 | 2007-06-22 | A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101079896A CN101079896A (en) | 2007-11-28 |
CN101079896B true CN101079896B (en) | 2010-05-19 |
Family
ID=38907123
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200710018108A Expired - Fee Related CN101079896B (en) | 2007-06-22 | 2007-06-22 | A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101079896B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101470735B (en) * | 2007-12-27 | 2011-05-04 | 财团法人工业技术研究院 | Virtual file management system and its system configuration establishment and file access method |
CN102291449B (en) * | 2011-08-08 | 2014-04-02 | 浪潮电子信息产业股份有限公司 | Method for testing and adjusting cluster storage system performance based on synchronous strategy |
WO2013074774A1 (en) * | 2011-11-15 | 2013-05-23 | Ab Initio Technology Llc | Data clustering based on variant token networks |
CN103235753A (en) * | 2013-04-09 | 2013-08-07 | 国家电网公司 | Method and device for monitoring information server |
CN104123300B (en) * | 2013-04-26 | 2017-10-13 | 上海云人信息科技有限公司 | Data distribution formula storage system and method |
EP3230885B1 (en) | 2014-12-08 | 2024-04-17 | Umbra Technologies Ltd. | Method for content retrieval from remote network regions |
EP3243314A4 (en) | 2015-01-06 | 2018-09-05 | Umbra Technologies Ltd. | System and method for neutral application programming interface |
US10630505B2 (en) | 2015-01-28 | 2020-04-21 | Umbra Technologies Ltd. | System and method for a global virtual network |
CN114079669B (en) | 2015-04-07 | 2025-01-07 | 安博科技有限公司 | System for providing a global virtual network (GVN) |
CN105550094B (en) * | 2015-12-10 | 2018-02-06 | 国网四川省电力公司信息通信公司 | A kind of high-availability system state automatic monitoring method |
WO2017098326A1 (en) | 2015-12-11 | 2017-06-15 | Umbra Technologies Ltd. | System and method for information slingshot over a network tapestry and granularity of a tick |
CN107710165B (en) * | 2015-12-15 | 2020-01-03 | 华为技术有限公司 | Method and device for storage node synchronization service request |
CN113810483B (en) | 2016-04-26 | 2024-12-20 | 安博科技有限公司 | Catapulted through the tapestry slingshot network |
CN106357646B (en) * | 2016-09-21 | 2019-12-31 | 苏州浪潮智能科技有限公司 | Agent control system for storage management software |
CN109120691B (en) * | 2018-08-15 | 2021-05-14 | 恒生电子股份有限公司 | Method, system, device and computer readable medium for detecting state of service system |
CN113395358B (en) * | 2021-08-16 | 2021-11-05 | 贝壳找房(北京)科技有限公司 | Network request execution method and execution system |
CN115694748B (en) * | 2022-10-24 | 2025-04-25 | 南京国电南自轨道交通工程有限公司 | A redundant framework design method based on real-time data synchronization of hierarchical systems |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1645389A (en) * | 2004-01-20 | 2005-07-27 | 国际商业机器公司 | Remote enterprise management system and method of high availability systems |
US7149918B2 (en) * | 2003-03-19 | 2006-12-12 | Lucent Technologies Inc. | Method and apparatus for high availability distributed processing across independent networked computer fault groups |
-
2007
- 2007-06-22 CN CN200710018108A patent/CN101079896B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7149918B2 (en) * | 2003-03-19 | 2006-12-12 | Lucent Technologies Inc. | Method and apparatus for high availability distributed processing across independent networked computer fault groups |
CN1645389A (en) * | 2004-01-20 | 2005-07-27 | 国际商业机器公司 | Remote enterprise management system and method of high availability systems |
Non-Patent Citations (4)
Title |
---|
庞丽萍 等.并行文件系统集中式元数据管理高可用系统设计.计算机工程与科学26 11.2004,26(11),87-88. |
庞丽萍 等.并行文件系统集中式元数据管理高可用系统设计.计算机工程与科学26 11.2004,26(11),87-88. * |
李胜利 等.高可用并行文件系统的分布式元数据管理.应用科学学报23 3.2005,23(3),297-299. |
李胜利 等.高可用并行文件系统的分布式元数据管理.应用科学学报23 3.2005,23(3),297-299. * |
Also Published As
Publication number | Publication date |
---|---|
CN101079896A (en) | 2007-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101079896B (en) | A method for constructing a multi-availability mechanism coexistence architecture of a parallel storage system | |
US11360854B2 (en) | Storage cluster configuration change method, storage cluster, and computer system | |
US10719417B2 (en) | Data protection cluster system supporting multiple data tiers | |
CA2921108C (en) | System and method for failover | |
US8498967B1 (en) | Two-node high availability cluster storage solution using an intelligent initiator to avoid split brain syndrome | |
US8856091B2 (en) | Method and apparatus for sequencing transactions globally in distributed database cluster | |
WO2017177941A1 (en) | Active/standby database switching method and apparatus | |
US7702757B2 (en) | Method, apparatus and program storage device for providing control to a networked storage architecture | |
US7895468B2 (en) | Autonomous takeover destination changing method in a failover | |
US20090106323A1 (en) | Method and apparatus for sequencing transactions globally in a distributed database cluster | |
JP2004532442A (en) | Failover processing in a storage system | |
US11003550B2 (en) | Methods and systems of operating a database management system DBMS in a strong consistency mode | |
CN113254275A (en) | MySQL high-availability architecture method based on distributed block device | |
CN106325768B (en) | A kind of two-shipper storage system and method | |
CN116055563B (en) | Raft protocol-based task scheduling method, raft protocol-based task scheduling system, electronic equipment and medium | |
CN108512753A (en) | The method and device that message is transmitted in a kind of cluster file system | |
CN106294031B (en) | A kind of business management method and storage control | |
US10970177B2 (en) | Methods and systems of managing consistency and availability tradeoffs in a real-time operational DBMS | |
WO2024174306A1 (en) | Transaction hosting method and apparatus based on shared storage database cluster | |
CN117891821A (en) | A method for real-time synchronization of database proxy cluster configuration | |
CN114546427B (en) | A method for implementing MySQL high availability based on DNS and MGR | |
CN113596195B (en) | Public IP address management method, device, main node and storage medium | |
RU2714602C1 (en) | Method and system for data processing | |
JPH09293001A (en) | Non-stop maintenance system | |
CN112799835A (en) | Method and system for processing metadata of distributed database system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20100519 Termination date: 20130622 |