Disclosure of Invention
The embodiment of the application provides a master-slave database switching method and related equipment, which are used for respectively configuring respective disk group resources in a first database and a second database, wherein the dual-computer switching process relates to the switching of service plane floating IP and the switching of roles of a master database and a slave database, and the disk group resources and the database resources do not need to be offline from master data and then online from the slave database, so that the problem of non-adaptation between a disk and the database in the dual-computer switching process is avoided, and the problem that a user request cannot be responded due to the failure of a shared disk is also avoided.
In a first aspect, an embodiment of the present application provides a method for switching between a master database and a slave database, where the method is applied to a database system, and the database system includes a first host and a second host, where a first database runs on the first host, a second database runs on the second host, and disk group resources of the first database and disk group resources of the second database are respectively configured in the first database and the second database, and the method includes: the first host can monitor the operation state of the first database, and can release the binding between the first database and the service plane floating Internet Protocol (IP) address under the condition that the first host detects that the first database operates abnormally and determines that the first database meets the dual-computer switching condition, and if a service plane listener exists in the first database, the first host can also close the service plane listener, wherein the service plane floating IP address is used for bearing data streams of the service plane, the service plane is a network plane for receiving and responding to a service request, and the service plane listener is used for monitoring the legality of a connection request received through the service plane; the first host can switch the first database from the master to the slave, and send indication information to the second host, where the indication information is used to indicate that the second database is switched from the slave to the master, and start a service plane floating IP address on the second database.
In the application, when the first database meets the dual-host switching condition, the first host releases the service plane floating IP address bound to the first database, and switches the first database from the master to the slave. After the first database completes the role switching operation, sending indication information to the second database, wherein the indication information is used for indicating that the second database is switched from the slave use to the master use, and starting the service plane floating IP address on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In one possible implementation manner of the first aspect of the present application, the method further includes: the first host can send the redo log of the first database to the second host through the replication plane, wherein the replication plane is a network plane for implementing the replication of the redo log in the master database by the slave database, and the redo log is used for controlling the second database to execute the transaction executed by the first database according to the redo log by the second host, so that the data in the disk group resources of the second database is consistent with the data in the disk group resources of the first database.
In the application, the second host can acquire the redo log of the first database in real time through the copy plane, so that the data in the disk group resource of the second database can be consistent with the data in the disk group resource of the first database in real time, when the dual-computer switching operation is started, the second database can respond to the service request of the client in time according to the data in the disk group resource after the switching is completed, the service interruption time of a user is further shortened, and the influence of the dual-computer switching on the client is reduced.
In a possible implementation manner of the first aspect of the present application, the sending, by the first host, the redo log of the first database to the second host may specifically include: when the first host determines that the first database is in a normal operation state, the first host can send a redo log of the first database to the second host in real time through the copy plane; when the first host determines that the first database operates abnormally, the first host forcibly sends the redo log of the first database to the second host.
In this application, no matter first database is in normal operating condition or when the operation is unusual, first host computer all can send the redo log of first database to the second host computer to guarantee that the data in the disk group resource of first database can realize the unanimity of real-time nature, improved the security and the stability of data.
In a possible implementation manner of the first aspect of the present application, switching the first database from the master use to the slave use includes: after determining that the first database meets the dual-computer switching condition, the first host may determine whether the first database meets a requirement for executing a planned dual-computer switching operation switch mode, and when the first database meets the switch mode, the first host switches the first database from a master mode to a slave mode in the switch mode, where the switch mode is a mode in which the dual-computer switching function of the first database executes the dual-computer switching operation in a normal operation state.
In the application, under the condition that the first database conforms to the switch mode, the first host switches the first database from the master mode to the slave mode under the switch mode, so that the realizability of the scheme is improved.
In a possible implementation manner of the first aspect of the present application, switching, by the first host, the first database from an active use to a slave use may specifically include: the method comprises the steps that a first host can judge whether a first database meets requirements of a switch mode after the first database meets double-computer switching conditions, the first host obtains a System Change Number (SCN) after a second database is switched to be in a main use mode under the condition that the first database does not meet the switch mode, and performs a flash back operation according to the SCN, and then the first host can perform double-computer switching operation in the fail mode when the database is inactive to switch the first database from the main use mode to a slave use mode, wherein the SCN is a numerical series which is automatically maintained by a database system and is increased in cumulative number, and is used for distinguishing the sequence of transactions, and the fail mode is a mode for forcibly performing double-computer switching operation when a double-computer switching function of the first database is in a fault state.
In the method, under the condition that the first database does not conform to the switchover mode, the first database is switched to be used as a slave through the failover mode, and therefore the realizability of the scheme is guaranteed; in addition, when the first host needs to switch the first database to the slave database in the failover mode, the first host may determine that the first database fails during transaction execution, and even if the redo log of the first database is forcibly sent to the second host due to the failure of the first database, the first host may not determine whether all the redo log is sent to the second host, and by executing the flashback operation according to the SCN after the second database is switched to the master, consistency between the data of the first database and the data of the second database after the execution of the dual-computer operation may be ensured.
In a possible implementation manner of the first aspect of the present application, switching, by the first host, the first database from the master use to the slave use includes: the first host can judge whether the first database meets the requirement of a switch mode after determining that the first database meets a dual-computer switching condition, and under the condition that the first database meets the switch mode, the first host can control the first database to be switched from active to secondary in the switch mode and check whether the first database is switched successfully, and under the condition that the second database is switched unsuccessfully in the switch mode, namely under the condition that the second database is not switched to secondary in the switch mode, the first host can acquire the SCN after the second database is switched to active and execute flash back operation according to the SCN, and further the first host can switch the first database from active to secondary in the failure mode.
In the method, under the condition that the first database fails to be switched under the condition of meeting the switchover mode, the first database is switched into the slave database through the failover mode, and therefore the realizability of the scheme is guaranteed; in addition, when the first host needs to switch the first database to the slave database in the failover mode, the first host may determine that the first database fails during transaction execution, and even if the redo log of the first database is forcibly sent to the second host due to the failure of the first database, the first host may not determine whether all the redo log is sent to the second host, and by executing the flashback operation according to the SCN after the second database is switched to the master, consistency between the data of the first database and the data of the second database after the execution of the dual-computer operation may be ensured.
In one possible implementation manner of the first aspect of the present application, the method further includes: the first host can also release the binding between the first database and the management plane floating IP address after releasing the binding between the first database and the management plane floating IP address, and if the management plane listener exists in the first database, the first host can also close the management plane listener of the first database, wherein after executing the dual-computer switching operation, the management plane floating IP is bound on the second database, the management plane floating IP is used for bearing the data stream of the management plane, and the management plane is a network plane for realizing the internal network monitoring.
In the application, since the second host and the first host also share the same management plane floating IP address on the management plane, the management plane floating IP address used for bearing the data stream of the management plane does not change before and after the switching is performed, and the IP address accessed by the gateway used for monitoring the internal network condition does not change before and after the dual-host switching, thereby reducing the influence of the dual-host switching on the gateway.
In a possible implementation manner of the first aspect of the present application, the determining, by the first host, that the first database meets the dual-computer switching condition includes at least one of: the method comprises the steps that a first host judges whether disk group resources of a first database are online or not, under the condition that the disk group resources of the first database are not online, the first host can perform pull-up operation on the disk group resources of the first database, if the pull-up operation fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the disk group resources of the first database exceeds a first threshold value, the first host determines that the first database meets a dual-computer switching condition; or the first host judges whether the database resources of the first database are online, under the condition that the database resources of the first database are not online, the first host can perform pull-up operation on the database resources of the first database, if the pull-up operation fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the database resources of the first database exceeds a second threshold value, the first host determines that the first database meets the dual-computer switching condition; or the first host may determine whether the service plane floating IP address of the first database is online, perform a pull-up operation on the service plane floating IP address of the first database under the condition that the service plane floating IP address of the first database is not online, repeat the pull-up operation if the pull-up fails, and determine that the first database satisfies the dual-computer switching condition when the number of times that the first host continuously performs the pull-up operation on the service plane floating IP address of the first database exceeds a third threshold.
In the application, when the first host monitors that any one of the disk group resource, the database resource, the service plane listener or the service plane floating IP address is not on-line and is pulled up continuously and fails, the dual-computer switching operation can be triggered in time.
In one possible implementation manner of the first aspect of the present application, before the first host unbinds the first database from the traffic plane floating IP address, the method further includes: the first host starts the first database to a raised state; the first host starts a flashback mode of the first database; the method comprises the steps that a first host starts a first database to be in a completely open state, wherein the mounted state and the open state are two states of the database in the opening process, the mounted state is the state of the database when the database is not completely opened, and the open state is the state of the database when the database is completely opened.
In the method and the device, the flashback mode of the first database is started, so that flashback operation can be realized after the first database is switched to be used by the first host in the failover mode, and the realizability and the fluency of the dual-computer switching process are improved.
In a second aspect, an embodiment of the present application provides a method for switching between a master database and a slave database, where the method is applied to a database system, and the database system includes a first host and a second host, where a first database runs on the first host, a second database runs on the second host, and disk group resources of the first database and disk group resources of the second database are respectively configured in the first database and the second database, and the method includes: the second host may switch the secondary usage of the second database to the primary usage after receiving the indication information sent by the first host, and then start a service plane floating IP address on the second database, where the service plane is a network plane that receives and responds to the service request, where the indication information is used to indicate to start a dual-server switching operation of the second database, the service plane floating IP address is used to carry data streams of the service plane, the service plane is a network plane that receives and responds to the service request, and before the dual-server switching operation is performed, the service plane floating IP address is bound to the first database.
In the application, after receiving the indication information, the second host switches the secondary database from secondary use to primary use based on the indication information, and starts a service plane floating IP address on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In one possible implementation manner of the second aspect of the present application, the method further includes: the second host can obtain the redo log of the first database through the replication plane, and control the second database to execute the transaction executed by the first database according to the redo log, so that the data in the disk group resources of the second database is consistent with the data in the disk group resources of the first database, wherein the replication plane is a network plane for implementing replication of the redo log in the main database from the secondary database.
In the application, the second host can acquire the redo log of the first database in real time through the copy plane, so that the data in the disk group resource of the second database can be consistent with the data in the disk group resource of the first database in real time, when the dual-computer switching operation is started, the second database can respond to the service request of the client in time according to the data in the disk group resource after the switching is completed, the service interruption time of a user is further shortened, and the influence of the dual-computer switching on the client is reduced.
In a possible implementation manner of the second aspect of the present application, the switching, by the second host, the second database from the slave use to the master use includes: after receiving the indication information, the second host may determine whether the second database meets a requirement for executing a planned dual-computer switching operation switch mode, and when the second database meets the switch mode, the second host may switch the second database from a slave mode to a master mode in the switch mode, where the switch mode is a mode in which the dual-computer switching function of the first database executes the dual-computer switching operation in a normal operation state.
In the application, the second host switches the second database from the master mode to the slave mode in the switch mode under the condition that the second database conforms to the switch mode, so that the realizability of the scheme is improved.
In a possible implementation manner of the second aspect of the present application, the switching, by the second host, the second database from the slave use to the master use includes: after receiving the indication information, the second host may determine whether the first database meets a requirement of a switch mode, and when the second database does not meet the switch mode, the second host switches the second database from the slave to the master in a fail mode of performing a dual-computer switching operation when the database is inactive, where the fail mode is a mode in which the dual-computer switching function of the first database is in a failure state and the dual-computer switching operation is forcibly performed.
In the application, under the condition that the second database does not conform to the switchover mode, the second database is forcibly switched to the primary database under the failover mode, so that the second database can be ensured to be changed from the secondary database to the primary database under various conditions, and the realizability of the scheme is improved.
In a possible implementation manner of the second aspect of the present application, the switching, by the second host, the second database from the slave use to the master use includes: after receiving the indication information, the second host may determine whether the second database meets a requirement of the switchover mode, and when the second database meets the switchover mode, the second host may control the second database to be switched from the master mode to the slave mode in the switchover mode, and check whether the second database is switched successfully, and when the second database is switched unsuccessfully in the switchover mode, the second host switches the second database from the slave mode to the master mode in the failover mode.
In the application, under the condition that the switching fails in the second database switching mode, the second database is forced to be switched to the primary mode in the failover mode, so that the implementation scenes of the scheme are increased, and the realizability and the flexibility of the scheme are improved.
In a possible implementation manner of the second aspect of the present application, switching the second database from slave use to master use in the failover mode includes: when the second host determines that the second database cannot be switched to the master in the switchover mode, the second host may wait for a period of time, so that the second database may continue to execute the transaction executed by the first database according to the redo log of the first database, and obtain an execution state of the transaction executed by the second database according to the redo log of the first database, until the second database executes the transaction executed by the first database, the second host stops the replication process of the second database, and forcibly switches the slave to the master in the failover mode.
In the application, when the second host cannot switch the slave use of the second database to the master use in the switchover mode, the second host may force the second database to be the master use in the failover mode when determining that the transaction executed by the second database executed by the first database is in the execution completion state, which may avoid forced interruption of the database replication process, thereby ensuring continuity of the process of replicating the second database to the first database, so as to achieve data consistency of the second database and the first database as soon as possible, and thus respond to the service request of the client as soon as possible.
In one possible implementation manner of the second aspect of the present application, switching the second database from the slave use to the master use in the switch mode includes: the second host may preset a waiting duration, and when the second host cannot switch the second database to the active state in the switchover mode, the second host may wait for a period of time, so that the second database may continue to execute the transaction executed by the first database according to the redo log of the first database, and when the duration of executing the transaction executed by the first database by the second database reaches the preset duration, the second host stops the replication process of the second database, and forcibly switches the second database from the active state to the active state in the failover mode.
In the application, the second host may preset a waiting time, and if all transactions are not executed in the second database within the preset time, it means that a replication process of the second database may fail, and when the waiting time reaches the preset time, the second database is forced to be switched to the primary database in the failover mode, thereby avoiding an excessively long time for service interruption.
In a possible implementation manner of the second aspect of the present application, after the second host switches the second database from the slave use to the master use, the method further includes: the second host can restart the second database and record a system change number SCN when the second database is restarted, that is, the second database is switched to the SCN after being used, the SCN is used for the first host to control the first database to execute a flashback operation according to the SCN, wherein the SCN is a cumulative ascending number series automatically maintained by the database management system and used for distinguishing the sequence of transactions.
In the application, when the first database fails, even if the redo log of the first database is forcibly sent to the second host, the first host cannot determine whether all the redo logs are sent to the second host, and the second host records the SCN after the second database is switched to be active, so that the first host can control the first database to execute the flashback operation according to the SCN, and consistency between the data of the first database and the data of the second database after the dual-computer operation is executed can be ensured.
In one possible implementation of the second aspect of the present application, the method further comprises at least one of:
the second host judges whether the disk group resources of the second database are online, executes the pull-up operation on the disk group resources of the second database under the condition that the disk group resources of the second database are not online, and repeatedly executes the pull-up operation if the pull-up fails; or
The second host judges whether the database resource of the second database is online, executes the pull-up operation on the database resource of the second database under the condition that the database resource of the second database is not online, and repeatedly executes the pull-up operation if the pull-up fails; or
When the listener function exists in the second database, the second host judges whether the copy plane listener of the second database is online or not, under the condition that the copy plane listener of the second database is not online and the state of the copy process is normal, the second host performs pull-up operation on the copy plane listener of the second database, if the pull-up fails, the pull-up operation is repeatedly performed, and under the condition that the copy plane listener of the second database is not online and the state of the copy process is abnormal, the second host restarts the copy process and then performs pull-up operation on the copy plane listener of the second database.
In the application, in each monitoring period, the second host can check whether the disk group resources, the database resources and the copy plane monitor of the second database are online, if not, the pull-up operation can be repeatedly executed, and when the repeated pull-up fails, a warning can be sent to prompt an operation and maintenance person to perform manual intervention, so that the slave database is always in a normal running state, the condition that the slave database cannot be switched due to abnormal running in the process of executing the dual-computer switching operation is avoided, and the stability of the master-slave database switching process is improved.
In a possible implementation manner of the second aspect of the present application, after the second host switches the second database from the slave use to the master use, the method further includes: after the second host starts the service plane floating IP address on the second database, the second host may also start a management plane floating IP address on the second database, and if a management plane listener exists in the second database, the second host may also start the management plane listener of the second database, where before the dual-host switching operation is performed, the management plane floating IP is bound to the first database, the management plane floating IP is used to carry a data stream of the management plane, and the management plane is a network plane used to implement internal network monitoring.
In the application, since the second host and the first host also share the same management plane floating IP address on the management plane, the management plane floating IP address used for bearing the data stream of the management plane does not change before and after the switching is performed, and the IP address accessed by the gateway used for monitoring the internal network condition does not change before and after the dual-host switching, thereby reducing the influence of the dual-host switching on the gateway.
In a third aspect, an embodiment of the present application provides a host, where the host is a first host, the first host is included in a database system, the database system further includes a second host, a first database is run on the first host, a second database is run on the second host, and disk group resources of the first database and the second database are respectively configured in the first database and the second database, and the host includes:
a removing unit, configured to remove the binding between the first database and a service plane floating IP address when the first database meets the dual-computer handover condition, where the service plane floating IP address is used to carry a data stream of a service plane, and the service plane is a network plane that receives and responds to a service request;
the switching unit is used for switching the first database from master use to slave use;
and the sending unit is used for sending indication information to the second host, wherein the indication information is used for indicating that the second database is switched from slave use to master use, and a service plane floating IP address is started on the second database.
In the application, when the first database meets the dual-computer switching condition, the release unit releases the service plane floating IP address bound to the first database, and the switching unit switches the first database from the master to the slave. After the first database completes the role switching operation, the sending unit sends indication information to the second database, where the indication information is used to indicate that the second database is switched from slave use to master use, and starts the service plane floating IP address on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In the third aspect of the present application, the constituent modules of the first host may further perform the steps described in the foregoing first aspect and in various possible implementations, for details, see the foregoing description of the first aspect and the various possible implementations.
In a fourth aspect, an embodiment of the present application provides a host, where the host is a second host, the second host is included in a database system, the database system further includes the second host, where a first database runs on the first host, a second database runs on the second host, and disk group resources of the first database and the second database are respectively configured in the first database and the second database, and the host includes:
the receiving unit is used for receiving indication information sent by the first host, wherein the indication information is used for indicating the starting of the dual-computer switching operation of the second database;
the switching unit is used for switching the secondary database from secondary use to primary use;
and the starting unit is used for starting a service plane floating Internet Protocol (IP) address on the second database, the service plane is a network plane for receiving and responding to the service request, the service plane floating IP address is used for bearing data flow of the service plane, the service plane is a network plane for receiving and responding to the service request, and the service plane floating IP address is bound on the first database before the dual-computer switching operation is executed.
In this application, after receiving the indication information, the receiving unit switches the second database from the slave use to the master use based on the indication information, and the starting unit starts the service plane floating IP address on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In a fourth aspect of the present application, the constituent modules of the second host may further perform the steps described in the foregoing second aspect and in various possible implementations, for details, see the foregoing description of the second aspect and in various possible implementations.
In a fifth aspect, an embodiment of the present application provides a host, which is a first host and includes a memory and a processor;
the memory for storing computer program code;
the processor is configured to execute the memory stored code to cause the first host to perform the method as described in the first aspect above.
In a sixth aspect, an embodiment of the present application provides a host, which is a second host, and includes a memory and a processor;
the memory for storing computer program code;
the processor is configured to execute the memory-stored code to cause the first host to perform the method as described in the second aspect above.
In a seventh aspect, the present application provides a computer-readable storage medium, which stores computer program code, and when the computer program code runs on a computer, the computer is caused to execute the method according to the first aspect.
In an eighth aspect, embodiments of the present application provide a computer-readable storage medium having stored therein computer program code, which, when run on a computer, causes the computer to perform the method as described in the second aspect above.
In a ninth aspect, embodiments of the present application provide a chip system, which includes a processor, and is configured to enable a network device to implement the functions recited in the foregoing aspects, for example, to transmit or process data and/or information recited in the foregoing methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the network device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
For the advantageous effects of the fifth to ninth aspects of the present application, reference may be made to the first aspect.
Detailed Description
The embodiment of the application provides a master-slave database switching method and related equipment, which are used for respectively configuring respective disk group resources in a first database and a second database, wherein the dual-computer switching process relates to the switching of service plane floating IP and the switching of roles of a master database and a slave database, and the disk group resources and the database resources do not need to be offline from master data and then online from the slave database, so that the problem of non-adaptation between a disk and the database in the dual-computer switching process is avoided, and the problem that a user request cannot be responded due to the failure of a shared disk is also avoided.
Embodiments of the present application are described below with reference to the accompanying drawings.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
In the embodiment of the present application, referring to fig. 1, the present application is applied to a database system 10, where the database system 10 includes a first host 100 and a second host 200, and the first host 100 and the second host 200 are communicatively connected. The first database 110 is run on the first host 100, and the second database 210 is run on the second host 200. The first database 110 and the second database 210 are each configured with respective disk group resources.
In this embodiment, the first host 100 and the second host 200 may be respectively installed with dual-computer management software, and respectively manage the first database 110 and the second database 210 through the dual-computer management software. The first database 110 accesses and manages data in the disk group resources through the operating system of the first host 100; the second database 210 accesses and manages data in the disk group resources through the operating system of the second host 200.
In the embodiment of the present application, each of the first host 100 and the second host 200 may be a computer device, a server, a personal computer, a notebook computer, or other types of communication devices.
In the embodiment of the present application, the first database 110 is an active database (active or master), the second database 210 is a slave database (slave), and the second database 210 may also be called a standby database (standby), it should be understood that the first database 110 and the second database 210 may also be named by other names as long as the relationship between the first database 110 and the second database 210 is shown.
In this embodiment, the first database 110 and the second database 210 may be an Oracle database management system, a MySQL database management system, or other types of database management systems, and details thereof are not repeated herein.
In this embodiment, the first database 110 and the second database 210 are each configured with at least two redo log groups for storing redo logs, for example, there are a first redo log group and a second redo log group, when the first redo log group is full, log group switching may occur, and the redo logs may be continuously written into the second redo log group, and when the second redo log group is full, log group switching may occur again, and the redo logs may be continuously written into the first redo log group.
It should be understood that, in the embodiment of the present application, the first database is taken as a master database, the second database is taken as a slave database, and the first database and the second database are taken as a homogeneous Oracle database management system as an example, and detailed description is provided.
The embodiment of the present application provides a method for switching between master and slave databases, which is applied to the database system 10, in a case that the first database 110 satisfies a dual-computer switching condition, the first host releases a service plane floating IP bound to the first database, switches the first database from master to slave, and sends indication information to the second host, and the second host switches the second database from slave to master after receiving the indication information, and starts a service plane floating IP address on the second database, thereby implementing dual-computer switching. However, in the process of switching the database from active to secondary or from secondary to active, there may be two switching modes, namely a planned execution dual-computer switching operation (switching) mode and a dual-computer switching operation (failover) mode when the database is inactive, and in different modes, the switching flows executed by the first host and the second host are different.
In this embodiment of the application, when the dual-computer operation is triggered, part of resources or components of the primary database are abnormal, but the dual-computer switching function of the primary database may be in a normal operating state, the switchover mode is a mode in which the dual-computer switching function of the database is in the normal operating state to execute the dual-computer switching operation, and the failover mode is a mode in which the dual-computer switching function of the database is in a fault state to forcibly execute the dual-computer switching operation. When the first database is switched to be used as a slave in the switchover mode, the second database can be switched to be used as a master in the switchover mode or be switched to be used as a master in the failover mode; when the first database is switched to the slave database in the failover mode, the second database can only be switched to the master database in the failover mode, and the switching manner in the three cases will be described in detail below.
The first host, the first host and the second host all switch between active use and slave use in the switch mode
In this embodiment of the present application, referring to fig. 2 specifically, an embodiment of a master-slave database switching method provided in this embodiment of the present application may include:
201. the first host computer starts the first database.
In some embodiments of the present application, the first host may boot the first database to a raised (mounted) state, boot the duplicate plane listener, set the first database as the primary database, and then boot the first database to a fully booted (open) state. And starting a management plane floating IP address and a service plane floating IP address on the first database, and starting a management plane listener and a service plane listener.
In the embodiment of the present application, the mounted state and the open state are two states of the database in the opening process, where the mounted state (the ascending state) is a state of the database when the database is not completely opened yet, and the open state (the completely starting state) is a state of the database when the database is completely opened, and in the two states, the host may execute different rights of management operations on the database.
In this embodiment, a floating Internet Protocol (IP) address is an IP address that can drift between a first database and a second database in a dual-computer networking, and is always bound to a primary database. The management plane floating IP is used for bearing the data flow of the management plane, and the service plane floating IP address is used for bearing the data flow of the service plane.
In the embodiment of the application, in order to realize network isolation, a plurality of network planes are introduced in the communication process. The replication plane is a network plane planned by the first host and the second host and used for replicating redo logs in the master database from the slave database, namely a network plane used for communication between the first host and the second host; the management plane is a network plane for realizing internal network monitoring; the traffic plane refers to the network plane that receives and responds to service requests.
In the embodiment of the application, the listener is a network service in an Oracle database management system, and can be used for performing legal check on the received connection request, and if the connection request is valid, the connection is performed, otherwise, the connection is rejected. The duplicate plane listener is used for monitoring a connection request on the duplicate plane, the management plane listener is used for monitoring a connection request on the management plane, and the service plane listener is used for monitoring a connection request on the service plane. It should be appreciated that although there is a listener in an Oracle database management system, there are some types of database management systems where there is no listener and there is accordingly no step to start a listener.
202. The second host initiates a second database.
In this embodiment of the present application, the second host may start the second database to a mounted state, start the copy plane listener, set the second database as a slave database, start a copy process of the second database, start a flashback mode of the second database, and start the second database to an open state.
203. The first host sends the redo log of the first database to the second host.
In this embodiment of the application, after the first host starts the first database, the first database may receive a service request sent by the client, so that a redo log of the first database may be generated according to the received service request and written into a redo log group, and the first host may send the redo log in the redo log group to the second host through the IP address of the copy plane.
Because the first host and the second host are communicated with each other through the copy plane, the second host can acquire the redo log of the first database in real time.
204. The second host controls the second database to execute the transaction executed by the first database.
In this embodiment of the application, after the redo log of the first database is acquired, the second host may control the second database to execute the transaction executed by the first database according to the redo log, so that the data in the disk group resource of the second database is consistent with the data in the disk group resource of the first database.
In this embodiment of the application, since the second host may obtain the redo log of the first database in real time by copying the IP address of the plane, so that the data in the disk group resource of the second database may be consistent with the data in the disk group resource of the first database in real time, when the dual-computer switching operation is started, after the second database completes the switching, the second database may respond to the service request of the client in time according to the data in the disk group resource, thereby further shortening the duration of service interruption of the user, and reducing the influence on the client caused by the dual-computer switching.
205. The first host monitors the operation of the first database.
In this embodiment of the application, the first host may periodically monitor the operation condition of the first database, so that when the first database is abnormal, the dual-computer switching operation may be triggered in time, specifically referring to fig. 3, where the mode in which the first host monitors the operation condition of the first database specifically includes:
the method comprises the steps that a first host judges whether disk group resources of a first database are online or not, under the condition that the disk group resources of the first database are not online, the first host performs pull-up operation on the disk group resources of the first database, if the pull-up operation fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the disk group resources of the first database exceeds a first threshold value, the first host determines that the first database meets a dual-computer switching condition; or
The method comprises the steps that a first host judges whether database resources of a first database are online or not, under the condition that the database resources of the first database are not online, the first host performs pull-up operation on the database resources of the first database, if the pull-up operation fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the database resources of the first database exceeds a second threshold value, the first host determines that the first database meets a dual-computer switching condition; or
The method comprises the steps that a first host judges whether a service plane listener of a first database is online or not, under the condition that the service plane listener of the first database is not online, the first host performs pull-up operation on the service plane listener of the first database, if the pull-up fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the service plane listener of the first database exceeds a fourth threshold value, the first host determines that the first database meets a dual-computer switching condition; or
The method comprises the steps that a first host judges whether a management plane listener of a first database is online or not, under the condition that the management plane listener of the first database is not online, the first host performs pull-up operation on the management plane listener of the first database, and if the pull-up fails, the pull-up operation is repeatedly performed; or
The method comprises the steps that a first host judges whether a copy plane listener of a first database is online or not, under the condition that the copy plane listener of the first database is not online, the first host executes pull-up operation on the copy plane listener of the first database, and if the pull-up fails, the pull-up operation is repeatedly executed;
the method comprises the steps that a first host judges whether a service plane floating IP address of a first database is online or not, under the condition that the service plane floating IP address of the first database is not online, the first host performs pull-up operation on the service plane floating IP address of the first database, if the pull-up operation fails, the pull-up operation is repeatedly performed, and when the number of times that the first host continuously performs the pull-up operation on the service plane floating IP address of the first database exceeds a third threshold value, the first host determines that the first database meets a dual-computer switching condition; or
The first host judges whether the management plane floating IP address of the first database is on-line or not, executes pull-up operation on the management plane floating IP address of the first database under the condition that the management plane floating IP address of the first database is not on-line, and repeatedly executes pull-up operation if pull-up fails.
In the embodiment of the present application, the database resource is composed of an instance (instance) and a database (database) structure. Examples include background processes and memory structures, and the memory structure of the database may be a System Global Area (SGA), a Process Global Area (PGA), a User Global Area (UGA), or other types of memory structures. The database structure is determined by a physical structure and a logical structure, wherein the physical structure is determined by operating system files such as database data files, redo log files, control files, parameter files or archive log files, and the logical structure is determined by tablespaces, data blocks or index terminals.
In the embodiment of the application, when the first host monitors that any one of the disk group resource, the database resource, the service plane listener or the service plane floating IP address is not on-line and is not pulled up continuously, the dual-computer switching operation can be triggered in time.
When any resource in the management plane listener, the duplicate plane listener or the management plane floating IP address is not on-line, the pulling operation can be repeatedly executed, and when the pulling operation is failed due to repeated pulling, a warning can be given to prompt operation and maintenance personnel to perform manual intervention, so that the operation stability of the main database is improved; however, the dual-computer switching operation is not triggered, and because the failure to respond to the service request of the client cannot be directly caused when any resource is abnormal, the dual-computer switching operation is not triggered at this time, thereby avoiding the interruption of the user service.
It should be understood that since there is a listener in the Oracle database management system, the first host needs to determine whether the listener of the first database is online, but there is no corresponding step of determining whether the listener is online if there is no listener in some types of database management systems.
In addition, in some embodiments of the present application, the order of determining the abnormality of the first host for the various resources or components of the first database is not limited, for example, the first host may also determine whether the database resource is online first, and then determine whether the disk group resource is online, and the determination order of the specific resources or components should be flexibly set according to the actual order.
206. The second host monitors the operating conditions of the second database.
In this embodiment of the application, the second host may periodically monitor the operation condition of the second database, specifically referring to fig. 4, the manner in which the second host monitors the operation condition of the second database specifically includes:
the second host judges whether the disk group resources of the second database are online, executes the pull-up operation on the disk group resources of the second database under the condition that the disk group resources of the second database are not online, and repeatedly executes the pull-up operation if the pull-up fails; or
The second host judges whether the database resource of the second database is online, executes the pull-up operation on the database resource of the second database under the condition that the database resource of the second database is not online, and repeatedly executes the pull-up operation if the pull-up fails; or
The second host judges whether the copy plane listener of the second database is online or not, under the condition that the copy plane listener of the second database is offline and the state of the copy process is normal, the second host performs pull-up operation on the copy plane listener of the second database, if the pull-up fails, the pull-up operation is repeatedly performed, and under the condition that the copy plane listener of the second database is offline and the state of the copy process is abnormal, the second host restarts the copy process and then performs pull-up operation on the copy plane listener of the second database.
In the embodiment of the application, in each monitoring period, the second host may check whether the disk group resource, the database resource, and the copy plane listener of the second database are online, and if not, the second host may repeatedly perform the pull-up operation, and when the repeated pull-up fails, a warning may be issued to prompt an operation and maintenance person to perform manual intervention, so as to ensure that the slave database is constantly in a normal operation state, avoid a situation that the slave database cannot be switched due to abnormal operation in the process of performing the dual-computer switching operation, and improve the stability of the master-slave database switching process.
It should be understood that, in some embodiments of the present application, the execution order of step 205 and step 206 is not limited, and step 205 may be executed first, and then step 206 may be executed; step 206 may be executed first, and then step 205 may be executed; step 205 and step 206 may also be performed simultaneously.
207. The first host unbinds the first database from the traffic plane floating IP address.
In this embodiment, when the first host determines that the first database satisfies the dual-server switching condition through step 205, the first host stops using the service plane floating IP address on the first database, that is, the first database is unbound to the service plane floating IP address, and the service plane listener is closed.
208. The first host unbinds the first database from the management plane floating IP address.
In this embodiment, when the first host determines that the first database satisfies the dual-host switching condition through step 205, the first host stops using the management plane floating IP address on the first database, that is, the first database is unbound to the management plane floating IP address, and the management plane listener is closed.
It should be appreciated that the steps of shutting down the listener in steps 207 and 208 are optional steps, and that in a database management system where no listener is present, the step of shutting down the listener is not present.
It should be understood that, in some embodiments of the present application, the execution order of step 207 and step 208 is not limited, and step 207 may be executed first, and then step 208 may be executed; step 208 may be performed first, and then step 207 may be performed; step 207 and step 208 may also be performed simultaneously.
209. The first host switches the first database from the master to the slave in a planned execution dual-computer switching operation switchover mode.
In some embodiments of the present application, after the first host unbinds the first database from the floating IP address, it may be determined whether the first database complies with a requirement of a switchover mode, and in a case that the first database complies with the switchover mode, the first database is switched from the master to the slave in the switchover mode.
In this embodiment of the application, the first host and the second host may determine whether the database meets the requirement of the switch mode, specifically, the host may determine whether the database meets the requirement of the switch mode by determining whether the switch mode of the database is in a normal operating state, for example, by using a Structured Query Language (SQL) command, it may be queried whether a switch _ status (state of the switch mode) in the database is a TO STANDBY state or a session ACTIVE (in-service) state.
210. The first host sends indication information to the second host.
In some embodiments of the present application, after the first host switches the first database to the slave use in the switch over mode, the first host may send indication information to the second host, where the indication information is used to indicate that the second database is switched from the slave use to the master use, and start a traffic plane floating IP address on the second database.
211. And the second host switches the second database from the slave use to the master use in a planned execution double-computer switching operation switchover mode.
In some embodiments of the present application, after receiving the indication information of the first host, the second host may determine whether the second database meets a requirement of a switch mode, and switch the second database from a slave to a master in the switch mode when the second database meets the switch mode.
212. The second host initiates a traffic plane floating IP address on the second database.
213. The second host initiates a management plane floating IP address on the second database.
In this embodiment of the present application, the second host may store address information of a service plane floating IP address and a management plane floating IP address in advance, and after the second host switches the second database to the active state, the second database may be started to a mounted state, the duplicate plane listener may be started, the second database is set as the active database, and then the second database is started to an open state. And starting the management plane floating IP address and the service plane floating IP address on the second database, and starting the management plane listener and the service plane listener so as to complete the switching from the secondary use to the primary use of the second database.
In this embodiment of the present application, since the second host and the first host also share the same management plane floating IP address on the management plane, the management plane floating IP address used for carrying the data stream of the management plane before and after the handover is performed does not change, and the IP address accessed by the gateway used for monitoring the internal network condition before and after the dual-host handover does not change, thereby reducing the influence of the dual-host handover on the gateway.
Second, the first host computer is under the switch mode and the second host computer is under the failover mode to switch between the master and the slave
With specific reference to fig. 5, another embodiment of the master-slave database switching method provided in the embodiment of the present application may include:
501. the first host computer starts the first database.
502. The second host initiates a second database.
503. The first host sends the redo log of the first database to the second host.
504. The second host controls the second database to execute the transaction executed by the first database.
505. The first host monitors the operation of the first database.
506. The second host monitors the operating conditions of the second database.
507. The first host unbinds the first database from the traffic plane floating IP address.
508. The first host unbinds the first database from the management plane floating IP address.
509. The first host switches the first database from the master to the slave in a planned execution dual-computer switching operation switchover mode.
510. The first host sends indication information to the second host.
In the embodiment of the present application, steps 501 to 510 are similar to steps 201 to 210 in the embodiment shown in fig. 2, and are not described herein again.
511. And the second host switches the second database from the slave use to the master use in a failover mode when the database is inactive.
In some embodiments of the present application, after receiving the indication information of the first host, the second host may determine whether the second database meets a requirement of the switch mode, and if the second database does not meet the switch mode, or if the second database meets the switch mode but fails to switch in the switch mode, the second database may not be able to switch the second database from the slave to the master in the switch mode, and the second host may switch the second database from the slave to the master in the switch mode.
In an implementation manner, when the second host determines that the second database cannot be switched to the master in the switchover mode, the second host may wait for a period of time, so that the second database may continue to execute the transaction executed by the first database according to the redo log of the first database, and obtain an execution state of the transaction executed by the second database according to the redo log of the first database, until the execution of the transaction executed by the second database by the first database is completed, the second host stops the copy process of the second database, and forcibly switches the second database from the slave use to the master use in the failover mode.
In some embodiments of the application, when the second host cannot switch the secondary database to the primary database in the switchover mode, the second host may force the second database to be the primary database in the failover mode when it is determined that the transaction executed by the second database executing the first database is in the execution complete state, which may avoid forced interruption of the database replication process, thereby ensuring continuity of the process of replicating the second database to the first database, so as to achieve data consistency between the second database and the first database as soon as possible, and thus respond to the service request of the client as soon as possible.
In another implementation manner, the second host may obtain a preset time length, and when the time length for the second database to execute the transaction executed by the first database reaches the preset time length, the second host switches the secondary database from the primary database to the secondary database in the failover mode.
In some embodiments of the present application, a waiting duration may be preset by the second host, and in a case that the second host cannot switch the second database to be the primary database in the switchover mode, the second host may wait for a period of time, so that the second database may continue to execute the transaction executed by the first database according to the redo log of the first database, and when the duration of executing the transaction executed by the first database by the second database reaches the preset duration, the second host stops the copy process of the second database, and forcibly switches the secondary database from the primary database to the primary database in the failover mode.
In the embodiment of the present application, the preset time period may be 60 seconds, 45 seconds, 75 seconds, or other preset time periods, which is not limited herein.
In this embodiment of the present application, a waiting time may be preset by the second host, and if all transactions are still not executed in the second database within the preset time, it means that a failure may occur in a replication process of the second database, and when the waiting time reaches the preset time, the second database is forced to be switched to the active state in the failover mode, so as to avoid an excessively long time for service interruption.
512. The second host initiates a traffic plane floating IP address on the second database.
513. The second host initiates a management plane floating IP address on the second database.
In the embodiment of the present application, steps 512 and 513 are similar to steps 212 and 213 in the embodiment shown in fig. 2, and are not described herein again.
Thirdly, the first host computer is in the failover mode, and the second host computer is in the failover mode to switch between the master host and the slave host
With specific reference to fig. 6, another embodiment of the master-slave database switching method provided in the embodiment of the present application may include:
601. the first host starts the first database to a raised state;
602. the first host starts a flashback mode of the first database;
603. the first host launches the first database to a fully open state.
In some embodiments of the present application, after the first host starts the first database to the mounted state, the first host may start the copy plane listener, set the first database as the primary database, start the archive mode of the redo log group of the first database, and start the first database to the open state after the flashback mode of the first database is started. And starting a management plane floating IP address and a service plane floating IP address on the first database, and starting a management plane listener and a service plane listener.
604. The second host initiates a second database.
605. The first host sends the redo log of the first database to the second host.
606. The second host controls the second database to execute the transaction executed by the first database.
607. The first host monitors the operation of the first database.
608. The second host monitors the operating conditions of the second database.
609. Unbinding, at the first host, the first database from the traffic plane floating IP address.
610. Unbinding, at the first host, the first database from the management plane floating IP address.
In some embodiments of the present application, steps 604 to 610 are similar to steps 202 to 208 in the embodiment shown in fig. 2, and are not repeated here.
611. The first host sends indication information to the second host.
In some embodiments of the present application, after determining that the first database cannot complete the switch from the master to the slave in the switch mode, the first host may check whether all redo logs of the first host are sent to the second host, if not all redo logs of the first host are sent to the second host, force all redo logs of the first host to be sent to the second host, and after determining that all redo logs are sent to the second host, send indication information to the second host, where the indication information is used to indicate that the second database is switched from the slave to the master, and start a service plane floating IP address on the second database.
Under the condition that the first database cannot complete dual-computer switching in the switch mode, the first host can determine that the first database fails, and then the possibility that the first host sends redo logs to the second host and also fails exists.
612. And the second host switches the second database from the slave use to the master use in a failover mode when the database is inactive.
In the embodiment of the present application, step 612 is similar to step 511 in the embodiment shown in fig. 5, and is not described here again.
613. The second host reboots the second database.
614. And the second host records the system change number SCN when the second database is restarted.
In some embodiments of the present application, after the second host switches the second database to the active state, the second host may restart the second database, and record a System Change Number (SCN) when the second database is restarted, that is, the SCN after the second database is switched to the active state. Wherein SCN is a numerical series which is automatically maintained by the database management system and is accumulated and increased, and is used for distinguishing the sequence of the affairs,
615. the second host initiates a traffic plane floating IP address on the second database.
616. The second host initiates a management plane floating IP address on the second database.
In the embodiment of the present application, steps 615 and 616 are similar to steps 212 and 213 in the embodiment shown in fig. 2, and are not described herein again.
617. And the first host acquires the system change number after the second database is switched to be the primary database.
In some embodiments of the present application, when it is determined that the first database satisfies the dual-computer switching condition through step 607, the first host may determine whether the first database meets a requirement of the switch mode, and when the first database does not meet the switch mode, or when the first database meets the switch mode but fails to switch in the switch mode, the first database may need to be switched from the active mode to the slave mode in the switch mode. The first host may obtain the SCN after the secondary database is switched from active to secondary, that is, the SCN when the secondary database is restarted in step 614.
618. And the first host executes the flashback operation according to the system change number.
In some embodiments of the present application, after acquiring the SCN after the second database is switched to be active, the first host may execute a flashback operation according to the SCN since the first host has already started the flashback mode of the first database in step 602.
It should be understood that step 602 is an optional step, and when step 602 is not executed, the first host may restart the first database and start an archive mode of the redo log group and a flashback mode of the first database after the first host acquires the SCN after the second database is switched to active through step 618.
619. The first host executes the dual-computer switching operation failover mode when the database is inactive, and switches the first database from the master mode to the slave mode.
In some embodiments of the present application, after performing a flashback operation according to the SCN after the second database is switched to the active state, the first host may switch the first database from the active state to the slave state in the failover mode. And the second host controls the first database to continuously execute the transaction according to the SCN number and the redo log.
In the embodiment of the application, when the first database and the second database both complete dual-computer switching in the failover mode, the second host database completes switching from the slave use to the master use in the failover mode, and after the second database is restarted, the SCN of the second database when the second database is restarted is acquired, and the first host acquires the SCN and executes a flashback operation according to the SCN. When the first database and the second database are both in a fault state, the consistency of the data of the first database and the second database is ensured by unifying the SCN, and the accuracy of the data in the databases is improved.
In order to better implement the above-mentioned aspects of the embodiments of the present application, the following also provides related apparatuses for implementing the above-mentioned aspects.
Specifically, as shown in fig. 7, a schematic structural diagram of a host provided in this embodiment of the present application is shown, where the host is a first host, the first host is included in a database system, the database system further includes a second host, a first database is run on the first host, a second database is run on the second host, and the first database and the second database are respectively configured with respective disk group resources, and the host includes: a release unit 701, a switching unit 702, and a transmission unit 703, wherein,
a removing unit 701, configured to remove, when the first database meets a dual-machine handover condition, a binding between the first database and a service plane floating IP address, where the service plane floating IP address is used to carry a data stream of a service plane, and the service plane is a network plane that receives and responds to a service request;
a switching unit 702, configured to switch the first database from an active use to a slave use;
a sending unit 703, configured to send indication information to the second host, where the indication information is used to indicate that the second database is switched from the slave use to the master use, and start the service plane floating IP address on the second database.
In this embodiment, when the first database satisfies the dual-server switching condition, the removing unit 701 removes the service plane floating IP address bound to the first database, and the switching unit 702 switches the first database from the master to the slave. After the first database completes the role switching operation, the sending unit 703 sends instruction information to the second database, where the instruction information is used to instruct to switch the second database from the slave use to the master use, and starts the floating IP address of the service plane on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In this embodiment of the application, the sending unit 703 is further configured to send, to the second host, a redo log of the first database, where the redo log is used to instruct the second host to control the second database to execute, according to the redo log, the transaction executed by the first database, so that data in the disk group resource of the second database is consistent with data in the disk group resource of the first database.
In this embodiment of the application, the sending unit 703 is specifically configured to:
when the first host determines that the first database is in a normal operation state, the first host can send a redo log of the first database to the second host in real time through an IP address of a copy plane; when the first host determines that the first database operates abnormally, the first host forcibly sends the redo log of the first database to the second host.
In some embodiments of the present application, the switching unit 702 is specifically configured to:
and under the condition that the first database does not conform to a planned execution double-computer switching operation switchover mode, acquiring a System Change Number (SCN) when the second database is started, executing a flashback operation according to the SCN, and switching the first database from a master mode to a slave mode under the execution double-computer switching operation failover mode when the database is inactive.
In some embodiments of the present application, the switching unit 702 is specifically configured to:
and under the condition that the first database conforms to the switchover mode, switching the first database from the master mode to the slave mode in the switchover mode.
In some embodiments of the present application, the switching unit 702 is specifically configured to:
and under the condition that the color switching of the second database fails in the switchover mode, acquiring the SCN when the second database is started, executing a flashback operation according to the SCN, and switching the first database from the master mode to the slave mode in the failover mode.
In this embodiment of the application, the removing unit 701 is further configured to remove the binding between the first database and a management plane floating IP address, where after the dual-computer switching operation is performed, the management plane floating IP is bound to the second database, the management plane floating IP is used to carry a data stream of a management plane, and the management plane is a network plane used to implement internal network monitoring.
In this embodiment of the application, the first host further includes a pulling unit 704, and the manner of determining that the first database satisfies the dual-computer switching condition by the releasing unit 701 includes at least one of the following:
when the disk group resources of the first database are not online, the pull-up unit 704 performs a pull-up operation on the disk group resources of the first database, and when the number of times that the pull-up unit 704 continuously performs the pull-up operation on the disk group resources of the first database exceeds a first threshold, the release unit 701 determines that the first database meets a dual-computer switching condition; or
When the database resource of the first database is not online, the pull-up unit 704 performs a pull-up operation on the database resource of the first database, and when the number of times that the pull-up unit 704 continuously performs the pull-up operation on the database resource of the first database exceeds a second threshold, the release unit 701 determines that the first database satisfies a dual-computer switching condition; or
When the service plane floating IP address resource of the first database is not online, the pull-up unit 704 performs a pull-up operation on the service plane floating IP address resource of the first database, and when the number of times that the pull-up unit 704 continuously performs the pull-up operation on the service plane floating IP address of the first database exceeds a third threshold, the release unit 701 determines that the first database satisfies the dual-computer switching condition.
In this embodiment, the host further includes:
the starting unit 705 is configured to start the first database to a raised mounted state, start a flashback mode of the first database, and start the first database to a fully open state.
Specifically, as shown in fig. 8, another schematic structural diagram of a host provided in this embodiment of the present application is shown, where the host is a second host, the second host is included in a database system, the database system further includes a second host, where a first database is run on the first host, a second database is run on the second host, and the first database and the second database are respectively configured with respective disk group resources, and the host includes: a receiving unit 801, a switching unit 802, and an enabling unit 803, wherein,
a receiving unit 801, configured to receive indication information sent by a first host, where the indication information is used to indicate to start a dual-computer switching operation of the second database;
a switching unit 802, configured to switch the second database from slave use to master use;
a starting unit 803, configured to start a service plane floating IP address on the second database, where the service plane is a network plane that receives and responds to a service request, the service plane floating IP address is used to carry a data stream of the service plane, the service plane is a network plane that receives and responds to the service request, and the service plane floating IP address is bound to the first database before executing a dual-computer handover operation.
In this embodiment of the application, after receiving unit 801 receives the indication information, based on the indication information, switching unit 802 switches the second database from the slave use to the master use, and starting unit 803 starts the service plane floating IP address on the second database. The master database and the slave database are respectively provided with respective disk group resources, and the disk group resources and the database resources are simultaneously activated on the master database and the slave database, so that the problem of inadaptation between the disks and the databases in the process of double-computer switching is avoided, the condition that a user request cannot be responded due to the failure of a shared disk is also avoided, and the data security is improved; in the double-computer switching process, the switching of service plane floating IP and the switching of roles of a master database and a slave database are involved, and disk group resources and database resources do not need to be offline from master data and online from the slave database any more, so that the time length of double-computer switching is favorably shortened, and the time length of user service interruption is shortened; in addition, since the floating IP address of the service plane for carrying the data stream of the service plane is not changed before and after the switching is executed, the IP address accessed by the client before and after the dual-computer switching is not changed, thereby reducing the influence of the dual-computer switching on the client.
In this embodiment, the host further includes: an acquisition unit 804 and a control unit 805, wherein,
an obtaining unit 804, configured to obtain a redo log of the first database;
the control unit 805 is configured to control the second database to execute the transaction executed by the first database according to the redo log, so that the data in the disk group resources of the second database is consistent with the data in the disk group resources of the first database.
In some embodiments of the present application, the switching unit 802 is specifically configured to:
and under the condition that the second database does not conform to the planned execution of the dual-computer switching operation switchover mode, switching the second database from the slave use to the master use under the dual-computer switching operation failover mode when the database is inactive.
In some embodiments of the present application, the switching unit 802 is specifically configured to:
and under the condition that the second database conforms to a switch mode, switching the second database from a slave mode to a master mode under the switch mode.
In some embodiments of the present application, the switching unit 802 is specifically configured to:
and under the condition that the role switching of the second database in the switchover mode fails, switching the second database from a slave database to a master database in the failover mode.
In some embodiments of the present application, the switching unit 802 is specifically configured to:
acquiring a preset time length; and when the duration of the transaction executed by the first database by the second database reaches the preset duration, switching the second database from slave use to master use in the failover mode.
In some embodiments of the present application, the switching unit 802 is specifically configured to:
and when the second database finishes executing the transaction executed by the first database, switching the second database from slave use to master use in the failover mode.
In this embodiment, the host further includes: a recording unit 806, configured to record a system change number SCN after the second database is switched to be active, where the SCN is used by the first host to control the first database to execute a flashback operation according to the SCN.
In this embodiment of the application, the host further includes a pull-up unit 807, where the pull-up unit 807 is specifically configured to:
performing a pull operation on disk group resources of the second database if the disk group resources of the second database are not online; or
Performing a pull operation on a database resource of the second database when the database resource of the second database is not online; or
The starting unit 803 is further configured to execute, by the replication process, the transaction executed by the first database according to the redo log of the first database, and restart the replication process when the state of the replication process is abnormal.
In this embodiment of the application, the starting unit 803 is further configured to start a management plane floating IP address on the second database, where before the dual-computer handover operation is performed, the management plane floating IP is bound to the first database, the management plane floating IP is used to carry a data stream of a management plane, and the management plane is a network plane used to implement internal network monitoring.
It should be noted that, because the contents of information interaction, execution process, and the like between the modules/units of the apparatus are based on the same concept as the method embodiment of the present application, the technical effect brought by the contents is the same as the method embodiment of the present application, and specific contents may refer to the description in the foregoing method embodiment of the present application, and are not described herein again.
Referring to fig. 9, a schematic structural diagram of a first host provided in the embodiment of the present application is shown, where a terminal device 900 includes:
a receiver 901, a transmitter 902, a processor 903 and a memory 904 (wherein the number of processors 903 in the terminal device 900 may be one or more, one processor is taken as an example in fig. 10). In some embodiments of the present application, the receiver 901, the transmitter 902, the processor 903 and the memory 904 may be connected by a bus or other means, wherein fig. 10 exemplifies connection by a bus.
The memory 904 may include both read-only memory and random-access memory, and provides instructions and data to the processor 903. A portion of memory 904 may also include non-volatile random access memory (NVRAM). The memory 904 stores an operating system and operating instructions, executable modules or data structures, or a subset or an expanded set thereof, wherein the operating instructions may include various operating instructions for performing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.
The processor 903 controls the operation of the terminal device, and the processor 903 may also be referred to as a Central Processing Unit (CPU). In a specific application, the various components of the terminal device are coupled together by a bus system, wherein the bus system may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The method disclosed in the embodiments of the present application may be applied to the processor 903, or implemented by the processor 903. The processor 903 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 903. The processor 903 may be a general-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 904, and the processor 903 reads information in the memory 904 and performs the steps of the above method in combination with hardware thereof.
The receiver 901 may be used to receive input numeric or character information and generate signal inputs related to the related settings and function control of the first host.
The transmitter 902 may be configured to output numerical or character information, such as a redo log of the first database, to the second host via the first interface; the transmitter 902 is also operable to send instructions to the disk group via the second interface to modify data in the disk group; the transmitter 902 may also include a display device such as a display screen.
In this embodiment, the processor 903 is configured to execute the foregoing master-slave database switching method executed by the first host.
Referring to another host provided in the present embodiment, referring to fig. 10, a second host 1000 includes:
a receiver 1001, a transmitter 1002, a processor 1003 and a memory 1004 (wherein the number of processors 1003 in the network device 1000 may be one or more, and one processor is taken as an example in fig. 10). In some embodiments of the present application, the receiver 1001, the transmitter 1002, the processor 1003 and the memory 1004 may be connected by a bus or other means, wherein the connection by the bus is exemplified in fig. 10.
The memory 1004 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1003. A portion of the memory 1004 may also include NVRAM. The memory 1004 stores an operating system and operating instructions, executable modules or data structures, or a subset or an expanded set thereof, wherein the operating instructions may include various operating instructions for performing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.
The processor 1003 controls the operation of the network device, and the processor 1003 may also be referred to as a CPU. In a particular application, the various components of the network device are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The method disclosed in the embodiment of the present application may be applied to the processor 1003 or implemented by the processor 1003. The processor 1003 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1003. The processor 1003 may be a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1004, and the processor 1003 reads the information in the memory 1004, and completes the steps of the method in combination with the hardware thereof.
The receiver 1001 may be used to receive numeric or character information, such as, for example, receiving a redo log of a first database sent by a first host, and may also be used to generate signal inputs related to related settings and function control of a second host.
The transmitter 1002 may be configured to output numeric or character information through the first interface, such as information including a system change number SCN to the first host; the transmitter 1002 may also be configured to send instructions to the disk group via the second interface to modify data in the disk group; the transmitter 1002 may also include a display device such as a display screen.
In this embodiment, the processor 1003 is configured to execute the foregoing master-slave database switching method executed by the second host.
Also provided in the embodiments of the present application is a computer program product including a master-slave database switching instruction, which when executed on a computer, causes the computer to perform the steps performed by the first host in the method described in the embodiments of fig. 2 to 6.
Also provided in the embodiments of the present application is a computer program product including a master-slave database switching instruction, which when run on a computer, causes the computer to perform the steps performed by the second host in the method described in the embodiments of fig. 2 to 6.
Also provided in the embodiments of the present application is a computer-readable storage medium, which stores therein instructions for switching between master and slave databases, and when the instructions are executed on a computer, the computer is caused to perform the steps performed by the first host in the method described in the foregoing embodiments shown in fig. 2 to 6.
Also provided in the embodiments of the present application is a computer-readable storage medium, which stores therein instructions for switching between master and slave databases, and when the instructions are executed on a computer, the computer is caused to perform the steps performed by the second host in the method described in the foregoing embodiments shown in fig. 2 to 6.
Wherein any of the aforementioned processors may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits configured to control the execution of the programs of the method of the first aspect.
Embodiments of the present application also provide a chip system, which includes a processor, and is configured to enable a network device to implement the functions referred to in the foregoing aspects, for example, to transmit or process data and/or information referred to in the foregoing methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the network device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods described in the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.