CN110119308B

CN110119308B - A system for managing large-scale container applications

Info

Publication number: CN110119308B
Application number: CN201810121519.3A
Authority: CN
Inventors: 许勇
Original assignee: Beijing Zero Research Technology Co ltd
Current assignee: Beijing Zero Research Technology Co ltd
Priority date: 2018-02-07
Filing date: 2018-02-07
Publication date: 2021-06-04
Anticipated expiration: 2038-02-07
Also published as: CN110119308A

Abstract

The present invention provides a system for managing large-scale container applications. The large-scale container includes at least one container group, and at least one container group forms a node; the system includes a master control node, wherein an interface module receives the information sent by the requesting node for the target container group. Operation request; the management module monitors the operation request through the interface module, and if the large-scale container includes the target container group, it sends the operation request to the target node where the target container group is located through the interface module; the target node manages the target container group to perform the corresponding operation; the requesting node Monitor the operation execution status of the target container group in the target node through the interface module. The invention realizes the management of large-scale containers in a cluster mode, and distributes large-scale access pressure to container groups in different destination nodes as needed through the master control node, so that each container in each container group has high availability rate, and can start a corresponding number of container groups according to demand.

Description

System for managing large-scale container applications

Technical Field

The invention relates to the technical field of cloud computing in computer science, in particular to a system for managing large-scale container application.

Background

At present, container technology development is well-trained, and containerization and microservices become hot topics for software development and release industries. The development of application deployment containerization has prompted the birth of an excellent application container engine such as docker. The application developer does not need to worry about various inexplicable online errors caused by environmental factors in the processes of building a development environment, developing an application, testing the application, deploying the application and online operation and maintenance. The containerization of applications makes development deployments more containerized and lightweight, as containerization allows applications and application dependencies to be packaged into one container that can be ported and published to any popular linux machine, and also allows virtualization. The containerized applications completely use a sandbox mechanism, and do not have any interfaces with each other, so that the applications are isolated. However, when an application is exposed to large-scale user access, the independent container is not capable of supporting large-scale access pressure, and a method for managing the container is needed so that the application becomes highly available. Therefore, a new system for managing large-scale container applications is needed.

Disclosure of Invention

The invention aims to solve the technical problem that the existing containerized application management mode cannot bear large-scale access pressure, and further provides a system for managing large-scale container applications.

In order to solve the above technical problem, the present invention provides a system for managing large-scale container applications, wherein the large-scale container comprises at least one container group, and each container group comprises at least one container; at least one of said groups of containers forming a node;

the system comprises a main control node, wherein the main control node comprises an interface module and a management module;

the interface module receives an operation request aiming at a target container group sent by a request node; the management module monitors the operation request through the interface module, and if the large-scale container comprises the target container group, the management module sends the operation request to a target node where the target container group is located through the interface module; the target node manages the target container group to execute corresponding operation; and the requesting node monitors the operation execution state of the target container group in the target node through the interface module.

Preferably, in the system for managing a large-scale container application, after the management module monitors the operation request through the interface module, if the large-scale container does not include the target container group, a new container group is created as the target container group according to the information of the target container group recorded in the operation request;

the master control node also comprises a scheduling module; and the scheduling module selects an existing node as the target node according to a preset condition and schedules the new container group to the selected existing node.

Preferably, in the system for managing a large-scale container application, the master node further includes a storage module, and the interface module sends the operation request, the operation execution state of the target container group, the creation record of the new container group, the information of the selected existing node, and the scheduling record of the new container group scheduled to the selected existing node to the storage module for storage.

Preferably, in the system for managing large-scale container applications, each container is loaded with at least one application process;

each node also comprises an agent module and a node management module; the node sends an operation request to the interface module through the node management module; the node receives an operation request from the interface module through the node management module; and the agent module is in communication connection with the interface module and is used for controlling different application processes in the node according to the operation request and summarizing the state data of each application process to obtain the operation execution state and feeding the operation execution state back to the interface module.

Preferably, in the system for managing large-scale container applications, application processes loaded in containers of different container groups belonging to the same node have a coupling relationship.

Preferably, in the system for managing a large-scale container application, the agent module is further configured to distribute the operation request to different container groups and/or different containers according to a load balancing standard according to different container groups and/or different containers in the node.

Preferably, in the system for managing a large-scale container application, the master node further includes a replication controller, and the replication controller is configured to define a container group replica; the management module creates the new container group through the replication controller.

Preferably, in the system for managing a large-scale container application, the replication controller is further configured to obtain a container group quantity requirement of the operation request, and compare the container group quantity requirement with a container group quantity currently in a running state; and if the quantity requirement of the container groups is less than the quantity of the container groups, the management module controls the container groups operated by the part to stop through the interface module.

Preferably, the system for managing large-scale container applications further includes a storage volume;

the container group executes read-write operation on the storage volume during running, and reads data required by executing operation requests from the storage volume; and saving the data generated by the process of executing the operation request/running the application program into the storage volume.

Preferably, in the system for managing a large-scale container application, the storage volume is of an EmptyDir or HostPath type.

Compared with the prior art, the technical scheme provided by the invention at least has the following beneficial effects:

(1) the system for managing the large-scale container application, provided by the invention, has the advantages that the large-scale container comprises at least one container group, a plurality of container groups are distributed into one node for management, and a main control node is simultaneously configured for allocating and controlling the container groups in different nodes, so that the large-scale container is managed in a cluster mode, even if large-scale access pressure exists, the access pressure can be distributed to the container groups in different destination nodes according to the requirement through the main control node, each container in each container group has high availability, and the corresponding number of container groups can be started according to the requirement.

(2) According to the system for managing the large-scale container application, after the operation request sent by a certain request node is obtained, if it is determined after analysis that no corresponding container group exists in the currently applied node to provide service for the request node, the main control node can establish a new container group for the request node so as to ensure that normal service is provided for the request node.

(3) In the system for managing the application of the large-scale container, the application processes loaded in the corresponding container have close coupling in the container group under each node, so that the system provided by the invention can regard one group of container groups as a cluster to provide services, and one service can be regarded as an external access interface of one group of container groups providing the same service, thereby realizing the scheduling control of the large-scale container groups.

Drawings

FIG. 1 is a diagram illustrating relationships among application processes, containers, container groups, and nodes according to an embodiment of the present invention;

FIG. 2 is a functional block diagram of a system for managing large-scale container applications according to one embodiment of the present invention;

FIG. 3 is a functional block diagram of a system for managing large-scale container applications according to another embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description of the present invention, and do not indicate or imply that the device or assembly referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Wherein the terms "first position" and "second position" are two different positions. In addition, unless expressly stated or limited otherwise, the terms "mounted," "connected," and "connected" are intended to be construed broadly, as if the terms are fixed or movable relative to each other, as if they were connected together in any other manner; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, and the two components can be communicated with each other. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

Example 1

The present embodiments provide a system for managing large-scale container applications, wherein the large-scale containers comprise at least one container group, and each container group comprises at least one container; at least one container group forms a node, and the node is a working host, can be a physical machine, and can also be a virtual machine. As shown in fig. 1, which represents the relationship between applications, containers, container groups, and nodes. Each container can be loaded with a plurality of application processes, the plurality of containers can form a container group, and each node comprises a plurality of container groups, namely, the container groups are created, started and destroyed on the nodes. In the above, docker may be used in a container group to package, instantiate, and run applications. As shown in fig. 2, the system includes a main control node 200, where the main control node 200 further includes an interface module 201 and a management module 202, where the management module 202 is communicatively connected to the interface module 201. The large-scale container 100 and the interface module 201 can be communicatively connected, and the management module 202 can manage different nodes in the large-scale container 100 through the interface module 201.

In some embodiments of the present invention, the interface module may be implemented by using an API server to provide services in a REST manner, where the APIs are interfaces for performing add-drop and delete check on the container group and monitoring a change in the container group or a change in the service, such as creating the container group, creating a replication set controller, and the like. Therefore, the API server provides a unique operation entrance of the resource object, all other components such as the node and the management module must operate the resource data through the API provided by the API, and the components can complete the related service functions in real time through full query and change monitoring of the related resource data. In the figure, node 101, node 102, node 103 … … and node N are all shown in fig. 1, and include a plurality of container groups inside. Any one of the nodes can request to operate other nodes, and then the other nodes provide services for the node, the node which sends the request serves as a request node, and the node which provides the services serves as a target node. In this process, the interface module 201 serves as a unique operation entry, and takes charge of data transmission among the requesting node, the target node and the management module.

The interface module 201 receives an operation request for a target container group sent by a requesting node; the management module 202 monitors the operation request through the interface module 201, and if the large-scale container includes the target container group, sends the operation request to a target node where the target container group is located through the interface module 201; the target node manages the target container group to execute corresponding operation; the requesting node listens to the operation execution status of the target container group in the target node through the interface module 201.

In the above scheme, the operation request may include information related to the target container group, such as a capacity requirement for the target container, a container number requirement, and the like, in the containerized application management, each container group may be assigned an individual IP address, and the requesting node may also send the IP address of the target container group to the node module 201. The operation request may include creating a service, adjusting a current service, canceling a service, or the like. If the management module 202 monitors the operation request through the interface module 201, and if the large-scale container does not include the target container group, the management module 202 creates a new container group as the target container group according to the information of the target container group recorded in the operation request; correspondingly, as shown in fig. 3, the master node 200 further includes a scheduling module 203; the scheduling module 203 selects an existing node as the target node according to a preset condition, and schedules the new container group to the selected existing node. The preset condition may be selected according to a load balancing principle, for example, the load conditions of all current nodes may be obtained, which node carries a small load (or serves a small number of nodes), and the new container group may be added to the node.

In the above solution, the management module can also be used to implement failure detection and recovery of container groups in each node, and copy and remove of container groups, so as to ensure how many container groups in a cluster duplicate the definition of centralized copy.

In the above solution of this embodiment, a plurality of container groups are allocated to one node for management, and a master control node is configured to allocate and control container groups in different nodes, so that a large-scale container is managed in a cluster manner, and even if there is a large-scale access pressure, the access pressure can be allocated to container groups in different destination nodes as required by the master control node, so that each container in each container group has a high availability, and a corresponding number of container groups can be started as required.

In addition, as shown in fig. 3, in the above system for managing a large-scale container application, the main control node 200 further includes a storage module 204, and the interface module 201 sends the operation request, the operation execution state of the target container group, the creation record of the new container group, the information of the selected existing node, and the scheduling record of the new container group scheduled to the selected existing node to the storage module 204 for storage. That is, the storage module 204 monitors and records the state change of each node in the large-scale container. In this embodiment, the storage module 204 is highly available for persisting and storing all resource objects in a large-scale container, such as nodes, services, groups of containers, etc., therein.

Further preferably, as shown in fig. 3, the system for managing a large-scale container application further includes an agent module and a node management module in each node; as shown by the proxy module a1 and the node management module B1 included in the node 101, the proxy module a1 can be used to control the proxy of the service executed by the container group 11 to the container group 1m and the load balancing of the software mode. The node N includes AN agent module AN and a node management module BN, and the agent module AN can be used to control the agents of the container group N1 to the container group Nm execution service and the load balancing of the software mode. That is to say, the agent module distributes the operation request to different container groups and/or different containers according to load balancing standards according to load amounts of different container groups and/or different containers in the node.

The node management module can be implemented by a nodelet service, and the nodelet is responsible for managing the creation, modification, monitoring, deletion and the like of a container group on the node, and reporting the state information of the node to the interface module 201 in real time; as a requesting node, the node sends an operation request to the interface module 201 through the nodelet, and as the target node, the node receives an operation request from the interface module 201 through the nodelet; the agent module in each node is in communication connection with the interface module 201, and the operation execution state obtained by summarizing the state data of each application process is fed back to the interface module 201. In addition, service processes that may also run in each node include node-proxy, docker daemon, and the like.

In the above scheme, application processes loaded in containers of different container groups belonging to the same node have a coupling relationship. Therefore, the system provided in the above scheme can regard a group of container groups as a cluster to provide services, and a service can regard as an external access interface of a group of container groups providing the same service, thereby being capable of realizing scheduling control of large-scale container groups.

Preferably, in the above scheme, the master node 200 further includes a replication controller, where the replication controller is configured to define a container group replica; the management module 202 creates the new container group through the replication controller. The replication controller is further configured to obtain a container group quantity requirement of the operation request, and compare the container group quantity requirement with a container group quantity currently in a running state; and if the quantity requirement of the container groups is less than the quantity of the container groups, the management module controls the container groups operated by the part to stop through the interface module.

That is, the replication controller is used in the cluster to define the number of container group replicas, and on the master node, the management module completes the operations of creating, starting, stopping, monitoring, and the like of the container group through the definition of the replication controller. By definition of the replication controller, it can be guaranteed that the user-specified number of container group replicas can be run at any one time. If too many container groups are in operation, the system will stop some of the container groups; if the number of the container group copies is too small, the system starts some container groups again, and the copy controller is defined to ensure that the number of the copy container groups which are expected by the user is operated in the cluster.

In the above scheme, the system further includes a storage volume, and the container group performs read-write operations on the storage volume during operation, and reads data required for performing an operation request from the storage volume; and saving the data generated by the process of executing the operation request/running the application program into the storage volume. A storage volume is a shared directory in a group of containers that can be accessed by multiple containers, the storage volume having the same lifecycle as the group of containers but being unrelated to the lifecycle of the containers. Data in the storage volume is not lost when the container is terminated or restarted. The method supports multiple types of storage volumes, and any number of storage volumes can be used simultaneously by one container group. The storage volume in this embodiment can be implemented in two ways:

(1) when the container group is distributed to the nodes, an EmptyDir type storage volume is created, the initial content of the EmptyDir type storage volume is empty, and all containers in the same container group can read and write the same file in the EmptyDir. When a container group is deleted from a node, the data in EmptyDir is also permanently deleted.

(2) The HostPath is used for mounting files or directories on a host on a container group, log files generated by application programs generally need to be stored permanently, a high-speed file system of the host can be used for storage or container application needing to access an internal data structure of a docker engine on the host can be used, and the HostPath can be used as a docker directory of the host, so that the internal application of the container can directly access the file system of the docker.

The above technical solution provided by the embodiment solves the problems of high availability and management of container applications.

Example 2

The present embodiment creates a completion flow for the replication set controller and associated services to further illustrate the workflow of the system.

Firstly, a request node submits a request for creating a copy set controller to a target container group through a node management module, and sends related parameters of the target container group to an interface module, the request is written into a storage module through the interface module, at the moment, the management module monitors a copy set event and related information of the target container group through an interface of the interface module for monitoring resource change, an example of the target container group required by the management module is found in the container group in the current node after analysis, then a new container group object is generated according to template definition of the target container group in the copy set, the new container group object is written into the storage module through the interface module, the event is found by a scheduling module, the scheduling module immediately executes a complex scheduling process, an operating node is selected for the new container group according to preset conditions, and then the result is written into the storage module through the interface module, the node management module running on top of the target node then hears this newly generated container group through the interface module and, according to its definition, starts the container group and is responsible for its operation.

Then, the requesting node submits a service creation request mapped to the newly-created target container group through its internal node management module, which queries the existing instance of the associated target container group, then generates the information of the service and writes it into the storage module through the interface module. Then, the agent module on the target node inquires and monitors the service object and the corresponding information thereof through the interface module, and establishes a load balancer in a software mode to realize the flow forwarding function of the service access to the target container group in the target node, thereby realizing the high availability of the container group.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A system for managing large-scale container applications, characterized by:

the large scale container comprises at least one container group, each container group comprises at least one container; at least one of said groups of containers forming a node;

the interface module receives an operation request aiming at a target container group sent by a request node; the management module monitors the operation request through the interface module, and if the large-scale container comprises the target container group, the management module sends the operation request to a target node where the target container group is located through the interface module; the target node manages the target container group to execute corresponding operation; the requesting node monitors the operation execution state of the target container group in the target node through the interface module;

after the management module monitors the operation request through the interface module, if the large-scale container does not include the target container group, a new container group is created as the target container group according to the information of the target container group recorded in the operation request;

the master control node also comprises a scheduling module; the scheduling module selects an existing node as the target node according to a preset condition and schedules the new container group to the selected existing node;

the interface module sends the operation request, the operation execution state of the target container group, the creation record of the new container group, the information of the selected existing node, and the scheduling record of the new container group scheduled to the selected existing node to the storage module for storage;

wherein each container is loaded with at least one application process;

2. The system for managing large-scale container applications of claim 1, wherein:

and the application processes loaded in the containers have a coupling relation among different container groups belonging to the same node.

3. The system for managing large-scale container applications of claim 1 or 2, wherein:

the agent module is further configured to distribute the operation request to different container groups and/or different containers according to load balancing standards according to load amounts of different container groups and/or different containers in the node.

4. The system for managing large-scale container applications of claim 1 or 2, wherein:

the main control node also comprises a replication controller, and the replication controller is used for defining a container group replica; the management module creates the new container group through the replication controller.

5. The system for managing large-scale container applications of claim 4, wherein:

the replication controller is further configured to obtain a container group quantity requirement of the operation request, and compare the container group quantity requirement with a container group quantity currently in a running state; and if the quantity requirement of the container groups is less than the quantity of the container groups, the management module controls the container groups operated by the part to stop through the interface module.

6. The system for managing large-scale container applications of claim 1 or 2, further comprising:

a storage volume;

7. The system for managing large-scale container applications of claim 6, wherein:

the storage volume is of an EmptyDir or a HostPath type.