[go: up one dir, main page]

CN114625478A - Application program management method and device, electronic equipment and computer readable storage medium - Google Patents

Application program management method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114625478A
CN114625478A CN202210126119.8A CN202210126119A CN114625478A CN 114625478 A CN114625478 A CN 114625478A CN 202210126119 A CN202210126119 A CN 202210126119A CN 114625478 A CN114625478 A CN 114625478A
Authority
CN
China
Prior art keywords
application
container
group
application program
running state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210126119.8A
Other languages
Chinese (zh)
Other versions
CN114625478B (en
Inventor
施凯
张振
王思宇
常耀伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202210126119.8A priority Critical patent/CN114625478B/en
Publication of CN114625478A publication Critical patent/CN114625478A/en
Application granted granted Critical
Publication of CN114625478B publication Critical patent/CN114625478B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45591Monitoring or debugging support

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses an application program management method and device, electronic equipment and a computer readable storage medium. The method comprises the following steps: acquiring an operation state detection result for each application program in at least one group of application programs; acquiring container states of a set of containers in which the set of applications are respectively run; determining the application programs with preset running states in each group of application programs according to the running state detection result of each application program and the container state of the container in which the application program runs; predetermined operations are performed on the application program having the predetermined running state. The method and the device for managing the application programs can comprehensively consider the whole application programs with the same application program identification to carry out comprehensive decision of management, and avoid the problem that the service provided by the group of application programs is poor in stability due to lack of global control caused by the fact that the management of the application programs is executed only based on the running state detection result of a single application program.

Description

应用程序管理方法和装置、电子设备和计算机可读存储介质Application management method and apparatus, electronic device and computer-readable storage medium

技术领域technical field

本申请涉及云计算技术领域,尤其涉及一种应用程序管理方法和装置、电子设备和计算机可读存储介质。The present application relates to the field of cloud computing technologies, and in particular, to an application management method and apparatus, an electronic device, and a computer-readable storage medium.

背景技术Background technique

随着云计算技术的发展,越来越多的应用可以借助于广泛的云计算资源来为用户提供服务。近年来,已经出现了基于云计算的云原生技术,其借助于云计算环境,原生是基于云计算体系来设计,从而能够在云计算资源上以较佳的状态运行,并能够充分利用和发挥云平台的分布式以及弹性的优势。在云原生体系中,容器是云原生体系的基本要素之一,利用容器化能够为云原生的微服务提供基础的保障,并且通过例如K8S的容器编排系统,可以对容器进行管理。With the development of cloud computing technology, more and more applications can provide services to users with the help of extensive cloud computing resources. In recent years, cloud-native technologies based on cloud computing have emerged. With the help of the cloud computing environment, the native design is based on the cloud computing system, so that it can run in a better state on cloud computing resources, and can make full use and play The advantages of distributed and elastic cloud platform. In the cloud-native system, containers are one of the basic elements of the cloud-native system. Containerization can provide basic guarantees for cloud-native microservices, and containers can be managed through a container orchestration system such as K8S.

在云原生体系中,应用程序或服务可以通过容器化来生成作为提供给用户的微服务。因此,一个应用程序可以通过制作多个副本并在多个容器中运行来提高应用程序的运行效率、降低单个容器的负载以及甚至在出现故障时的容灾性能。例如,在容器中运行的应用程序处于不健康状态,例如进程挂起或服务异常时,该容器实际上已经无法作为独立的微服务单元向外提供服务,因此通常情况下可以通过预先设置来使得对于这样的不健康容器发出告警,并由维护人员手工对该容器进行重启,以恢复应用程序的正常运行状态。但是这样的手工维护重复操作依赖于人工的查看和及时操作,效率不仅非常低,而且会由于人力的不足,而导致某个或某些容器未能够及时重启而影响了用户的正常使用。In a cloud-native system, applications or services can be containerized to generate microservices provided to users. Therefore, an application can be made multiple copies and run in multiple containers to improve the operation efficiency of the application, reduce the load of a single container, and even disaster recovery performance in the event of failure. For example, when an application running in a container is in an unhealthy state, such as when the process is suspended or the service is abnormal, the container can no longer provide services as an independent microservice unit. Such an unhealthy container issues an alarm, and the maintainer manually restarts the container to restore the normal running state of the application. However, such repetitive manual maintenance operations rely on manual review and timely operation, which is not only very inefficient, but also causes one or some containers to fail to restart in time due to lack of manpower, which affects the normal use of users.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供一种应用程序管理方法和装置、电子设备和计算机可读存储介质,以解决现有技术中基于单机检测的配置复杂且效率低的缺陷。Embodiments of the present application provide an application program management method and apparatus, an electronic device, and a computer-readable storage medium, so as to solve the defects of complex configuration and low efficiency based on single-machine detection in the prior art.

为达到上述目的,本申请实施例提供了一种应用程序管理方法,所述方法包括:To achieve the above purpose, an embodiment of the present application provides an application management method, the method comprising:

获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果;obtain a running state detection result for each application in the at least one set of applications;

获取其中分别运行该组应用程序的一组容器的容器状态;Get the container state of a set of containers in which the set of applications are running;

根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态确定各组应用程序中具有预定运行状态的应用程序;Determine an application program with a predetermined operating state in each group of application programs according to the running state detection result of each application program and the container state of the container in which the application program runs;

对具有预定运行状态的应用程序执行预定的操作。Perform a predetermined operation on an application with a predetermined running state.

本申请实施例还提供了一种应用程序管理装置,包括:The embodiment of the present application also provides an application management device, including:

第一状态获取模块,用于获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果;a first state acquisition module, configured to acquire a running state detection result for each application program in the at least one group of application programs;

第二状态获取模块,用于获取其中分别运行该组应用程序的一组容器的容器状态;The second state acquisition module is used to acquire the container state of a group of containers in which the group of application programs are respectively running;

确定模块,用于根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态,确定各组应用程序中具有预定运行状态的应用程序;A determination module, configured to determine the applications with predetermined running states in each group of applications according to the running state detection results of each application and the container state of the container in which the application runs;

执行模块,用于对具有预定运行状态的应用程序执行预定的操作。The execution module is used to execute a predetermined operation on an application program with a predetermined running state.

本申请实施例还提供了一种电子设备,包括:The embodiment of the present application also provides an electronic device, including:

存储器,用于存储程序;memory for storing programs;

处理器,用于运行所述存储器中存储的所述程序,所述程序运行时执行本申请实施例提供的应用程序管理方法。The processor is configured to run the program stored in the memory, and when the program runs, the application program management method provided by the embodiment of the present application is executed.

本申请实施例还提供了一种计算机可读存储介质,其上存储有可被处理器执行的计算机程序,其中,该程序被处理器执行时实现如本申请实施例提供的应用程序管理方法。Embodiments of the present application further provide a computer-readable storage medium on which a computer program executable by a processor is stored, wherein when the program is executed by the processor, the application program management method provided by the embodiments of the present application is implemented.

本申请实施例提供的应用程序管理方法和装置、电子设备和计算机可读存储介质,通过获取具有相同应用程序标识的一组应用程序的运行状态检测结果,并获取与该组应用程序对应的容器的容器状态,从而基于应用程序的运行状态检测结果和应用程序在其中运行的容器的容器状态来进行综合判断,以确定该组应用程序中具有预定运行状态的应用程序,并对这样确定的具有预定运行状态的应用程序执行操作。因此,能够综合考虑具有相同应用程序标识的应用程序的全体来进行管理的综合决策,避免了仅基于单个应用程序的运行状态检测结果来执行应用程序的管理,导致缺乏全局把控而使得该组应用程序提供的服务稳定性差的问题。The application program management method and apparatus, electronic device, and computer-readable storage medium provided by the embodiments of the present application acquire the running state detection results of a group of application programs with the same application program identifier, and acquire the container corresponding to the group of application programs the container status, so as to make a comprehensive judgment based on the detection result of the running status of the application and the container status of the container in which the application runs, to determine the application with a predetermined running status in the group of applications, and to determine the application with a predetermined running status in this way. An application in a predetermined running state performs an action. Therefore, it is possible to comprehensively consider the entire application program with the same application program identifier to perform management comprehensive decision-making, and avoid performing application program management based only on the running state detection result of a single application program, resulting in lack of global control and making the group The problem of poor stability of the service provided by the application.

上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solution of the present application. In order to be able to understand the technical means of the present application more clearly, it can be implemented according to the content of the description, and in order to make the above-mentioned and other purposes, features and advantages of the present application more obvious and easy to understand , and the specific embodiments of the present application are listed below.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are for purposes of illustrating preferred embodiments only and are not to be considered limiting of the application. Also, the same components are denoted by the same reference numerals throughout the drawings. In the attached image:

图1是本申请实施例提供的应用程序管理方案的应用场景示意图;1 is a schematic diagram of an application scenario of an application program management solution provided by an embodiment of the present application;

图2为本申请提供的应用程序管理方法一个实施例的流程图;2 is a flowchart of an embodiment of an application management method provided by the present application;

图3为本申请提供的应用程序管理方法另一个实施例的流程图;3 is a flowchart of another embodiment of an application management method provided by the present application;

图4为本申请提供的应用程序管理装置实施例的结构示意图;FIG. 4 is a schematic structural diagram of an embodiment of an application management apparatus provided by the present application;

图5为本申请提供的电子设备实施例的结构示意图。FIG. 5 is a schematic structural diagram of an embodiment of an electronic device provided by the present application.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.

实施例一Example 1

本申请实施例提供的方案可应用于任何具有云应用管理能力的系统,例如包括云应用管理模块的云服务系统等等。图1为本申请实施例提供的应用程序管理方案的应用场景示意图,图1所示的场景仅仅是本申请的技术方案的原理的示例之一。The solutions provided in the embodiments of the present application can be applied to any system with cloud application management capability, such as a cloud service system including a cloud application management module, and so on. FIG. 1 is a schematic diagram of an application scenario of an application management solution provided by an embodiment of the present application. The scenario shown in FIG. 1 is only one example of the principle of the technical solution of the present application.

随着云计算技术的发展,越来越多的应用可以借助于广泛的云计算资源来为用户提供服务。近年来,已经出现了基于云计算的云原生技术,其借助于云计算环境,原生是基于云计算体系来设计,从而能够在云计算资源上以较佳的状态运行,并能够充分利用和发挥云平台的分布式以及弹性的优势。在云原生体系中,容器是云原生体系的基本要素之一,利用容器化能够为云原生的微服务提供基础的保障,并且通过例如K8S的容器编排系统,可以对容器进行管理。With the development of cloud computing technology, more and more applications can provide services to users with the help of extensive cloud computing resources. In recent years, cloud-native technologies based on cloud computing have emerged. With the help of the cloud computing environment, the native design is based on the cloud computing system, so that it can run in a better state on cloud computing resources, and can make full use and play The advantages of distributed and elastic cloud platform. In the cloud-native system, containers are one of the basic elements of the cloud-native system. Containerization can provide basic guarantees for cloud-native microservices, and containers can be managed through a container orchestration system such as K8S.

在云原生体系中,应用程序或服务可以通过容器化来生成作为提供给用户的微服务。因此,一个应用程序可以通过制作多个副本并在多个容器中运行来提高应用程序的运行效率、降低单个容器的负载以及甚至在出现故障时的容灾性能。例如,在容器中运行的应用程序处于不健康状态,例如进程挂起或服务异常时,该容器实际上已经无法作为独立的微服务单元向外提供服务,因此通常情况下可以通过预先设置来使得对于这样的不健康容器发出告警,并由维护人员手工对该容器进行重启,以恢复应用程序的正常运行状态。但是这样的手工维护重复操作依赖于人工的查看和及时操作,效率不仅非常低,而且会由于人力的不足,而导致某个或某些容器未能够及时重启而影响了用户的正常使用。In a cloud-native system, applications or services can be containerized to generate microservices provided to users. Therefore, an application can be made multiple copies and run in multiple containers to improve the operation efficiency of the application, reduce the load of a single container, and even disaster recovery performance in the event of failure. For example, when an application running in a container is in an unhealthy state, such as when the process is suspended or the service is abnormal, the container can no longer provide services as an independent microservice unit. Such an unhealthy container issues an alarm, and the maintainer manually restarts the container to restore the normal running state of the application. However, such repetitive manual maintenance operations rely on manual review and timely operation, which is not only very inefficient, but also causes one or some containers to fail to restart in time due to lack of manpower, which affects the normal use of users.

在现有技术中,已经提出了在容器管理系统中设置容器探针,使得该探针可以根据用户的配置来周期性地进入指定容器中来检测容器服务是否正常。例如,该探针可以是命令脚本程序,并且通过在指定容器中执行该命令脚本程序,并且根据命令脚本程序的执行结果来判断容器中的应用程序是否处于正常运行状态。例如,用户可以根据已经容器化的一组应用程序来对命令脚本程序进行配置,以使得该命令脚本程序的执行可以反映容器中的应用程序的运行状态。配置之后可以将配置好的探测命令脚本植入到该组应用程序,即向外提供服务的所有应用程序副本的容器中来执行,并且例如当执行成功生成了预定的退出码时,则可以确定探测成功,即该容器中的应用程序的运行状态为正常。In the prior art, it has been proposed to set a container probe in a container management system, so that the probe can periodically enter a specified container according to a user's configuration to detect whether the container service is normal. For example, the probe may be a command script program, and by executing the command script program in a specified container, and according to the execution result of the command script program, it is determined whether the application in the container is in a normal running state. For example, a user can configure a command script according to a group of applications that have been containerized, so that the execution of the command script can reflect the running state of the applications in the container. After configuration, the configured probe command script can be implanted into the set of applications, that is, the containers of all application copies that provide services to the outside world, and for example, when the execution successfully generates a predetermined exit code, it can be determined. The detection is successful, that is, the running status of the application in the container is normal.

反之,如果该命令脚本程序的执行而生成的退出码不为零,则可以表示该脚本执行失败,因此在现有技术中可以基于该结果来判断该容器中的应用程序运行异常,从而可以直接触发该容器的重启,以恢复容器中的应用正常运行。On the contrary, if the exit code generated by the execution of the command script program is not zero, it can indicate that the script execution fails. Therefore, in the prior art, it can be judged that the application program in the container is running abnormally based on the result, so that it can be directly Triggers a restart of the container to resume normal operation of the application in the container.

但是在该现有技术方案中,实际上是通过植入的该探测命令脚本程序的执行结果来间接反映容器中应用程序的运行状态的,因此,如果对于该命令脚本的配置不合理,则会影响对于应用程序运行状态检测的准确性,甚至可能会导致错误的判断。因此,在现有技术中,通常需要应用程序的维护人员具有较丰富的经验和专业的知识来进行这样的配置。However, in the prior art solution, the execution result of the implanted detection command script program is actually used to indirectly reflect the running state of the application in the container. Therefore, if the configuration of the command script is unreasonable, the Affect the accuracy of application running status detection, and may even lead to wrong judgment. Therefore, in the prior art, maintenance personnel of the application program are usually required to have rich experience and professional knowledge to perform such a configuration.

此外,特别是在云原生体系中,基于一个应用程序形成多个应用程序副本并分别运行在单个容器中,来共同向外提供服务。即,实际上该组应用程序副本构成一个整体来分别向外提供服务的各个部分。因此,在这样的情况下,如果由于探测命令脚本的配置不合理导致单个容器不合理甚至错误地重启,那么会直接影响服务的一部分,甚至如果单个容器中运行的应用程序在整体中比较重要,那么这样的容器的直接重启可能会对整个服务的正常运行造成较大影响,甚至导致用户的损失。In addition, especially in the cloud-native system, multiple application copies are formed based on an application and run in a single container to jointly provide services to the outside world. That is, in effect, the set of application copies constitutes a whole to provide each part of the service to the outside world. Therefore, in such a case, if a single container is restarted unreasonably or even erroneously due to the unreasonable configuration of the probe command script, it will directly affect a part of the service, even if the application running in a single container is more important in the whole, Then the direct restart of such a container may have a great impact on the normal operation of the entire service, and even lead to user losses.

在现有技术的方案中,基于单容器检测结果来管理一组应用程序的方案在实际应用中会存在着安全隐患。由于在云原生体系中,需要从整体上来衡量提供微服务的多个容器中应用程序副本的运行状态对于整体服务的影响。为此,对于每个容器配置单机探针检测需要维护人员对该应用程序具有较深入的理解,并且还需要具备较为专业的知识。此外,由于云原生体系中提供微服务的各个容器之间的关系复杂,因此,这样的逐个单机配置往往也由于配置繁琐或内容复杂而容易使得维护人员进行了错误的配置。而这样的错误配置一旦维护人员确定生效,则容易导致提供微服务的容器错误地重启,或者在错误的时机重启,导致向外提供的服务受到影响或者甚至无法向外提供服务。In the solution of the prior art, the solution of managing a group of application programs based on the detection result of a single container may have potential security risks in practical application. Because in the cloud native system, it is necessary to measure the impact of the running status of application copies in multiple containers that provide microservices on the overall service. For this reason, the configuration of stand-alone probe detection for each container requires maintenance personnel to have a deep understanding of the application, as well as professional knowledge. In addition, due to the complex relationship between the various containers that provide microservices in the cloud native system, such a single-machine configuration is often prone to incorrect configuration by maintenance personnel due to cumbersome configuration or complex content. Once such a misconfiguration is determined to be effective by the maintainer, it is easy to cause the container that provides the microservice to be restarted incorrectly, or restarted at the wrong time, which will affect the externally provided services or even fail to provide external services.

特别地,现有技术中目前所采用针对单个容器的探针检测,由于在配置时对于一组应用程序中的各个应用程序是无差别地统一配置的,因此一旦配置生效则会在该组应用程序的所有的容器中生效。换言之,如果由于维护人员的失误而导致对于探针探测方法进行了错误的配置,则会导致在所有容器中运行的应用可能会被全部重启,从而导致所有应用都无法向外提供服务,严重影响了服务的稳定性和高可用性。In particular, the probe detection currently used in the prior art for a single container, because each application in a group of applications is uniformly and uniformly configured during configuration, so once the configuration takes effect, it will be applied in the group of applications. Take effect in all containers of the program. In other words, if the probe detection method is incorrectly configured due to the mistakes of the maintainers, all applications running in all containers may be restarted, resulting in all applications being unable to provide services to the outside world, seriously affecting service stability and high availability.

例如,如图1中所示,图1示出了本申请的应用程序管理方法可应用的应用场景。在图1中所示的场景中,在云服务器上可以为一个应用程序的3个副本生成三个容器1-3来分别运行该三个副本程序,从而可以通过以分为在容器1-3中运行的三个应用程序副本1-3来以微服务的形式向外作为一个整体提供服务。因此,为了确保服务的稳定性,可以通过对于应用程序1-3来配置探测命令脚本程序,该脚本程序可以分别在容器1-3中执行,并且因此,在现有技术中可以根据上述命令脚本程序在容器1-3中的执行结果来判断应用程序1-3的运行状态。特别地,在本申请实施例中,维护人员通常仅根据经验来配置该脚本程序,并且当脚本程序配置完成之后,可以对于全部三个应用程序1-3生效。因此,当例如脚本程序在容器1中的执行结果为生成了零的退出码,或者生成了成功的执行结果标识,从而可以认为该容器中的应用程序的运行状态为正常。反之,当在某一个容器中脚本程序的执行结果为非零,或者返回了失败的执行结果,那么在现有技术中就会判断该容器中的该应用程序的运行状态异常,从而可以直接对于该容器下达重启指令,来恢复该应用程序的运行。但是如上所述,在现有技术中,如果由于对于脚本程序配置错误,导致该脚本程序在容器1中运行失败,而实际上应用程序1在该容器1中始终正常运行,但是该脚本应用程序却会由于运行失败而返回例如非零的退出码,那么这就会导致该容器被重启。特别是,在重启之后,会继续使用该脚本程序来执行应用程序的运行状态检测。因此,不可避免的是,这些脚本程序每次在容器中运行都会返回失败的执行结果,从而使得应用程序1-3的容器1-3反复重启,最终使得应用程序1-3向外提供的服务异常。For example, as shown in FIG. 1 , FIG. 1 shows an application scenario to which the application management method of the present application can be applied. In the scenario shown in Figure 1, three containers 1-3 can be generated for three copies of an application on the cloud server to run the three copy programs respectively, so that the three copies can be divided into containers 1-3 by dividing The three application replicas 1-3 running in the microservices provide services to the outside as a whole. Therefore, in order to ensure the stability of the service, a probe command script program can be configured for the application programs 1-3, the script program can be executed in the containers 1-3 respectively, and therefore, in the prior art, according to the above-mentioned command script The execution result of the program in the container 1-3 is used to judge the running state of the application program 1-3. In particular, in the embodiment of the present application, the maintainer usually configures the script program only according to experience, and after the script program is configured, it can take effect for all three application programs 1-3. Therefore, for example, when the execution result of the script program in the container 1 is that an exit code of zero is generated, or a successful execution result identifier is generated, it can be considered that the running state of the application program in the container is normal. On the contrary, when the execution result of the script program in a certain container is non-zero, or a failed execution result is returned, then in the prior art, it is judged that the running state of the application program in the container is abnormal, so that the The container issues a restart instruction to resume the application. However, as mentioned above, in the prior art, if the script program fails to run in the container 1 due to the wrong configuration of the script program, in fact, the application program 1 always runs normally in the container 1, but the script program But it will return e.g. a non-zero exit code due to failure, then this will cause the container to be restarted. In particular, the script will continue to be used to perform health checks of the application after a restart. Therefore, it is inevitable that these script programs will return a failed execution result every time they are run in the container, so that the container 1-3 of the application 1-3 is restarted repeatedly, and finally the services provided by the application 1-3 to the outside are made. abnormal.

对此,在本申请实施例中,在获取了脚本程序的执行结果之后,并不基于该执行结果来直接管理该应用程序,而是进一步获取容器当前的容器状态,并且基于该容器状态和脚本程序的执行结果来对容器中的应用程序的运行状态进行综合判断,尤其是在本申请实施例中,在判断应用程序的运行状态时,获取该应用程序一组的全部应用程序的脚本程序的执行结果,从而还可以基于该一组应用程序的预定策略来在尽量保证服务正常的情况下对具有预定状态的应用程序进行管理。In this regard, in the embodiment of the present application, after the execution result of the script program is obtained, the application program is not directly managed based on the execution result, but the current container state of the container is further obtained, and based on the container state and the script The execution result of the program is used to comprehensively judge the running state of the application program in the container. Especially in this embodiment of the present application, when judging the running state of the application program, the script programs of all the application programs in the application group are obtained. According to the execution result, applications with a predetermined state can also be managed based on the predetermined policy of the group of applications while ensuring that the service is normal as much as possible.

具体地,本申请实施例的技术方案可以使用Kubernetes框架体系来作为容器管理方案。在该方案中,本申请实施例可以实现无侵入式的配置,借鉴本领域中的livenessProbe探针能力做深入的功能拓展和技术演进,从而提出了增强型livnessProbe探针控制器。在该方案中,控制器可以采用标准的Kubernetes检测模块和并且可以基于ControllerRuntime架构。因此,在本申请实施例中,对于云服务系统的整体实现上基于中心化部署,整个部署域影响范围是一个Kubernetes集群。如上所述,对于该集群中每个服务器上的单个容器进行周期性做容器状态上报,而可以基于增强型livenessProbe探针控制模块来进一步结合容器的容器状态进行综合判断。Specifically, the technical solutions of the embodiments of the present application may use the Kubernetes framework system as the container management solution. In this solution, the embodiments of the present application can implement non-invasive configuration, and make in-depth functional expansion and technical evolution by drawing on the livenessProbe probe capabilities in the field, thereby proposing an enhanced livenessProbe probe controller. In this solution, the controller can adopt the standard Kubernetes detection module and can be based on the ControllerRuntime architecture. Therefore, in the embodiment of the present application, the overall implementation of the cloud service system is based on centralized deployment, and the scope of influence of the entire deployment domain is a Kubernetes cluster. As described above, the container status is periodically reported for a single container on each server in the cluster, and the enhanced livenessProbe probe control module can be used to further combine the container status for comprehensive judgment.

在微服务架构下,应用以容器化形式部署,广义认为应用运行在容器内。因此,可以将探测脚本组件布置为单机实施探测,并将状态上报到容器层面,进而由例如云服务系统的应用有本申请实施例的应用程序管理方法的管理模块来进行综合决策。根据本申请实施例的增强型livenessProbe控制器感知到容器状态变化后,执行内部逻辑计算并决策是否需要重启此容器。最终当判断应用程序运行异常时,可以通过增强型livenessProbe控制器下发指令,并对于对应的容器触发重启。换言之,在本申请实施例中,用户在对探测脚本进行了编辑配置之后,对于单机容器侧仅执行检测,并例如由容器将检测结果上报,并且由例如检测控制器的管理模块对例如脚本的执行结果以及容器状态进行综合研判,特别是还可以进一步综合这一组应用程序副本的判断结果来进一步确定需要重启的容器,从而能够从保障服务稳定性和可靠性的角度去进行应用程序的管理和控制。Under the microservice architecture, applications are deployed in a containerized form, and in a broad sense, applications run in containers. Therefore, the detection script component can be arranged to perform detection on a single machine, and report the status to the container level, and then a management module such as a cloud service system to which the application management method of the embodiment of the present application is applied makes a comprehensive decision. After sensing the state change of the container, the enhanced livenessProbe controller according to the embodiment of the present application performs internal logic calculation and decides whether to restart the container. Finally, when it is judged that the application is running abnormally, the enhanced livenessProbe controller can issue instructions and trigger a restart for the corresponding container. In other words, in this embodiment of the present application, after the user edits and configures the detection script, only the detection is performed on the single-machine container side, and the detection result is reported by the container, for example, and the management module such as the detection controller checks the script's detection results. The execution results and container status are comprehensively judged, especially the judgment results of this group of application copies can be further integrated to further determine the containers that need to be restarted, so that the application can be managed from the perspective of ensuring service stability and reliability. and control.

例如,在本申请实施例中,可以使用下述三种方法,即ExecAction、HTTPGetAction和TCPSocketAction,来获得应用程序的运行状态检测结果。例如,检测结果Success可以表示通过检测,Failure表示未通过检测,Unknown表示检测没有正常进行。具体地,对于ExecAction检测方法,可以在容器中执行指定的脚本命令,如果执行成功,退出码为0则探测成功;对于HTTPGetAction方法,可以通过容器的IP地址、端口号及路径调用HTTP Get方法,如果响应的状态码大于等于200且小于400,则认为容器健康;对于TCPSocketAction,可以通过容器的IP地址和端口号执行TCP检查,如果能够建立TCP连接,则表明容器健康。For example, in this embodiment of the present application, the following three methods, ie, ExecAction, HTTPGetAction, and TCPSocketAction, may be used to obtain the running state detection result of the application. For example, the test result Success may indicate that the test passed, Failure indicates that the test failed, and Unknown indicates that the test was not performed normally. Specifically, for the ExecAction detection method, the specified script command can be executed in the container. If the execution is successful and the exit code is 0, the detection is successful; for the HTTPGetAction method, the HTTP Get method can be called through the IP address, port number and path of the container. If the response status code is greater than or equal to 200 and less than 400, the container is considered healthy; for TCPSocketAction, a TCP check can be performed through the container's IP address and port number, and if a TCP connection can be established, the container is healthy.

因此,在本申请实施例的方案中,用户对于探针的配置完成之后,即可以立即生效在所有的容器层面,这在现有技术的方案中,对于应用的可靠性是一个潜在的风险。例如,如上所述,如果探针配置错误,则采用社区逻辑将导致应用被全量重启,从稳定性和服务的高可用性来说是一个致命点。但是,在本申请实施例的方案中,通过容器侧将探测执行结果上报给容器控制模块,从而容器控制模块可以以中心化思想,站在全体应用程序的全局视角,尤其是站在一组应用程序的整体视角来考虑应用整体提供的服务的高可靠性。Therefore, in the solution of the embodiment of the present application, after the configuration of the probe by the user is completed, it can take effect at all container levels immediately, which is a potential risk to the reliability of the application in the solution of the prior art. For example, as mentioned above, if the probe is configured incorrectly, the adoption of community logic will cause the application to be restarted in full, which is a fatal point in terms of stability and high service availability. However, in the solution of this embodiment of the present application, the detection execution result is reported to the container control module through the container side, so that the container control module can use a centralized concept to stand in the global perspective of all applications, especially from a group of applications. The overall perspective of the program to consider the high reliability of the services provided by the application as a whole.

此外,在本申请实施例中,由于可以考虑所有应用程序副本提供服务的整体的可靠性,还可以进一步引入每次重启容器的数量阈值,来防止配置错误导致的全体容器的重启,例如,在本申请实施例中,可以最大不可用比例maxUnAvailable值,从而可以结合当前应用实际容器数量,做兜底保护。In addition, in this embodiment of the present application, since the overall reliability of the services provided by all application copies can be considered, a threshold for the number of containers restarted each time can be further introduced to prevent restarts of all containers caused by configuration errors. In the embodiment of the present application, the maximum unavailability ratio maxUnAvailable value can be used, so that bottom protection can be performed in combination with the actual number of containers in the current application.

此外,在本申请实施例中,由于根据提供不同服务的应用程序副本的数量不同,因此,在设置最大不可用比例时,还可以考虑当前应用副本数,动态调整比例状态。如:当前容器副本数10个,若设置maxUnAvailable=20%,则同一时刻仅允许2台容器被触发重启。在实际应用中,该阈值可以由维护人员根据实际应用的需求而灵活设置。例如,副本数<10,则maxUnAvailable=1;副本数>=10,则maxUnAvailable=20%。当然在本申请实施例中,还以由开源OpenKruise内pdbproducer控制器根据服务的容量和规则策略对maxUnAvailable值进行实时动态维护。In addition, in this embodiment of the present application, since the number of application program replicas that provide different services is different, when setting the maximum unavailability ratio, the current application replica number can also be considered to dynamically adjust the proportion state. For example, the current number of container replicas is 10. If maxUnAvailable=20% is set, only 2 containers are allowed to be triggered to restart at the same time. In practical applications, the threshold can be flexibly set by maintenance personnel according to the requirements of practical applications. For example, if the number of replicas<10, then maxUnAvailable=1; if the number of replicas>=10, then maxUnAvailable=20%. Of course, in the embodiment of the present application, the pdbproducer controller in the open source OpenKruise also performs real-time dynamic maintenance on the maxUnAvailable value according to the capacity of the service and the rule policy.

此外,如上所述,如果探测方案配置失败,例如命令脚本的配置错误,则脚本在容器中每次执行都会返回失败的结果,但是实际上容器和应用程序都运行正常。因此,在本申请实施例中,还可以进一步统计在预定时间段内应用程序检测结果为失败的次数,并且如果一段时间内的失败次数达到预定阈值,则可能发生上述配置错误的情况,因此,可以在该情况下向用户发出告警。Also, as mentioned above, if the probe scheme configuration fails, for example, the command script is misconfigured, the script will return a failed result every time it executes in the container, but in fact both the container and the application are running fine. Therefore, in this embodiment of the present application, the number of times the application program detection result is a failure within a predetermined period of time may be further counted, and if the number of failures within a period of time reaches a predetermined threshold, the above configuration error may occur. Therefore, The user may be alerted in this case.

此外,在本申请实施例中,在判断应用程序的实际运行状态时由于综合考虑了检测结果和容器状态,因此,还可以对于检测结果和容器状态的组合进一步赋予不同的优先级,以实现更加精确的判断。例如,检测结果可以包括:执行成功和执行失败,并且容器状态可以包括:容器可用和容器不可用。因此,在本申请实施例中,可以将执行失败并且容器不可用的第一组合的优先级设置为大于执行失败并且容器可用的第二组合。In addition, in the embodiment of the present application, since the detection result and the container state are comprehensively considered when judging the actual running state of the application, different priorities may be further assigned to the combination of the detection result and the container state, so as to achieve more Accurate judgment. For example, the detection results may include execution success and execution failure, and the container status may include container available and container unavailable. Therefore, in this embodiment of the present application, the priority of the first combination in which the execution fails and the container is unavailable may be set to be higher than the priority of the second combination in which the execution fails and the container is available.

本申请实施例提供的应用程序管理方案,通过获取具有相同应用程序标识的一组应用程序的运行状态检测结果,并获取与该组应用程序对应的容器的容器状态,从而基于应用程序的运行状态检测结果和应用程序在其中运行的容器的容器状态来进行综合判断,以确定该组应用程序中具有预定运行状态的应用程序,并对这样确定的具有预定运行状态的应用程序执行操作。因此,能够综合考虑具有相同应用程序标识的应用程序的全体来进行管理的综合决策,避免了仅基于单个应用程序的运行状态检测结果来执行应用程序的管理,导致缺乏全局把控而使得该组应用程序提供的服务稳定性差的问题。The application management solution provided by the embodiments of the present application obtains the running state detection results of a group of applications with the same application identifier, and obtains the container state of the container corresponding to the group of applications, so as to obtain the running state of the application based on the running state of the application. The detection result and the container state of the container in which the application program runs are used for comprehensive judgment to determine an application program with a predetermined running state in the group of application programs, and perform an operation on the thus determined application program with a predetermined running state. Therefore, it is possible to comprehensively consider the entire application program with the same application program identifier to perform management comprehensive decision-making, and avoid performing application program management based only on the running state detection result of a single application program, resulting in lack of global control and making the group The problem of poor stability of the service provided by the application.

上述实施例是对本申请实施例的技术原理和示例性的应用框架的说明,下面通过多个实施例来进一步对本申请实施例具体技术方案进行详细描述。The foregoing embodiments are descriptions of the technical principles and exemplary application frameworks of the embodiments of the present application, and the specific technical solutions of the embodiments of the present application will be further described in detail below through multiple embodiments.

实施例二Embodiment 2

图2为本申请提供的应用程序管理方法一个实施例的流程图,该方法的执行主体可以为具有云应用管理能力的各种终端或服务器设备,也可以为集成在这些设备上的装置或芯片。如图2所示,该应用程序管理方法包括如下步骤:FIG. 2 is a flowchart of an embodiment of an application management method provided by this application. The execution body of the method may be various terminals or server devices with cloud application management capabilities, or may be devices or chips integrated on these devices. . As shown in Figure 2, the application management method includes the following steps:

S201,获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果。S201: Acquire a running state detection result for each application program in at least one group of application programs.

在步骤S201中,可以获取至少一组应用程序中的各应用程序的运行状态检测结果。在本申请实施例中,可以通过在运行每个应用程序的容器中执行预先配置的探测方案来将获得的检测结果作为步骤S201中的运行状态检测结果。特别地,在本申请实施例中,可以对于相同的应用程序标识的一组应用程序来获取检测结果。因此,在微服务体系中,可以对于作为整体向外提供服务的一组微服务应用程序副本来获取其检测状态。In step S201, the running state detection result of each application program in the at least one group of application programs may be acquired. In this embodiment of the present application, the obtained detection result may be used as the running state detection result in step S201 by executing a pre-configured detection scheme in the container running each application. In particular, in this embodiment of the present application, the detection result may be acquired for a group of application programs identified by the same application program. Therefore, in the microservice system, the detection status of a set of microservice application replicas that provide services as a whole can be obtained.

S202,获取其中分别运行该组应用程序的一组容器的容器状态。S202: Acquire container states of a group of containers in which the group of application programs are respectively run.

在步骤S202中,可以获取其中运行步骤S201中获取的检测结果所针对的应用程序的容器的容器状态。例如,在云原生体系中,每个应用程序可以运行在容器中,以分别提供不同的服务内容。因此,在本申请实施例中,除了获取例如探针的执行结果作为检测结果之外,还进一步在步骤S202中获取其中运行该应用程序的容器的状态,以便于进行综合判断。In step S202, the container state of the container in which the application program for which the detection result acquired in step S201 is run may be acquired. For example, in a cloud-native system, each application can run in a container to provide different service content. Therefore, in this embodiment of the present application, in addition to acquiring, for example, the execution result of the probe as the detection result, the state of the container in which the application is running is further acquired in step S202 to facilitate comprehensive judgment.

S203,根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态确定各组应用程序中具有预定运行状态的应用程序。S203 , according to the detection result of the running state of each application and the container state of the container in which the application runs, determine an application with a predetermined running state in each group of applications.

在步骤S203中,可以根据步骤S201中获取到的检测结果以及步骤S202中获取到的容器状态来综合判断应用程序的实际运行状态。例如,在实际应用中,用户通过对探针进行配置来使得其可以在应用程序的容器中运行来检测应用程序的运行状态,因此,探针的检测结果实际上只能够是间接地反映应用程序的运行状态,而且如上所述,探针探测的结果的准确性还会依赖于维护人员对于探针的具体配置。因此,如果配置不合理或者甚至配置错误,则会导致探针在容器中的执行失败,而在现有技术中如果仅基于这样的执行结果来作为应用程序运行状态的唯一判断依据,那么会使得对应用程序执行错误的处理,例如,在应用程序实际运行正常的情况下进行了重启。而在本申请步骤S203中,可以基于检测结果并且进一步考虑容器的实际状态来进行综合判断,以消除单机侧由于探测方案配置问题导致的检测结果错误的问题。In step S203, the actual running state of the application can be comprehensively judged according to the detection result obtained in step S201 and the container state obtained in step S202. For example, in practical applications, the user configures the probe so that it can run in the container of the application to detect the running state of the application. Therefore, the detection result of the probe can only indirectly reflect the application. and, as mentioned above, the accuracy of the probe detection results will also depend on the specific configuration of the probe by the maintenance personnel. Therefore, if the configuration is unreasonable or even wrong, the execution of the probe in the container will fail. In the prior art, if only the execution result is used as the only basis for judging the running state of the application, it will cause Performed wrong handling of the application, for example, restarted when the application was actually running normally. In step S203 of the present application, a comprehensive judgment can be made based on the detection result and further considering the actual state of the container, so as to eliminate the problem of wrong detection results caused by the configuration problem of the detection scheme on the single machine side.

S204,对具有预定运行状态的应用程序执行预定的操作。S204, perform a predetermined operation on an application program with a predetermined running state.

因此,在步骤S204中,可以基于步骤S203中确定的结果来对确定具有预定运行状态的应用程序执行预定的操作。例如,在步骤S203中确定应用程序的运行状态异常或者失败的情况下,就可以在步骤S204中对该应用程序执行重启,以恢复该应用程序的正常运行。Therefore, in step S204, a predetermined operation may be performed on the application program determined to have a predetermined running state based on the result determined in step S203. For example, if it is determined in step S203 that the running state of the application program is abnormal or fails, the application program may be restarted in step S204 to restore the normal operation of the application program.

本申请实施例提供的应用程序管理方法,通过获取具有相同应用程序标识的一组应用程序的运行状态检测结果,并获取与该组应用程序对应的容器的容器状态,从而基于应用程序的运行状态检测结果和应用程序在其中运行的容器的容器状态来进行综合判断,以确定该组应用程序中具有预定运行状态的应用程序,并对这样确定的具有预定运行状态的应用程序执行操作。因此,能够综合考虑具有相同应用程序标识的应用程序的全体来进行管理的综合决策,避免了仅基于单个应用程序的运行状态检测结果来执行应用程序的管理,导致缺乏全局把控而使得该组应用程序提供的服务稳定性差的问题。The application program management method provided by the embodiment of the present application obtains the running state detection results of a group of application programs with the same application program identifier, and obtains the container state of the container corresponding to the group of application programs. The detection result and the container state of the container in which the application program runs are used for comprehensive judgment to determine an application program with a predetermined running state in the group of application programs, and perform an operation on the thus determined application program with a predetermined running state. Therefore, it is possible to comprehensively consider the entire application program with the same application program identifier to perform management comprehensive decision-making, and avoid performing application program management based only on the running state detection result of a single application program, resulting in lack of global control and making the group The problem of poor stability of the service provided by the application.

实施例三Embodiment 3

图3为本申请提供的应用程序管理方法另一个实施例的流程图,该方法的执行主体可以为具有云原生应用管理能力的各种终端或服务器设备,也可以为集成在这些设备上的装置或芯片。如图3所示,该应用程序管理方法包括如下步骤:FIG. 3 is a flowchart of another embodiment of an application management method provided by the present application. The execution body of the method may be various terminal or server devices with cloud native application management capabilities, or may be devices integrated on these devices. or chip. As shown in Figure 3, the application management method includes the following steps:

S301,在各应用程序的容器中执行预定的命令脚本。S301, a predetermined command script is executed in the container of each application.

在步骤S301中,可以在运行各应用程序的容器中执行预定的命令脚本来检测应用程序的运行状态。例如,在现有技术中可以由维护人员配置命令脚本程序并将配置后的脚本程序在容器中执行,以作为检测容器中运行的应用程序的检测手段。In step S301, a predetermined command script may be executed in the container running each application to detect the running state of the application. For example, in the prior art, a maintenance person can configure a command script program and execute the configured script program in a container, as a detection means for detecting an application program running in the container.

S302,根据命令脚本的执行结果确定各应用程序的运行状态检测结果。S302: Determine the running state detection result of each application program according to the execution result of the command script.

在步骤S302中,可以根据步骤S301中执行的命令脚本的执行结果来确定容器中应用程序的运行状态检测结果。例如,如图1中所示,用户可以对于容器1-3中运行的应用程序预先配置探测脚本程序,并且在步骤S301中在容器1-3中分别执行该脚本程序。从而在步骤S302中可以获得在容器1中的执行结果为生成了零的退出码,或者生成了成功的执行结果标识,因此,在步骤S302中可以认为该容器中的应用程序的运行状态为正常。当在步骤S302中获取到在某一个容器中脚本程序的执行结果为非零,或者返回了失败的执行结果,那么在步骤S302中可以将该执行失败的结果作为应用程序的运行状态检测结果。In step S302, the running state detection result of the application program in the container may be determined according to the execution result of the command script executed in step S301. For example, as shown in FIG. 1 , the user may pre-configure a detection script program for the application programs running in the containers 1-3, and in step S301, the script programs are executed in the containers 1-3 respectively. Thus, in step S302, it can be obtained that the execution result in the container 1 is that an exit code of zero is generated, or a successful execution result identifier is generated. Therefore, in step S302, it can be considered that the running state of the application in the container is normal. . When it is obtained in step S302 that the execution result of the script program in a certain container is non-zero, or a failed execution result is returned, then in step S302, the execution failure result can be used as the running state detection result of the application program.

S303,获取其中分别运行该组应用程序的一组容器的容器状态。S303: Obtain container states of a group of containers in which the group of application programs are respectively run.

在步骤S303中,可以获取其中运行步骤S302中获取的检测结果所针对的应用程序的容器的容器状态。例如,在云原生体系中,每个应用程序可以运行在容器中,以分别提供不同的服务内容。In step S303, the container state of the container of the application program for which the detection result obtained in step S302 is executed may be obtained. For example, in a cloud-native system, each application can run in a container to provide different service content.

例如,当步骤S302中获取到步骤S301中执行的命令脚本的执行结果为非零,或者返回了失败的执行结果时,那么在现有技术中就会判断该容器中的该应用程序的运行状态异常,从而可以直接对于该容器下达重启指令,来恢复该应用程序的运行。但是如上所述,在现有技术中,如果由于对于脚本程序配置错误,导致该脚本程序在容器1中运行失败,而实际上应用程序1在该容器1中始终正常运行,但是该脚本应用程序却会由于运行失败而返回例如非零的退出码,那么这就会导致该容器被重启。特别是,在重启之后,会继续使用该脚本程序来执行应用程序的运行状态检测。因此,不可避免的是,这些脚本程序每次在容器中运行都会返回失败的执行结果,从而使得应用程序1-3的容器1-3反复重启,最终使得应用程序1-3向外提供的服务异常。For example, when the execution result of the command script executed in step S301 obtained in step S302 is non-zero, or a failed execution result is returned, then in the prior art, the running state of the application in the container will be judged exception, so that a restart instruction can be directly issued to the container to resume the running of the application. However, as mentioned above, in the prior art, if the script program fails to run in the container 1 due to the wrong configuration of the script program, in fact, the application program 1 always runs normally in the container 1, but the script program But it will return e.g. a non-zero exit code due to failure, then this will cause the container to be restarted. In particular, the script will continue to be used to perform health checks of the application after a restart. Therefore, it is inevitable that these script programs will return a failed execution result every time they are run in the container, so that the container 1-3 of the application 1-3 is restarted repeatedly, and finally the services provided by the application 1-3 to the outside are made. abnormal.

因此,在本申请实施例中,除了获取例如探针的执行结果作为检测结果之外,还进一步在步骤S303中获取其中运行该应用程序的容器的状态,以便于进行综合判断。Therefore, in this embodiment of the present application, in addition to acquiring, for example, the execution result of the probe as the detection result, the state of the container in which the application is running is further acquired in step S303 to facilitate comprehensive judgment.

S304,根据应用程序的运行状态检测结果和容器状态的组合的优先级确定具有预定运行状态的应用程序。S304: Determine an application with a predetermined running state according to the combination priority of the running state detection result of the application and the container state.

在步骤S304中,可以基于步骤S302中确定的检测结果以及步骤S303中获取到的容器状态的组合来对应用程序的运行状态进行判断。特别是,在本申请实施例中,在判断应用程序的实际运行状态时为了综合考虑了检测结果和容器状态,可以对于检测结果和容器状态的组合进一步赋予不同的优先级,以实现更加精确的判断。例如,检测结果可以包括:执行成功和执行失败,并且容器状态可以包括:容器可用和容器不可用。因此,在本申请实施例中,可以将执行失败并且容器不可用的第一组合的优先级设置为大于执行失败并且容器可用的第二组合。因此,在例如图1中所示的场景中,当在步骤S302中确定容器1中的命令脚本的执行结果为失败并且容器2中的命令脚本的执行结果为失败,并且在步骤S303中获取到容器1的状态为可用,而容器2的状态为不可用,因此在步骤S304中可以基于容器2中的脚本执行结果与容器状态的组合优先级高于容器1中的脚本执行结果与容器状态的组合,而确定优先重启容器2,以恢复容器2中的应用程序2的运行。In step S304, the running state of the application may be judged based on the combination of the detection result determined in step S302 and the container state obtained in step S303. In particular, in the embodiment of the present application, in order to comprehensively consider the detection result and the container state when judging the actual running state of the application, different priorities may be further assigned to the combination of the detection result and the container state, so as to achieve a more accurate judge. For example, the detection results may include execution success and execution failure, and the container status may include container available and container unavailable. Therefore, in this embodiment of the present application, the priority of the first combination in which the execution fails and the container is unavailable may be set to be higher than the priority of the second combination in which the execution fails and the container is available. Therefore, in the scenario shown in FIG. 1, for example, when it is determined in step S302 that the execution result of the command script in container 1 is failure and the execution result of the command script in container 2 is failure, and in step S303, the The status of container 1 is available, and the status of container 2 is unavailable, so in step S304, the priority based on the combination of the script execution result in container 2 and the container status can be higher than the script execution result in container 1 and the container status. combination, and it is determined that the container 2 is restarted first, so as to restore the running of the application 2 in the container 2.

S305,根据一组应用程序中应用程序的总数,确定操作阈值。S305: Determine an operation threshold according to the total number of application programs in a group of application programs.

S306,根据各组应用程序中具有预定运行状态的应用程序的比例,对该组应用程序进行管理。S306: Manage the application programs in each group according to the proportion of the application programs in each group of application programs having a predetermined running state.

在步骤S305中,可以根据步骤S301中检测的具有相同标识的应用程序的总数来确定操作阈值。特别地,在本申请实施例的方案中,用户对于探针的配置完成之后,即可以立即生效在所有的容器层面,这在现有技术的方案中,对于应用的可靠性是一个潜在的风险。例如,如上所述,如果探针配置错误,则采用社区逻辑将导致应用被全量重启,从稳定性和服务的高可用性来说是一个致命点。为此,在本申请实施例中,可以在步骤S302中,由容器侧将探测执行结果上报给容器控制模块,并在步骤S303中进一步获取容器状态,从而容器控制模块可以以中心化思想,来基于一组应用程序的检测结果和容器状态,站在全体应用程序的全局视角,尤其是站在一组应用程序的整体视角来考虑应用整体提供的服务的高可靠性。In step S305, the operation threshold may be determined according to the total number of application programs with the same identification detected in step S301. In particular, in the solution of the embodiment of the present application, after the configuration of the probe by the user is completed, it can take effect immediately at all container levels, which is a potential risk to the reliability of the application in the solution of the prior art . For example, as mentioned above, if the probe is configured incorrectly, the adoption of community logic will cause the application to be restarted in full, which is a fatal point in terms of stability and high service availability. For this reason, in this embodiment of the present application, in step S302, the container side reports the detection execution result to the container control module, and in step S303, the container status is further obtained, so that the container control module can use a centralized idea to Based on the detection results and container status of a group of applications, consider the high reliability of the services provided by the application as a whole from the global perspective of the entire application, especially from the overall perspective of a group of applications.

因此,在步骤S305中,可以考虑所有应用程序副本提供服务的整体的可靠性,来确定一组应用程序中每次操作的数量阈值,来防止配置错误导致的全体容器的重启。Therefore, in step S305, the overall reliability of the services provided by all the application copies can be considered to determine the number threshold of each operation in a group of applications, so as to prevent restarting of all containers caused by configuration errors.

例如,在步骤S305中可以结合当前应用实际容器数量来设置最大不可用比例值,用作应用程序整体的兜底保护。具体地,由于根据提供不同服务的应用程序的数量不同,因此,在步骤S305中设置最大不可用比例时,可以考虑当前应用程序总数,动态调整比例状态。例如,当前容器副本数10个的情况下,可以在步骤S305中设置最大不可用比例值为20%,则在步骤S306中同一时刻仅允许2台容器被触发重启。在实际应用中,该阈值可以由维护人员根据实际应用的需求而灵活设置。例如,当一组应用程序的总数小于10的情况下,可以将该最大不可用的数量值固定为1,即,在步骤S306中同一时刻仅允许1台容器被触发重启。而当一组应用程序的总数大于或等于10的情况下,可以将该最大不可用比例值设置为20%,即在步骤S306中同一时刻仅允许全体应用程序的20%重启。当然在本申请实施例中,还以由开源OpenKruise内pdbproducer控制器根据服务的容量和规则策略对该阈值进行实时动态维护。For example, in step S305, the maximum unavailability ratio value may be set in combination with the actual number of containers of the current application, which is used as a bottom protection for the entire application program. Specifically, since the number of applications providing different services is different, when setting the maximum unavailability ratio in step S305, the current total number of applications can be considered to dynamically adjust the ratio status. For example, when the current number of container replicas is 10, the maximum unavailability ratio value can be set to 20% in step S305, and only two containers are allowed to be triggered to restart at the same time in step S306. In practical applications, the threshold can be flexibly set by maintenance personnel according to the requirements of practical applications. For example, when the total number of applications in a group is less than 10, the maximum unavailable number value can be fixed as 1, that is, only one container is allowed to be triggered to restart at the same time in step S306. When the total number of applications in a group is greater than or equal to 10, the maximum unavailability ratio value may be set to 20%, that is, only 20% of all applications are allowed to restart at the same time in step S306. Of course, in this embodiment of the present application, the threshold is also dynamically maintained in real time by the pdbproducer controller in the open source OpenKruise according to the capacity of the service and the rules and policies.

此外,如果用户在配置探测方案出现错误,则在步骤S301中在容器中执行脚本时,在步骤S302中必然会返回失败的结果,但是实际上容器和应用程序都运行正常。因此,在步骤S306中可以进一步统计在预定时间段内应用程序检测结果为失败的次数,并且如果一段时间内的失败次数达到预定阈值,则可能发生上述配置错误的情况,因此,可以在该情况下向用户发出告警,而不对容器执行重启操作。In addition, if the user has an error in configuring the detection scheme, when the script is executed in the container in step S301, a failure result will inevitably be returned in step S302, but in fact the container and the application are running normally. Therefore, in step S306, the number of times that the application program detection result is a failure within a predetermined period of time may be further counted, and if the number of failures within a period of time reaches a predetermined threshold, the above-mentioned configuration error may occur. Therefore, in this case The next step is to alert the user without restarting the container.

本申请实施例提供的应用程序管理方法,通过获取具有相同应用程序标识的一组应用程序的运行状态检测结果,并获取与该组应用程序对应的容器的容器状态,从而基于应用程序的运行状态检测结果和应用程序在其中运行的容器的容器状态来进行综合判断,以确定该组应用程序中具有预定运行状态的应用程序,并对这样确定的具有预定运行状态的应用程序执行操作。因此,能够综合考虑具有相同应用程序标识的应用程序的全体来进行管理的综合决策,避免了仅基于单个应用程序的运行状态检测结果来执行应用程序的管理,导致缺乏全局把控而使得该组应用程序提供的服务稳定性差的问题。The application program management method provided by the embodiment of the present application obtains the running state detection results of a group of application programs with the same application program identifier, and obtains the container state of the container corresponding to the group of application programs. The detection result and the container state of the container in which the application program runs are used for comprehensive judgment to determine an application program with a predetermined running state in the group of application programs, and perform an operation on the thus determined application program with a predetermined running state. Therefore, it is possible to comprehensively consider the entire application program with the same application program identifier to perform management comprehensive decision-making, and avoid performing application program management based only on the running state detection result of a single application program, resulting in lack of global control and making the group The problem of poor stability of the service provided by the application.

实施例四Embodiment 4

图4为本申请提供的应用程序管理装置实施例的结构示意图,可用于执行如图2和图3所示的方法步骤。如图4所示,该应用程序管理装置可以包括:第一状态获取模块41、第二状态获取模块42、确定模块43和执行模块44。FIG. 4 is a schematic structural diagram of an embodiment of an application management apparatus provided by the present application, which can be used to execute the method steps shown in FIG. 2 and FIG. 3 . As shown in FIG. 4 , the application program management apparatus may include: a first state acquisition module 41 , a second state acquisition module 42 , a determination module 43 and an execution module 44 .

第一状态获取模块41可以用于获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果。The first state obtaining module 41 may be configured to obtain a running state detection result for each application program in the at least one group of application programs.

第一状态获取模块41可以获取至少一组应用程序中的各应用程序的运行状态检测结果。在本申请实施例中,第一状态获取模块41可以通过在运行每个应用程序的容器中执行预先配置的探测方案来将获得的检测结果作为运行状态检测结果。特别地,在本申请实施例中,可以对于相同的应用程序标识的一组应用程序来获取检测结果。因此,在微服务体系中,可以对于作为整体向外提供服务的一组微服务应用程序副本来获取其检测状态。The first state obtaining module 41 may obtain the running state detection result of each application program in the at least one group of application programs. In this embodiment of the present application, the first state obtaining module 41 may use the obtained detection result as the running state detection result by executing a pre-configured detection scheme in the container running each application. In particular, in this embodiment of the present application, the detection result may be acquired for a group of application programs identified by the same application program. Therefore, in the microservice system, the detection status of a set of microservice application replicas that provide services as a whole can be obtained.

具体地,第一状态获取模块41可以在运行各应用程序的容器中执行预定的命令脚本来检测应用程序的运行状态并根据命令脚本的执行结果确定各应用程序的运行状态检测结果。例如,在现有技术中可以由维护人员配置命令脚本程序并将配置后的脚本程序在容器中执行,以作为检测容器中运行的应用程序的检测手段。因此,第一状态获取模块41可以根据执行的命令脚本的执行结果来确定容器中应用程序的运行状态检测结果。例如,如图1中所示,用户可以对于容器1-3中运行的应用程序预先配置探测脚本程序,并且在容器1-3中分别执行该脚本程序。从而可以获得在容器1中的执行结果为生成了零的退出码,或者生成了成功的执行结果标识,因此,第一状态获取模块41可以认为该容器中的应用程序的运行状态为正常。当第一状态获取模块41获取到在某一个容器中脚本程序的执行结果为非零,或者返回了失败的执行结果,那么第一状态获取模块41中可以将该执行失败的结果作为应用程序的运行状态检测结果。Specifically, the first state acquisition module 41 may execute a predetermined command script in the container running each application program to detect the running state of the application program and determine the running state detection result of each application program according to the execution result of the command script. For example, in the prior art, a maintenance person can configure a command script program and execute the configured script program in a container, as a detection means for detecting an application program running in the container. Therefore, the first state acquisition module 41 can determine the running state detection result of the application in the container according to the execution result of the executed command script. For example, as shown in FIG. 1 , a user may pre-configure a probe script program for the applications running in the containers 1-3, and execute the script program in the containers 1-3 respectively. Thus, it can be obtained that the execution result in the container 1 is an exit code of zero, or a successful execution result identifier is generated. Therefore, the first state acquisition module 41 can consider that the running state of the application in the container is normal. When the first state acquisition module 41 acquires that the execution result of the script program in a certain container is non-zero, or returns a failed execution result, then the first state acquisition module 41 can use the execution failure result as the application program's execution result. Running status detection results.

第二状态获取模块42可以用于获取其中分别运行该组应用程序的一组容器的容器状态。The second state acquisition module 42 may be used to acquire container states of a group of containers in which the group of application programs are respectively run.

第二状态获取模块42可以获取其中运行第一状态获取模块41中获取的检测结果所针对的应用程序的容器的容器状态。例如,在云原生体系中,每个应用程序可以运行在容器中,以分别提供不同的服务内容。因此,在本申请实施例中,除了获取例如探针的执行结果作为检测结果之外,第二状态获取模块42还进一步获取其中运行该应用程序的容器的状态,以便于进行综合判断。The second state obtaining module 42 may obtain the container state of the container in which the application program for which the detection result obtained in the first state obtaining module 41 is running is executed. For example, in a cloud-native system, each application can run in a container to provide different service content. Therefore, in this embodiment of the present application, in addition to obtaining, for example, the execution result of the probe as the detection result, the second state obtaining module 42 further obtains the state of the container in which the application is running, so as to facilitate comprehensive judgment.

确定模块43可以用于根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态确定各组应用程序中具有预定运行状态的应用程序。The determining module 43 may be configured to determine, according to the running state detection result of each application and the container state of the container in which the application runs, an application program with a predetermined running state in each group of application programs.

确定模块43可以根据第一状态获取模块41获取到的检测结果以及第二状态获取模块42获取到的容器状态来综合判断应用程序的实际运行状态。例如,在实际应用中,用户通过对探针进行配置来使得其可以在应用程序的容器中运行来检测应用程序的运行状态,因此,探针的检测结果实际上只能够是间接地反映应用程序的运行状态,而且如上所述,探针探测的结果的准确性还会依赖于维护人员对于探针的具体配置。因此,如果配置不合理或者甚至配置错误,则会导致探针在容器中的执行失败,而在现有技术中如果仅基于这样的执行结果来作为应用程序运行状态的唯一判断依据,那么会使得对应用程序执行错误的处理,例如,在应用程序实际运行正常的情况下进行了重启。而确定模块43可以基于检测结果并且进一步考虑容器的实际状态来进行综合判断,以消除单机侧由于探测方案配置问题导致的检测结果错误的问题。The determination module 43 can comprehensively determine the actual running state of the application program according to the detection result obtained by the first state obtaining module 41 and the container state obtained by the second state obtaining module 42 . For example, in practical applications, the user configures the probe so that it can run in the container of the application to detect the running state of the application. Therefore, the detection result of the probe can only indirectly reflect the application. and, as mentioned above, the accuracy of the probe detection results will also depend on the specific configuration of the probe by the maintenance personnel. Therefore, if the configuration is unreasonable or even wrong, the execution of the probe in the container will fail. In the prior art, if only the execution result is used as the only basis for judging the running state of the application, it will cause Performed wrong handling of the application, for example, restarted when the application was actually running normally. The determination module 43 can make a comprehensive judgment based on the detection result and further consider the actual state of the container, so as to eliminate the problem that the detection result is wrong due to the configuration problem of the detection scheme on the single machine side.

具体地,确定模块43可以根据作为第一状态获取模块41的获取对象的具有相同标识的应用程序的总数来确定操作阈值。特别地,在本申请实施例的方案中,用户对于探针的配置完成之后,即可以立即生效在所有的容器层面,这在现有技术的方案中,对于应用的可靠性是一个潜在的风险。例如,如上所述,如果探针配置错误,则采用社区逻辑将导致应用被全量重启,从稳定性和服务的高可用性来说是一个致命点。为此,在本申请实施例中,可以由第一状态获取模块41将容器侧的探测执行结果上报给确定模块43,并由第二状态获取模块42进一步获取容器状态,从而确定模块43可以以中心化思想,来基于一组应用程序的检测结果和容器状态,站在全体应用程序的全局视角,尤其是站在一组应用程序的整体视角来考虑应用整体提供的服务的高可靠性。Specifically, the determination module 43 may determine the operation threshold according to the total number of application programs with the same identification that are the acquisition objects of the first state acquisition module 41 . In particular, in the solution of the embodiment of the present application, after the configuration of the probe by the user is completed, it can take effect immediately at all container levels, which is a potential risk to the reliability of the application in the solution of the prior art . For example, as mentioned above, if the probe is configured incorrectly, the adoption of community logic will cause the application to be restarted in full, which is a fatal point in terms of stability and high service availability. To this end, in this embodiment of the present application, the first state acquisition module 41 can report the detection execution result on the container side to the determination module 43, and the second state acquisition module 42 further acquires the container state, so that the determination module 43 can use The idea of centralization is based on the detection results and container status of a group of applications, from the global perspective of all applications, especially from the overall perspective of a group of applications, to consider the high reliability of the services provided by the application as a whole.

因此,确定模块43可以考虑所有应用程序副本提供服务的整体的可靠性,来确定一组应用程序中每次操作的数量阈值,来防止配置错误导致的全体容器的重启。Therefore, the determination module 43 may consider the overall reliability of the services provided by all the application replicas to determine the number threshold of each operation in a group of applications, so as to prevent the restart of all containers caused by configuration errors.

例如,确定模块43可以结合当前应用实际容器数量来设置最大不可用比例值,用作应用程序整体的兜底保护。具体地,由于根据提供不同服务的应用程序的数量不同,因此,确定模块43设置最大不可用比例时,可以考虑当前应用程序总数,动态调整比例状态。例如,当前容器副本数10个的情况下,可以设置最大不可用比例值为20%,则执行模块44同一时刻仅允许2台容器被触发重启。在实际应用中,该阈值可以由维护人员根据实际应用的需求而灵活设置。例如,当一组应用程序的总数小于10的情况下,可以将该最大不可用的数量值固定为1,即,同一时刻仅允许1台容器被触发重启。而当一组应用程序的总数大于或等于10的情况下,可以将该最大不可用比例值设置为20%,即同一时刻仅允许全体应用程序的20%重启。当然在本申请实施例中,还以由开源OpenKruise内pdbproducer控制器根据服务的容量和规则策略对该阈值进行实时动态维护。For example, the determination module 43 may set the maximum unavailability ratio value in combination with the actual number of containers of the current application, which is used as a bottom protection for the entire application. Specifically, since the number of applications that provide different services is different, when the determination module 43 sets the maximum unavailability ratio, the current total number of applications can be considered to dynamically adjust the ratio status. For example, when the current number of container replicas is 10, the maximum unavailability ratio value can be set to 20%, and the execution module 44 allows only 2 containers to be triggered to restart at the same time. In practical applications, the threshold can be flexibly set by maintenance personnel according to the requirements of practical applications. For example, when the total number of applications in a group is less than 10, the maximum unavailable value can be fixed to 1, that is, only one container is allowed to be triggered to restart at the same time. When the total number of applications in a group is greater than or equal to 10, the maximum unavailability ratio value can be set to 20%, that is, only 20% of all applications are allowed to restart at the same time. Of course, in this embodiment of the present application, the threshold is also dynamically maintained in real time by the pdbproducer controller in the open source OpenKruise according to the capacity of the service and the rules and policies.

执行模块44可以用于对具有预定运行状态的应用程序执行预定的操作。The execution module 44 may be used to execute a predetermined operation on an application program with a predetermined running state.

因此,执行模块44可以基于确定模块43确定的结果来对确定具有预定运行状态的应用程序执行预定的操作。例如,在确定模块43确定应用程序的运行状态异常或者失败的情况下,执行模块44可以对该应用程序执行重启,以恢复该应用程序的正常运行。Therefore, the execution module 44 may perform a predetermined operation on the application program determined to have the predetermined running state based on the result determined by the determination module 43 . For example, when the determination module 43 determines that the running state of the application program is abnormal or fails, the execution module 44 may restart the application program to restore the normal operation of the application program.

如果用户在配置探测方案出现错误,则在容器中执行脚本时,第一状态获取模块41必然会获取到失败的结果,但是实际上容器和应用程序都运行正常。因此,在执行模块44可以进一步统计在预定时间段内应用程序检测结果为失败的次数,并且如果一段时间内的失败次数达到预定阈值,则可能发生上述配置错误的情况,因此,可以在该情况下向用户发出告警,而不对容器执行重启操作。If the user has an error in configuring the detection scheme, when the script is executed in the container, the first state obtaining module 41 will inevitably obtain the result of failure, but in fact, the container and the application are running normally. Therefore, the execution module 44 may further count the number of times that the application program detection results are failures within a predetermined period of time, and if the number of failures within a period of time reaches a predetermined threshold, the above configuration error may occur. Therefore, in this case The next step is to alert the user without restarting the container.

本申请实施例提供的应用程序管理装置,通过获取具有相同应用程序标识的一组应用程序的运行状态检测结果,并获取与该组应用程序对应的容器的容器状态,从而基于应用程序的运行状态检测结果和应用程序在其中运行的容器的容器状态来进行综合判断,以确定该组应用程序中具有预定运行状态的应用程序,并对这样确定的具有预定运行状态的应用程序执行操作。因此,能够综合考虑具有相同应用程序标识的应用程序的全体来进行管理的综合决策,避免了仅基于单个应用程序的运行状态检测结果来执行应用程序的管理,导致缺乏全局把控而使得该组应用程序提供的服务稳定性差的问题。The application program management apparatus provided by the embodiment of the present application acquires the running state detection results of a group of application programs with the same application program identifier, and acquires the container state of the container corresponding to the group of application programs. The detection result and the container state of the container in which the application program runs are used for comprehensive judgment to determine an application program with a predetermined running state in the group of application programs, and perform an operation on the thus determined application program with a predetermined running state. Therefore, it is possible to comprehensively consider the entire application program with the same application program identifier for comprehensive decision-making for management, and avoid performing application program management only based on the running state detection result of a single application program, resulting in lack of global control and making the group The problem of poor stability of the service provided by the application.

实施例五Embodiment 5

以上描述了应用程序管理装置的内部功能和结构,该装置可实现为一种电子设备。图5为本申请提供的电子设备实施例的结构示意图。如图5所示,该电子设备包括存储器51和处理器52。The internal function and structure of the application management apparatus are described above, and the apparatus can be implemented as an electronic device. FIG. 5 is a schematic structural diagram of an embodiment of an electronic device provided by the present application. As shown in FIG. 5 , the electronic device includes a memory 51 and a processor 52 .

存储器51,用于存储程序。除上述程序之外,存储器51还可被配置为存储其它各种数据以支持在电子设备上的操作。这些数据的示例包括用于在电子设备上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。The memory 51 is used to store programs. In addition to the above-described programs, the memory 51 may also be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, etc.

存储器51可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。Memory 51 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.

处理器52,不仅仅局限于处理器(CPU),还可能为图形处理器(GPU)、现场可编辑门阵列(FPGA)、嵌入式神经网络处理器(NPU)或人工智能(AI)芯片等处理芯片。处理器52,与存储器51耦合,执行存储器51所存储的程序,该程序运行时执行上述实施例二或三的应用程序管理方法。The processor 52 is not limited to a processor (CPU), but may also be a graphics processing unit (GPU), a field programmable gate array (FPGA), an embedded neural network processor (NPU), or an artificial intelligence (AI) chip, etc. processing chip. The processor 52, coupled with the memory 51, executes the program stored in the memory 51, and when the program runs, the application program management method of the second or third embodiment above is executed.

进一步,如图5所示,电子设备还可以包括:通信组件53、电源组件54、音频组件55、显示器56等其它组件。图5中仅示意性给出部分组件,并不意味着电子设备只包括图5所示组件。Further, as shown in FIG. 5 , the electronic device may further include: a communication component 53 , a power supply component 54 , an audio component 55 , a display 56 and other components. Only some components are schematically shown in FIG. 5 , which does not mean that the electronic device only includes the components shown in FIG. 5 .

通信组件53被配置为便于电子设备和其他设备之间有线或无线方式的通信。电子设备可以接入基于通信标准的无线网络,如WiFi、3G、4G或5G,或它们的组合。在一个示例性实施例中,通信组件53经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件53还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。The communication component 53 is configured to facilitate wired or wireless communication between the electronic device and other devices. Electronic devices can access wireless networks based on communication standards, such as WiFi, 3G, 4G or 5G, or a combination thereof. In one exemplary embodiment, the communication component 53 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 53 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

电源组件54,为电子设备的各种组件提供电力。电源组件54可以包括电源管理系统,一个或多个电源,及其他与为电子设备生成、管理和分配电力相关联的组件。The power supply assembly 54 provides power to various components of the electronic device. Power supply components 54 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic devices.

音频组件55被配置为输出和/或输入音频信号。例如,音频组件55包括一个麦克风(MIC),当电子设备处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器51或经由通信组件53发送。在一些实施例中,音频组件55还包括一个扬声器,用于输出音频信号。Audio component 55 is configured to output and/or input audio signals. For example, audio component 55 includes a microphone (MIC) that is configured to receive external audio signals when the electronic device is in operating modes, such as calling mode, recording mode, and voice recognition mode. The received audio signal may be further stored in the memory 51 or transmitted via the communication component 53 . In some embodiments, audio assembly 55 also includes a speaker for outputting audio signals.

显示器56包括屏幕,其屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。Display 56 includes a screen, which may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.

本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by program instructions related to hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the steps including the above method embodiments are executed; and the foregoing storage medium includes: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present invention. scope.

Claims (11)

1.一种应用程序管理方法,其中,每个应用程序运行在容器中,并且所述方法包括:1. An application management method, wherein each application runs in a container, and the method comprises: 获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果;obtain a running state detection result for each application in the at least one set of applications; 获取其中分别运行该组应用程序的一组容器的容器状态;Get the container state of a set of containers in which the set of applications are running; 根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态确定各组应用程序中具有预定运行状态的应用程序;Determine an application program with a predetermined operating state in each group of application programs according to the running state detection result of each application program and the container state of the container in which the application program runs; 对具有预定运行状态的应用程序执行预定的操作。Perform a predetermined operation on an application with a predetermined running state. 2.根据权利要求1所述的应用程序管理方法,其中,所述获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果包括:2 . The application management method according to claim 1 , wherein the acquiring a running state detection result for each application in the at least one group of applications comprises: 2 . 在各应用程序的容器中执行预定的命令脚本;Execute a predetermined command script in the container of each application; 根据所述命令脚本的执行结果确定各应用程序的运行状态检测结果。The running state detection result of each application program is determined according to the execution result of the command script. 3.根据权利要求1所述的应用程序管理方法,其中,每一组所述应用程序具有相同的应用程序标识,所述对具有预定运行状态的应用程序执行预定的操作包括:3. The application program management method according to claim 1, wherein each group of the application programs has the same application program identifier, and the performing a predetermined operation on the application program with a predetermined running state comprises: 根据所述一组应用程序中应用程序的总数,确定操作阈值,其中,所述操作阈值为允许在同一时刻重启该组应用程序所对应的容器的最大数量,并且An operation threshold is determined according to the total number of applications in the group of applications, wherein the operation threshold is the maximum number of containers corresponding to the group of applications that are allowed to be restarted at the same time, and 根据各组应用程序中具有预定运行状态的应用程序的比例,对该组应用程序进行管理。The application programs of each group are managed according to the proportion of the application programs with a predetermined running state in the group of application programs. 4.根据权利要求3所述的应用程序管理方法,其中,所述根据各组应用程序中具有预定运行状态的应用程序的比例,对该组应用程序进行管理包括:4. The application program management method according to claim 3, wherein, according to the proportion of the application programs in each group of application programs with a predetermined running state, the management of the group of application programs comprises: 当所述比例小于所述操作阈值时,对具有预定运行状态的应用程序进行重启操作。When the ratio is smaller than the operation threshold, a restart operation is performed on the application program with the predetermined running state. 5.根据权利要求3所述的应用程序管理方法,其中,所述根据所述一组应用程序中应用程序的总数,确定操作阈值,包括:5. The application program management method according to claim 3, wherein the determining an operation threshold according to the total number of application programs in the group of application programs comprises: 当所述一组应用程序中应用程序的总数小于预设阈值时,将第一预设值确定为所述操作阈值;When the total number of application programs in the group of application programs is less than a preset threshold, determining a first preset value as the operation threshold; 当所述一组应用程序中应用程序的总数小于预设阈值时,将第二预设值与所述应用程序的总数的乘积确定为所述操作阈值。When the total number of application programs in the group of application programs is less than a preset threshold, the product of the second preset value and the total number of application programs is determined as the operation threshold. 6.根据权利要求2所述的应用程序管理方法,其中,6. The application management method according to claim 2, wherein, 所述根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态,确定各组应用程序中具有预定运行状态的应用程序包括:The determining, according to the running state detection result of each application and the container state of the container in which the application runs, determines that the applications having the predetermined running state in each group of applications include: 根据应用程序的运行状态检测结果和容器状态的组合的优先级确定具有预定运行状态的应用程序。An application having a predetermined running state is determined according to the priority of the combination of the running state detection result of the application and the container state. 7.根据权利要求6所述的应用程序管理方法,其中,所述应用程序的运行状态检测结果包括:所述命令脚本执行成功以及所述命令脚本执行失败,并且所述容器状态包括:容器就绪以及容器不可用,并且7 . The application management method according to claim 6 , wherein the detection result of the running state of the application comprises: the command script is successfully executed and the command script fails to be executed, and the container state comprises: the container is ready. 8 . and the container is unavailable, and 所述组合包括:命令脚本执行失败并且容器不可用的第一组合;以及命令脚本执行失败并且容器可用的第二组合,并且The combination includes: a first combination of command script execution failure and the container unavailable; and a second combination of command script execution failure and the container available, and 所述第一组合的优先级高于所述第二组合。The first combination has a higher priority than the second combination. 8.根据权利要求1所述的应用程序管理方法,其中,所述对具有预定运行状态的应用程序执行预定的操作包括:8. The application management method according to claim 1, wherein the performing a predetermined operation on an application having a predetermined running state comprises: 计算预定时间段内被确定具有预定运行状态的应用程序的次数;Count the number of applications that are determined to have a predetermined operating state within a predetermined period of time; 根据所述次数对具有预定运行状态的应用程序执行预定的操作。A predetermined operation is performed on the application having a predetermined running state according to the number of times. 9.一种应用程序管理装置,其中,每个应用程序运行在容器中,并且所述装置包括:9. An application program management apparatus, wherein each application program runs in a container, and the apparatus comprises: 第一状态获取模块,用于获取针对至少一组应用程序中的每一个应用程序的运行状态检测结果;a first state acquisition module, configured to acquire a running state detection result for each application program in the at least one group of application programs; 第二状态获取模块,用于获取其中分别运行该组应用程序的一组容器的容器状态;a second state acquisition module, configured to acquire the container states of a group of containers in which the group of application programs are respectively running; 确定模块,用于根据各应用程序的运行状态检测结果以及其中运行该应用程序的容器的容器状态,确定各组应用程序中具有预定运行状态的应用程序;a determination module, configured to determine the applications with predetermined running states in each group of applications according to the running state detection results of each application and the container state of the container in which the application runs; 执行模块,用于对具有预定运行状态的应用程序执行预定的操作。The execution module is used for executing a predetermined operation on an application program with a predetermined running state. 10.一种电子设备,包括:10. An electronic device comprising: 存储器,用于存储程序;memory for storing programs; 处理器,用于运行所述存储器中存储的所述程序,以执行如权利要求1至8中任一所述的应用程序管理方法。The processor is configured to run the program stored in the memory to execute the application program management method according to any one of claims 1 to 8. 11.一种计算机可读存储介质,其上存储有可被处理器执行的计算机程序,其中,所述程序被处理器执行时实现如权利要求1至8中任一所述的应用程序管理方法。11. A computer-readable storage medium on which a computer program executable by a processor is stored, wherein when the program is executed by the processor, the application program management method according to any one of claims 1 to 8 is implemented .
CN202210126119.8A 2022-02-10 2022-02-10 Application management method and device, electronic device and computer-readable storage medium Active CN114625478B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210126119.8A CN114625478B (en) 2022-02-10 2022-02-10 Application management method and device, electronic device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210126119.8A CN114625478B (en) 2022-02-10 2022-02-10 Application management method and device, electronic device and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN114625478A true CN114625478A (en) 2022-06-14
CN114625478B CN114625478B (en) 2025-06-24

Family

ID=81898692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210126119.8A Active CN114625478B (en) 2022-02-10 2022-02-10 Application management method and device, electronic device and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN114625478B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183508A (en) * 2015-08-26 2015-12-23 北京元心科技有限公司 Method for monitoring application in container system and intelligent terminal
US20160182315A1 (en) * 2014-12-22 2016-06-23 Rovio Entertainment Ltd. Container manager
US20160371127A1 (en) * 2015-06-19 2016-12-22 Vmware, Inc. Resource management for containers in a virtualized environment
CN109710492A (en) * 2018-12-29 2019-05-03 北方工业大学 Application program operation monitoring method, medium and electronic equipment
CN110647470A (en) * 2019-09-24 2020-01-03 网易(杭州)网络有限公司 Test method and manufacturing method, device, medium and electronic equipment
CN110704166A (en) * 2019-09-30 2020-01-17 北京金山云网络技术有限公司 Service running method, device and server
CN110730135A (en) * 2019-09-06 2020-01-24 平安普惠企业管理有限公司 Method and device for improving performance of server, storage medium and server
CN112346926A (en) * 2020-10-16 2021-02-09 北京金山云网络技术有限公司 Resource state monitoring method and device and electronic equipment
CN112445574A (en) * 2020-11-27 2021-03-05 中国工商银行股份有限公司 Application container multi-cluster migration method and device
CN112925565A (en) * 2019-12-06 2021-06-08 中兴通讯股份有限公司 Application management method, system and server in hybrid cloud environment
CN113971054A (en) * 2021-10-29 2022-01-25 北京金山云网络技术有限公司 Application copy processing method and device and server

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160182315A1 (en) * 2014-12-22 2016-06-23 Rovio Entertainment Ltd. Container manager
US20160371127A1 (en) * 2015-06-19 2016-12-22 Vmware, Inc. Resource management for containers in a virtualized environment
CN105183508A (en) * 2015-08-26 2015-12-23 北京元心科技有限公司 Method for monitoring application in container system and intelligent terminal
CN109710492A (en) * 2018-12-29 2019-05-03 北方工业大学 Application program operation monitoring method, medium and electronic equipment
CN110730135A (en) * 2019-09-06 2020-01-24 平安普惠企业管理有限公司 Method and device for improving performance of server, storage medium and server
CN110647470A (en) * 2019-09-24 2020-01-03 网易(杭州)网络有限公司 Test method and manufacturing method, device, medium and electronic equipment
CN110704166A (en) * 2019-09-30 2020-01-17 北京金山云网络技术有限公司 Service running method, device and server
CN112925565A (en) * 2019-12-06 2021-06-08 中兴通讯股份有限公司 Application management method, system and server in hybrid cloud environment
CN112346926A (en) * 2020-10-16 2021-02-09 北京金山云网络技术有限公司 Resource state monitoring method and device and electronic equipment
CN112445574A (en) * 2020-11-27 2021-03-05 中国工商银行股份有限公司 Application container multi-cluster migration method and device
CN113971054A (en) * 2021-10-29 2022-01-25 北京金山云网络技术有限公司 Application copy processing method and device and server

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李战: "基于Docker的容器集群调度机制的设计与实现", 中国优秀硕士学位论文全文数据库 (信息科技辑), no. 10, 15 October 2018 (2018-10-15) *
翁湦元;单杏花;阎志远;王雪峰;: "基于Kubernetes的容器云平台设计与实践", 铁路计算机应用, no. 12, 25 December 2019 (2019-12-25) *

Also Published As

Publication number Publication date
CN114625478B (en) 2025-06-24

Similar Documents

Publication Publication Date Title
US11438249B2 (en) Cluster management method, apparatus and system
CN114978883A (en) Network wake-up management method and device, electronic equipment and storage medium
US12298888B2 (en) Application scenario injection and validation system
CN110674034A (en) Health examination method and device, electronic equipment and storage medium
CN102855184A (en) Android automatic test cross application device and method
CN114942859A (en) Method, device, equipment, medium and program product for processing node failure
CN108364670B (en) Read-write pressure testing method, device, equipment and storage medium
US12393499B2 (en) Method and device for recovering self-test exception of server component, system and medium
CN117215859A (en) Active and standby BIOS automatic switching method, automatic switching device and computer
CN109101371B (en) Disaster recovery switching method and device
CN104809054A (en) Method and system for implementing program testing
US20260030097A1 (en) Memory Processing Method Based on a Server and Apparatus, Processor and Electronic Device
CN107688547B (en) A method and system for switching between active and standby controllers
CN115052140A (en) Method and device for testing algorithm index, camera access method and medium
CN114625478A (en) Application program management method and device, electronic equipment and computer readable storage medium
WO2023115836A1 (en) Method and apparatus for controlling distributed operation system, and device, medium and program product
CN112965896A (en) Test environment fault detection method and device based on dynamic byte codes
CN114064343B (en) Abnormal handling method and device for block chain
CN110908824A (en) A kind of fault identification method, device and equipment
US20140335794A1 (en) System and Method for Automated Testing of Mobile Computing Devices
CN117544536A (en) Dial test methods, devices, electronic equipment and storage media
CN114780270A (en) Memory fault processing method and device, electronic equipment and computer readable storage medium
CN115567419A (en) Health state detection method, system, device and medium for kafka cluster
CN116126467A (en) Container application fault recovery method, device, equipment and storage medium
CN115168236A (en) Automatic testing method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant