[go: up one dir, main page]

CN118838665A - Configuration management method and device for batch server operating system parameters - Google Patents

Configuration management method and device for batch server operating system parameters Download PDF

Info

Publication number
CN118838665A
CN118838665A CN202411311052.0A CN202411311052A CN118838665A CN 118838665 A CN118838665 A CN 118838665A CN 202411311052 A CN202411311052 A CN 202411311052A CN 118838665 A CN118838665 A CN 118838665A
Authority
CN
China
Prior art keywords
server
configuration
resource
controller
system configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411311052.0A
Other languages
Chinese (zh)
Inventor
潘峰
肖雪
唐晓东
高传集
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202411311052.0A priority Critical patent/CN118838665A/en
Publication of CN118838665A publication Critical patent/CN118838665A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4406Loading of operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Hardware Redundancy (AREA)

Abstract

本发明涉及云计算服务器配置管理技术领域,具体提供了一种批量服务器操作系统参数的配置管理方法及装置,具有如下步骤:S1、注册服务器节点信息,周期性上报;S2、创建操作系统配置资源给服务器节点上按用途分类打标签;S3、进行配置资源;S4、控制器监测到声明的系统配置资源,选择符合条件的服务器设备节点DeviceNode;S5、控制器配置资源的Request请求发给执行器;S6、执行器收到请求后检查资源创建的依赖条件;S7、控制器将结果同步到系统配置资源实例的状态显示;S8、控制器检测服务器节点的变化。与现有技术相比,本发明能够提升批量配置服务器系统参数的效率,简化操作难度,便于所有人员评审。

The present invention relates to the technical field of cloud computing server configuration management, and specifically provides a configuration management method and device for batch server operating system parameters, which has the following steps: S1, registering server node information and reporting it periodically; S2, creating operating system configuration resources and labeling server nodes according to usage classification; S3, configuring resources; S4, the controller monitors the declared system configuration resources and selects the server device node DeviceNode that meets the conditions; S5, the controller sends the Request request of the configuration resource to the executor; S6, the executor checks the dependency conditions of resource creation after receiving the request; S7, the controller synchronizes the result to the status display of the system configuration resource instance; S8, the controller detects the changes of the server node. Compared with the prior art, the present invention can improve the efficiency of batch configuration of server system parameters, simplify the difficulty of operation, and facilitate the review of all personnel.

Description

一种批量服务器操作系统参数的配置管理方法及装置A configuration management method and device for batch server operating system parameters

技术领域Technical Field

本发明涉及云计算服务器配置管理技术领域,具体提供一种批量服务器操作系统参数的配置管理方法及装置。The present invention relates to the technical field of cloud computing server configuration management, and specifically provides a configuration management method and device for batch server operating system parameters.

背景技术Background Art

云中心管理了大批量混合型号的服务器,在云平台搭建和运维管理过程中,环境配置时间长,配置复杂,模型不统一,成为制约云中心发展的瓶颈。当前管理大批量服务器的配置主流方案,通常是安装完操作系统后使用ansible或salt运维工具批量修改操作系统配置,并且信创环境下这些工具的兼容性相对较差,虽然实现了一定的自动化,但是仍有不足:The cloud center manages a large number of servers of mixed models. During the construction and operation and maintenance of the cloud platform, the environment configuration takes a long time, the configuration is complex, and the model is not unified, which has become a bottleneck restricting the development of the cloud center. The current mainstream configuration solution for managing a large number of servers is usually to use Ansible or Salt operation and maintenance tools to batch modify the operating system configuration after installing the operating system. In addition, the compatibility of these tools in the information innovation environment is relatively poor. Although a certain degree of automation has been achieved, there are still some shortcomings:

(1)安装系统配置大批量的不同配置需求的服务器,需要先安装完系统后逐个登陆系统定制化修改配置,时间较长;(1) Installing a large number of servers with different configuration requirements requires logging into the system one by one to customize and modify the configuration after installing the system, which takes a long time;

(2)不同版本的操作系统的运维管理工具命令及配置文件不统一,操作较复杂,运维成本高。(2) The operation and maintenance management tool commands and configuration files of different versions of operating systems are not unified, which makes the operation more complicated and the operation and maintenance costs high.

发明内容Summary of the invention

本发明是针对上述现有技术的不足,提供一种实用性强的批量服务器操作系统参数的配置管理方法。The present invention aims at solving the above-mentioned deficiencies of the prior art and provides a configuration management method for batch server operating system parameters with strong practicability.

本发明进一步的技术任务是提供一种设计合理,安全适用的批量服务器操作系统参数的配置管理装置。A further technical task of the present invention is to provide a configuration management device for batch server operating system parameters that is reasonably designed, safe and applicable.

本发明解决其技术问题所采用的技术方案是:The technical solution adopted by the present invention to solve its technical problem is:

一种批量服务器操作系统参数的配置管理方法,具有如下步骤:A configuration management method for batch server operating system parameters comprises the following steps:

S1、服务器在安装系统时部署执行器启动后,通过上报执行器所在服务器的相关信息到控制器,注册服务器节点信息,并维持心跳,周期性上报;S1. After the server deploys the executor when installing the system and starts it, it reports the relevant information of the server where the executor is located to the controller, registers the server node information, maintains the heartbeat, and reports periodically;

S2、创建操作系统配置资源给服务器节点上按用途分类打标签;S2. Create operating system configuration resources and label the server nodes according to their usage.

S3、云原生kubernetes批量声明系统进行配置资源;S3, cloud native kubernetes batch declaration system to configure resources;

S4、控制器监测到声明的系统配置资源,选择符合条件的服务器设备节点DeviceNode;S4, the controller detects the declared system configuration resources and selects a server device node DeviceNode that meets the conditions;

S5、控制器配置资源的Request请求发给执行器;S5, the controller sends a request for resource configuration to the executor;

S6、执行器收到请求后检查资源创建的依赖条件;S6. After receiving the request, the executor checks the dependency conditions for resource creation;

S7、控制器将结果同步到系统配置资源实例的状态显示;S7, the controller synchronizes the result to the status display of the system configuration resource instance;

S8、控制器检测服务器节点的变化。S8. The controller detects changes in the server nodes.

进一步的,所述控制器为检查每个资源的前置条件,如果不满足,将资源状态置为失败,等待一段时间后重试,直到匹配完成后通过周期性下发任务检查系统配置,如有不一致,会自动修复。Furthermore, the controller checks the preconditions of each resource. If not met, the resource status is set to failure, and retries are made after a period of time. After the matching is completed, the system configuration is checked by periodically sending tasks. If there is any inconsistency, it is automatically repaired.

进一步的,所述执行器通过Linux服务方式部署管理在每一台服务器上,每个执行器都是一个独立的部署运行单元,以插件化的集成方案不断扩展服务器的兼容清单,封装统一的服务器系统配置接口API。Furthermore, the executor is deployed and managed on each server through Linux service mode. Each executor is an independent deployment and operation unit. The compatible list of servers is continuously expanded with a plug-in integration solution, and a unified server system configuration interface API is encapsulated.

进一步的,在步骤S4中,控制器监测到声明的系统配置资源,根据资源中DeviceNodeName或DeviceNodeSelector选择符合条件的服务器设备节点DeviceNode;Further, in step S4, the controller monitors the declared system configuration resources and selects a qualified server device node DeviceNode according to DeviceNodeName or DeviceNodeSelector in the resources;

所述DeviceNodeName为根据服务器设备节点的Name,选择出单个DeviceNode;The DeviceNodeName is a single DeviceNode selected according to the Name of the server device node;

所述DeviceNodeSelector为定义资源将要选择的标签,从步骤S2中创建的DeviceNode节点中筛选出一组与此标签匹配的节点。The DeviceNodeSelector is a label that defines the resource to be selected, and a group of nodes matching this label are screened out from the DeviceNode nodes created in step S2.

进一步的,在步骤S5中,控制器根据系统配置资源实例和DeviceNode地址相关信息组装系统配置资源的Request请求发给执行器。Furthermore, in step S5, the controller assembles a Request request for the system configuration resource according to the system configuration resource instance and the DeviceNode address related information and sends it to the executor.

进一步的,在步骤S6中,检查资源创建的依赖条件,若不满足,直接返回失败;Furthermore, in step S6, the dependency conditions for resource creation are checked, and if not met, failure is directly returned;

根据当前系统的型号匹配执行相应处理流程,返回执行结果;Execute the corresponding processing flow according to the model matching of the current system and return the execution result;

具体为:Specifically:

(1)收到在SystemGrub配置内核参数请求;(1) Receive a request to configure kernel parameters in SystemGrub;

(2)检查相应的服务器节点DeviceNode,发现当前节点有未完成的配置任务,返回失败,显示当前节点任务繁忙;(2) Check the corresponding server node DeviceNode and find that the current node has unfinished configuration tasks. It returns failure and shows that the current node task is busy.

(3)如果当前无配置任务,检查内核参数是否已配置,如有该配置,直接返回成功;(3) If there is no configuration task currently, check whether the kernel parameters have been configured. If so, return success directly;

否则,执行相应的命令,完成在服务器设备节点上配置内核参数,通过缓存文件记录当前节点需要重启后生效,重启服务器,系统启动后,检查当前内核参数,返回成功结果,清理缓存记录。Otherwise, execute the corresponding command to complete the configuration of kernel parameters on the server device node, record the current node through the cache file and it needs to be restarted to take effect, restart the server, check the current kernel parameters after the system starts, return a successful result, and clear the cache record.

进一步的,在步骤S7中,控制器将结果同步到系统配置资源实例的状态显示:Further, in step S7, the controller synchronizes the result to the status display of the system configuration resource instance:

如果成功,控制器更新资源状态,处理流程结束;If successful, the controller updates the resource status and the process ends;

如果失败,控制器将资源状态置为失败,设置n秒后,进入重试流程。If it fails, the controller sets the resource status to failed and enters the retry process after n seconds.

进一步的,控制器监测服务器节点变化,如下:Furthermore, the controller monitors changes in server nodes as follows:

(1)有新加入的服务器节点DeviceNode,每个系统配置资源都会重新触发协调逻辑,检查是否满足DeviceNodeSelector或DeviceNodeName的调度条件,如果满足,创建新的系统配置资源实例;(1) When a new server node DeviceNode is added, each system configuration resource will re-trigger the coordination logic to check whether the scheduling conditions of DeviceNodeSelector or DeviceNodeName are met. If so, a new system configuration resource instance will be created;

(2)删除一个服务器节点DeviceNode,每个系统配置资源都会重新触发协调逻辑,检查是否有该DeviceNode节点的资源实例,如果满足,删除该系统配置资源实例,实现缩容。(2) When a server node DeviceNode is deleted, each system configuration resource will re-trigger the coordination logic to check whether there is a resource instance of the DeviceNode node. If satisfied, the system configuration resource instance will be deleted to achieve capacity reduction.

一种批量服务器操作系统参数的配置管理装置,包括:至少一个存储器和至少一个处理器;A configuration management device for batch server operating system parameters, comprising: at least one memory and at least one processor;

所述至少一个存储器,用于存储机器可读程序;The at least one memory is used to store a machine-readable program;

所述至少一个处理器,用于调用所述机器可读程序,执行一种批量服务器操作系统参数的配置管理方法。The at least one processor is used to call the machine-readable program to execute a configuration management method for batch server operating system parameters.

本发明的一种批量服务器操作系统参数的配置管理方法及装置和现有技术相比,具有以下突出的有益效果:Compared with the prior art, the configuration management method and device of batch server operating system parameters of the present invention have the following outstanding beneficial effects:

本发明可以提升批量配置服务器系统参数的效率,简化操作难度,便于所有人员评审。服务器系统配置资源批量创建后可以通过控制器实现服务器并发执行自动化系统参数配置。The present invention can improve the efficiency of batch configuration of server system parameters, simplify the difficulty of operation, and facilitate the review of all personnel. After the server system configuration resources are batch created, the controller can realize the server concurrent execution of automatic system parameter configuration.

通过声明式管理资源,不需人为控制执行顺序,周期性检查修正系统参数,防止被后台修改的风险,规范管理配置可视化。By declaratively managing resources, there is no need to manually control the execution order, and system parameters can be periodically checked and corrected to prevent the risk of being modified by the background, and standardized management configuration visualization can be achieved.

通过失败重试机制,可以减少人机交互时间,效率更高。抽象出操作人员关注的业务模型,简化了配置难度,入门更加容易。The failure retry mechanism can reduce the human-computer interaction time and improve efficiency. The business model that operators are concerned about is abstracted, which simplifies the configuration difficulty and makes it easier to get started.

配置文件简单,只有系统参数业务相关信息,有助于系统专家评审,并且可以管理混合型服务器多种操作系统共存的运维场景,可以短时间内完成大规模服务器节点系统参数配置,时间算法复杂度为零。The configuration file is simple and contains only system parameter business-related information, which is helpful for system expert review. It can also manage the operation and maintenance scenarios where multiple operating systems coexist on hybrid servers. It can complete the system parameter configuration of large-scale server nodes in a short time, and the time algorithm complexity is zero.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

附图1是一种批量服务器操作系统参数的配置管理方法的流程示意图;FIG1 is a flow chart of a method for configuring and managing parameters of a batch server operating system;

附图2是一种批量服务器操作系统参数的配置管理方法的架构图。FIG2 is an architectural diagram of a method for configuring and managing parameters of batch server operating systems.

具体实施方式DETAILED DESCRIPTION

为了使本技术领域的人员更好的理解本发明的方案,下面结合具体的实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention is further described in detail below in conjunction with specific implementation methods. Obviously, the described embodiments are only part of the embodiments of the present invention, rather than all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

下面给出一个最佳实施例:A best embodiment is given below:

如图2所示,本实施例中的控制器为通过控制器管理系统配置资源的创建、更新、删除流程,支持容错机制。基于服务器设备的节点标签,可以调度系统配置资源在哪些节点配置。As shown in Figure 2, the controller in this embodiment manages the creation, update, and deletion process of system configuration resources through the controller, and supports a fault-tolerant mechanism. Based on the node labels of the server devices, the system configuration resources can be scheduled to be configured on which nodes.

控制器会检查每个资源的前置条件,如不满足,将该资源状态置为失败,10秒后重试,直到配置成功。配置完成后,通过周期性下发任务检查系统配置,如有不一致,会自动修复。通过本方法,在大规模混合型号服务器的配置场景下,使用统一的模型,简化了服务器的配置信息,实现了服务器系统参数配置的自动化配置管理。The controller will check the preconditions of each resource. If they are not met, the resource status will be set to failed and retry after 10 seconds until the configuration is successful. After the configuration is completed, the system configuration is checked by periodically issuing tasks. If there is any inconsistency, it will be automatically repaired. Through this method, in the configuration scenario of large-scale mixed-model servers, a unified model is used to simplify the configuration information of the server, and the automated configuration management of the server system parameter configuration is realized.

控制器运行在kubernetes中的负载,通过插件式执行器的型号匹配调度支持混合型号服务器的多种系统版本。The controller runs the workload in Kubernetes and supports multiple system versions of mixed model servers through model matching scheduling of plug-in executors.

执行器通过Linux服务方式部署管理在每一台服务器上,每个执行器都是一个独立的部署运行单元,以插件化的集成方案不断扩展服务器的兼容清单。封装了统一的服务器系统配置接口API,方便控制器抽象出统一的资源模型供使用人员操作。不同型号操作系统的系统配置的差异化,只在执行器Agent内部适配完成。执行器可以支持一个或多个型号的服务器操作系统配置管理。The executor is deployed and managed on each server through Linux services. Each executor is an independent deployment and operation unit, and the compatible list of servers is continuously expanded with plug-in integration solutions. It encapsulates a unified server system configuration interface API, which facilitates the controller to abstract a unified resource model for users to operate. The differentiation of system configurations of different operating systems is only completed within the executor Agent. The executor can support the configuration management of one or more server operating systems.

如图1所示,本实施例中的一种批量服务器操作系统参数的配置管理方法,具有如下步骤:As shown in FIG1 , a configuration management method for batch server operating system parameters in this embodiment has the following steps:

S1、服务器在安装系统时部署执行器启动后,通过上报执行器所在服务器的相关信息到控制器,注册服务器节点信息,并维持心跳,周期性上报;S1. After the server deploys the executor when installing the system and starts it, it reports the relevant information of the server where the executor is located to the controller, registers the server node information, maintains the heartbeat, and reports periodically;

S2、创建操作系统配置资源给服务器节点上按用途分类打标签,一个服务器上可以打多个标签;S2. Create operating system configuration resources to label server nodes according to their usage. Multiple labels can be added to a server.

S3、云原生kubernetes批量声明系统进行配置资源(创建、更新、删除);S3, cloud native kubernetes batch declaration system to configure resources (create, update, delete);

S4、控制器监测到声明的系统配置资源,根据资源中DeviceNodeName或DeviceNodeSelector选择符合条件的服务器设备节点DeviceNode;S4, the controller detects the declared system configuration resources and selects a qualified server device node DeviceNode according to DeviceNodeName or DeviceNodeSelector in the resources;

其中,DeviceNodeName为根据服务器设备节点的Name,选择出单个DeviceNode;Among them, DeviceNodeName is to select a single DeviceNode according to the Name of the server device node;

DeviceNodeSelector定义资源将要选择的标签,从步骤S2中创建的DeviceNode节点中筛选出一组与此标签匹配的节点DeviceNodeSelector defines the label that the resource will select, and selects a set of nodes that match this label from the DeviceNode nodes created in step S2

S5、控制器根据系统配置资源实例和DeviceNode地址相关信息组装系统配置资源的Request请求(创建、更新、删除)发给执行器;S5. The controller assembles a Request (create, update, delete) for the system configuration resource based on the system configuration resource instance and DeviceNode address related information and sends it to the executor;

S6、执行器收到请求后检查资源创建的依赖条件;S6. After receiving the request, the executor checks the dependency conditions for resource creation;

检查资源创建的依赖条件,若不满足,直接返回失败;Check the dependency conditions for resource creation. If they are not met, return failure directly.

根据当前系统的型号匹配执行相应处理流程,返回执行结果。(成功、失败、描述信息)Execute the corresponding processing flow according to the model matching of the current system and return the execution result. (Success, failure, description information)

场景示例如下:The following are some example scenarios:

(1)收到在SystemGrub配置内核参数请求;(1) Receive a request to configure kernel parameters in SystemGrub;

(2)检查相应的服务器节点DeviceNode,发现当前节点有未完成的配置任务,返回失败,显示当前节点任务繁忙;(2) Check the corresponding server node DeviceNode and find that the current node has unfinished configuration tasks. It returns failure and shows that the current node task is busy.

(3)如果可以当前无配置任务,检查内核参数是否已配置,如有该配置,直接返回成功。否则,执行相应的命令,完成在服务器设备节点上配置内核参数,通过缓存文件记录当前节点需要重启后生效,重启服务器,系统启动后,检查当前内核参数,返回成功结果,清理缓存记录。(3) If there is no configuration task currently, check whether the kernel parameters have been configured. If so, return success directly. Otherwise, execute the corresponding command to complete the configuration of the kernel parameters on the server device node, record the current node through the cache file and need to restart to take effect, restart the server, check the current kernel parameters after the system starts, return a success result, and clear the cache record.

S7、控制器将结果同步到系统配置资源实例的状态显示;S7, the controller synchronizes the result to the status display of the system configuration resource instance;

如果成功,控制器更新资源状态,处理流程结束;If successful, the controller updates the resource status and the process ends;

如果失败,控制器将资源状态置为失败,设置n秒后,进入重试流程。If it fails, the controller sets the resource status to failed and enters the retry process after n seconds.

S8、控制器检测服务器节点的变化;S8, the controller detects changes in the server node;

(1)有新加入的服务器节点DeviceNode,每个系统配置资源都会重新触发协调逻辑,检查是否满足DeviceNodeSelector或DeviceNodeName的调度条件,如果满足,创建新的系统配置资源实例,实现扩容;(1) When a new server node DeviceNode is added, each system configuration resource will re-trigger the coordination logic to check whether the scheduling conditions of DeviceNodeSelector or DeviceNodeName are met. If so, a new system configuration resource instance will be created to achieve capacity expansion;

场景示例如下;The scenario example is as follows;

a、开始有3个节点注册,标签common-sysctl: enabled,创建了服务器系统配置资源SystemSysctl;a. Three nodes are registered at the beginning, label common-sysctl: enabled, and server system configuration resource SystemSysctl is created;

b、现在新加入3个DeviceNode节点,同时打了标签common-sysctl: enabled,则控制器自动将服务器系统配置资源在这3个DeviceNode节点配置,无需人工处理。b. Now three new DeviceNode nodes are added and labeled common-sysctl: enabled. The controller automatically configures the server system configuration resources on these three DeviceNode nodes without manual processing.

(2)删除一个服务器节点DeviceNode,每个系统配置资源都会重新触发协调逻辑,检查是否有该DeviceNode节点的资源实例,如果满足,删除该系统配置资源实例,实现缩容。(2) When a server node DeviceNode is deleted, each system configuration resource will re-trigger the coordination logic to check whether there is a resource instance of the DeviceNode node. If satisfied, the system configuration resource instance will be deleted to achieve capacity reduction.

基于上述方法,本实施了中的一种批量服务器操作系统参数的配置管理装置,包括:至少一个存储器和至少一个处理器;Based on the above method, the present invention implements a configuration management device for batch server operating system parameters, including: at least one memory and at least one processor;

所述至少一个存储器,用于存储机器可读程序;The at least one memory is used to store a machine-readable program;

所述至少一个处理器,用于调用所述机器可读程序,执行一种批量服务器操作系统参数的配置管理方法。The at least one processor is used to call the machine-readable program to execute a configuration management method for batch server operating system parameters.

上述具体的实施方式仅是本发明具体的个案,本发明的专利保护范围包括但不限于上述具体的实施方式,任何符合本发明权利要求书记载的技术方案且任何所属技术领域普通技术人员对其做出的适当变化或者替换,皆应落入本发明的专利保护范围。The above-mentioned specific implementations are only specific cases of the present invention. The patent protection scope of the present invention includes but is not limited to the above-mentioned specific implementations. Any technical solutions that comply with the claims of the present invention and any appropriate changes or substitutions made by ordinary technicians in the relevant technical field shall fall within the patent protection scope of the present invention.

尽管已经示出和描述了本发明的实施例,对于本领域的普通技术人员而言,可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由所附权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the present invention, and that the scope of the present invention is defined by the appended claims and their equivalents.

Claims (9)

1. A configuration management method for batch server operating system parameters is characterized by comprising the following steps:
S1, after a deployment executor is started when a system is installed by a server, reporting relevant information of the server where the executor is located to the controller, registering server node information, maintaining heartbeat and reporting periodically;
S2, creating an operating system configuration resource to label the server node according to the purpose classification;
s3, the cloud native kubernetes batch declaration system configures resources;
S4, the controller monitors the declared system configuration resources and selects server equipment nodes DeviceNode which meet the conditions;
s5, a Request of the controller for configuring resources is sent to an executor;
S6, checking the dependency condition of resource creation after the executor receives the request;
s7, the controller synchronizes the result to the state display of the system configuration resource instance;
S8, the controller detects the change of the server node.
2. The method of claim 1, wherein the controller checks the preconditions of each resource, if they are not satisfied, sets the resource status as failed, and retries after waiting for a period of time until the matching is completed, and if there is no match, automatically repairs the system configuration by periodically issuing a task to check the system configuration.
3. The method for managing the configuration of the operating system parameters of the batch server according to claim 2, wherein the executors are deployed and managed on each server by a Linux service mode, each executor is an independent deployment running unit, and compatible lists of the servers are continuously expanded by a plug-in integration scheme, and unified server system configuration interface APIs are packaged.
4.A method of configuration management of batch server operating system parameters according to claim 3, wherein in step S4, the controller monitors declared system configuration resources and selects eligible server appliance nodes DeviceNode based on DeviceNodeName or DeviceNodeSelector in the resources;
DeviceNodeName is that a single DeviceNode is selected according to the Name of the server equipment node;
and DeviceNodeSelector, defining a label to be selected by the resource, and screening a group of nodes matched with the label from DeviceNode nodes created in the step S2.
5. The method according to claim 4, wherein in step S5, the controller assembles a Request for system configuration resources according to the system configuration resource instance and DeviceNode address-related information to the executor.
6. The method according to claim 5, wherein in step S6, the dependency condition of resource creation is checked, and if not, failure is returned directly;
Executing a corresponding processing flow according to the model matching of the current system, and returning an execution result;
The method comprises the following steps:
(1) Receiving a request for configuring kernel parameters at SystemGrub;
(2) Checking the corresponding server node DeviceNode, finding that the current node has an incomplete configuration task, returning to failure, and displaying that the current node is busy;
(3) If no configuration task exists at present, checking whether the kernel parameters are configured, if so, returning success directly;
otherwise, executing the corresponding command, completing configuration of kernel parameters on the server equipment node, recording that the current node needs to be restarted to take effect through the cache file, restarting the server, checking the current kernel parameters after the system is started, returning a success result, and cleaning the cache record.
7. The method according to claim 6, wherein in step S7, the controller synchronizes the result to the status display of the system configuration resource instance:
If successful, the controller updates the resource state, and the processing flow is ended;
If the resource state fails, the controller sets the resource state as failed, and after n seconds are set, the retry flow is entered.
8. A method of configuration management of batch server operating system parameters according to claim 7, the method is characterized in that the controller monitors the change of the server node as follows:
(1) With newly added server node DeviceNode, each system configuration resource retriggers the coordination logic, checks if the scheduling conditions of DeviceNodeSelector or DeviceNodeName are met, and if so, creates a new system configuration resource instance;
(2) Deleting one server node DeviceNode, re-triggering the coordination logic by each system configuration resource, checking whether the resource instance of the DeviceNode node exists, and deleting the system configuration resource instance to realize capacity reduction if the resource instance is met.
9. A configuration management apparatus for batch server operating system parameters, comprising: at least one memory and at least one processor;
The at least one memory for storing a machine readable program;
the at least one processor being configured to invoke the machine readable program to perform the method of any of claims 1 to 8.
CN202411311052.0A 2024-09-20 2024-09-20 Configuration management method and device for batch server operating system parameters Pending CN118838665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411311052.0A CN118838665A (en) 2024-09-20 2024-09-20 Configuration management method and device for batch server operating system parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411311052.0A CN118838665A (en) 2024-09-20 2024-09-20 Configuration management method and device for batch server operating system parameters

Publications (1)

Publication Number Publication Date
CN118838665A true CN118838665A (en) 2024-10-25

Family

ID=93149641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411311052.0A Pending CN118838665A (en) 2024-09-20 2024-09-20 Configuration management method and device for batch server operating system parameters

Country Status (1)

Country Link
CN (1) CN118838665A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141466A (en) * 2015-09-25 2015-12-09 浪潮(北京)电子信息产业有限公司 Operating system automation deployment method based on cloud platform and system thereof
US20160094625A1 (en) * 2014-01-21 2016-03-31 Oracle International Corporation System and method for dynamic clustered jms in an application server environment
CN110825392A (en) * 2019-10-31 2020-02-21 北京深之度科技有限公司 Customization method, batch deployment method and batch deployment system of operating system
CN111371589A (en) * 2020-02-16 2020-07-03 苏州浪潮智能科技有限公司 Batch deployment method and system for server operating systems
US20210365290A1 (en) * 2020-04-16 2021-11-25 Nanjing University Of Posts And Telecommunications Multidimensional resource scheduling method in kubernetes cluster architecture system
CN115225475A (en) * 2022-07-04 2022-10-21 浪潮云信息技术股份公司 Automatic configuration management method, system and device for server network
CN118075105A (en) * 2024-01-31 2024-05-24 深圳前海环融联易信息科技服务有限公司 Method, device, computer and storage medium for configuring server node

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160094625A1 (en) * 2014-01-21 2016-03-31 Oracle International Corporation System and method for dynamic clustered jms in an application server environment
CN105141466A (en) * 2015-09-25 2015-12-09 浪潮(北京)电子信息产业有限公司 Operating system automation deployment method based on cloud platform and system thereof
CN110825392A (en) * 2019-10-31 2020-02-21 北京深之度科技有限公司 Customization method, batch deployment method and batch deployment system of operating system
CN111371589A (en) * 2020-02-16 2020-07-03 苏州浪潮智能科技有限公司 Batch deployment method and system for server operating systems
US20210365290A1 (en) * 2020-04-16 2021-11-25 Nanjing University Of Posts And Telecommunications Multidimensional resource scheduling method in kubernetes cluster architecture system
CN115225475A (en) * 2022-07-04 2022-10-21 浪潮云信息技术股份公司 Automatic configuration management method, system and device for server network
CN118075105A (en) * 2024-01-31 2024-05-24 深圳前海环融联易信息科技服务有限公司 Method, device, computer and storage medium for configuring server node

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈英达;温柏坚;林强;唐亮亮;李凯;: "基于Cobbler中央调度控制技术的微机服务器自动化远程批量部署研究与应用", 自动化与仪器仪表, no. 04, 25 April 2018 (2018-04-25) *

Similar Documents

Publication Publication Date Title
US7779298B2 (en) Distributed job manager recovery
US8863137B2 (en) Systems and methods for automated provisioning of managed computing resources
CN113569987A (en) Model training method and device
CN112527349A (en) Dynamic deployment strategy optimization and continuous deployment service guarantee system
CN112445598B (en) Task scheduling method and device based on quartz, electronic equipment and medium
CN112230987B (en) Distributed modular plug-in frame realization system and method
CN111831289A (en) A microservice automatic deployment management system and method
CN116860288B (en) ERP system upgrade method, device, equipment and medium
CN114443294B (en) Big data service component deployment method, system, terminal and storage medium
CN114217843B (en) System operation and maintenance method, device and server
CN113835834A (en) A method and system for scaling computing nodes based on K8S container cluster
CN114816662A (en) Container arrangement method and system applied to Kubernetes
CN117369942A (en) Method and system for arranging and automatically deploying application service resources
CN110569104A (en) Management method and computer storage medium for task training in deep learning system
CN114443239B (en) Method and device for injecting container
CN111522630B (en) Method and system for executing planned tasks based on batch dispatching center
CN113658351A (en) Product production method and device, electronic equipment and storage medium
CN115225475B (en) Automatic configuration management method, system and device for server network
WO2024139011A1 (en) Information processing method
CN112565416A (en) Cloud-native-based large-scale edge android equipment nanotube system and nanotube method thereof
CN118838665A (en) Configuration management method and device for batch server operating system parameters
CN117472509A (en) Non-containerized application management method based on Kubernetes cluster equipment
CN115543383A (en) Operator-based distributed storage upgrading method
CN110532000B (en) Kbroker distributed operating system for operation publishing and operation publishing system
CN114879977A (en) Application deployment method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination