[go: up one dir, main page]

CN111290854A - Task management method, device and system, computer storage medium and electronic equipment - Google Patents

Task management method, device and system, computer storage medium and electronic equipment Download PDF

Info

Publication number
CN111290854A
CN111290854A CN202010064754.9A CN202010064754A CN111290854A CN 111290854 A CN111290854 A CN 111290854A CN 202010064754 A CN202010064754 A CN 202010064754A CN 111290854 A CN111290854 A CN 111290854A
Authority
CN
China
Prior art keywords
task
node
executed
service
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010064754.9A
Other languages
Chinese (zh)
Other versions
CN111290854B (en
Inventor
曹智颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010064754.9A priority Critical patent/CN111290854B/en
Publication of CN111290854A publication Critical patent/CN111290854A/en
Application granted granted Critical
Publication of CN111290854B publication Critical patent/CN111290854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本公开提供了任务管理方法、装置、分布式任务管理系统。该方法应用于分布式任务管理系统,分布式任务管理系统包括任务调度服务器和与任务调度服务器中的多个任务调度服务所对应的节点;该方法包括:各任务调度服务监听与节点对应的父节点;在监听到节点中存在被删除的第一节点时,通过目标任务调度服务创建与第一节点对应的第二节点,第二节点对应的目标任务调度服务不同于与第一节点对应的任务调度服务;目标任务调度服务从数据库中获取与第一节点对应的任务信息,并将任务信息添加至与目标任务调度服务对应的任务队列中。本公开通过正常工作的任务调度服务接管故障的任务调度服务的任务,确保了任务不丢失,提高了任务管理系统的可用性。

Figure 202010064754

The present disclosure provides a task management method, an apparatus, and a distributed task management system. The method is applied to a distributed task management system, and the distributed task management system includes a task scheduling server and nodes corresponding to a plurality of task scheduling services in the task scheduling server; the method includes: each task scheduling service monitors the parent node corresponding to the node. node; when monitoring the existence of the deleted first node in the node, create a second node corresponding to the first node through the target task scheduling service, and the target task scheduling service corresponding to the second node is different from the task corresponding to the first node. Scheduling service; the target task scheduling service obtains the task information corresponding to the first node from the database, and adds the task information to the task queue corresponding to the target task scheduling service. The present disclosure takes over the tasks of the faulty task scheduling service through the normal working task scheduling service, which ensures that the task is not lost and improves the availability of the task management system.

Figure 202010064754

Description

任务管理方法、装置、系统、计算机存储介质及电子设备Task management method, apparatus, system, computer storage medium and electronic device

技术领域technical field

本公开涉及云计算技术领域,具体而言,涉及一种任务管理方法、任务管理装置、分布式任务管理系统、计算机存储介质及电子设备。The present disclosure relates to the technical field of cloud computing, and in particular, to a task management method, a task management device, a distributed task management system, a computer storage medium, and an electronic device.

背景技术Background technique

任务管理系统广泛存在于需要较长执行时间的计算任务,例如离线语音识别、大数据统计任务、机器学习模型训练等等,任务管理系统将这些需要较长执行时间的计算任务管理起来,执行任务下发、监控任务进度、存储任务状态等。Task management systems are widely used in computing tasks that require a long execution time, such as offline speech recognition, big data statistics tasks, machine learning model training, etc. The task management system manages these computing tasks that require a long execution time and executes tasks. Issue, monitor task progress, store task status, etc.

传统的任务管理系统主要由控制台、任务管理服务和任务执行服务组成,用户通过控制台下发任务到任务管理服务,任务管理服务再将任务下发给任务执行服务,当任务执行服务接收到任务后开始执行任务,并将任务执行进度更新到数据库中。并且任务管理服务能够将任务执行状态实时展示到控制台供用户查询。传统的任务管理系统架构简单,只适用于处理原子化的任务,即任务本身只涉及一个任务执行服务,当一个任务由多个具有先后顺序的子任务构成时,这些子任务需要由多个不同的任务执行服务执行,如果采用传统的任务管理系统必然会降低任务执行效率,任务管理服务的可用性较低。The traditional task management system is mainly composed of a console, a task management service and a task execution service. Users send tasks to the task management service through the console, and the task management service sends the tasks to the task execution service. When the task execution service receives After the task starts to execute the task, and the task execution progress is updated to the database. And the task management service can display the task execution status to the console in real time for users to query. The traditional task management system has a simple architecture and is only suitable for processing atomic tasks, that is, the task itself only involves one task execution service. When a task consists of multiple subtasks with sequential order, these subtasks need to be composed of multiple different If the traditional task management system is used, the task execution efficiency will be reduced, and the availability of the task management service will be low.

需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above Background section is only for enhancement of understanding of the background of the present disclosure, and therefore may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.

发明内容SUMMARY OF THE INVENTION

本公开的实施例提供了一种任务管理方法、任务管理装置、任务管理系统、计算机存储介质及电子设备,进而至少在一定程度上可以确保任务不丢失,实现任务管理系统的高可用性。The embodiments of the present disclosure provide a task management method, a task management apparatus, a task management system, a computer storage medium and an electronic device, thereby ensuring that tasks are not lost at least to a certain extent and realizing high availability of the task management system.

本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。Other features and advantages of the present disclosure will become apparent from the following detailed description, or be learned in part by practice of the present disclosure.

根据本公开实施例的一个方面,提供了一种任务管理方法,所述任务管理方法应用于分布式任务管理系统,所述分布式任务管理系统包括任务调度服务器和与所述任务调度服务器中的多个任务调度服务所对应的节点,其中所述任务调度服务用于管理、调度用户的任务;所述方法包括:各所述任务调度服务监听与所述节点对应的父节点;在监听到所述节点中存在被删除的第一节点时,通过目标任务调度服务创建与所述第一节点对应的第二节点,所述第二节点对应的目标任务调度服务不同于与所述第一节点对应的任务调度服务;所述目标任务调度服务从数据库中获取与所述第一节点对应的任务信息,并将所述任务信息添加至与所述目标任务调度服务对应的任务队列中。According to an aspect of the embodiments of the present disclosure, a task management method is provided, the task management method is applied to a distributed task management system, and the distributed task management system includes a task scheduling server and a task scheduling server and the task scheduling server. Nodes corresponding to a plurality of task scheduling services, wherein the task scheduling services are used to manage and schedule tasks of users; the method includes: each of the task scheduling services monitors the parent node corresponding to the node; When there is a deleted first node in the node, a second node corresponding to the first node is created through the target task scheduling service, and the target task scheduling service corresponding to the second node is different from that corresponding to the first node. The target task scheduling service obtains the task information corresponding to the first node from the database, and adds the task information to the task queue corresponding to the target task scheduling service.

根据本公开实施例的一个方面,提供了一种任务管理装置,所述任务管理装置应用于分布式任务管理系统,所述分布式任务管理系统包括任务调度服务器和与所述任务调度服务器中的多个任务调度服务所对应的节点,其中所述任务调度服务用于管理、调度用户的任务;所述装置包括:监听模块,用于各所述任务调度服务监听与所述节点对应的父节点;节点创建模块,用于在监听到所述节点中存在被删除的第一节点时,通过目标任务调度服务创建与所述第一节点对应的第二节点,所述第二节点对应的目标任务调度服务不同于与所述第一节点对应的任务调度服务;任务接管模块,用于所述目标任务调度服务从数据库中获取与所述第一节点对应的任务信息,并将所述任务信息添加至与所述目标任务调度服务对应的任务队列中。According to an aspect of the embodiments of the present disclosure, a task management apparatus is provided, the task management apparatus is applied to a distributed task management system, and the distributed task management system includes a task scheduling server and a task scheduling server and the task scheduling server. Nodes corresponding to multiple task scheduling services, wherein the task scheduling services are used to manage and schedule user tasks; the device includes: a monitoring module, used for each task scheduling service to monitor the parent node corresponding to the node The node creation module is used to create a second node corresponding to the first node through the target task scheduling service when monitoring the existence of the deleted first node in the node, and the target task corresponding to the second node The scheduling service is different from the task scheduling service corresponding to the first node; the task takeover module is used for the target task scheduling service to obtain the task information corresponding to the first node from the database, and add the task information into the task queue corresponding to the target task scheduling service.

在本公开的一些实施例中,基于前述方案,所述任务管理装置还包括:注册模块,用于通过各所述任务调度服务在分布式协调服务系统中进行注册,以获取与各所述任务调度服务对应的具有不同节点编号的节点。In some embodiments of the present disclosure, based on the foregoing solution, the task management apparatus further includes: a registration module, configured to register in the distributed coordination service system through each of the task scheduling services, so as to obtain information related to each of the tasks Nodes with different node numbers corresponding to the scheduling service.

在本公开的一些实施例中,基于前述方案,所述任务管理装置还包括:标识修改模块,用于将所述任务信息对应的任务调度服务标识修改为与所述目标任务调度服务对应的任务调度服务标识。In some embodiments of the present disclosure, based on the foregoing solution, the task management apparatus further includes: an identification modification module, configured to modify the task scheduling service identification corresponding to the task information to a task corresponding to the target task scheduling service Dispatch service ID.

在本公开的一些实施例中,所述分布式任务管理系统还包括任务管理服务器和任务执行服务器;基于前述方案,所述任务管理装置还包括:请求生成模块,用于通过所述任务调度服务器接收所述任务管理服务器发送的待执行任务,并根据所述待执行任务形成任务执行请求;任务执行模块,用于将所述任务执行请求发送至所述任务执行服务器,以使所述任务执行服务器中的任务执行服务执行任务。In some embodiments of the present disclosure, the distributed task management system further includes a task management server and a task execution server; based on the foregoing solution, the task management apparatus further includes: a request generation module, configured to use the task scheduling server Receive the task to be executed sent by the task management server, and form a task execution request according to the to-be-executed task; a task execution module is configured to send the task execution request to the task execution server, so that the task is executed The task execution service in the server executes the task.

在本公开的一些实施例中,基于前述方案,所述请求生成模块包括:任务添加单元,用于将所述待执行任务添加至与所述任务调度服务对应的任务队列中,并判断所述任务调度服务的线程池中是否存在空闲线程;任务拉取单元,用于当存在所述空闲线程时,从与所述任务调度服务对应的任务队列中拉取所述待执行任务,并获取所述待执行任务中的子任务;请求生成单元,用于根据所述子任务的属性信息和所述子任务形成所述任务执行请求。In some embodiments of the present disclosure, based on the foregoing solution, the request generation module includes: a task adding unit, configured to add the to-be-executed task to a task queue corresponding to the task scheduling service, and determine the Whether there is an idle thread in the thread pool of the task scheduling service; the task pulling unit is used to pull the to-be-executed task from the task queue corresponding to the task scheduling service when the idle thread exists, and obtain all the A subtask in the task to be executed; a request generating unit, configured to form the task execution request according to the attribute information of the subtask and the subtask.

在本公开的一些实施例中,基于前述方案,所述请求生成单元配置为:判断所述子任务是否为待执行子任务;当判定所述子任务为待执行子任务时,根据所述待执行子任务形成所述任务执行请求。In some embodiments of the present disclosure, based on the foregoing solution, the request generating unit is configured to: determine whether the subtask is a subtask to be executed; when it is determined that the subtask is a subtask to be executed, according to the subtask to be executed Executing a subtask forms the task execution request.

在本公开的一些实施例中,所述任务执行服务的数量为多个,并且各所述任务执行服务具有不同的任务类型标签;基于前述方案,所述任务执行模块配置为:将所述任务执行请求中待执行任务的任务类型与所述任务类型标签进行匹配,以确定目标任务执行服务;将所述任务执行请求发送至所述目标任务执行服务,以使所述目标任务执行服务执行任务。In some embodiments of the present disclosure, the number of the task execution services is multiple, and each task execution service has a different task type label; based on the foregoing solution, the task execution module is configured to: Matching the task type of the task to be executed in the execution request with the task type label to determine the target task execution service; sending the task execution request to the target task execution service, so that the target task execution service executes the task .

在本公开的一些实施例中,基于前述方案,所述任务管理装置还配置为:查询所述待执行子任务是否执行完成;若所述待执行子任务执行完成时,则更新所述数据库中与所述待执行任务对应的任务进度信息;若所述待执行子任务执行未完成,则将所述待执行任务重新添加至与所述任务调度服务对应的任务队列中。In some embodiments of the present disclosure, based on the foregoing solution, the task management apparatus is further configured to: query whether the execution of the subtask to be executed is completed; if the execution of the subtask to be executed is completed, update the database in the Task progress information corresponding to the task to be executed; if the execution of the subtask to be executed is not completed, the task to be executed is re-added to the task queue corresponding to the task scheduling service.

在本公开的一些实施例中,基于前述方案,所述任务管理装置还配置为:判断所述待执行任务中是否存在未执行子任务;若存在,则在所述未执行子任务为待执行子任务时执行所述未执行子任务,直至所述待执行任务中不存在未执行子任务;若不存在,则更新所述数据库中与所述待执行任务对应的任务状态信息。In some embodiments of the present disclosure, based on the foregoing solution, the task management apparatus is further configured to: determine whether there is an unexecuted subtask in the to-be-executed task; if there is, the unexecuted subtask is to be executed The unexecuted subtask is executed when the subtask is executed until there is no unexecuted subtask in the to-be-executed task; if there is no unexecuted subtask, the task status information corresponding to the to-be-executed task in the database is updated.

在本公开的一些实施例中,基于前述方案,所述任务管理装置还配置为:各所述任务调度服务根据与各所述任务调度服务对应的任务队列中的任务数量更新所述节点中的任务量;任务管理服务器从各所述节点中获取所述任务量,根据所述任务量确定任务分配权重,并基于所述任务分配权重执行任务分配。In some embodiments of the present disclosure, based on the foregoing solution, the task management apparatus is further configured to: each of the task scheduling services update the number of tasks in the node according to the number of tasks in the task queue corresponding to each of the task scheduling services The task amount; the task management server obtains the task amount from each of the nodes, determines the task allocation weight according to the task amount, and performs task allocation based on the task allocation weight.

在本公开的一些实施例中,所述分布式任务管理系统还包括控制台;基于前述方案,所述任务管理装置还配置为:所述控制台响应用户的第一触发操作以录入所述待执行任务;或者,所述控制台响应用户的第二触发操作生成任务进度查询请求,并将所述任务进度查询请求发送至所述任务管理服务器,以获取任务进度信息。In some embodiments of the present disclosure, the distributed task management system further includes a console; based on the foregoing solution, the task management apparatus is further configured to: the console responds to a user's first trigger operation to enter the to-be-to-be Execute the task; or, the console generates a task progress query request in response to the user's second trigger operation, and sends the task progress query request to the task management server to obtain task progress information.

根据本公开实施例的一个方面,提供了一种分布式任务管理系统,其特征在于,包括:控制台,用于响应用户的触发操作以生成任务下发指令或任务进度查询请求;任务管理服务器,与所述控制台连接,用于响应所述任务下发指令下发任务或者响应所述任务进度查询请求以向所述控制台反馈任务进度信息;任务调度服务器,与所述任务管理服务器连接,包含多个任务调度服务,各所述任务调度服务在分布式协调服务系统中注册形成节点,用于接收所述任务管理服务器下发的任务,并通过各所述任务调度服务调度、管理所述任务;任务执行服务器,与所述任务调度服务器连接,包含多个任务执行服务,用于响应所述任务调度服务发送的任务执行请求以执行任务;数据存储设备,与所述任务管理服务器和所述任务调度服务器连接,用于存储与所述任务相关的信息。According to an aspect of the embodiments of the present disclosure, a distributed task management system is provided, which is characterized by comprising: a console for generating a task issuing instruction or a task progress query request in response to a user's trigger operation; a task management server , connected with the console, and used to issue a task in response to the task issuing instruction or in response to the task progress query request to feed back task progress information to the console; a task scheduling server, connected with the task management server , including a plurality of task scheduling services, each of which is registered in the distributed coordination service system to form a node for receiving tasks issued by the task management server, and dispatches and manages the tasks through each of the task scheduling services. a task execution server, connected to the task scheduling server, and comprising a plurality of task execution services for executing tasks in response to a task execution request sent by the task scheduling service; a data storage device, connected with the task management server and the task management server and The task scheduling server is connected to store information related to the task.

根据本公开实施例的一个方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现如上述实施例所述的任务管理方法。According to an aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the task management method described in the foregoing embodiments.

根据本公开实施例的一个方面,提供了一种电子设备,包括一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如上述实施例所述的任务管理方法。According to an aspect of the embodiments of the present disclosure, there is provided an electronic device including one or more processors; and a storage device for storing one or more programs, when the one or more programs are stored by the one or more programs When executed by multiple processors, the one or more processors are caused to execute the task management method described in the above-mentioned embodiments.

在本公开的实施例所提供的技术方案中,任务管理系统中的多个任务调度服务在分布式协调服务框架中进行注册,形成与各任务调度服务对应的节点,也就是说,该任务管理系统为多节点分布式任务管理系统。各任务调度服务监听与节点对应的父节点,当存在被删除的第一节点时,通过目标任务调度服务创建与第一节点对应的第二节点,同时目标任务调度服务从数据库中获取与第一节点对应的任务信息,并将该任务信息添加至与目标任务调度服务对应的任务队列中。本公开的技术方案能够在某个任务调度服务出现故障后,通过其它正常工作的任务调度服务接管出现故障的任务调度服务的任务,确保了任务不丢失,提高了任务管理系统的可用性。In the technical solutions provided by the embodiments of the present disclosure, multiple task scheduling services in the task management system are registered in the distributed coordination service framework to form nodes corresponding to each task scheduling service, that is, the task management The system is a multi-node distributed task management system. Each task scheduling service monitors the parent node corresponding to the node. When there is a deleted first node, a second node corresponding to the first node is created through the target task scheduling service. At the same time, the target task scheduling service obtains the first node from the database. The task information corresponding to the node is added to the task queue corresponding to the target task scheduling service. The technical solution of the present disclosure can take over the tasks of the faulty task scheduling service through other normally working task scheduling services after a certain task scheduling service fails, ensuring that tasks are not lost and improving the availability of the task management system.

应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

附图说明Description of drawings

此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Obviously, the drawings in the following description are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort. In the attached image:

图1示出了可以应用本公开实施例的技术方案的示例性系统架构的示意图;FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure can be applied;

图2示意性示出了相关技术中传统的任务管理系统的架构示意图;FIG. 2 schematically shows a schematic diagram of the architecture of a conventional task management system in the related art;

图3示意性示出了根据本公开的一个实施例的任务管理方法的流程示意图;3 schematically shows a schematic flowchart of a task management method according to an embodiment of the present disclosure;

图4示意性示出了根据本公开的一个实施例的分布式任务管理系统的架构示意图;FIG. 4 schematically shows a schematic diagram of the architecture of a distributed task management system according to an embodiment of the present disclosure;

图5示意性示出了根据本公开的一个实施例的分布式任务处理系统中任务调度服务之间的分布架构图;5 schematically shows a distributed architecture diagram between task scheduling services in a distributed task processing system according to an embodiment of the present disclosure;

图6示意性示出了根据本公开的一个实施例的数据存储设备中存储的任务信息的结构示意图;6 schematically shows a schematic structural diagram of task information stored in a data storage device according to an embodiment of the present disclosure;

图7示意性示出了根据本公开的一个实施例的任务调度服务故障时的任务调度服务之间的关系图;FIG. 7 schematically shows a relationship diagram between task scheduling services when the task scheduling service fails according to an embodiment of the present disclosure;

图8示意性示出了根据本公开的一个实施例的任务恢复后的任务调度服务之间的关系图;FIG. 8 schematically shows a relationship diagram between task scheduling services after task recovery according to an embodiment of the present disclosure;

图9示意性示出了根据本公开的一个实施例的数据存储设备中存储的恢复后的任务信息的结构示意图;9 schematically shows a schematic structural diagram of restored task information stored in a data storage device according to an embodiment of the present disclosure;

图10示意性示出了根据本公开的一个实施例的任务调度服务的执行流程示意图;FIG. 10 schematically shows a schematic diagram of an execution flow of a task scheduling service according to an embodiment of the present disclosure;

图11示意性示出了根据本公开的一个实施例的任务管理装置的框图;FIG. 11 schematically shows a block diagram of a task management apparatus according to an embodiment of the present disclosure;

图12示出了适于用来实现本公开实施例的电子设备的计算机系统的结构示意图。FIG. 12 shows a schematic structural diagram of a computer system suitable for implementing an electronic device of an embodiment of the present disclosure.

具体实施方式Detailed ways

现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本公开的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本公开的各方面。Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。The block diagrams shown in the figures are merely functional entities and do not necessarily necessarily correspond to physically separate entities. That is, these functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices entity.

附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowcharts shown in the figures are only exemplary illustrations and do not necessarily include all contents and operations/steps, nor do they have to be performed in the order described. For example, some operations/steps can be decomposed, and some operations/steps can be combined or partially combined, so the actual execution order may be changed according to the actual situation.

图1示出了可以应用本公开实施例的技术方案的示例性系统架构的示意图。FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure may be applied.

如图1所示,系统架构100可以包括任务管理前端101、网络102和任务管理后端103。网络102用以在任务管理前端101和任务管理后端103之间提供通信链路的介质。网络102可以包括各种连接类型,例如有线通信链路、无线通信链路等等。As shown in FIG. 1 , the system architecture 100 may include a task management front end 101 , a network 102 and a task management back end 103 . The network 102 is the medium used to provide the communication link between the task management front end 101 and the task management back end 103 . The network 102 may include various connection types, such as wired communication links, wireless communication links, and the like.

应该理解,图1中的任务管理前端、网络和任务管理后端的数目仅仅是示意性的。根据实际需要,可以具有任意数目的任务管理前端、网络和任务管理后端。比如任务管理后端103可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。任务管理前端101可以是诸如笔记本电脑、平板电脑、台式计算机等终端设备,但并不局限于此。It should be understood that the numbers of task management front ends, networks and task management back ends in FIG. 1 are merely illustrative. There can be any number of task management front-ends, networks, and task management back-ends according to actual needs. For example, the task management backend 103 may be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, and network services. , cloud communications, middleware services, domain name services, security services, CDN, and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms. The task management front end 101 may be a terminal device such as a notebook computer, a tablet computer, a desktop computer, etc., but is not limited thereto.

在本公开的一个实施例中,用户可以在任务管理前端101中录入任务信息或者通过任务管理前端101发送任务进度查询请求,任务信息和任务进度查询请求通过网络102发送至任务管理后端103。任务管理后端103包括任务管理服务器、任务调度服务器和任务执行服务器,其中,任务调度服务器包含多个任务调度服务,用于管理、调度用户的任务;任务执行服务器包含多个任务执行服务,用于执行接收到的用户的任务。接收到任务管理前端101下发的任务后,任务管理服务器将任务下发至任务调度服务器,通过多个任务调度服务调度任务,并调用相应地任务执行服务执行各个任务。在任务调度服务调度任务的过程中,如果某个任务调度服务出现故障,可以基于分布式协调服务系统通过未出现故障的任务调度服务接管出现故障的任务调度服务调度的任务,保证任务不丢失。另外任务管理服务器在向任务调度服务器下发任务时,可以根据各个任务调度服务对应的任务队列中的任务量动态调整下发任务的权重,保证任务均衡分配到每个任务调度服务。本公开实施例的技术方案能够在任务管理系统中设置多个任务调度服务以及与任务调度服务对应的节点以形成分布式任务管理系统,在该分布式任务管理系统中能够同时执行多个任务,提高了任务执行效率,另外该分布式任务管理系统具有服务自注册、故障自愈的特点,确保了任务不丢失,并且带权重的任务下发能够保证分布式任务管理系统的平稳运行,进一步提高了任务管理系统的高可用性。In an embodiment of the present disclosure, a user can enter task information in the task management front end 101 or send a task progress query request through the task management front end 101 , and the task information and task progress query request are sent to the task management back end 103 through the network 102 . The task management backend 103 includes a task management server, a task scheduling server, and a task execution server, wherein the task scheduling server includes a plurality of task scheduling services for managing and scheduling tasks of users; the task execution server includes a plurality of task execution services, which are used for to perform the tasks received by the user. After receiving the task issued by the task management front end 101, the task management server sends the task to the task scheduling server, schedules the task through multiple task scheduling services, and invokes the corresponding task execution service to execute each task. In the process of scheduling tasks by the task scheduling service, if a task scheduling service fails, the distributed coordination service system can take over the tasks scheduled by the faulty task scheduling service through the non-faulty task scheduling service to ensure that the task is not lost. In addition, when the task management server sends tasks to the task scheduling server, it can dynamically adjust the weight of the tasks to be delivered according to the task volume in the task queue corresponding to each task scheduling service, so as to ensure that the tasks are evenly distributed to each task scheduling service. The technical solutions of the embodiments of the present disclosure can set up multiple task scheduling services and nodes corresponding to the task scheduling services in the task management system to form a distributed task management system, in which multiple tasks can be executed simultaneously, The task execution efficiency is improved. In addition, the distributed task management system has the characteristics of service self-registration and fault self-healing, which ensures that tasks are not lost, and the weighted task issuance can ensure the smooth operation of the distributed task management system, further improving High availability of the task management system.

需要说明的是,本公开实施例所提供的任务管理方法一般由服务器执行,相应地,任务管理装置一般设置于服务器中。但是,在本公开的其它实施例中,也可以由终端设备执行本公开实施例所提供的任务管理方法。It should be noted that, the task management method provided by the embodiments of the present disclosure is generally executed by a server, and accordingly, a task management apparatus is generally set in the server. However, in other embodiments of the present disclosure, the task management method provided by the embodiments of the present disclosure may also be executed by the terminal device.

在本领域的相关技术中,图2示出了传统的任务管理系统的架构图,如图2所述,传统的任务管理系统包括控制台201、任务管理服务器202、任务执行服务器203和数据库204,具体的任务执行流程为:用户通过控制台201下发任务到任务管理服务器202;任务管理服务器202将任务信息存储到数据库204,同时将任务下发给任务执行服务器203;任务执行服务器203接收到任务后,开始执行任务,同时周期性的将任务执行进度更新到数据库204中;用户通过控制台201向任务管理服务器202发送任务进度查询请求,任务管理服务器202响应该任务进度查询请求,将数据库204中的任务执行状态实时的展示到控制台201中,以供用户浏览。In the related art in the art, FIG. 2 shows an architecture diagram of a traditional task management system. As described in FIG. 2 , the traditional task management system includes a console 201 , a task management server 202 , a task execution server 203 and a database 204 The specific task execution process is as follows: the user sends the task to the task management server 202 through the console 201; the task management server 202 stores the task information in the database 204, and simultaneously issues the task to the task execution server 203; the task execution server 203 receives After arriving at the task, start to execute the task, and periodically update the task execution progress to the database 204; the user sends a task progress query request to the task management server 202 through the console 201, and the task management server 202 responds to the task progress query request, will The task execution status in the database 204 is displayed in the console 201 in real time for the user to browse.

但是,传统的任务管理系统架构简单,适用于处理原子化的任务,如果一个任务由多个具有先后顺序的子任务构成,并且这些子任务由多个不同的任务执行服务执行,那么为了使得任务管理系统具有高可用性,就需要将任务管理服务设置为多个节点,通过不同的节点对任务进行管理。但是相应地存在问题,如多个节点之间如何协调分配任务、当某个节点故障了,该节点的任务该如何处理,等等。However, the traditional task management system has a simple structure and is suitable for processing atomic tasks. If a task consists of multiple sub-tasks with sequential order, and these sub-tasks are executed by multiple different task execution services, in order to make the task The management system has high availability, so it is necessary to set the task management service to multiple nodes, and manage tasks through different nodes. However, there are corresponding problems, such as how to coordinate and distribute tasks among multiple nodes, how to deal with the tasks of a node when a node fails, and so on.

鉴于相关技术中存在的问题,本公开实施例提供了一种任务管理方法,该方法是基于云技术的,云技术(Cloud technology)基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术、应用技术等的总称,可以组成资源池,按需所用,灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源,如视频网站、图片类网站和更多的门户网站。伴随着互联网行业的高度发展和应用,将来每个物品都有可能存在自己的识别标志,都需要传输到后台系统进行逻辑处理,不同程度级别的数据将会分开处理,各类行业数据皆需要强大的系统后盾支撑,只能通过云计算来实现。In view of the problems existing in the related technologies, the embodiments of the present disclosure provide a task management method, which is based on cloud technology, and cloud technology (Cloud technology) is based on network technology, information technology, integration technology, The general term for management platform technology, application technology, etc., which can form a resource pool, which can be used on demand and is flexible and convenient. Cloud computing technology will become an important support. Background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites. With the high development and application of the Internet industry, in the future, each item may have its own identification mark, which needs to be transmitted to the back-end system for logical processing. Data of different levels will be processed separately, and all kinds of industry data need to be strong. The system backing support can only be achieved through cloud computing.

云计算(cloud computing)是一种计算模式,它将计算任务分布在大量计算机构成的资源池上,使各种应用系统能够根据需要获取计算力、存储空间和信息服务。提供资源的网络被称为“云”。“云”中的资源在使用者看来是可以无限扩展的,并且可以随时获取,按需使用,随时扩展,按使用付费。Cloud computing is a computing model that distributes computing tasks on a resource pool composed of a large number of computers, enabling various application systems to obtain computing power, storage space and information services as needed. The network that provides the resources is called the "cloud". The resources in the "cloud" are infinitely expandable in the eyes of users, and can be obtained at any time, used on demand, expanded at any time, and paid for according to usage.

作为云计算的基础能力提供商,会建立云计算资源池(简称云平台,一般称为IaaS(Infrastructure as a Service,基础设施即服务)平台,在资源池中部署多种类型的虚拟资源,供外部客户选择使用。云计算资源池中主要包括:计算设备(为虚拟化机器,包含操作系统)、存储设备、网络设备。As a basic capability provider of cloud computing, it will establish a cloud computing resource pool (referred to as cloud platform, generally called IaaS (Infrastructure as a Service) platform, and deploy various types of virtual resources in the resource pool for External customers choose to use. The cloud computing resource pool mainly includes: computing devices (which are virtualized machines, including operating systems), storage devices, and network devices.

按照逻辑功能划分,在IaaS(Infrastructure as a Service,基础设施即服务)层上可以部署PaaS(Platform as a Service,平台即服务)层,PaaS层之上再部署SaaS(Software as a Service,软件即服务)层,也可以直接将SaaS部署在IaaS上。PaaS为软件运行的平台,如数据库、web容器等。SaaS为各式各样的业务软件,如web门户网站、短信群发器等。一般来说,SaaS和PaaS相对于IaaS是上层。According to the division of logical functions, the PaaS (Platform as a Service) layer can be deployed on the IaaS (Infrastructure as a Service) layer, and the SaaS (Software as a Service) layer can be deployed on the PaaS layer. service) layer, or directly deploy SaaS on IaaS. PaaS is a platform on which software runs, such as databases and web containers. SaaS is a variety of business software, such as web portals, SMS group senders, etc. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.

云处理通常应用于大数据领域,大数据(Big data)是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的数据集合,是需要新处理模式才能具有更强的决策力、洞察发现力和流程优化能力的海量、高增长率和多样化的信息资产。随着云时代的来临,大数据也吸引了越来越多的关注,大数据需要特殊的技术,以有效地处理大量的容忍经过时间内的数据。适用于大数据的技术,包括大规模并行处理数据库、数据挖掘、分布式文件系统、分布式数据库、云计算平台、互联网和可扩展的存储系统。Cloud processing is usually used in the field of big data. Big data refers to the collection of data that cannot be captured, managed and processed by conventional software tools within a certain time frame. It requires new processing modes to have stronger decision-making power, Massive, high-growth and diverse information assets for insight discovery and process optimization capabilities. With the advent of the cloud era, big data is also attracting more and more attention, and big data requires special technologies to efficiently process a large amount of data that tolerates elapsed time. Technologies applicable to big data, including massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems.

本公开实施例首先提供了一种任务管理方法,图3示意性示出了根据本公开的一个实施例的任务管理方法的流程图,该任务管理方法可以由任务管理前端和任务管理后端来执行,具体可以由终端设备和服务器来执行,另外该任务管理方法应用于分布式任务管理系统,该分布式任务管理系统包括任务调度服务器和与任务调度服务器中的多个任务调度服务所对应的节点,其中任务调度服务用于管理、调度用户的任务。参照图3所示,该任务管理方法至少包括步骤S310至步骤S330,详细介绍如下:An embodiment of the present disclosure first provides a task management method. FIG. 3 schematically shows a flowchart of a task management method according to an embodiment of the present disclosure. The task management method may be implemented by a task management front-end and a task management back-end. Execution, specifically, can be performed by a terminal device and a server. In addition, the task management method is applied to a distributed task management system, and the distributed task management system includes a task scheduling server and a plurality of task scheduling services in the task scheduling server. Node, where the task scheduling service is used to manage and schedule user tasks. Referring to FIG. 3 , the task management method includes at least steps S310 to S330, which are described in detail as follows:

在步骤S310中,各所述任务调度服务监听与所述节点对应的父节点。In step S310, each task scheduling service monitors the parent node corresponding to the node.

在本公开的一个实施例中,任务管理系统为分布式任务管理系统,该分布式任务管理系统中包含任务调度服务器,该任务调度服务器包含多个任务调度服务,多个任务调度服务组成一任务调度集群,且各个任务调度服务在分布式协调服务系统中注册形成节点。图4示出了分布式任务管理系统的架构示意图,如图4所示,分布式任务管理系统400包括控制台401、任务管理服务器402、任务调度服务器403、任务执行服务器404和数据存储设备405,其中控制台401具体可以是Web控制台,用于响应用户的触发操作以生成任务下发指令或任务进度查询请求,也就是说,用户通过该控制台可以进行任务下发、任务状态监控以及任务执行结果查询等操作;任务管理服务器402也称为任务管理后台,为控制台401的后台服务,与控制台401连接,用以响应控制台401发送的任务下发指令下发任务或者响应控制台401发送的任务进度查询请求以向控制台401反馈任务进度信息;任务调度服务器403与任务管理服务器402连接,包含一任务调度服务集群,该任务调度服务集群包含多个任务调度服务,其中各所述任务调度服务在分布式协调服务框架中注册形成节点,并且各个任务调度服务接收到任务管理设备402下发的任务,负责调度任务的整个生命周期,另外任务调度服务器403还可以实时将任务执行进度更新到数据存储设备405中;任务执行服务器404与任务调度服务器403连接,包含一任务执行服务集群,该任务执行服务集群包含多个任务执行服务,各任务执行服务即为原子化的计算任务执行服务,接收任务调度服务提交的子任务执行请求,执行相应地计算任务;数据存储设备405具体可以为硬盘等可用于存储数据的介质,用于存储任务管理服务器402和任务调度服务器403提交的数据。在本公开的实施例中,各个任务执行服务具有不同的任务类型标签,用户在通过控制台401录入任务时会对每个任务的类型进行标记,任务执行服务器404接收到任务时,可以将任务的类型与任务执行服务的任务类型标签进行匹配,以确定与其匹配的任务执行服务,并通过该匹配的任务执行服务执行该任务。In an embodiment of the present disclosure, the task management system is a distributed task management system, the distributed task management system includes a task scheduling server, the task scheduling server includes multiple task scheduling services, and the multiple task scheduling services form a task The cluster is scheduled, and each task scheduling service is registered in the distributed coordination service system to form a node. FIG. 4 shows a schematic diagram of the architecture of the distributed task management system. As shown in FIG. 4 , the distributed task management system 400 includes a console 401 , a task management server 402 , a task scheduling server 403 , a task execution server 404 and a data storage device 405 , where the console 401 may be a Web console specifically, and is used to generate a task issuing instruction or a task progress query request in response to a user's trigger operation, that is, the user can perform task issuing, task status monitoring and Task execution result query and other operations; the task management server 402 is also called the task management background, which is the background service of the console 401 and is connected to the console 401 to issue tasks or respond to control in response to the task issuing instructions sent by the console 401 The task progress query request sent by the console 401 is used to feed back the task progress information to the console 401; the task scheduling server 403 is connected to the task management server 402, and includes a task scheduling service cluster, and the task scheduling service cluster includes a plurality of task scheduling services, wherein each The task scheduling service is registered in the distributed coordination service framework to form a node, and each task scheduling service receives the task issued by the task management device 402 and is responsible for scheduling the entire life cycle of the task. The execution progress is updated in the data storage device 405; the task execution server 404 is connected to the task scheduling server 403, and includes a task execution service cluster, and the task execution service cluster includes a plurality of task execution services, and each task execution service is an atomic calculation The task execution service receives subtask execution requests submitted by the task scheduling service, and executes corresponding computing tasks; the data storage device 405 may be a hard disk or other media that can be used to store data, and is used to store the tasks submitted by the task management server 402 and the task scheduling server 403. The data. In the embodiment of the present disclosure, each task execution service has different task type labels, and the user will mark the type of each task when entering a task through the console 401. When the task execution server 404 receives a task, it can label the task type. The type of the task execution service is matched with the task type label of the task execution service to determine the matching task execution service, and execute the task through the matching task execution service.

在本公开的一个实施例中,任务管理服务器402、任务调度服务器403和任务执行服务器404可以是同一个服务器,也可以是不同的服务器,本公开实施例对此不作具体限定。In an embodiment of the present disclosure, the task management server 402, the task scheduling server 403, and the task execution server 404 may be the same server or different servers, which are not specifically limited in this embodiment of the present disclosure.

在本公开的一个实施例中,该分布式任务管理系统是基于分布式协调服务系统构建的,通过各个任务调度服务在分布式协调服务系统中进行注册形成与各个任务调度服务对应的节点,每个节点对应不同的节点编号。在本公开的实施例中,该分布式协调服务系统具体可以是zookeeper,zookeeper是Apache Hadoop的一个子项目,它主要是用来解决分布式应用中经常遇到的一些数据管理问题,如:统一命名服务、状态同步服务、集群管理、分布式应用配置项的管理等。zookeeper具有事件监听功能,其允许用户在指定节点上注册一些监听器watcher,并且在一些特定事件触发的时候,zookeeper服务端会将事件通知到感兴趣的客户端上去。zookeeper数据模型的结构整体上可以看作是一棵树,每个节点称作一个Znode,图5示出了分布式任务管理系统中任务调度服务之间的分布架构图,如图5所示,分布式任务管理系统中存在N个任务调度服务Broker1、Broker2、……、BrokerN,每个任务调度服务在zookeeper中进行注册,形成与各个任务调度服务对应的节点,该些节点分别为/broker/1、/broker/2、……、/broker/N,该些节点均为节点/brokers的子节点,也就是说,/brokers为各个任务调度服务对应的节点的父节点。在形成树状结构后,各个任务调度服务可以在其节点上注册监听器,用以监听父节点/brokers,当父节点或子节点中的信息发生变化时,zookeeper可以将变化信息通知到各个任务调度服务。In an embodiment of the present disclosure, the distributed task management system is constructed based on a distributed coordination service system, and each task scheduling service is registered in the distributed coordination service system to form a node corresponding to each task scheduling service. Each node corresponds to a different node number. In the embodiment of the present disclosure, the distributed coordination service system may specifically be zookeeper. Zookeeper is a sub-project of Apache Hadoop. It is mainly used to solve some data management problems often encountered in distributed applications, such as: unified Naming service, state synchronization service, cluster management, management of distributed application configuration items, etc. Zookeeper has an event monitoring function, which allows users to register some listener watchers on specified nodes, and when some specific events are triggered, the zookeeper server will notify interested clients of the event. The structure of the zookeeper data model can be regarded as a tree as a whole, and each node is called a Znode. Figure 5 shows the distributed architecture diagram between the task scheduling services in the distributed task management system, as shown in Figure 5, There are N task scheduling services Broker1, Broker2, ..., BrokerN in the distributed task management system. Each task scheduling service is registered in zookeeper to form nodes corresponding to each task scheduling service. These nodes are /broker/ 1. /broker/2, ..., /broker/N, these nodes are all child nodes of the node /brokers, that is to say, /brokers is the parent node of the node corresponding to each task scheduling service. After the tree structure is formed, each task scheduling service can register listeners on its nodes to monitor the parent node/brokers. When the information in the parent node or child node changes, zookeeper can notify each task of the change information. Dispatch service.

在本公开的一个实施例中,任务调度服务及其对应的节点的信息均可存储于数据库,即数据存储设备405中,便于任务管理服务器402从中获取相关信息,另外当任务调度服务更新任务进度时,也可将自身的节点序号更新到数据存储设备405中,图6示出了数据存储设备中存储的任务信息的结构示意图,如图6所示,其中存在多个任务,每个任务具有对应的任务ID、任务状态、任务调度服务的任务调度服务标识以及其它信息,例如任务ID为1的任务所对应的任务状态为Step1,任务调度服务的任务调度服务标识为1;任务ID为2的任务所对应的任务状态为Step2,任务调度服务的任务调度服务为2;任务ID为3的任务所对应的任务状态为Complete,任务调度服务的任务调度服务为N。值得说明的是,任务ID、任务状态和任务调度服务标识也可以是其它的表示形式,包括并不限于上述例子中的形式,本公开实施例在此不作具体限定。In an embodiment of the present disclosure, the information of the task scheduling service and its corresponding nodes can be stored in the database, that is, the data storage device 405, so that the task management server 402 can obtain relevant information therefrom, and when the task scheduling service updates the task progress , you can also update your own node serial number to the data storage device 405. Figure 6 shows a schematic structural diagram of the task information stored in the data storage device. As shown in Figure 6, there are multiple tasks, and each task has Corresponding task ID, task status, task scheduling service ID of the task scheduling service, and other information, for example, the task status corresponding to the task whose task ID is 1 is Step1, the task scheduling service ID of the task scheduling service is 1; the task ID is 2 The task status corresponding to the task of the task scheduling service is Step2, and the task scheduling service of the task scheduling service is 2; the task status corresponding to the task whose task ID is 3 is Complete, and the task scheduling service of the task scheduling service is N. It should be noted that the task ID, task status and task scheduling service identifier may also be in other representation forms, including but not limited to the forms in the above examples, which are not specifically limited in this embodiment of the present disclosure.

在本公开的一个实施例中,当某个任务调度服务故障挂掉后,该故障任务调度服务的节点就会被自动删除,其对应的任务就可能无法顺利执行,进而导致任务执行失败,降低用户体验。In an embodiment of the present disclosure, when a task scheduling service fails and hangs up, the node of the faulty task scheduling service will be automatically deleted, and the corresponding task may not be executed smoothly, thereby causing the task execution to fail and reducing the user experience.

在步骤S320中,在监听到所述节点中存在被删除的第一节点时,通过目标任务调度服务创建与所述第一节点对应的第二节点,所述第二节点对应的目标任务调度服务不同于与所述第一节点对应的任务调度服务。In step S320, when it is detected that the deleted first node exists in the node, a second node corresponding to the first node is created through the target task scheduling service, and the target task scheduling service corresponding to the second node Different from the task scheduling service corresponding to the first node.

在本公开的一个实施例中,为了保证所有的任务都可被顺利执行,当存在出现故障挂掉的任务调度服务时,可以由正常工作的任务调度服务接管其所对应的任务,并按照顺序执行任务队列中的所有任务。接管出现故障挂掉的任务调度服务的任务的具体流程为:在监听到节点中存在被删除的第一节点时,由除挂掉的任务调度服务之外的其它任务调度服务竞争确定目标任务调度服务;然后通过目标任务调度服务在zookeeper中创建与第一节点对应的第二节点。从多个竞争的任务调度服务中唯一确定目标任务调度服务是zookeeper的一种特性,至于具体如何竞争确定目标任务调度服务,本公开实施例在此不再赘述。图7示出了任务调度服务故障时的任务调度服务之间的关系图,如图7所示,编号为/brokers/2对应的任务调度服务出现故障挂掉了,因此编号为/brokers/2的节点被删除。那么可以通过目标任务调度服务注册新的节点,以生成与编号为/brokers/2的节点对应的新节点。In an embodiment of the present disclosure, in order to ensure that all tasks can be executed smoothly, when there is a task scheduling service that fails and hangs up, the normal working task scheduling service can take over the corresponding tasks, and perform the tasks in order. Execute all tasks in the task queue. The specific process of taking over the task of the task scheduling service that fails and hangs up is as follows: when monitoring the existence of the deleted first node in the node, the target task scheduling is determined by the competition of other task scheduling services except the suspended task scheduling service. service; then create a second node corresponding to the first node in zookeeper through the target task scheduling service. It is a feature of zookeeper to uniquely determine the target task scheduling service from multiple competing task scheduling services. As for how to compete and determine the target task scheduling service, the embodiment of the present disclosure will not repeat them here. Figure 7 shows the relationship between the task scheduling services when the task scheduling service fails. As shown in Figure 7, the task scheduling service corresponding to the number /brokers/2 fails and hangs up, so the number is /brokers/2 node is deleted. Then a new node can be registered through the target task scheduling service to generate a new node corresponding to the node numbered /brokers/2.

在本公开的一个实施例中,第一节点的编号与第二节点的编号可以不完全相同,例如可以将第二节点的编号设置为/brokers/N-r,比如编号为/brokers/2的节点对应的任务调度服务出现故障挂掉了,该节点被删除,那么目标任务调度服务可以接管该任务调度服务的任务,并在zookeeper上注册与/brokers/2对应的节点,新注册的节点的编号可以设置为/brokers/2-r;也可以基于第一节点的编号将第二节点的编号设置为其它形式,当然也可以将第一节点的编号和第二节点的编号设置为完全不同的形式,本公开的实施例对此不作具体限定。In an embodiment of the present disclosure, the number of the first node and the number of the second node may not be exactly the same. For example, the number of the second node may be set to /brokers/N-r, for example, the node numbered /brokers/2 corresponds to If the task scheduling service fails and hangs up, the node is deleted, then the target task scheduling service can take over the task of the task scheduling service and register the node corresponding to /brokers/2 on zookeeper. The number of the newly registered node can be Set to /brokers/2-r; you can also set the number of the second node to other forms based on the number of the first node, of course, you can also set the number of the first node and the number of the second node to a completely different form, The embodiments of the present disclosure do not specifically limit this.

在步骤S330中,所述目标任务调度服务从数据库中获取与所述第一节点对应的任务信息,并将所述任务信息添加至与所述目标任务调度服务对应的任务队列中。In step S330, the target task scheduling service acquires task information corresponding to the first node from a database, and adds the task information to a task queue corresponding to the target task scheduling service.

在本公开的一个实施例中,在成功创建与第一节点对应的第二节点后,可以通过目标任务调度服务从数据存储设备405中获取与第一节点对应的任务信息,并将该任务信息添加至与目标任务调度服务对应的任务队列中。在本公开的实施例中,各个任务调度服务分别对应一个任务队列,各任务调度服务接收到任务管理服务器402下发的任务后,可以将接收到的任务添加到其所对应的任务队列中,这样可以提高任务处理效率,保证任务处理系统的高可用性。In an embodiment of the present disclosure, after the second node corresponding to the first node is successfully created, the task information corresponding to the first node can be obtained from the data storage device 405 through the target task scheduling service, and the task information Add to the task queue corresponding to the target task scheduling service. In the embodiment of the present disclosure, each task scheduling service corresponds to a task queue respectively, and after each task scheduling service receives the task issued by the task management server 402, it can add the received task to its corresponding task queue, In this way, the task processing efficiency can be improved, and the high availability of the task processing system can be ensured.

进一步地,在将任务信息添加至与目标任务调度服务对应的任务队列中之后,还可以将数据存储设备405中与任务信息对应的任务调度服务标识修改为与目标任务调度服务对应的任务调度服务标识。Further, after adding the task information to the task queue corresponding to the target task scheduling service, the task scheduling service identifier corresponding to the task information in the data storage device 405 can also be modified to the task scheduling service corresponding to the target task scheduling service. logo.

继续以步骤S320中的例子为例,当其它的任务调度服务监听到编号为/brokers/2的节点被删除时,竞争创建与编号为/brokers/2的节点对应的新节点,当成功创建编号为/brokers/2-r的节点后,目标任务调度服务可以从数据库中获取所有Broker2的任务信息,并将这些任务信息放入自身的任务队列中,同时将数据存储设备中与这些任务信息对应的任务调度服务标识修改为与目标任务调度服务对应的任务调度服务标识。图8示出了任务恢复后的任务调度服务之间的关系图,如图8所示,任务调度服务Broker1为竞争后确定的目标任务调度服务,其在zookeeper中注册形成新的节点,该节点的编号为/brokers/2-r,与被删除的编号为/brokers/2的节点相对应。同时,图9示出了数据存储设备中存储的恢复后的任务信息的结构示意图,如图9所示,原始的任务ID为1、3的任务所对应的任务状态均变为Running,任务调度服务的任务调度服务标识未改变,而原始的任务ID为2的任务对应的任务状态变为Running,任务调度服务的任务调度服务标识由2变为1。Continuing to take the example in step S320 as an example, when other task scheduling services monitor that the node numbered /brokers/2 is deleted, they compete to create a new node corresponding to the node numbered /brokers/2. After being the node of /brokers/2-r, the target task scheduling service can obtain the task information of all Broker2 from the database, put the task information into its own task queue, and store the data storage device corresponding to the task information. The task scheduling service ID of is modified to the task scheduling service ID corresponding to the target task scheduling service. Figure 8 shows the relationship between the task scheduling services after task recovery. As shown in Figure 8, the task scheduling service Broker1 is the target task scheduling service determined after the competition. It is registered in zookeeper to form a new node. is numbered /brokers/2-r, which corresponds to the deleted node numbered /brokers/2. Meanwhile, FIG. 9 shows a schematic structural diagram of the restored task information stored in the data storage device. As shown in FIG. 9 , the task statuses corresponding to the tasks with the original task IDs 1 and 3 are changed to Running, and the task is scheduled The task scheduling service identifier of the service remains unchanged, while the task status corresponding to the task whose original task ID is 2 becomes Running, and the task scheduling service identifier of the task scheduling service changes from 2 to 1.

在本公开的一个实施例中,任务状态信息、任务对应的任务调度服务标识等信息都会定时更新到数据存储设备中,另外,还可以新建任务调度服务,并在zookeeper中注册形成对应的节点,该新建的任务调度服务对应的任务调度服务标识、对应的节点信息也会被更新到数据存储设备中,以供任务管理服务器获取。In an embodiment of the present disclosure, information such as task status information and task scheduling service identifiers corresponding to tasks are regularly updated to the data storage device. In addition, a new task scheduling service can be created and registered in zookeeper to form a corresponding node. The task scheduling service identifier and the corresponding node information corresponding to the newly created task scheduling service will also be updated to the data storage device for the task management server to acquire.

在本公开的一个实施例中,基于图4所示的分布式任务管理系统的系统架构图,任务管理服务器402可以将用户发送的待执行任务下发至任务调度服务器403中的各个任务调度服务,任务调度服务接收到任务管理服务器402发送的待执行任务后,可以根据待执行任务形成任务执行请求,然后将任务执行请求发送至任务执行服务器404中对应的任务执行服务,以使任务执行服务执行任务。In an embodiment of the present disclosure, based on the system architecture diagram of the distributed task management system shown in FIG. 4 , the task management server 402 can deliver the tasks to be executed sent by the user to each task scheduling service in the task scheduling server 403 , after the task scheduling service receives the task to be executed sent by the task management server 402, it can form a task execution request according to the task to be executed, and then send the task execution request to the corresponding task execution service in the task execution server 404, so that the task execution service perform tasks.

在本公开的一个实施例中,每个任务调度服务的执行流程相同,为了使本公开的技术方案更清晰,下面以一个任务调度服务为例,对任务调度服务的执行流程进行说明。In an embodiment of the present disclosure, the execution process of each task scheduling service is the same. In order to make the technical solution of the present disclosure clearer, the following uses a task scheduling service as an example to describe the execution process of the task scheduling service.

图10示出了任务调度服务的执行流程示意图,如图10所示,在步骤S1001中,将待执行任务添加至与任务调度服务对应的任务队列中,并判断任务调度服务的线程池中是否存在空闲线程;在步骤S1002中,当存在空闲线程时,从与任务调度服务对应的任务队列中拉取待执行任务,并获取待执行任务中的子任务;在步骤S1003中,判断该子任务是否为待执行子任务;在步骤S1004中,当判定该子任务为待执行子任务时,根据待执行子任务形成任务执行请求;在步骤S1005中,将任务执行请求中待执行任务的任务类型与任务类型标签进行匹配,以确定目标任务执行服务;在步骤S1006中,将任务执行请求发送至目标任务执行服务,以使目标任务执行服务执行任务;在步骤S1007中,查询待执行子任务是否执行完成;在步骤S1008中,若待执行子任务执行完成,则更新数据库中与待执行任务对应的任务进度信息;在步骤S1009中,若待执行子任务执行未完成,则将待执行任务重新添加至与任务调度服务对应的任务队列中;在步骤S1010中,判断待执行任务中是否存在未执行子任务;在步骤S1011中,若存在,则在未执行子任务为待执行子任务时执行该未执行子任务,直至待执行任务中不存在未执行子任务;在步骤S1012中,若不存在,则更新数据存储设备中与待执行任务对应的任务状态信息。Figure 10 shows a schematic diagram of the execution flow of the task scheduling service. As shown in Figure 10, in step S1001, the task to be executed is added to the task queue corresponding to the task scheduling service, and it is determined whether the thread pool of the task scheduling service is in the thread pool. There is an idle thread; in step S1002, when there is an idle thread, pull the task to be executed from the task queue corresponding to the task scheduling service, and obtain the subtask in the task to be executed; in step S1003, determine the subtask Whether it is a subtask to be executed; in step S1004, when it is determined that the subtask is a subtask to be executed, a task execution request is formed according to the subtask to be executed; in step S1005, the task type of the task to be executed in the task execution request Match with the task type label to determine the target task execution service; in step S1006, send the task execution request to the target task execution service, so that the target task execution service execution task; in step S1007, query whether the subtask to be executed is The execution is completed; in step S1008, if the execution of the subtask to be executed is completed, the task progress information corresponding to the task to be executed in the database is updated; in step S1009, if the execution of the subtask to be executed is not completed, then the task to be executed is renewed. Add to the task queue corresponding to the task scheduling service; in step S1010, determine whether there is an unexecuted subtask in the to-be-executed task; in step S1011, if there is, execute when the unexecuted subtask is a to-be-executed subtask The unexecuted subtask is until there is no unexecuted subtask in the to-be-executed task; in step S1012, if there is no unexecuted subtask, update the task status information corresponding to the to-be-executed task in the data storage device.

在步骤S1001之前,任务调度服务接收到待执行任务后即可将其放入与任务调度服务对应的任务队列中,该任务队列中的待执行任务可以按照接收时间的先后顺序进行排序,接收时间较早的待执行任务可以位于任务队列的前端,接收时间较晚的待执行任务可以位于任务队列的后端,当然任务队列中的待执行任务还可以按照其它规则进行排序,例如根据待执行任务的优先级等进行排序,本公开实施例对此不作具体限定。在步骤S1003中,任务调度服务中的空闲线程获取到一个调度的待执行任务后,由于该待执行任务中包含多个子任务,因此任务调度服务需要获取子任务的属性信息,然后根据子任务的属性信息和子任务形成任务执行请求,获取子任务的属性信息具体包括,判断拉取的待执行任务中包含多少子任务,以及子任务是否为待执行子任务,例如一个待执行任务中包含5个依次排列的子任务,分别为1号、2号、3号、4号、5号子任务,当任务调度服务拉取1号子任务时需要获取该子任务的执行状态,并根据子任务的执行状态判断其是否为待执行子任务,例如待执行任务中的1号子任务和2号子任务曾执行过,由于在查询2号子任务是否执行完成时,其查询结果为执行未完成便将该待执行任务重新放入了任务队列中,所以任务调度服务获取的1号子任务的执行状态为1号子任务已执行,相应地,1号子任务不是待执行子任务,接着可以依次判断剩余子任务的执行状态,以确定该些子任务是否为待执行子任务。在步骤S1007中,可以在预设时间点进行待执行子任务执行状态的查询,例如在任务执行服务接收并执行待执行子任务后立即查询,在执行后的第3s、5s进行查询等等,如果在预设时间点查询到的待执行子任务执行状态为执行未完成,则将该待执行任务重新放回到任务队列的尾部进行排序。在步骤S1008中,任务进度信息为待执行子任务的执行进度,例如图5和图9中的Step1、Step2、Running等都是执行进度,用以标识待执行子任务执行到了什么地步;在步骤S1012中,任务状态信息为待执行任务的执行结果,例如图5中的Complete,标识待执行任务执行完毕,当然还可以采用其它的标识,例如成功执行、等待重新执行等等。Before step S1001, after the task scheduling service receives the to-be-executed task, it can put it into the task queue corresponding to the task scheduling service, and the to-be-executed tasks in the task queue can be sorted according to the order of reception time. Earlier tasks to be executed can be located at the front of the task queue, and tasks to be executed with a later reception time can be located at the back end of the task queue. Of course, the tasks to be executed in the task queue can also be sorted according to other rules, such as according to the tasks to be executed. Priority and the like are sorted, which is not specifically limited in this embodiment of the present disclosure. In step S1003, after the idle thread in the task scheduling service obtains a scheduled task to be executed, since the task to be executed contains multiple subtasks, the task scheduling service needs to obtain the attribute information of the subtask, and then according to the subtask's attribute information Attribute information and subtasks form a task execution request. Obtaining attribute information of subtasks includes determining how many subtasks are included in the pulled task to be executed, and whether the subtasks are subtasks to be executed. For example, a task to be executed contains 5 subtasks. The subtasks arranged in sequence are subtasks No. 1, No. 2, No. 3, No. 4, and No. 5. When the task scheduling service pulls the subtask No. 1, it needs to obtain the execution status of the subtask, and according to the subtask The execution status judges whether it is a subtask to be executed. For example, subtask 1 and subtask 2 in the task to be executed have been executed. When querying whether the execution of subtask 2 is completed, the query result is that the execution is not completed. The task to be executed is put back into the task queue, so the execution status of subtask No. 1 obtained by the task scheduling service is that subtask No. 1 has been executed. Correspondingly, subtask No. 1 is not a subtask to be executed. The execution status of the remaining subtasks is judged to determine whether these subtasks are subtasks to be executed. In step S1007, the query of the execution state of the subtask to be executed may be performed at a preset time point, for example, the query is performed immediately after the task execution service receives and executes the subtask to be executed, and the query is performed at the 3s and 5s after execution, etc., If the execution status of the to-be-executed subtask queried at the preset time point is not completed, the to-be-executed task is put back into the tail of the task queue for sorting. In step S1008, the task progress information is the execution progress of the subtask to be executed. For example, Step1, Step2, Running, etc. in FIG. 5 and FIG. In S1012, the task status information is the execution result of the task to be executed. For example, Complete in FIG. 5 indicates that the execution of the to-be-executed task is completed. Of course, other identifications may also be used, such as successful execution, waiting for re-execution and so on.

在本公开的一个实施例中,在整个待执行任务执行完成后,将任务执行结果即任务状态信息更新到数据管理设备405中,然后再重复执行步骤S1001-S1012,以执行任务队列中的其它待执行任务。In an embodiment of the present disclosure, after the execution of the entire task to be executed is completed, the task execution result, that is, the task status information, is updated to the data management device 405, and then steps S1001-S1012 are repeatedly executed to execute other tasks in the task queue. task to be performed.

在本公开的一个实施例中,任务管理服务器向任务调度服务器中的任务调度服务下发任务,为了保证任务的均衡分配,避免有的任务调度服务负载过高、有的任务调度服务负载过低,可以根据任务调度服务对应的任务量进行动态调整各个任务调度服务的任务下发量。任务调度服务器中的每个任务调度服务都有与其对应的任务队列,该任务队列中的任务数量就反映了任务调度服务本身的负载,为了获取各个任务调度服务对应的任务量,可以通过每个任务调度服务周期性地根据各自任务队列中的任务数量更新与各个任务调度服务对应的节点中的任务量,然后任务管理服务器可以从zookeeper上获取所有节点的任务量,并根据任务量确定任务分配权重,进一步基于任务分配权重执行任务分配。例如任务调度服务集群包括三个任务调度服务A、B、C,与任务调度服务A对应的节点中存储的任务量为10个,与任务调度服务B对应的节点中存储的任务量为5个,与任务调度服务C对应的节点中存储的任务量为15个,那么可以确定与任务调度服务B对应的任务分配权重最大,与任务调度服务A对应的任务分配权重次之,与任务调度服务C对应的任务分配权重最小,具体的任务分配权重分别为5/12、1/3、1/4,若任务管理后台需要下发60个待执行任务时,可以向任务调度服务B分配25个,向任务调度服务A分配20个,向任务调度服务C分配15个,这样就可以保证任务调度服务A、任务调度服务B和任务调度服务C中的任务量都为30个,保证了任务分配的均衡。In an embodiment of the present disclosure, the task management server issues tasks to the task scheduling service in the task scheduling server. In order to ensure balanced assignment of tasks, it is avoided that some task scheduling services have an excessively high load and some task scheduling services have an excessively low load. , the task delivery amount of each task scheduling service can be dynamically adjusted according to the task amount corresponding to the task scheduling service. Each task scheduling service in the task scheduling server has its corresponding task queue, and the number of tasks in the task queue reflects the load of the task scheduling service itself. The task scheduling service periodically updates the task volume in the nodes corresponding to each task scheduling service according to the number of tasks in the respective task queues, and then the task management server can obtain the task volume of all nodes from zookeeper, and determine the task allocation according to the task volume. weight, and further perform task assignment based on the task assignment weight. For example, the task scheduling service cluster includes three task scheduling services A, B, and C. The number of tasks stored in the node corresponding to the task scheduling service A is 10, and the number of tasks stored in the node corresponding to the task scheduling service B is 5 , the number of tasks stored in the node corresponding to the task scheduling service C is 15, then it can be determined that the task allocation weight corresponding to the task scheduling service B is the largest, the task allocation weight corresponding to the task scheduling service A is second, and the task scheduling service The task assignment weight corresponding to C is the smallest, and the specific task assignment weights are 5/12, 1/3, and 1/4. If the task management background needs to deliver 60 tasks to be executed, it can assign 25 tasks to the task scheduling service B. , 20 tasks are allocated to task scheduling service A and 15 tasks are allocated to task scheduling service C, so that the number of tasks in task scheduling service A, task scheduling service B and task scheduling service C can all be guaranteed to be 30, which ensures the task allocation balance.

在本公开的一个实施例中,用户能够接触的只有控制台401,分布式任务管理系统400的内部细节对用户是不可见的,因此用户仅可以通过对控制台进行不同的触发操作以进行任务下发、任务状态监控以及任务执行结果查询等。具体地,控制台401响应用户的第一触发操作可以录入待执行任务;或者,控制台401响应用户的第二触发操作可以生成任务进度查询请求,并将任务进度查询请求发送至任务管理服务器402,任务管理服务器402接收到该任务进度查询请求后,可以从数据存储设备405中获取与任务进度查询请求中的任务ID所对应任务的任务状态,并将该任务状态反馈并显示在控制台的显示界面中,方便用户浏览。In an embodiment of the present disclosure, only the console 401 can be contacted by the user, and the internal details of the distributed task management system 400 are invisible to the user, so the user can only perform tasks by performing different trigger operations on the console Delivery, task status monitoring, and task execution result query. Specifically, the console 401 may enter the task to be executed in response to the user's first trigger operation; or, the console 401 may generate a task progress query request in response to the user's second trigger operation, and send the task progress query request to the task management server 402 , after the task management server 402 receives the task progress query request, it can obtain the task status of the task corresponding to the task ID in the task progress query request from the data storage device 405, and feed back the task status and display it on the console In the display interface, it is convenient for users to browse.

本公开实施例中的任务管理方法可以应用于多个场景,特别是需要较长执行时间的计算任务,例如离线语音识别、大数据统计任务、机器学习模型训练等等。The task management method in the embodiment of the present disclosure can be applied to multiple scenarios, especially computing tasks that require a long execution time, such as offline speech recognition, big data statistics tasks, machine learning model training, and so on.

以大数据统计任务为例,假设一个任务是需要对景区中的外地游客推送广告,那么首先可以获取位于该景区中的人群,具体可以根据数据库中存储的游客随身携带的定位设备提交的地理坐标数据确定该景区中人群的组成,即该人群所包含的游客;接着可以根据每个游客最近点击浏览的页面以及聊天内容、位置共享信息等信息判断哪些游客是外地游客;然后针对确认为外地游客的游客,根据其用户信息、浏览信息等信息构建用户画像;最后基于每个游客的用户画像可以向每个游客推送相关的广告,比如有的用户频繁浏览火车、机票订票平台,那么可以向该用户推送火车、机票广告信息,有的用户频繁浏览美食推荐平台,那么可以向该用户推送美食广告,等等。在推送广告的过程中,存在三个子任务,第一个子任务是筛选人群数据,第二个子任务是构建外地游客的用户画像,第三个子任务是基于用户画像进行广告投放,这三个子任务的类型不同,基于本公开实施例的任务管理方法,可以通过用户在控制台中录入该任务及相关的子任务,然后通过任务管理服务器将该任务及相关子任务下发至任务调度服务器,通过该任务调度服务器对该任务进行调度并向任务执行服务发送任务执行请求,以使任务执行服务执行各子任务,以完成该任务。在通过本公开实施例中的任务管理系统进行任务管理时,即使三个子任务中部分子任务执行完成,部分子任务执行未完成,也可以在重复执行该任务时只执行未完成的子任务,而无需重复执行已完成的子任务,提高了任务执行效率。Taking the big data statistical task as an example, suppose a task is to push advertisements to tourists from other places in the scenic spot, then firstly, the crowd located in the scenic spot can be obtained. Specifically, the geographical coordinates submitted by the positioning device carried by the tourists can be stored in the database. The data determines the composition of the crowd in the scenic spot, that is, the tourists included in the crowd; then it can determine which tourists are out-of-town tourists based on the pages recently clicked by each tourist, chat content, location sharing information and other information; tourists, build user portraits based on their user information, browsing information and other information; finally, based on the user portraits of each tourist, relevant advertisements can be pushed to each tourist. For example, if some users frequently browse train and air ticket booking platforms, they can send The user pushes advertisement information of trains and air tickets, and some users frequently browse the food recommendation platform, so food advertisements can be pushed to the user, and so on. In the process of pushing advertisements, there are three subtasks. The first subtask is to filter crowd data, the second subtask is to build user portraits of foreign tourists, and the third subtask is to place advertisements based on user portraits. These three subtasks According to the task management method of the embodiment of the present disclosure, the task and related subtasks can be entered in the console by the user, and then the task and related subtasks can be sent to the task scheduling server through the task management server. The task scheduling server schedules the task and sends a task execution request to the task execution service, so that the task execution service executes each subtask to complete the task. When performing task management through the task management system in the embodiment of the present disclosure, even if some subtasks of the three subtasks are executed and some subtasks are not completed, only the unfinished subtasks can be executed when the task is repeatedly executed, and the There is no need to repeat the completed subtasks, which improves the efficiency of task execution.

在本公开的一个实施例中,任务管理系统为分布式任务管理系统,其可以通过任务调度服务器调度多个任务,其中的每个任务调度服务可以完整的调度多个任务的整个生命周期,保证了任务管理系统的高可用性。In one embodiment of the present disclosure, the task management system is a distributed task management system, which can schedule multiple tasks through a task scheduling server, and each task scheduling service can completely schedule the entire life cycle of multiple tasks, ensuring that High availability of the task management system.

本公开实施例中的任务管理方法通过将任务管理系统与分布式协调服务系统相结合,在任务管理系统中设置有多个任务调度服务,每个任务调度服务在分布式协调服务系统中注册形成节点,进而形成多节点的分布式任务管理系统,该分布式任务管理系统中的每个任务调度服务可以完整的调度多个任务的整个生命周期,当有个别任务调度服务出现故障挂掉时,其它正常工作的任务调度服务可以竞争接管出现故障挂掉的任务调度服务对应的任务,实现了服务自注册和故障自愈,确保任务不丢失,并且保证了整个分布式任务管理系统的正常服务。同时,分布式任务管理系统中存在多个任务调度服务和多个任务执行服务,不同的任务执行服务可以执行不同的任务,因此提高了任务执行效率;另外分布式任务管理系统带权重的任务下发可以均衡各个任务调度服务的负载,防止个别任务调度服务负载过高引起的任务调度服务故障,进一步保证了分布式任务管理系统的正常服务。The task management method in the embodiment of the present disclosure combines a task management system with a distributed coordination service system, and sets a plurality of task scheduling services in the task management system, and each task scheduling service is registered in the distributed coordination service system to form a Nodes, and then form a multi-node distributed task management system. Each task scheduling service in the distributed task management system can completely schedule the entire life cycle of multiple tasks. When an individual task scheduling service fails and hangs up, Other normally working task scheduling services can compete to take over the tasks corresponding to the failed task scheduling services, which realizes service self-registration and fault self-healing, ensures that tasks are not lost, and ensures the normal service of the entire distributed task management system. At the same time, there are multiple task scheduling services and multiple task execution services in the distributed task management system. Different task execution services can perform different tasks, thus improving the task execution efficiency. In addition, the distributed task management system has weighted tasks under the It can balance the load of each task scheduling service, prevent the task scheduling service failure caused by the high load of individual task scheduling service, and further ensure the normal service of the distributed task management system.

以下介绍本公开的装置实施例,可以用于执行本公开上述实施例中的任务管理方法。对于本公开装置实施例中未披露的细节,请参照本公开上述的任务管理方法。The apparatus embodiments of the present disclosure are introduced below, which can be used to execute the task management methods in the above-mentioned embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the above-mentioned task management method of the present disclosure.

图11示意性示出了根据本公开的一个实施例的任务管理装置的框图。FIG. 11 schematically shows a block diagram of a task management apparatus according to an embodiment of the present disclosure.

参照图11所示,根据本公开的一个实施例的任务管理装置1100,该任务管理装置应用于分布式任务管理系统,该分布式任务管理系统包括任务调度服务器和与所述任务调度服务器中的多个任务调度服务所对应的节点,其中所述任务调度服务用于管理、调度用户的任务;任务管理装置1100包括:监听模块1101、节点创建模块1102和任务接管模块1103。Referring to FIG. 11 , according to a task management apparatus 1100 according to an embodiment of the present disclosure, the task management apparatus is applied to a distributed task management system, and the distributed task management system includes a task scheduling server and a task scheduling server and a task scheduling server. Nodes corresponding to multiple task scheduling services, wherein the task scheduling services are used to manage and schedule user tasks; the task management device 1100 includes: a monitoring module 1101 , a node creation module 1102 and a task takeover module 1103 .

其中,监听模块1101,用于各所述任务调度服务监听与所述节点对应的父节点;节点创建模块1102,用于在监听到所述节点中存在被删除的第一节点时,通过目标任务调度服务创建与所述第一节点对应的第二节点,所述第二节点对应的目标任务调度服务不同于与所述第一节点对应的任务调度服务;任务接管模块1103,用于所述目标任务调度服务从数据库中获取与所述第一节点对应的任务信息,并将所述任务信息添加至与所述目标任务调度服务对应的任务队列中。Among them, the monitoring module 1101 is used for each of the task scheduling services to monitor the parent node corresponding to the node; the node creation module 1102 is used for monitoring the existence of the deleted first node in the node, through the target task The scheduling service creates a second node corresponding to the first node, and the target task scheduling service corresponding to the second node is different from the task scheduling service corresponding to the first node; the task takeover module 1103 is used for the target The task scheduling service acquires task information corresponding to the first node from the database, and adds the task information to the task queue corresponding to the target task scheduling service.

在本公开的一个实施例中,所述任务管理装置1100还包括:注册模块,用于通过各所述任务调度服务在分布式协调服务系统中进行注册,以获取与各所述任务调度服务对应的具有不同节点编号的节点。In an embodiment of the present disclosure, the task management apparatus 1100 further includes: a registration module, configured to register in the distributed coordination service system through each of the task scheduling services, so as to obtain information corresponding to each of the task scheduling services of nodes with different node numbers.

在本公开的一个实施例中,所述任务管理装置1100还包括:标识修改模块,用于将所述任务信息对应的任务调度服务标识修改为与所述目标任务调度服务对应的任务调度服务标识。In an embodiment of the present disclosure, the task management apparatus 1100 further includes: an identification modification module, configured to modify the task scheduling service identification corresponding to the task information to the task dispatching service identification corresponding to the target task dispatching service .

在本公开的一个实施例中,所述分布式任务管理系统还包括任务管理服务器和任务执行服务器;所述任务管理装置1100还包括:请求生成模块,用于通过所述任务调度服务器接收所述任务管理服务器发送的待执行任务,并根据所述待执行任务形成任务执行请求;任务执行模块,用于将所述任务执行请求发送至所述任务执行服务器,以使所述任务执行服务器中的任务执行服务执行任务。In an embodiment of the present disclosure, the distributed task management system further includes a task management server and a task execution server; the task management apparatus 1100 further includes: a request generation module, configured to receive the task scheduling server through the task scheduling server. The task to be executed sent by the task management server, and a task execution request is formed according to the to-be-executed task; the task execution module is configured to send the task execution request to the task execution server, so that the tasks in the task execution server are executed. The task execution service executes the task.

在本公开的一个实施例中,所述请求生成模块包括:任务添加单元,用于将所述待执行任务添加至与所述任务调度服务对应的任务队列中,并判断所述任务调度服务的线程池中是否存在空闲线程;任务拉取单元,用于当存在所述空闲线程时,从与所述任务调度服务对应的任务队列中拉取所述待执行任务,并获取所述待执行任务中的子任务;请求生成单元,用于根据所述子任务的属性信息和所述子任务形成所述任务执行请求。In an embodiment of the present disclosure, the request generating module includes: a task adding unit, configured to add the to-be-executed task to a task queue corresponding to the task scheduling service, and determine the Whether there is an idle thread in the thread pool; the task pulling unit is used to pull the to-be-executed task from the task queue corresponding to the task scheduling service when the idle thread exists, and obtain the to-be-executed task A subtask in ; a request generating unit, configured to form the task execution request according to the attribute information of the subtask and the subtask.

在本公开的一个实施例中,所述请求生成单元配置为:判断所述子任务是否为待执行子任务;当判定所述子任务为待执行子任务时,根据所述待执行子任务形成所述任务执行请求。In an embodiment of the present disclosure, the request generating unit is configured to: determine whether the subtask is a subtask to be executed; when it is determined that the subtask is a subtask to be executed, form a subtask according to the subtask to be executed the task execution request.

在本公开的一个实施例中,所述任务执行服务的数量为多个,并且各所述任务执行服务具有不同的任务类型标签;所述任务执行模块配置为:将所述任务执行请求中待执行任务的任务类型与所述任务类型标签进行匹配,以确定目标任务执行服务;将所述任务执行请求发送至所述目标任务执行服务,以使所述目标任务执行服务执行任务。In an embodiment of the present disclosure, the number of the task execution services is multiple, and each of the task execution services has a different task type label; the task execution module is configured to: The task type of the execution task is matched with the task type label to determine the target task execution service; the task execution request is sent to the target task execution service, so that the target task execution service executes the task.

在本公开的一个实施例中,所述任务管理装置1100还配置为:查询所述待执行子任务是否执行完成;若所述待执行子任务执行完成时,则更新所述数据库中与所述待执行任务对应的任务进度信息;若所述待执行子任务执行未完成,则将所述待执行任务重新添加至与所述任务调度服务对应的任务队列中。In an embodiment of the present disclosure, the task management apparatus 1100 is further configured to: query whether the execution of the subtask to be executed is completed; if the execution of the subtask to be executed is completed, update the Task progress information corresponding to the task to be executed; if the execution of the subtask to be executed is not completed, the task to be executed is re-added to the task queue corresponding to the task scheduling service.

在本公开的一个实施例中,所述任务管理装置1100还配置为:判断所述待执行任务中是否存在未执行子任务;若存在,则在所述未执行子任务为待执行子任务时执行所述未执行子任务,直至所述待执行任务中不存在未执行子任务;若不存在,则更新所述数据库中与所述待执行任务对应的任务状态信息。In an embodiment of the present disclosure, the task management apparatus 1100 is further configured to: determine whether there is an unexecuted subtask in the to-be-executed task; if so, when the unexecuted subtask is a to-be-executed subtask Execute the unexecuted subtask until there is no unexecuted subtask in the to-be-executed task; if there is no unexecuted subtask, update the task status information corresponding to the to-be-executed task in the database.

在本公开的一个实施例中,所述任务管理装置1100还配置为:各所述任务调度服务根据与各所述任务调度服务对应的任务队列中的任务数量更新所述节点中的任务量;任务管理服务器从各所述节点中获取所述任务量,根据所述任务量确定任务分配权重,并基于所述任务分配权重执行任务分配。In an embodiment of the present disclosure, the task management apparatus 1100 is further configured to: each of the task scheduling services update the task amount in the node according to the number of tasks in the task queue corresponding to each of the task scheduling services; The task management server acquires the task amount from each of the nodes, determines a task assignment weight according to the task amount, and performs task assignment based on the task assignment weight.

在本公开的一个实施例中,所述分布式任务管理系统还包括控制台;所述任务管理装置1100还配置为:所述控制台响应用户的第一触发操作以录入所述待执行任务;或者,所述控制台响应用户的第二触发操作生成任务进度查询请求,并将所述任务进度查询请求发送至所述任务管理服务器,以获取任务进度信息。In an embodiment of the present disclosure, the distributed task management system further includes a console; the task management apparatus 1100 is further configured to: the console responds to a user's first trigger operation to input the to-be-executed task; Alternatively, the console generates a task progress query request in response to the user's second trigger operation, and sends the task progress query request to the task management server to obtain task progress information.

图12示出了适于用来实现本公开实施例的电子设备的计算机系统的结构示意图。FIG. 12 shows a schematic structural diagram of a computer system suitable for implementing an electronic device of an embodiment of the present disclosure.

需要说明的是,图12示出的电子设备的计算机系统1200仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。It should be noted that the computer system 1200 of the electronic device shown in FIG. 12 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.

如图12所示,计算机系统1200包括中央处理单元(Central Processing Unit,CPU)1201,其可以根据存储在只读存储器(Read-Only Memory,ROM)1202中的程序或者从存储部分1208加载到随机访问存储器(Random Access Memory,RAM)1203中的程序而执行各种适当的动作和处理,实现上述实施例中所述的图像标注方法。在RAM 1203中,还存储有系统操作所需的各种程序和数据。CPU 1201、ROM 1202以及RAM1203通过总线1204彼此相连。输入/输出(Input/Output,I/O)接口1205也连接至总线1204。As shown in FIG. 12, the computer system 1200 includes a central processing unit (Central Processing Unit, CPU) 1201, which can be loaded into a random device according to a program stored in a read-only memory (Read-Only Memory, ROM) 1202 or from a storage part 1208 The program in the memory (Random Access Memory, RAM) 1203 is accessed to execute various appropriate actions and processes, so as to realize the image labeling method described in the above embodiments. In the RAM 1203, various programs and data necessary for system operation are also stored. The CPU 1201 , the ROM 1202 , and the RAM 1203 are connected to each other through a bus 1204 . An Input/Output (I/O) interface 1205 is also connected to the bus 1204 .

以下部件连接至I/O接口1205:包括键盘、鼠标等的输入部分1206;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分1207;包括硬盘等的存储部分1208;以及包括诸如LAN(Local AreaNetwork,局域网)卡、调制解调器等的网络接口卡的通信部分1209。通信部分1209经由诸如因特网的网络执行通信处理。驱动器1210也根据需要连接至I/O接口1205。可拆卸介质1211,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1210上,以便于从其上读出的计算机程序根据需要被安装入存储部分1208。The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, etc.; an output section 1207 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc. ; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN (Local Area Network) card, a modem, and the like. The communication section 1209 performs communication processing via a network such as the Internet. Drivers 1210 are also connected to I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 1210 as needed so that a computer program read therefrom is installed into the storage section 1208 as needed.

特别地,根据本公开的实施例,下文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分1209从网络上被下载和安装,和/或从可拆卸介质1211被安装。在该计算机程序被中央处理单元(CPU)1201执行时,执行本公开的系统中限定的各种功能。In particular, according to embodiments of the present disclosure, the processes described below with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 1209, and/or installed from the removable medium 1211. When the computer program is executed by the central processing unit (CPU) 1201, various functions defined in the system of the present disclosure are executed.

需要说明的是,本公开实施例所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable Compact Disc Read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless, wired, etc., or any suitable combination of the foregoing.

附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现,所描述的单元也可以设置在处理器中。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments of the present disclosure may be implemented in software or hardware, and the described units may also be provided in a processor. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances.

作为另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的任务管理装置中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现上述实施例中所述的方法。As another aspect, the present disclosure also provides a computer-readable medium. The computer-readable medium may be included in the task management apparatus described in the above-mentioned embodiments; it may also exist alone without being assembled into the electronic device. in the device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by an electronic device, enables the electronic device to implement the methods described in the above-mentioned embodiments.

应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units of the apparatus for action performance are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.

通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本公开实施方式的方法。From the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on the network , which includes several instructions to cause a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.

本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure .

应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (15)

1. A task management method is characterized in that the task management method is applied to a distributed task management system, the distributed task management system comprises a task scheduling server and nodes corresponding to a plurality of task scheduling services in the task scheduling server, wherein the task scheduling services are used for managing and scheduling tasks of users; the method comprises the following steps:
each task scheduling service monitors a father node corresponding to the node;
when the deleted first node exists in the nodes, creating a second node corresponding to the first node through a target task scheduling service, wherein the target task scheduling service corresponding to the second node is different from the task scheduling service corresponding to the first node;
and the target task scheduling service acquires task information corresponding to the first node from a database and adds the task information to a task queue corresponding to the target task scheduling service.
2. The task management method according to claim 1, characterized in that:
and registering each task scheduling service in the distributed coordination service system to acquire nodes with different node numbers corresponding to each task scheduling service.
3. The method of claim 1, wherein after adding the task information to a task queue corresponding to the target task scheduling service, the method further comprises:
and modifying the task scheduling service identifier corresponding to the task information into a task scheduling service identifier corresponding to the target task scheduling service.
4. The task management method according to claim 1, wherein the distributed task management system further comprises a task management server and a task execution server;
the task scheduling server receives a task to be executed sent by the task management server and forms a task execution request according to the task to be executed;
and sending the task execution request to the task execution server so that the task execution service in the task execution server executes the task.
5. The task management method according to claim 4, wherein the task scheduling server receives the to-be-executed task sent by the task management server and forms a task execution request according to the to-be-executed task, and the task execution request includes:
adding the task to be executed into a task queue corresponding to the task scheduling service, and judging whether an idle thread exists in a thread pool of the task scheduling service;
when the idle thread exists, pulling the task to be executed from a task queue corresponding to the task scheduling service, and acquiring a subtask in the task to be executed;
and forming the task execution request according to the attribute information of the subtasks and the subtasks.
6. The task management method according to claim 5, wherein the forming the task execution request according to the attribute information of the subtask and the subtask includes:
judging whether the subtask is a subtask to be executed;
and when the subtask is judged to be the subtask to be executed, forming the task execution request according to the subtask to be executed.
7. The task management method according to claim 4, wherein the number of the task execution services is plural, and each of the task execution services has a different task type tag;
the sending the task execution request to the task execution server to enable the task execution service in the task execution server to execute the task includes:
matching the task type of the task to be executed in the task execution request with the task type label to determine a target task execution service;
and sending the task execution request to the target task execution service so that the target task execution service executes the task.
8. The task management method according to claim 4, wherein after sending the task execution request to the task execution server, the method further comprises:
inquiring whether the sub task to be executed is executed or not;
if the execution of the subtask to be executed is completed, updating task progress information corresponding to the task to be executed in the database;
and if the execution of the subtasks to be executed is not finished, adding the tasks to be executed into the task queues corresponding to the task scheduling services again.
9. The task management method according to claim 8, wherein after updating the task progress information corresponding to the task to be executed in the database, the method further comprises:
judging whether the tasks to be executed have unexecuted subtasks or not;
if the task exists, the unexecuted subtask is executed when the unexecuted subtask is a subtask to be executed until the unexecuted subtask does not exist in the task to be executed;
and if not, updating the task state information corresponding to the task to be executed in the database.
10. The task management method of claim 1, further comprising:
each task scheduling service updates the task quantity in the node according to the task quantity in the task queue corresponding to each task scheduling service;
and the task management server acquires the task quantity from each node, determines task distribution weight according to the task quantity, and executes task distribution based on the task distribution weight.
11. The task management method according to claim 4, wherein the distributed task management system further comprises a console; the method further comprises the following steps:
the console responds to a first trigger operation of a user to enter the task to be executed; or,
and the console responds to a second trigger operation of the user to generate a task progress query request and sends the task progress query request to the task management server so as to acquire task progress information.
12. A task management device is applied to a distributed task management system, the distributed task management system comprises a task scheduling server and nodes corresponding to a plurality of task scheduling services in the task scheduling server, wherein the task scheduling services are used for managing and scheduling tasks of users; the device comprises:
the monitoring module is used for monitoring a father node corresponding to the node by each task scheduling service;
a node creating module, configured to create, when it is monitored that a deleted first node exists in the nodes, a second node corresponding to the first node through a target task scheduling service, where the target task scheduling service corresponding to the second node is different from a task scheduling service corresponding to the first node;
and the task taking module is used for acquiring the task information corresponding to the first node from a database by the target task scheduling service and adding the task information into a task queue corresponding to the target task scheduling service.
13. A distributed task management system, comprising:
the console is used for responding to the triggering operation of a user to generate a task issuing instruction or a task progress query request;
the task management server is connected with the console and used for responding to the task issuing instruction to issue the task or responding to the task progress inquiry request to feed back task progress information to the console;
the task scheduling server is connected with the task management server and comprises a plurality of task scheduling services, and each task scheduling service is registered in the distributed coordination service system to form a node and is used for receiving the tasks issued by the task management server and scheduling and managing the tasks through each task scheduling service;
the task execution server is connected with the task scheduling server, comprises a plurality of task execution services and is used for responding to a task execution request sent by the task scheduling service so as to execute a task;
and the data storage equipment is connected with the task management server and the task scheduling server and is used for storing information related to the tasks.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the task management method of any one of claims 1 to 11.
15. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to perform a task management method as claimed in any one of claims 1-11.
CN202010064754.9A 2020-01-20 2020-01-20 Task management methods, devices, systems, computer storage media and electronic equipment Active CN111290854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010064754.9A CN111290854B (en) 2020-01-20 2020-01-20 Task management methods, devices, systems, computer storage media and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010064754.9A CN111290854B (en) 2020-01-20 2020-01-20 Task management methods, devices, systems, computer storage media and electronic equipment

Publications (2)

Publication Number Publication Date
CN111290854A true CN111290854A (en) 2020-06-16
CN111290854B CN111290854B (en) 2024-03-15

Family

ID=71023320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010064754.9A Active CN111290854B (en) 2020-01-20 2020-01-20 Task management methods, devices, systems, computer storage media and electronic equipment

Country Status (1)

Country Link
CN (1) CN111290854B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611066A (en) * 2020-06-30 2020-09-01 平安银行股份有限公司 Task execution method, task execution server and storage medium
CN112380183A (en) * 2020-11-13 2021-02-19 深圳市和讯华谷信息技术有限公司 Distributed file synchronization method and device, computer equipment and storage medium
CN112463561A (en) * 2020-11-20 2021-03-09 中国建设银行股份有限公司 Fault positioning method, device, equipment and storage medium
CN112463312A (en) * 2020-11-02 2021-03-09 北京健康之家科技有限公司 Dynamic maintenance system and method for timing task, medium and computing equipment
CN112804093A (en) * 2020-12-31 2021-05-14 杭州东方通信软件技术有限公司 Centralized scheduling support method and system based on fault capability center
CN112948068A (en) * 2020-09-16 2021-06-11 深圳市明源云科技有限公司 Task scheduling method and device and electronic equipment
CN113064744A (en) * 2021-05-06 2021-07-02 腾讯科技(深圳)有限公司 Task processing method and device, computer readable medium and electronic equipment
CN113065779A (en) * 2021-04-07 2021-07-02 网易(杭州)网络有限公司 Data processing method and device and electronic equipment
CN113190341A (en) * 2021-05-31 2021-07-30 内蒙古豆蔻网络科技有限公司 Server resource scheduling method and system
CN113553126A (en) * 2021-07-06 2021-10-26 网易(杭州)网络有限公司 Data processing method and device
CN113656157A (en) * 2021-08-10 2021-11-16 北京锐安科技有限公司 Distributed task scheduling method and device, storage medium and electronic equipment
CN113741872A (en) * 2021-09-03 2021-12-03 上海新炬网络信息技术股份有限公司 Software application automatic publishing method based on job scheduling
CN113760513A (en) * 2021-09-16 2021-12-07 康键信息技术(深圳)有限公司 Distributed task scheduling method, device, equipment and medium
CN113760485A (en) * 2020-07-16 2021-12-07 北京沃东天骏信息技术有限公司 Scheduling method, device and equipment of timing task and storage medium
CN113794755A (en) * 2021-08-25 2021-12-14 上海华兴数字科技有限公司 Shared service pushing method and system based on micro-service architecture
WO2022048329A1 (en) * 2020-09-01 2022-03-10 北京达佳互联信息技术有限公司 Menu display method and apparatus
CN114567625A (en) * 2022-03-01 2022-05-31 上海创远仪器技术股份有限公司 Android Http service-based radio monitoring equipment control processing system, method, device, processor and storage medium thereof
CN115185673A (en) * 2022-05-17 2022-10-14 贝壳找房(北京)科技有限公司 Distributed timed task scheduling method, system, storage medium and program product
CN116680064A (en) * 2023-08-03 2023-09-01 中航信移动科技有限公司 Task node management method, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102209100A (en) * 2011-03-15 2011-10-05 厦门亿力吉奥信息科技有限公司 Task scheduling cloud processing system and method
CN102932210A (en) * 2012-11-23 2013-02-13 北京搜狐新媒体信息技术有限公司 Method and system for monitoring node in PaaS cloud platform
CN104462370A (en) * 2014-12-09 2015-03-25 北京百度网讯科技有限公司 Distributed task scheduling system and method
CN105447097A (en) * 2015-11-10 2016-03-30 北京北信源软件股份有限公司 Data acquisition method and system
CN106909451A (en) * 2017-02-28 2017-06-30 郑州云海信息技术有限公司 A kind of distributed task dispatching system and method
CN108958920A (en) * 2018-07-13 2018-12-07 众安在线财产保险股份有限公司 A kind of distributed task dispatching method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102209100A (en) * 2011-03-15 2011-10-05 厦门亿力吉奥信息科技有限公司 Task scheduling cloud processing system and method
CN102932210A (en) * 2012-11-23 2013-02-13 北京搜狐新媒体信息技术有限公司 Method and system for monitoring node in PaaS cloud platform
CN104462370A (en) * 2014-12-09 2015-03-25 北京百度网讯科技有限公司 Distributed task scheduling system and method
CN105447097A (en) * 2015-11-10 2016-03-30 北京北信源软件股份有限公司 Data acquisition method and system
CN106909451A (en) * 2017-02-28 2017-06-30 郑州云海信息技术有限公司 A kind of distributed task dispatching system and method
CN108958920A (en) * 2018-07-13 2018-12-07 众安在线财产保险股份有限公司 A kind of distributed task dispatching method and system

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611066A (en) * 2020-06-30 2020-09-01 平安银行股份有限公司 Task execution method, task execution server and storage medium
CN113760485A (en) * 2020-07-16 2021-12-07 北京沃东天骏信息技术有限公司 Scheduling method, device and equipment of timing task and storage medium
WO2022048329A1 (en) * 2020-09-01 2022-03-10 北京达佳互联信息技术有限公司 Menu display method and apparatus
CN112948068A (en) * 2020-09-16 2021-06-11 深圳市明源云科技有限公司 Task scheduling method and device and electronic equipment
CN112948068B (en) * 2020-09-16 2024-03-12 深圳市明源云科技有限公司 Task scheduling method and device and electronic equipment
CN112463312A (en) * 2020-11-02 2021-03-09 北京健康之家科技有限公司 Dynamic maintenance system and method for timing task, medium and computing equipment
CN112463312B (en) * 2020-11-02 2024-07-23 北京水滴科技集团有限公司 Dynamic maintenance system and method for timing tasks, medium and computing device
CN112380183A (en) * 2020-11-13 2021-02-19 深圳市和讯华谷信息技术有限公司 Distributed file synchronization method and device, computer equipment and storage medium
CN112463561A (en) * 2020-11-20 2021-03-09 中国建设银行股份有限公司 Fault positioning method, device, equipment and storage medium
CN112804093A (en) * 2020-12-31 2021-05-14 杭州东方通信软件技术有限公司 Centralized scheduling support method and system based on fault capability center
CN113065779A (en) * 2021-04-07 2021-07-02 网易(杭州)网络有限公司 Data processing method and device and electronic equipment
CN113065779B (en) * 2021-04-07 2023-08-11 网易(杭州)网络有限公司 Data processing method and device and electronic equipment
CN113064744A (en) * 2021-05-06 2021-07-02 腾讯科技(深圳)有限公司 Task processing method and device, computer readable medium and electronic equipment
CN113190341A (en) * 2021-05-31 2021-07-30 内蒙古豆蔻网络科技有限公司 Server resource scheduling method and system
CN113553126A (en) * 2021-07-06 2021-10-26 网易(杭州)网络有限公司 Data processing method and device
CN113553126B (en) * 2021-07-06 2024-03-22 网易(杭州)网络有限公司 Data processing method and device
CN113656157A (en) * 2021-08-10 2021-11-16 北京锐安科技有限公司 Distributed task scheduling method and device, storage medium and electronic equipment
CN113656157B (en) * 2021-08-10 2024-04-23 北京锐安科技有限公司 Distributed task scheduling method and device, storage medium and electronic equipment
CN113794755A (en) * 2021-08-25 2021-12-14 上海华兴数字科技有限公司 Shared service pushing method and system based on micro-service architecture
CN113741872A (en) * 2021-09-03 2021-12-03 上海新炬网络信息技术股份有限公司 Software application automatic publishing method based on job scheduling
CN113741872B (en) * 2021-09-03 2024-04-23 上海新炬网络信息技术股份有限公司 Automatic software application publishing method based on job scheduling
CN113760513A (en) * 2021-09-16 2021-12-07 康键信息技术(深圳)有限公司 Distributed task scheduling method, device, equipment and medium
CN114567625A (en) * 2022-03-01 2022-05-31 上海创远仪器技术股份有限公司 Android Http service-based radio monitoring equipment control processing system, method, device, processor and storage medium thereof
CN115185673B (en) * 2022-05-17 2023-10-31 贝壳找房(北京)科技有限公司 Distributed timing task scheduling method, system, storage medium and program product
CN115185673A (en) * 2022-05-17 2022-10-14 贝壳找房(北京)科技有限公司 Distributed timed task scheduling method, system, storage medium and program product
CN116680064B (en) * 2023-08-03 2023-10-10 中航信移动科技有限公司 Task node management method, electronic equipment and storage medium
CN116680064A (en) * 2023-08-03 2023-09-01 中航信移动科技有限公司 Task node management method, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111290854B (en) 2024-03-15

Similar Documents

Publication Publication Date Title
CN111290854B (en) Task management methods, devices, systems, computer storage media and electronic equipment
CN104092767B (en) A kind of publish/subscribe system and its method of work for increasing message queue model
WO2020258290A1 (en) Log data collection method, log data collection apparatus, storage medium and log data collection system
CN103092698B (en) Cloud computing application automatic deployment system and method
Bhattacharjee et al. IBM deep learning service
CN112035213B (en) Multi-tenant network car booking system and dynamic isolation method
US11461119B2 (en) Virtual containers configured to support multiple machine learning models
CN108365971A (en) Daily record analytic method, equipment and computer-readable medium
WO2017106110A1 (en) Publish-subscribe message transformation
CN109117252B (en) Method and system for task processing based on container and container cluster management system
US20160366246A1 (en) Computing resource deployment system
CN102760074A (en) High-load business process scalability
Kijsipongse et al. A hybrid GPU cluster and volunteer computing platform for scalable deep learning
CN103491024A (en) Job scheduling method and device for streaming data
US20230196182A1 (en) Database resource management using predictive models
US20140006576A1 (en) Managing service specifications and the discovery of associated services
US20160366232A1 (en) Computing resource management system
CN114610504A (en) Message processing method and device, electronic equipment and storage medium
CN112148458A (en) Task scheduling method and device
WO2013039796A2 (en) Scale-out system to acquire event data
KR20210065817A (en) Apparatus for Layer switching of Deep Learning Private Cloud Service
CN113472638B (en) Edge gateway control method, system, device, electronic equipment and storage medium
CN110913018A (en) Distributed regulation and control service system
CN108205470A (en) A kind of distribution ad data calculating task management system and method
CN115361382B (en) Data processing method, device, equipment and storage medium based on data group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40024407

Country of ref document: HK

TA01 Transfer of patent application right

Effective date of registration: 20230922

Address after: 101, 4th Floor, Building 9, West District, No. 10 Courtyard, Northwest Wangdong Road, Haidian District, Beijing, 100080

Applicant after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 518000 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 Floors

Applicant before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment