[go: up one dir, main page]

CN113722057A - Big data cluster processing method and system, electronic device and storage medium - Google Patents

Big data cluster processing method and system, electronic device and storage medium Download PDF

Info

Publication number
CN113722057A
CN113722057A CN202110270156.1A CN202110270156A CN113722057A CN 113722057 A CN113722057 A CN 113722057A CN 202110270156 A CN202110270156 A CN 202110270156A CN 113722057 A CN113722057 A CN 113722057A
Authority
CN
China
Prior art keywords
task
big data
data cluster
script
execution plan
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110270156.1A
Other languages
Chinese (zh)
Inventor
陈金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JD Digital Technology Holdings Co Ltd
Original Assignee
JD Digital Technology Holdings Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JD Digital Technology Holdings Co Ltd filed Critical JD Digital Technology Holdings Co Ltd
Priority to CN202110270156.1A priority Critical patent/CN113722057A/en
Publication of CN113722057A publication Critical patent/CN113722057A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4887Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues involving deadlines, e.g. rate based, periodic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)

Abstract

本发明提供一种大数据集群处理方法和系统、电子设备和存储介质,其中所述大数据集群处理方法包括:对大数据集群中的主机进行脚本定义,生成第一定时任务脚本;在大数据集群下创建第一任务执行计划,并将所述第一定时任务脚本添加到对应的第一任务执行计划中;基于所述第一任务执行计划,定时将所述第一定时任务脚本推送到大数据集群下的主机中。本发明通过定义定时任务脚本,然后在指定大数据集群下创建任务执行计划,并将该脚本添加到任务执行计划中,根据执行计划定时将该脚本推送到集群下的主机中运行,极大地提高了大数据集群运维工作的效率,节省了运维成本。

Figure 202110270156

The present invention provides a big data cluster processing method and system, an electronic device and a storage medium, wherein the big data cluster processing method includes: defining a script for a host in a big data cluster to generate a first timed task script; Create a first task execution plan under the cluster, and add the first timed task script to the corresponding first task execution plan; based on the first task execution plan, regularly push the first timed task script to the large host in the data cluster. In the present invention, by defining a timed task script, then creating a task execution plan under a designated big data cluster, adding the script to the task execution plan, and regularly pushing the script to the hosts under the cluster to run according to the execution plan, thereby greatly improving the It improves the efficiency of big data cluster operation and maintenance work and saves the operation and maintenance cost.

Figure 202110270156

Description

Big data cluster processing method and system, electronic device and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a big data cluster processing method and system, an electronic device, and a storage medium.
Background
The big data cluster is a cluster for realizing data acquisition, data storage and data analysis of big data. Currently, different large data clusters are managed by respective independent management systems. When managing a big data cluster, a big data cluster operation and maintenance worker often needs to configure a timing task in a host under the cluster, execute tasks such as process state collection, data statistics and the like, and need to write a task script and configure a crontab on each host every time the timing task is set, modified and deleted.
In the prior art, when the large data cluster is large in scale, the time cost for managing the planning task is increased sharply, and the task execution state is not easy to view and the task result is not easy to collect.
Disclosure of Invention
The invention provides a big data cluster processing method and system, electronic equipment and a storage medium, which are used for solving the technical defects in the prior art.
The invention provides a big data cluster processing method, which comprises the following steps:
performing script definition on a host in a big data cluster to generate a first timed task script;
creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
According to the big data cluster processing method provided by the invention, the method further comprises any one or the combination of the following steps:
respectively modifying the first timed task script and the first task execution plan, and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan;
updating the first timed task script and the first task execution plan respectively to generate a second timed task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and deleting the first timed task script and the first task execution plan, and pushing the deleted contents to a host under the big data cluster.
According to a big data cluster processing method provided by the present invention, after the first timed task script is pushed to the host under the big data cluster at a regular time or the modified first timed task script is pushed to the host under the big data cluster at a regular time, the method further includes:
monitoring the state of the first task execution plan, and reporting the state of the first task execution plan to a server side corresponding to the big data cluster;
the timing and pushing the second timed task script to the host computer under the big data cluster comprises the following steps:
and monitoring the state of the second task execution plan, and reporting the state of the second task execution plan to a server side corresponding to the big data cluster.
According to a big data cluster processing method provided by the present invention, after the first timed task script is pushed to the host under the big data cluster at regular time based on the first task execution plan, the method further includes:
and collecting a task operation result fed back by the host under the big data cluster, and reporting the task operation result to a server side corresponding to the big data cluster.
The invention also provides a big data cluster processing system, which comprises:
the script definition module is used for carrying out script definition on a host in the big data cluster and generating a first timed task script;
the script distribution module is used for creating a first task execution plan under the big data cluster and adding the first timed task script into the corresponding first task execution plan;
and the timing pushing module is used for pushing the first timing task script to a host under a big data cluster at regular time based on the first task execution plan.
According to the big data cluster processing system provided by the invention, the big data cluster processing system comprises any one or the combination of the following components:
the modification module is used for respectively modifying the first timed task script and the first task execution plan and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan;
the updating module is used for respectively updating the first timing task script and the first task execution plan and generating a second timing task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and the deleting module is used for deleting the first timed task script and the first task execution plan and pushing the deleted content to the host computer under the big data cluster.
According to a big data cluster processing system provided by the present invention, the big data cluster processing system comprises:
and the state monitoring module is used for monitoring the state of the first task execution plan or the second task execution plan and reporting the state of the first task execution plan or the second task execution plan to a server side corresponding to the big data cluster.
According to a big data cluster processing system provided by the present invention, the big data cluster processing system comprises:
and the operation result collection module is used for collecting the task operation result fed back by the host under the big data cluster and reporting the task operation result to the server side corresponding to the big data cluster.
The present invention also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of any of the big data cluster processing methods described above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the big data cluster processing method as described in any of the above.
The method comprises the steps of defining a timed task script, then creating a task execution plan under a specified big data cluster, adding the script into the task execution plan, pushing the script to a host under the cluster to run at a fixed time according to the execution plan, and reporting a task execution state and a result to a system server; the first timed task script is pushed to the host under the big data cluster at regular time based on the first task execution plan, so that all the hosts under the appointed big data cluster can be effective at regular time only by modifying the subsequent first timed task script once, the operation and maintenance efficiency of the big data cluster is greatly improved, and the operation and maintenance cost is saved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a big data cluster processing method according to an embodiment of the present invention;
fig. 2 is a second schematic flowchart of a big data cluster processing method according to an embodiment of the present invention;
fig. 3 is a third schematic flowchart of a big data cluster processing method according to an embodiment of the present invention;
fig. 4 is a fourth schematic flowchart of a big data cluster processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a large data cluster processing system provided by the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a big data cluster processing method, wherein an execution main body is a computer system, a script is uploaded to the computer system, then a timing task is established on the computer system based on the script, and a host computer is selected to be acted on, and the clicking determination is completed, referring to figure 1, the big data cluster processing comprises the following steps:
s1: performing script definition on a host in a big data cluster to generate a first timed task script;
specifically, task operations which need to be executed regularly on a large data cluster or a host are determined, the task operations are compiled into a first timing task script by using a programming language Shell or Python, and then a first timing task script uploading interface of a task center is called to upload the first timing task script to the task center. Script (Script) is an executable file written according to a certain format by using a specific descriptive language, and the first timed task Script is generated by Script definition based on a host in a large data cluster.
S2: creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
specifically, a timed task creation interface of the task center is called, a first timed task script executed by the timed task, a target big data cluster or a target host used for the timed task, a time period triggered by the timed task at regular time and other parameters are set, and creation of the timed task is completed. The first task execution plan is automatically created under a large data cluster.
S3: and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
Specifically, the task center puts the timing task into a task pool, periodically triggers the timing task, automatically and timely pushes the first timing task script to a target big data cluster or a target host, and executes the first timing task script to complete the timing task operation. The timing task can be realized by adopting the existing timing device.
The big data cluster processing method provided by the invention realizes that the task script at the first timing of the subsequent big data cluster operation and maintenance personnel can take effect on all the hosts under the appointed big data cluster only by modifying once.
The big data cluster processing method provided by the invention comprises any one or the combination of the following steps:
respectively modifying the first timed task script and the first task execution plan, and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan; and the subsequent operation and maintenance personnel of the big data cluster can modify the timing task execution plan and the timing task script only once to take effect on all the hosts under the appointed big data cluster.
Updating the first timed task script and the first task execution plan respectively to generate a second timed task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and deleting the first timed task script and the first task execution plan, and pushing the deleted contents to a host under the big data cluster. The invention provides a set of complete timed task management flow, and the operations of unified script definition, unified script issuing, timed task execution, task state monitoring, task result collection and the like are performed on the host in the big data cluster, so that the operation and maintenance work efficiency of the big data cluster is improved.
Preferably, in the execution process of the method, after the timing pushing the first timing task script to the host under the big data cluster or the timing pushing the modified first timing task script to the host under the big data cluster, the method includes:
monitoring the state of the first task execution plan, and reporting the state of the first task execution plan to a server side corresponding to the big data cluster;
the timing and pushing the second timed task script to the host computer under the big data cluster comprises the following steps:
and monitoring the state of the second task execution plan, and reporting the state of the second task execution plan to a server side corresponding to the big data cluster.
Further, after the first timed task script is pushed to the host under the big data cluster at regular time based on the first task execution plan, the method further includes:
and collecting a task operation result fed back by the host under the big data cluster, and reporting the task operation result to a server side corresponding to the big data cluster. The task execution state is checked and the task result is collected. That is, the results of the execution are returned to the system; the system stores the received task execution result in the log library and updates the execution state and the next execution time of the task in the task library.
To further understand the method of the present embodiment, in a specific example, as shown in fig. 2, the method for processing a big data cluster of the present embodiment includes:
2011, defining a script for a host in the big data cluster, and generating a first timed task script;
firstly, task operation which needs to be executed on a large data cluster or a host at fixed time is determined, the task operation is compiled into a first fixed-time task script by using a programming language Shell or Python, and then a first fixed-time task script uploading interface of a task center is called to upload the first fixed-time task script to the task center.
Step 2012, creating a first task execution plan under the big data cluster, and adding the first timed task script to the corresponding first task execution plan;
storing the script uploaded by the user in a script library, storing the planned task created by the user in a task library, storing the result of task execution in a log library, and scanning a task list in the task library in real time. When a certain task in the task list reaches a preset execution time, the system analyzes the task, acquires a script and a target host required by the task execution, extracts the script from a script library, sends the script to the target host through an SFTP protocol, and executes the script on the target host.
And 2013, based on the first task execution plan, pushing the first timed task script to a host under a big data cluster at regular time.
The task center puts the timing task into a task pool, periodically triggers the timing task, automatically pushes the first timing task script to a target big data cluster or a target host, and executes the first timing task script to finish the timing task operation.
Step 2014, collecting a task operation result fed back by the host under the big data cluster, and reporting the task operation result to the server corresponding to the big data cluster.
And executing the first timed task script, preferably, after the timed task operation is completed, collecting a task running result fed back by the host computer under the big data cluster, wherein the task running result fed back by the host computer comprises successful running or unsuccessful running, and if the running is unsuccessful, continuing to run again in the next step. And reporting the task operation result to a server side corresponding to the big data cluster, so that the server side corresponding to the big data cluster can obtain the operation result.
Step 2015, respectively modifying the first timed task script and the first task execution plan, and periodically pushing the modified first timed task script to a host under a big data cluster based on the modified first task execution plan;
that is, the first timing task script and the first task execution plan are modified respectively to form a modified first task execution plan, the modified first timing task script is pushed to the host under the big data cluster at regular time, and the subsequent operation and maintenance personnel of the big data cluster can take effect on all the hosts under the specified big data cluster by modifying the timing task execution plan and the timing task script only once.
Step 2016, monitoring the state of the first task execution plan, and reporting the state of the first task execution plan to a server corresponding to the big data cluster.
And monitoring the state of the modified first task execution plan, and reporting the state of the first task execution plan to a server corresponding to the big data cluster to realize real-time monitoring and management.
That is, the script uploaded by the user is stored in the script library, the planned task created by the user is stored in the task library, the result of the task execution is stored in the log library, and the task list in the task library is scanned in real time. When a task in the task list reaches a predetermined execution time. The embodiment of the invention analyzes the first task execution plan, acquires a first timing task script and a target host which are required by the first task execution plan, extracts the first timing task script from a script library, sends the first timing task script to the target host through an SFTP protocol, and executes the first timing task script on the target host.
To further understand the method of the present embodiment, in a specific example, as shown in fig. 3, the method for processing a big data cluster of the present embodiment includes:
3011, defining a script for the host in the big data cluster, and generating a first timed task script;
firstly, task operation which needs to be executed on a large data cluster or a host at fixed time is determined, the task operation is compiled into a first fixed-time task script by using a programming language Shell or Python, and then a first fixed-time task script uploading interface of a task center is called to upload the first fixed-time task script to the task center.
Step 3012, creating a first task execution plan under the big data cluster, and adding the first timed task script to the corresponding first task execution plan;
storing the script uploaded by the user in a script library, storing the planned task created by the user in a task library, storing the result of task execution in a log library, and scanning a task list in the task library in real time. When a certain task in the task list reaches a preset execution time, the system analyzes the task, acquires a script and a target host required by the task execution, extracts the script from a script library, sends the script to the target host through an SFTP protocol, and executes the script on the target host.
And 3013, based on the first task execution plan, periodically pushing the first timed task script to a host in a big data cluster.
The task center puts the timing task into a task pool, periodically triggers the timing task, automatically pushes the first timing task script to a target big data cluster or a target host, and executes the first timing task script to finish the timing task operation.
And 3014, collecting a task running result fed back by the host in the big data cluster, and reporting the task running result to the server corresponding to the big data cluster.
And executing the first timed task script, preferably, after the timed task operation is completed, collecting a task running result fed back by the host computer under the big data cluster, wherein the task running result fed back by the host computer comprises successful running or unsuccessful running, and if the running is unsuccessful, continuing to run again in the next step. And reporting the task operation result to a server side corresponding to the big data cluster, so that the server side corresponding to the big data cluster can obtain the operation result.
Step 3015, update the first timed task script and the first task execution plan respectively, and generate a second timed task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
when the updating is needed, the first timed task script and the first task execution plan are only needed to be updated respectively, a second timed task script is generated, and the second timed task script is pushed to the host under the big data cluster in a timed mode, so that the updating is achieved, and the efficiency is high.
The embodiment of the invention analyzes the second task execution plan, acquires a second timing task script and a target host required by the execution of the second task execution plan, extracts the second timing task script from a script library, sends the second timing task script to the target host through an SFTP protocol, and executes the second timing task script on the target host.
And 3016, monitoring the state of the second task execution plan, and reporting the state of the second task execution plan to a server corresponding to the big data cluster.
To further understand the method of the present embodiment, in a specific example, as shown in fig. 4, the method for processing a big data cluster of the present embodiment includes:
step 4011, performing script definition on a host in the big data cluster, and generating a first timed task script;
firstly, task operation which needs to be executed on a large data cluster or a host at fixed time is determined, the task operation is compiled into a first fixed-time task script by using a programming language Shell or Python, and then a first fixed-time task script uploading interface of a task center is called to upload the first fixed-time task script to the task center.
Step 4012, creating a first task execution plan under the big data cluster, and adding the first timed task script to the corresponding first task execution plan;
storing the script uploaded by the user in a script library, storing the planned task created by the user in a task library, storing the result of task execution in a log library, and scanning a task list in the task library in real time. When a certain task in the task list reaches a preset execution time, the system analyzes the task, acquires a script and a target host required by the task execution, extracts the script from a script library, sends the script to the target host through an SFTP protocol, and executes the script on the target host.
And 4013, based on the first task execution plan, periodically pushing the first timed task script to a host under a big data cluster.
The task center puts the timing task into a task pool, periodically triggers the timing task, automatically pushes the first timing task script to a target big data cluster or a target host, and executes the first timing task script to finish the timing task operation.
And 4014, collecting a task operation result fed back by the host in the big data cluster, and reporting the task operation result to a server corresponding to the big data cluster.
And executing the first timed task script, preferably, after the timed task operation is completed, collecting a task running result fed back by the host computer under the big data cluster, wherein the task running result fed back by the host computer comprises successful running or unsuccessful running, and if the running is unsuccessful, continuing to run again in the next step. And reporting the task operation result to a server side corresponding to the big data cluster, so that the server side corresponding to the big data cluster can obtain the operation result.
And 4015, deleting the first timed task script and the first task execution plan, and pushing the deleted content to a host under the big data cluster.
If part of the content needs to be deleted, the first timing task script and the first task execution plan corresponding to the part of the content to be deleted can be directly deleted, and the deleted content is pushed to the host computer under the big data cluster, so that the operation is convenient.
The following describes the big data cluster processing system provided by the present invention, and the big data cluster processing system described below and the big data cluster processing method described above may be referred to correspondingly.
The embodiment of the invention discloses a big data cluster processing system, which is shown in figure 5 and comprises the following components:
the script definition module 10 is configured to perform script definition on a host in a big data cluster, and generate a first timing task script;
specifically, task operations which need to be executed regularly on a large data cluster or a host are determined, the task operations are compiled into a first timing task script by using a programming language Shell or Python, and then a first timing task script uploading interface of a task center is called to upload the first timing task script to the task center.
The script distribution module 20 is configured to create a first task execution plan under the big data cluster, and add the first timing task script to the corresponding first task execution plan;
specifically, a timed task creation interface of the task center is called, a first timed task script executed by the timed task, a target big data cluster or a target host used for the timed task, a time period triggered by the timed task at regular time and other parameters are set, and creation of the timed task is completed.
That is, the script uploaded by the user is stored in the script library, the planned task created by the user is stored in the task library, the result of the task execution is stored in the log library, and the task list in the task library is scanned in real time. When a certain task in the task list reaches a preset execution time, the system analyzes the task, acquires a script and a target host required by the task execution, extracts the script from a script library, sends the script to the target host through an SFTP protocol and executes the script on the target host,
and the timing pushing module 30 is configured to push the first timing task script to the host in the big data cluster at regular time based on the first task execution plan.
Specifically, the task center puts the timing task into a task pool, periodically triggers the timing task, automatically pushes the first timing task script to a target big data cluster or a target host, and executes the first timing task script to complete the timing task operation.
The big data cluster processing system provided by the invention realizes that the task script at the first timing can take effect on all the hosts under the appointed big data cluster at regular time only by modifying once for the subsequent big data cluster operation and maintenance personnel by defining the timing task script, then creating a task execution plan under the appointed big data cluster, adding the script into the task execution plan, pushing the script into the hosts under the cluster to operate according to the execution plan at regular time, reporting the task execution state and result to a system server side, and pushing the first timing task script into the hosts under the big data cluster at regular time based on the first task execution plan.
The big data cluster processing system provided by the invention comprises any one or the combination of the following components:
the modification module is used for respectively modifying the first timed task script and the first task execution plan and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan; and the subsequent operation and maintenance personnel of the big data cluster can modify the timing task execution plan and the timing task script only once to take effect on all the hosts under the appointed big data cluster.
The updating module is used for respectively updating the first timing task script and the first task execution plan and generating a second timing task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and the deleting module is used for deleting the first timed task script and the first task execution plan and pushing the deleted content to the host computer under the big data cluster. The invention provides a set of complete timed task management flow, and the operations of unified script definition, unified script issuing, timed task execution, task state monitoring, task result collection and the like are performed on the host in the big data cluster, so that the operation and maintenance work efficiency of the big data cluster is improved.
The big data cluster processing system provided by the invention comprises:
and the state monitoring module is used for monitoring the state of the first task execution plan or the second task execution plan and reporting the state of the first task execution plan or the second task execution plan to a server side corresponding to the big data cluster.
The big data cluster processing system provided by the invention comprises:
and the operation result collection module is used for collecting the task operation result fed back by the host under the big data cluster and reporting the task operation result to the server side corresponding to the big data cluster. The task execution state is checked and the task result is collected. That is, the results of the execution are returned to the system; the system stores the received task execution result in the log library and updates the execution state and the next execution time of the task in the task library.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a big data cluster processing method comprising:
s1: performing script definition on a host in a big data cluster to generate a first timed task script;
s2: creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
s3: and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, the computer is capable of performing a big data cluster processing method, the method comprising:
s1: performing script definition on a host in a big data cluster to generate a first timed task script;
s2: creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
s3: and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform a big data cluster processing method, the method comprising:
s1: performing script definition on a host in a big data cluster to generate a first timed task script;
s2: creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
s3: and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A big data cluster processing method is characterized by comprising the following steps:
performing script definition on a host in a big data cluster to generate a first timed task script;
creating a first task execution plan under a big data cluster, and adding the first timed task script into the corresponding first task execution plan;
and based on the first task execution plan, the first timed task script is pushed to a host computer under a big data cluster in a timed mode.
2. The big data cluster processing method according to claim 1, further comprising any one or a combination of:
respectively modifying the first timed task script and the first task execution plan, and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan;
updating the first timed task script and the first task execution plan respectively to generate a second timed task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and deleting the first timed task script and the first task execution plan, and pushing the deleted contents to a host under the big data cluster.
3. The big data cluster processing method according to claim 2, wherein after the timing pushing the first timing task script to the host under the big data cluster or the timing pushing the modified first timing task script to the host under the big data cluster, the method further comprises:
monitoring the state of the first task execution plan, and reporting the state of the first task execution plan to a server side corresponding to the big data cluster;
the timing and pushing the second timed task script to the host computer under the big data cluster comprises the following steps:
and monitoring the state of the second task execution plan, and reporting the state of the second task execution plan to a server side corresponding to the big data cluster.
4. The big data cluster processing method according to claim 1, wherein after said pushing the first timed task script into the host under the big data cluster at regular time based on the first task execution plan, the method further comprises:
and collecting a task operation result fed back by the host under the big data cluster, and reporting the task operation result to a server side corresponding to the big data cluster.
5. A big data cluster processing system, comprising:
the script definition module is used for carrying out script definition on a host in the big data cluster and generating a first timed task script;
the script distribution module is used for creating a first task execution plan under the big data cluster and adding the first timed task script into the corresponding first task execution plan;
and the timing pushing module is used for pushing the first timing task script to a host under a big data cluster at regular time based on the first task execution plan.
6. The big data cluster processing system of claim 5, wherein the big data cluster processing system comprises any one or a combination of:
the modification module is used for respectively modifying the first timed task script and the first task execution plan and pushing the modified first timed task script to a host under a big data cluster at regular time based on the modified first task execution plan;
the updating module is used for respectively updating the first timing task script and the first task execution plan and generating a second timing task script and a second task execution plan; based on the second task execution plan, pushing the second timed task script to a host under a big data cluster at regular time;
and the deleting module is used for deleting the first timed task script and the first task execution plan and pushing the deleted content to the host computer under the big data cluster.
7. The big data cluster processing system of claim 6, wherein the big data cluster processing system comprises:
and the state monitoring module is used for monitoring the state of the first task execution plan or the second task execution plan and reporting the state of the first task execution plan or the second task execution plan to a server side corresponding to the big data cluster.
8. The big data cluster processing system of claim 6, wherein the big data cluster processing system comprises:
and the operation result collection module is used for collecting the task operation result fed back by the host under the big data cluster and reporting the task operation result to the server side corresponding to the big data cluster.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the big data cluster processing method according to any of claims 1 to 4 when executing the program.
10. A non-transitory computer readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the big data cluster processing method according to any of claims 1 to 4.
CN202110270156.1A 2021-03-12 2021-03-12 Big data cluster processing method and system, electronic device and storage medium Pending CN113722057A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110270156.1A CN113722057A (en) 2021-03-12 2021-03-12 Big data cluster processing method and system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110270156.1A CN113722057A (en) 2021-03-12 2021-03-12 Big data cluster processing method and system, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN113722057A true CN113722057A (en) 2021-11-30

Family

ID=78672561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110270156.1A Pending CN113722057A (en) 2021-03-12 2021-03-12 Big data cluster processing method and system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN113722057A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6047260A (en) * 1997-06-05 2000-04-04 Attention Control Systems, Inc. Intelligent planning and calendaring system with cueing feature and floating tasks
WO2016127756A1 (en) * 2015-02-15 2016-08-18 北京京东尚科信息技术有限公司 Flexible deployment method for cluster and management system
US9460159B1 (en) * 2013-08-14 2016-10-04 Google Inc. Detecting visibility of a content item using tasks triggered by a timer
US20170139680A1 (en) * 2015-11-18 2017-05-18 Mastercard International Incorporated Systems, methods, and media for graphical task creation
CN108762911A (en) * 2018-06-13 2018-11-06 平安科技(深圳)有限公司 Timing task management method, apparatus, computer equipment and storage medium
CN109254765A (en) * 2018-08-22 2019-01-22 平安科技(深圳)有限公司 Timing task management method, apparatus, computer equipment and storage medium
CN109800080A (en) * 2018-12-14 2019-05-24 深圳壹账通智能科技有限公司 A kind of method for scheduling task based on Quartz frame, system and terminal device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6047260A (en) * 1997-06-05 2000-04-04 Attention Control Systems, Inc. Intelligent planning and calendaring system with cueing feature and floating tasks
US9460159B1 (en) * 2013-08-14 2016-10-04 Google Inc. Detecting visibility of a content item using tasks triggered by a timer
WO2016127756A1 (en) * 2015-02-15 2016-08-18 北京京东尚科信息技术有限公司 Flexible deployment method for cluster and management system
US20170139680A1 (en) * 2015-11-18 2017-05-18 Mastercard International Incorporated Systems, methods, and media for graphical task creation
CN108762911A (en) * 2018-06-13 2018-11-06 平安科技(深圳)有限公司 Timing task management method, apparatus, computer equipment and storage medium
CN109254765A (en) * 2018-08-22 2019-01-22 平安科技(深圳)有限公司 Timing task management method, apparatus, computer equipment and storage medium
CN109800080A (en) * 2018-12-14 2019-05-24 深圳壹账通智能科技有限公司 A kind of method for scheduling task based on Quartz frame, system and terminal device

Similar Documents

Publication Publication Date Title
CN108804630B (en) Industry application-oriented big data intelligent analysis service system
CN109918437A (en) Distributed data processing method, apparatus and data assets management system
US10083061B2 (en) Cloud embedded process tenant system for big data processing
CN111190892B (en) Method and device for processing abnormal data in data backfilling
CN111026411A (en) Software remote deployment management method and management machine
CN108416657B (en) Order generation method and equipment based on consultation service
CN112559525A (en) Data checking system, method, device and server
CN112632559A (en) Vulnerability automatic verification method, device, equipment and storage medium
CN104484424A (en) Establishing method for resource price information base of construction enterprise based on internet
CN106384283A (en) Internet plus based service bus structure and service bus system
JP2023544463A (en) Enterprise process graph to represent RPA data
CN104298761A (en) Implementation method for master data matching between heterogeneous software systems
CN110083504A (en) The running state monitoring method and device of distributed task scheduling
CN109389328A (en) A kind of card Product development process management method and system
CN113722057A (en) Big data cluster processing method and system, electronic device and storage medium
CN116402325A (en) Automatic business process processing method and device
CN109814991A (en) A kind of data administer in task management method and device
CN107729254A (en) A kind of implementation method for safeguarding Batch orders technique automatically by shell scripts
CN117349300B (en) A method, device and equipment for collecting high-concurrency process information of MES system
CN115242688B (en) Network fault detection method, device and medium
CN106708610B (en) Service model management method and system
CN117827578B (en) A method for automatically creating inspection tasks for multiple resource pools based on service trees and inspection task templates
US20250080422A1 (en) System and method for change request assisted policy state management
CN112702376B (en) Real-time transaction monitoring method
CN115049418A (en) Activity multidimensional marketing method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: Jingdong Digital Technology Holding Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211130