CN107172149A - Big data instant scheduling method - Google Patents
Big data instant scheduling method Download PDFInfo
- Publication number
- CN107172149A CN107172149A CN201710344085.9A CN201710344085A CN107172149A CN 107172149 A CN107172149 A CN 107172149A CN 201710344085 A CN201710344085 A CN 201710344085A CN 107172149 A CN107172149 A CN 107172149A
- Authority
- CN
- China
- Prior art keywords
- task
- load
- node
- host node
- burst
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000008569 process Effects 0.000 claims abstract description 42
- 238000004891 communication Methods 0.000 claims abstract description 12
- 238000013507 mapping Methods 0.000 claims abstract description 5
- 238000012544 monitoring process Methods 0.000 claims description 18
- 230000002159 abnormal effect Effects 0.000 claims description 10
- 238000012423 maintenance Methods 0.000 claims description 2
- 238000007726 management method Methods 0.000 description 25
- 230000005540 biological transmission Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 238000013508 migration Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000008021 deposition Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000006317 isomerization reaction Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0663—Performing the actions predefined by failover planning, e.g. switching to standby network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0668—Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/60—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
- H04L67/61—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources taking into account QoS or priority requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/60—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
- H04L67/63—Routing a service request depending on the request content or context
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Multi Processors (AREA)
Abstract
The invention provides a kind of big data instant scheduling method, this method includes:Operation is divided into independent task by host node, and load node is to host node application and performs task;Host node carries out one-level burst to operation, and the execution information for getting all two grades of bursts in this burst is transmitted to load;Internal memory mapping and secondary burst are done in load to this piece, are then placed in thread pool;Each thread of load node, which is performed, to be completed to carry out a reduce after a second task burst, sends result to host node, and host node will load the result progress reduce operations sent;Distribution meets the load process host node of Host List documentation requirements, task is distributed into load node by map processes, and obtain corresponding task result from load node by reduce processes;Merger intermediate result obtains final result, and final job result is given corresponding task submitter by host node by socket communication distinct feed-back.The present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, the need for meeting high-performance cloud calculating.
Description
Technical field
The present invention relates to big data, more particularly to a kind of big data instant scheduling method.
Background technology
Cloud computing is by the way that internet is by huge data storage and calculates the computer that processing routine is distributed to group system
In, and corresponding application program service is provided.User to resource when submitting access request, and system can be automatically by request
It is switched to the computer and storage system of actual storage resource.The cloud computing platform of virtualization technology is in mass data processing side
Face achieves gratifying achievement.But mass data is distributed on large-scale cluster and carries out parallel processing by cloud computing, due to
Current main flow cloud computing platform bottom uses virtualization technology, and all softwares and application thereon is operated on virtual hardware,
This strategy necessarily brings performance to a certain extent to reduce.And realization mechanism is to use first to store number inside MapReduce
According to the strategy for reading forward process again, when middle data volume becomes big, number increases, this pattern necessarily leads to substantial amounts of useless
Magnetic disc i/o operation;If data can so increase network load in distal end;, can be by I/O bottlenecks if data are local
Limitation, so as to reduce the efficiency of tasks carrying.
The content of the invention
To solve the problems of above-mentioned prior art, the present invention proposes a kind of big data instant scheduling method, bag
Include:
Operation is divided into independent task, and maintenance task state array by host node;After load node starts, to host node
Application task, is obtained after task by predetermined scheme execution task;After the completion of tasks carrying, the result of task is fed back into main section
Put and apply for next task;If the mission bit stream that load node is obtained is invalid, this parallel computation Mission Success is completed,
Load node winding-up;The operation of limit priority is taken out from task queue;Host node carries out burst to operation, obtains one-level
Burst, is put into load cell;Host node inquiry state table obtains the information of one-level burst, then gets in this burst all two grades
The execution information of burst is transmitted to load;Load-receipt does internal memory mapping and secondary burst to this piece to task, and detection is total
Core number, set up thread pool, burst be then put into thread pool, thread obtains second task point by loading internal state table
Piece, completion is performed until content is all in table;It is laggard that each thread of load node performs one second task burst of completion
Reduce of row, host node is sent to by reduce results and internal state table, and host node determines whether to send to this load
Another burst, if desired sends new task, then continues to take task to be sent to load;Until content is all in state table
After the completion of execution, host node sends exit instruction to each load, and load triggers event signal notifies to be retired, the institutes such as monitoring
There are load and host node to wait to exit together;Host node will load the result progress reduce operations sent;Host node is analyzed
Incoming command line parameter is to obtain the relevant information of this parallel interface operation;According to Host List file, dynamically distributes are more
Individual load node simultaneously constitutes a communication domain, then waits load node application task or submits task result;It is then based on simultaneously
The dynamic process that row calculates interface is created and administrative model, and the load process host node that distribution meets Host List documentation requirements will
Task is distributed to load node by map processes, and obtains corresponding task result from load node by reduce processes;Return
And these intermediate results, the final result of this parallel computation operation is obtained, finally this result distinct feed-back is carried to task
Friendship person;Once host node, which detects some load node, occurs exception, then new Host List file is produced, and distribute newly
Load node performs task to take over abnormal load node;If host node finds all in the corresponding state array of task
Item is all to have completed, then host node submits final job result by socket communication distinct feed-back to corresponding task
Person.
The present invention compared with prior art, with advantages below:
The present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, to meet high-performance cloud
The need for calculating.
Brief description of the drawings
Fig. 1 is the flow chart of big data instant scheduling method according to embodiments of the present invention.
Embodiment
Retouching in detail to one or more embodiment of the invention is hereafter provided together with illustrating the accompanying drawing of the principle of the invention
State.The present invention is described with reference to such embodiment, but the invention is not restricted to any embodiment.The scope of the present invention is only by right
Claim is limited, and the present invention covers many replacements, modification and equivalent.Illustrate in the following description many details with
Thorough understanding of the present invention is just provided.These details are provided for exemplary purposes, and without in these details
Some or all details can also realize the present invention according to claims.
An aspect of of the present present invention provides a kind of big data instant scheduling method.Fig. 1 is according to embodiments of the present invention big
Data instant scheduling method flow diagram.
The high-performance cloud calculating platform that the present invention is designed directly builds cloud using Heterogeneous Computing node and put down without virtualization
Platform bottom;MapReduce is rewritten using the parallel computation interfacing and multithreading of increase multi-level fault tolerance function, calculated
In avoid substantial amounts of useless I/O operation, so as to improve the efficiency of cloud computing, to meet the need for high-performance cloud calculates.
Under node isomerous environment, based on HADOOP, cloud infrastructure service layer is set up using isomerization hardware, operation two times scheduling is realized,
Operation/task rollback and dynamic migration, MapReduce frameworks are set up based on multi-level fault tolerance parallel computation interface.To centre knot
Fruit is directly handled, and reduces unnecessary I/O operation.
The cloud computing platform framework based on HADOOP of the present invention includes Heterogeneous Computing node, based on parallel computation interface
Fault-tolerant unit, monitoring module, job management module, task management module, distributed data base and MapReduce Computational frames.
Job management module is used to preserve job queue, manages the scheduling of operation, there is provided corresponding for the execution of monitoring operation
It is fault-tolerant, support that remote job is submitted and result is returned.Monitoring module is used to manage available host list, notes abnormalities node simultaneously
According to loading condition to node sequencing, to select the minimum node of load to perform task first.Task management module is used for task
Divide and allocation schedule, tasks carrying, collect and returning result.
Comprising operation communication submodule in job management module, according to user mutual, realize that operation submission is fed back with result;
Job management submodule according to priority carries out job scheduling according to the management and scheduling of operation, upkeep operation queue;Monitoring module evidence
The running situation and loading level of all nodes are monitored, and accordingly for the Host List file needed for task execution module is provided.
Task management module is the specific performing module of operation, and burst is carried out to operation, and carries out rational management to burst,
Finally collect and feed back result of calculation.Task management module is obtained after the operation that job management is distributed with scheduler module, by setting
Strategy divide the job into multiple tasks, interacted with monitoring module and perform parallel computation required by task to generate scheduling
Host List file;The dynamic process for being then based on parallel computation interface is created and administrative model, and distribution meets Host List text
Task is distributed to load node by the load process host node of part requirement by map processes, and is passed through from load node there
Reduce processes obtain corresponding task result;These intermediate results of merger can be obtained by the final of this parallel computation operation
As a result, finally task submitter is given this result distinct feed-back.It is different that once host node detects the generation of some load node
Often, then interacted with monitoring module to produce new Host List file, and distribute new load node to take over abnormal load
Node performs task.
After operation communicator module initialization, according to user setting, it is necessary to binding local socket server address with
And the socket server remote address of job management submodule, and create two worker threads:Wait thread:Circular wait is received
The job result fed back from task management module;Thread is sent, once user have submitted new job, based on job management submodule
The server address of block, transmits the operation that user inputs by socket and gives job management submodule.
Job management submodule waits operation communication submodule to submit to the operation of oneself.In worker thread, create
Two worker threads.Thread is parsed, the local socket server for safeguarding obtains the work of operation communication submodule remote visiting system
Industry, parses the information and confirms whether the information meets rule, is placed behind the address that qualified operation is bound to operation submitter
Into multipriority job queue, with execution to be scheduled.Thread is scanned, scan round multipriority job queue is to determine team
Whether there is the operation of higher priority in row, if with the presence of the operation of higher priority, taking out operation, life is constructed according to operation
Order row simultaneously starts the process of multiple tasks performing module to complete this parallel computation operation;This parallel computation of circular wait is made
Industry, which is performed, to be terminated;Complete then to judge that generation is abnormal if operation is not normal execution, then reschedule and perform this subjob.
Host node analyzes incoming command line parameter to obtain the relevant information of this parallel interface operation.According to monitoring mould
The Host List file of block generation, the multiple load nodes of dynamically distributes simultaneously constitute a communication domain, then wait load node Shen
Please task or submission task result.
Operation is divided into independent task by host node, and safeguards that an initial value is 0 task status array, to remember
Record tasks carrying situation.Once there is load node application task, then first search was obtained in state array as appointing corresponding to 0 item
Business, assigns the task to applicant and respective items in state array is put into 1;If certain task is completed, correspondence in state array
Item should be set to 2;If performing the process exception of some task, respective items should be reset as 0 in state array, with etc.
Wait to be reassigned to the execution of other nodes;If state array is got the bid, a certain corresponding task for being designated as 1 is in stipulated time model
The interior process exception without feedback result, then the Predicated execution task is enclosed, and the corresponding item of the task in state array is reset
0, performed to be reassigned to other nodes.
If host node finds that all items in the corresponding state array of task are all 2, show that this parallel computation is made
Already through completing, final job result is given corresponding task submitter by host node by socket communication distinct feed-back.
After load node starts, analyze its parent process and whether there is, if being not present, refusal is performed;If in the presence of judging
Itself it is to be started by host node.Load node obtains just performing task by predetermined scheme after task to host node application task;
After the completion of tasks carrying, the result of task is fed back into host node and applies for next task.If what load node was obtained appoints
Business information is invalid, then this parallel computation Mission Success is completed, load node winding-up.
Include three-level fault-tolerant networks in high-performance cloud calculating platform proposed by the present invention:The fault-tolerant i.e. operation of one-level is secondary to adjust
Degree.When systems scan into cluster a certain node in execution task it is abnormal, then system can reschedule immediately execution this
Business.Two grades of fault-tolerant i.e. operation/task rollbacks.Performed at system rollback operation/task to nearest checkpoint.If host node is different
Often, then this parallel computation operation fails, and secondary system dispatches the execution state of this parallel computation operation and rollback operation extremely
At nearest checkpoint and continue executing with operation;If load node is abnormal, system attempt to restart abnormal nodes and rollback its
At execution status of task to nearest checkpoint and continue executing with task.Fault-tolerant three-level is dynamic migration.When abnormal load section
Point can not obtain rollback in a short time, that is to say, that in the case of two grades of fault-tolerant failures, and system can be actively by abnormal work
The task immigration of node to other working nodes are performed.In order to ensure the stability of parallel interface PC cluster ability, and rack
Calculating platform substitutes abnormal load process by the new load process of dynamically distributes.
Monitoring module is used to generate available Host List in present system:According to cpu core numbers and being carrying out
The number of processes of task compares, if entering number of passes less than host name is added into Host List if core number.If entering number of passes not less than cpu
Core number, then remove host name from Host List.The node where No. 0 process in monitoring programme is monitoring server,
Node tasks node where non-zero process.The monitoring of process model is concretely comprised the following steps:
1) it is event signal a kernel objects to be created in each non-zero process, and process opens this event when performing task
Signal, calculates and completes trigger event signal.This signal be used for obtain process whether etc. it is to be retired.
2) host name is obtained, oneself is calculated in the number of processes M and process for the task of being carrying out and completes wait through performing and move back
The quantity K gone out, sets up available host listing file.
, will through performing the quantity K to be retired such as completion equal to oneself in process if 3) be carrying out the number of processes M of task
This host name writes Host List file.
The renewal of Host List has two kinds of strategies:Timing updates to be updated with before task scheduling.Updated for timing, monitor journey
Sequence is kept running always, and scheduler program and monitoring are not interacted;For being updated before task, the operation of monitoring programme occurs dividing
Before task, monitoring programme automatic start before execution task updates Host List, is then log out, host node process is according to available
Host List enters one group of process of Mobile state and performs task, if wherein there is the failure of process task midway, automatic start is monitored again
Program updates list, and the process that host node process starts corresponding quantity further according to available host list completes appointing for failure process
Business.
Burst of the task scheduling of the present invention based on operation.After operation is taken out in job queue, operation is first carried out
One-level burst, node is assigned to by one-level burst, and secondary burst is then carried out in node, and two grades of bursts are assigned into thread,
The distribution of task adds the method for thread pool using load cell.For map operations, the distribution of one-level burst is saved according to free time load
Point application principle, the strong more bursts of node application of computing capability, the weak node of computing capability completes less burst.Load
Node has been carried out after map tasks, and map result as reduce input directly are carried out into first time reduce, then will
As a result it is sent to host node and is second of reduce, and obtains final result.
Load to load cell application task, host node will inquire about one-level burst table, if finding state value to be not carried out
Burst, this burst information is just sent to load process.When host node is performing the fault-tolerant strategy inch of task dynamic migration, hair
Send the position of the starting point and ending point of burst task, and by the execution of all two grades of bursts in one-level burst to be migrated
Information table is transmitted to load together.
If this operation is scheduled for the first time, all one-level bursts are all put into load cell and are scheduled by host node.
If operation is not scheduled for the first time, host node is selected state value and is scheduled for unfinished piece.Performing the mistake of scheduling
Host node continuous updating performs the content of state table in journey.
The content for the load internal state table that host node is sent according to load, if detecting this fraction in now load
Piece goes to certain progress and not completed, then sends another burst task to it.If host node detects execution state table
In content all to have completed, then whole task is completed, and the signal to be retired such as sends to load node, and this operation is completed.For
Load node ,-level burst is sent to after load, and load first does internal memory mapping to this one-level burst, then institute in detection node
There is the core number that processor is total, open the thread of respective numbers, start thread pool and perform task.
In the load, one-level burst is represented with load internal state table.Thread pool takes out also according to load internal state table
The burst being not carried out performs map tasks, and the input of the result of map tasks as reduce tasks then is performed into reduce appoints
Business, during execution, often completes two grades of bursts and just updates the content of load internal state table, and execution is tied
Host node really and to the modification information for loading internal state table is issued in the lump, and host node updates institute further according to load internal state table
There is the execution state table of burst.Until performing, state is all to perform completion, and all bursts perform completion.
Operation is as follows from the flow that obtains result is submitted to:Operation communication submodule submits the operation of priority;Working pipe
Manage the operation that submodule takes out limit priority from task queue;Host node carries out burst to operation, obtains one-level burst, is put into
Load cell.Host node obtains the information of one-level burst by inquiring about state table, then gets all two grades of bursts in this burst
Execution information transmit to load;Load-receipt does internal memory mapping and secondary burst to this piece, detects total core to task
Calculation, sets up thread pool, and burst then is put into thread pool, and thread obtains second task burst by loading internal state table,
Completion is performed until content is all in table, thread is exited, thread pool is destroyed.Each thread of load node, which is performed, completes one
A reduce is carried out after second task burst, reduce results and internal state table are sent to host node, host node is determined
Whether another burst is sent to this load, if desired send new task, then continue to take task to be sent to load.Until shape
In state table after the completion of all execution of content, host node sends exit instruction to each load, and load triggers event signal is notified
Monitoring etc. is to be retired, and all loads and host node are waited and exited together.Host node will load the result progress reduce sent
Operation.Host node sends result to submitter's operation communication submodule of operation.
In node data cache policy, the work in the load node of cloud platform caching cloud platform interior joint subnet is deployed in
Industry.Local index server is deployed in each subnet, the operation shared in residing cloud platform is stored, and respectively
The corresponding node listing of individual operation.Load node oneself is cached with the identity of resource provider to local index server registration
Operation.Line node builds the subnet in cloud platform in local index server organization cloud platform, helps subnet user convenient
Find load node in ground.Management server periodically collects job request letter from the local index server in each cloud platform
Breath;After each run cache algorithm, management server be each cloud platform generate two cache object inventories, and by this two
Individual the job list is sent respectively to two load nodes disposed in cloud platform;Each stores working as operation in node
Preceding state also periodically reports management server.Load node updates according to the job list received from management server
The process space of oneself;Once some new cache object is synchronously completed, load node will calculate its content hash values, and
Local index server registration operation into residing cloud platform.
When the user of some cloud platform starts an operation, the node adds global node subnet and cloud platform simultaneously
In subnet.And reported to local index server in the running status of each operation, local index server monitoring cloud platform
Subnet, is sent to management server by the relevant information of each active job in cloud platform, includes unique hash marks of operation
Know, job size, the number of users accessed, currently available number of copies in cloud platform, the subnet of the most recently requested operation is used
Amount etc..According to the information being collected into from local index server, management server is periodically executed a cache algorithm to draw
Load node in each cloud platform needs the operation for updating and deleting.And the j ob schedule drawn is immediately sent to each born
Carry node.After an operation is synchronously completed, the local index server registration work of load node immediately into cloud platform
Industry.Then load node uploads locally to ask the user of the operation to provide data.
Further, operation burst to be distributed is actively sent to by procedure below the node of available free bandwidth.
Management server sets decision package, according to the information being collected into, determines the operation for needing to send, and carries out burst to it, sends out
The big operation burst of auxiliary node is given, and operation is stored to the memory allocation of user.To the transmission target section each chosen
Point adds an operation, and the operation is in running background, and the small operation that the operation is independently performed with user is carried into resource simultaneously
Deposition state, shares uploading bandwidth.When user leaves system, the end of job of the running background.Management server selection sends mesh
Node is marked, notifies each selected node storage to send a burst of target job;Receive send operation node from
The operation burst that cloud platform and node subnet storage are specified.After the completion of the operation burst storage of transmission, it is it to send destination node
He provides upload by memory node, is used as auxiliary node;When distribution procedure terminates, management server received transmission operation to all
Node send message, notify its memory to remove the operation burst being previously sent from the process space of user.
In summary, the present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, to meet
The need for high-performance cloud is calculated.
Obviously, can be with general it should be appreciated by those skilled in the art, above-mentioned each module of the invention or each step
Computing system realize that they can be concentrated in single computing system, or be distributed in multiple computing systems and constituted
Network on, alternatively, the program code that they can be can perform with computing system be realized, it is thus possible to they are stored
Performed within the storage system by computing system.So, the present invention is not restricted to any specific hardware and software combination.
It should be appreciated that the above-mentioned embodiment of the present invention is used only for exemplary illustration or explains the present invention's
Principle, without being construed as limiting the invention.Therefore, that is done without departing from the spirit and scope of the present invention is any
Modification, equivalent substitution, improvement etc., should be included in the scope of the protection.In addition, appended claims purport of the present invention
Covering the whole changes fallen into scope and border or this scope and the equivalents on border and repairing
Change example.
Claims (1)
1. a kind of big data instant scheduling method, it is characterised in that including:
Operation is divided into independent task, and maintenance task state array by host node;After load node starts, to host node application
Task, is obtained after task by predetermined scheme execution task;After the completion of tasks carrying, the result of task is fed back into host node simultaneously
Apply for next task;If the mission bit stream that load node is obtained is invalid, this parallel computation Mission Success is completed, load
Node winding-up;The operation of limit priority is taken out from task queue;Host node carries out burst to operation, obtains a fraction
Piece, is put into load cell;Host node inquiry state table obtains the information of one-level burst, then gets all two fractions in this burst
The execution information of piece is transmitted to load;Load-receipt does internal memory mapping and secondary burst to this piece, detected always to task
Core number, sets up thread pool, and burst then is put into thread pool, and thread obtains second task point by loading internal state table
Piece, completion is performed until content is all in table;It is laggard that each thread of load node performs one second task burst of completion
Reduce of row, host node is sent to by reduce results and internal state table, and host node determines whether to send to this load
Another burst, if desired sends new task, then continues to take task to be sent to load;Until content is all in state table
After the completion of execution, host node sends exit instruction to each load, and load triggers event signal notifies to be retired, the institutes such as monitoring
There are load and host node to wait to exit together;Host node will load the result progress reduce operations sent;Host node is analyzed
Incoming command line parameter is to obtain the relevant information of this parallel interface operation;According to Host List file, dynamically distributes are more
Individual load node simultaneously constitutes a communication domain, then waits load node application task or submits task result;It is then based on simultaneously
The dynamic process that row calculates interface is created and administrative model, and the load process host node that distribution meets Host List documentation requirements will
Task is distributed to load node by map processes, and obtains corresponding task result from load node by reduce processes;Return
And these intermediate results, the final result of this parallel computation operation is obtained, finally this result distinct feed-back is carried to task
Friendship person;Once host node, which detects some load node, occurs exception, then new Host List file is produced, and distribute newly
Load node performs task to take over abnormal load node;If host node finds all in the corresponding state array of task
Item is all to have completed, then host node submits final job result by socket communication distinct feed-back to corresponding task
Person.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710344085.9A CN107172149A (en) | 2017-05-16 | 2017-05-16 | Big data instant scheduling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710344085.9A CN107172149A (en) | 2017-05-16 | 2017-05-16 | Big data instant scheduling method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107172149A true CN107172149A (en) | 2017-09-15 |
Family
ID=59816776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710344085.9A Pending CN107172149A (en) | 2017-05-16 | 2017-05-16 | Big data instant scheduling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107172149A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109144693A (en) * | 2018-08-06 | 2019-01-04 | 上海海洋大学 | A kind of power adaptive method for scheduling task and system |
CN111427670A (en) * | 2019-01-09 | 2020-07-17 | 北京京东尚科信息技术有限公司 | Task scheduling method and system |
CN111708812A (en) * | 2020-05-29 | 2020-09-25 | 北京赛博云睿智能科技有限公司 | Distributed data processing method |
CN111857538A (en) * | 2019-04-25 | 2020-10-30 | 北京沃东天骏信息技术有限公司 | Data processing method, device and storage medium |
CN112637067A (en) * | 2020-12-28 | 2021-04-09 | 北京明略软件系统有限公司 | Graph parallel computing system and method based on analog network broadcast |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080086442A1 (en) * | 2006-10-05 | 2008-04-10 | Yahoo! Inc. | Mapreduce for distributed database processing |
CN101706757A (en) * | 2009-09-21 | 2010-05-12 | 中国科学院计算技术研究所 | I/O system and working method facing multi-core platform and distributed virtualization environment |
CN101996079A (en) * | 2010-11-24 | 2011-03-30 | 南京财经大学 | MapReduce programming framework operation method based on pipeline communication |
CN105279241A (en) * | 2015-09-29 | 2016-01-27 | 成都四象联创科技有限公司 | Cloud computing based big data processing method |
-
2017
- 2017-05-16 CN CN201710344085.9A patent/CN107172149A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080086442A1 (en) * | 2006-10-05 | 2008-04-10 | Yahoo! Inc. | Mapreduce for distributed database processing |
CN101706757A (en) * | 2009-09-21 | 2010-05-12 | 中国科学院计算技术研究所 | I/O system and working method facing multi-core platform and distributed virtualization environment |
CN101996079A (en) * | 2010-11-24 | 2011-03-30 | 南京财经大学 | MapReduce programming framework operation method based on pipeline communication |
CN105279241A (en) * | 2015-09-29 | 2016-01-27 | 成都四象联创科技有限公司 | Cloud computing based big data processing method |
Non-Patent Citations (1)
Title |
---|
郭羽成: "MPI高性能云计算平台关键技术研究", 《中国博士学位论文全文数据库(电子期刊)信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109144693A (en) * | 2018-08-06 | 2019-01-04 | 上海海洋大学 | A kind of power adaptive method for scheduling task and system |
CN109144693B (en) * | 2018-08-06 | 2020-06-23 | 上海海洋大学 | Power self-adaptive task scheduling method and system |
CN111427670A (en) * | 2019-01-09 | 2020-07-17 | 北京京东尚科信息技术有限公司 | Task scheduling method and system |
CN111857538A (en) * | 2019-04-25 | 2020-10-30 | 北京沃东天骏信息技术有限公司 | Data processing method, device and storage medium |
CN111708812A (en) * | 2020-05-29 | 2020-09-25 | 北京赛博云睿智能科技有限公司 | Distributed data processing method |
CN112637067A (en) * | 2020-12-28 | 2021-04-09 | 北京明略软件系统有限公司 | Graph parallel computing system and method based on analog network broadcast |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10635664B2 (en) | Map-reduce job virtualization | |
CN107168799A (en) | Data-optimized processing method based on cloud computing framework | |
EP2810164B1 (en) | Managing partitions in a scalable environment | |
US20160275123A1 (en) | Pipeline execution of multiple map-reduce jobs | |
US7185096B2 (en) | System and method for cluster-sensitive sticky load balancing | |
CN107172149A (en) | Big data instant scheduling method | |
US8671134B2 (en) | Method and system for data distribution in high performance computing cluster | |
JP5902716B2 (en) | Large-scale storage system | |
US20050132379A1 (en) | Method, system and software for allocating information handling system resources in response to high availability cluster fail-over events | |
US8332862B2 (en) | Scheduling ready tasks by generating network flow graph using information receive from root task having affinities between ready task and computers for execution | |
US7680848B2 (en) | Reliable and scalable multi-tenant asynchronous processing | |
US10505791B2 (en) | System and method to handle events using historical data in serverless systems | |
US8959222B2 (en) | Load balancing system for workload groups | |
US20080229320A1 (en) | Method, an apparatus and a system for controlling of parallel execution of services | |
Balasangameshwara et al. | Performance-driven load balancing with a primary-backup approach for computational grids with low communication cost and replication cost | |
US8082344B2 (en) | Transaction manager virtualization | |
CN108536532A (en) | A kind of batch tasks processing method and system | |
EP4068725B1 (en) | Topology-based load balancing for task allocation | |
CN101715001A (en) | Method for controlling execution of grid task | |
Han et al. | EdgeTuner: Fast scheduling algorithm tuning for dynamic edge-cloud workloads and resources | |
CN109614227A (en) | Task resource allocation method, apparatus, electronic device, and computer-readable medium | |
CN111767145A (en) | Container scheduling system, method, device and equipment | |
US20230333880A1 (en) | Method and system for dynamic selection of policy priorities for provisioning an application in a distributed multi-tiered computing environment | |
US12236267B2 (en) | Method and system for performing domain level scheduling of an application in a distributed multi-tiered computing environment using reinforcement learning | |
CN107169099A (en) | Data processing method based on HADOOP |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170915 |
|
RJ01 | Rejection of invention patent application after publication |