Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The implementation of the present invention will be described in detail below with reference to specific embodiments.
Referring to FIG. 1, a preferred embodiment of the present invention is provided.
In a first aspect, the present invention provides a task allocation optimization method for a multi-core heterogeneous ASIC computing motherboard, including:
S1, acquiring calculation core setting information of a calculation main board, and carrying out digital feedback processing of a calculation core on the calculation main board according to the calculation core setting information so as to obtain a main board calculation performance model;
S2, acquiring an executable task set of the computing main board, and analyzing task computing requirements of each executable task in the executable task set to obtain a computing requirement set corresponding to the executable task set;
S3, performing performance scheduling simulation analysis on the calculation demand set according to the main board calculation performance model through an optimization algorithm to obtain a task allocation plan of the main board calculation performance model corresponding to the calculation demand set;
S4, acquiring a current execution task of the computing main board, carrying out matching identification on the current execution task according to the executable task set, and generating task execution characteristics of the current execution task according to a matching identification result;
And S5, performing task allocation on the task execution characteristics according to the task allocation plan to obtain a task allocation mode of the current execution task, and performing computing resource scheduling on the computing main board according to the task allocation mode to perform task allocation processing on the current execution task.
Specifically, in step S1 of the embodiment provided by the present invention, the computing core setting information of the computing motherboard is obtained through a hardware management tool or a system command, where the computing core setting information includes the type, model, and performance index of each computing core disposed on the multi-core heterogeneous ASIC computing motherboard.
It can be appreciated that by obtaining detailed information of hardware resources, the computing power and topology of the system can be known, which provides important basis for subsequent task scheduling, resource allocation, and performance optimization, and understanding the performance characteristics of each computing core helps to evaluate how to allocate tasks according to load characteristics (computationally intensive, I/O intensive, etc.).
More specifically, a series of benchmark tests (such as Linpack, stress-ng) or specific tasks can be further run under actual computing load, the performance of the system under different loads and core configurations is recorded, so as to monitor parameters of the computing cores, including CPU utilization rate, memory occupation, load balancing, response time and the like, experiment is performed on different task scheduling strategies, indexes such as task execution time, throughput, delay and the like are collected, and according to test results, execution feedback (such as resource consumption, task execution time and the like) of the system is associated with computing core setting information (such as core number, thread number, frequency and the like), so that performance bottlenecks and advantages are extracted.
More specifically, a mathematical twin technology and an algorithm model technology are used to construct a calculation performance model based on performance data obtained from digital feedback processing to express the relation between the resource utilization rate and task load, so that the execution time of a task under different core numbers, thread numbers and frequency settings can be predicted according to the resource requirements and hardware performances of the task, and for a calculation task supporting multiple cores, the model needs to consider how to reasonably distribute the task among the cores, so that resource waste is avoided and parallelism is improved, and based on real-time monitoring feedback, the model can dynamically adjust task scheduling strategies to realize load balancing.
Specifically, in step S2 of the embodiment provided by the present invention, all task sets executable on the computing motherboard are collected, where the executable tasks are tasks that the computing motherboard can run, that is, the computing motherboard is subjected to simulation of the executable tasks in advance, so as to obtain performance requirements of the executable tasks, and an optimal task allocation scheme is obtained according to the performance requirements, and then in actual work, analysis and task allocation of the executable tasks are not required in real time, and allocation modes of the computing motherboard corresponding to the executable tasks are obtained directly through pre-allocated plans.
It can be understood that the collection of the executable task sets is the executable task with representative meaning, the executable tasks can represent the task types which can be performed on the computing main board, when the executable task which cannot be completely corresponding to the executable task sets appears in the follow-up actual work, the executable task sets can be subjected to pattern matching to obtain the executable task which can be referred to, and further, the task allocation scheme of the executable task can be rapidly obtained according to the reference and degree adjustment of the task allocation mode of the executable task.
More specifically, the executable task set includes computation-intensive tasks such as scientific computation, image processing, machine learning training and the like, memory-intensive tasks such as large-scale data analysis, database operation, memory caching processing and the like, and I/O-intensive tasks such as file reading/writing, network transmission, database query and the like, and hybrid tasks such as tasks including various demands of computation, memory, I/O and the like.
More specifically, each task is analyzed in depth to identify its specific computing resource needs. Specific analysis items include CPU requirements, how many CPU cores (single core, dual core, multi-core) are needed for each task, whether parallel computing is supported, whether thread scheduling and parallelization can be performed, CPU usage frequency (e.g., for high performance computing, it may be required to run on a high-frequency CPU), memory requirements, memory size (RAM) for each task, whether there is a memory bottleneck or a large amount of cache is required, I/O requirements, file read-write speed requirements (for data processing tasks, database operations, etc.), network bandwidth requirements (for distributed computing or network transport intensive tasks), storage requirements, temporary storage (e.g., cache, log files, etc.), and long-term storage requirements (e.g., database files, data backups, etc.).
More specifically, the results of each task analysis are sorted to form a comprehensive task computing requirement set, wherein the set can comprise the following matters that the resource requirement of each task such as the required CPU core number, memory size, I/O bandwidth and the like, the priority of each task is set according to the property and real-time requirement of the task, the execution time estimation of each task is based on the resource requirement and the task scale, the execution time of the task is estimated, the resource competition information is used for marking possible resource conflicts when a plurality of tasks compete for the same resource, especially shared resources such as CPU, memory, I/O and the like, and the computing requirement set is output in the form of structured data (such as JSON, XML, database table and the like) so as to facilitate the use of a subsequent scheduling system and an optimization algorithm.
It can be understood that by performing in-depth analysis on the computing requirements of each task, the type and degree of the requirements of various tasks on the computing main board resources can be clarified, a computing requirement set is generated by classifying the tasks and analyzing the resource requirements, comprehensive understanding on the task and the resource relationship is formed, data support is provided for subsequent performance scheduling and optimization, a more reasonable resource scheduling strategy can be designed by analyzing the computing requirements of each task, the performance of the computing main board is maximized, and resource conflicts and bottlenecks are avoided.
Specifically, in step S3 of the embodiment provided by the present invention, according to the set of computing requirements, an optimization objective is first defined, and generally, the objective is to maximize system performance or minimize resource waste, which may include the objective of minimizing task execution time, minimizing time required for completing an overall task, maximizing resource utilization, reasonably utilizing resources such as CPU, memory, I/O, etc., avoiding resource idling or excessive use, load balancing, ensuring balanced allocation of tasks among different computing cores, memory, storage, etc., avoiding overload or resource bottleneck, reducing latency of tasks, optimizing queuing latency of tasks, and reducing delay of a system.
More specifically, according to the motherboard computing performance model, specific constraints (such as CPU core number, memory requirement, I/O bandwidth and the like) in a resource requirement set of each task are considered, and constraint conditions are established, for example, computing resource constraints such as the CPU core number cannot exceed the available core number, the memory usage cannot exceed the total memory of the system, task priority constraints such as high priority tasks should be scheduled preferentially, task mutual exclusion constraints such as that certain tasks cannot be executed simultaneously due to resource conflict, parallel execution constraints such as whether specific tasks support parallel execution or not, and whether the execution can be started after the completion of other tasks is needed.
More specifically, a mathematical model of the task scheduling optimization problem is constructed, an objective function and constraint conditions are combined to form a linear or nonlinear optimization problem, and a proper optimization algorithm is selected according to the complexity of task scheduling and the characteristics of the problem.
More specifically, the common optimization algorithm comprises a Genetic Algorithm (GA), a Simulated Annealing (SA), a Particle Swarm Optimization (PSO), a simulated bird swarm foraging behavior (PSO), a dynamic resource scheduling problem, a Reinforcement Learning (RL), an interactive learning optimal strategy with environment, a situation requiring real-time adjustment in task scheduling, a heuristic algorithm (such as A, best-FIRST SEARCH, etc.), wherein the Genetic Algorithm (GA) is suitable for complex task scheduling problems, can perform global searching, can process complex constraints, is suitable for large-scale optimization problems, can avoid sinking into a local optimal solution, explore the global optimal solution through gradual temperature reduction, and is suitable for the scheduling problems with strong constraints.
More specifically, the selected optimization algorithm is applied to the task scheduling model, an optimal or near-optimal task scheduling scheme is found through repeated iteration and updating, in each iteration, an objective function is evaluated according to the current task scheduling scheme, indexes such as time of task execution, resource utilization rate and the like are calculated, a task allocation scheme is continuously adjusted according to the searching process of the algorithm so as to meet the optimization requirement of the objective function, and when the optimization algorithm reaches a certain convergence standard (for example, the objective function change is smaller than a certain threshold value or the maximum iteration number is reached), the optimization process is ended.
It can be appreciated that by using a suitable optimization algorithm, an optimal or near-optimal task scheduling scheme can be found, so that the use of computing resources is more efficient, and the optimization algorithm can process complex task scheduling constraints, such as task dependency relationships, resource sharing limitations, and the like, and avoid unreasonable scheduling strategies.
More specifically, according to the optimized task allocation plan, scheduling simulation execution is performed, the effect of the scheduling plan in practical application is verified in a simulation manner, execution of task scheduling is simulated in a virtual environment, execution time, resource consumption, load condition and the like of tasks are recorded, the execution effect of the scheduling plan is tested in a physical system, indexes such as resource utilization rate, task completion time and system response time are monitored, the optimization effect is evaluated in comparison with the task execution time before and after scheduling, after the task scheduling is analyzed, the use condition of system resources (such as CPU, memory and I/O) is ensured, full utilization of the resources is ensured, whether the tasks are uniformly distributed to each computing core is checked, overload of part of cores is avoided, other cores are idle, according to simulation analysis results, the task allocation plan is further adjusted, and the resource scheduling strategy is optimized, for example, some tasks may not be effectively parallelized, or some resources (such as I/O bandwidth) become bottlenecks, so that performance is reduced.
More specifically, the optimized task scheduling scheme is arranged into a complete task allocation scheme, the complete task allocation scheme comprises information such as resource allocation, scheduling sequence, execution nodes and the like of each task, CPU core, memory size, I/O bandwidth and other resources used by each task are defined, task execution sequence is determined, priority execution of high-priority tasks is ensured, resource conflict is avoided, if parallelization is supported by the tasks, parallel task execution arrangement and allocation strategies are marked, the generated task allocation scheme is output in a structured format (such as JSON, XML, tables and the like), and a specific scheme is provided for resource scheduling and task execution of a system.
It can be understood that the task allocation schemes of the computing main board corresponding to the executable tasks are obtained in advance through the optimization algorithm, and in the actual work of the subsequent computing main board, the task allocation schemes do not need to be obtained through the optimization algorithm again, but the corresponding schemes can be directly scheduled, so that rapid and efficient task allocation and resource scheduling are realized.
Specifically, in step S4 of the embodiment provided by the present invention, the task currently being executed by the computing motherboard is obtained through the task management module of the operating system, in the Linux system, the current process list may be obtained through commands such as ps, top, htop, in the Windows system, the running process information may be obtained through the task manager, tasklist commands or PowerShell script, and in the real-time system or the embedded system, the task execution information may be obtained through an API provided by the task scheduler or the real-time operating system (RTOS).
More specifically, the relevant information of each execution task, such as process ID, CPU utilization rate, memory utilization condition, I/O operation and execution time, is collected, all executing tasks on the computing main board are obtained in real time, basic data are provided for subsequent analysis and scheduling, multidimensional information of the tasks is collected through a task management tool, and the comprehensiveness of task feature data is ensured.
More specifically, the current task is identified and classified according to a predefined set of executable tasks (including known task types and their computing requirements), and the resource usage (e.g., CPU, memory, I/O) of the current task is compared to the task computing requirements in the set of executable tasks.
More specifically, according to the execution time, resource consumption, priority and the like of the task, the type which is most consistent with the current task is matched, namely, the executable task closest to the executable task is selected from the executable task set, and the executable task is used as the task execution characteristic of the current execution task.
More specifically, if there is a large deviation between the current execution task and any one of the executable tasks in the executable task set, a deeper analysis is required for the current execution task to obtain the relevance between the current execution task and each of the executable tasks in the executable task set, and a task execution feature of the current execution task is generated according to the relevance.
It can be understood that the task execution feature is used for feeding back the association relation between the current execution task and each executable task in the executable task set on specific execution, and because the corresponding optimal task allocation scheme is obtained by analyzing the executable task set in advance, the current execution task can be rapidly and quickly allocated according to the optimal task allocation scheme corresponding to the executable task set which is analyzed in advance.
Specifically, in step S5 of the embodiment provided by the present invention, the task is matched and allocated according to the task allocation plan (generally, the predetermined task allocation plan calculated by the optimization algorithm), the task is disassembled and allocated to the corresponding computing core on the computing motherboard, the reasonable load balancing is performed according to the resource requirement and the available resource of the current task, the overload or the resource idling is avoided, the limitation of the hardware resources (such as the CPU core and the memory size) of the motherboard is avoided, the exceeding of the upper limit of the resources is avoided, and the task is allocated to the appropriate computing resource according to the task characteristics and the policy in the plan.
It can be understood that in a multi-core heterogeneous ASIC computing motherboard, there are generally multiple types of computing cores, including a CPU, a GPU, and the like, where different computing cores adapt to different types of work, so that when corresponding to each executable task in an executable task set, optimal modes of resource scheduling and task allocation for multiple computing cores are also different, by performing analysis on a representative executable task set in advance to obtain a task allocation plan, when actually executing a current execution task, by performing analysis on association between the current execution task and each executable task to obtain a relationship between execution characteristics between the current execution task and the executable task, the relationship between execution characteristics between the current execution task and the executable task can be fed back to the task allocation plan corresponding to the executable task, so as to obtain the optimal task allocation mode corresponding to the current execution task.
The invention provides a task allocation optimization method for a multi-core heterogeneous ASIC computing main board, which has the following beneficial effects:
The invention collects the calculation core setting information of the calculation main board, generates a main board calculation performance model through digital feedback processing, acquires an executable task set, analyzes the calculation requirement of each task, forms the calculation requirement set, uses an optimization algorithm to carry out simulation analysis on the calculation requirement set based on the performance model, generates a task allocation plan, identifies and generates the task execution characteristics of the current execution task, distributes the task execution characteristics according to the task allocation plan, schedules calculation resources to process the current task, accurately simulates the resource performance through the performance model, ensures the high efficiency of resource allocation through the optimization algorithm, improves the system performance, responds to task change in real time, ensures the system flexibility and stability, maximizes the utilization of heterogeneous calculation resources, improves the overall calculation efficiency, and solves the problem of execution delay when the multi-core heterogeneous ASIC calculation main board carries out real-time task allocation in the prior art.
Preferably, the step of obtaining the setting information of the computing core of the computing motherboard and performing digital feedback processing of the computing core on the computing motherboard according to the setting information of the computing core to obtain the motherboard computing performance model includes:
s11, acquiring calculation core setting information of a calculation main board, wherein the calculation core setting information comprises performance indexes of a plurality of calculation cores arranged on the calculation main board and layout association relations among the calculation cores;
s12, respectively performing digital simulation on each computing core on the computing main board according to the performance index of each computing core in the computing core setting information to obtain a unit performance model corresponding to each computing core;
S13, carrying out resource parallel scheduling analysis on each computing core on the computing main board according to the layout association relation of each computing core in the computing core setting information so as to obtain parallel working characteristics corresponding to each unit performance model;
and S14, carrying out association combination on each unit performance model according to the parallel working characteristics to obtain a main board calculation performance model.
The performance index of each computing core is obtained, wherein the performance index comprises clock frequency, working frequency of each computing core, computing capacity of each computing core, such as instruction number executed per second, floating point operation number per second and the like, core load, current load or resource use condition of each computing core, power consumption characteristics of each computing core, which are critical to heat management and power supply optimization, and cache size, namely cache capacity of each computing core, influence the efficiency of processing data.
More specifically, the layout association relationship between the computing cores is obtained, wherein the physical layout and the logical layout of the computing cores on the motherboard comprise a physical position, namely a position of each computing core on a physical layer, such as a specific area on a chip, and a data transmission path, namely a data exchange path between the computing cores, possibly comprises data transmission through a shared memory, a network bus or a cache, and a connection topology, namely a connection topology structure between different computing cores, comprising a ring structure, a grid structure or a star structure, influences task scheduling and data access efficiency.
It can be understood that by collecting the performance index and layout association relation of the computing cores, task scheduling and computing resource allocation can be ensured to comprehensively reflect the actual running state of the computing main board, and by accurately acquiring core information, subsequent simulation analysis and performance modeling are supported.
More specifically, each computing core is digitally simulated based on its performance index (e.g., clock frequency, computing power, power consumption, etc.), the computing power of the computing core is modeled using, for example, a simulator (e.g., a CPU simulator) or a computing model (e.g., a floating point operation, integer operation, etc.), the processing power, response time, power consumption, etc. of each computing core when running alone are evaluated by the numerical model, the computing core performance under different load conditions is simulated, and the impact of different tasks (e.g., computationally intensive tasks or I/O intensive tasks) on the core performance is tested.
More specifically, according to the layout association relationship between the computing cores, how the computing cores coordinate in the parallel processing process is analyzed, the overall performance of the system is improved, the data transmission delay between different computing cores, especially the time and bandwidth of data cross-core transmission, is analyzed, the synchronization and consistency problems during sharing of the cache and the memory between the computing cores are analyzed, the performance degradation caused by data inconsistency is avoided, the scheduling condition of parallel tasks on a plurality of computing cores is simulated, the task can be distributed to each core efficiently, and resource conflict and uneven load are avoided.
More specifically, based on the parallel scheduling result, the cooperative working characteristic of each computing core is evaluated, such as bandwidth utilization rate, namely, the data transmission bandwidth utilization condition of tasks among a plurality of computing cores is analyzed, parallelism benefit, namely, the improvement of the computing efficiency by parallel processing is analyzed, the change of task execution time under different parallelism is calculated, the advantage of multi-core processing can be exerted to the greatest extent by analyzing the parallel scheduling among the cores, resource conflict in task scheduling is reduced, and the parallel scheduling analysis can effectively improve the resource utilization rate in a multi-core system and reduce waiting time and data transmission delay.
The method comprises the steps of carrying out weighted combination on each unit model according to the performance, load and parallel working characteristics of the computing cores to obtain a comprehensive main board computing performance model, combining a single model of each computing core with a synergistic effect among multiple cores, evaluating the overall performance of the main board, identifying bottlenecks in the computing performance of the main board, such as limitation of computing capacity, bottleneck of data transmission and the like, generating a global computing performance model, and dynamically adjusting the computing strategy according to the load, task demands, resource states and the like.
It can be understood that the computing performance model of the main board can predict the overall computing performance of the main board under different loads and configurations, provide decision basis for task scheduling, resource allocation and system optimization, and dynamically adjust the allocation mode of computing resources by associating and combining the performance models of each unit so as to realize more efficient resource utilization and task scheduling.
Preferably, the step of obtaining an executable task set of the computing motherboard, and analyzing task computing requirements of each executable task in the executable task set to obtain a computing requirement set corresponding to the executable task set includes:
S21, performing predictive analysis on the executable tasks of the computing main board to obtain an executable task set of the computing main board;
s22, analyzing task computing requirements of each executable task in the executable task set to obtain task computing requirements corresponding to each executable task;
S23, combining task computing requirements of all the executable tasks to obtain a computing requirement set corresponding to the executable task set.
Specifically, the computing motherboard is subjected to predictive analysis, a task set which is likely to be executed on the motherboard in the future is identified and determined, and tasks which are likely to be executed can be identified and predicted through methods such as historical data analysis, machine learning model prediction, task scheduling strategies and the like, so that a set containing all the tasks which are likely to be executed is obtained and is called as an executable task set.
More specifically, each task in the executable task set is analyzed separately to define its specific computing requirements, its computing requirements (e.g., CPU, memory, storage, etc.) are analyzed based on the task's code and configuration file, and by simulating or actually running the task, the resource usage is monitored and counted, and a task computing requirement description is generated for each task, including the required computing resources and the number thereof.
More specifically, the computing requirements of all tasks are combined together to obtain a comprehensive computing requirement of the whole executable task set, and a suitable algorithm or strategy (such as superposition, parallel computing requirement analysis, etc.) is adopted to combine according to the computing requirements of each task, wherein one comprehensive computing requirement set represents the total computing resources required by the computing main board when executing the tasks.
It can be understood that resource planning can be effectively performed by predicting and analyzing task demands on a computing main board in advance, resource waste or deficiency is avoided, specific computing demands of tasks are known, a dispatching system can be helped to better distribute resources, priority processing of key tasks is ensured, overall system performance is improved, bottlenecks in the task execution process are reduced, task execution efficiency is improved, possible resource conflicts and bottlenecks are identified in advance through comprehensive computing demand analysis, and preventive measures are taken to improve reliability and stability of the system.
Preferably, the step of analyzing task computing requirements of each executable task in the executable task set to obtain task computing requirements corresponding to each executable task includes:
s221, analyzing the calculated amount and the memory requirement of the executable task to obtain the performance resource requirement of the executable task;
S222, analyzing the computational parallelism and the delay sensitivity of the executable task to obtain the execution standard requirement of the executable task;
S223, analyzing the data transmission dependency of the executable task to obtain the transmission performance requirement of the executable task;
And S224, combining the performance resource requirement, the execution standard requirement and the transmission performance requirement of the executable task to obtain the task calculation requirement of the executable task.
Specifically, computing resources and memory resources required by each task are determined, the computing amount and memory requirements are estimated by analyzing the code structure, algorithm complexity, memory allocation and the like of the task, the actual CPU utilization rate and memory occupation condition are monitored by running the example of the task, more accurate data are obtained, and the performance resource requirements of each task including CPU period, memory size and the like are generated.
More specifically, the parallel execution capacity and the sensitivity to delay of each task are evaluated, whether the task has parallel execution potential or not is analyzed, such as multithreading, multiprocessing or distributed computing, the parallelism of the task is evaluated, the sensitivity of the task to the execution delay is determined by analyzing the performance of the task under different delay conditions, and the execution standard requirements of each task including the parallelism and the delay sensitivity are generated.
More specifically, the dependency of each task on data transmission in the execution process is analyzed, the data quantity, the data source and the data transmission path which the task needs to access are determined, the bandwidth requirement, the transmission delay and the data integrity requirement of the data transmission are evaluated, and the transmission performance requirement of each task is generated, wherein the transmission performance requirement comprises the data transmission bandwidth, the transmission delay and the like.
More specifically, the requirements of the three aspects are integrated to form an overall calculation requirement of each task, performance resource requirements, execution standard requirements and transmission performance requirements are integrated, and relationships and comprehensive influences among the requirements are considered to form a complete task calculation requirement description, so that a set containing the complete calculation requirement of each task is obtained, and the detailed description of the calculation resource, parallelism, delay sensitivity, transmission performance and the like is included.
It can be understood that by decomposing and analyzing the demands of each aspect of the task in detail, a comprehensive calculation demand description is formed, the accuracy and the effectiveness of resource allocation are ensured, the calculation amount, the memory demand, the parallelism, the delay sensitivity and the data transmission demand of the task are accurately known, and the system is supported to perform more accurate resource allocation and scheduling, so that the resource waste or shortage is avoided.
Preferably, the step of performing performance scheduling simulation analysis on the computing demand set according to the motherboard computing performance model by using an optimization algorithm to obtain a task allocation plan of the motherboard computing performance model corresponding to the computing demand set includes:
s31, defining a performance scheduling simulation target aiming at the motherboard computing performance model, wherein the performance scheduling simulation target comprises execution time efficiency, resource utilization efficiency and computing energy consumption efficiency;
S32, taking the performance scheduling simulation target as an optimization constraint condition of an optimization algorithm, and respectively performing performance scheduling scheme simulation on each task calculation requirement in the calculation requirement set according to the main board calculation performance model by the optimization algorithm with the optimization constraint condition to obtain a task allocation plan of the main board calculation performance model corresponding to each task calculation requirement, wherein the optimization algorithm comprises a genetic algorithm, an ant colony algorithm and a particle colony algorithm;
S33, the task allocation plans of the main board computing performance model corresponding to the task computing demands jointly form the task allocation plans of the main board computing performance model corresponding to the computing demand sets.
Specifically, targets of performance scheduling simulation are defined, and the targets are used as optimization constraint conditions of an optimization algorithm, wherein the targets comprise execution time efficiency, total execution time of tasks, speed of task completion, resource utilization efficiency, utilization rate of computing resources (such as CPU, memory and I/O) on a main board, resource waste avoidance, energy consumption efficiency calculation, energy consumption minimization in a computing process and energy efficiency ratio improvement of a system.
The algorithm selection comprises a genetic algorithm, a natural selection and genetic mechanism simulation, an ant colony algorithm, a path optimization problem, an optimal path searching and particle swarm algorithm, a bird colony foraging simulation, a continuous space optimization and a rapid convergence characteristic, wherein the genetic algorithm is suitable for simulating complex multimodal optimization problems, the ant colony algorithm is suitable for simulating foraging behaviors of ants, and the particle swarm algorithm is particularly suitable for searching optimal paths in multitask scheduling.
More specifically, execution time, resource utilization, and energy consumption efficiency are taken as objective functions and constraints of the optimization algorithm.
More specifically, performance scheduling simulation is performed on each task in a main board calculation performance model and a calculation demand set through an optimization algorithm, a task allocation plan is generated, parameters and initial solutions of the optimization algorithm are initialized according to the main board calculation performance model and the task demand set, in each iteration, the task allocation plan is adjusted according to optimization constraint conditions, execution time, resource utilization and energy consumption efficiency are gradually optimized, whether optimization is completed or not is judged according to preset convergence conditions (such as fixed iteration times and objective function change amplitude), a specific allocation plan of each task on the main board is generated, and the task allocation plan is formed.
More specifically, the allocation plans of the tasks are combined to form a complete task allocation plan of the main board calculation performance model corresponding to the calculation demand set, the allocation plans of each task are integrated to form an integral task scheduling plan, the optimal allocation of all the tasks under the constraint condition is ensured, and the final task allocation plan comprises the detailed allocation situation of all the tasks.
It can be understood that by applying the optimization algorithm, an optimal task allocation scheme can be found under the multi-objective constraint, the task execution speed and the overall performance of the system are improved, the optimization algorithm can effectively allocate computing resources on a main board, the resource utilization rate is maximized, idle or overload of resources caused by improper resource allocation is avoided, the energy consumption efficiency is optimized, the power consumption of the system is reduced, the energy efficiency ratio is improved, and the system is suitable for application scenes sensitive to energy consumption, such as mobile equipment and green computing.
Preferably, the step of using the performance scheduling simulation target as an optimization constraint condition of an optimization algorithm, and performing, by the optimization algorithm with the optimization constraint condition, performance scheduling scheme simulation on each task computing requirement in the computing requirement set according to the motherboard computing performance model, so as to obtain a task allocation plan of the motherboard computing performance model corresponding to each task computing requirement includes:
s321, generating an initialization scheme for the main board calculation performance model through a selected optimization algorithm to obtain a plurality of initial scheduling schemes of the main board calculation performance model corresponding to the task calculation requirements;
S322, carrying out scheme evaluation on each initial scheduling scheme according to the performance scheduling simulation target to obtain scheme evaluation characteristics of each initial scheduling scheme corresponding to the performance scheduling simulation target;
s323, carrying out constraint processing on the optimization algorithm in the optimization direction based on scheme evaluation characteristics of the performance scheduling simulation targets corresponding to each initial scheduling scheme so as to obtain an execution framework of the optimization algorithm;
S324, performing optimization processing of an initial scheduling scheme on the main board computing performance model through an execution framework of the optimization algorithm to obtain a plurality of optimal scheduling schemes;
and S325, evaluating each optimized scheduling scheme according to a preset standard, taking each optimized scheduling scheme as an initial scheduling scheme if the optimized scheduling scheme meeting the preset standard does not exist, returning to the step of evaluating each initial scheduling scheme according to the performance scheduling simulation target to obtain scheme evaluation characteristics of each initial scheduling scheme corresponding to the performance scheduling simulation target, and taking the optimized scheduling scheme as a task allocation plan of task calculation requirements corresponding to the main board calculation performance model if the optimized scheduling scheme meeting the preset standard exists.
Specifically, an optimization algorithm (such as a genetic algorithm, an ant colony algorithm, a particle swarm algorithm, etc.) is selected and initialized, and a plurality of initial scheduling schemes are randomly or based on heuristic rules according to a motherboard computing performance model and a computing demand set.
More specifically, evaluation indexes (such as execution time, resource utilization rate and energy consumption) are determined, each initial scheduling scheme is evaluated, the performance of each initial scheduling scheme on each evaluation index is calculated, scheme evaluation characteristics are formed, an optimization direction (such as the need of reducing the execution time and improving the resource utilization rate) is determined according to the evaluation characteristics, and parameters and operations of an optimization algorithm are adjusted to search and optimize towards the optimization direction.
More specifically, under the execution framework of the optimization algorithm, iterative optimization is performed on the initial scheduling scheme, a new scheduling scheme is generated, a plurality of optimized scheduling schemes are generated, each optimized scheduling scheme is evaluated according to a preset standard, a preset evaluation standard (such as a threshold value reaching a certain performance index) is determined, and whether the optimized scheduling scheme meets the preset standard is evaluated.
More specifically, if the optimal scheduling scheme meeting the preset standard does not exist, each optimal scheduling scheme is used as an initial scheduling scheme and returned to the initial scheduling scheme evaluation step, and if the optimal scheduling scheme meeting the preset standard exists, the optimal scheduling scheme is used as a task allocation plan of the task computing requirements corresponding to the main board computing performance model.
It can be understood that the optimal or suboptimal task allocation scheme can be gradually converged through multiple iterations and evaluations of the optimization algorithm, so that the scheduling efficiency and accuracy are improved, the resource utilization rate is fully considered in the evaluation and optimization process, the resource waste is avoided, the resource utilization efficiency of the system is improved, the overall energy consumption of the system can be effectively reduced through energy consumption evaluation and optimization, the energy efficiency ratio is improved, the method is particularly suitable for application scenes needing energy conservation, the flexibility of the optimization algorithm is suitable for various complex task requirements and computing environments, and a task allocation scheme meeting multi-objective constraint is formed.
Preferably, the step of obtaining a current execution task of the computing motherboard, performing matching recognition on the current execution task according to the executable task set, and generating task execution characteristics of the current execution task according to a result of the matching recognition includes:
S41, performing identification authentication on a current execution task of the computing main board through a process monitoring system of the computing main board to obtain the current execution task of the computing main board;
s42, carrying out matching recognition on the current execution task according to the executable task set to obtain recognition matching degree of the current execution task and each executable task in the executable task set;
S43, if the identification matching degree of the current execution task and one executable task in the executable task set accords with a preset standard, taking the executable task with the identification matching degree meeting the preset standard as an identification object of the current execution task, and generating task execution characteristics of the current execution task according to the identification object;
S44, if the identification matching degree of the current execution task and any executable task in the executable task set does not meet a preset standard, carrying out threshold evaluation applicable to summary on the identification matching degree of each executable task in the current execution task and the executable task set so as to obtain a plurality of identification matching degrees meeting the threshold evaluation;
And S45, marking the executable tasks corresponding to the plurality of identification matching degrees meeting the threshold evaluation as inductive objects, and inductive summarizing the identification matching degrees corresponding to the inductive objects to obtain the task execution characteristics of the current execution task.
Specifically, a process monitoring system on a computing main board is used for identifying and authenticating all executing tasks, and this step ensures that all the currently executing tasks are acquired and unique identification is allocated to each task.
More specifically, the identified and authenticated current execution task is matched with a predefined executable task set, and the identification matching degree of the current execution task and each executable task is calculated.
It should be noted that, the task computing requirements of each executable task in the executable task set include shallow requirement features and deep requirement features, where the deep requirement features are used to analyze an optimal task allocation scheme corresponding to the executable task, and when identifying and matching the current executable task, the deep requirement features do not need to be identified and matched, but the shallow requirements of the current executable task are directly identified and matched with the executable task set, in other words, the matching of the current executable task and the executable task set is based on the obvious features of the current executable task and the matching of the corresponding parts in the executable task set, where the features include computing amount, memory requirement, and the like.
More specifically, if the matching degree of a certain task reaches or exceeds a preset standard, the task is marked as an identification object of the currently executed task, task execution characteristics are generated according to the identification object, and if no matching degree of the task reaches the preset standard, the next step is performed.
More specifically, threshold evaluation is performed on the matching degree which does not reach the preset standard, a plurality of matching degrees which meet the threshold are selected, tasks which meet the threshold evaluation are marked as inductive objects, the identification matching degrees corresponding to the inductive objects are inductive summarized, and the characteristics of the currently executed task are extracted from the inductive objects.
It can be understood that the currently executed task can be accurately identified through the process monitoring system and the matching identification process, missed detection or false detection is avoided, the system can dynamically adapt to the change of different tasks through matching degree evaluation and threshold value evaluation, the flexibility and reliability of identification are improved, the generated task execution characteristics can better describe the behavior and attribute of the currently executed task, support is provided for subsequent task management and optimization, the threshold value evaluation and summary process can effectively process the condition of low identification degree, key characteristics are extracted through summary, the overall efficiency of the system is improved, the system can better manage and schedule the tasks through accurate identification and characteristic extraction, the resource use is optimized, and the overall performance of a computing main board is improved.
Preferably, the step of summarizing the recognition matching degree corresponding to each summary object to obtain the task execution characteristic of the current execution task includes:
S451, performing deviation conversion processing on the identification matching degree of each induction object to obtain the execution deviation characteristics of each induction object and the current execution task, and performing execution mode analysis on the execution deviation characteristics to obtain the execution mode corresponding to the execution deviation characteristics;
s452, calling an execution mode corresponding to the task computing demands of each induction object, and carrying out similarity evaluation on the task computing demands and the execution deviation features to obtain the task computing demands and the execution deviation features of the execution mode with the highest similarity;
s453, the task computing requirement is used as a reference execution feature, and the reference execution feature and the execution deviation feature are combined to obtain the task execution feature of the current execution task.
Specifically, the recognition matching degree of each generalized object is converted, and the execution deviation characteristics between each generalized object and the current execution task are calculated, wherein the execution deviation characteristics represent the differences between the generalized objects and the current execution task.
More specifically, the execution deviation features are analyzed to identify corresponding execution modes, the execution modes reflect the behaviors and trends of the execution deviation features, the execution modes corresponding to the task computing requirements of each inductive object are called, the execution modes are subjected to similarity evaluation with the execution deviation features of the current execution task, and the execution mode which is most similar to the execution deviation features is found out through the similarity evaluation.
More specifically, the task computing requirements of the execution mode with the highest similarity are taken as reference execution characteristics, the reference execution characteristics and the execution deviation characteristics of the current execution task are combined, and finally the task execution characteristics of the current execution task are generated.
It can be understood that by performing bias conversion and pattern analysis, the features of the currently executed task can be accurately extracted, even in the case of low initial matching degree, accurate feature descriptions can be obtained through induction and analysis, the system can dynamically adjust according to the execution bias and pattern, the adaptability to different tasks is improved, the flexibility and response capability of the system are enhanced, the accurate task execution features contribute to optimizing resource allocation and task scheduling, the overall performance and efficiency of the system are improved, through similarity evaluation and pattern combination, recognition accuracy can be improved through further analysis and combination even in the case of unsatisfactory initial recognition, robustness to various tasks is enhanced through multi-step processing and analysis, and efficient and accurate recognition and feature extraction can be maintained even in the face of complex and dynamically-changed task environments.
Preferably, the step of performing task allocation on the task execution feature according to the task allocation plan to obtain a task allocation mode of the currently executed task, and performing computing resource scheduling on the computing motherboard according to the task allocation mode to perform task allocation processing on the currently executed task includes:
s51, analyzing the task execution characteristics to obtain mapping relations between the task execution characteristics and each executable task in an executable task set corresponding to the task allocation plan;
s52, carrying out plan specific selection on the task allocation plan according to the mapping relation, and carrying out task allocation processing of resource scheduling through the specific selected plan so as to obtain a task allocation mode of the current execution task;
And S53, carrying out calculation resource scheduling on the calculation main board according to the task allocation mode so as to carry out task allocation processing on the currently executed task.
Specifically, the task execution characteristics are analyzed in detail, the mapping relation between the task execution characteristics and the executable task set in the task allocation plan is determined, and the mapping relation determines the association between each task execution characteristic and a specific task in the plan.
More specifically, a task allocation plan is specifically selected based on the mapping relation, a plan which is most suitable for the current task execution characteristics is selected, an optimal scheme of resource scheduling and task allocation is ensured, and task allocation processing of resource scheduling is performed by using the specifically selected plan.
More specifically, according to the generated task allocation mode, the computing main board is subjected to computing resource scheduling, so that the computing resources are ensured to be allocated and managed according to the task allocation mode, and the requirement of executing the task at present is met.
It can be understood that by analyzing the task execution characteristics and mapping with the task allocation plan, the maximization and optimization of resource utilization are ensured, the accurate task allocation mode ensures that the task can be executed on the most suitable resource, the execution time is reduced, the task completion efficiency is improved, the deviation in the task execution process is reduced, the accuracy and reliability of task execution are improved through detailed analysis and accurate mapping relation, and the system can more stably process high-load and complex task environments through optimizing the resource scheduling and task allocation.
In a second aspect, the present invention provides a task allocation optimization device for a multi-core heterogeneous ASIC computing motherboard, which is configured to implement the task allocation optimization method for a multi-core heterogeneous ASIC computing motherboard according to any one of the first aspect.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.