[go: up one dir, main page]

CN111507523A - An optimization method for cable production scheduling based on reinforcement learning - Google Patents

An optimization method for cable production scheduling based on reinforcement learning Download PDF

Info

Publication number
CN111507523A
CN111507523A CN202010299221.9A CN202010299221A CN111507523A CN 111507523 A CN111507523 A CN 111507523A CN 202010299221 A CN202010299221 A CN 202010299221A CN 111507523 A CN111507523 A CN 111507523A
Authority
CN
China
Prior art keywords
cable
order
production scheduling
action
cable production
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010299221.9A
Other languages
Chinese (zh)
Other versions
CN111507523B (en
Inventor
林剑
宋洪波
王周敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Finance and Economics
Original Assignee
Zhejiang University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Finance and Economics filed Critical Zhejiang University of Finance and Economics
Priority to CN202010299221.9A priority Critical patent/CN111507523B/en
Publication of CN111507523A publication Critical patent/CN111507523A/en
Application granted granted Critical
Publication of CN111507523B publication Critical patent/CN111507523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Software Systems (AREA)
  • Development Economics (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Manufacturing & Machinery (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于强化学习的线缆生产调度优化方法,首先建立多流水线和复杂资源约束条件下的线缆生产调度优化模型,优化模型以截止期延期惩罚费用最小化为目标,在此基础上,结合超启发式算法框架,将强化学习机制作为超启发式算法的HLH策略,并针对线缆生产调度问题特点,设计简易启发式规则,用以构建LLH方法集合,从而实现对于线缆生产调度问题的优化求解;优化方法复杂度低,可有效提升传统电缆行业生产与管理效率;对于传统产业全面推进提质增效、转型升级具有重要意义。

Figure 202010299221

The invention discloses a cable production scheduling optimization method based on reinforcement learning. First, a cable production scheduling optimization model under the constraints of multiple pipelines and complex resources is established. The optimization model aims at minimizing the penalty fee for deadline delay. On the basis, combined with the super-heuristic algorithm framework, the reinforcement learning mechanism is used as the HLH strategy of the super-heuristic algorithm, and according to the characteristics of the cable production scheduling problem, a simple heuristic rule is designed to construct the LLH method set, so as to realize the cable production scheduling problem. The optimization solution of production scheduling problem; the optimization method has low complexity, which can effectively improve the production and management efficiency of the traditional cable industry; it is of great significance for the comprehensive promotion of quality, efficiency, transformation and upgrading of traditional industries.

Figure 202010299221

Description

一种基于强化学习的线缆生产调度优化方法An optimization method for cable production scheduling based on reinforcement learning

技术领域technical field

本发明涉及一种优化方法,特别涉及一种基于强化学习的线缆生产调度优化方法。The invention relates to an optimization method, in particular to a cable production scheduling optimization method based on reinforcement learning.

背景技术Background technique

随着工业规模的不断提升和社会经济的不断发展,线缆产品已越来越广泛地被应用于建筑、交通、汽车、通信、能源等重要工业领域。据统计,早在2012年,我国电线电缆行业总产值就已超过万亿规模,成为世界上第一大电线电缆生产国。与此同时,电线电缆行业市场竞争形势也日趋激烈,企业需要通过降低库存、提高设备利用率、合理配置人力资源等方式来降低企业生产成本,提升企业生产、管理和服务效率。调度优化是实现企业生产、管理和服务效率提升的关键环节,对企业来讲,合理的生产调度方案不仅可以缩短产品制造周期,而且可以有效提高人员工作效率、设备利用率、减少能源和物质损耗,从而达到节能减排、降低成本和提高经济效益的目的。特别是伴随着敏捷制造思想的形成以及企业敏捷化工程的不断开展,重视准时生产,实现资源的灵活和高效配置以满足企业生产和客户服务需求,已成为生产调度的核心思想。With the continuous improvement of industrial scale and the continuous development of social economy, cable products have been widely used in important industrial fields such as construction, transportation, automobiles, communications, and energy. According to statistics, as early as 2012, the total output value of my country's wire and cable industry has exceeded one trillion, becoming the world's largest wire and cable producer. At the same time, the market competition in the wire and cable industry is becoming increasingly fierce. Enterprises need to reduce their production costs and improve their production, management and service efficiency by reducing inventory, improving equipment utilization, and rationally allocating human resources. Scheduling optimization is the key link to improve the production, management and service efficiency of enterprises. For enterprises, a reasonable production scheduling scheme can not only shorten the product manufacturing cycle, but also effectively improve the work efficiency of personnel, equipment utilization, and reduce energy and material losses. , so as to achieve the purpose of energy saving, emission reduction, cost reduction and economic benefit improvement. Especially with the formation of agile manufacturing ideas and the continuous development of enterprise agile projects, it has become the core idea of production scheduling to attach importance to just-in-time production and realize flexible and efficient allocation of resources to meet the needs of enterprise production and customer service.

由于线缆产品种类型号繁多、生产工艺复杂,因此针对电缆生产调度问题的建模和求解均具有很大挑战性。目前线缆生产企业主要还停留在依赖人工经验进行生产调度的阶段,有关线缆生产调度的文献少之又少。申请号为201810526733.7发明专利名称为《一种多类电缆加工的优化调度方法》,公开了一种多类电缆加工的优化调度方法,用于实现电缆生产加工排程。但是该发明只考虑了所有订单工艺流程均相同的情况,与线缆企业的生产实际存在明显差异。Due to the variety of cable products and complex production processes, the modeling and solution of cable production scheduling problems are very challenging. At present, cable manufacturers are still in the stage of relying on manual experience for production scheduling, and there are very few literatures on cable production scheduling. The application number is 201810526733.7 and the title of the invention patent is "An Optimal Scheduling Method for Multi-Type Cable Processing", which discloses an optimal scheduling method for multi-type cable processing, which is used to realize the cable production and processing schedule. However, this invention only considers the situation that all order processes are the same, which is obviously different from the actual production of cable companies.

此外,超启发式算法作为一种跨领域的问题求解模式,通过一种高层次启发式(High Level Heuristic,HLH)策略管理和操纵一系列低层次启发式(Low LevelHeuristics,LLH)方法,动态地生成最优启发式方法用以求解不同问题,这为解决复杂多样性问题提供了新的途径。但是,超启发式算法存在计算复杂度较高的问题,其中主要一方面原因在于HLH策略本身就需耗费大量时间以寻找最优启发式方法,降低HLH策略的算法复杂度对于提升算法整体性能同样具有重要影响。In addition, as a cross-domain problem solving mode, hyperheuristics manage and manipulate a series of low-level heuristics (LLH) methods through a high-level heuristic (HLH) strategy, dynamically Generating optimal heuristics to solve different problems provides a new way to solve complex and diverse problems. However, the hyperheuristic algorithm has the problem of high computational complexity. One of the main reasons is that the HLH strategy itself needs to spend a lot of time to find the optimal heuristic method. Reducing the algorithm complexity of the HLH strategy can also improve the overall performance of the algorithm. have an important impact.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是提供一种简单实用,优化方法复杂度低,可有效提升传统电缆行业生产与管理效率的基于强化学习的线缆生产调度优化方法。The technical problem to be solved by the present invention is to provide a cable production scheduling optimization method based on reinforcement learning that is simple and practical, the optimization method has low complexity, and can effectively improve the production and management efficiency of the traditional cable industry.

本发明首先建立多流水线和复杂资源约束条件下的线缆生产调度优化模型,优化模型以截止期延期惩罚费用最小化为目标,在此基础上,结合超启发式算法框架,将强化学习机制作为超启发式算法的HLH策略,并针对线缆生产调度问题特点,设计简易启发式规则,用以构建LLH方法集合,从而实现对于线缆生产调度问题的优化求解。The invention first establishes a cable production scheduling optimization model under the condition of multi-pipeline and complex resource constraints. The optimization model aims at minimizing the penalty fee for deadline delay. On this basis, combined with the super-heuristic algorithm framework, the reinforcement learning mechanism is used as the According to the HLH strategy of the super-heuristic algorithm, and according to the characteristics of the cable production scheduling problem, a simple heuristic rule is designed to construct the LLH method set, so as to realize the optimal solution to the cable production scheduling problem.

本发明是通过以下技术方案来实现的:The present invention is achieved through the following technical solutions:

1、一种基于强化学习的线缆生产调度优化方法,该方法包括如下步骤:1. A method for optimizing cable production scheduling based on reinforcement learning, the method comprising the following steps:

步骤1、建立线缆生产调度问题的约束优化数学模型;Step 1. Establish a constrained optimization mathematical model for the cable production scheduling problem;

线缆生产原材料铜棒或铝棒通过拉丝退火、束丝/绞线、挤塑、成缆、挤护套、铠装等工艺环节实现电线电缆生产,其中退火环节主要针对铜棒材质,以增加拉丝后导线的柔韧性。不同工序的设备均需要相应配套模具以实现某种特定型号的线缆生产,在某一工序的某一台机器上,生产不同型号产品需要切换相应模具,且切换模具需花费一定时间,在拉丝退火、束丝/绞线、挤塑、成缆、挤护套等工艺环节结束后都会生成线缆产品。设定线缆生产线中共有m台机器,有N个待生产订单{J1,J2,…,JN},每个订单Ji(i=1,2,…,N)根据线缆产品型号的生产工艺要求对应n个工序集合Oi={Oi1,Oi2,…,Oin};一个订单只包含一种线缆产品规格,设定用于工艺环节g(g=1,2,…,6)生产的机器集合为Mg,Ggh表示工艺环节g上第h个生产规格,Gig为订单Ji在工艺环节g上对应的生产规格,G′gh为生产线缆规格Ggh时相应的可用模具套数;在机器Mk(k=1,2,…,m)上生产,若需要从订单Ji切换到另一订单Ji′,且Ji和Ji′两个订单对应的线缆规格不同,则所需更换模具的时间为Sii′k;设定工序Oij(i=1,2,…,N;j=1,2,…,n)的开始时间和完工时间分别为Bij和Cij;设定机器k上生产订单Ji的开始时间和完工时间分别为Bik和Cik;以截止期延期惩罚费用最小化为优化目标,合理安排不同作业相应工序的加工设备和时序;线缆生产调度问题目标函数为:The raw material copper rod or aluminum rod for cable production is wire and cable production through wire drawing annealing, wire bundle/stranded wire, extrusion, cabling, sheath extrusion, armoring and other process links. The flexibility of the wire after drawing. Equipment in different processes requires corresponding supporting molds to realize the production of a specific type of cable. On a certain machine in a process, the production of different types of products requires switching corresponding molds, and it takes a certain amount of time to switch molds. Cable products will be produced after the process links such as annealing, stranding/stranding, extrusion, cabling, and sheath extrusion are completed. There are m machines in the cable production line, and there are N orders {J 1 ,J 2 ,…,J N } to be produced, and each order J i (i=1,2,…,N) is based on the cable product The production process requirements of the model correspond to n process sets O i ={O i1 ,O i2 ,...,O in }; an order contains only one cable product specification, which is set for the process link g (g=1,2 ,...,6) The set of machines produced is M g , G gh represents the h-th production specification in the process link g, Gi g is the production specification corresponding to the order J i in the process link g, and G' gh is the production cable specification The corresponding number of mold sets available at G gh ; produced on the machine M k ( k =1,2,...,m), if it is necessary to switch from order Ji to another order Ji' , and both Ji and Ji' The cable specifications corresponding to each order are different, the time required to replace the mold is S ii′k ; set the start of the process O ij (i=1,2,...,N; j=1,2,...,n) The time and completion time are B ij and C ij respectively; the start time and completion time of the production order J i on machine k are set as B ik and C ik respectively ; the optimization goal is to minimize the penalty cost of deadline delay , and reasonably arrange the processing equipment and timing of the corresponding processes of different jobs; the objective function of the cable production scheduling problem is:

Figure BDA0002453344550000031
Figure BDA0002453344550000031

其中,Di为订单Ji对应的交货截止期,Ci为订单Ji的完工时间,wi为截止期各订单紧急权重因子;Among them, D i is the delivery deadline corresponding to order J i , C i is the completion time of order J i , and wi is the urgency weight factor of each order in the deadline;

约束条件如下:The constraints are as follows:

Figure BDA0002453344550000032
Figure BDA0002453344550000032

Figure BDA0002453344550000033
Figure BDA0002453344550000033

Figure BDA0002453344550000034
Figure BDA0002453344550000034

Figure BDA0002453344550000041
Figure BDA0002453344550000041

Figure BDA0002453344550000042
Figure BDA0002453344550000042

其中,约束(2)给定了同一个订单Ji中后一个工序的开始时间必须要在前一个工序结束后才能开始加工;约束(3)给定了机器k上紧后工序必须要在前一工序结束后才能开始加工,其中考虑了更换模具的时间;约束(5)给定了线缆生产中某一工序上的模具数量限制;本步骤所建立的线缆生产调度模型同时考虑了多型号线缆生产、不同型号模具切换、模具资源约束等情况,更加符合企业线缆生产实际情况。Among them, constraint (2) specifies that the start time of the next process in the same order J i must be completed after the previous process is completed; constraint (3) specifies that the process after the machine k is tightened must be in the previous process. Processing can only be started after a process is completed, which considers the time to replace the mold; constraint (5) gives the limit of the number of molds in a certain process in cable production; the cable production scheduling model established in this step also considers multiple Model cable production, different types of mold switching, mold resource constraints, etc., are more in line with the actual situation of enterprise cable production.

步骤2、初始化优化算法和强化学习参数;Step 2. Initialize the optimization algorithm and reinforcement learning parameters;

2.1、初始化算法参数:当前迭代次数t,最大迭代次数maxT,周期迭代次数T;2.1. Initialization algorithm parameters: the current number of iterations t, the maximum number of iterations maxT, and the number of periodic iterations T;

2.2、初始化强化学习动作集:构建全局搜索算子集Λ={a1,a2,…,aλ}和领域搜索算子集Γ={a′1,a′2,…,a′γ},并将A=Λ∪Γ作为动作集,其中Λ中算子基于交叉操作,Γ中算子则基于交换操作;2.2. Initialize the reinforcement learning action set: construct the global search operator set Λ={a 1 ,a 2 ,…,a λ } and the domain search operator set Γ={a′ 1 ,a′ 2 ,…,a′ γ }, and use A=Λ∪Γ as the action set, where the Λ operator is based on the crossover operation, and the Γ operator is based on the exchange operation;

2.3、生成初始解:随机生成一个由N个订单对应工序所组成的初始解,即Xt=Ruffled{O1,O2,…,ON},Ruffled(·)为随机打乱顺序操作;2.3. Generate initial solution: randomly generate an initial solution consisting of N order corresponding processes, namely X t =Ruffled{O 1 ,O 2 ,...,O N }, Ruffled( ) is a random shuffling operation;

步骤3、随机选取初始状态st以及st对应的某一个动作χtt∈A);Step 3. Randomly select the initial state s t and a certain action χ tt ∈ A) corresponding to the initial state s t ;

步骤4、将χt作为搜索算子应用到Xt,并连续运行T次,每次运行时,采用最小完工时间优先作为标准,生成调度方案,具体步骤如下:Step 4. Apply χ t as a search operator to X t , and run it continuously for T times. In each operation, use the minimum completion time priority as the standard to generate a scheduling scheme. The specific steps are as follows:

4.1、遍历所有机器,判断工序Oij是否可以在机器上加工,若可以,则在满足公式(2)-(6)给定的约束条件基础上,计算每一台机器上工序Oij的完工时间;4.1. Traverse all machines to determine whether the process O ij can be processed on the machine. If so, calculate the completion of the process O ij on each machine on the basis of satisfying the constraints given by formulas (2)-(6). time;

4.2、选取完工时间最小的机器作为Oij的加工指派机器;4.2. Select the machine with the smallest completion time as the processing assignment machine of Oij ;

4.3、生成订单在机器上的生产调度方案,并采用公式(1)计算得到目标函数值F(·);4.3. Generate the production scheduling plan of the order on the machine, and use the formula (1) to calculate the objective function value F(·);

若得到的新解更优,则替换原有解,T次运行结束后按照公式(7)计算得到λ值;If the new solution obtained is better, replace the original solution, and calculate the λ value according to formula (7) after T times of running;

Figure BDA0002453344550000051
Figure BDA0002453344550000051

步骤5、根据λ值选择相应状态st,即λ∈{s|s=θ123},其中θ1=[0.9,1],θ2=[0.5,0.9),θ3=[0,0.5)为状态空间的区间阈值;Step 5. Select the corresponding state s t according to the λ value, that is, λ∈{s|s=θ 123 }, where θ 1 =[0.9,1], θ 2 =[0.5,0.9), θ 3 = [0, 0.5) is the interval threshold of the state space;

步骤6、生成随机数r(r∈[0,1]),基于公式(8)所计算的强化概率ε得到下一步执行动作χt;当r<ε时,选择状态st对应Q值最高的动作;否则,随机选择状态st对应某一动作进行操作;Step 6. Generate a random number r (r∈[0,1]), and obtain the next execution action χ t based on the reinforcement probability ε calculated by formula (8); when r < ε, select the state s t corresponding to the highest Q value action; otherwise, randomly select state s t to operate corresponding to a certain action;

Figure BDA0002453344550000052
Figure BDA0002453344550000052

公式(8)中,maxT为设定的最大迭代次数;In formula (8), maxT is the maximum number of iterations set;

步骤7、针对当前动作χt执行结果对其效用进行评价以引导超启发式算法的搜索方向,定义执行动作χt的效用值函数rt为:Step 7: Evaluate the utility of the execution result of the current action χ t to guide the search direction of the hyperheuristic algorithm, and define the utility value function r t of the execution action χ t as:

Figure BDA0002453344550000053
Figure BDA0002453344550000053

根据公式(10)所示学习函数更新χt所属动作集中所有动作χ′t的Q值,并依据状态表达机制确定下一状态;According to the learning function shown in formula (10), update the Q values of all actions χ′ t in the action set to which χ t belongs, and determine the next state according to the state expression mechanism;

Figure BDA0002453344550000054
Figure BDA0002453344550000054

公式(10)中Qt(stt)表示第t次迭代时状态st对应动作χt的Q值,α为学习率,γ为折扣因子,其中γ=0.8,α采用公式(11)所示方式进行自适应调整;In formula (10), Q t (s t , χ t ) represents the Q value of the state s t corresponding to the action χ t in the t-th iteration, α is the learning rate, γ is the discount factor, where γ=0.8, α adopts the formula ( 11) Adaptive adjustment is performed in the manner shown;

Figure BDA0002453344550000055
Figure BDA0002453344550000055

步骤8、判断t≤maxT是否成立,若成立转到步骤4继续执行,否则输出最优调度方案及其对应的甘特图。Step 8. Determine whether t≤maxT is established, if so, go to step 4 to continue execution, otherwise output the optimal scheduling scheme and its corresponding Gantt chart.

本发明的有益效果是:可根据线缆企业生产的实际情况,以截止期延期惩罚费用最小化为优化目标,建立了多流水线和复杂资源约束条件下的线缆生产调度模型。在此基础上提出了基于强化学习的超启发式调度优化方法,在超启发式算法框架下,设计了包含具备全局和局部搜索能力的LLH方法集合;在强化学习机制下,将LLH方法集合作为动作集合,动态地选择相应LLH方法进行单解迭代寻优。该方法采用单列编码和单解迭代方案,简单实用,算法复杂度低,可有效提升传统电缆行业生产与管理效率,对于传统产业全面推进提质增效、转型升级具有重要意义。The beneficial effect of the invention is that a cable production scheduling model can be established under multi-pipeline and complex resource constraint conditions according to the actual production situation of the cable enterprise, with the optimization goal of minimizing the penalty fee for deadline delay. On this basis, a hyper-heuristic scheduling optimization method based on reinforcement learning is proposed. Under the framework of hyper-heuristic algorithm, a set of LLH methods with global and local search capabilities is designed; under the reinforcement learning mechanism, the set of LLH methods is used as the Action set, dynamically select the corresponding LLH method for single solution iterative optimization. The method adopts a single-column coding and a single-solution iteration scheme, which is simple and practical, and has low algorithm complexity. It can effectively improve the production and management efficiency of the traditional cable industry, and is of great significance for the comprehensive promotion of quality, efficiency, transformation and upgrading of traditional industries.

附图说明Description of drawings

为了易于说明,本发明由下述的具体实施例及附图作以详细描述。For ease of description, the present invention is described in detail by the following specific embodiments and accompanying drawings.

图1是线缆生产流程示意图。Figure 1 is a schematic diagram of the cable production process.

图2是基于强化学习的超启发式调度优化算法流程图。Figure 2 is a flowchart of the hyperheuristic scheduling optimization algorithm based on reinforcement learning.

图3是调度解甘特图。Figure 3 is a Gantt chart for scheduling solutions.

具体实施方式Detailed ways

下面结合附图对本发明的优选实施例进行详细阐述,以使本发明的优点和特征能更易于被本领域技术人员理解,从而对本发明的保护范围做出更为清楚明确的界定;The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, so that the protection scope of the present invention can be more clearly defined;

线缆企业生产流程示意图如图1所示,线缆生产原材料铜棒或铝棒通过拉丝退火、束丝/绞线、挤塑、成缆、挤护套、铠装等工艺环节实现电线电缆生产,其中退火环节主要针对铜棒材质,以增加拉丝后导线的柔韧性。不同工序的设备均需要相应配套模具以实现某种特定型号的线缆生产,在某一工序的某一台机器上,生产不同型号产品需要切换相应模具,且切换模具需花费一定时间,在拉丝退火、束丝/绞线、挤塑、成缆、挤护套等工艺环节结束后都会生成线缆产品。在线缆行业,客户订单通常约定产品交货截止期,延期交付会导致违约成本增加。基于上述考虑,实施例以截止期延期惩罚费用最小化为例进行说明。The schematic diagram of the production process of cable companies is shown in Figure 1. The copper rods or aluminum rods, the raw materials for cable production, are produced through wire drawing and annealing, bundling/stranding, extrusion, cabling, sheath extrusion, armoring and other processes to achieve wire and cable production. , the annealing link is mainly for the copper rod material to increase the flexibility of the wire after drawing. Equipment in different processes requires corresponding supporting molds to realize the production of a specific type of cable. On a certain machine in a process, the production of different types of products requires switching corresponding molds, and it takes a certain amount of time to switch molds. Cable products will be produced after the process links such as annealing, stranding/stranding, extrusion, cabling, and sheath extrusion are completed. In the cable industry, customer orders usually stipulate a deadline for product delivery, and delayed delivery will lead to increased default costs. Based on the above considerations, the embodiment is described by taking the minimization of the penalty fee for deadline extension as an example.

步骤1、设定线缆生产线中共有m台机器可用于上述工艺环节生产,有N个待生产订单{J1,J2,…,JN},每个订单Ji(i=1,2,…,N)根据其产品型号的生产工艺要求对应n个工序集合Oi={Oi1,Oi2,…,Oin}。一个订单只包含一种线缆产品规格,设定用于工艺环节g(g=1,2,…,6)生产的机器集合为Mg,Ggh表示工艺环节g上第h个生产规格,

Figure BDA0002453344550000075
为订单Ji在工艺环节g上对应的生产规格,G′gh为生产线缆规格Ggh时相应的可用模具套数;在机器Mk(k=1,2,…,m)上生产,若需要从订单Ji切换到另一订单Ji′,且Ji和Ji′两个订单对应的线缆规格不同,则所需更换模具的时间为Sii′k。此外,设定工序Oij(i=1,2,…,N;j=1,2,…,n)的开始时间和完工时间分别为Bij和Cij;设定机器k上生产订单Ji的开始时间和完工时间分别为B′ik和C″ik;以截止期延期惩罚费用最小化为优化目标,合理安排不同作业相应工序的加工设备和时序。Step 1. Set a total of m machines in the cable production line that can be used for the production of the above process links. There are N orders to be produced {J 1 ,J 2 ,...,J N }, and each order J i (i=1,2 ,...,N) corresponds to n process sets O i ={O i1 ,O i2 ,...,O in } according to the production process requirements of its product model. An order contains only one cable product specification, and the set of machines used for the production of process link g (g=1,2,...,6) is set to be Mg , and G gh represents the hth production specification on process link g ,
Figure BDA0002453344550000075
is the production specification corresponding to the order J i in the process link g, G′ gh is the corresponding number of available mold sets when producing the cable specification G gh ; produced on the machine M k (k=1,2,...,m), if It is necessary to switch from order J i to another order J i′ , and the cable specifications corresponding to the two orders J i and J i′ are different, then the time required to replace the mold is S ii′k . In addition, set the start time and finish time of the process O ij (i=1,2,...,N; j=1,2,...,n) as B ij and C ij respectively; set the production order J on the machine k The start time and finish time of i are B′ ik and C″ ik respectively; the optimization goal is to minimize the penalty cost of deadline delay, and the processing equipment and sequence of the corresponding procedures of different operations are reasonably arranged.

其目标函数为:Its objective function is:

Figure BDA0002453344550000071
Figure BDA0002453344550000071

其中,Di为订单Ji对应的交货截止期,Ci为订单Ji的完工时间,wi为截止期各订单紧急权重因子。Among them, D i is the delivery deadline corresponding to the order J i , C i is the completion time of the order J i , and wi is the urgency weight factor of each order in the deadline.

约束条件如下:The constraints are as follows:

Figure BDA0002453344550000072
Figure BDA0002453344550000072

Figure BDA0002453344550000073
Figure BDA0002453344550000073

Figure BDA0002453344550000074
Figure BDA0002453344550000074

Figure BDA0002453344550000081
Figure BDA0002453344550000081

Figure BDA0002453344550000082
Figure BDA0002453344550000082

其中,约束(2)给定了同一个订单Ji中后一个工序的开始时间必须要在前一个工序结束后才能开始加工;约束(3)给定了机器k上紧后工序必须要在前一工序结束后才能开始加工,其中考虑了更换模具的时间;约束(5)给定了线缆生产中某一工序上的模具数量限制。Among them, constraint (2) specifies that the start time of the next process in the same order J i must be completed after the previous process is completed; constraint (3) specifies that the process after the machine k is tightened must be in the previous process. Processing can only be started after a process is completed, which takes into account the time to replace the mold; constraint (5) specifies the limit on the number of molds in a process in cable production.

基于强化学习的超启发式调度优化算法求解线缆生产调度问题的具体应用实例如下:The specific application examples of the super-heuristic scheduling optimization algorithm based on reinforcement learning to solve the cable production scheduling problem are as follows:

给定某线缆生产调度问题实例如表2所示,该实例包含7个订单、34个工序和10台机器,每个订单有对应交货截止期,每个工序有对应生产规格、模具数量限制、生产时间、可用机器设备,不同规格之间切换模具时间如表3所示。A given example of a cable production scheduling problem is shown in Table 2. The example contains 7 orders, 34 processes and 10 machines. Each order has a corresponding delivery deadline, and each process has corresponding production specifications and mold quantities. Limits, production time, available machines, and mold switching time between different specifications are shown in Table 3.

表1线缆生产调度问题实例Table 1 Examples of cable production scheduling problems

Figure BDA0002453344550000083
Figure BDA0002453344550000083

Figure BDA0002453344550000091
Figure BDA0002453344550000091

表2不同规格之间模具更换时间表Table 2 Timetable for mold replacement between different specifications

G<sub>11</sub>G<sub>11</sub> G<sub>12</sub>G<sub>12</sub> G<sub>21</sub>G<sub>21</sub> G<sub>22</sub>G<sub>22</sub> G<sub>31</sub>G<sub>31</sub> G<sub>32</sub>G<sub>32</sub> G<sub>41</sub>G<sub>41</sub> G<sub>42</sub>G<sub>42</sub> G<sub>51</sub>G<sub>51</sub> G<sub>52</sub>G<sub>52</sub> G<sub>61</sub>G<sub>61</sub> G<sub>62</sub>G<sub>62</sub> G<sub>11</sub>G<sub>11</sub> 00 33 -- -- -- -- -- -- -- -- -- -- G<sub>12</sub>G<sub>12</sub> 11 00 -- -- -- -- -- -- -- -- -- -- G<sub>21</sub>G<sub>21</sub> -- -- 00 44 -- -- -- -- -- -- -- -- G<sub>22</sub>G<sub>22</sub> -- -- 22 00 -- -- -- -- -- -- -- -- G<sub>31</sub>G<sub>31</sub> -- -- -- -- 00 11 -- -- -- -- -- -- G<sub>32</sub>G<sub>32</sub> -- -- -- -- 22 00 -- -- -- -- -- -- G<sub>41</sub>G<sub>41</sub> -- -- -- -- -- -- 00 33 -- -- -- -- G<sub>42</sub>G<sub>42</sub> -- -- -- -- -- -- 33 00 -- -- -- -- G<sub>51</sub>G<sub>51</sub> -- -- -- -- -- -- -- -- 00 11 -- -- G<sub>52</sub>G<sub>52</sub> -- -- -- -- -- -- -- -- 33 00 -- -- G<sub>61</sub>G<sub>61</sub> -- -- -- -- -- -- -- -- -- -- 00 33 G<sub>62</sub>G<sub>62</sub> -- -- -- -- -- -- -- -- -- -- 66 00

因此,N=7,m=10。基于强化学习的超启发式调度优化算法求解线缆生产调度问题的具体步骤如下:Therefore, N=7, m=10. The specific steps of solving the cable production scheduling problem by the hyperheuristic scheduling optimization algorithm based on reinforcement learning are as follows:

步骤2、初始化优化算法和强化学习参数。Step 2. Initialize the optimization algorithm and reinforcement learning parameters.

2.1、初始化算法参数:当前迭代次数t=1,最大迭代次数maxT=300,周期迭代次数T=3,Q值表中所有数据初始化为0;2.1. Initialization algorithm parameters: the current number of iterations t=1, the maximum number of iterations maxT=300, the number of periodic iterations T=3, and all data in the Q value table are initialized to 0;

2.2、初始化强化学习动作集:构建全局搜索算子集Λ={a1,a2,…,aλ}和领域搜索算子集Γ={a′1,a′2,…,a′γ},并将A=Λ∪Γ作为动作集,其中Λ中算子主要基于交叉操作,Γ中算子则主要基于交换操作;2.2. Initialize the reinforcement learning action set: construct the global search operator set Λ={a 1 ,a 2 ,…,a λ } and the domain search operator set Γ={a′ 1 ,a′ 2 ,…,a′ γ }, and use A=Λ∪Γ as the action set, where the Λ operator is mainly based on the crossover operation, and the Γ operator is mainly based on the exchange operation;

2.3、生成初始解:随机生成一个由7个订单对应工序所组成的初始解,即Xt=Ruffled{O1,O2,…,O7},Ruffled(·)为随机打乱顺序操作。2.3. Generate initial solution: randomly generate an initial solution composed of 7 orders corresponding to the process, namely X t =Ruffled{O 1 ,O 2 ,...,O 7 }, Ruffled( ) is a random order shuffling operation.

步骤3、随机选取初始状态st以及st对应的某一个动作χtt∈A);Step 3. Randomly select the initial state s t and a certain action χ tt ∈ A) corresponding to the initial state s t ;

步骤4、将χt作为搜索算子应用到Xt,并连续运行T次,每次运行时,若得到的新解更优,则替换原有解,T次运行结束后按照公式(7)计算得到λ值;Step 4. Apply χ t as a search operator to X t , and run it continuously for T times. In each run, if the new solution obtained is better, replace the original solution. After the T times of running, follow formula (7) Calculate the λ value;

Figure BDA0002453344550000101
Figure BDA0002453344550000101

步骤5、根据λ值选择相应状态st,即λ∈{s|s=θ123},其中θ1=[0.9,1],θ2=[0.5,0.9),θ3=[0,0.5)为状态空间的区间阈值。Step 5. Select the corresponding state s t according to the λ value, that is, λ∈{s|s=θ 123 }, where θ 1 =[0.9,1], θ 2 =[0.5,0.9), θ 3 = [0, 0.5) is the interval threshold of the state space.

步骤6、生成随机数r(r∈[0,1]),基于公式(8)所计算的强化概率ε得到下一步执行动作χt。当r<ε时,选择状态st对应Q值最高的动作;否则,随机选择状态st对应某一动作进行操作。Step 6: Generate a random number r (r∈[0,1]), and obtain the next execution action χ t based on the reinforcement probability ε calculated by the formula (8). When r<ε, select the state st corresponding to the action with the highest Q value; otherwise, randomly select the state st to operate corresponding to a certain action.

Figure BDA0002453344550000102
Figure BDA0002453344550000102

公式(8)中,maxT为设定的最大迭代次数。In formula (8), maxT is the set maximum number of iterations.

步骤7、针对当前动作χt执行结果对其效用进行评价以引导超启发式算法的搜索方向,本发明定义执行动作χt的效用值函数rt为:Step 7. Evaluate its utility for the execution result of the current action χ t to guide the search direction of the hyperheuristic algorithm, the present invention defines the utility value function r t of the execution action χ t as:

Figure BDA0002453344550000103
Figure BDA0002453344550000103

在此基础上根据公式(10)所示学习函数更新χt所属动作集中所有动作χ′t的Q值,并依据状态表达机制确定下一状态。On this basis, the Q values of all actions χ' t in the action set to which χ t belongs are updated according to the learning function shown in formula (10), and the next state is determined according to the state expression mechanism.

Figure BDA0002453344550000104
Figure BDA0002453344550000104

公式(10)中Qt(stt)表示第t次迭代时状态st对应动作χt的Q值,α为学习率,γ为折扣因子,其中γ=0.8,α采用公式(11)所示方式进行自适应调整。In formula (10), Q t (s t , χ t ) represents the Q value of the state s t corresponding to the action χ t in the t-th iteration, α is the learning rate, γ is the discount factor, where γ=0.8, α adopts the formula ( 11) Adaptive adjustment is performed in the manner shown.

Figure BDA0002453344550000111
Figure BDA0002453344550000111

步骤8、判断t≤maxT是否成立,如成立转到步骤4继续执行,否则输出最优调度解Xbest。本实施例得到的目标函数值为39,对应的甘特图,结果如图3所示,其中A所示区间为模具更换时间。Step 8. Determine whether t≤maxT is established, if so, go to step 4 to continue execution, otherwise output the optimal scheduling solution X best . The objective function value obtained in this example is 39, corresponding to the Gantt chart, and the result is shown in Figure 3, where the interval indicated by A is the mold replacement time.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何不经过创造性劳动想到的变化或替换,都应涵盖在本发明的保护范围之内;因此,本发明的保护范围应该以权利要求书所限定的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this, and any changes or substitutions that are not thought of without creative work should be covered within the protection scope of the present invention; therefore, The protection scope of the present invention should be based on the protection scope defined by the claims.

Claims (5)

1.一种基于强化学习的线缆生产调度优化方法,其特征在于,该方法包括如下步骤:1. a cable production scheduling optimization method based on reinforcement learning, is characterized in that, this method comprises the steps: 步骤1、建立线缆生产调度问题的约束优化数学模型;Step 1. Establish a constrained optimization mathematical model for the cable production scheduling problem; 设定线缆生产线中共有m台机器,有N个待生产订单{J1,J2,…,JN},每个订单Ji(i=1,2,…,N)根据线缆产品型号的生产工艺要求对应n个工序集合Oi={Oi1,Oi2,…,Oin};一个订单只包含一种线缆产品规格,设定用于工艺环节g(g=1,2,…,6)生产的机器集合为Mg,Ggh表示工艺环节g上第h个生产规格,
Figure FDA0002453344540000011
为订单Ji在工艺环节g上对应的生产规格,G′gh为生产线缆规格Ggh时相应的可用模具套数;在机器Mk(k=1,2,…,m)上生产,若需要从订单Ji切换到另一订单Ji′,且Ji和Ji′两个订单对应的线缆规格不同,则所需更换模具的时间为Sii′k;设定工序Oij(i=1,2,…,N;j=1,2,…,n)的开始时间和完工时间分别为Bij和Cij;设定机器k上生产订单Ji的开始时间和完工时间分别为B′ik和C′i′k;以截止期延期惩罚费用最小化为优化目标,合理安排不同作业相应工序的加工设备和时序;线缆生产调度问题目标函数为:
There are m machines in the cable production line, and there are N orders {J 1 ,J 2 ,…,J N } to be produced, and each order J i (i=1,2,…,N) is based on the cable product The production process requirements of the model correspond to n process sets O i ={O i1 ,O i2 ,...,O in }; an order contains only one cable product specification, which is set for the process link g (g=1,2 ,...,6) The set of machines produced is M g , where G gh represents the h-th production specification on the process link g,
Figure FDA0002453344540000011
is the production specification corresponding to the order J i in the process link g, G′ gh is the corresponding number of available mold sets when producing the cable specification G gh ; produced on the machine M k (k=1,2,...,m), if It is necessary to switch from order J i to another order J i′ , and the cable specifications corresponding to the two orders J i and J i′ are different, then the time required to replace the mold is S ii′k ; set the process O ij ( i=1,2,...,N; j=1,2,...,n) start time and finish time are B ij and C ij respectively; set the start time and finish time of production order J i on machine k, respectively are B′ ik and C′ i′k ; take the minimization of deadline delay penalty costs as the optimization goal, reasonably arrange the processing equipment and timing of the corresponding procedures of different jobs; the objective function of the cable production scheduling problem is:
Figure FDA0002453344540000012
Figure FDA0002453344540000012
其中,Di为订单Ji对应的交货截止期,Ci为订单Ji的完工时间,wi为截止期各订单紧急权重因子;Among them, D i is the delivery deadline corresponding to order J i , C i is the completion time of order J i , and wi is the urgency weight factor of each order in the deadline; 约束条件如下:The constraints are as follows:
Figure FDA0002453344540000013
Figure FDA0002453344540000013
Figure FDA0002453344540000014
Figure FDA0002453344540000014
Figure FDA0002453344540000015
Figure FDA0002453344540000015
Figure FDA0002453344540000016
Figure FDA0002453344540000016
Figure FDA0002453344540000017
Figure FDA0002453344540000017
其中,约束(2)给定了同一个订单Ji中后一个工序的开始时间必须要在前一个工序结束后才能开始加工;约束(3)给定了机器k上紧后工序必须要在前一工序结束后才能开始加工;Among them, constraint (2) specifies that the start time of the next process in the same order J i must be completed after the previous process is completed; constraint (3) specifies that the process after the machine k is tightened must be in the previous process. Processing can only start after a process is completed; 步骤2、初始化优化算法和强化学习参数;Step 2. Initialize the optimization algorithm and reinforcement learning parameters; 2.1、初始化算法参数:当前迭代次数t,最大迭代次数maxT,周期迭代次数T;2.1. Initialization algorithm parameters: the current number of iterations t, the maximum number of iterations maxT, and the number of periodic iterations T; 2.2、生成初始解:随机生成一个由N个订单对应工序所组成的初始解,即Xt=Ruffled{O1,O2,…,ON},Ruffled(·)为随机打乱顺序操作;2.2. Generate initial solution: randomly generate an initial solution consisting of N order corresponding processes, namely X t =Ruffled{O 1 ,O 2 ,...,O N }, Ruffled( ) is a random shuffling operation; 步骤3、随机选取初始状态st以及st对应的某一个动作χtt∈A);Step 3. Randomly select the initial state s t and a certain action χ tt ∈ A) corresponding to the initial state s t ; 步骤4、将χt作为搜索算子应用到Xt,并连续运行T次,每次运行时,采用最小完工时间优先作为标准,生成调度方案,Step 4. Apply χ t as a search operator to X t , and run it continuously for T times. In each operation, use the minimum completion time priority as the standard to generate a scheduling scheme, 若得到的新解更优,则替换原有解,T次运行结束后按照公式(7)计算得到λ值;If the new solution obtained is better, replace the original solution, and calculate the λ value according to formula (7) after T times of running;
Figure FDA0002453344540000021
Figure FDA0002453344540000021
步骤5、根据λ值选择相应状态st,即λ∈{s|s=θ123},其中θ1=[0.9,1],θ2=[0.5,0.9),θ3=[0,0.5)为状态空间的区间阈值;Step 5. Select the corresponding state s t according to the λ value, that is, λ∈{s|s=θ 123 }, where θ 1 =[0.9,1], θ 2 =[0.5,0.9), θ 3 = [0, 0.5) is the interval threshold of the state space; 步骤6、生成随机数r(r∈[0,1]),基于公式(8)所计算的强化概率ε得到下一步执行动作χt;当r<ε时,选择状态st对应Q值最高的动作;否则,随机选择状态st对应某一动作进行操作;Step 6. Generate a random number r (r∈[0,1]), and obtain the next execution action χ t based on the reinforcement probability ε calculated by formula (8); when r < ε, select the state s t corresponding to the highest Q value action; otherwise, randomly select state s t to operate corresponding to a certain action;
Figure FDA0002453344540000022
Figure FDA0002453344540000022
公式(8)中,maxT为设定的最大迭代次数;In formula (8), maxT is the maximum number of iterations set; 步骤7、针对当前动作χt执行结果对其效用进行评价以引导超启发式算法的搜索方向,定义执行动作χt的效用值函数rt为:Step 7: Evaluate the utility of the execution result of the current action χ t to guide the search direction of the hyperheuristic algorithm, and define the utility value function r t of the execution action χ t as:
Figure FDA0002453344540000023
Figure FDA0002453344540000023
根据公式(10)所示学习函数更新χt所属动作集中所有动作χ′t的Q值,并依据状态表达机制确定下一状态;According to the learning function shown in formula (10), update the Q values of all actions χ′ t in the action set to which χ t belongs, and determine the next state according to the state expression mechanism;
Figure FDA0002453344540000024
Figure FDA0002453344540000024
公式(10)中Qt(stt)表示第t次迭代时状态st对应动作χt的Q值,α为学习率,γ为折扣因子,其中γ=0.8,α采用公式(11)所示方式进行自适应调整;In formula (10), Q t (s t , χ t ) represents the Q value of the state s t corresponding to the action χ t in the t-th iteration, α is the learning rate, γ is the discount factor, where γ=0.8, α adopts the formula ( 11) Adaptive adjustment is performed in the manner shown;
Figure FDA0002453344540000025
Figure FDA0002453344540000025
步骤8、判断t≤maxT是否成立,若成立转到步骤4继续执行,否则输出最优调度方案及其对应的甘特图。Step 8. Determine whether t≤maxT is established, if so, go to step 4 to continue execution, otherwise output the optimal scheduling scheme and its corresponding Gantt chart.
2.根据权利要求1所述的线缆生产调度优化方法,其特征在于:在步骤2.1之后以及步骤2.2之前增加一个步骤,该步骤为初始化强化学习动作集:构建全局搜索算子集Λ={a1,a2,…,aλ}和领域搜索算子集Γ={a′1,a′2,…,a′γ},并将A=Λ∪Γ作为动作集,其中Λ中算子基于交叉操作,Γ中算子则基于交换操作。2. The cable production scheduling optimization method according to claim 1, wherein a step is added after step 2.1 and before step 2.2, the step is to initialize the reinforcement learning action set: constructing a global search operator set Λ={ a 1 ,a 2 ,…,a λ } and domain search operator subset Γ={a′ 1 ,a′ 2 ,…,a′ γ }, and take A=Λ∪Γ as the action set, where Λ calculates The neutron is based on the crossover operation, and the Γ neutron is based on the exchange operation. 3.根据权利要求1所述的线缆生产调度优化方法,其特征在于:步骤4中所述生成调度方案的具体步骤如下:3. The cable production scheduling optimization method according to claim 1, wherein the specific steps of generating a scheduling scheme described in step 4 are as follows: 4.1、遍历所有机器,判断工序Oij是否可以在机器上加工,若可以,则在满足公式(2)-(6)给定的约束条件基础上,计算每一台机器上工序Oij的完工时间;4.1. Traverse all machines to determine whether the process O ij can be processed on the machine. If so, calculate the completion of the process O ij on each machine on the basis of satisfying the constraints given by formulas (2)-(6). time; 4.2、选取完工时间最小的机器作为Oij的加工指派机器;4.2. Select the machine with the smallest completion time as the processing assignment machine of Oij ; 4.3、生成订单在机器上的生产调度方案,并采用公式(1)计算得到目标函数值F(·)。4.3. Generate the production scheduling plan of the order on the machine, and use the formula (1) to calculate the objective function value F(·). 4.根据权利要求3所述的线缆生产调度优化方法,其特征在于:步骤4.2中,若存在不同机器的最小完工时间相同,则在其中随机选取加工指派机器。4. The cable production scheduling optimization method according to claim 3, characterized in that: in step 4.2, if there are different machines with the same minimum completion time, the processing and assigning machines are randomly selected among them. 5.根据权利要求1所述的线缆生产调度优化方法,其特征在于:步骤1中约束(3)考虑了更换模具的时间;约束(5)给定了线缆生产中某一工序上的模具数量限制。5. The cable production scheduling optimization method according to claim 1, characterized in that: in step 1, the constraint (3) considers the time to replace the mold; The number of molds is limited.
CN202010299221.9A 2020-04-16 2020-04-16 Cable production scheduling optimization method based on reinforcement learning Active CN111507523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010299221.9A CN111507523B (en) 2020-04-16 2020-04-16 Cable production scheduling optimization method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010299221.9A CN111507523B (en) 2020-04-16 2020-04-16 Cable production scheduling optimization method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN111507523A true CN111507523A (en) 2020-08-07
CN111507523B CN111507523B (en) 2023-04-18

Family

ID=71864129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010299221.9A Active CN111507523B (en) 2020-04-16 2020-04-16 Cable production scheduling optimization method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN111507523B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112150088A (en) * 2020-11-26 2020-12-29 深圳市万邑通信息科技有限公司 Huff-puff flexible intelligent assembly logistics path planning method and system
CN112418549A (en) * 2020-12-03 2021-02-26 华能秦煤瑞金发电有限责任公司 Cable in-out management method
CN112598255A (en) * 2020-12-17 2021-04-02 上海交通大学 Automatic wharf outlet box position allocation optimization method based on hyper-heuristic algorithm
CN112613643A (en) * 2020-12-08 2021-04-06 北京电子工程总体研究所 Maintenance support resource combined inventory configuration method based on hyper-heuristic algorithm
CN113378343A (en) * 2021-07-09 2021-09-10 浙江盘盘科技有限公司 Cable production scheduling method based on discrete Jaya algorithm
CN117391423A (en) * 2023-12-11 2024-01-12 东北大学 A multi-constraint automated scheduling method for chip high multi-layer ceramic packaging substrate production lines
CN117575581A (en) * 2024-01-16 2024-02-20 江苏中凯金属科技有限公司 Aluminum bar production method for recycling waste aluminum materials

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390195A (en) * 2013-05-28 2013-11-13 重庆大学 Machine workshop task scheduling energy-saving optimization system based on reinforcement learning
CN105809344A (en) * 2016-03-07 2016-07-27 浙江财经大学 Hyper-heuristic algorithm based ZDT flow shop job scheduling method
CN107168267A (en) * 2017-06-29 2017-09-15 山东万腾电子科技有限公司 Based on the production scheduling method and system for improving population and heuristic strategies
US20180080949A1 (en) * 2016-09-21 2018-03-22 Roche Diagnostics Operations, Inc. Automated scheduler for laboratory equipment
US20180121766A1 (en) * 2016-09-18 2018-05-03 Newvoicemedia, Ltd. Enhanced human/machine workforce management using reinforcement learning
CN108694502A (en) * 2018-05-10 2018-10-23 清华大学 An Adaptive Scheduling Method for Robotic Manufacturing Cells Based on XGBoost Algorithm
CN109270904A (en) * 2018-10-22 2019-01-25 中车青岛四方机车车辆股份有限公司 A kind of flexible job shop batch dynamic dispatching optimization method
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production Control Method Based on Reinforcement Learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390195A (en) * 2013-05-28 2013-11-13 重庆大学 Machine workshop task scheduling energy-saving optimization system based on reinforcement learning
CN105809344A (en) * 2016-03-07 2016-07-27 浙江财经大学 Hyper-heuristic algorithm based ZDT flow shop job scheduling method
US20180121766A1 (en) * 2016-09-18 2018-05-03 Newvoicemedia, Ltd. Enhanced human/machine workforce management using reinforcement learning
US20180080949A1 (en) * 2016-09-21 2018-03-22 Roche Diagnostics Operations, Inc. Automated scheduler for laboratory equipment
CN107168267A (en) * 2017-06-29 2017-09-15 山东万腾电子科技有限公司 Based on the production scheduling method and system for improving population and heuristic strategies
CN108694502A (en) * 2018-05-10 2018-10-23 清华大学 An Adaptive Scheduling Method for Robotic Manufacturing Cells Based on XGBoost Algorithm
CN109270904A (en) * 2018-10-22 2019-01-25 中车青岛四方机车车辆股份有限公司 A kind of flexible job shop batch dynamic dispatching optimization method
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production Control Method Based on Reinforcement Learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
伊雅丽;: "研发型企业多项目人力资源调度研究――基于蚁群优化的超启发式算法" *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112150088A (en) * 2020-11-26 2020-12-29 深圳市万邑通信息科技有限公司 Huff-puff flexible intelligent assembly logistics path planning method and system
CN112418549A (en) * 2020-12-03 2021-02-26 华能秦煤瑞金发电有限责任公司 Cable in-out management method
CN112613643A (en) * 2020-12-08 2021-04-06 北京电子工程总体研究所 Maintenance support resource combined inventory configuration method based on hyper-heuristic algorithm
CN112613643B (en) * 2020-12-08 2024-06-25 北京电子工程总体研究所 Maintenance support resource joint inventory configuration method based on hyper-heuristic algorithm
CN112598255A (en) * 2020-12-17 2021-04-02 上海交通大学 Automatic wharf outlet box position allocation optimization method based on hyper-heuristic algorithm
CN113378343A (en) * 2021-07-09 2021-09-10 浙江盘盘科技有限公司 Cable production scheduling method based on discrete Jaya algorithm
CN117391423A (en) * 2023-12-11 2024-01-12 东北大学 A multi-constraint automated scheduling method for chip high multi-layer ceramic packaging substrate production lines
CN117391423B (en) * 2023-12-11 2024-03-22 东北大学 A multi-constraint automated scheduling method for chip high multi-layer ceramic packaging substrate production lines
CN117575581A (en) * 2024-01-16 2024-02-20 江苏中凯金属科技有限公司 Aluminum bar production method for recycling waste aluminum materials
CN117575581B (en) * 2024-01-16 2024-04-26 江苏中凯金属科技有限公司 Aluminum bar production method for recycling waste aluminum materials

Also Published As

Publication number Publication date
CN111507523B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN111507523B (en) Cable production scheduling optimization method based on reinforcement learning
CN104268722B (en) Dynamic flexible job-shop scheduling method based on multi-objective Evolutionary Algorithm
Zhang et al. A novel heuristic method for the energy-efficient flexible job-shop scheduling problem with sequence-dependent set-up and transportation time
CN111242503A (en) Multi-target flexible job shop scheduling method based on two-layer genetic algorithm
CN110969362B (en) Multi-target task scheduling method and system under cloud computing system
CN113378343B (en) Cable production scheduling method based on discrete Jaya algorithm
CN103714395B (en) Cost-oriented mixed-model two-sided assembly line balancing method
CN115310794A (en) Man-machine collaborative assembly line balancing method and device
CN115099459B (en) Workshop multi-row layout method considering gaps and loading and unloading points
Luan et al. Enhanced NSGA-II for multi-objective energy-saving flexible job shop scheduling
CN105094970B (en) The method of more times scheduling models of task can be divided under a kind of solution distributed system
CN116976228B (en) Method for planning task of double-side dismantling line of retired electromechanical product
CN116933939A (en) Flexible workshop collaborative production method and system based on improved raccoon optimization algorithm
Liu et al. Multi-objective flexible job shop scheduling problem considering machine switching off-on operation
Quan et al. Multi-objective optimization scheduling for manufacturing process based on virtual workflow models
CN116976649A (en) Method for balancing local destructive dismantling line of retired household appliance
Guo et al. Integrated scheduling for remanufacturing system considering component commonality using improved multi-objective genetic algorithm
Gu et al. An improved memetic algorithm to solve the energy-efficient distributed flexible job shop scheduling problem with transportation and start-stop constraints
CN111026534B (en) Workflow execution optimization method based on multiple group genetic algorithms in cloud computing environment
CN116985146B (en) Robot parallel disassembly planning method for retired electronic products
CN117333084A (en) An energy-saving distributed assembly zero-wait flow shop scheduling method and system based on hyper-heuristic multi-dimensional distribution estimation
CN117707083A (en) Scheduling method, terminal equipment and storage medium for distributed assembly line shop
CN111665799A (en) Time constraint type parallel machine energy-saving scheduling method based on collaborative algorithm
CN117726119A (en) A graph bionic learning method to solve distributed hybrid flow workshop group scheduling
Zhou et al. Imperialist competitive algorithm based on vnsobl optimization for distributed parallel machine scheduling problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant