Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes

395 Accesses
1 Citation
Explore all metrics

Abstract

Deep reinforcement learning (DRL) has been preliminarily applied to run-to-run (RtR) control. However, the existing works have mainly conducted on shift and drift disturbances in the chemical mechanical polishing (CMP) process and have not taken the non-stationary time-series disturbances into full consideration. Inspiring from the powerful self-learning mechanism of DRL, a new distributional reinforcement learning controller, quantile option structure deep deterministic policy gradient (QUOTA-DDPG), is designed to generate control policies without precise numerical model in this work. Specifically, the procedure for adjusting the recipe is formulated as a Markovian decision process. Meanwhile, state, action and reward are reasonably designed. Regarding QUOTA-DDPG, an option is first determined based on the option strategy, and the action is decided via intra-option policy at each time step. Moreover, target network and empirical replay mechanisms are utilized to enhance the stability and trainability. Simulations demonstrate that the presented approach outperforms the existing methods regarding the disturbance compensation and target tracking. The application of QUOTA-DDPG controller enriches the development of semiconductor smart manufacturing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DRL-dEWMA: a composite framework for run-to-run control in the semiconductor manufacturing process

Article 11 November 2023

Deep Reinforcement Learning for Multiobjective Scheduling in Industry 5.0 Reconfigurable Manufacturing Systems

Deep reinforcement learning framework for end-to-end semiconductor process control

Article 21 April 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Liu K, Chen Y, Zhang T et al (2018) A survey of run-to-run control for batch processes. ISA Trans 83:107–125
Article Google Scholar
Zhong Z, Wang A, Kim H et al (2021) Adaptive cautious regularized run-to-run controller for lithography process. IEEE Trans Semicond Manuf 34(3):387–397
Article Google Scholar
Fan SKS, Jen CH, Hsu CY et al (2020) A new double exponentially weighted moving average run-to-run control using a disturbance-accumulating strategy for mixed-product mode. IEEE Trans Autom Sci Eng 18(4):1846–1860
Article Google Scholar
Lee AC, Horng JH, Kuo TW et al (2014) Robustness analysis of mixed product run-to-run control for semiconductor process based on ODOB control structure. IEEE Trans Semicond Manuf 27(2):212–222
Article Google Scholar
Wang HY, Pan TH, Wong DSH et al (2019) An extended state observer-based run to run control for semiconductor manufacturing processes. IEEE Trans Semicond Manuf 32(2):154–162
Article Google Scholar
Bao Y, Zhu Y, Qian F (2021) A deep reinforcement learning approach to improve the learning performance in process control. Ind Eng Chem Res 60(15):5504–5515
Article Google Scholar
Wang J, Ma Y, Zhang L et al (2018) Deep learning for smart manufacturing: methods and applications. J Manuf Syst 48:144–156
Article Google Scholar
Gupta S, Singal G, Garg D (2021) Deep reinforcement learning techniques in diversified domains: a survey. Archiv Comput Methods Eng 28(7):4715–4754
Article Google Scholar
Wang H, Liu N, Zhang Y et al (2020) Deep reinforcement learning: a survey. Front Inf Technol Electron Eng 21(12):1726–1744
Article Google Scholar
Grondman I, Busoniu L, Lopes GAD et al (2012) A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans Syst Man Cybern C (Appl Rev) 42(6):1291–1307
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Lillicrap T P, Hunt J J, Pritzel A et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
Fujimoto S, Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: International conference on machine learning. PMLR, pp 1587–1596
Schulman J, Wolski F, Dhariwal P et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Haarnoja T, Zhou A, Abbeel P et al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning. PMLR, pp 1861–1870
Spielberg S P K, Gopaluni R B, Loewen PD (2017) Deep reinforcement learning approaches for process control. In: 6th international symposium on advanced control of industrial processes (AdCONIP). IEEE, pp 201–206
Song Z, Yang J, Mei X et al (2021) Deep reinforcement learning for permanent magnet synchronous motor speed control systems. Neural Comput Appl 33(10):5409–5418
Article Google Scholar
Ma Y, Zhu W, Benton MG et al (2019) Continuous control of a polymerization system with deep reinforcement learning. J Process Control 75:40–47
Article Google Scholar
Joshi T, Makker S, Kodamana H et al (2021) Twin actor twin delayed deep deterministic policy gradient (TATD3) learning for batch process control. Comput Chem Eng 155:107527
Article Google Scholar
Nian R, Liu J, Huang B (2020) A review on reinforcement learning: introduction and applications in industrial process control. Comput Chem Eng 139:106886
Article Google Scholar
Spielberg S, Tulsyan A, Lawrence NP et al (2020) Deep reinforcement learning for process control: a primer for beginners. arXiv preprint arXiv:2004.05490
Yu J, Guo P (2020) Run-to-run control of chemical mechanical polishing process based on deep reinforcement learning. IEEE Trans Semicond Manuf 33(3):454–465
Article Google Scholar
Dabney W, Rowland M, Bellemare M et al (2018) Distributional reinforcement learning with quantile regression. In: Proceedings of the AAAI conference on artificial intelligence, vol 32(1)
Bellemare M G, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: International conference on machine learning. PMLR, pp 449–458
Zhang S, Yao H (2019) Quota: the quantile option architecture for reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 33(01), pp 5797–5804
Bacon P L, Harb J, Precup D (2017) The option-critic architecture. In: Proceedings of the AAAI conference on artificial intelligence, vol 31(1)
Botvinick MM (2012) Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol 22(6):956–962
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (No. 62273002, No. 61873113).

Author information

Authors and Affiliations

School of Computer Science and Technology, School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, China
Zhu Ma
Anhui Engineering Laboratory of Human-Robot Collaboration System and Intelligent Equipment, School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, China
Tianhong Pan

Authors

Zhu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Tianhong Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianhong Pan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, Z., Pan, T. Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes. Neural Comput & Applic 35, 19337–19350 (2023). https://doi.org/10.1007/s00521-023-08760-1

Download citation

Received: 30 March 2022
Accepted: 12 June 2023
Published: 26 June 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00521-023-08760-1

Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DRL-dEWMA: a composite framework for run-to-run control in the semiconductor manufacturing process

Deep Reinforcement Learning for Multiobjective Scheduling in Industry 5.0 Reconfigurable Manufacturing Systems

Deep reinforcement learning framework for end-to-end semiconductor process control

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DRL-dEWMA: a composite framework for run-to-run control in the semiconductor manufacturing process

Deep Reinforcement Learning for Multiobjective Scheduling in Industry 5.0 Reconfigurable Manufacturing Systems

Deep reinforcement learning framework for end-to-end semiconductor process control

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation