[go: up one dir, main page]

Skip to main content

Advertisement

Log in

Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Deep reinforcement learning (DRL) has been preliminarily applied to run-to-run (RtR) control. However, the existing works have mainly conducted on shift and drift disturbances in the chemical mechanical polishing (CMP) process and have not taken the non-stationary time-series disturbances into full consideration. Inspiring from the powerful self-learning mechanism of DRL, a new distributional reinforcement learning controller, quantile option structure deep deterministic policy gradient (QUOTA-DDPG), is designed to generate control policies without precise numerical model in this work. Specifically, the procedure for adjusting the recipe is formulated as a Markovian decision process. Meanwhile, state, action and reward are reasonably designed. Regarding QUOTA-DDPG, an option is first determined based on the option strategy, and the action is decided via intra-option policy at each time step. Moreover, target network and empirical replay mechanisms are utilized to enhance the stability and trainability. Simulations demonstrate that the presented approach outperforms the existing methods regarding the disturbance compensation and target tracking. The application of QUOTA-DDPG controller enriches the development of semiconductor smart manufacturing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Liu K, Chen Y, Zhang T et al (2018) A survey of run-to-run control for batch processes. ISA Trans 83:107–125

    Article  Google Scholar 

  2. Zhong Z, Wang A, Kim H et al (2021) Adaptive cautious regularized run-to-run controller for lithography process. IEEE Trans Semicond Manuf 34(3):387–397

    Article  Google Scholar 

  3. Fan SKS, Jen CH, Hsu CY et al (2020) A new double exponentially weighted moving average run-to-run control using a disturbance-accumulating strategy for mixed-product mode. IEEE Trans Autom Sci Eng 18(4):1846–1860

    Article  Google Scholar 

  4. Lee AC, Horng JH, Kuo TW et al (2014) Robustness analysis of mixed product run-to-run control for semiconductor process based on ODOB control structure. IEEE Trans Semicond Manuf 27(2):212–222

    Article  Google Scholar 

  5. Wang HY, Pan TH, Wong DSH et al (2019) An extended state observer-based run to run control for semiconductor manufacturing processes. IEEE Trans Semicond Manuf 32(2):154–162

    Article  Google Scholar 

  6. Bao Y, Zhu Y, Qian F (2021) A deep reinforcement learning approach to improve the learning performance in process control. Ind Eng Chem Res 60(15):5504–5515

    Article  Google Scholar 

  7. Wang J, Ma Y, Zhang L et al (2018) Deep learning for smart manufacturing: methods and applications. J Manuf Syst 48:144–156

    Article  Google Scholar 

  8. Gupta S, Singal G, Garg D (2021) Deep reinforcement learning techniques in diversified domains: a survey. Archiv Comput Methods Eng 28(7):4715–4754

    Article  Google Scholar 

  9. Wang H, Liu N, Zhang Y et al (2020) Deep reinforcement learning: a survey. Front Inf Technol Electron Eng 21(12):1726–1744

    Article  Google Scholar 

  10. Grondman I, Busoniu L, Lopes GAD et al (2012) A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans Syst Man Cybern C (Appl Rev) 42(6):1291–1307

    Article  Google Scholar 

  11. Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  12. Lillicrap T P, Hunt J J, Pritzel A et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971

  13. Fujimoto S, Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: International conference on machine learning. PMLR, pp 1587–1596

  14. Schulman J, Wolski F, Dhariwal P et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

  15. Haarnoja T, Zhou A, Abbeel P et al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning. PMLR, pp 1861–1870

  16. Spielberg S P K, Gopaluni R B, Loewen PD (2017) Deep reinforcement learning approaches for process control. In: 6th international symposium on advanced control of industrial processes (AdCONIP). IEEE, pp 201–206

  17. Song Z, Yang J, Mei X et al (2021) Deep reinforcement learning for permanent magnet synchronous motor speed control systems. Neural Comput Appl 33(10):5409–5418

    Article  Google Scholar 

  18. Ma Y, Zhu W, Benton MG et al (2019) Continuous control of a polymerization system with deep reinforcement learning. J Process Control 75:40–47

    Article  Google Scholar 

  19. Joshi T, Makker S, Kodamana H et al (2021) Twin actor twin delayed deep deterministic policy gradient (TATD3) learning for batch process control. Comput Chem Eng 155:107527

    Article  Google Scholar 

  20. Nian R, Liu J, Huang B (2020) A review on reinforcement learning: introduction and applications in industrial process control. Comput Chem Eng 139:106886

    Article  Google Scholar 

  21. Spielberg S, Tulsyan A, Lawrence NP et al (2020) Deep reinforcement learning for process control: a primer for beginners. arXiv preprint arXiv:2004.05490

  22. Yu J, Guo P (2020) Run-to-run control of chemical mechanical polishing process based on deep reinforcement learning. IEEE Trans Semicond Manuf 33(3):454–465

    Article  Google Scholar 

  23. Dabney W, Rowland M, Bellemare M et al (2018) Distributional reinforcement learning with quantile regression. In: Proceedings of the AAAI conference on artificial intelligence, vol 32(1)

  24. Bellemare M G, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: International conference on machine learning. PMLR, pp 449–458

  25. Zhang S, Yao H (2019) Quota: the quantile option architecture for reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 33(01), pp 5797–5804

  26. Bacon P L, Harb J, Precup D (2017) The option-critic architecture. In: Proceedings of the AAAI conference on artificial intelligence, vol 31(1)

  27. Botvinick MM (2012) Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol 22(6):956–962

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (No. 62273002, No. 61873113).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianhong Pan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Z., Pan, T. Distributional reinforcement learning for run-to-run control in semiconductor manufacturing processes. Neural Comput & Applic 35, 19337–19350 (2023). https://doi.org/10.1007/s00521-023-08760-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08760-1

Keywords

Navigation