[go: up one dir, main page]

Next Article in Journal
Sensitivity of Bayesian Networks to Noise in Their Parameters
Previous Article in Journal
Permutation Entropy: An Ordinal Pattern-Based Resilience Indicator for Industrial Equipment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Assisted Hartree–Fock Approach for Energy Level Calculations in the Neutral Ytterbium Atom

1
National Key Laboratory of Particle Transport and Separation Technology, Tianjin 300180, China
2
Research Institute of Physical and Chemical Engineering of Nuclear Industry, Tianjin 300180, China
3
Institute of Atomic and Molecular Physics, Sichuan University, Chengdu 610065, China
4
Key Laboratory of High Energy Density Physics and Technology, Ministry of Education, Chengdu 610065, China
*
Authors to whom correspondence should be addressed.
Entropy 2024, 26(11), 962; https://doi.org/10.3390/e26110962
Submission received: 8 October 2024 / Revised: 29 October 2024 / Accepted: 7 November 2024 / Published: 8 November 2024
(This article belongs to the Section Multidisciplinary Applications)

Abstract

:
Data-driven machine learning approaches with precise predictive capabilities are proposed to address the long-standing challenges in the calculation of complex many-electron atomic systems, including high computational costs and limited accuracy. In this work, we develop a general workflow for machine learning-assisted atomic structure calculations based on the Cowan code’s Hartree–Fock with relativistic corrections (HFR) theory. The workflow incorporates enhanced ElasticNet and XGBoost algorithms, refined using entropy weight methodology to optimize performance. This semi-empirical framework is applied to calculate and analyze the excited state energy levels of the 4f closed-shell Yb I atom, providing insights into the applicability of different algorithms under various conditions. The reliability and advantages of this innovative approach are demonstrated through comprehensive comparisons with ab initio calculations, experimental data, and other theoretical results.

1. Introduction

Since the mid-20th century, driven by advancements in research methodologies and computational capabilities, atomic calculations have evolved from fundamental theoretical research into an increasingly popular and powerful tool spanning multiple fields, including plasma physics, astrophysics, spectroscopy, materials science, and molecular biology. The significance is not only manifested in its crucial role in interpreting spectral data, plasma diagnostics [1], and the study of atomic clocks [2], but also underscores that atomic calculations form the foundation for exploring the interactions of matter with various light fields and particles. In contrast, the currently known and available atomic structure data are relatively limited. The primary atomic structure databases are confined to a few key resources, such as the National Institute of Standards and Technology (NIST) Atomic Spectra Database [3], and the CHIANTI atomic database [4].
The expansion of atomic data is highly dependent on either advancement in atomic theory or our ability and methods to accurately predict the data. The Hartree–Fock (HF) method serves as the starting point for the majority of atomic and molecular electronic structure calculations. A series of post-Hartree–Fock methods, including multi-configuration Hartree–Fock (MCHF) [5] and multi-configuration Dirac–Hartree–Fock (MCDHF) [6], etc., have been developed to address the challenges in complex many-electron atomic systems, particularly the electron correlation effects and relativistic effects [7]. Corresponding to these theoretical advancements, several atomic calculation programs have emerged over the past few decades, including the Cowan code [8], developed in 1968 by R. D. Cowan based on the HFR method; CIV3 [9], proposed by A. Hibbert in 1975, based on configuration interaction; ATSP [10], developed by C. F. Fischer et al. in 1996, based on the MCHF method; GRASP [11,12], based on the MCDHF method, first proposed in 1989 and subsequently improved many times; RATIP [13,14], a relativistic calculation method developed based on GRASP; and FAC [15], a relativistic configuration interaction method that appeared in 2003.
Ab initio calculations present the optimal choice for calculating atomic structure data with minimal experimental input. However, these calculations are associated with high computational costs. Using a limited set of experimental values to guide machine learning algorithms in optimizing Cowan code’s calculations offers an alternative approach that maintains computational efficiency while improving accuracy, compared to implementing more sophisticated ab initio methods that require substantial computational resources and extremely complex atomic structure programs. Machine learning methodologies demonstrate particular efficacy for this task, owing to their capacity to be trained on sparse datasets, and be iteratively refined through the judicious incorporation of experimental values into the training set. Notably, they possess the potential to capture intrinsic correlation effects in complex atomic systems, particularly in addressing electron correlation effects and relativistic corrections where traditional approaches face computational challenges, thereby complementing existing theoretical frameworks. Various machine learning approaches have demonstrated breakthrough applications in multiple computationally intensive and accuracy-demanding fields of physics, such as characterizing physical models [16], creating interatomic potentials [17,18], and solving the Schrödinger equation [19,20].
A serious problem in machine learning-assisted atomic structure calculations is the selection of an appropriate machine learning algorithm. There is no established method for determining the best-suited algorithm for a specific problem a priori, necessitating repeated testing. Thus, a robust and scalable framework is essential.
In this study, we chose the Cowan code, based on HFR theory, as our ab initio computational platform. The Cowan code inherently incorporates least-squares fitting (LSF), resulting in the data structures generated during its computational process being highly conducive to regression-based machine learning algorithms. Furthermore, its relatively streamlined program architecture and robust performance in calculating the spectra of transition elements, including lanthanides and actinides, facilitates the implementation of flexible data interfaces [21,22,23]. Based on this, we developed a machine learning-assisted atomic structure calculation workflow with four improved machine learning algorithms embedded.
The energy level structure of ytterbium (Yb) is characterized by distinct features, including large inter-configuration energy gaps and densely populated energy level clusters. Coupled with the relative abundance of experimental data, these properties render Yb an ideal system for investigating the atomic structure of lanthanide elements.
This article is organized as follows: In Section 2, we detail the machine learning-assisted atomic structure calculation workflow, along with the four newly designed machine learning algorithms. In Section 3, we apply this workflow to compute selected excited state energy levels of Yb I, followed by a comparative analysis to identify the appropriate application scenarios for each algorithm. Finally, Section 4 concludes the work.

2. Calculation Methods

This study proposes a novel machine learning-assisted workflow for atomic structure calculations. We develop a versatile and scalable framework capable of both multi-atomic system calculations and flexible integration of diverse machine learning algorithms, tailored to specific research requirements. Figure 1 presents the steps and components of the workflow. We elaborate on each step of the workflow, demonstrating how traditional computational methods collaboratively operate with machine learning algorithms, as well as how newly designed algorithms can be seamlessly integrated into the framework.

2.1. Machine Learning-Assisted Atomic Structure Calculation Workflow

  • Step 1: Initial calculation. Ab initio calculations based on the HFR theory are accomplished by Cowan code.
In the HFR theory, ab initio single-electron radial wave functions R ( r ) for all the subshells of the considered configurations will be computed first, and as a result some important energy parameters, called Slater parameters, can be obtained for each configuration, such as center-of-gravity configuration energies E a v , Slater direct integral F k , Slater exchange integral G k , radial integral related to the spin–orbit interaction ζ j , which in terms of the radial portion of a single-electron wave function R ( r ) can be defined as:
F k l i , l j = 0 0 2 r < k r > k + 1 × R i r 1 R j r 2 R i r 1 R j r 2 r 1 2 r 2 2 d r 1 d r 2 ,
G k ( l i , l j ) = 0 0 2 r < k r > k + 1 × R i ( r 1 ) R j ( r 2 ) R i ( r 2 ) R j ( r 1 ) r 1 2 r 2 2 d r 1 d r 2 ,
and:
ζ j = 0 1 r   d V d r | R j ( r ) | 2 d r .
These electrostatic and spin–orbit interaction parameters are crucial for energy level calculations and atomic Hamiltonian matrix element construction. According to Slater-Condon theory [24], utilized in the HFR method, these elements can be characterized as follows:
δ H a b = δ a b E a v + j = 1 q k > 0 f k l j , l j a b F k l j , l j + d j a b ζ j + i = 1 q 1 j = i + 1 q k > 0 f k l i , l j a b F k l i , l j + k g k l i , l j a b G k l i , l j × ( | l i l j | k l i + l j ) ,
where f k , g k , and d j are the angular coefficients of the Slater parameters F k , G k , and ζ j , respectively. Next, for each possible value of total angular momentum J , energy matrices are set up and diagonalized to attain eigenvalues and eigenvectors, respectively.
  • Step 2: Data extraction. We implemented customized modifications to the Cowan code. Based on HFR theory, we established data output interfaces and preliminary data formatting processes at key nodes of the program. This approach enables real-time capture and export of key parameters during the ab initio calculation process, including the aforementioned series of electrostatic and spin–orbit interaction Slater parameters: E a v , F k , G k , and ζ j , as well as the radial integral correlation matrix that connects these parameters to energy levels. This method ensures a direct correlation between the extracted data and the computational process, minimizing intermediate processing steps and thereby enhancing the accuracy and traceability of the data.
  • Step 3: Data preparation. This phase integrates multi-source data, including experimental energy levels from the NIST database, ab initio calculated energy levels, and the electrostatic and spin–orbit interaction Slater parameters along with their corresponding correlation matrices. In this step, we also perform comprehensive preprocessing on these datasets prior to machine learning training. This involves constructing a pipeline for data cleaning and standardization, thereby transforming the data into a correlation matrix encompassing both feature variables and target values, suitable for machine learning algorithms.
  • Step 4: Machine learning fitting calculations. The machine learning algorithm performs the fitting calculations in this step using the correlation matrix containing all the preprocessing information, where the radial integral correlation matrix is used to form the feature variables, and the NIST [3] experimental energy level values are used as the target values. We designed this step as a drawer to freely integrate different machine learning algorithms. To facilitate computation, we employ both linear (ElasticNet) and tree-based (XGBoost) models. These algorithms are tailored and optimized for the specific characteristics of the atomic structure data, as elaborated in Section 2.2.
  • Step 5 and 6: Results evaluation and parameter refinement. We evaluate the performance of the machine learning algorithms using the root mean square error, and the mean absolute error to measure the accuracy of fitting the computational energy level data. We use five-fold cross-validation and grid search to update the hyperparameters to optimize the generalization ability of different machine learning algorithms and finally attain the best computational results. Furthermore, our enhanced ElasticNet model refines Slater parameters, facilitating an iterative optimization process when reintroduced at Step 2.

2.2. Machine Learning Algorithms

Two categories of machine learning algorithms were employed: linear models and tree-based models, including Residuals Adaptive ElasticNet (RAEN) as a linear model, along with a suite of ensemble tree models comprising XGBoost-Residuals (XGB-R), XGBoost-Base margin (XGB-B), and XGBoost-Custom (XGB-C). Considering the intrinsic linear nature of the Slater parameter correlation matrix (feature matrix) derived from the Cowan code, which incorporates LSF as its own fitting calculation method, we additionally designed linear models for comparative analysis. Similar studies were conducted by using ridge regression models [25].
Ensemble tree models, renowned for their high predictive accuracy, have been established as reliable tools for addressing computational challenges in complex microscopic systems, such as in the field of materials science [26]. Given that ab initio calculations can provide initial predictions as target values—a rare advantage in machine learning tasks—the XGBoost model, with its high predictive accuracy, emerges as an ideal choice. This is particularly due to its key feature of residual learning capability, which focuses on learning the discrepancies between initial predictions and actual values. Consequently, we custom-designed XGB-R, XGB-B, and XGB-C to address various data characteristics and application scenarios.

2.2.1. Linear Model

In the formulation of our model, we define the parameter vector β as follows:
β 0 = [ E A V , G k , F k , ζ j ] T ,
the general form of linear model can be expressed as:
y = 1 ,   x 1 , x 2 ,   ,   x n T β .
The Residuals Adaptive ElasticNet (hereafter referred to as RAEN) employed in this study maintains the sparsity and regularization advantages of the standard ElasticNet [27] while incorporating two key enhancements: a dynamic sample weighting mechanism to balance the importance of different samples, and a residual penalty term to focus particularly on samples that are difficult to fit. This structural design underpins the nomenclature RAEN. In comparison to the loss function of the standard ElasticNet, which is expressed as:
L β = 1 2 n y i x i T   β 2 + α   [ ρ β 1 + 1 ρ 2 ] ,
the loss function of RAEN is:
L β = 1 2 n   w i   y i x i T   β 2 + α   [ ρ β 1 + 1 ρ 2 ] ,
where w i is defined as:
w i = 1 M + P + R i ,
composed of the mean squared error (MSE) M , parameter control term P , and residual constraint term R i :
M = 1 n y i y i ^   2 ,
P = 1 p β j β 0 ,
R i = | y i y i ^   | .
Here, x i T β   a n d   y i are the fitting calculation (prediction) and experimental energy level, y i ^ is ab initio value, α and ρ are the regularization coefficients controlling the L 1 and L 2 penalty terms, respectively. In the computational process, we directly treat β as the adjustable model parameters (coefficients of the matrix), using β 0 as the initial point for updates. The optimal energy values and parameter vectors are obtained through an iterative process that minimizes the loss function.

2.2.2. Tree-Based Model

Tree-based models, particularly ensemble tree methods, do not possess a concept analogous to the coefficients found in linear models. Instead, they rely on the aggregation of multiple tree learners for computation, a process known as boosting.
Tree-based models perform calculations through recursive node partitioning, which directly yields energy level predictions but does not generate coefficients like those in linear models. Consequently, these models do not require Slater parameters. They perform calculations solely based on initial predictions (ab initio) and the radial integral correlation matrix. The final prediction of the model is the cumulative sum of the predictions from multiple tree models. The standard XGBoost algorithm [28], an efficient and robust gradient boosting decision tree algorithm, can be represented as:
y i = φ ( x i ) = k = 1 M f k   ( x i ) .
The loss function is:
L φ = i = 1 N   l y i , y i ^ + k = 1 M   Ω f j ,
Here, l is a differentiable convex loss function which measures the difference between the prediction y i ^ and the target y i . The term Ω penalizes the complexity of the model.
In this study, we propose a modified version of the standard XGBoost model, termed XGBoost-Residuals (XGB-R). This variant alters the prediction direction of the model as follows:
φ x i = y i 0 y i ,
where y i 0 is the initial prediction. The model’s fitting objective is transformed from the final values to the residuals of previous iteration. The final prediction is obtained through the cumulative summation of the results from each fitting iteration. By directly learning the residuals, the prediction accuracy is significantly enhanced.
XGBoost-Base margin (XGB-B) incorporates the initial prediction (ab initio) into the base learner, denoted as f j ( x i , y i 0 ) . This approach uses the initial prediction as a baseline, with the model’s fitting results representing adjustments to this baseline.
XGBoost-Custom (XGB-C) modifies the objective function, defined by the equation:
L C u s t o m φ = i = 1 N [   w 1 l ( y i , y i ^ ) + w 2 λ ( y i 0 y i ) 2 ] + k = 1 M   Ω ( f j ) ,
where l ( y i , y i ^ ) is the prediction loss, λ ( y i 0 y i ) 2 is the residual constraint term, and w 1 , w 2 are entropy-based weights determined by:
w j = ( 1 H j ) / k = 1 2 ( 1 H k ) ,   j = 1 ,   2 ,
H j = i = 1 N p j i × ln p j i ,   j = 1 ,   2 .
Here, H j represents the information entropy [29] of the j -th term ( j = 1 for prediction loss, j = 2 for residual constraint), p j i are the normalized values of the prediction loss and residual constraint terms, respectively.
By incorporating both the custom objective function and evaluation metrics, XGB-C aims to achieve a more optimal balance between residual constraints and the influence of initial predictions. The introduction of the entropy weight method enhances the model’s adaptability, enabling dynamic adjustment of the relative importance of prediction accuracy and consistency with initial predictions based on data distribution.

3. Results and Discussion

To elucidate the distinctive characteristics of four computational methods—RAEN, XGB-R, XGB-B, and XGB-C—in calculating complex energy level structures, we selected the Yb I atom for analysis. This element was chosen due to its large inter-configuration energy gaps and densely populated energy level clusters.
We calculated the energy levels of multiple excited states of Yb I, categorized by odd and even parity, relative to the ground state configuration [Xe]4f146s2. The calculated relative energies are systematically compared with the experimental data provided by the NIST Atomic Spectra Database [3].

3.1. Calculations of Yb I Even-Parity Excited States

We selected multiple outer-shell excited states, specifically those maintaining an invariant 4f14 closed-shell core and involving only s, p, and d orbitals. This enabled us to analyze the characteristics of different computational methods when fitting large datasets of energy levels.
Utilizing the Cowan code, we conduct ab initio calculations of Yb I even-parity excited state energy levels based on HFR theory. We then integrated high-precision experimental energy levels to perform fitting calculations using various regression methods. The results of all possible terms are listed in Table 1.
The ab initio calculation results are compared with the experimental values provided by NIST to compute E (the absolute errors) and R (the root mean square error, RMSE, R = i N ( y i y i ^ ) 2   N , where y i and y i ^ are the fitting calculation and experimental energy level from NIST, and N is the number of energy levels). These metrics are used to evaluate the accuracy of the new computational methods.
We report E for each term and R for all energy levels incorporated in our calculations. A lower R -value signifies higher computational accuracy, with this metric being particularly sensitive to extreme deviations and outliers. Consequently, regression model predictions that substantially diverge from experimental values are deemed unacceptable. Such discrepancies often indicate either overfitting of the machine learning model or, more critically, a failure to capture the underlying patterns in the Slater parameter features, rendering the model incapable of accurate predictions.
It must be acknowledged that the ab initio calculations using the Cowan code have largely elucidated the energy level positions and structures, albeit with limited accuracy ( R -value is 966.1). This, however, represents the limit of the HFR method’s capabilities. The reduction in R ( R -value is 439.2, 24.7, 75.0, and 247.8) demonstrates that the four novel methods, through the incorporation of select high-purity experimental energy levels for fitting, have substantially enhanced computational precision. Figure 2 provides a more intuitive illustration of this improvement from the perspective of error.
Ab initio calculations (denoted by triangles) exhibit substantially larger deviations compared to the other four computational methods (represented by squares). This pronounced disparity is further evidenced in the marginal histograms, where the ab initio method displays a distinctly prominent bar. Our RAEN model also demonstrates deviations, notably in computing the 4f146p2 configuration, where its performance is surpassed even by ab initio calculations.
This phenomenon is intrinsically linked to the substantial energy level spacing between distinct spectral terms of the 6p2 configuration, with the ³P state demonstrating pronounced exchange interaction among parallel spin electrons. Moreover, single linear models often exhibit diminished efficacy when concurrently processing extensive datasets. A comprehensive analysis of linear model behavior on reduced sample sizes is presented in Section 3.2.
Tree-based methods demonstrated excellent performance with this dataset. These XGBoost models, trained on experimental energy levels, accurately capture the relationship between Slater parameter features and energy level values in the HFR method. This numerical approach augments the HFR theory by incorporating certain electron-electron interactions previously neglected in the original formulation.
The lower R -values serve as a clear validation of this approach. Moreover, when processing large-interval energy level sequences across different energy clusters, the computational results of all three XGBoost models demonstrate stability. This was particularly evident in the XGB-R and XGB-B models, where the maximum absolute error is constrained within 150 cm−1, and both models achieve R -values below 100 cm−1.
The XGB-C model demonstrates mediocre accuracy for this dataset, an outcome anticipated in the initial design phase. This underscores an inherent challenge in the fitting process: overfitting. As the number of experimental energy levels utilized for training expands, the overfitting phenomenon becomes more pronounced in complex tree-based models.
Upon augmenting the volume of experimental energy levels in the training set, the XGB-R model’s calculations exhibit an ostensibly “perfect” fit to the majority of experimental values, as illustrated in Figure 3. However, this apparent precision belies severe overfitting, evidenced by a R of 624.8. The R -value exhibits a substantial increase, accompanied by significant deviations from experimental values for several energy levels: 3D3 of 4f146s6d, 3P1 of 4f146p2, 3S1 of 4f146s9s, and 3D2 of 4f146s8d. Enhanced residual learning enables it to effectively capture experimental data characteristics. However, overly stringent residual constraints compromise generalization capability. The XGB-B model, utilizing ab initio calculations as its initial prediction baseline, encounters fitting disruptions when training energy levels significantly deviate from ab initio results. The XGB-C model establishes an equilibrium between the aforementioned approaches by employing the entropy weight method to dynamically and adaptively adjust the influence of ab initio calculations and residual constraints. Additionally, the incorporation of the entropy weight method in XGB-C enables dynamic adjustment based on data distribution, particularly the distribution of energy levels across neighboring configurations. This approach potentially leads to enhanced performance in capturing complex configuration interactions.

3.2. Small-Sample Analysis of Linear Models for Yb I Excited States

For linear models, we use Cowan fit and RAEN to calculate a set of even-parity excited state energy levels with a relatively small energy span. As the dataset size decreased, the advantages of linear models became increasingly apparent in Table 2.
The Cowan code includes an optional fitting module that employs the LSF method to optimize calculations against experimental values. We denote the results from this approach as “Cowan fit”. Given that both LSF and RAEN are linear methods, we include this approach in our comparative analysis.
The R of the RAEN model ( R -value of 114.1) showed a significant improvement compared to its performance on larger datasets, and was substantially better than the fitting results from the built-in calculation of the Cowan code ( R -value of 1040.8).
Cowan fit utilizes the LSF method, which is susceptible to overfitting in high-dimensional or multicollinear datasets, even with small sample sizes. RAEN, constrained by combined L 1 and L 2 regularization terms, effectively mitigating this issue. Complex models such as XGB are similarly prone to overfitting with limited samples, as demonstrated in Figure 4.
Similar studies were conducted by using ridge regression models [25]. In their approach, experimental values were incorporated discretely for each configuration, effectively creating smaller sample subsets for the fitting process. This approach exemplifies the segmented fitting paradigm prevalent in linear models. We applied an analogous segmented fitting technique to the RAEN model when computing odd-parity excited state energy levels of Yb I. This method yielded a great enhancement in computational precision.
Another advantage of linear models is that they directly yield updated Slater parameters ( E a v , F k , G k , and ζ j ) after fitting calculation. In linear models, these parameters are treated as directly adjustable variables (feature coefficients). In contrast, within the context of HFR theoretical calculations, these parameters assume a critical role as fundamental components in constructing the Hamiltonian. The RAEN model updated partial parameters, as shown in Table 3, and through a feedback loop, it corrected the Hamiltonian calculations in the Cowan code.
Tree-based models perform calculations through node partitioning, thus they can only directly fit energy level values and do not incorporate the concept of these types of parameters. However, our analysis of XGB model calculation results indicates that as the number of built-in parameters in tree models increases and complexity rises, computational accuracy indeed improves.
Moreover, the XGB model employs a second-order Taylor expansion to approximate the loss function, so we can put forward an interesting but reasonable conjecture that for the HFR theory, if the Hamiltonian is expanded into more orthogonal radial integral terms, perhaps the computational accuracy of ab initio will also be improved, but of course, this requires more work to verify.
Additionally, we attempted to calculate the energy level using the support vector regression (SVR) method. However, the results were unsatisfactory, suggesting that kernel methods may not be well-suited for processing these types of data. We also applied these methods to calculate odd-parity excited state energy levels for the Yb I in Table 4. The results showed improvement over the ab initio method. Notably, the XGB-R model achieved R -value of 67.2, with the relative error controlled within 0.5%.
Furthermore, it is important to note that in the atomic and molecular field, the predictability of theoretical models is intrinsically linked to their capability to provide specific data without any experimental assistance, which is crucial for theoretical calculations. Our current work represents a refinement of HFR calculations guided by experimental reference data, thus possessing limited predictive power. To achieve full predictive capabilities, more sophisticated machine learning models with enhanced learning and generalization abilities, such as artificial neural networks [33], would be required. This aspect represents an important direction for future investigation.

4. Conclusions

We constructed a comprehensive, machine learning-assisted general workflow for atomic structure calculations. Within this framework, we developed and applied four machine learning methods combined with HFR theory for atomic structure calculations: RAEN, XGB-R, XGB-B, and XGB-C. These methods were used to calculate 56 energy levels of the Yb I, including eight even-parity configurations and six odd-parity configurations of the 4f14 closed shell. Under optimal conditions, the best-performing model demonstrates excellent predictive capabilities. When compared to NIST data, the model achieves an average absolute error of less than 50 cm−1 for even-parity configurations and less than 100 cm−1 for odd-parity configurations. Compared to HFR ab initio calculations, our method demonstrates significantly improved accuracy, while the lower R-value indicates enhanced computational stability. To achieve comparable accuracy levels, traditional ab initio methods would require more sophisticated theoretical models incorporating various correlation effects through complex perturbation terms. This approach would necessitate the use of more elaborate atomic structure calculation programs, substantially increasing computational resource requirements. Through our data-driven approach, we successfully enhanced calculation accuracy while maintaining the computational efficiency inherent to the Cowan code framework.
We analyzed the conditions where each method is most applicable. As additive models, XGB-like methods are highly accurate and generalizable for a large number of energy level calculations across clusters of energy levels, and can be used to determine the steps of changes in energy level structure, but the energy level data used for training fits need to be carefully selected to avoid overfitting. Our sample selection prioritizes energy level cluster boundary states to capture the overall characteristics of clusters, incorporates inter-configuration transition states to reflect inter-configuration interactions, and implements uniform sampling within clusters to select representative states. All selected energy levels are required to maintain high purity, where the leading configuration accounts for more than 80%. For closely spaced energy levels within a cluster where the number of calculations is relatively small, linear models such as RAEN can be employed. Additionally, these models allow for the optimization of HFR theory by updating the Slater parameters. Consequently, different methods of calculation can be used selectively for different atomic system characteristics and the availability of experimental energy level data. Furthermore, the flexible and extensible computational workflow enables the potential integration of a wider range of machine learning algorithms to assist in atomic structure calculations.

Author Contributions

Conceptualization, J.Z., C.Y. and J.C.; methodology, C.Y., G.J. and J.Z.; software, K.M. and C.Y.; validation, K.M.; data curation, K.M.; writing—original draft preparation, K.M.; writing—review and editing, C.Y. and J.Z.; visualization, K.M.; supervision, Y.L., G.J. and J.C.; project administration, J.Z., Y.L. and J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the open fund project of National Key Laboratory of Particle Transport and Separation Technology and the Liao Yuan Project of China Nuclear Energy Industry Corporation.

Data Availability Statement

Datasets generated during the current study are available from the corresponding authors on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Godbert, L.; Calisti, A.; Stamm, R.; Talin, B.; Lee, R.; Klein, L. Plasma Diagnostics with Spectral Profile Calculations. Phys. Rev. E 1994, 49, 5644–5651. [Google Scholar] [CrossRef] [PubMed]
  2. Ludlow, A.D.; Boyd, M.M.; Ye, J.; Peik, E.; Schmidt, P.O. Optical Atomic Clocks. Rev. Mod. Phys. 2015, 87, 637–701. [Google Scholar] [CrossRef]
  3. Kramida, A.; Ralchenko, Y. NIST Atomic Spectra Database, NIST Standard Reference Database 78 1999. Available online: https://data.nist.gov/pdr/lps/EBC9DB05EDEC5B0EE043065706812DF83 (accessed on 7 October 2024).
  4. Dere, K.P.; Landi, E.; Mason, H.E.; Monsignori Fossi, B.C.; Young, P.R. CHIANTI—An Atomic Database for Emission Lines: I. Wavelengths Greater than 50 Å. Astron. Astrophys. Suppl. Ser. 1997, 125, 149–173. [Google Scholar] [CrossRef]
  5. Fischer, C.F. Hartree—Fock Method for Atoms: A Numerical Approach; John Wiley & Sons: Hoboken, NJ, USA, 1977. [Google Scholar]
  6. Grant, I.P.; Quiney, H.M. Foundations of the Relativistic Theory of Atomic and Molecular Structure. In Advances in Atomic and Molecular Physics; Elsevier: Amsterdam, The Netherlands, 1988; Volume 23, pp. 37–86. ISBN 978-0-12-003823-7. [Google Scholar]
  7. Lindgren, I.; Morrison, J. Atomic Many-Body Theory; Springer Series in Chemical Physics; Springer: Berlin/Heidelberg, Germany, 1982; Volume 13, ISBN 978-3-642-96616-3. [Google Scholar]
  8. Cowan, R.D. Theoretical Calculation of Atomic Spectra Using Digital Computers. J. Opt. Soc. Am. 1968, 58, 808. [Google Scholar] [CrossRef]
  9. Hibbert, A. CIV3—A General Program to Calculate Configuration Interaction Wave Functions and Electric-Dipole Oscillator Strengths. Comput. Phys. Commun. 1975, 9, 141–172. [Google Scholar] [CrossRef]
  10. Froese Fischer, C.; Tachiev, G.; Gaigalas, G.; Godefroid, M.R. An MCHF Atomic-Structure Package for Large-Scale Calculations. Comput. Phys. Commun. 2007, 176, 559–579. [Google Scholar] [CrossRef]
  11. Grant, I.P.; McKenzie, B.J.; Norrington, P.H.; Mayers, D.F.; Pyper, N.C. An Atomic Multiconfigurational Dirac-Fock Package. Comput. Phys. Commun. 1980, 21, 207–231. [Google Scholar] [CrossRef]
  12. Jönsson, P.; Gaigalas, G.; Bieroń, J.; Fischer, C.F.; Grant, I.P. New Version: Grasp2K Relativistic Atomic Structure Package. Comput. Phys. Commun. 2013, 184, 2197–2203. [Google Scholar] [CrossRef]
  13. Fritzsche, S. The Ratip Program for Relativistic Calculations of Atomic Transition, Ionization and Recombination Properties. Comput. Phys. Commun. 2012, 183, 1525–1559. [Google Scholar] [CrossRef]
  14. Fritzsche, S. Ratip—A Toolbox for Studying the Properties of Open-Shell Atoms and Ions. J. Electron Spectrosc. Relat. Phenom. 2001, 114–116, 1155–1164. [Google Scholar] [CrossRef]
  15. Gu, M.F. The Flexible Atomic Code. Can. J. Phys. 2008, 86, 675–689. [Google Scholar] [CrossRef]
  16. Ghiringhelli, L.M.; Vybiral, J.; Levchenko, S.V.; Draxl, C.; Scheffler, M. Big Data of Materials Science: Critical Role of the Descriptor. Phys. Rev. Lett. 2015, 114, 105503. [Google Scholar] [CrossRef] [PubMed]
  17. Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B.P.; Ramprasad, R.; Gubernatis, J.E.; Lookman, T. Machine Learning Bandgaps of Double Perovskites. Sci. Rep. 2016, 6, 19375. [Google Scholar] [CrossRef] [PubMed]
  18. Bartók, A.P.; Csányi, G. G Aussian Approximation Potentials: A Brief Tutorial Introduction. Int. J. Quantum Chem. 2015, 115, 1051–1057. [Google Scholar] [CrossRef]
  19. Pfau, D.; Spencer, J.S.; Matthews, A.G.D.G.; Foulkes, W.M.C. Ab Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks. Phys. Rev. Res. 2020, 2, 033429. [Google Scholar] [CrossRef]
  20. Hermann, J.; Schätzle, Z.; Noé, F. Deep-Neural-Network Solution of the Electronic Schrödinger Equation. Nat. Chem. 2020, 12, 891–897. [Google Scholar] [CrossRef]
  21. Liggins, F.S.; Pickering, J.C.; Nave, G.; Kramida, A.; Gamrath, S.; Quinet, P. New Ritz Wavelengths and Transition Probabilities of Parity-Forbidden [Mn II] Lines of Astrophysical Interest. ApJ 2021, 907, 69. [Google Scholar] [CrossRef]
  22. Kramida, A. Cowan Code: 50 Years of Growing Impact on Atomic Physics. Atoms 2019, 7, 64. [Google Scholar] [CrossRef]
  23. Chikh, A.; Deghiche, D.; Meftah, A.; Tchang-Brillet, W.-Ü.L.; Wyart, J.-F.; Balança, C.; Champion, N.; Blaess, C. Extended Analysis of the Free Ion Spectrum of Er3+ (Er IV). J. Quant. Spectrosc. Radiat. Transf. 2021, 272, 107796. [Google Scholar] [CrossRef]
  24. Cowan, R.D. The Theory of Atomic Structure and Spectra; University of California Press: New York, NY, USA, 1981; ISBN 978-0-520-03821-9. [Google Scholar]
  25. Yu, Y.; Yang, C.; Jiang, G. Ridge Regression Energy Levels Calculation of Neutral Ytterbium (Z = 70). Chin. Phys. B 2023, 32, 033101. [Google Scholar] [CrossRef]
  26. Carrete, J.; Li, W.; Mingo, N.; Wang, S.; Curtarolo, S. Finding Unprecedentedly Low-Thermal-Conductivity Half-Heusler Semiconductors via High-Throughput Materials Modeling. Phys. Rev. X 2014, 4, 011019. [Google Scholar] [CrossRef]
  27. Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
  28. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
  29. Shannon, C.E. A Mathematical Theory of Communication. SIGMOBILE Mob. Comput. Commun. Rev. 2001, 5, 3–55. [Google Scholar] [CrossRef]
  30. Porsev, S.G.; Safronova, M.S.; Derevianko, A.; Clark, C.W. Long-Range Interaction Coefficients for Ytterbium Dimers. Phys. Rev. A 2014, 89, 012711. [Google Scholar] [CrossRef]
  31. Baumann, M.; Braun, M.; Maier, J. Configuration Interaction in the 6s n d1D2 and 6s n s1S0 States of Yb Probed by Lifetime Measurements. Z. Für Phys. D At. Mol. Clust. 1987, 6, 275–278. [Google Scholar] [CrossRef]
  32. Bowers, C.J.; Budker, D.; Commins, E.D.; DeMille, D.; Freedman, S.J.; Nguyen, A.-T.; Shang, S.-Q.; Zolotorev, M. Experimental Investigation of Excited-State Lifetimes in Atomic Ytterbium. Phys. Rev. A 1996, 53, 3103–3109. [Google Scholar] [CrossRef]
  33. Hopfield, J.J. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. Natl. Acad. Sci. USA 1982, 79, 2554–2558. [Google Scholar] [CrossRef]
Figure 1. Step-by-step diagram of the machine learning-assisted atomic structure calculation workflow. Step 1: Initial calculation; Step 2: data extraction; Step 3: data preparation; Step 4: machine learning fitting calculations; Step 5: parameter refinement; Step 6: results evaluation.
Figure 1. Step-by-step diagram of the machine learning-assisted atomic structure calculation workflow. Step 1: Initial calculation; Step 2: data extraction; Step 3: data preparation; Step 4: machine learning fitting calculations; Step 5: parameter refinement; Step 6: results evaluation.
Entropy 26 00962 g001
Figure 2. Errors of even-parity excited state energy levels for Yb I calculated using ab initio method, RAEN, XGB-R, XGB-B, and XGB-C. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Figure 2. Errors of even-parity excited state energy levels for Yb I calculated using ab initio method, RAEN, XGB-R, XGB-B, and XGB-C. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Entropy 26 00962 g002
Figure 3. Errors in even-parity excited state energy levels of Yb I calculated by XGB-R, XGB-B, and XGB-C models with an expanded training set. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Figure 3. Errors in even-parity excited state energy levels of Yb I calculated by XGB-R, XGB-B, and XGB-C models with an expanded training set. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Entropy 26 00962 g003
Figure 4. Errors in small-sample even-parity excited state energy levels of Yb I calculated using Cowan fit, RAEN, XGB-R, XGB-B, and XGB-C methods. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Figure 4. Errors in small-sample even-parity excited state energy levels of Yb I calculated using Cowan fit, RAEN, XGB-R, XGB-B, and XGB-C methods. Top marginal histograms show error distributions for each method. The vertical axis is broken (indicated by //) to accommodate the large range of errors.
Entropy 26 00962 g004
Table 1. Yb I even-parity excited state energy levels (in unit cm−1) from the experiment (NIST), ab initio calculations, and fitting calculated of RAEN, XGB-R, XGB-B, and XGB-C, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Table 1. Yb I even-parity excited state energy levels (in unit cm−1) from the experiment (NIST), ab initio calculations, and fitting calculated of RAEN, XGB-R, XGB-B, and XGB-C, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Config.TermJExp. [3]Ab InitioRAENXGB-RXGB-BXGB-COther
Work
Level E Level E Level E Level E
4f146s21S00.00.00.00.00.00.00.00.00.00.0
4f145d6s3D124,489.122,944.724,492.73.624,435.753.40.00.024,132.8356.325,108 a
3D224,751.925,366.724,757.75.824,765.213.324,364.8124.324,893.1141.225,368 a
3D325,270.925,983.525,339.468.425,324.653.724,838.686.725,438.4167.425,891 a
1D227,677.626,630.827,676.01.627,661.416.225,352.481.527,436.3241.328,353 a
4f146s7s3S132,694.731,957.032,697.42.732,700.55.827,626.351.332,524.2170.533,092 a
1S034,350.632,225.733,773.0577.634,324.326.332,689.84.933,853.5497.134,755 a
4f146s6d3D139,808.739,365.839,810.61.939,817.18.434,269.081.639,707.0101.7
3D239,838.039,387.140,091.6253.639,866.628.639,858.349.639,734.5103.5
3D339,966.139,571.439,967.81.739,956.39.839,951.0113.039,875.490.7
1D240,061.539,608.540,397.6336.140,058.33.239,919.746.439,956.6104.9
4f146s8s3S141,615.040,730.641,616.31.341,612.42.640,068.36.841,410.7204.3
1S041,939.940,773.841,791.3148.641,903.036.941,572.842.241,665.6274.3
4f146p23P042,436.943,448.444,134.31697.342,491.454.542,557.9121.042,671.4234.5
3P143,805.444,199.743,882.877.443,820.214.843,900.495.043,897.291.8
4f146s7d3D144,311.443,384.244,303.48.044,302.78.744,259.951.544,098.0213.4
3D244,313.143,391.244,261.851.344,303.59.644,255.557.644,100.8212.3
3D344,357.643,459.144,092.7264.944,360.83.244,347.79.944,150.6207.0
1D244,380.843,474.344,670.6289.844,380.80.044,410.029.244,171.6209.2
4f146p23P244,760.445,030.044,770.09.644,360.83.244,869.5109.144,823.362.9
4f146s9s3S145,121.344,097.244,775.4345.945,115.85.540,068.36.841,410.7204.3
1S0 44,123.745,487.9 45,353.6 41,572.842.241,665.6274.3
4f146s8d3D146,445.045,388.846,449.74.746,448.03.045,124.63.344,885.7235.6
3D246,467.745,387.646,458.49.346,441.426.345,259.0 45,075.0
3D346,480.745,425.946,195.7285.046,468.412.346,423.221.846,201.4243.6
1D2 45,424.846,727.1 47,551.1 46,382.085.746,218.0249.746,405.45 b
4f146p21D247,821.848,840.947,700.8121.047,862.540.746,438.542.246,237.8242.9
1S045,121.349,525.350,963.7 50,631.6 47,492.1 47,078.8
R * 966.1439.2 24.7 75.0 247.8
* R is the root mean square error (RMSE), and a Ref. [30], b Ref. [31].
Table 2. Small-sample even-parity excited state energy levels of Yb I calculated using Cowan fit, RAEN, XGB-R, XGB-B, and XGB-C methods, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Table 2. Small-sample even-parity excited state energy levels of Yb I calculated using Cowan fit, RAEN, XGB-R, XGB-B, and XGB-C methods, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Config.TermJExp. [3]Ab InitioCowan FitOther
Work
RAENXGB-RXGB-BXGB-C
Level E Level E Level E Level E
4f146p23P042,436.943,448.443,443.7 42,708.5271.642,436.90.042,443.97.042,671.6234.7
3P143,805.444,199.744,199.7 43,960.9155.543,805.40.043,809.54.143,897.792.3
4f146s7d3D144,311.443,384.243,340.4 44,293.917.542,366.31945.042,375.71935.743,463.2848.2
3D244,313.143,391.243,373.744,314.1 c44,294.218.944,313.10.044,301.211.944,101.3211.8
3D344,357.643,459.143,411.944,345.3 c44,343.514.144,357.60.044,353.34.344,150.5207.1
1D244,380.843,474.343,152.744,401.1 c44,366.314.544,380.80.044,378.82.044,172.8208.0
4f146p23P244,760.445,030.044,466 44,879.8119.446,034.71274.346,027.71267.344,836.175.7
4f146s9s3S145,121.344,097.244,070.6 45,074.247.145,121.30.045,115.55.844,886.0235.3
1S0 44,123.744,092.7 45,909.5 45,361.0 45,346.9 45,076.2
4f146s8d3D146,445.045,388.845,376.4 46,364.780.346,445.00.046,433.411.646,202.1242.9
3D246,467.745,387.645,392.7 46,377.790.044,984.31483.444,990.41477.344,840.91626.8
3D346,480.745,425.945,412.6 46,398.082.746,480.70.046,470.510.246,238.2242.5
1D2 45,424.845,292.8 47,550.9 47,574.0 47,552.6 47,062.2
4f146p21D247,821.848,840.946,097.7 47,961.3139.547,822.81.047,832.410.648,059.5237.6
R * 1040.8 114.1 737.6 734.3 604.4
* R is the root mean square error (RMSE), and c Ref. [32].
Table 3. Partial Slater parameters (in unit cm−1) from Cowan fit and RAEN fitting calculations of small-sample even-parity excited state energy levels of Yb I.
Table 3. Partial Slater parameters (in unit cm−1) from Cowan fit and RAEN fitting calculations of small-sample even-parity excited state energy levels of Yb I.
Linear ModelCowan FitRAEN
E a v (4f146s7d)43,385.342,878.3
ζ f 28.628.7
G 2 s d 28.662.6
E a v (4f146p2)44,875.845,269.1
F 2 p p 17521395.8
ζ d 931.8241.4
E a v (4f146s8d)45,399.245,770.5
ζ f 14.513.5
G 2 s d 13.716.9
E a v (4f146s9s)44,076.944,424.7
Table 4. Yb I odd-parity excited state fitting calculated energy levels (in unit cm−1) using RAEN, XGB-R, XGB-B, and XGB-C methods, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Table 4. Yb I odd-parity excited state fitting calculated energy levels (in unit cm−1) using RAEN, XGB-R, XGB-B, and XGB-C methods, where E (in unit cm−1) and R represent the absolute errors and root mean square errors between experimental and theoretical energies, respectively.
Config.TermJExp. [3]Ab InitioRAENXGB-RXGB-BXGB-COther
Work
Level E Level E Level E Level E
4f146s6p3P117,992.017,891.717,808.4183.617,979.112.917,913.178.917,968.923.118,450.0 a
3P219,710.418,854.419,403.7306.719,621.289.219,610.2100.219,512.4198.020,251.0 a
1P125,068.229,170.426,212.41144.225,167.198.925,182.8114.626,015.2947.025,967.0 a
4f146s7p3P038,090.739,090.338,305.8215.138,135.344.638,199.4108.738,322.0231.3
3P138,174.239,183.538,173.90.338,200.626.438,232.358.138,407.2233.0
3P238,552.039,400.838,804.1252.138,558.06.038,573.621.638,747.8195.8
1P140,564.040,914.440,561.92.140,540.923.140,517.746.440,644.880.8
4f146s5f3F3 44,148.243,259.5 43,273.0 43,286.2 43,466.8 43,297.51 c
3F4 44,148.541,711.7 43,379.1 43,367.3 43,560.9
3F243,433.944,148.043,853.2419.343,452.118.243,509.976.043,598.7164.8
1F3 44,209.343,519.2 43,460.8 43,441.2 43,673.4 43,254.8 c
4f146s8p3P043,614.344,595.743,556.857.543,628.314.043,635.220.943,840.4226.1
3P143,659.444,632.743,597.062.443,670.310.943,708.949.543,884.1224.7
3P243,805.744,719.543,879.073.343,886.781.043,908.8103.144,017.0211.3
1P144,017.645,308.443,918.499.244,098.781.144,131.7114.144,315.8298.2
4f146s6f3F245,956.346,777.345,821.5134.845,942.413.945,957.61.346,145.1188.8
3F3 46,777.546,421.1 46,018.1 45,991.2 46,219.9
4f146s9p3P146,078.947,040.546,317.6238.746,185.5106.646,203.7124.846,301.8222.9
3P046,082.247,021.346,082.60.446,134.352.146,180.398.146,299.0216.8
4f146s6f3F4 46,777.645,858.3 46,031.8 46,018.6 46,271.9
3F3 46,818.446,042.6 46,051.5 46,009.9 46,319.7
4f146s9p3P246,184.247,084.746,184.30.146,211.527.346,243.759.546,391.9207.7
1P146,370.347,435.346,842.4472.146,545.1174.846,541.8171.546,617.0246.7
4f146s6f3F247,326.748,186.247,326.70.147,313.013.747,345.218.547,525.3198.7
3F3 48,186.347,525.1 47,419.4 47,377.8 47,637.2
3F4 48,186.447,764.0 47,776.1 47,762.6 47,938.9
1F3 48,212.848,062.6 48,169.5 48,155.7 48,247.1
R * 1302.1338.8 67.2 88.8 297.2
* R is the root mean square error (RMSE), and a Ref. [30], c Ref. [32].
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, K.; Yang, C.; Zhang, J.; Li, Y.; Jiang, G.; Chai, J. Machine Learning-Assisted Hartree–Fock Approach for Energy Level Calculations in the Neutral Ytterbium Atom. Entropy 2024, 26, 962. https://doi.org/10.3390/e26110962

AMA Style

Ma K, Yang C, Zhang J, Li Y, Jiang G, Chai J. Machine Learning-Assisted Hartree–Fock Approach for Energy Level Calculations in the Neutral Ytterbium Atom. Entropy. 2024; 26(11):962. https://doi.org/10.3390/e26110962

Chicago/Turabian Style

Ma, Kaichen, Chen Yang, Junyao Zhang, Yunfei Li, Gang Jiang, and Junjie Chai. 2024. "Machine Learning-Assisted Hartree–Fock Approach for Energy Level Calculations in the Neutral Ytterbium Atom" Entropy 26, no. 11: 962. https://doi.org/10.3390/e26110962

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop