Abstract
High dimensional model representation (HDMR) is a general set of quantitative model assessment and analysis tools for capturing high dimensional input-output system behavior. In practice, the HDMR component functions are each approximated by an appropriate basis function expansion. This procedure often requires many input-output samples which can restrict the treatment of high dimensional systems. In order to address this problem we introduce svr-based HDMR to efficiently and effectively construct the HDMR expansion by support vector regression (SVR) for a function \(f(\mathbf{x})\). In this paper the results for independent variables sampled over known probability distributions are reported. The theoretical foundation of the new approach relies on the kernel used in SVR itself being an HDMR expansion (referred to as the HDMR kernel ), i.e., an ANOVA kernel whose component kernels are mutually orthogonal and all non-constant component kernels have zero expectation. Several HDMR kernels are constructed as illustrations. While preserving the characteristic properties of HDMR, the svr-based HDMR method enables efficient construction of high dimensional models with satisfactory prediction accuracy from a modest number of samples, which also permits accurate computation of the sensitivity indices. A genetic algorithm is employed to optimally determine all the parameters of the component HDMR kernels and in SVR. The svr-based HDMR introduces a new route to advance HDMR algorithms. Two examples are used to illustrate the capability of the method.







Similar content being viewed by others
Notes
When the paper was published the authors changed to \(n=4\) with \(\mathbf{a} = (0, 1, 4.5, 99)\).
References
R. Fisher, Statistical Methods for Research Workers (Oliver and Boyd, Edinburgh, 1925)
W. Hoeffding, Ann. Math. Stat. 19, 293–325 (1948)
I.M. Sobol, Mathematicheskoe Modelirovanie 2, 112–118 (in Russian) (1993) (Transl. Math. Model. Comp. Exp. 1, 407–414)
H. Rabitz, O.F. Alis, J. Math. Chem. 25, 197–233 (1999)
G. Li, H. Rabitz, J. Math. Chem. 50, 99–130 (2012)
G. Li, H. Rabitz, J. Math. Chem. 52, 2052–2073 (2014)
G. Hooker, J. Comput. Graph. Stat. 16(3), 709–732 (2007)
O.F. Alis, H. Rabitz, J. Math. Chem. 29, 127–142 (2001)
Z. Huang, H. Qiu, M. Zhao, X. Cai, L. Gao, AASRI Proc. 3, 95–100 (2012)
Z. Huang, H. Qiu, M. Zhao, X. Cai, L. Gao, Eng. Comput. 32(3), 643–667 (2015)
S.R. Gunn, M. Brown, K.M. Bossley, Network performance assessment for neurofuzzy data modeling. Intelligent Data Analysis, vol. 1208 of Lecture Notes in Computer Science, ed. by X. Liu, P. Cohen. (1997) pp. 313–323
V. Vapnik, The Nature of Statistical Learning Theory (Springer, New York, 1995). ISBN:0-387-94559-8
V. Vapnik, S. Golowich, A. Smola, Support vector method for function approximation, regression estimation, and signal processing, in Advances in Neural Information Processing System, vol. 9, ed. by M. Mozer, M. Jordan, T. Petsche (MIT Press, Cambridge, 1997), pp. 281–287
A. Smola, B. Schölkopf, Stat. Comput. 14, 199–222 (2004)
O.L. Mangasarian, Nonlinear Programming (McGraw-Hill, New York, 1983)
G.P. McCormick, Nonlinear Programming: Theory, Algorithm, and Applications (Wiley, New York, 1983)
G.E. Fasshauer, M. Mccourt, SIAM J. Sci. Comput. 34(2), A737–A762 (2012)
A. Shashua, Introduction to Machine Learning (School of Computer Science and Engineering,The Hebrew University of Jerusalem, Jerusalem, 2008)
M.O. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, J. Weston, Support vector regression with ANOVA decomposition kernels. Technique report CSD-TR-97-22, Nov. 27, Department of Computer Science, Egham (1997)
D. Duvenaud, H. Nickisch, C.E. Rasmussen, Advances in Neural Information Processing Systems, vol. 24 (Granada, 2011)
N. Durrande, D. Ginsbourger, O. Roustant, L. Carraro, J. Multivariate Anal. 115, 57–67 (2013)
D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning (Kluwer, Boston, 1989)
C.W. Hsu, C.C. Chang, C.J. Lin, A Practical Guide to Support Vector Classification. (2010). www.csie.ntu.edu.tw/~cjlin/talks/freiburg
A. Jouyban, H.K. Chan, N.Y.K. Chew, M. Khounasabjafari, W.E. Acree Jr., Chem. Pharm. Bull. 54(4), 428–431 (2006)
A. Saltelli, I.M. Sobol, Reliab. Eng. Syst. Saf. 50(3), 225–239 (1995)
G. Li, H. Rabitz, J. Math. Chem. 48, 1010–1035 (2010)
G. Chastainga, F. Gamboab, C. Prieur, J. Stat. Comput. Simul. 85(7), 1306–1333 (2015)
C. Dunkl, Y. Xu, Orthogonal Polynomials of Several Variables (Cambridge University Press, Cambridge, 2001)
Acknowledgments
Support of this work for G. Li is provided by the NSF with account number CHE-1464569 and H. Rabitz was supported by the Templeton Foundation with Account No. 52265.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendices
The material below will summarize the features of the HDMR kernels employed in this paper.
Appendix 1: \(K_0(x_{i_s},x_i)\) is a kernel
The commonly used kernels \(K(x_{i_s},x_i)\) in SVR, like Gaussian rbf, polynomial kernels, etc. do not satisfy the zero expectation condition with respect to the measure \(\mu \) required for defining HDMR component functions:
or using \({\mathrm d}\mu (s)=p_i(s){\mathrm d}s\) and \({\mathrm d}\mu (t)=p_i(t){\mathrm d}t\)
where \(p_i(x_i)\) is the marginal pdf of \(x_i\). To solve this problem, define
For a uniform distribution of \(x_i\), Eq. 61 reduces to
From Eq. 61, it is easy to prove that \(K_0(x_{i_s},x_i)\) satisfies Eq. 60 by direct integration.
If \(K(x_i,x'_i)\) is a kernel, it can be proved that \(K_0(x_i,x'_i)\) is also a kernel satisfying Mercer’s condition [17]
The proof is the following:
Here, the inner product \(\langle *,* \rangle \) is defined with respect to the kernel \(K(x_i,x'_i)\)
and the symmetric property
along with the Cauchy-Schwarz inequality
were used.
Appendix 2: Formulas of some commonly used HDMR kernels
Using Eq. 61 some commonly used HDMR kernels are given below. To save the space, the detailed derivations of these formulas are omitted.
-
1a.
Gaussian rbf HDMR kernel with a uniform distribution in [−1,1]
$$\begin{aligned} K_0(x_i,x'_i)= & {} \exp {\left[ \frac{(x_i-x'_i)^2}{2l_i^2}\right] } \nonumber \\&\quad -\frac{\left[ \mathrm{erf}\left( \frac{1+x_i}{\sqrt{2}l_i}\right) +\mathrm{erf}\left( \frac{1-x_i}{\sqrt{2}l_i}\right) \right] \left[ \mathrm{erf}\left( \frac{1+x'_i}{\sqrt{2}l_i}\right) +\mathrm{erf}\left( \frac{1-x'_i}{\sqrt{2}l_i}\right) \right] }{\frac{4}{\sqrt{\pi }}\frac{\sqrt{2}}{l_i}\, \mathrm{erf}\left( \frac{\sqrt{2}}{l_i}\right) +\frac{4}{\pi }\left( e^{-\frac{2}{l_i^2}}-1\right) }.\nonumber \\ \end{aligned}$$(69) -
1b.
Gaussian rbf HDMR kernel with a standard normal distribution in \([-\infty ,\infty ]\)
$$\begin{aligned} K_0(x_i,x'_i) =\exp {\left[ -\frac{(x_i-x'_i)^2}{2l_i^2}\right] } -\frac{l_i\sqrt{l_i^2+2}}{l_i^2+1} \exp {\left[ -\frac{x_i^2+(x'_i)^2}{2(l_i^2+1)}\right] }. \end{aligned}$$(70) -
2a.
Exponential HDMR kernel with a uniform distribution in [−1,1]
$$\begin{aligned} K_0(x_i,x'_i)= & {} \exp {\left[ -\frac{|x_i-x'_i|}{2l_i^2}\right] } \nonumber \\&\quad -\frac{\left( 2-\exp {\left[ -\frac{(1+x_i)(1-x_i)}{4l_i^4}\right] }\right) \left( 2-\exp {\left[ -\frac{(1+x'_i)(1-x'_i)}{4l_i^4}\right] }\right) }{2\left[ \frac{1}{l_i^2}-\left( 1-e^{-\frac{1}{l_i^2}}\right) \right] }.\nonumber \\ \end{aligned}$$(71) -
2b.
Exponential HDMR kernel with a standard normal distribution in \([-\infty ,\infty ]\)
$$\begin{aligned}&K_0(x_i,x'_i)=\exp {\left[ -\frac{|x_i-x'_i|}{2l_i^2}\right] } -\frac{1}{4\,\mathrm{erfc}(\frac{1}{2l_i^2})} \nonumber \\&\quad \times \left[ \exp \left( -\frac{x_i}{2l_i^2}\right) \mathrm{erfc} \left( \frac{1}{2\sqrt{2}l_i^2}-\frac{x_i}{\sqrt{2}}\right) +\exp \left( \frac{x_i}{2l_i^2}\right) \mathrm{erfc} \left( \frac{1}{2\sqrt{2}l_i^2}+\frac{x_i}{\sqrt{2}}\right) \right] \nonumber \\&\quad \times \left[ \exp \left( -\frac{x'_i}{2l_i^2}\right) \mathrm{erfc} \left( \frac{1}{2\sqrt{2}l_i^2}-\frac{x'_i}{\sqrt{2}}\right) +\exp \left( \frac{x'_i}{2l_i^2}\right) \mathrm{erfc} \left( \frac{1}{2\sqrt{2}l_i^2}+\frac{x'_i}{\sqrt{2}}\right) \right] . \nonumber \\ \end{aligned}$$(72) -
3a.
Polynomial HDMR kernel with a uniform distribution in [0,1]
$$\begin{aligned} K_0(x_i,x'_i) = (x_ix'_i+1)^k - \frac{\left[ \displaystyle {\sum _{j=1}^{k+1}} C_{k+1}^j x_i^{j-1}\right] \left[ \displaystyle {\sum _{j=1}^{k+1}}C_{k+1}^j (x_i')^{j-1}\right] }{(k+1)\displaystyle {\sum _{j=1}^{k+1}}\frac{2^j-1}{j}}, \end{aligned}$$(73)where
$$\begin{aligned} C_{k+1}^j=\frac{(k+1)!}{j!(k+1-j)!} \end{aligned}$$(74)denotes the number of all combinations of \(k+1\) things taken j at a time.
-
3b.
Polynomial HDMR kernel with a uniform distribution in [−1,1]
$$\begin{aligned} K_0(x_i,x'_i)= & {} \,(x_ix'_i+1)^k - \frac{1}{(k+1)\displaystyle {\sum _{j=1}^{\mathrm{Int}((k+1)/2)}} C_{k+1}^{2j-1}/(2j-1)} \times \nonumber \\&\quad \times \left[ \sum _{j=1}^{\mathrm{Int}((k+1)/2)} C_{k+1}^{2j-1} x_i^{2(j-1)} \right] \cdot \left[ \sum _{j=1}^{\mathrm{Int}((k+1)/2)} C_{k+1}^{2j-1}(x'_i)^{2(j-1)} \right] .\nonumber \\ \end{aligned}$$(75) -
4.
Fourier HDMR kernel with a uniform distribution in \([-\pi ,\pi ]\)
$$\begin{aligned} K_0(x_i,x'_i)= & {} \frac{1-q^2}{2(1-2q\cos (x_i-x'_i)+q^2)} - \frac{1}{2}\nonumber \\= & {} \frac{q\cos (x_i-x'_i)-q^2}{1-2q\cos (x_i-x'_i)+q^2}. \end{aligned}$$(76)
Complex and hybrid HDMR kernels
If \(K_{01}(x_i,x'_i)\) and \(K_{02}(x_i,x'_i)\) are two HDMR kernels, then
is another HDMR kernel. Using Eq. 31, we know that \(K_{03}(x_i,x'_i)\) is a kernel. Moreover, it can be also proved that \(K_{03}(x_i,x'_i)\) has zero expectation
Therefore \(K_{03}(x_i,x'_i)\) is an HDMR kernel. Using Eq. 77, various complex and hybrid HDMR kernels can be constructed.
Rights and permissions
About this article
Cite this article
Li, G., Xing, X., Welsh, W. et al. High dimensional model representation constructed by support vector regression. I. Independent variables with known probability distributions. J Math Chem 55, 278–303 (2017). https://doi.org/10.1007/s10910-016-0690-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10910-016-0690-z