An integrated framework for diagnosing process faults with incomplete features

Roozbeh Razavi-Far¹,
Mehrdad Saif²,
Vasile Palade ORCID: orcid.org/0000-0002-6768-8394³ &
…
Shiladitya Chakrabarti²

386 Accesses
1 Altmetric
Explore all metrics

Abstract

Handling missing values and large-dimensional features are crucial requirements for data-driven fault diagnosis systems. However, most intelligent data-driven diagnostic systems are not able to handle missing data. The presence of high-dimensional feature sets can also further complicate the process of fault diagnosis. This paper aims to devise a missing data imputation unit along with a dimensionality reduction unit in the pre-processing module of the diagnostic system. This paper proposes a novel pooling strategy for missing data imputation (PSMI). This strategy can simplify complex patterns of missingness and incrementally update the pool. The pre-processing module receives incomplete observations, PSMI estimates missing values, and, then, the dimensionality reduction unit transforms completed observations onto a lower-dimensional feature space. These transformed observations are then fed as inputs to the fault classification module for decision making and diagnosis. This diagnostic scheme makes use of various state-of-the-art missing data imputation, dimensionality reduction and classification algorithms. This enables a comprehensive comparison and allows to find the best techniques for the sake of diagnosing faults in the Tennessee Eastman process. The obtained results show the effectiveness of the proposed pooling strategy and indicate that principal component analysis imputation and heteroscedastic discriminant analysis approaches outperform other imputation and dimensionality reduction techniques in this diagnostic application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Fault Diagnosis Using Principal Component Analysis Approach with Continuous Evidence

A Proposal of Condition Monitoring with Missing Data and Small-Magnitude Faults in Industrial Plants

Data-Driven Fault Diagnosis: Multivariate Statistical Approach

References

Altman N (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
MathSciNet Google Scholar
Atouni M, Verron S, Kobi A (2015) Fault detection with conditional Gaussian network. Eng Appl Artif Intell 45:473–481
Article Google Scholar
Batista G, Monard M (2002) A study of k-nearest neighbour as an imputation method. HIS 87:251–260
Google Scholar
Bellman RE (1961) Adaptive control processes. Princeton University Press, Princeton
Book Google Scholar
Cao W, Haralick R (2009) Affine feature extraction: a generalization of the fukunaga–koontz transformation. Eng Appl Artif Intell 22(1):40–47
Article Google Scholar
Downs J, Vogel E (1993) A plant-wide industrial process control problem. Comput Chem Eng 17(2):245–255
Article Google Scholar
Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M (2021) Generative-adversarial class-imbalance learning for classifying cyber-attacks and faults—a cyber-physical power system. IEEE Trans Dependable Secure Comput. https://doi.org/10.1109/TDSC.2021.3118636
Article Google Scholar
Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M (2021) Generative adversarial dimensionality reduction for diagnosing faults and attacks in cyber-physical systems. Neurocomputing 440:101–110
Article Google Scholar
Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M, Parvania M (2021) Adversarial semi-supervised learning for diagnosing faults and attacks in power grids. IEEE Trans Smart Grid 12(4):3468–3478
Article Google Scholar
Farajzadeh-Zanjani M, Razavi-Far R, Saif M (2016) Efficient sampling techniques for ensemble learning and diagnosing bearing defects under class imbalanced condition. In: 2016 IEEE symposium series on computational intelligence (SSCI). pp 1–7
Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Article Google Scholar
Folch-Fortuny A, Arteaga F, Ferrer A (2016) Missing data imputation toolbox for MATLAB. Chemom Intell Lab Syst 154:93–100
Article Google Scholar
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2004) Neighbourhood components analysis. In: Advances in neural information processing systems, vol 17. MIT Press, pp 513–520
Grimble M, Johnson M (2005) Advanced textbooks in control and signal processing. Springer, Berlin
Google Scholar
Hallaji E, Razavi-Far R, Saif M (2021) DLIN: Deep ladder imputation network. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3054878
Article Google Scholar
Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Inf Sci 422:462–479
Article Google Scholar
Huang G (2014) An insight into extreme learning machines: random neurons, random features and kernels. Cogn Comput 6:376–390
Article Google Scholar
Jing C, Gao X, Zhu X, Lang S (2014) Fault classification on Tennessee Eastman process: PCA and SVM. In: 2014 International conference on mechatronics and control (ICMC)
Josse J, Husson F (2013) Handling missing values in exploratory multivariate data analysis methods. J SFdS 153(2):79–99
MathSciNet MATH Google Scholar
Kasun LLC, Yang Y, Huang GB, Zhang Z (2016) Dimension reduction with extreme learning machine. IEEE Trans Image Process 25(8):3906–3918
Article MathSciNet Google Scholar
Loog M, Duin R (2004) Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion. IEEE Trans Pattern Anal Mach Intell 26(6):732–739
Article Google Scholar
Monsef H, Ranjbar A, Jadid S (1997) Fuzzy rule-based expert system for power system fault diagnosis. IEE Proc Gener Transm Distrib 144(2):186–192
Article Google Scholar
Oliveira J, Pontes VK, Sartori I, Embirucu M (2017) Fault detection and diagnosis in dynamic systems using weightless neural networks. Expert Syst Appl 84:200–219
Article Google Scholar
Razavi-Far R, Chakrabarti S, Saif M, Zio E (2019) An integrated imputation–prediction scheme for prognostics of battery data with missing observations. Expert Syst Appl 115:709–723
Article Google Scholar
Razavi-Far R, Cheng B, Saif M, Ahmadi M (2020) Similarity-learning information-fusion schemes for missing data imputation. Knowl Based Syst 187:104805
Article Google Scholar
Razavi-Far R, Davilu H, Palade V, Lucas C (2009) Model-based fault detection and isolation of a steam generator using neuro-fuzzy networks. Neurocomputing 72(13):2939–2951
Article Google Scholar
Razavi-Far R, Farajzadeh-Zanajni M, Wang B, Saif M, Chakrabarti S (2021) Imputation-based ensemble techniques for class imbalance learning. IEEE Trans Knowl Data Eng 33(5):1988–2001
Google Scholar
Razavi-Far R, Farajzadeh-Zanjani M, Saif M (2017) An integrated class-imbalanced learning scheme for diagnosing bearing defects in induction motors. IEEE Trans Ind Inform 13(6):2758–2769
Article Google Scholar
Razavi-Far R, Farajzadeh-Zanjani M, Saif M, Chakrabarti S (2020) Correlation clustering imputation for diagnosing attacks and faults with missing power grid data. IEEE Trans Smart Grid 11(2):1453–1464
Article Google Scholar
Razavi-Far R, Kinnaert M (2012) Incremental design of a decision system for residual evaluation: a wind turbine application*. In: IFAC proceedings. 8th IFAC symposium on fault detection, supervision and safety of technical processes, vol 45(20). pp 343–348
Razavi-Far R, Palade V, Zio E (2014) Optimal detection of new classes of faults by an invasive weed optimization method. In: 2014 International joint conference on neural networks (IJCNN). pp 91–98
Razavi-Far R, Zio E, Palade V (2014) Efficient residuals preprocessing for diagnosing multi-class faults in a doubly fed induction generator, under missing data scenarios. Expert Syst Appl 41(14):6386–6399
Article Google Scholar
Scheffer J (2002) Dealing with missing data. Res Lett Inf Math Sci 3:153–160
Google Scholar
Sharma N, Saroha K (2015) Study of dimension reduction methodologies in data mining. In: International conference on computing, communication and automation (ICCCA2015)
Sim J, Kwon O, Lee K (2016) Adaptive pairing of classifier and imputation methods based on the characteristics of missing values in data sets. Expert Syst Appl 46:486–493
Article Google Scholar
Wang G, Li J, Sun C, Jiao J (2018) Least squares and contribution plot based approach for quality-related process monitoring. IEEE Access 6:54158–54166
Article Google Scholar
Yang X, Rui S, Zhang X, Xu S, Yang C, Liu PX (2019) Fault diagnosis in chemical processes based on class-incremental FDA and PCA. IEEE Access 7:18164–18171
Article Google Scholar
Zhang S (2012) Nearest neighbor selection for iteratively KNN imputation. J Syst Softw 85(11):2541–2552
Article Google Scholar
Zhang Z, Dong F (2014) Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian network approach. Chemom Intell Lab Syst 138:30–40
Article Google Scholar
Zhu J, Ge Z, Song Z (2017) Distributed parallel PCA for modeling and monitoring of large-scale plant-wide processes with big data. IEEE Trans Ind Inform 13(4):1877–1885
Article Google Scholar
Zhu Y, Wang Z, Gao D, Li D (2017) GMFLLM: a general manifold framework unifying three classic models for dimensionality reduction. Eng Appl Artif Intell 65:421–432
Article Google Scholar
Zhu Z, Song ZH (2011) A novel fault diagnosis system using pattern classification on kernel FDA subspace. Expert Syst Appl 38:6895–6905
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering and School of Computer Science, University of Windsor, Windsor, ON, N9B 3P4, Canada
Roozbeh Razavi-Far
Department of Electrical and Computer Engineering, University of Windsor, Windsor, ON, N9B 3P4, Canada
Mehrdad Saif & Shiladitya Chakrabarti
Center for Data Science, Coventry University, Coventry, CV1 5FB, UK
Vasile Palade

Authors

Roozbeh Razavi-Far
View author publications
You can also search for this author in PubMed Google Scholar
Mehrdad Saif
View author publications
You can also search for this author in PubMed Google Scholar
Vasile Palade
View author publications
You can also search for this author in PubMed Google Scholar
Shiladitya Chakrabarti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roozbeh Razavi-Far.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Razavi-Far, R., Saif, M., Palade, V. et al. An integrated framework for diagnosing process faults with incomplete features. Knowl Inf Syst 64, 75–93 (2022). https://doi.org/10.1007/s10115-021-01625-w

Download citation

Received: 26 March 2020
Revised: 31 October 2021
Accepted: 07 November 2021
Published: 26 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10115-021-01625-w

An integrated framework for diagnosing process faults with incomplete features

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bayesian Fault Diagnosis Using Principal Component Analysis Approach with Continuous Evidence

A Proposal of Condition Monitoring with Missing Data and Small-Magnitude Faults in Industrial Plants

Data-Driven Fault Diagnosis: Multivariate Statistical Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

An integrated framework for diagnosing process faults with incomplete features

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bayesian Fault Diagnosis Using Principal Component Analysis Approach with Continuous Evidence

A Proposal of Condition Monitoring with Missing Data and Small-Magnitude Faults in Industrial Plants

Data-Driven Fault Diagnosis: Multivariate Statistical Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now