TCF-Trans: Temporal Context Fusion Transformer for Anomaly Detection in Time Series
<p>Demonstration example of the 3-layer decoder block diagram for the original Informer network.</p> "> Figure 2
<p>Architecture of vanilla Transformer.</p> "> Figure 3
<p>Block diagram of overall structure of TCF-Trans.</p> "> Figure 4
<p>Demonstration example of the three-layer feature fusion decoder of TCF-Trans. ⊕ donates feature fusion operations.</p> "> Figure 5
<p>Demonstration example of temporal context fusion module of TCF-Trans. ⊕ donates feature fusion operations.</p> "> Figure 6
<p>Visualisation examples on the gesture dataset. (<b>a</b>) Raw data slice from the test set (<b>b</b>) Visualisation example of a detection result slice by the proposed method (<b>c</b>) Visualisation example of a detection result slice by LUNAR.</p> "> Figure 7
<p>Visualisation examples on the real-world transportation dataset (days). (<b>a</b>) Raw data slice from the test set, where each graph line represents traffic from different raw data sources (<b>b</b>) visualisation example of a detection result slice achieved using the proposed method.</p> ">
Abstract
:1. Introduction
- We introduce a novel framework: Temporal Context Fusion Transformer (TCF-Trans) for unsupervised anomaly detection in time series based on temporal context fusion.
- We replace the straight throughout feature-transmitting structure in the decoder layers of Informer with the proposed feature fusion decoder, which fully utilises the features extracted from shallow and deep decoder layers. This strategy prevents the decoder from missing unusual anomaly details while maintaining robustness from noises inside the data.
- We propose the temporal context fusion module to fuse the auxiliary predictions generated by the decoder adaptively. This strategy alleviates noises or distortions caused by the single auxiliary prediction and fully uses temporal context information of the data.
- Extensive experiments on the public and collected transportation datasets validate that the proposed framework is effective for anomaly detection tasks, such as transportation tasks in time series. In addition, a series of sensitivity experiments and the ablation study show that the proposed method maintains high performance under various experimental settings.
2. Related Work
2.1. Problem Statement
2.2. Transformer
2.3. Preliminary on Informer
3. TCF-Trans: Temporal Context Fusion Transformer
3.1. Overall Structure
3.2. Auxiliary Prediction Generator
3.3. Temporal Context Fusion Module and Anomaly Detection
Algorithm 1: Anomaly detection via TCF-Trans |
4. Experiments
4.1. Setup
4.2. Evaluation on Public Datasets
4.3. Evaluation of the Real-World Transportation Dataset
4.4. Ablation Study
4.5. Parameter Sensitivity Experiments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. 2009, 41, 1–58. [Google Scholar] [CrossRef]
- Cherdo, Y.; Miramond, B.; Pegatoquet, A.; Vallauri, A. Unsupervised Anomaly Detection for Cars CAN Sensors Time Series Using Small Recurrent and Convolutional Neural Networks. Sensors 2023, 23, 5013. [Google Scholar] [CrossRef] [PubMed]
- Xu, Z.; Yang, Y.; Gao, X.; Hu, M. DCFF-MTAD: A Multivariate Time-Series Anomaly Detection Model Based on Dual-Channel Feature Fusion. Sensors 2023, 23, 3910. [Google Scholar] [CrossRef] [PubMed]
- Pang, G.; Shen, C.; Cao, L.; Hengel, A.V.D. Deep learning for anomaly detection: A review. ACM Comput. Surv. 2021, 54, 1–38. [Google Scholar] [CrossRef]
- El Sayed, A.; Ruiz, M.; Harb, H.; Velasco, L. Deep Learning-Based Adaptive Compression and Anomaly Detection for Smart B5G Use Cases Operation. Sensors 2023, 23, 1043. [Google Scholar] [CrossRef] [PubMed]
- Kim, B.; Alawami, M.A.; Kim, E.; Oh, S.; Park, J.; Kim, H. A comparative study of time series anomaly detection models for industrial control systems. Sensors 2023, 23, 1310. [Google Scholar] [CrossRef]
- Lan, D.T.; Yoon, S. Trajectory Clustering-Based Anomaly Detection in Indoor Human Movement. Sensors 2023, 23, 3318. [Google Scholar] [CrossRef]
- Fisher, W.D.; Camp, T.K.; Krzhizhanovskaya, V.V. Anomaly detection in earth dam and levee passive seismic data using support vector machines and automatic feature selection. J. Comput. Sci. 2017, 20, 143–153. [Google Scholar] [CrossRef]
- Tian, Y.; Mirzabagheri, M.; Bamakan, S.M.H.; Wang, H.; Qu, Q. Ramp loss one-class support vector machine; A robust and effective approach to anomaly detection problems. Neurocomputing 2018, 310, 223–235. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data TKDD 2012, 6, 1–39. [Google Scholar] [CrossRef]
- Mishra, S.; Chawla, M. A comparative study of local outlier factor algorithms for outliers detection in data streams. In Emerging Technologies in Data Mining and Information Security; Springer: Singapore, 2019; pp. 347–356. [Google Scholar]
- Pevnỳ, T. Loda: Lightweight on-line detector of anomalies. Mach. Learn. 2016, 102, 275–304. [Google Scholar] [CrossRef]
- Zhao, Y.; Nasrullah, Z.; Hryniewicki, M.K.; Li, Z. LSCP: Locally selective combination in parallel outlier ensembles. In Proceedings of the 2019 SIAM International Conference on Data Mining, SIAM, Santa Barbara, CA, USA, 2–4 May 2019; pp. 585–593. [Google Scholar]
- Choi, K.; Yi, J.; Park, C.; Yoon, S. Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines. IEEE Access 2021, 9, 120043–120065. [Google Scholar] [CrossRef]
- Ruff, L.; Vandermeulen, R.; Goernitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep one-class classification. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 4393–4402. [Google Scholar]
- Trinh, H.D.; Giupponi, L.; Dini, P. Urban anomaly detection by processing mobile traffic traces with LSTM neural networks. In Proceedings of the 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), Boston, MA, USA, 10–13 June 2019; pp. 1–8. [Google Scholar]
- Munir, M.; Siddiqui, S.A.; Dengel, A.; Ahmed, S. DeepAnT: A deep learning approach for unsupervised anomaly detection in time series. IEEE Access 2018, 7, 1991–2005. [Google Scholar] [CrossRef]
- Zong, B.; Song, Q.; Min, M.R.; Cheng, W.; Lumezanu, C.; Cho, D.; Chen, H. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Liu, Y.; Li, Z.; Zhou, C.; Jiang, Y.; Sun, J.; Wang, M.; He, X. Generative adversarial active learning for unsupervised outlier detection. IEEE Trans. Knowl. Data Eng. 2019, 32, 1517–1528. [Google Scholar] [CrossRef]
- Deng, A.; Hooi, B. Graph Neural Network-Based Anomaly Detection in Multivariate Time Series. Proc. AAAI Conf. Artif. Intell. 2021, 35, 4027–4035. [Google Scholar] [CrossRef]
- Goodge, A.; Hooi, B.; Ng, S.K.; Ng, W.S. LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks. Proc. AAAI Conf. Artif. Intell. 2022, 36, 6737–6745. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.U.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Chen, L.; You, Z.; Zhang, N.; Xi, J.; Le, X. UTRAD: Anomaly detection and localization with U-Transformer. Neural Netw. 2022, 147, 53–62. [Google Scholar] [CrossRef]
- Wang, X.; Pi, D.; Zhang, X.; Liu, H.; Guo, C. Variational transformer-based anomaly detection approach for multivariate time series. Measurement 2022, 191, 110791. [Google Scholar] [CrossRef]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
- Li, H.; Peng, X.; Zhuang, H.; Lin, Z. Multiple Temporal Context Embedding Networks for Unsupervised time Series Anomaly Detection. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 3438–3442. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
- Parmar, N.; Vaswani, A.; Uszkoreit, J.; Kaiser, L.; Shazeer, N.; Ku, A.; Tran, D. Image transformer. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm Sweden, 10–15 July 2018; pp. 4055–4064. [Google Scholar]
- Chen, H.; Wang, Z.; Tian, H.; Yuan, L.; Wang, X.; Leng, P. A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking. Sensors 2022, 22, 6558. [Google Scholar] [CrossRef] [PubMed]
- Xian, T.; Li, Z.; Zhang, C.; Ma, H. Dual Global Enhanced Transformer for image captioning. Neural Netw. 2022, 148, 129–141. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Jin, X.; Xuan, Y.; Zhou, X.; Chen, W.; Wang, Y.X.; Yan, X. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, USA, 10–12 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Liu, M.; Ren, S.; Ma, S.; Jiao, J.; Chen, Y.; Wang, Z.; Song, W. Gated Transformer Networks for Multivariate Time Series Classification. arXiv 2021, arXiv:2103.14438. [Google Scholar]
- Wang, C.; Xing, S.; Gao, R.; Yan, L.; Xiong, N.; Wang, R. Disentangled Dynamic Deviation Transformer Networks for Multivariate Time Series Anomaly Detection. Sensors 2023, 23, 1104. [Google Scholar] [CrossRef]
- Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in Time Series: A Survey. arXiv 2023, arXiv:2202.07125. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer Normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
- Wang, P.; Zheng, W.; Chen, T.; Wang, Z. Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. arXiv 2022, arXiv:2203.05962. [Google Scholar]
- Xue, F.; Chen, J.; Sun, A.; Ren, X.; Zheng, Z.; He, X.; Chen, Y.; Jiang, X.; You, Y. A Study on Transformer Configuration and Training Objective. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023. [Google Scholar]
- Siffer, A.; Fouque, P.A.; Termier, A.; Largouet, C. Anomaly detection in streams with extreme value theory. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 1067–1075. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar]
- Zhao, Y.; Nasrullah, Z.; Li, Z. PyOD: A Python Toolbox for Scalable Outlier Detection. J. Mach. Learn. Res. 2019, 20, 1–7. [Google Scholar]
- Keogh, E.; Lin, J.; Fu, A. HOT SAX: Efficiently finding the most unusual time series subsequence. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, TX, USA, 27–30 November 2005; p. 8. [Google Scholar] [CrossRef]
- Ahmad, S.; Lavin, A.; Purdy, S.; Agha, Z. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 2017, 262, 134–147. [Google Scholar] [CrossRef]
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
LUNAR [21] | 50.33 | 38.57 | 72.40 |
DeepAnt [17] | 40.08 | 25.07 | 99.86 |
DeepSVVD [15] | 50.00 | 43.46 | 58.86 |
TCF-Trans | 69.69 | 61.67 | 80.11 |
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
LSCP [13] | 60.53 | 62.30 | 58.86 |
LUNAR [21] | 45.59 | 88.66 | 30.69 |
SO-GAAL [19] | 59.76 | 57.61 | 62.08 |
DeepSVVD [15] | 54.51 | 81.62 | 40.92 |
TCF-Trans | 62.55 | 68.76 | 57.36 |
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
LSCP [13] | 41.24 | 48.87 | 35.67 |
LUNAR [21] | 44.33 | 35.13 | 60.06 |
SO-GAAL [19] | 40.36 | 40.50 | 40.22 |
DeepSVVD [15] | 32.19 | 20.41 | 76.03 |
TCF-Trans | 60.55 | 50.18 | 76.31 |
Dataset | Dimension of Data | Training Size | Testing Size |
---|---|---|---|
real-world transportation dataset (days) | 12 | 104 | 166 |
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
LODA [12] | 83.05 | 81.67 | 84.48 |
LSCP [13] | 90.91 | 96.15 | 86.21 |
LUNAR [21] | 88.89 | 96.00 | 82.76 |
SO-GAAL [19] | 82.64 | 79.37 | 86.21 |
DeepSVVD [15] | 87.39 | 85.25 | 89.66 |
TCF-Trans | 93.81 | 96.36 | 91.38 |
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
Informer baseline † | 94.02 | 93.22 | 94.83 |
TCF-Trans w/o temporal context fusion † | 85.22 | 85.96 | 84.48 |
TCF-Trans w/o feature fusion † | 94.74 | 96.43 | 93.10 |
TCF-Trans † | 96.55 | 96.55 | 96.55 |
Optimiser | (%) | Precision (%) | Recall (%) |
---|---|---|---|
SGD † | 88.50 | 90.91 | 86.21 |
Adam † * | 96.55 | 96.55 | 96.55 |
Ratio of Data (%) | (%) | Precision (%) | Recall (%) |
---|---|---|---|
80 † | 92.86 | 96.30 | 89.66 |
90 † | 94.74 | 96.43 | 93.10 |
100 † * | 96.55 | 96.55 | 96.55 |
Input and Reference Length | (%) | Precision (%) | Recall (%) |
---|---|---|---|
[3 & 1] † | 93.91 | 94.74 | 93.10 |
[5 & 1] † | 94.02 | 93.22 | 94.83 |
[5 & 2] †* | 96.55 | 96.55 | 96.55 |
[7 & 2] † | 93.81 | 96.36 | 91.38 |
[9 & 3] † | 91.67 | 88.71 | 94.83 |
Method | (%) | Precision (%) | Recall (%) |
---|---|---|---|
4-layer Informer baseline † | 85.22 | 85.96 | 84.48 |
4-layer TCF-Trans † | 93.91 | 94.74 | 93.10 |
TCF-Trans † | 96.55 | 96.55 | 96.55 |
Optimiser | (%) | Precision (%) | Recall (%) |
---|---|---|---|
MAE † | 94.02 | 93.22 | 94.83 |
SmoothL1 † | 95.73 | 94.92 | 96.55 |
MSE † * | 96.55 | 96.55 | 96.55 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Peng, X.; Li, H.; Lin, Y.; Chen, Y.; Fan, P.; Lin, Z. TCF-Trans: Temporal Context Fusion Transformer for Anomaly Detection in Time Series. Sensors 2023, 23, 8508. https://doi.org/10.3390/s23208508
Peng X, Li H, Lin Y, Chen Y, Fan P, Lin Z. TCF-Trans: Temporal Context Fusion Transformer for Anomaly Detection in Time Series. Sensors. 2023; 23(20):8508. https://doi.org/10.3390/s23208508
Chicago/Turabian StylePeng, Xinggan, Hanhui Li, Yuxuan Lin, Yongming Chen, Peng Fan, and Zhiping Lin. 2023. "TCF-Trans: Temporal Context Fusion Transformer for Anomaly Detection in Time Series" Sensors 23, no. 20: 8508. https://doi.org/10.3390/s23208508