[go: up one dir, main page]

skip to main content
research-article
Public Access

Spatio-Temporal Denoising Graph Autoencoders with Data Augmentation for Photovoltaic Data Imputation

Published: 30 May 2023 Publication History

Abstract

The integration of the global Photovoltaic (PV) market with real time data-loggers has enabled large scale PV data analytical pipelines for power forecasting and reliability assessment of PV fleets. Nevertheless, the performance of PV data analysis depends on the quality of PV timeseries data. We propose a novel Spatio-Temporal Denoising Graph Autoencoder STD-GAE framework to impute missing PV Power Data. STD-GAE exploits temporal correlation, spatial coherence, and value dependencies from domain knowledge to recover missing data. It is empowered by two modules. (1) To cope with sparse yet various scenarios of missing data, STD-GAE incorporates a domain-knowledge aware data augmentation module to create plausible variations of missing data patterns. This generalizes STD-GAE to robust imputation over different seasons and environment. (2) STD-GAE nontrivially integrates spatiotemporal graph convolution layers and denoising autoencoder to improve the accuracy of imputation accuracy at PV fleet level. Experimental results on two PV datasets show that STD-GAE can achieve a gain of 43.14% in imputation accuracy and remains less sensitive to missing rate, different seasons, and missing scenarios, compared with state-of-the-art data imputation methods.

Supplemental Material

MP4 File
Presentation video for paper "Spatio-temporal Denoising Graph Autoencoder with Data Augmentation for Photovoltaic Timeseries Data Imputation" accepted by SIGMOD 2023.
PDF File
Presentation video for paper "Spatio-temporal Denoising Graph Autoencoder with Data Augmentation for Photovoltaic Timeseries Data Imputation" accepted by SIGMOD 2023.

References

[1]
Alan J. Curran, Tyler Burleyson, Sascha Lindig, David Moser, and Roger H. French,. 2020. PVplr: Performance Loss Rate Analysis Pipeline. https://CRAN.R-project.org/package=PVplr tex.ids: a.j.curranPVplrSDLEPerformance2020,curranPVplrPerformanceLoss2020.
[2]
Alan J Curran, Tyler L Burleyson, Sascha Lindig, Joshua Stein, Laura S Bruckman, David Moser, and Roger H French. 2020. PVplr: R Package Implementation of Multiple Filters and Algorithms for Time-series Performance Loss Rate Analysis. In PVSC 47.
[3]
Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive graph convolutional recurrent network for traffic forecasting. Advances in Neural Information Processing Systems 33 (2020), 17804--17815.
[4]
Alessandro Betti, Maria Luisa Lo Trovato, Fabio Leonardi, Giuseppe Leotta, Fabrizio Ruffini, and Ciro Lanzetta. 2019.Predictive Maintenance in Photovoltaic Plants with a Big Data Approach. ArXiv (2019).
[5]
Thierry Blu, Philippe Thévenaz, and Michael Unser. 2004. Linear interpolation revitalized. IEEE Transactions on Image Processing 13, 5 (2004), 710--719.
[6]
Ajoy Kumar Chakraborty and Navonita Sharma. 2016. Advanced metering infrastructure: Technology and challenges. In 2016 IEEE/PES Transmission and Distribution Conference and Exposition (T D).
[7]
Xinyu Chen, Jinming Yang, and Lijun Sun. 2020. A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transportation Research Part C: Emerging Technologies 117 (2020), 102673.
[8]
Jens Christiansen. 2021. Global Market Outlook for Solar Power. Technical Report. SolarPower Europe. 136 pages.
[9]
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems.
[10]
Zulong Diao, Xin Wang, Dafang Zhang, Yingru Liu, Kun Xie, and Shaoyao He. 2019. Dynamic Spatial-Temporal Graph Convolutional Neural Networks for Traffic Forecasting. In AAAI.
[11]
A. P. Dobos. 2014. PVWatts Version 5 Manual. Technical Report NREL/TP-6A20--62641. National Renewable Energy Lab. (NREL), Golden, CO (United States). https://doi.org/10.2172/1158421
[12]
A Rogier T Donders, Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. 2006. A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087--1091.
[13]
Roger H. French, Laura S. Bruckman, David Moser, Sascha Lindig, Mike van Iseghem, Björn Müller, Joshua S. Stein, Mauricio Richter, Magnus Herz, Wilfried Van Sark, Franz Baumgartner, Julián Ascencio-Vásquez, Dario Bertani, Giosué Maugeri, Alan J. Curran, Kunal Rath, JiQi Liu, Arash Khalilnejad, Mohammed Meftah, Dirk Jordan, Chris Deline, Georgios Makrides, George Georghiou, Andreas Livera, Bennet Meyers, Gilles Plessis, Marios Theristis, and Wei Luo. 2001. Assessment of Performance Loss Rate of PV Power Systems. IEA-PVPS.
[14]
Lovedeep Gondara and Ke Wang. 2018. Mida: Multiple imputation using denoising autoencoders. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 260--272.
[15]
Ahmad Maroof Karimi, Yinghui Wu, Mehmet Koyuturk, and Roger H French. 2021. Spatiotemporal Graph Neural Network for Performance Prediction of Photovoltaic Power Systems. In Proceedings of the AAAI Conference on Artificial Intelligence.
[16]
Arash Khalilnejad, Ahmad M. Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H. French, and Alexis R. Abramson. 2020. Automated Pipeline Framework for Processing of Large-Scale Building Energy Time Series Data. PLOS ONE 15 (2020).
[17]
Arash Khalilnejad, Ahmad M Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H French, and Alexis R Abramson. 2020. Automated pipeline framework for processing of large-scale building energy time series data. PloS one (2020).
[18]
Hufsa Khan, Xizhao Wang, and Han Liu. 2022. Handling missing data through deep convolutional neural network. Information Sciences 595 (2022), 278--293.
[19]
Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).
[20]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
[21]
Sascha Lindig, Atse Louwen, M Herz, J Ascencio-Vásquez, David Moser, and M Topic. 2021. Performance Imputation Techniques for Assessing Costs of Technical Failures in PV Systems. In Proceedings / 38th European Photovoltaic Solar Energy Conference and Exhibition.
[22]
Sascha Lindig, Atse Louwen, David Moser, and Marko Topic. 2020. Outdoor PV system monitoring-input data quality, data imputation and filtering approaches. Energies (2020).
[23]
Sascha Lindig, David Moser, Alan J. Curran, Kunal Rath, Arash Khalilnejad, Roger H. French, Magnus Herz, Björn Müller, George Makrides, George Georghiou, Andreas Livera, Mauricio Richter, Julián Ascencio-Vásquez, Mike van Iseghem, Mohammed Meftah, Dirk Jordan, Chris Deline, Wilfried van Sark, Joshua S. Stein, Marios Theristis, Bennet Meyers, Franz Baumgartner, and Wei Luo. 2021. International collaboration framework for the calculation of performance loss rates: Data quality, benchmarks, and trends (towards a uniform methodology). Progress in Photovoltaics: Research and Applications (2021).
[24]
Shao-Hsien Liu, Stavroula A Chrysanthopoulou, Qiuzhi Chang, Jacob N Hunnicutt, and Kate L Lapane. 2019. Missing data in marginal structural models: a plasmode simulation study comparing multiple imputation and inverse probabilityweighting. Medical care 57, 3 (2019), 237.
[25]
Javier López-de Lacalle. 2019. tsoutliers: Detection of Outliers in Time Series. https://CRAN.R-project.org/package=tsoutliers tex.ids: lopez-de lacalleTsoutliersDetectionOutliers2016, lopez2016tsoutliers.
[26]
R Malarvizhi and Antony Selvadoss Thanamani. 2012. K-nearest neighbor in missing data imputation. International Journal of Engineering Research and Development 5, 1 (2012), 5--7.
[27]
Noor Bariah Mohamad, Boon-Han Lim, and An-Chow Lai. 2021. Imputation of Missing Values for Solar Irradiance Data under Different Weathers using Univariate Methods. IOP Conference Series: Earth and Environmental Science (2021).
[28]
Ricardo Cardoso Pereira, Miriam Seoane Santos, Pedro Pereira Rodrigues, and Pedro Henriques Abreu. 2020. Reviewing autoencoders for missing data imputation: Technical trends, applications and outcomes. Journal of Artificial Intelligence Research 69 (2020), 1255--1285.
[29]
Ethan M. Pickering, Mohammad A. Hossain, Roger H. French, and Alexis R. Abramson. 2018. Building electricity consumption: Data analytics of building operations with classical time series decomposition and case based subsetting. Energy and Buildings (2018).
[30]
Fabrício José Pontes, GF Amorim, Pedro Paulo Balestrassi, AP Paiva, and João Roberto Ferreira. 2016. Design of experiments and focused grid search for neural network parameter optimization. Neurocomputing 186 (2016), 22--34.
[31]
Irene Romero-Fiances, Andreas Livera, Marios Theristis, George Makrides, Joshua S. Stein, Gustavo Nofuentes, Juan de la Casa, and George E. Georghiou. 2022. Impact of duration and missing data on the long-term photovoltaic degradation rate estimation. Renewable Energy 181 (2022), 738--748.
[32]
Patrick Royston and Ian R White. 2011. Multiple imputation by chained equations (MICE): implementation in Stata. Journal of statistical software 45 (2011), 1--20.
[33]
Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzman Lopez, Nicolas Collignon, and Rik Sarkar. 2021. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management.
[34]
Shaun R Seaman and Ian R White. 2013. Review of inverse probability weighting for dealing with missing data. Statistical methods in medical research 22, 3 (2013), 278--295.
[35]
Shaun R Seaman, Ian R White, Andrew J Copas, and Leah Li. 2012. Combining multiple imputation and inverse-probability weighting. Biometrics 68, 1 (2012), 129--137.
[36]
Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson. 2018. Structured sequence modeling with graph convolutional recurrent networks. In International Conference on Neural Information Processing. Springer, 362--373.
[37]
Concepción Crespo Turrado, María del Carmen Meizoso López, Fernando Sánchez Lasheras, Benigno Antonio Rodríguez Gómez, José Luis Calvo Rollé, and Francisco Javier de Cos Juez. 2014. Missing data imputation of solar radiation data under different atmospheric conditions. Sensors (2014).
[38]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4--24.
[39]
Bohong Xiang, Feng Yan, Tao Wu, Weiwei Xia, Jin Hu, and Lianfeng Shen. 2020. An Improved Multiple Imputation Method Based on Chained Equations for Distributed Photovoltaic Systems. In 2020 IEEE 6th International Conference on Computer and Communications (ICCC).
[40]
Mao Yang, Dingze Liu, Yang Cui, Xin Huang, and Gangui Yan. 2020. Research on complementary algorithm of photovoltaic power missing data based on improved cloud model. International Transactions on Electrical Energy Systems 30, 7 (2020), e12350.
[41]
Yongchao Ye, Shiyao Zhang, and James JQ Yu. 2021. Spatial-temporal traffic data imputation via graph attention convolutional network. In International Conference on Artificial Neural Networks. Springer, 241--252.
[42]
Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3634--3640.
[43]
Xiyue Zhang, Chao Huang, Yong Xu, and Lianghao Xia. 2020. Spatial-Temporal Convolutional Graph Attention Networks for Citywide Traffic Flow Forecasting. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 1853--1862.
[44]
Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1234--1241.
[45]
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57--81.

Cited By

View all
  • (2024)Weighted Average Ensemble-Based PV Forecasting in a Limited Environment with Missing Data of PV PowerSustainability10.3390/su1610406916:10(4069)Online publication date: 13-May-2024
  • (2024)Efficient Mixture of Experts based on Large Language Models for Low-Resource Data PreprocessingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671873(3690-3701)Online publication date: 25-Aug-2024
  • (2024)Mining of Switching Sparse Networks for Missing Value Imputation in Multivariate Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671760(2296-2306)Online publication date: 25-Aug-2024
  • Show More Cited By

Index Terms

  1. Spatio-Temporal Denoising Graph Autoencoders with Data Augmentation for Photovoltaic Data Imputation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 1
    PACMMOD
    May 2023
    2807 pages
    EISSN:2836-6573
    DOI:10.1145/3603164
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2023
    Published in PACMMOD Volume 1, Issue 1

    Permissions

    Request permissions for this article.

    Author Tags

    1. data augmentation
    2. imputation
    3. spatio-temporal graph neural networks

    Qualifiers

    • Research-article

    Funding Sources

    • Department of Energy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)261
    • Downloads (Last 6 weeks)51
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Weighted Average Ensemble-Based PV Forecasting in a Limited Environment with Missing Data of PV PowerSustainability10.3390/su1610406916:10(4069)Online publication date: 13-May-2024
    • (2024)Efficient Mixture of Experts based on Large Language Models for Low-Resource Data PreprocessingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671873(3690-3701)Online publication date: 25-Aug-2024
    • (2024)Mining of Switching Sparse Networks for Missing Value Imputation in Multivariate Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671760(2296-2306)Online publication date: 25-Aug-2024
    • (2024)Parallel-friendly Spatio-Temporal Graph Learning for Photovoltaic Degradation Analysis at ScaleProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680026(4470-4478)Online publication date: 21-Oct-2024
    • (2024)Discovering Denial Constraints Based on Deep Reinforcement LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679714(120-129)Online publication date: 21-Oct-2024
    • (2023)Time-Series Imputation Using Graph Neural Networks and Denoising Autoencoders2023 IEEE 50th Photovoltaic Specialists Conference (PVSC)10.1109/PVSC48320.2023.10359805(1-4)Online publication date: 11-Jun-2023
    • (2023)Accelerating Time to Science using CRADLE: A Framework for Materials Data Science2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00041(234-245)Online publication date: 18-Dec-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media