Preview

Newsletter of North-Caucasus Federal University

Advanced search

Impact of missing data recovery methods on the accuracy of electricity consumption forecasting by gradient boosting algorithm

https://doi.org/10.37493/2307-907X.2025.6.5

Abstract

Introduction. In low-voltage networks, smart electricity metering systems (ISMS) are being actively implemented, and their data can be used to predict consumption. However, the presence of missing values in the data increases the error in the prediction results. Goal. To compare the impact of different methods for recovering missing values in SES data on the accuracy of electricity consumption forecasting. Materials and methods. The study was based on a real dataset containing hourly values of active energy from 132 single-phase household consumers in one of the regions of the North-Caucasus over a 25-month period. Four methods were used to fill in missing data: the mean, the median, interpolation, and the median for each hour of the day. The XGBoost machine learning model was used to predict hourly electricity consumption patterns for the next month, week, and day, and the quality of the prediction was evaluated using the RMSE metric. Results and discussion. Based on the analysis of the average values of the RMSE indicator, it was concluded that the choice of the averaging method has some effect on the monthly forecast interval, for which the median methods of filling in missing data result in a lower RMSE value by 0,066 kWh, or 12,6 %, compared to the method of filling in missing data with average values, and by 0,053 kWh, or 10,2 % compared to the method of filling in missing data with interpolation. With a weekly forecast horizon, the best result is obtained by the hourly median with an advantage of RMSE over the rest of the methods of 0,013-0,021 kWh, and for the daily forecast, the median method with RMSE of 0,012-0,023 kWh shows the greatest efficiency. Conclusion. On a monthly forecast horizon, it is advisable to use median methods to fill in the missing data. When forecasting for a week or a day, all of the methods discussed are almost equivalent.

About the Authors

A. G. Shidov
North-Caucasus Federal University
Россия

Arsen G. Shidov – Postgraduate Student of the Department of AESiE, Faculty of Oil and Gas Engineering



Yu. G. Kononov
North-Caucasus Federal University
Россия

Yuri G. Kononov – Dr. Sci. (Techn.) Professor, Head of the Department of Automated Electric Power Systems and Power Supply



D. A. Kostyukov
North-Caucasus Federal University
Россия

Dmitry A. Kostyukov – Cand. Sci. (Techn.), Associate Professor of the Department of Automated Electric Power Systems and Power Supply



M. R. Kurshev
North-Caucasus Federal University
Россия

Murat R. Kurshev – Postgraduate Student, Department of AECiE, Faculty of Oil and Gas Engineering



B. G. Shidov
North-Caucasus Federal University
Россия

Beslan G. Shidov – Master Student of the AESiE, Faculty of Oil and Gas Engineering



References

1. Concept «Digital Transformation 2030». PAO Rosseti. – Moscow, 2018. 31 p. – Available from: https://rosseti.ru/investment/Kontseptsiya_Tsifrovaya_transformatsiya_2030.pdf. (In Russ.).

2. On the procedure for providing access to the minimum set of functions of intelligent systems for metering electrical energy (capacity): Government of the Russian Federation Resolution №. 890 of June 19, 2020: as amended on March 29, 2024. Of ficial Internet Portal of Legal Information. Available from: http://publication.pravo.gov.ru/Document/View/0001202006230034. (In Russ.).

3. On Amending Certain Legislative Acts of the Russian Federation in Connection with the Development of Electric Energy (Power) Metering Systems in the Russian Federation: Federal Law of the Russian Federation of December 27, 2018 №. 522 FZ: adopted by the State Duma of the Federal Assembly of the Russian Federation on December 19, 2018: approved by the Council of the Federation of the Federal Assembly of the Russian Federation on December 21, 2018. Rossijskaya Gazeta. 2018. December 29. №. 295; Collection of Legislation of the Russian Federation. 2018. December 31. No. 53 (Part I), Art. 8448. (In Russ.).

4. Shimmari MAl, Wallom D. Short-term load forecasting using UK non-domestic businesses to enable demand response ag gregators’ participation in electricity markets. IEEE PES Grid Edge Technologies Conference & Exposition (Grid Edge), San Diego, CA, USA; 2023. P. 1-5, https://doi.org/10.1109/GridEdge54130.2023.10102712.

5. Gao H-X, S. Kuenzel and Zhang X-Y. A Hybrid ConvLSTM-Based Anomaly Detection Approach for Combating Energy Theft. IEEE Transactions on Instrumentation and Measurement. 2022;(71):2517110. https://doi.org/10.1109/TIM.2022.3201569.

6. Mark Ryan, Luca Massaron. Machine Learning for Tabular Data: XGBoost, Deep Learning and AI. Manning; 2025. 504 p.

7. Sobrino EM, Santiago AV, González AM. Forecasting the Electricity Hourly Consumption of Residential Consumers with Smart Meters using Machine Learning Algorithms. IEEE Milan PowerTech: proceedings. Milan, Italy; 2019. P. 1-6. https://doi.org/10.1109/PTC.2019.8810902.

8. Munawar S, Khan ZA, Chaudhary NI, Javaid N, Raja MAZ, Milyani AH, Azhari AA. Novel FDIs-based data manipulation and its detection in smart meters' electricity theft scenarios. Frontiers in Energy Research. 2022;(10): article 1043593. https://doi.org/10.3389/fenrg.2022.1043593.

9. Chen Z. Electricity Theft Detection Using Deep Bidirectional Recurrent Neural Network / Z. Chen, D. Meng, Y. Zhang, T. Xin, D. Xiao // 22nd International Conference on Advanced Communication Technology (ICACT). Phoenix Park, Ko rea (South), 2020. P. 401-406. https://doi.org/10.23919/ICACT48636.2020.9061565.

10. Manikovsky AS, Mukhopad AYu. Methods of recovery of missed measurements in time series in the power consumption forecasting system. Engineering Bulletin of the Don. 2022;(7). Available from: http://cyberleninka.ru/article/n/metody-vosstanovleniya-propuschennyh-znacheniy-vo-vremennyh-ryadah-v-sisteme-prognozirovaniya-elektropotrebleniya. (In Russ.).

11. Raviprabhakaran V, Pusuluri P, Nendralla B. [et al.]. Household Power Consumption Analysis using Machine Learning. IEEE 4th International Conference on Sustainable Energy and Future Electric Transportation (SEFET): proceedings. 2024. https://doi.org/10.1109/SEFET61574.2024.10718254.

12. Asghar, Ehtisham & Hill, Martin & Şengör, Ibrahim & Lynch, Conor & Quang An, Phan. Validation of a 24-hour-ahead Prediction model for a Residential Electrical Load under diverse climate. 2025. https://doi.org/10.48550/arXiv.2505.00348.


Review

For citations:


Shidov A.G., Kononov Yu.G., Kostyukov D.A., Kurshev M.R., Shidov B.G. Impact of missing data recovery methods on the accuracy of electricity consumption forecasting by gradient boosting algorithm. Newsletter of North-Caucasus Federal University. 2025;(6):46-55. (In Russ.) https://doi.org/10.37493/2307-907X.2025.6.5

Views: 9

JATS XML


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2307-907X (Print)