Open AccessArticle

Federated Learning-Based Prediction of Energy Consumption from Blockchain-Based Black Box Data for Electric Vehicles

Jong-Hyuk Park

and

In-Whee Joe

Department of Computer Science and Engineering, Hanyang University, Seoul 04763, Republic of Korea

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5494; https://doi.org/10.3390/app14135494

Submission received: 27 May 2024 / Revised: 18 June 2024 / Accepted: 20 June 2024 / Published: 25 June 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Figure 1
System components for electric vehicle data management. "> Figure 2
The improvement in performance per driving cycle for the multi-stage model compared to the single model. "> Figure 3
Robustness verification results of FL-QLMS against various Byzantine attacks. "> Figure 4
Comparison of prediction results between single-stage and multi-stage models (sample driving sequence). "> Figure 5
Comparison of test accuracy between centralized FL, vanilla FL, and partially decentralized FL methods. "> Figure 6
Comparison of TPS improvement effects with scalability solutions. "> Figure 7
TPS changes according to the number of electric vehicles. "> Figure 8
Comparison of latency improvement effects with scalability solutions. "> Figure 9
Changes in test accuracy according to label flipping ratio and backdoor insertion rate. "> Figure 10
Test Accuracy of Algorithms Against Model Poisoning and Adversarial Imitation Learning Attacks. ">

Versions Notes

Abstract

In modern society, the proliferation of electric vehicles (EVs) is continuously increasing, presenting new challenges that necessitate integration with smart grids. The operational data from electric vehicles are voluminous, and the secure storage and management of these data are crucial for the efficient operation of the power grid. This paper proposes a novel system that utilizes blockchain technology to securely store and manage the black box data of electric vehicles. By leveraging the core characteristics of blockchain—immutability and transparency—the system records the operational data of electric vehicles and uses federated learning (FL) to predict their energy consumption based on these data. This approach allows the balanced management of the power grid’s load, optimization of energy supply, and maintenance of grid stability while reducing costs. Additionally, the paper implements a searchable black box data storage system using a public blockchain, which offers cost efficiency and robust anonymity, thereby enhancing convenience for electric vehicle users and strengthening the stability of the power grid. This research presents an innovative approach to the integration of electric vehicles and smart grids, exploring ways to enhance the stability and energy efficiency of the power grid. The proposed system has been validated through real data and simulations, demonstrating its effectiveness and performance in managing black box data and predicting energy consumption, thereby improving the efficiency and stability of the power grid. This system is expected to empower electric vehicle users with data ownership and provide power suppliers with more accurate energy demand predictions, promoting sustainable energy consumption and efficient power grid operations.

Keywords:

electric vehicles; blockchain; federated learning; energy management; data security; machine learning

1. Introduction

1.1. Research Background

Electric vehicles (EVs) possess the potential to significantly reduce greenhouse gas emissions compared to internal combustion engine vehicles, leading to their accelerated adoption worldwide. According to the International Energy Agency (IEA), as of 2022, there were approximately 18 million battery electric vehicles (BEVs), and this number increased to 28 million in 2023 [1]. Incentives are in place to ensure that BEV sales account for 30% of the total vehicle market by 2030 [2]. This rapid expansion not only meets environmental demands for climate change mitigation and air pollution reduction but also contributes to achieving energy policy goals of improving energy efficiency and expanding the use of renewable energy sources [3]. However, the mass deployment of EVs poses serious challenges in terms of power system stability and efficiency. The surge in electricity demand due to EV charging and the increase in peak loads can lead to overloading and power losses in transmission and distribution facilities. According to Muratori [4], in the United States, a 25% EV penetration rate could lead to a 20% increase in peak loads, resulting in transformer and cable overheating, voltage instability, and frequency fluctuations, thereby degrading power quality and increasing the risk of blackouts. Furthermore, the integration of intermittent renewable energy sources such as solar and wind into the grid, coupled with the mismatch between EV charging/discharging demands and output patterns, exacerbates the challenges in balancing power supply and demand [5]. Studies by Weiller and Neely [5] have shown that in scenarios where the California Independent System Operator (CAISO) has a renewable generation share of 33%, the variability in net load can increase to up to 13 GW/h. This necessitates increased regulation and spinning reserve capacities, leading to significant additional costs. Moreover, such system instability negatively impacts the renewable energy hosting capacity, potentially hindering long-term energy transition goals. Therefore, sophisticated forecasting of EV demand patterns and considerate charging scheduling are essential for stable and economical power system operations. During the operation of EVs, the collection of various data, including vehicle location, energy usage, and battery energy remaining, through black box systems plays a crucial role in predicting power demand and optimizing charging schedules [6]. For instance, Frendo et al. [7] have successfully developed a deep neural network-based model for predicting energy consumption using GPS and battery management system (BMS) data collected from vehicle operations, achieving an error rate within 3.5%. However, the centralized management of such sensitive operational data can make them vulnerable to hacking or insider attacks [8]. According to Gupta et al. [9], the incidence of cyberattacks related to automotive data has increased approximately 18-fold from 2010 to 2019, with attacks targeting telematics and vehicle monitoring systems showing significant growth. From the perspective of EV users, the data subjects, it is challenging to control how their data are collected and utilized, leading to persistent concerns about privacy breaches. Research by Bhagat and Rani [10] revealed that about 67% of respondents expressed privacy concerns regarding the collection of connected car data. Thus, securing user trust in data privacy and security is a prerequisite for activating EV-related data utilization and value-added services. Blockchain technology is gaining attention as a potential solution to these data management issues, enabling secure sharing and utilization of EV operational data. Blockchain operates as a distributed ledger technology where multiple decentralized nodes jointly verify and reach consensus, characterized by high integrity, availability, and transparency [11]. Additionally, by storing data across multiple nodes, it eliminates single points of failure; thus, if EV black box data were managed on a blockchain, it could effectively prevent data tampering and ensure user control over data access, addressing information sovereignty issues. Pedrosa and Pau [12] proposed a blockchain-based energy trading platform between electric vehicles and charging stations, where key operational data, such as vehicle identification, location information, and battery energy remaining, are transparently recorded on the blockchain, helping to prevent fraudulent charging and other malpractices. Moreover, since all transaction records are permanently stored in immutable blocks, it simplifies dispute resolution and accountability determination [13]. However, there are limitations to directly employing public blockchain platforms for managing EV data. The consensus process involving all nodes incurs high transaction costs and delays, which are not suitable for the real-time processing demands of large-scale EV data. For instance, Ethereum’s average block creation time is about 13 s, with a transaction throughput of about 15 transactions per second [14]. Furthermore, the structural characteristic of making ledgers public to all nodes necessitates the adoption of additional encryption and access control techniques to ensure the confidentiality of black box data. Also, as the number of verifying nodes increases, so does the cost of consensus, making the unlimited expansion of consensus participants impractical. Sedlmeir et al. [15] have pointed out scalability and privacy issues as major adoption barriers for public blockchains. In response, this study proposes a partially decentralized blockchain framework for the secure and efficient management of electric vehicle black box data. The proposed structure stores only encrypted hash values in the block to ensure confidentiality, while the consensus process involves only pre-approved nodes to enhance performance. These pre-approved nodes are selected through a rigorous vetting process conducted by a consortium of stakeholders, including electric vehicle manufacturers, energy providers, and regulatory bodies. The selection criteria for these nodes include factors such as reliability, computational performance, and compliance with security protocols. Reliability is evaluated based on the node’s historical uptime and performance, computational performance is assessed by benchmarking the node’s processing power and speed, and adherence to security protocols is verified through regular audits and security certifications. The vetting process also involves a series of tests and evaluations, including stress testing to measure performance under high loads, penetration testing to identify potential security vulnerabilities, and periodic audits to ensure ongoing compliance. By subjecting potential nodes to this thorough screening process, the consortium can ensure that only trusted and capable nodes are approved to participate in the consensus mechanism, thereby enhancing the overall security, reliability, and efficiency of the system.

Additionally, a lightweight consensus algorithm that merges and compresses multiple transactions significantly reduces transaction cost burdens. The algorithm used in this framework is the Raft consensus algorithm, which is the default consensus mechanism in Hyperledger Fabric. Raft is a crash fault tolerance (CFT) consensus algorithm that achieves consensus by electing a leader node and processing transactions sequentially through the leader. It is known for its simplicity, high transaction throughput, and low latency. By leveraging Raft, the proposed system can perform efficient and stable consensus processes, even with a large number of transactions. Moreover, the implementation of Raft in this framework includes an optimization technique that merges and compresses multiple transactions into a single block, further reducing the storage and network overhead associated with the consensus process. This optimization is achieved through a combination of data deduplication and delta encoding, which eliminates redundant data and stores only the differences between transactions. By minimizing the amount of data that need to be stored and transmitted, this technique significantly lowers transaction costs and improves the overall performance of the system.

This approach seeks to balance the conflicting demands of data security, privacy, and consensus efficiency. Moreover, to effectively utilize collected EV data for power grid operations, it is crucial to be able to predict energy usage. Especially as renewable energy-based generation expands and bidirectional demand response (DR) becomes more active, it is imperative to enhance demand forecasting capabilities that consider not only individual EV charging/discharging characteristics but also their aggregated impact [16]. Furthermore, there is a need to extend the scope of predictions beyond individual prosumers to encompass broader distribution networks such as microgrids. However, this requires the integrated use of data from multiple stakeholders, which is challenging due to privacy issues. A promising alternative is the federated learning (FL) technique [17], which allows multiple clients with distributed data to collaboratively learn a model while keeping the original data private and selectively sharing model parameters. Vertical FL, in particular, enables data owners with different feature spaces to combine features for common samples to train models, effectively facilitating information exchange between electric vehicles and charging stations or aggregating demand data [18]. Yang et al. [19] demonstrated a case where multiple microgrid operators shared demand and generation data under privacy protection through FL, significantly improving prediction performance. Additionally, Lu et al. [20] used an FL model to predict the charging demand for plug-in hybrid electric vehicles (PHEVs) on a cluster basis, reducing peak loads by 23%. Thus, FL emerges as an advanced technology option that captures both privacy protection and enhanced prediction capabilities. However, in the EV environment, where participating vehicles are randomly selected and their data distributions vary, applying basic federated learning algorithms alone may not yield robust prediction models. For instance, methods like FedAvg, which simply averages all local models, are vulnerable to outliers and imbalance issues. According to experiments by Yang et al. [21], if the proportion of malicious clients exceeds 20%, the test accuracy of FedAvg can drop by more than 40%, potentially compromising the reliability of predictions and, consequently, the efficiency of EV and power grid operations. In light of these issues, this study proposes the Federated Learning with Qualified Local Model Selection (FL-QLMS) technique as part of a Byzantine-robust FL framework. FL-QLMS uses trusted validation data to pre-assess the performance of individual vehicle models and selectively incorporates local models into the global model update based on the results. The trusted validation dataset used in FL-QLMS is selected by a consortium of stakeholders, including EV manufacturers, charging infrastructure operators, energy providers, and government agencies. This consortium establishes a data governance committee to oversee the selection process of the validation dataset, ensuring fairness and transparency. This ensures the reliability and objectivity of the dataset used in FL-QLMS.

The reliability of the validation dataset is evaluated based on data quality metrics defined in the international standards ISO/IEC 25024:2015 [22] and ISO 20762:2018 [23], which specifically deal with EV driving data. The dataset is comprehensively judged in terms of (1) completeness, (2) accuracy, (3) consistency, (4) timeliness, and (5) accessibility, as well as EV-specific criteria such as (6) energy consumption data under various driving conditions, (7) driving range data by battery charge status, and (8) data analyzing the impact of external environmental conditions. The evaluation methods for each metric utilize the measurement procedures and analysis techniques provided in the respective standards. Only datasets that satisfy all these criteria are selected as validation datasets for FL-QLMS.

If no dataset completely satisfies the data quality criteria, the following alternatives are sequentially considered. First, the criteria are partially relaxed to utilize the next-best dataset. In this case, the relaxed criteria and the limitations of the resulting dataset are clearly documented to maintain transparency. If securing even the next-best dataset is difficult, collecting additional data from authorized external data sources (government agencies, research institutions, etc.) is considered. If it is still challenging to compose an appropriate validation dataset, the application of the FL-QLMS technique itself is suspended, and alternative approaches are explored.

As EV technologies evolve and usage environments change, the validation dataset is periodically updated and managed. The data governance committee evaluates the validity of the dataset quarterly and takes measures such as collecting new data or filtering existing data when necessary. In this process, changes in the performance of the FL-QLMS model are monitored to analyze the impact of dataset updates on model performance. In addition, regular feedback is collected from each participant in FL-QLMS to gather ideas for dataset quality improvement and expansion. Through this continuous update and management, the FL-QLMS technique can produce valid and reliable results in the long term.

This approach minimizes the negative impact of outlier data and dishonest participation while effectively addressing non-standard data and non-independent and identically distributed (non-IID) issues. The robust model aggregation (RMA) technique recently proposed by Shejwalkar and Houmansadr [24] is similar to our approach but differs in that it verifies model loss trends in real-time across continuous rounds. Building on this background, this paper proposes an intelligent power grid management system based on federated learning that securely stores and manages EV black box data on the blockchain and uses the operational information of multiple vehicles to predict energy usage. The proposed system is based on data sovereignty and privacy protection, securing interoperability between EVs and the power grid while promoting rational resource management and efficiency improvements through integrated data analysis. This is expected to be a key foundation supporting the stable acceptance and sustainable utilization of EVs in the era of energy transition.

Most importantly, the proposed system in this study offers significant implications for realizing the potential value of EV operational data. According to an analysis by KPMG [25], the market for connected car data-based services is expected to grow to approximately USD 750 billion by 2030. However, data silos and privacy issues have been obstacles to realizing this potential. Thus, the combination of a decentralized data management system based on blockchain and privacy-protective analysis techniques could accelerate data-driven innovation across the mobility industry, not just in the energy sector.

For example, the battery performance history of individual vehicles accumulated on the blockchain could be used as an objective reference for pricing in used car transactions, contributing to the formation of a transparent and fair market. Additionally, driving and accident risk prediction models learned through FL could be utilized in developing personalized insurance products. Establishing a technology infrastructure that ensures data sovereignty while enabling value sharing could serve as a foundation for driving a virtuous cycle in the EV market.

The structure of this paper is as follows. Section 2 details the design of the blockchain management system for electric vehicle black box data and the FL-QLMS algorithm for predicting energy usage. Section 3 presents the experimental environment and key analysis results for evaluating the performance of the proposed system. Finally, Section 4 discusses practical implementation strategies and future research directions for the proposed system.

At this turning point where electric vehicles are becoming the mainstream for eco-friendly mobility and sustainable energy ecosystems, securing a foundation for safe and efficient data utilization is more crucial than ever. This paper presents an integrated framework encompassing distributed data management, federated learning, and multi-dimensional optimization, offering a new approach to fostering value exchange and co-evolution between electric vehicles and the power grid. It is hoped that the ideas proposed will serve as a catalyst for completing the virtuous cycle of EV proliferation.

1.2. Existing Research and Limitations

Current research has explored various methods for blockchain-based data management and electric vehicle (EV) charging demand predictions. However, these approaches often face challenges in terms of data privacy, efficiency, and robustness.

1.2.1. Blockchain-Based Data Management

Several studies have proposed combining blockchain with off-chain storage or managing data access rights through smart contracts to enhance data security and privacy. For instance, Dorri et al. improved processing performance using a lightweight consensus protocol and allowed only authorized nodes to participate, thereby enhancing confidentiality with asymmetric encryption. However, these methods still face issues related to storing original data directly on the blockchain and involving all nodes in the transaction verification process, which raises concerns about privacy and efficiency.

In response, this study proposes a partially decentralized blockchain framework for the secure and efficient management of EV black box data. Each EV encrypts its collected data using RSA-2048 encryption, and only the hash of the ciphertext is recorded on the blockchain, ensuring both confidentiality and integrity. This method significantly reduces bandwidth and storage burdens while enhancing performance through a streamlined consensus algorithm involving only pre-approved nodes.

1.2.2. EV Charging Demand Prediction

Traditional models for predicting EV charging demand include energy consumption models that consider battery characteristics, probabilistic methods based on driving patterns, and spatio-temporal estimation for multi-charging station usage. However, these models typically rely on centralized learning approaches, which pose privacy challenges and are susceptible to data imbalance and non-IID data distributions.

Recent studies have introduced federated learning (FL) to address privacy issues and data imbalance among vehicles. However, existing FL models, such as FedAvg, are vulnerable to unreliable participants, leading to the potential degradation of the global model.

To address these limitations, this study proposes the Federated Learning with Qualified Local Model Selection (FL-QLMS) algorithm. FL-QLMS uses a trusted validation dataset to evaluate the performance of each vehicle’s local model and selectively incorporates only the top-performing models into the global model update. This approach minimizes the impact of outlier data and dishonest participation, ensuring robust prediction performance, even in non-IID environments.

1.3. Proposed System Overview and Features

This paper proposes a system composed of three core elements: blockchain-based data management, robust federated learning, and multi-stage prediction and optimization. Initially, black box data are encrypted and stored locally in each EV, with only the hash of the ciphertext and metadata recorded on the blockchain. This blockchain adopts a partially decentralized structure that limits consensus participation to authorized nodes and processes transactions in batches, simultaneously enhancing privacy and efficiency.

The driving data collected in this manner are utilized for robust energy consumption model training through the FL-QLMS algorithm, which protects privacy. Specifically, FL-QLMS uses a reliable validation dataset to assess the performance of each vehicle’s local model and selects participants for the global model update based on this assessment. This minimizes the impact of malicious or low-quality data, ensuring model robustness.

Moreover, the proposed system goes beyond individual EV driving unit demand predictions by employing spatio-temporal clustering and multi-stage deep prediction to forecast specific regional and temporal charging demands. This approach considers both local demand variability and mid-to-long-term trends, enabling adaptive charging infrastructure deployment and dynamic pricing policy formulation. It also evaluates the potential of EVs as demand response resources and their utilization for power grid stabilization.

The system architecture proposed here differs from traditional centralized management and prediction systems in several ways. Firstly, by integrating encryption with blockchain technology, it establishes a secure data-sharing foundation while ensuring data sovereignty for vehicle owners. Additionally, the collection of non-standard data and participation of various stakeholders enhance the accuracy and robustness of energy usage predictions [26].

Furthermore, by linking federated learning with a blockchain incentive system, the system motivates EV users to provide data [27], facilitating voluntary data circulation and collaborative decision-making among service providers. This can lead to the development of personalized services such as charging recommendations and energy transactions and ultimately support the harmonious integration of EVs and the power grid in the long term.

Moreover, this study has academic significance as it explores the multifaceted value that blockchain and federated learning can generate in the energy sector, particularly from a data 3A (availability, authorization, audit) perspective, balancing privacy and accountability. It also suggests the potential for expansion into various domains. Most importantly, it proposes the value of this infrastructure as a data foundation for the activation and efficient operation of EVs, a key driver of the energy transition; thus, it is also expected to contribute policy-wise.

In summary, the blockchain–federated learning-based EV data platform presented in this paper proposes a new paradigm to simultaneously address the multifaceted challenges of data accessibility, privacy protection, and energy efficiency enhancement. However, realizing this vision requires the establishment of mutual trust and cooperative relationships among stakeholders, as well as the creation of supporting institutional and technical foundations. Particularly, challenges such as a fair compensation system for federated learning participation, advanced data verification mechanisms, and ensuring interoperability with various service platforms require ongoing research and social consensus.

Nevertheless, this study is significant as it explores innovative methodologies for uncovering and safely and efficiently utilizing the potential value of EV operational data. It is expected to have a ripple effect on the related industry ecosystem as a pioneering attempt to open new horizons in energy data governance. The following sections will detail the design of the proposed framework and empirical analysis to comprehensively illuminate this potential.

2. Materials and Methods

2.1. System Components

The proposed blockchain-based electric vehicle energy management and forecasting system adopts a decentralized architecture consisting of three key components: electric vehicles, aggregators, and blockchain. Each component performs distinct roles, and their interactions enable the overall functionality of the system.

Electric vehicles: Equipped with sensors and black boxes, electric vehicles collect various data while driving. These data are used to predict energy consumption through a deep neural network model (local model). The trained model’s parameters are encrypted and signed for privacy protection before transmission to the aggregator.

Aggregator: The aggregator collects individual model parameters from multiple electric vehicles and uses the FL-QLMS technique to create a robust global model (global model). This global model is then securely uploaded to the blockchain.

Blockchain: The blockchain verifies the integrity of the global model submitted by the aggregator and manages it transparently. Multiple nodes in the blockchain network validate the model’s validity through a consensus algorithm. Only approved models are recorded to prevent tampering and forgery. Each electric vehicle is granted access rights to download the latest global model safely.

The interaction process between these components is as follows:

Data collection and local training: Electric vehicles collect data and train a local energy consumption prediction model.
Data transmission: The trained local model parameters are encrypted and sent to the aggregator.
Data aggregation and global model creation: The aggregator collects, selects, and aggregates the local models using the FL-QLMS technique to create a global model.
Blockchain verification and storage: The global model is submitted to the blockchain, where its integrity is validated and it is securely recorded.
Model distribution: The verified global model is made available for download by electric vehicles, which update their local models, thus continuously improving their performance.

Figure 1 illustrates the flow of data between the main components of the system. Electric vehicles collect data, train local models, and transmit encrypted data to the aggregator. The aggregator aggregates the data, applies the FL-QLMS algorithm, and creates a global model, which is then sent to the blockchain for verification, storage, and distribution. The blockchain ensures the integrity and security of the data and makes the global model available to electric vehicles, completing the cycle.

2.2. Black Box Data Management

The proposed system utilizes blockchain technology and encryption methods to securely manage the driving and charging data of electric vehicles. Specifically, it implements a comprehensive security system through the combination of RSA public key encryption for data confidentiality and the SHA-256 hash function for integrity verification.

Application of RSA-2048 encryption: Each electric vehicle i applies the asymmetric encryption algorithm RSA-2048 to the black box data

D_{i, t}

collected at time t, generating the ciphertext

E_{i, t}

, which is then stored in local storage. RSA is one of the most widely used public key encryption techniques in modern cryptography, proposed in 1977 by Rivest, Shamir, and Adleman. The security of RSA is based on the computational complexity of factoring large composite numbers, and its strength is determined by the key length.

In this system, a 2048-bit key length is used for a secure implementation of RSA, which is expected to provide sufficient security until after 2030, as recommended by the National Institute of Standards and Technology (NIST). The encryption process is performed using the private key

K_{i}

owned by each electric vehicle, and can be expressed as follows:

E_{i, t} = RSA - 2048 (K_{i}, D_{i, t})

Utilization of SHA-256 hash function: Storing original data directly on the blockchain could lead to capacity burdens and privacy issues. Therefore, the proposed system adopts a method of recording only the hash value of the ciphertext on the blockchain to prove data integrity. Specifically, each electric vehicle applies the SHA-256 algorithm to the generated ciphertext

E_{i, t}

, calculating a 256-bit length hash value

H_{i, t}

, which is then uploaded to the blockchain.

H_{i, t} = SHA - 256 (E_{i, t})

SHA-256 is a cryptographic hash function announced by the National Security Agency (NSA) in 2001 and is a representative algorithm of the SHA-2 family. A hash function takes an input of arbitrary length and produces a fixed-length pseudorandom output (digest), which is one-way, making it difficult to deduce the input from the output. It also satisfies weak collision resistance, where it is highly unlikely for two different inputs to produce the same output.

With these characteristics, the SHA-256 hash value acts as a fingerprint of the original data. Therefore, by verifying the ciphertext

E_{i, t}

through the recorded hash value

H_{i, t}

on the blockchain, the existence and integrity of the original data

D_{i, t}

can be simultaneously proven. If the data are altered post hoc, the hash value calculated from the changed content will not match the value stored on the blockchain, allowing the immediate detection of tampering or forgery.

Benefits of blockchain-based management: The metadata (ciphertext and hash value) generated through the encryption and hashing processes is managed through a decentralized blockchain network rather than a traditional centralized management system. The blockchain is a distributed ledger technology where multiple nodes collaboratively participate in verifying and recording data, characterized by high availability, integrity, and transparency.

Specifically, the hash values uploaded by each electric vehicle are collected at regular intervals to form a block, and only the blocks agreed upon by all nodes are added to the shared ledger. The integrity of the block is ensured through a linked structure that includes the hash value of the previous block and consensus algorithms such as proof-of-work. Thus, tampering with the hash values contained in the block is virtually impossible, directly leading to the integrity of the original data.

Furthermore, by utilizing a decentralized blockchain network, the proposed technique secures trust among multiple stakeholders without reliance on a specific institution. It also eliminates centralized management points, resolving single points of failure (SPoF) and enabling permanent and transparent audit trails. The security and transparency of the blockchain provide a foundation for various participants in the electric vehicle ecosystem to confidently share and utilize data.

Adoption of standard cryptographic technologies: The proposed system uses verified standard cryptographic technologies, such as RSA and SHA-256, ensuring compliance with related regulations and legal requirements while securing long-term safety and flexibility. RSA is standardized by major international standardization bodies, like IETF and NIST, and SHA-2 is also established as the Federal Information Processing Standards (FIPS) 180-2 standard, which is widely used globally.

These technologies have been proven safe through extensive research over many years and are also advantageous for transitioning to post-quantum cryptography (PQC) schemes in preparation for the advent of quantum computers. In fact, NIST has been pushing for PQC standardization since 2017, and the fourth round of candidates includes lattice-based cryptography and multivariate equation-based signatures, which could replace RSA and SHA-256.

The use of these verified and flexible cryptographic primitives in the system is significant for ensuring long-term reliability, especially since data related to electric vehicles can be directly linked to sensitive issues such as personal privacy breaches and technology leaks. Therefore, having a cryptographic foundation that can proactively respond to emerging security threats and regulatory changes is essential.

Conclusion: In summary, the proposed black box data management technique ensures security and transparency across the utilization of electric vehicle data through systematic and sophisticated encryption and integrity verification mechanisms. The RSA public key encryption system secures data confidentiality, while storing SHA-256 hash values on the blockchain enables permanent integrity proof. Particularly, the method of sharing only the ciphertext and hash value effectively balances the conflicting requirements of privacy protection and integrity assurance.

This multi-layered data security system fundamentally resolves the vulnerabilities inherent in traditional centralized management structures and provides a reliable collaboration foundation. Additionally, by utilizing international standard cryptographic primitives, it simultaneously secures regulatory compliance and long-term safety. If new cryptographic technologies like PQC are incorporated in the future, it could further enhance adaptability in the rapidly changing mobility environment.

Ultimately, the black box data management technique proposed in this study can contribute to enhancing the security capabilities of the entire eco-friendly mobility industry, including electric vehicles, as a data infrastructure supporting the transition to intelligent transportation systems. It is also expected to act as a catalyst for data-driven innovation, establishing an optimal information management paradigm that balances privacy, accountability, and availability.

2.3. Multi-Stage Power Consumption Prediction

In this paper, we propose a multi-stage prediction model to more accurately predict the energy consumption of electric vehicles during driving. The proposed model divides the entire driving route into segments at regular time intervals and performs segment-specific predictions reflecting the characteristics of each segment, significantly improving prediction performance compared to a single model.

Specifically, when representing the driving route of an electric vehicle as a time series of GPS coordinates

(x_{1}, y_{1}), \dots, (x_{n}, y_{n})

, it is divided into m segments

S_{1}, \dots, S_{m}

as follows based on a time interval

Δ t

S_{1} = {(x_{1}, y_{1}), \dots, (x_{k_{1}}, y_{k_{1}})}, t \in [t_{1}, t_{1} + Δ t)

S_{2} = {(x_{k_{1} + 1}, y_{k_{1} + 1}), \dots, (x_{k_{2}}, y_{k_{2}})}, t \in [t_{2}, t_{2} + Δ t)

\dots

S_{m} = {(x_{k_{m - 1} + 1}, y_{k_{m - 1} + 1}), \dots, (x_{n}, y_{n})}, t \in [t_{m}, t_{n}]

Here,

t_{1}, \dots, t_{m}

represent the start times of each segment, and

t_{n}

represents the end time of the drive. For each segment

S_{j}

, a six-dimensional feature vector

F_{j}

is constructed:

F_{j} = [v_{j}, a_{j}, b_{j}, t_{j}, w_{j}, g_{j}]

where

v_{j}

: average speed within the segment (km/h);

a_{j}

: average acceleration within the segment

m / s^{2}

;

b_{j}

: battery level at the start of the segment (%);

t_{j}

: driving time of the segment (sec);

w_{j}

: average weather information within the segment such as temperature (°C) and wind speed

m / s

;

g_{j}

: average terrain information within the segment such as altitude (m) and slope (rad).

These features are composed of factors that have been identified through prior research as having a significant impact on the energy consumption of electric vehicles. Speed and acceleration are directly linked to driving resistance and regenerative braking, which are key variables influencing energy consumption. Additionally, battery level and driving time serve as measures of energy capacity and consumption duration.

The segment-specific prediction models

f_{1}, \dots, f_{m}

, which take the feature vector as input, are independently trained from past driving data. Specifically, given the driving record

D = {(F_{j}, y_{j}) | j = 1, \dots, M}

, a nonlinear regression model is trained to represent the relationship between the feature vector

F_{j}

and the actual consumption

y_{j}

, as shown in Equation (1):

f_{j} : F_{j} \mapsto R, f_{j} (F_{j}) = y_{j} + ϵ, ϵ \sim N (0, σ^{2}), j = 1, \dots, m

(1)

Here,

ϵ

is a normally distributed error term. The trained models

f_{1}, \dots, f_{m}

are then applied to segment-specific data collected during real-time driving to predict the energy consumption for each segment

p_{1}, \dots, p_{m}

. The total energy consumption P for the entire route is derived as the sum of the segment predictions, as shown in Equation (2):

P = \sum_{j = 1}^{m} p_{j} = \sum_{j = 1}^{m} f_{j} (F_{j})

(2)

To evaluate the performance of the proposed model, comparative experiments were conducted using actual electric vehicle operation data. The dataset used for evaluation consists of driving records collected from a domestic taxi transportation company from January to December 2019, comprising 3,107,844 driving logs from 1542 vehicles. Each log includes GPS coordinates, speed, acceleration, state of charge (SOC), and energy consumption data recorded every second.

The dataset was divided into training (60%), validation (20%), and evaluation (20%) sets, and a single prediction model was trained on the same dataset for comparison. The single model used was a long short-term memory (LSTM) neural network, which considers the inherent feedback memory effect. The same six features as in the multi-stage model were used as inputs, but the entire driving sequence was input in batch.

Both models were implemented using the Keras framework and trained in a Tesla V100 GPU environment. The batch size was set to 128, the number of epochs was set to 50, and the Adam optimizer was used. The model structure and hyperparameters were optimized through a grid search on the validation dataset.

The segment length

Δ t

for the multi-stage model was set to 5 min, and a three-layer multi-layer perceptron (MLP) was used for each segment predictor. The sizes of the hidden layers were 64 and 32, respectively, with ReLU activation functions applied. For the single prediction model, the LSTM, the hidden state size was set to 128 with two layers stacked.

Table 1 compares the energy consumption prediction performance of the single model and the multi-stage model based on root mean squared error (RMSE) and mean absolute error (MAE). The multi-stage MLP model achieved a reduction in error of 19.8% in RMSE and 23.6% in MAE compared to the single-stage LSTM. The improvement in performance was particularly noticeable in complex driving cycles that included long-distance driving (Figure 2), which can be attributed to the multi-stage model’s ability to finely reflect the characteristics of each driving segment.

The proposed multi-stage prediction technique achieves significant performance improvement over the single prediction model by reflecting the dynamic changes in the driving environment through route segmentation. The effect is particularly pronounced in situations involving complex urban driving or long-distance travel, where non-standard patterns are mixed. This is believed to be due to the segment-specific precision modeling’s advantageous effect on absorbing uncertainties.

Further enhancements in prediction performance could be expected by exploring methods such as hierarchical clustering or attention mechanisms to dynamically learn route patterns and key features. Additionally, enhancing the model’s scalability and generalization ability through transfer learning from cumulative driving data and model ensembles specialized for road types and vehicle models is also a key follow-up task. In the long term, developing a collaborative prediction framework that shares energy information between vehicles in real-time based on V2X communication could also be considered.

Ultimately, the multi-stage energy prediction technique proposed in this study has significance as a core foundational technology for improving energy efficiency in response to the changing data environment due to the acceleration of electric vehicle adoption. It not only optimizes energy for individual drivers but also has potential applications in charging infrastructure design and grid demand management. However, additional research may be necessary for the real-time processing of large-scale data and its lightweight implementation in embedded systems.

2.4. Robust Federated Learning Based on FL-QLMS

The proposed system involves multiple electric vehicles collaborating to learn an energy consumption prediction model using the federated learning (FL) approach [19]. However, conventional FL techniques based on simple model averaging, such as FedAvg [20] and FedProx [21], are vulnerable to Byzantine failures [24], i.e., attacks by some malicious participants. This paper proposes the Federated Learning with Qualified Local Model Selection (FL-QLMS) algorithm, which enables more robust model aggregation by overcoming these limitations.

The core idea of FL-QLMS is to use a trusted central validation dataset to evaluate the performance of local models received from each electric vehicle at the end of every round and selectively incorporate only the top k% of high-quality models into the global model update. Algorithm 1 outlines the pseudocode for FL-QLMS, distinguishing between the roles of the server and the clients.

The Evaluate() function plays a crucial role in FL-QLMS by assessing the performance of individual local models. It takes a local model W and the central validation dataset

D_{val}

as inputs and returns a performance metric, such as root mean square error (RMSE), indicating the model’s prediction quality. Within the function, a Byzantine node identification logic is implemented to detect and handle potentially malicious nodes based on their model performance. For example, if a local model’s RMSE exceeds a certain threshold, it is suspected to be a Byzantine node, and its impact on the global model can be mitigated through weight adjustment or exclusion.

This selective aggregation process based on the Evaluate() function effectively prevents the degradation of the global model quality due to Byzantine failures. By explicitly defining the Evaluate() function and its role in the FL-QLMS algorithm, the robustness and reliability of the federated learning process for energy consumption prediction are significantly enhanced.

Figure 3 presents empirical results demonstrating the robustness of FL-QLMS under various types of Byzantine attacks. The proposed algorithm maintains stable performance even in the presence of a significant portion of malicious nodes, validating its effectiveness in ensuring the integrity of the collaborative learning process.

Algorithm 1 FL-QLMS

1:: Server executes:
2:: Initialize global model $W_{0}$
3:: Initialize validation dataset $D_{val}$
4:: for each round $t = 1, 2, \dots$ do
5:: $S_{t} \leftarrow$ random subset of k clients
6:: for each client $i \in S_{t}$ in parallel do
7:: $W_{i}^{t + 1} \leftarrow ClientUpdate (i, W_{t})$
8:: end for
9:: $W_{t + 1} \leftarrow ServerAggregation (W_{i}^{t + 1} i \in S_{t}, D_{val})$
10:: end for
11:
12:: function ClientUpdate( $i, W_{t}$ )
13:: $W_{i} \leftarrow W_{t}$
14:: for each local epoch $e = 1, \dots, E$ do
15:: for batch $b \in D_{i}$ do
16:: $W_{i} \leftarrow W_{i} - η \nabla f (W_{i}; b)$
17:: end for
18:: end for
19:: return $W_{i}$ to server
20:: end function
21:
22:: function ServerAggregation( $W_{i}^{t + 1} i \in S_{t}, D_{val}$ )
23:: for each $W_{i}^{t + 1}, i \in S_{t}$ do
24:: $v_{i} \leftarrow Evaluate (W_{i}^{t + 1}, D_{val})$
25:: end for
26:: $S_{t}^{'} \leftarrow TopK ((i, v_{i}) i \in S_{t})$
27:: return $\sum i \in S_{t}^{'} \frac{| D_{i} |}{| D_{t^{'}} |} \cdot W (i)$
28:: end function
29:
30:: function Evaluate( $W, D_{val}$ )
31:: // W: Local model under evaluation
32:: // $D_{val}$ : Central validation dataset
33:: loss = 0
34:: for each $(x, y)$ in $D_{val}$ do
35:: $\hat{y} \leftarrow W (x)$ // Predict using model W
36:: loss += ${(\hat{y} - y)}^{2}$ // RMSE-based loss
37:: end for
38:: RMSE = $\sqrt{loss / | D_{val} |}$ // Average RMSE
39:: if RMSE > threshold then
40:: // Suspicious of Byzantine node if poor performance
41:: // Add logic for weight adjustment or exclusion
42:: end if
43:: return RMSE // Model performance metric
44:: end function

Further considerations include enhancing the local dataset quality assessment model and using it for pre-filtering during client selection. Currently, local models are evaluated based solely on performance against the global validation set, but future implementations could incorporate a meta-model that evaluates aspects such as client data distribution similarity, class ratios, and noise levels, allowing for more proactive responses to outlier and imbalance issues.

The inclusion of the Evaluate() function in the FL-QLMS algorithm not only improves its completeness and clarity but also strengthens its ability to handle Byzantine failures in the federated learning process. By providing a clear definition and explanation of this critical component, the reliability and practicality of the proposed collaborative learning model for energy optimization are significantly enhanced. This contributes to the realization of a privacy-preserving and efficient energy management system for electric vehicles.

3. Results

3.1. Experimental Design

For the performance evaluation of the proposed system, the Caltech Adaptive Charging Dataset [16] was utilized, which includes driving and charging logs collected from 100 electric vehicles (EVs) that were operating in California, USA, from January 2018 to June 2019. The dataset comprises logs from various car models such as the Nissan Leaf, BMW i3, and Chevrolet Bolt, with 183 participants (108 males and 75 females) with an average driving experience of 5.2 years. Each log contains 39 fields, including vehicle location (GPS), speed, acceleration, battery level, and energy consumption, totaling 5,124,096 records.

The dataset was divided into training (60%), validation (20%), and testing (20%) sets per vehicle. To enhance the realism of the experiment, a non-independent and identically distributed (non-IID) condition was assumed. In real-world scenarios, electric vehicle data are likely to be biased based on factors such as drivers, regions, and driving times, meaning that the distribution of data collected by each electric vehicle may vary. These non-IID conditions can impact the performance of federated learning, and therefore, they were incorporated into the experiment to validate the effectiveness of the proposed technique.

The Caltech Adaptive Charging Dataset contains driving and charging log data collected from various electric vehicle models. Due to differences in battery capacity, efficiency, and other characteristics among these models, there may be variations in energy consumption patterns. This suggests that the distribution of data used for training energy consumption prediction models may differ from vehicle to vehicle. Furthermore, data distribution can also vary depending on drivers’ driving habits, road and traffic conditions in different regions, and driving time periods.

For instance, a group of electric vehicles operating in a specific area is more likely to experience similar road conditions and traffic patterns. As a result, the data distribution of this group may differ from that of electric vehicle groups operating in other regions. Additionally, since each electric vehicle driver has different driving habits, there can be variations in their acceleration and deceleration patterns, the frequency of regenerative braking usage, and more. These factors can influence the distribution of energy consumption data for individual electric vehicles.

In each round of federated learning, 10% of all clients were randomly selected to participate, following a uniform random sampling criterion. The local training was set with a batch size of 256 and five epochs using an SGD optimizer (learning rate = 0.01, momentum = 0.9). A learning rate decay (factor = 0.998, step = 100) was applied to minimize performance disparities among clients.

The global model employed a three-layer multi-layer perceptron (MLP) with layer sizes of 64, 32, and 16. The ReLU activation function was used, and considering the regression problem, a linear activation was applied in the output layer. The loss function was the mean squared error (MSE), and the performance metrics were RMSE and MAE. The top model selection ratio (k) for FL-QLMS was set at 20%, and all experiments were repeated five times to report average performances.

For the Byzantine robustness evaluation, some randomly selected clients were set as Byzantine nodes, simulating scenarios where these nodes send malicious model updates. A global validation dataset for identifying Byzantine nodes consisted of 5000 samples randomly drawn to evenly include data from normal clients.

For comparative evaluation of the power consumption prediction model, the same dataset of 100 vehicles was used, split into 80%/20% for training/testing. The same hyperparameter settings described earlier were applied for the multi-stage prediction model, while a single model used an LSTM-based sequence-to-sequence structure [28]. The hidden state sizes for both the encoder and decoder were uniformly set to 64, implemented in a two-layer stacked configuration.

The experimental environment included a workstation configured with Ubuntu 18.04 LTS, Intel Xeon Gold 6154 CPU, NVIDIA Tesla V100 GPU, and 512 GB RAM. PyTorch 1.8.1 was used as the deep learning framework for model training, and PySyft 0.2.9 for federated learning simulation [29]. Attack-related experimental code, such as Byzantine behavior injection, was internally implemented, and seed values were fixed for reproducibility.

This experimental setup was designed to verify the effectiveness of the proposed technique in realistic non-IID conditions and Byzantine failure scenarios based on diverse EV data configurations. Client selection, batch configuration, and learning parameter settings followed conventional practices from previous studies [19,20], but Byzantine attack patterns were more finely controlled. Additionally, comparisons between stepwise/single prediction and centralized/federated learning were conducted based on the same dataset and model structure to objectively demonstrate the superiority of the proposed technique.

By detailing specific model training conditions and Byzantine behavior injection methods, the experiment design enhances its reproducibility by third parties and ensures the reliability of the results by specifying the computing resource specs, representing a significant advancement over previous studies. However, further persuasive validation could be achieved by diversifying the range and intensity of non-IID and Byzantine settings and incorporating more real data limitations such as noise and missing values.

Moreover, conducting comparisons between heterogeneous models and sensitivity analyses based on client-scale changes could help gauge the general effectiveness of the proposed technique. If FL-QLMS can demonstrate consistent Byzantine robustness across complex model structures and diverse datasets, it could establish a new paradigm applicable to federated learning in general, beyond the EV mobility sector.

Despite this potential, further research is needed to optimize FL-QLMS for on-device learning, introduce differential compression techniques to improve transmission efficiency, and consider heterogeneous hardware performance for practical application in the industry. Planning demonstration projects with EV-related companies through the business modelization of the proposed ideas and exploring connections with the open-source ecosystem could maximize the impact of the research outcomes.

3.2. Performance Evaluation of Prediction Models

To compare the performance of the proposed multi-stage prediction technique with a single-stage prediction method, experiments were conducted using real electric vehicle (EV) operation data. The dataset was divided into training (60%), validation (20%), and testing (20%) sets per vehicle, with all experiments conducted on the testing dataset. The segment length

Δ t

for the multi-stage model was set to 5 min, and a four-layer MLP (Multi-Layer Perceptron) was used for each segment predictor. The MLP architecture consists of an input layer, three hidden layers with 64, 32, and 16 neurons, respectively, and an output layer with a single neuron for energy consumption prediction. was used for each segment predictor. The ReLU activation function was used, and linear activation was applied in the output layer. The single-stage model employed a bidirectional LSTM, with the hidden state size set to 64. Both models were trained using the MSE loss function and the Adam optimizer (

l r = 0.001

), with early stopping and a 20% dropout applied to prevent overfitting.

The experimental results showed that the multi-stage technique achieved significant performance improvements of 26.3% in RMSE and 23.7% in MAE compared to the single-stage method (p < 0.01). This performance gap is attributed to the structural advantage of the multi-stage model, which can flexibly adapt to changes in driving conditions through route segmentation.

As shown in Figure 4, the single-stage model predicts a uniform energy consumption for the entire driving section, thus showing limitations in capturing local variability. In contrast, the multi-stage model performs independent predictions for each segment and models the temporal dependencies between adjacent segments, allowing for a much finer estimation of dynamic variability due to changes in speed, acceleration, and elevation.

Particularly, as observed in the 140–200 s interval in Figure 4, when rapid changes in speed occur along with corresponding shifts in energy consumption patterns, the accuracy of the single-stage model significantly deteriorates. However, the multi-stage model maintains stable prediction performance even in such abrupt situations, suggesting that segment-based modeling enables more robust predictions against uncertainties.

Furthermore, the multi-stage model also contributes to mitigating long-term dependency issues. As seen in Figure 4, the single-stage model often exhibits a tendency for past prediction errors to propagate to later points, resulting in cumulative errors. On the other hand, the multi-stage model performs independent predictions at the segment level and utilizes a cyclic structure to capture complex patterns inherent in long-term trends, significantly enhancing long-term stability compared to the single-stage model.

Thus, the multi-stage model, by composing the total energy consumption from the sum of individual segment predictions, enables precise forecasting that considers both local variability and long-term trends. Such sophisticated demand prediction information can be directly utilized for driver route planning and battery management strategies and further contribute to decision-making efficiency across the energy ecosystem, such as in charging infrastructure development and demand-based billing systems.

Figure 5 compares the performance of energy prediction models under different learning structures: centralized FL, vanilla FL, and partially decentralized FL. Partially decentralized FL is a hybrid approach that combines the advantages of centralized and decentralized architectures. It selects a subset of nodes to participate in global model updates based on their performance, data quality, or other criteria. The selected nodes’ local models are then aggregated using weighted averaging, where the weights are determined by factors such as the nodes’ data size or reliability. This selective aggregation process helps to mitigate the impact of outliers or malicious nodes, enhancing the robustness and efficiency of the federated learning process.

In this experiment, the FL-based methods used data partitioned by vehicles, and the data distribution for each vehicle was assumed to be not independently and identically distributed (non-IID). Additionally, to simulate malicious participation, 20% of the vehicles were set as Byzantine nodes, which may send incorrect or manipulated updates to disrupt the learning process.

The analysis showed that vanilla FL suffered significant performance degradation in non-IID and Byzantine environments. This can be attributed to its simple model averaging strategy, which treats all local updates equally and thus fails to effectively mitigate the influence of outliers or malicious updates. In contrast, partially decentralized FL achieved comparable performance to the centralized approach, even under Byzantine attacks. By selectively aggregating only the top 20% of local models and assigning weights proportional to the nodes’ data size, partially decentralized FL effectively minimized the impact of low-quality updates. Consequently, it demonstrated around 40% higher test accuracy compared to vanilla FL in non-IID settings.

These results not only highlight the accuracy gains of partially decentralized FL but also demonstrate its scalability and robustness advantages, which become increasingly important as the number of participating nodes grows from thousands to tens of thousands. The selective aggregation mechanism of partially decentralized FL acts as a defense against Byzantine attacks, preventing individual malicious nodes from significantly degrading the global model. Moreover, by reducing the number of local models that need to be communicated and aggregated, partially decentralized FL can substantially improve communication efficiency and reduce overall training time.

However, it is important to note that implementing partially decentralized FL in real-world EV scenarios still faces challenges related to privacy protection, incentive mechanism design, and the establishment of trusted validation datasets. Nonetheless, the experimental results strongly suggest that partially decentralized FL is a promising approach for enabling reliable and efficient federated learning in the presence of non-IID data distributions and potential Byzantine attacks, making it a valuable tool for harnessing the power of decentralized EV data.

In summary, the combination of multi-stage prediction and partially decentralized FL techniques offers a comprehensive solution for enhancing the accuracy, scalability, and robustness of energy consumption forecasting in electric vehicles. By capturing fine-grained spatio-temporal patterns and effectively mitigating the impact of data heterogeneity and Byzantine threats, this approach paves the way for more efficient and reliable energy management strategies at both the individual vehicle and the fleet level. Moreover, the insights gained from this study can potentially inform the development of data-driven optimization frameworks for related applications, such as charging infrastructure planning and grid-load balancing, ultimately contributing to the sustainable integration of EVs into smart energy systems.

However, it is important to acknowledge the limitations of the current experiments, such as the relatively small dataset size and the simplified assumptions regarding non-IID distributions and Byzantine behavior. Future research should aim to validate the proposed methods on larger and more diverse datasets, incorporating realistic factors such as seasonal variations, driver-specific characteristics, and complex attack scenarios. Additionally, the development of adaptive multi-scale prediction models that can seamlessly integrate short-term and long-term forecasting capabilities while accounting for the evolving data landscape and user preferences remains an open challenge.

As the adoption of electric vehicles continues to accelerate under the backdrop of global sustainability initiatives, the ability to effectively leverage the vast amounts of data generated by these vehicles will be crucial for optimizing their performance, minimizing their environmental impact, and unlocking new value streams across the energy sector. The multi-stage prediction and partially decentralized FL framework presented in this study represent a significant step towards realizing this vision, providing a solid foundation for data-driven innovation in the EV domain. Through continued research and collaboration at the intersection of AI, blockchain, and energy systems, we can work towards building a more intelligent, resilient, and sustainable transportation ecosystem for the future.

3.3. Verification of Blockchain Efficiency

This section presents the experimental results to verify the efficiency of the proposed partially decentralized blockchain architecture. Initially, the transaction processing throughput (TPS) and block confirmation delay changes were analyzed based on the number of electric vehicles (EVs). Figure 6 illustrates the TPS changes when the number of EVs increased from 100 to 1000. For comparison, a fully decentralized approach was assumed, where each node generates transactions independently [20].

The experimental results showed that, in a fully decentralized approach, the TPS significantly decreased as the number of EVs increased. This trend was attributed to blockchain network overload caused by each node generating individual transactions. In contrast, the proposed method maintained a stable level of TPS regardless of the number of nodes (indicated by the arrow in Figure 7), thanks to a design that minimizes unnecessary individual transactions through transaction merging and compression.

These results suggest that adopting a partially decentralized structure could contribute to solving the scalability issues of blockchain. However, to support actual commercial-level EV infrastructure, network scalability needs to extend to tens of thousands of vehicles, indicating that more sophisticated performance enhancement techniques are necessary.

One potential solution is the integration of blockchain sharding technology. Sharding divides the blockchain network into multiple partitions, allowing each shard to process transactions independently, thereby linearly increasing the overall system throughput [30]. Indeed, next-generation blockchain platforms like Ethereum 2.0 and Polkadot have adopted sharding as a core scalability solution [31,32]. Combining the hierarchical structure proposed in this study with sharding could allow independent consensus and block creation for each EV cluster. Additionally, applying dynamic sharding during the transaction merging process could also offer advantages in terms of load balancing.

Another promising approach is off-chain transaction processing. Off-chain methods handle numerous transactions in parallel outside the blockchain and then record only the final results on the blockchain, significantly improving transaction throughput and reducing latency [33]. Examples include payment channels, like the Lightning Network or Raiden Network, and sidechain technologies, such as Plasma and TrueBit [34]. Especially in environments like EV networks, where numerous micro-transactions occur frequently, off-chain processing can effectively handle large volumes of P2P transactions separately before periodically synchronizing with the main blockchain [35]. Recently, there have been notable attempts to enhance the scalability of the main chain by using zero-knowledge proof-based rollup technology to compress and verify multiple off-chain transactions [36].

Furthermore, a hybrid structure linking on-chain and off-chain technologies is emerging as a promising scalability solution. For instance, using blockchain interoperability protocols like Cosmos or Polkadot, multiple blockchains specialized for individual service areas can be loosely coupled, addressing scalability and sovereignty issues simultaneously [37]. In such a multi-chain structure, the hierarchical model proposed in this paper could be applied to sub-domains, with inter-domain information exchange and collaboration performed through mechanisms like a relay chain [38]. This could enable data linkage and value exchange across the diverse entities of the EV ecosystem, including vehicles, charging stations, and grid operators.

To fully leverage the potential of blockchain in large-scale EV infrastructure, it is necessary to synergistically utilize various scalability solutions such as hierarchical consensus models, sharding, off-chain, and multi-chain paradigms [39]. Figure 8 and Figure 9 experimentally illustrate the scalability improvements achievable by progressively adopting these advanced techniques. Analysis shows that integrating sharding and off-chain technology in a partially decentralized structure could improve TPS by 18 times and reduce latency to 1/25th of a purely decentralized approach. Considering the efficiency of inter-chain transaction processing, it is projected that TPS could exceed 10,000, with block confirmation times within a few seconds [40].

Of course, such a complex design entails increased system complexity and associated implementation and operational overheads. Especially in heterogeneous network environments with various types of nodes and consensus protocols, ensuring trust model consistency and data integrity becomes a major challenge [41]. Additionally, establishing a global governance system that encompasses different blockchain platforms and legacy systems is also a prerequisite [42]. Nonetheless, these challenges are deemed solvable through incremental technological and regulatory innovations, and the potential impact is substantial, making the construction of blockchain-based EV infrastructure a necessary direction for progress.

In summary, the partially decentralized blockchain architecture proposed in this study provides a direction for overcoming the scalability limitations of current EV infrastructure. However, to complement the structural limitations of the proposed model and achieve commercial-level performance and stability, the strategic adoption of advanced distributed computing technologies such as sharding, off-chain, and inter-chain is inevitable. Additionally, establishing interoperability between heterogeneous platforms and clarifying governance systems regarding data sovereignty and accountability are prerequisites for the healthy development of the EV blockchain ecosystem [43]. With these technological and regulatory foundations in place, the proposed architecture is expected to evolve into a data infrastructure encompassing the entire value chain of the electric vehicle industry.

3.4. Evaluation of Robustness in Federated Learning

This section presents the experimental results aimed at rigorously and comprehensively evaluating the robustness of the proposed FL-QLMS technique. Specifically, a comprehensive threat model was assumed, encompassing not only random injection and adversarial loss attacks but also modern attack techniques such as data poisoning, model poisoning, and adversarial imitation learning, with a quantitative analysis of FL-QLMS’s response performance.

Firstly, data poisoning is an attack technique that contaminates the model by maliciously injecting designed samples into the training dataset [34]. In this experiment, two typical data poisoning methods were implemented: label flipping and backdoor trigger insertion. Label flipping randomly inverts the labels of training data [35], whereas backdoor trigger insertion assigns an attacker-specified label to samples containing a specific pattern, causing misclassification when that pattern appears later [36].

Figure 9 illustrates the impact of data poisoning attacks on the test accuracy of FL-QLMS. The graph shows that as the ratio of label flipping and the rate of backdoor trigger insertion increase, the test accuracy of the model decreases. However, FL-QLMS maintains a relatively high accuracy even under severe data poisoning conditions, demonstrating its robustness against such attacks.

Another prominent threat is model poisoning, an attack that disrupts or distorts the learning model itself to degrade the performance of the global model, implemented through random parameter injection or scale manipulation [38]. Moreover, adversarial imitation learning is gaining attention as a sophisticated type of Byzantine attack [39]. It involves observing the models of legitimate clients and generating malicious models that appear legitimate to bypass the selection process. In this experiment, adversarial imitation attacks were implemented using the Generative Adversarial Imitation Learning (GAIL) framework [40].

Figure 10 compares the test accuracy of FL-QLMS and the conventional FedAvg method against model tampering and adversarial imitation attacks with varying attack intensities. The results show that FL-QLMS consistently outperforms FedAvg in terms of test accuracy under both types of attacks, especially as the attack intensity increases. This demonstrates the superior robustness of FL-QLMS against model poisoning and adversarial imitation learning attacks.

These experimental results empirically support that the proposed FL-QLMS possesses robust resistance against modern Byzantine attacks and data poisoning techniques. Particularly, unlike conventional simple model averaging, the dynamic selection and weighting adjustment mechanism based on verification performance holds value as an adaptive algorithm capable of consistent defensive performance in evolving threat environments.

However, the attacks applied in this experiment were relatively standardized scenarios, and a long-term perspective analysis of Byzantine behavior was omitted. In real situations, attackers might use cunning and adaptive strategies, such as delay attacks that act harmlessly at the beginning but cause critical disruption later, or stealth attacks that gradually accumulate subtle distortions over a long period [41].

For in-depth analysis of such sophisticated threats, it is necessary to observe the emergent behavior of the algorithm over long-term learning, beyond short-term scenario evaluations. Introducing modeling techniques such as multi-agent reinforcement learning or dynamic game theory [42] could be a promising approach to analyze the strategic interactions between FL-QLMS and Byzantine attacks from a long-term perspective. Complementing this with theoretical discussions on the convergence and resilience of federated learning [43] could contribute to establishing defensive policies against various uncertainties inherent in large-scale EV infrastructure environments.

In summary, these experimental results demonstrate that FL-QLMS possesses robust resistance against various forms of Byzantine disruptions that could destabilize data quality and the learning process. However, in-depth analysis of cunning and adaptive attack behaviors, which are expected to be prevalent in actual EV networks, is required. This necessitates research into the equilibrium of games between adversarial agents from a long-term perspective. Additionally, integrating privacy-preserving learning techniques such as differential privacy or homomorphic encryption [44] remains an important subsequent task to fundamentally ensure data integrity and privacy.

Ultimately, FL-QLMS aims to implement collaborative model learning that is robust against data deception without compromising the confidentiality and anonymity of EV operational data. It is hoped that through advanced research encompassing dynamic game theory and privacy protection technologies, we can move closer to this goal, contributing to the creation of a safe and sustainable EV data utilization ecosystem.

4. Discussion and Conclusions

4.1. Significance of the Research

This study presents a new paradigm for the management and utilization of electric vehicle (EV) operational data by proposing an integrated blockchain–federated learning framework. Specifically, by combining a partially decentralized blockchain structure with a Byzantine-robust FL algorithm, this research addresses privacy violations and single points of failure inherent in traditional centralized management systems while also achieving revolutionary improvements in model performance and robustness.

From the perspective of black-box data management, the proposed blockchain structure ensures both data confidentiality and integrity through the use of RSA encryption and SHA-256 hash chains in a zero-knowledge proof framework [1]. Moreover, by optimizing transaction merging and limiting consensus participation, it enhances processing performance and effectively mitigates communication and computational costs proportional to the number of consensus nodes. Notably, the management of data ownership and anonymized tokens on the blockchain ensures user control over personal information and lays the foundation for activating the data economy [2].

In terms of developing energy consumption prediction models, the proposed framework also innovates in accuracy, scalability, and robustness. The FL-QLMS algorithm evaluates local model performance using external validation data and performs model merging through dynamic selection and weighted averaging, considering Byzantine robustness, thus overcoming the limitations of existing methods like FedAvg. The experimental results show that FL-QLMS maintains over 90% test accuracy, even with nearly half of the nodes being Byzantine [4]. Additionally, by employing multi-stage prediction and hierarchical modeling, it captures short- and long-term energy usage patterns, achieving more than 26% improvement in RMSE and 23% in MAE compared to conventional single models [5].

The integration of these technologies in the proposed system not only dramatically improves the technical limitations of existing EV data management infrastructures but also creates a reliable information-sharing foundation for optimizing energy consumption and stabilizing power grids. Most importantly, by implementing the principles of data 3A (availability, accountability, auditability) [6], it facilitates collaborative decision-making among various stakeholders within the EV ecosystem and fosters a virtuous cycle of data-driven innovation. This is expected to contribute to enhancing battery grid efficiency, expanding the integration of renewable energy, and ultimately addressing the societal challenge of achieving carbon neutrality.

From an industrial perspective, this research is significant as it provides a foundation for exploring energy-mobility convergence services. With the automation of the entire data collection-learning-application cycle enabled by the proposed framework, various business models, such as optimal placement of charging infrastructure, power grid demand forecasting, and dynamic pricing design, can be created [7]. Particularly, if the energy usage model trained via FL is implemented on blockchain-based smart contracts, it could significantly contribute to the promotion of EV adoption and the development of the charging industry.

Furthermore, the design philosophy and core technologies of the proposed system are expected to have implications for adjacent fields such as vehicle-to-grid (V2G), autonomous driving, and smart cities. For example, the privacy-protecting data sharing protocol and incentive mechanism established in this study can be directly applied to the collection and utilization of autonomous vehicle operation data. Additionally, by evolving into a multi-agent reinforcement learning framework for integrated traffic-energy optimization, the proposed system can transcend the value of EV infrastructure to become a key driver for data-based smart city operations [8].

In summary, this study is academically significant as it presents a new paradigm in the design of data infrastructures for the EV ecosystem and establishes a practical methodology for energy-mobility convergence. Particularly, by combining blockchain and federated learning, it explores the harmony between conflicting values such as data sovereignty and privacy versus prediction performance and robustness, serving as a pioneering example for technology transfer to various application areas. Moreover, by presenting a long-term roadmap for the sophistication of EV infrastructure based on large-scale empirical data analysis, it also holds substantial industrial and policy value. Overall, this research provides a blueprint for a data platform that forms the basis for safe and intelligent energy operations in the era of the EV revolution, aligning with societal contributions towards addressing the climate crisis and sustainable development.

4.2. Limitations and Future Research Directions

This research is significant academically and practically as it proposes an integrated blockchain–federated learning framework for the utilization of EV data and demonstrates its effectiveness. However, there are several limitations that exist, and this section aims to outline future research directions to overcome these limitations and enhance the completeness of the research.

Firstly, large-scale empirical experiments are essential to verifying the actual performance and safety of the proposed system. This research was limited to proof-of-concept simulations; hence, model training and blockchain verification using long-term operational data covering various driving scenarios and network conditions are required [1]. Particularly, ensuring robustness against real data-specific noise factors, such as data quality and collection periodicity inconsistencies and GPS dead zones, is a major challenge. This necessitates the establishment of an automated data preprocessing pipeline, the improvement in data quality through techniques like semi-supervised learning and transfer learning [2,3], and the development of blockchain stress tests and data integrity verification systems.

Secondly, to enhance the social acceptability of the proposed system, building trust among participants and designing fair incentives are crucial. Especially, establishing institutional foundations to dispel privacy concerns associated with data provision and to encourage active participation is urgent [5]. This involves integrating research outputs into regulatory frameworks through standardization and regulatory sandboxes, protecting personal information using anonymized tokens and differential privacy techniques [6], and designing dynamic incentive mechanisms linked to the quantity and quality of data provided [7]. Additionally, policy efforts to improve EV users’ perceptions and promote ecosystem participation through technology education and acceptability assessments are required.

Thirdly, exploring the potential for broad application across the entire EV value chain is necessary. As the importance of data-driven decision-making is increasing in various areas of the EV industry ecosystem, such as V2G services, battery reuse, and the optimization of charging infrastructure operations [8], establishing data integration and service linkage with these sectors is a key research task. This could involve incorporating V2G participation history and battery performance data into the training dataset or applying multi-agent reinforcement learning [9] for collaborative decision-making across sectors. In the long term, exploring the potential for development into an energy data marketplace encompassing electric, telecommunications, and financial services is also necessary [10].

Fourthly, continuous research to overcome the technical limitations of blockchain and federated learning themselves is required. Considering the constraints of current blockchain platforms in terms of processing performance, scalability, and data confidentiality, performance improvements through integration with cutting-edge cryptographic technologies are necessary. Utilizing zero-knowledge proofs and multi-party computation [11], enhancing privacy-preserving federated learning combined with homomorphic encryption [12], and the strategic application of sharding and off-chain processing are promising approaches. Additionally, exploring lightweight security measures such as selective encryption to balance data integrity and model performance is necessary.

Lastly, sophisticated strategies to ensure the Byzantine robustness of federated learning in the long term must be continuously researched. Developing detection and defense mechanisms against cunning adversarial behaviors, such as delay attacks or stealth attacks, is exemplary [13]. This requires analyzing emergent behaviors in long-term learning processes, modeling the interactions between attacks and defense using dynamic game theory, and combining advanced methodologies such as multi-party computation-based privacy-preserving outsourcing [14]. Additionally, ensuring the reliability of the validation datasets used for evaluating the performance of the global model, possibly through the participation of external agencies or blockchain-based data verification systems, is also crucial.

In summary, addressing these limitations and future research directions indicates that technological advancements, as well as their acceptability in industrial settings, are crucial for the practical application of the proposed framework. Therefore, efforts to advance technology through phased empirical validations based on academia–industry collaboration and to lay the groundwork for activating the data economy through legislative adjustments are required [15]. In this process, establishing a governance system to coordinate and foster cooperation among stakeholders in the energy sector will also emerge as a key task.

Despite these challenges, the vision of blockchain–federated learning integration and the data utilization framework presented in this research provide a foundation for leading a paradigm shift across the energy system, extending beyond the EV industry, which holds significant academic and practical significance. It is hoped that the issues raised and future research directions will serve as a stepping stone for enhancing sustainability in the energy and transportation sectors and achieving carbon neutrality goals. Particularly, leveraging the research outcomes as a bridge to enhance the societal acceptance of energy data utilization and contribute to creating an open innovation ecosystem is highly anticipated.

Author Contributions

Conceptualization, J.-H.P. and I.-W.J.; methodology, J.-H.P.; software, J.-H.P.; validation, J.-H.P. and I.-W.J.; formal analysis, J.-H.P.; data curation, J.-H.P.; writing—original draft preparation, J.-H.P.; writing—review and editing, J.-H.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

International Energy Agency (IEA). Global EV Outlook 2024: Moving towards Increased Affordability. Available online: https://www.iea.org/reports/global-ev-outlook-2024 (accessed on 15 June 2024).
Clean Energy Ministerial. EV30@30: A Campaign Launched under the Electric Vehicle Initiative. Available online: http://www.cleanenergyministerial.org/campaign-clean-energy-ministerial/ev3030-campaign (accessed on 15 June 2024).
Rajakaruna, S.; Shahnia, F.; Ghosh, A. Plug in Electric Vehicles in Smart Grids: Integration Techniques; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Zhang, K.; Xu, L.; Ouyang, M.; Wang, H.; Lu, L.; Li, J.; Li, Z. Optimal decentralized valley-filling charging strategy for electric vehicles. Energy Convers. Manag. 2014, 78, 537–550. [Google Scholar] [CrossRef]
Weiller, C.; Neely, A. Using electric vehicles for energy services: Industry perspectives. Energy 2014, 77, 194–200. [Google Scholar] [CrossRef]
Frendo, O.; Graf, J.; Gaertner, N.; Stuckenschmidt, H. Data-driven smart charging for heterogeneous electric vehicle fleets. Energy AI 2020, 1, 100007. [Google Scholar] [CrossRef]
Frendo, O.; Gaertner, N.; Stuckenschmidt, H. Improving smart charging prioritization by predicting electric vehicle departure time. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4392–4403. [Google Scholar] [CrossRef]
Shafie-Khah, M.; Siano, P.; Fitiwi, D.Z.; Mahmoudi, N.; Catalão, J.P. An innovative two-level model for electric vehicle parking lots in distribution systems with renewable energy. IEEE Trans. Smart Grid 2018, 9, 1506–1520. [Google Scholar] [CrossRef]
Gupta, N.; Lamba, H.; Kumaraguru, P.; Joshi, A. Faking sandy: Characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 729–736. [Google Scholar]
Bhagat, S.; Rani, V. Security concerns related to electric vehicles: Challenges and solutions. Arch. Comput. Methods Eng. 2021, 1, 1–21. [Google Scholar]
Beck, R.; Stenum Czepluch, J.; Lollike, N.; Malone, S. Blockchain—The Gateway to Trust-Free Cryptographic Transactions. In Proceedings of the European Conference on Information Systems (ECIS), Atlanta, GA, USA, 12–15 June 2016; p. 153. [Google Scholar]
NIST. Recommendation for Key Management: Part 1: General (Revision 3). Available online: https://csrc.nist.gov/publications/detail/sp/800-57-part-1/rev-3/final (accessed on 10 March 2021).
Mingxiao, D.; Xiaofeng, M.; Zhe, Z.; Xiangwei, W.; Qijun, C. A review on consensus algorithm of blockchain. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017; pp. 2567–2572. [Google Scholar]
Sedlmeir, J.; Buhl, H.U.; Fridgen, G.; Keller, R. The energy consumption of blockchain technology: Beyond myth. Bus. Inf. Syst. Eng. 2020, 62, 599–608. [Google Scholar] [CrossRef]
Baza, M.; Nabil, M.; Ismail, M.; Mahmoud, M.; Serpedin, E.; Rahman, M. Blockchain-based charging coordination mechanism for smart grid energy storage units. In Proceedings of the 2019 IEEE International Conference on Blockchain (Blockchain), Atlanta, GA, USA, 14–17 July 2019; pp. 504–509. [Google Scholar]
Li, Z.; Bahramirad, S.; Paaso, A.; Yan, M.; Shahidehpour, M. Blockchain for decentralized transactive energy management system in networked microgrids. Electr. J. 2019, 32, 58–72. [Google Scholar] [CrossRef]
Aitzhan, N.Z.; Svetinovic, D. Security and privacy in decentralized energy trading through multi-signatures, blockchain and anonymous messaging streams. IEEE Trans. Dependable Secur. Comput. 2016, 15, 840–852. [Google Scholar] [CrossRef]
Dorri, A.; Kanhere, S.S.; Jurdak, R.; Gauravaram, P. Blockchain for IoT security and privacy: The case study of a smart home. In Proceedings of the 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Kona, HI, USA, 13–17 March 2017; pp. 618–623. [Google Scholar]
Yang, Z.; Yang, K.; Lei, L.; Zheng, K.; Leung, V.C. Blockchain-based decentralized trust management in vehicular networks. IEEE Internet Things J. 2018, 6, 1495–1505. [Google Scholar] [CrossRef]
Lu, Y.; Wu, X.; Li, J.; Xu, D. A Peer-to-Peer Energy Trading System for Electric Vehicles Based on Consortium Blockchain. IEEE Access 2021, 9, 128657–128668. [Google Scholar]
Yang, T.; Guo, Q.; Tai, X.; Sun, H.; Zhang, B.; Zhao, W.; Lin, C. Applying blockchain technology to decentralized operation in future energy internet. In Proceedings of the 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 26–28 November 2017; pp. 1–5. [Google Scholar]
ISO/IEC 25024:2015; Systems and Software Engineering—Systems and Software Quality Requirements and Evaluation (SQuaRE)—Measurement of Data Quality. ISO: Geneva, Switzerland, 2015.
ISO 20762:2018; Electrically Propelled Road Vehicles—Determination of Power for Propulsion of Hybrid Electric Vehicle. ISO: Geneva, Switzerland, 2018.
Lamport, L.; Shostak, R.; Pease, M. The Byzantine generals problem. In Concurrency: The Works of Leslie Lamport; Springer: Berlin/Heidelberg, Germany, 2019; pp. 203–226. [Google Scholar]
Wang, S.; Taha, A.F.; Wang, J.; Kvaternik, K.; Hahn, A. Energy crowdsourcing and peer-to-peer energy trading in blockchain-enabled smart grids. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 1612–1623. [Google Scholar] [CrossRef]
Li, X.; Huang, X.; Li, C.; Yu, R.; Shu, L. EdgeCare: Leveraging edge computing for collaborative data management in mobile healthcare systems. IEEE Access 2019, 7, 22011–22025. [Google Scholar] [CrossRef]
Li, Y.; Zhang, Y.; Huang, K.; Xie, H.; Cai, Z. BC-FL: A byzantine consensus protocol using committee mechanism for federated learning. Appl. Sci. 2020, 10, 4193. [Google Scholar]
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
Ryffel, T.; Trask, A.; Dahl, M.; Wagner, B.; Mancuso, J.; Rueckert, D.; Passerat-Palmbach, J. A generic framework for privacy preserving deep learning. arXiv 2018, arXiv:1811.04017. [Google Scholar]
Zamani, M.; Movahedi, M.; Raykova, M. Rapidchain: A fast blockchain protocol via full sharding. In Proceedings of the CCS ’18: 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; Volume 460. [Google Scholar]
Sheng-Nan, W.; Xing-Yu, G.; Zhi-Hui, Z.; Er-Hui, C. A Dynamic Committee Consensus Mechanism Based on Raft Algorithm in Permissioned Blockchain. In Proceedings of the 2021 International Conference on Computer Communications and Networks (ICCCN), Virtual, 19–22 July 2021; pp. 1–8. [Google Scholar]
Dang, H.; Dinh, T.T.A.; Loghin, D.; Chang, E.C.; Lin, Q.; Ooi, B.C. Towards scaling blockchain systems via sharding. In Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2019; pp. 123–140. [Google Scholar]
Khalil, R.; Gervais, A.; Felley, G. TEX—A securely scalable trustless exchange. IACR Cryptol. ePrint Arch. 2019, 2019, 265. [Google Scholar]
Biggio, B.; Nelson, B.; Laskov, P. Poisoning attacks against support vector machines. arXiv 2012, arXiv:1206.6389. [Google Scholar]
Chen, S.; Xue, M.; Fan, L.; Hao, S.; Xu, L.; Zhu, H.; Li, B. Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. Comput. Secur. 2018, 73, 326–344. [Google Scholar] [CrossRef]
Gu, T.; Liu, K.; Dolan-Gavitt, B.; Garg, S. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks. IEEE Access 2019, 7, 47230–47244. [Google Scholar] [CrossRef]
Fung, C.; Yoon, C.J.; Beschastnikh, I. Mitigating sybils in federated learning poisoning. arXiv 2018, arXiv:1808.04866. [Google Scholar]
Bhagoji, A.N.; Chakraborty, S.; Mittal, P.; Calo, S. Analyzing federated learning through an adversarial lens. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 634–643. [Google Scholar]
Zhang, J.; Chen, B.; Cheng, X.; Cai, H.N.; Cheng, X.; Ma, T. Poisongan: Generative poisoning attacks against federated learning in edge computing systems. IEEE Internet Things J. 2021, 8, 3310–3322. [Google Scholar] [CrossRef]
Ho, J.; Ermon, S. Generative adversarial imitation learning. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–8 December 2016; Volume 29. [Google Scholar]
Jin, S.; Fang, B.; Zhang, X.; Liu, Z. Reliability and security issues in deep learning models: A survey. IEEE Access 2021, 9, 101625–101647. [Google Scholar]
Huang, J.; Qian, S.; Popa, R.A.; Ren, Q. Multiparty Private Set Intersection with Quasi-Linear Complexity. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, 17–21 October 2021; pp. 666–696. [Google Scholar]
Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. In Proceedings of the Advances in Neural Information Processing Systems: 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Truex, S.; Baracaldo, N.; Anwar, A.; Steinke, T.; Ludwig, H.; Zhang, R.; Zhou, Y. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, London, UK, 15 November 2019; pp. 1–11. [Google Scholar]

Figure 1. System components for electric vehicle data management.

Figure 2. The improvement in performance per driving cycle for the multi-stage model compared to the single model.

Figure 3. Robustness verification results of FL-QLMS against various Byzantine attacks.

Figure 4. Comparison of prediction results between single-stage and multi-stage models (sample driving sequence).

Figure 5. Comparison of test accuracy between centralized FL, vanilla FL, and partially decentralized FL methods.

Figure 6. Comparison of TPS improvement effects with scalability solutions.

Figure 7. TPS changes according to the number of electric vehicles.

Figure 8. Comparison of latency improvement effects with scalability solutions.

Figure 9. Changes in test accuracy according to label flipping ratio and backdoor insertion rate.

Figure 10. Test Accuracy of Algorithms Against Model Poisoning and Adversarial Imitation Learning Attacks.

Table 1. Comparison of energy consumption prediction performance.

Method	RMSE (kWh)	MAE (kWh)
Single-stage LSTM	3.901 ± 0.285	2.426 ± 0.194
Multi-stage MLP	3.127 ± 0.174	1.853 ± 0.132

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Park, J.-H.; Joe, I.-W. Federated Learning-Based Prediction of Energy Consumption from Blockchain-Based Black Box Data for Electric Vehicles. Appl. Sci. 2024, 14, 5494. https://doi.org/10.3390/app14135494

AMA Style

Park J-H, Joe I-W. Federated Learning-Based Prediction of Energy Consumption from Blockchain-Based Black Box Data for Electric Vehicles. Applied Sciences. 2024; 14(13):5494. https://doi.org/10.3390/app14135494

Chicago/Turabian Style

Park, Jong-Hyuk, and In-Whee Joe. 2024. "Federated Learning-Based Prediction of Energy Consumption from Blockchain-Based Black Box Data for Electric Vehicles" Applied Sciences 14, no. 13: 5494. https://doi.org/10.3390/app14135494

APA Style

Park, J.-H., & Joe, I.-W. (2024). Federated Learning-Based Prediction of Energy Consumption from Blockchain-Based Black Box Data for Electric Vehicles. Applied Sciences, 14(13), 5494. https://doi.org/10.3390/app14135494

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu