CN111967667B

CN111967667B - Rail transit distributed operation and maintenance method and system

Info

Publication number: CN111967667B
Application number: CN202010827008.0A
Authority: CN
Inventors: 付哲; 肖骁; 刘超
Original assignee: Traffic Control Technology TCT Co Ltd
Current assignee: Traffic Control Technology TCT Co Ltd
Priority date: 2020-08-17
Filing date: 2020-08-17
Publication date: 2024-03-01
Anticipated expiration: 2040-08-17
Also published as: CN111967667A

Abstract

The embodiment of the invention provides a distributed operation and maintenance method and system for rail transit, wherein the method comprises the following steps: receiving original equipment data by a plurality of data schedulers, and carrying out data drift detection and data distribution on the processed data set to obtain corrected equipment data; dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises at least 1 actual station and at most 4 actual stations, receiving correction equipment data, and running a single machine algorithm and model tuning to obtain optimized model training parameters; and collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to a plurality of centralized station processors to complete global model aggregation. According to the embodiment of the invention, the distributed machine learning architecture for the rail transit system is constructed, the central station is set to split the line data set, the workload of the model aggregator is reduced, single-point faults are avoided, and the algorithm optimization and the system upgrading are carried out by adopting data drift detection.

Description

Rail transit distributed operation and maintenance method and system

Technical Field

The invention relates to the technical field of rail transit operation and maintenance, in particular to a rail transit distributed operation and maintenance method and system.

Background

In a smart operation and maintenance scenario, for example, in a rail transit system, the following problems are generally encountered: (1) The system with a centralized structure has the advantages that the consumption of the configured transmission wires is very large, the usability of the communication system is also deteriorated, and the single-point failure can lead to the unavailability of the whole system; (2) The length of the line is too long, the total data volume required to be processed by the system is very huge, the working pressure of the central processing unit is large, and the time consumption of the calculation process is too long.

Therefore, a new intelligent operation and maintenance method is needed to solve the above problems.

Disclosure of Invention

The embodiment of the invention provides a distributed operation and maintenance method and system for rail transit, which are used for solving the defects in the prior art.

In a first aspect, an embodiment of the present invention provides a distributed operation and maintenance method for rail transit, including:

receiving original equipment data in a preset centralized jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data;

dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters;

and collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation.

Further, the obtaining the processed data set based on the original equipment data, and performing data drift detection and data distribution on the processed data set to obtain corrected equipment data specifically includes:

preprocessing the original equipment data to obtain preprocessed original equipment data;

performing feature analysis on the preprocessed original equipment data to obtain the processed data set;

splitting the processed data set into a training set and a testing set according to a preset splitting proportion;

and detecting the data drift of the processed data set to obtain a detection result, and executing preset allocation operation on the original equipment data based on the detection result to obtain the correction equipment data.

Further, the detecting the drift of the processed data set data to obtain a detection result, and executing a preset allocation operation on the original equipment data based on the detection result to obtain the corrected equipment data specifically includes:

inputting the training set into a trained model to obtain a prediction result;

comparing the prediction result with a real label to obtain model prediction accuracy;

based on the model prediction accuracy and a preset judgment threshold, if the model drift is judged, a drift warning is sent out, a new model is retrained to replace the original model, a plurality of data schedulers with the model drift and a plurality of data schedulers with the model drift are used for carrying out data exchange, and the correction equipment data are sent to the plurality of concentrator station processors;

and if the model drift is judged to be not generated, the original model is still used for prediction, and the original equipment data are sent to the plurality of centralized station processors.

Further, the data schedulers with the model drift exchange data with the rest data schedulers without the model drift, and the correction device data are sent to the centralized station processors, which specifically comprises:

acquiring a time point of occurrence of model drift, and discarding original data before the time point by a plurality of data schedulers of occurrence of model drift;

the data schedulers with the model drift initiate data exchange requests to the rest data schedulers without the model drift;

each data scheduler which does not generate model drift transmits 1/N data after the time point to a plurality of data schedulers generating model drift, wherein N is the number of the plurality of hub station processors.

Further, the data schedulers with the model drift exchange data with the rest data schedulers without the model drift, and the correction device data are sent to the centralized station processors, and the method further comprises the following steps:

and after the plurality of centralized station processors receive data sent by a plurality of data schedulers with preset proportions and without model drift, restarting model training.

Further, the plurality of central station processors receive the correction device data, run a stand-alone algorithm and perform model tuning based on the correction device data, and obtain optimized model training parameters, which specifically includes:

and performing single machine learning optimization based on the correction equipment data by adopting a Frank-Wrofe algorithm to obtain the optimized model training parameters.

Further, the global model aggregation specifically includes:

and after receiving the integrated model training parameters uploaded by the plurality of central station processors according to a preset aggregation proportion, carrying out model aggregation by adopting a preset optimization algorithm so as to enable the global models to be consistent.

In a second aspect, an embodiment of the present invention further provides a track traffic distributed operation and maintenance system, including:

the scheduling module is used for receiving original equipment data in a preset concentrated jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data;

the centralized module is used for dividing the rail transit line into a plurality of centralized station processors, each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters;

and the aggregation module is used for collecting and integrating the optimized model training parameters by the model aggregator, and transmitting the integrated model training parameters to the plurality of concentrating station processors to complete global model aggregation.

In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements the steps of any one of the above-described distributed operation and maintenance methods for rail transit when the processor executes the program.

In a fourth aspect, embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a rail transit distributed operation and maintenance method as described in any of the above.

According to the distributed operation and maintenance method and system for the rail transit system, disclosed by the embodiment of the invention, the distributed machine learning architecture for the rail transit system is constructed, the central station is set to split the data set on the road, the workload of the model aggregator is reduced, single-point faults are avoided, and the algorithm optimization and the system upgrading are carried out by adopting data drift detection.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a distributed operation and maintenance method for rail transit provided by an embodiment of the invention;

FIG. 2 is a block diagram of a distributed operation and maintenance system according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a data drift application provided in an embodiment of the present invention;

FIG. 4 is a communication flow diagram in a model aggregation process provided by an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a distributed operation and maintenance system for rail transit according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Fig. 1 is a schematic flow chart of a distributed operation and maintenance method for rail transit, which is provided in an embodiment of the present invention, and as shown in fig. 1, the method includes:

s1, receiving original equipment data in a preset centralized jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data;

s2, dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters;

and S3, collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation.

Specifically, the application of the embodiment of the invention is premised on: the rail transit line is divided into a plurality of centralized stations, each centralized station comprises a plurality of actual stations, and each centralized station at least comprises one actual station and at most comprises 4 actual stations.

Here, the whole is composed of three parts of a data scheduler, a central station processor and a model aggregator, and the frame flow is as shown in fig. 2:

the data dispatcher is mainly responsible for the functions of data collection, data drift detection, data preprocessing and the like in the jurisdiction; the central station processor is responsible for single machine algorithm operation and tuning; the model aggregator is responsible for aggregating model parameters from different working nodes to obtain a complete global model.

According to the embodiment of the invention, the distributed machine learning architecture for the rail transit system is constructed, the central station is set to split the line data set, the workload of the model aggregator is reduced, single-point faults are avoided, and the algorithm optimization and the system upgrading are carried out by adopting data drift detection.

Based on the above embodiment, the method step S1 specifically includes:

The detecting the drift of the processed data set data to obtain a detection result, and executing a preset allocation operation on the original equipment data based on the detection result to obtain the correction equipment data specifically includes:

inputting the training set into a trained model to obtain a prediction result;

The data exchange is performed between the data schedulers with the model drift and the data schedulers without the model drift, and the correction equipment data are sent to the centralized station processors, which specifically comprises:

The data schedulers with the model drift perform data exchange with the data schedulers with the rest without the model drift, and send the correction device data to the centralized station processors, and the method further comprises the following steps:

Specifically, the data schedulers are in one-to-one correspondence with the central station processors, namely, the number and the number of the data schedulers are consistent with the central station numbers and the central station processor numbers, and the following functions are respectively realized:

(1) Receiving equipment data in a preset concentrated jurisdiction, and preprocessing the data, wherein the preprocessing comprises missing value filling, denoising, normalization and the like;

(2) Performing feature analysis operations including, but not limited to, statistical feature analysis, depth feature analysis, and the like;

(3) Splitting the data set, namely splitting the preprocessed data set into a training set and a testing set, wherein the preset splitting ratio is set by people, such as 8:2, 7:3 and the like;

(4) And (3) data drift detection:

data drift detection is also known as model drift or conceptual drift, and is mainly due to the fact that the performance of a model is reduced due to the fact that the distribution of data changes, and when such phenomena occur, a system needs to timely correct a data set used by algorithm training to cope with the performance reduction which may exist. The model update framework with conceptual drift detection is shown in fig. 3: firstly, a batch of input samples are sent into a trained model to obtain a corresponding prediction result; then, comparing the prediction result with a real label, and calculating the accuracy of the model; and finally, judging whether concept drift occurs according to the accuracy of the model, if so, sending out warning to declare that the concept drift exists, and retraining the model to replace the original model. If no concept drift occurs, the original model is used for prediction continuously. After the initial training of the system is started, the central station processor can determine the turn of training iteration and the starting time of each turn according to the actual change condition of data distribution in the jurisdiction of the central station processor.

(5) And under the function, the data scheduler is responsible for executing preset allocation operation to the central station processor, and the data allocation is divided into the following two cases:

1) When the 'data drift' condition does not occur, the data scheduler sends the original data in the jurisdiction range to the centralized station processor;

2) When the data drifting condition occurs, the data scheduler in which the drifting condition exists can discard the original data before the time and record the time point when the data drifting occurs, meanwhile, the data scheduler can initiate data exchange requests to other data schedulers which do not generate the data drifting on the whole line to supplement the shortage of data quantity, and after the other data schedulers which do not generate the data drifting receive the data exchange requests, 1/N of the data after the time point is sent to a command requester, wherein N is the number of the centralized station processors.

Here, the data distribution process of the data scheduler may also use a mode of random sampling of the whole line, which has the advantages of simple method, easier implementation, and avoiding the over-fitting problem possibly existing in the single machine optimization process, and has the disadvantage of larger communication burden among the distributed nodes.

Correspondingly, the central station processor initiating the data exchange request can start the retraining process after receiving data sent by other data schedulers with preset proportions, for example more than two thirds, of which no data drift occurs.

The embodiment of the invention takes charge of the functions of data collection, data drift detection, data preprocessing and the like in the jurisdiction by the data scheduler, and based on the data drift detection function, the data scheduler can judge the moment when the algorithm starts retraining so as to achieve the purposes of algorithm optimization and system upgrading.

Based on any of the above embodiments, step S2 in the method specifically includes:

Specifically, the process of each central station processor executing the current model according to the data allocated to the central station processor by the data scheduler is a conventional single machine learning task, and in the embodiment of the present invention, the central station processor adopts a Frank-Wlofe algorithm for the optimization process of single machine learning, and the purpose of the algorithm is to enable the single machine learning to obtain optimized model parameters.

According to the embodiment of the invention, the data set on the whole line is segmented through the arrangement of the centralized station, the workload of the model aggregator is reduced, and meanwhile, the situation that the whole system is unavailable due to the equipment failure of a single station is avoided.

Based on any of the foregoing embodiments, the global model aggregation specifically includes:

Specifically, the purpose of the model aggregator is to: model training parameters of all the central station processors are collected and integrated, and the integrated model parameters are issued to the central station processors so as to achieve global model consistency.

At the same time, the model aggregator is also the initiator of each start training of the system.

In order to simplify the calculation process of the embodiment of the invention, a preset optimization algorithm, such as a model averaging method, is adopted in the model aggregation process, and the calculation formula is as follows:

wherein omega _t The model aggregator receives K values in total under the current parameters, namely, K centralized station processors upload the current parameter values.

Here, the model aggregation process may also use methods such as ADMM, SSGD, etc., and what kind of method is specifically adopted is determined according to specific data conditions.

In the process of receiving the model parameters uploaded by the central station processor, the model aggregator receives the preset aggregation proportion, for example, 80% of the parameters uploaded by the central station processor, and then the model aggregation process can be started, and the specific flow is shown in fig. 4.

The model aggregator of the embodiment of the invention only receives the model parameters sent by the collecting center processor, and compared with the prior scheme for sending the original data, the model aggregator greatly reduces the communication burden of the center part of the system.

The track traffic distributed operation and maintenance system provided by the embodiment of the invention is described below, and the track traffic distributed operation and maintenance system described below and the track traffic distributed operation and maintenance method described above can be correspondingly referred to each other.

Fig. 5 is a schematic structural diagram of a distributed operation and maintenance system for rail transit according to an embodiment of the present invention, as shown in fig. 5, including: a scheduling module 51, a centralizing module 52 and an aggregation module 53; wherein:

the scheduling module 51 is configured to receive original equipment data in a preset centralized jurisdiction by a plurality of data schedulers, obtain a processed data set based on the original equipment data, and perform data drift detection and data distribution on the processed data set to obtain corrected equipment data; the centralized module 52 is configured to divide the rail transit line into a plurality of centralized station processors, each centralized station processor includes a plurality of actual stations, each centralized station processor includes at least 1 actual station, and at most 4 actual stations, the centralized station processors receive the correction device data, and execute a stand-alone algorithm and model tuning based on the correction device data to obtain optimized model training parameters; the aggregation module 53 is configured to collect and integrate the optimized model training parameters by a model aggregator, and send the integrated model training parameters to the plurality of hub processors to complete global model aggregation.

Fig. 6 illustrates a physical schematic diagram of an electronic device, as shown in fig. 6, which may include: a processor (processor) 610, a communication interface (communication interface) 620, a memory (memory) 630, and a communication bus (bus) 640, wherein the processor 610, the communication interface 620, and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a rail transit distributed operation and maintenance method comprising: receiving original equipment data in a preset centralized jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data; dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters; and collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation.

Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, embodiments of the present invention further provide a computer program product, including a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions which, when executed by a computer, enable the computer to perform the track traffic distributed operation and maintenance method provided by the above method embodiments, the method including: receiving original equipment data in a preset centralized jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data; dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters; and collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation.

In yet another aspect, embodiments of the present invention further provide a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the track traffic distributed operation and maintenance method provided in the above embodiments, the method including: receiving original equipment data in a preset centralized jurisdiction area by a plurality of data schedulers, acquiring a processed data set based on the original equipment data, and carrying out data drift detection and data distribution on the processed data set to acquire corrected equipment data; dividing a rail transit line into a plurality of centralized station processors, wherein each centralized station processor comprises a plurality of actual stations, each centralized station processor at least comprises 1 actual station and at most comprises 4 actual stations, the plurality of centralized station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters; and collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A distributed operation and maintenance method for rail transit, comprising:

collecting and integrating the optimized model training parameters by a model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation;

the obtaining the processed data set based on the original equipment data, and the performing data drift detection and data distribution on the processed data set to obtain corrected equipment data specifically includes:

detecting the data drift of the processed data set to obtain a detection result, and executing preset allocation operation on the original equipment data based on the detection result to obtain the correction equipment data;

the step of detecting the data drift of the processed data set to obtain a detection result, and executing a preset allocation operation on the original equipment data based on the detection result to obtain the correction equipment data specifically includes:

inputting the training set into a trained model to obtain a prediction result;

if judging that the model drift does not occur, predicting by using an original model, and transmitting the original equipment data to the plurality of centralized station processors;

the data schedulers with the model drift perform data exchange with the rest data schedulers without the model drift, and data the correction equipment to the plurality of concentrator station processors, specifically comprising:

each data scheduler which does not generate model drift transmits 1/N data after the time point to a plurality of data schedulers which generate model drift, wherein N is the number of the plurality of hub station processors;

the data schedulers with the model drift exchange data with the rest data schedulers without the model drift, and the correction equipment data are transmitted to the centralized station processors, and the method further comprises the following steps:

after the plurality of centralized station processors receive data sent by a plurality of data schedulers with preset proportions and without model drift, restarting model training;

the method comprises the steps that the plurality of central station processors receive the correction equipment data, and a single machine algorithm and model tuning are operated based on the correction equipment data to obtain optimized model training parameters, and specifically comprises the following steps:

performing single machine learning optimization based on the correction equipment data by adopting a Frank-Wrofe algorithm to obtain the optimized model training parameters;

the global model aggregation specifically comprises the following steps:

2. A rail transit distributed operation and maintenance system, comprising:

the aggregation module is used for collecting and integrating the optimized model training parameters by the model aggregator, and transmitting the integrated model training parameters to the plurality of centralized station processors to complete global model aggregation;

inputting the training set into a trained model to obtain a prediction result;

the global model aggregation specifically comprises the following steps:

3. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps of the rail transit distributed operation and maintenance method of claim 1.

4. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the rail transit distributed operation and maintenance method of claim 1.