CN119621855A - Industrial equipment time series data storage and preprocessing method - Google Patents
Industrial equipment time series data storage and preprocessing method Download PDFInfo
- Publication number
- CN119621855A CN119621855A CN202510156909.4A CN202510156909A CN119621855A CN 119621855 A CN119621855 A CN 119621855A CN 202510156909 A CN202510156909 A CN 202510156909A CN 119621855 A CN119621855 A CN 119621855A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- industrial equipment
- standard
- partition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2123/00—Data types
- G06F2123/02—Data types in the time domain, e.g. time-series data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a time sequence data storage and preprocessing method of industrial equipment, which belongs to the field of data storage and data mining, and comprises the steps of 1, configuring a sensor to collect original data of the industrial equipment in real time and send the original data to a message queue, 2, setting a data importing module, analyzing a subject of the message queue based on a data synchronizing tool to obtain first data and storing the first data in a table A, 3, designing a table B corresponding to the table A, preprocessing and standardizing the first data in the table A to obtain standard data, 4, extracting time characteristics of the standard data to obtain time domain characteristics, partitioning the standard data according to the time domain characteristics to obtain partitioned data lists, 5, obtaining dimension attributes corresponding to unit data of each partitioned data list, and storing the partitioned data lists in the table C according to a storage logic sequence. The industrial data acquisition and storage are realized, and the data processing efficiency and manageability are improved.
Description
Technical Field
The invention relates to the field of data storage and data mining, in particular to a time sequence data storage and preprocessing method for industrial equipment.
Background
At present, in the fields of intelligent manufacturing and industrial Internet, log data or measurement data generated by the operation of industrial equipment has time sequence attributes, such as vibration amplitude data sampled by a fan according to fixed frequency at a certain time, and frequency domain characteristic values such as peak values, mean values, variances, waveforms and the like are obtained through data mining, so that the method can be used for classification of vibration signals, fault diagnosis and fault prediction, thereby predicting the service life of the equipment and carrying out periodic maintenance on the equipment. The traditional storage mode adopts a file server or cloud object to store an original data file, or collects the original data file into a message queue, or adopts a time sequence database to store, and then data mining is carried out through Python or Spark, so that the mode needs to occupy a large amount of storage space and occupies a large amount of memory space when Python codes are used for reading, or the SQL capability of a time sequence database such as TDengine is used for carrying out data analysis on time sequence data through a preset function.
Therefore, the invention provides a time sequence data storage and preprocessing method for industrial equipment.
Disclosure of Invention
The invention provides a time sequence data storage and preprocessing method of industrial equipment, which is used for storing data into a table A of HBase by utilizing a data synchronization tool through collecting the industrial equipment data in real time and sending the industrial equipment data to a message queue. Next, standard data is obtained by preprocessing and normalization, and a corresponding table B is designed. And on the basis of the standardized data, extracting time characteristics and carrying out partition processing to generate a partition data list. And finally, storing the data according to the dimension attribute, and storing the partition data into a table C according to the storage logic sequence. The data processing and storage efficiency is optimized, and efficient management and analysis of the data are ensured.
In one aspect, the present invention provides a method for storing and preprocessing time-series data of industrial equipment, comprising:
Step 1, configuring a sensor to acquire original data of industrial equipment in real time, and sending the original data to a message queue;
step 2, setting a data import module, analyzing the subject of the message queue based on a data synchronization tool, acquiring first data and storing the first data in a table A of an HBase database;
step 3, designing a table B corresponding to the table A, and preprocessing and normalizing the first data of the table A to obtain standard data;
Step 4, extracting time features of the standard data to obtain time domain features, and partitioning the standard data according to the time domain features to obtain a partitioned data list;
And 5, acquiring dimension attributes corresponding to the unit data of each partition data list, and storing the partition data list into a table C according to a storage logic sequence.
In another aspect, configuring a sensor to collect raw data of an industrial device in real time includes:
Acquiring the working environment and monitoring requirements of industrial equipment, selecting the type of a sensor, and configuring a unique first number for the sensor;
determining the installation position of the sensor according to the original design drawing of the industrial equipment and the surrounding environment, and configuring a unique second number for the installation position;
and configuring and installing a sensor according to the corresponding relation between the first number and the second number, and initializing and starting the sensor based on a preset time sequence sampling frequency, wherein the sensor is used for acquiring the original data of the industrial equipment in real time.
On the other hand, sending the original data to the message queue includes:
Creating a message queue, serializing the original data into a byte stream, and inserting the byte stream into the message queue according to byte iteration;
Until the byte stream of the original data is completely inserted into the message queue, stopping iteration.
On the other hand, a data importing module is set, and the method for analyzing the theme of the message queue based on the data synchronizing tool comprises the following steps:
Constructing a data import module, configuring and installing a data synchronization tool, and generating a row key value pair group of original data based on the queue identification of the message queue analyzed by the data synchronization tool;
creating and registering consumers in the data synchronization tool, and creating a consumption record data table;
and the consumer consumes the theme of the message queue, and inserts the data into the original data table A to generate a row key value.
On the other hand, obtaining the original data and storing the original data in a table a of the HBase database, including:
constructing a table named table A for storing original time sequence data in the HBase database according to a standard preset time sequence field;
according to the result of executing consumption, obtaining first original data and analyzing;
if the first original data are measured values at the same time point, adopting a character string splicing mode to splice the first original data into a value, wherein special symbols are adopted to separate the single measured values;
If the first raw data are measured values at different time points, the data at different measuring times are different rows.
On the other hand, designing a table B corresponding to the table A, preprocessing and normalizing the first data of the table A to obtain standard data, wherein the method comprises the following steps:
acquiring first data of a table A, and converting all the data into second data in a preset format;
selecting a preset neighbor number K, and calculating the KNN distance between any two measured values in the second data as follows:
Wherein, the method comprises the steps of, Represents the distance between the ith measured value and the jth measured value in the second data, n represents the total n measured values in the second data,Representing the second dataIn (2), ln () represents a logarithmic function; Representing the second data Variance of all measured values in (a); the min and max respectively represent the minimum value and the maximum value;
Selecting any measured value as an intermediate value based on a preset neighbor number K, screening K measured values near the intermediate value to form a sample group, acquiring the average distance of the sample group, judging that the corresponding measured value is an abnormal value if the distance between any measured value in the sample group and the intermediate value is larger than the average distance, and otherwise, judging that the corresponding measured value is normal;
removing the abnormal value of the second data to obtain third data, and normalizing the third data to obtain standard data;
On the other hand, the time feature extraction is carried out on the standard data to obtain time domain features, the standard data is partitioned according to the time domain features to obtain a partitioned data list, and the method comprises the following steps:
standard data of a table B is obtained, and a preset time sequence sampling frequency of the standard data is obtained according to a time sequence field of the standard data;
Extracting time features of the standard data, converting timestamp information of the standard data into specific time features, and taking the specific time features as time domain features;
defining a time interval based on the time domain features, specifically:
Wherein The time interval is represented by a time interval,The starting point in time is indicated as such,Representing the time-origin mapping coefficient,The time-endpoint mapping coefficient is represented,The characteristics of the time domain are represented and,Represents the time interval mean of the standard data,Representing the maximum value of the time interval of standard data, T () represents an event handling function;
According to time intervals Carrying out partition cutting processing on the time part of the standard data, wherein each time partition corresponds to a time partition with the size ofWherein the time partition and its corresponding measured value constitute unit data of a partition data list.
On the other hand, acquiring the dimension attribute corresponding to the unit data of each partition data list, and storing the partition data list into the table C according to the storage logic sequence, including:
traversing the partition data list to obtain the time range of each unit data;
Acquiring dimension attributes corresponding to all fields according to fields of standard data corresponding to any unit data in any time interval and a field name-dimension attribute mapping table, wherein the dimension attributes are dimension attributes of the unit data;
A record is created for each unit data and its time interval, unit data values and dimension properties are stored in a table C in a stored logical order.
Compared with the prior art, the invention has the beneficial effects that:
The invention provides a time sequence data storage and preprocessing method of industrial equipment, which is used for storing data into a table A of HBase by utilizing a data synchronization tool through collecting the industrial equipment data in real time and sending the industrial equipment data to a message queue. Next, standard data is obtained by preprocessing and normalization, and a corresponding table B is designed. And on the basis of the standardized data, extracting time characteristics and carrying out partition processing to generate a partition data list. And finally, storing the data according to the dimension attribute, and storing the partition data into a table C according to the storage logic sequence. The data processing and storage efficiency is optimized, and efficient management and analysis of the data are ensured.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for storing and preprocessing time-series data of industrial equipment according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1:
As shown in fig. 1, the method for storing and preprocessing time-series data of industrial equipment provided by the embodiment of the invention includes:
Step 1, configuring a sensor to acquire original data of industrial equipment in real time, and sending the original data to a message queue;
step 2, setting a data import module, analyzing the subject of the message queue based on a data synchronization tool, acquiring first data and storing the first data in a table A of an HBase database;
step 3, designing a table B corresponding to the table A, and preprocessing and normalizing the first data of the table A to obtain standard data;
Step 4, extracting time features of the standard data to obtain time domain features, and partitioning the standard data according to the time domain features to obtain a partitioned data list;
And 5, acquiring dimension attributes corresponding to the unit data of each partition data list, and storing the partition data list into a table C according to a storage logic sequence.
In this embodiment, the sensor is a device for monitoring and collecting industrial equipment status or environmental data in real time, including types of temperature, pressure, humidity, vibration, and the like.
In this embodiment, industrial equipment refers to machinery, instruments, tools, and other equipment used in industrial processes for production, processing, inspection, or control.
In this embodiment, the raw data refers to raw information collected in real time by sensors, meters, etc. during the operation of the industrial equipment, either raw or analyzed.
In this embodiment, message queuing is a technique for communicating between different services by sending and receiving messages without requiring a direct synchronous connection.
In this embodiment, the data import module refers to a component that is used to obtain data from a message queue and store it in a database (e.g., HBase).
In this embodiment, the data synchronization tool is a software tool that is primarily used to synchronize data between different data sources or systems, such as kafka et al.
In this embodiment, the topic refers to the class of message or data flow in the message queue.
In this embodiment, the HBase database is an open-source, distributed, columnar storage database, which is part of the Apache Hadoop ecosystem, and is designed to handle large-scale, distributed data storage requirements.
In this embodiment, table a is an HBase table that stores raw data retrieved from a message queue.
In this embodiment, table B is an HBase table for storing standard data after preprocessing, normalization, and partitioning, and includes a column of dimension attributes and a column of time series data after preprocessing.
In this embodiment, by means of the coprocessor function of the HBase, the logic of data preprocessing is placed at the server, a large amount of data is not pulled to the client for processing, excessive memory is occupied, higher data preprocessing efficiency is obtained through the distributed storage and calculation capability of the HBase, synchronization of the data storage and preprocessing functions is also realized, and the subsequent data mining only needs to query the table after data preprocessing, and processing steps such as data cleaning, duplication removal and normalization are not needed.
In this embodiment, the data processing of table a and table B are performed synchronously, the coprocessor of HBase is mounted on the original data table a, and each time a new line of data is inserted into table a, the coprocessor is triggered to run, and the data after preprocessing is written into table B.
In this embodiment, the pre-processing normalization is used to convert the raw data into a standard form suitable for subsequent analysis.
In this embodiment, the standard data refers to data after preprocessing and standardization, and has a uniform format and structure.
In this embodiment, the time domain features refer to data features related to time, and features such as time nature, periodicity, and trending of data are extracted from time stamps or time fields and reflected.
In this embodiment, partitioning refers to dividing data into different blocks according to certain specific characteristics during data storage, querying, and processing.
In this embodiment, the partition data list refers to a series of data units obtained by partitioning standard data according to the extracted time domain features (such as year, month, day, hour, etc.) in step 4.
In this embodiment, the unit data refers to the smallest data unit in table a after being pre-processed and normalized, and partitioned by time characteristics.
In this embodiment, dimension attributes refer to various feature fields that can be used to describe or classify data in data processing and storage.
In this embodiment, the storage logic order refers to a manner of storing the data in the partition data list into the target table (table C) according to a certain rule.
In this embodiment, table C is a table for storing dimension attribute data corresponding to unit data of each partition data list.
The technical scheme has the working principle and beneficial effects that the industrial equipment data is collected and processed in real time, the data is stored in the HBase by utilizing the message queue and the data synchronization tool, and the data storage efficiency and the accuracy of subsequent analysis are improved through pretreatment, feature extraction and partition storage, so that efficient data processing and management are supported.
Example 2:
On the basis of the above embodiment 1, configuring the sensor to collect raw data of the industrial equipment in real time includes:
Acquiring the working environment and monitoring requirements of industrial equipment, selecting the type of a sensor, and configuring a unique first number for the sensor;
determining the installation position of the sensor according to the original design drawing of the industrial equipment and the surrounding environment, and configuring a unique second number for the installation position;
and configuring and installing a sensor according to the corresponding relation between the first number and the second number, and initializing and starting the sensor based on a preset time sequence sampling frequency, wherein the sensor is used for acquiring the original data of the industrial equipment in real time.
In this embodiment, the working environment refers to the physical and environmental conditions in which the device is in actual operation, including temperature, humidity, pressure, vibration, gas composition, electromagnetic interference, and many other factors.
In this embodiment, the monitoring requirements refer to the requirements for real-time monitoring and data acquisition of the operating state, environmental conditions and equipment performance of the industrial equipment.
In this embodiment, the first number is a unique identifier for identifying each sensor.
In this embodiment, the original design drawing refers to a detailed technical drawing drawn by an engineer during the design and construction stages of an industrial plant or system.
In this embodiment, the mounting location refers to a specific location or area where the sensor is actually placed in the device or work environment.
In this embodiment, the second number is a unique identifier for identifying each sensor location.
In this embodiment, the preset time sequence sampling frequency refers to the frequency of data acquisition of the industrial equipment by the sensor in a specified time interval.
The technical scheme has the advantages that the sensor type is selected and the unique number is configured by combining the equipment working environment and the monitoring requirement, the installation position is determined, the number is configured, accurate sensor installation and real-time data acquisition are realized, the equipment monitoring efficiency is optimized, and the accuracy and the reliability of data acquisition are ensured.
Example 3:
On the basis of the above embodiment 2, sending the original data to the message queue includes:
Creating a message queue, serializing the original data into a byte stream, and inserting the byte stream into the message queue according to byte iteration;
Until the byte stream of the original data is completely inserted into the message queue, stopping iteration.
Serialization refers in this embodiment to the process of converting the state of a data structure into a format that can be stored or transmitted.
In this embodiment, byte stream refers to a way in which data is processed and transferred in units of bytes in a computer system, in binary representation.
In this embodiment, iterative insertion refers to the process of inserting sequentially into a message queue in bytes until all bytes are inserted.
The technical scheme has the advantages that the method and the device have the advantages that through serializing original data and iteratively inserting the original data into the message queue, the sequence of inserting the data is ensured by controlling the inserting process through the queue identification and the pointer value, the data collision and repetition are avoided, and the reliability and the efficiency of data transmission are improved.
Example 4:
On the basis of the above embodiment 3, setting a data import module, parsing the subject of the message queue based on a data synchronization tool, including:
Constructing a data import module, configuring and installing a data synchronization tool, and generating a row key value pair group of original data based on the queue identification of the message queue analyzed by the data synchronization tool;
creating and registering consumers in the data synchronization tool, and creating a consumption record data table;
and the consumer consumes the theme of the message queue, and inserts the data into the original data table A to generate a row key value.
In this embodiment, parsing is the process of converting a byte stream into the original data.
In this embodiment, one row key (Rowkey) in the row key value pair corresponds to a plurality of columns, each column corresponds to storing a value of a dimension attribute or a value after splicing the time series data, and the preprocessing only processes the columns of the time series data.
In this embodiment, consumer refers to a component that processes data or consumes data.
In this embodiment, the consumption record data table is a table storing time series data after consumption (i.e., data processing).
In this embodiment, traversing refers to analyzing each data record one by one as the original time series data is processed.
The technical scheme has the advantages that the data synchronization tool analyzes the message queue and generates the row key value pair group, a consumer traverses and judges whether the row key value is consumed, unique consumption of data is ensured, the consumption state is recorded, the accuracy and the efficiency of data processing are improved, and repeated consumption is avoided.
Example 5:
on the basis of the above embodiment 4, the raw data is acquired and stored in table a of the HBase database, including:
constructing a table named table A for storing original time sequence data in the HBase database according to a standard preset time sequence field;
according to the result of executing consumption, obtaining first original data and analyzing;
if the first original data are measured values at the same time point, adopting a character string splicing mode to splice the first original data into a value, wherein special symbols are adopted to separate the single measured values;
If the first raw data are measured values at different time points, the data at different measuring times are different rows.
In this embodiment, the standard preset timing field refers to a basic field for defining and identifying time series data, such as a time stamp, a device identification, a data type, and the like.
In this embodiment, the HBase database is an open-source, distributed, columnar-store NoSQL database system for handling large-scale data sets, particularly suited for storing and managing non-relational data.
In this embodiment, raw time series data refers to data representing a certain physical virtual phenomenon acquired in time series.
In this embodiment, the first raw data refers to raw data that is initially acquired during the time series data acquisition process.
In this embodiment, the measured value refers to data representing a certain physical quantity or state, such as temperature, humidity, voltage, air pressure, speed, flow rate, etc., collected by a sensor, device or system.
In this embodiment, the string concatenation means that a plurality of measured values are connected together through specific symbols to form a complete string.
The technical scheme has the working principle and beneficial effects that the time sequence data are stored through the HBase table A, and the data are processed according to the measurement time, wherein measured values at the same time point are spliced into one value, and the data at different time points are stored in a plurality of rows, so that the time sequence data storage and query are optimized, and the flexibility and the efficiency of data processing are improved.
Example 6:
on the basis of the above embodiment 5, designing a table B corresponding to the table a, and performing pretreatment normalization on the first data of the table a to obtain standard data, where the method includes:
acquiring first data of a table A, and converting all the data into second data in a preset format;
selecting a preset neighbor number K, and calculating the KNN distance between any two measured values in the second data as follows:
Wherein, the method comprises the steps of, Represents the distance between the ith measured value and the jth measured value in the second data, n represents the total n measured values in the second data,Representing the second dataIn (2), ln () represents a logarithmic function; Representing the second data Variance of all measured values in (a); the min and max respectively represent the minimum value and the maximum value;
Selecting any measured value as an intermediate value based on a preset neighbor number K, screening K measured values near the intermediate value to form a sample group, acquiring the average distance of the sample group, judging that the corresponding measured value is an abnormal value if the distance between any measured value in the sample group and the intermediate value is larger than the average distance, and otherwise, judging that the corresponding measured value is normal;
removing the abnormal value of the second data to obtain third data, and normalizing the third data to obtain standard data;
In this example, the second data is the data obtained after a certain processing and conversion, and the original data is from table a.
In this embodiment, the preset number of neighbors refers to the number of neighbors selected when calculating the KNN (K-nearest neighbor) distance.
In this embodiment, KNN distance is a core concept in the K-Nearest Neighbor (K-Nearest Neighbor) algorithm, measuring the distance between two data points.
In this embodiment, the intermediate value refers to a measurement value obtained by screening K nearest neighbor data points as a reference point when KNN calculation is performed.
In this embodiment, the sample set refers to K adjacent measured values screened from around the selected intermediate value based on a preset number of neighbors (K) according to KNN algorithm.
In this embodiment, outliers refer to measurements in the dataset that deviate significantly from the overall data trend.
In this embodiment, the third data is a data set from which an outlier is removed, and the result is obtained by performing normalization processing.
In this embodiment, the standard data is data after normalization processing.
The technical scheme has the advantages that the distance between measured values is calculated through the KNN algorithm, abnormal values are screened and removed, and standard data are generated based on standardized processing. The method effectively improves the accuracy and quality of the data, removes abnormal values, and ensures the reliability and stability of data analysis results.
Example 7:
On the basis of the above embodiment 1, performing time feature extraction on the standard data to obtain time domain features, and partitioning the standard data according to the time domain features to obtain a partitioned data list, where the partitioning data list includes:
standard data of a table B is obtained, and a preset time sequence sampling frequency of the standard data is obtained according to a time sequence field of the standard data;
Extracting time features of the standard data, converting timestamp information of the standard data into specific time features, and taking the specific time features as time domain features;
defining a time interval based on the time domain features, specifically:
Wherein The time interval is represented by a time interval,The starting point in time is indicated as such,Representing the time-origin mapping coefficient,The time-endpoint mapping coefficient is represented,The characteristics of the time domain are represented and,Represents the time interval mean of the standard data,Representing the maximum value of the time interval of standard data, T () represents an event handling function;
According to time intervals Carrying out partition cutting processing on the time part of the standard data, wherein each time partition corresponds to a time partition with the size ofWherein the time partition and its corresponding measured value constitute unit data of a partition data list.
In this embodiment, the timing field refers to a data field related to time, and is used to indicate a point in time when the data recording occurs.
In this embodiment, the time stamp information refers to a specific time point at which each piece of data is recorded, and exists in the form of a time stamp.
In this embodiment, the time rule matching degree is a degree of matching between the extracted time feature and the preset time sequence sampling frequency.
In this embodiment, the specific time feature refers to a specific data attribute extracted from the time stamp information in the standard data, for example, year, month, day, minute, hour, etc., which can accurately describe the time dimension.
The technical scheme has the working principle and beneficial effects that the time interval is defined for data partition cutting by extracting the time characteristics of the standard data and comparing the time characteristics with the preset sampling frequency and selecting the optimal time domain characteristics. The method optimizes the time processing and partitioning of the data and improves the time sequence analysis and processing efficiency of the data.
Example 8:
On the basis of the above embodiment 1, acquiring the dimension attribute corresponding to the unit data of each partition data list, and storing the partition data list in the table C according to the storage logic order, including:
traversing the partition data list to obtain the time range of each unit data;
Acquiring dimension attributes corresponding to all fields according to fields of standard data corresponding to any unit data in any time interval and a field name-dimension attribute mapping table, wherein the dimension attributes are dimension attributes of the unit data;
A record is created for each unit data and its time interval, unit data values and dimension properties are stored in a table C in a stored logical order.
In this embodiment, the field name-dimension attribute mapping table is a mapping structure that associates data fields with their corresponding dimension attributes.
In this embodiment, the dimension attribute refers to descriptive information associated with the data field, such as, for example, time, region, product, sales, etc.
The technical scheme has the advantages that the partition data list is traversed, the time interval and the dimension attribute mapping table are combined, records are created for each unit data, and the records are stored in the table C according to storage logic. The method improves the structured storage efficiency of the data and is convenient for subsequent data query and analysis.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202510156909.4A CN119621855A (en) | 2025-02-13 | 2025-02-13 | Industrial equipment time series data storage and preprocessing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202510156909.4A CN119621855A (en) | 2025-02-13 | 2025-02-13 | Industrial equipment time series data storage and preprocessing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN119621855A true CN119621855A (en) | 2025-03-14 |
Family
ID=94894753
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202510156909.4A Pending CN119621855A (en) | 2025-02-13 | 2025-02-13 | Industrial equipment time series data storage and preprocessing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN119621855A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
CN112307086A (en) * | 2020-10-30 | 2021-02-02 | 湖北烽火平安智能消防科技有限公司 | Automatic data verification method and device in fire service |
CN114048217A (en) * | 2021-10-21 | 2022-02-15 | 微民保险代理有限公司 | Incremental data synchronization method and device, electronic equipment and storage medium |
CN115914360A (en) * | 2022-09-15 | 2023-04-04 | 成都飞机工业(集团)有限责任公司 | A time series data storage method, device, equipment and storage medium |
WO2024037629A1 (en) * | 2022-08-19 | 2024-02-22 | 顺丰科技有限公司 | Data integration method and apparatus for blockchain, and computer device and storage medium |
CN118395290A (en) * | 2024-05-13 | 2024-07-26 | 齐丰科技股份有限公司 | Equipment modeling method suitable for discrete point position table |
-
2025
- 2025-02-13 CN CN202510156909.4A patent/CN119621855A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
CN112307086A (en) * | 2020-10-30 | 2021-02-02 | 湖北烽火平安智能消防科技有限公司 | Automatic data verification method and device in fire service |
CN114048217A (en) * | 2021-10-21 | 2022-02-15 | 微民保险代理有限公司 | Incremental data synchronization method and device, electronic equipment and storage medium |
WO2024037629A1 (en) * | 2022-08-19 | 2024-02-22 | 顺丰科技有限公司 | Data integration method and apparatus for blockchain, and computer device and storage medium |
CN115914360A (en) * | 2022-09-15 | 2023-04-04 | 成都飞机工业(集团)有限责任公司 | A time series data storage method, device, equipment and storage medium |
CN118395290A (en) * | 2024-05-13 | 2024-07-26 | 齐丰科技股份有限公司 | Equipment modeling method suitable for discrete point position table |
Non-Patent Citations (1)
Title |
---|
谢文伟等: "《人工智能技术丛书 深度学习与计算机视觉 核心算法与应用》", 30 April 2023, 北京:北京理工大学出版社, pages: 63 - 64 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110347116B (en) | A machine tool state monitoring system and monitoring method based on operating data flow | |
US10679135B2 (en) | Periodicity analysis on heterogeneous logs | |
CN105653427B (en) | A Log Monitoring Method Based on Behavior Anomaly Detection | |
KR101611166B1 (en) | System and Method for Deducting about Weak Signal Using Big Data Analysis | |
WO2012073526A1 (en) | Data processing system, and data processing device | |
CN118378195B (en) | Screw air compressor fault prediction method based on multi-source data fusion | |
CN112800061B (en) | Data storage method, device, server and storage medium | |
CN116066343B (en) | An intelligent early warning method and system for oil pump unit fault model | |
Mueen et al. | AWarp: Fast warping distance for sparse time series | |
Egri et al. | Cross-correlation based clustering and dimension reduction of multivariate time series | |
CN109145109B (en) | User group message propagation abnormity analysis method and device based on social network | |
CN117572837B (en) | Intelligent power plant AI active operation and maintenance method and system | |
CN119621855A (en) | Industrial equipment time series data storage and preprocessing method | |
CN107357919A (en) | User behaviors log inquiry system and method | |
CN116910590A (en) | Gas sensor accuracy anomaly identification method and system based on adaptive clustering | |
CN114880584B (en) | A method for fault analysis of generator sets based on community discovery | |
CN116431702A (en) | Industrial big data analysis method and platform based on industrial Internet | |
EP3926428B1 (en) | Control device, control program, and control system | |
CN113064791A (en) | Scattered label feature extraction method based on real-time monitoring of mass log data | |
Supardi et al. | An evolutionary stream clustering technique for outlier detection | |
CN118820910B (en) | Heterogeneous network security big data management method and system | |
CN117251532B (en) | Large-scale literature mechanism disambiguation method based on dynamic multistage matching | |
CN116861204B (en) | Intelligent manufacturing equipment data management system based on digital twinning | |
CN118503884B (en) | Equipment state identification method, equipment and medium | |
CN118820739B (en) | Method, device and medium for visual playback of time series data based on key point recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |