CN114003638A

CN114003638A - An intelligent interconnected big data processing system

Info

Publication number: CN114003638A
Application number: CN202111242247.0A
Authority: CN
Inventors: 李锦基; 黄永权; 李明东; 田华雨; 付长财
Original assignee: Gold Sea Comm Corp
Current assignee: Gold Sea Comm Corp
Priority date: 2021-10-25
Filing date: 2021-10-25
Publication date: 2022-02-01

Abstract

The invention relates to the technical field of big data processing, and discloses an intelligent interconnected big data processing system, which comprises a data input unit, a data processing unit, a data classification unit and a data information layering unit, the data input unit comprises data input positioning analysis, data input category analysis and input data weight analysis, the data input category analysis refers to the analysis of data which is input into the system by external raw data, after the input data is analyzed by the data input category, the system performs a weight analysis on the data again, and the system locates the storage location of the data through the weight analysis, in the present invention, the system processes the data with large weight coefficient and high activity into the set-top data and processes the data with small weight coefficient and low activity into the settlement data, thereby avoiding the interference of the small weight coefficient data and the settlement information on the manually processed data.

Description

Intelligent interconnected big data processing system

Technical Field

The invention relates to the technical field of big data processing, in particular to an intelligent interconnected big data processing system.

Background

At present, the broadband internet access technology and the intelligent terminal are popularized at high speed, and the increasing speed of the network data capacity, the data processing quantity and the data intensity is greatly faster than any period; the big data era has been fortunate, large data processing systems have come up, the kinds of big data processing systems are various, no recognized classification method exists at present, and in order to clearly grasp the current development situation of the technology, the representative systems are classified from two aspects of load type and data type, so that the application range of various systems is determined on one hand, and the unsolved blank field can be seen on the other hand.

Most enterprises are difficult to bear the huge labor cost spent on simultaneously mastering various data analysis technologies, the screening strength of data is not strong in the data input and storage process, junk information is difficult to remove, so that some data with small weight coefficients occupy too much storage space, and due to the mixing of unimportant information, manual screening is difficult to perform, and manual data processing is interfered.

Therefore, an intelligent interconnection big data processing system is provided.

Disclosure of Invention

The invention mainly solves the technical problems in the prior art and provides an intelligent interconnected big data processing system.

In order to achieve the above object, the present invention adopts the following technical solution, an intelligent interconnected big data processing system, comprising a data input unit, a data processing unit, a data classification unit and a data information layering unit, wherein the data input unit comprises a data input positioning analysis, a data input category analysis and an input data weight analysis, the data input category analysis refers to analyzing data input into the system from external original data, the system performs the weight analysis on the data again after the input data passes through the data input category analysis, the system positions the storage position of the data through the weight analysis, the weight analysis is a quantitative value of the value and relative importance degree of a certain group of data in the whole and the occupied proportion, and the importance degree of the analyzed certain section of data to the whole database, the weight can be judged and calculated by dividing a plurality of layer indexes, the stored data passes through the data processing unit, and finally the compressed data is layered through the data information layering unit.

Preferably, the data input positioning analysis transmits the original data to the inside of the system through positioning on the external device, and converts the data into an internal format which is convenient for the system to process in an external format, the positioning device on the external device positions and counts coordinates of the input data to generate a data map, and the data map is used for analyzing and displaying data related to the position, the data map is essentially the combination of a general chart and geographic information, and forms attribute data which is attached to a space distribution position of a corresponding target and also becomes a basis or a parameter of a retrieval graph.

Preferably, the data input category analysis involves using tools to extract data and examine its key patterns and insights, by data mining techniques, which accept numbers and convert them into information, statistical analysis techniques examine samples for information such as median and bias, this information can help analysts find relevant patterns, diagnostic analysis techniques can solve why certain problems occur by identifying patterns in the data, predictive analysis techniques can use existing data to predict what is likely to happen, it can be a critical method of decision making, the guidance of analyzers to important patterns in quantitative data sets can be aided by data input category analysis, which is valuable to many industries, the category analysis of data input is completed through a data mining technology, a statistical analysis technology, a diagnostic analysis technology and a predictive analysis technology.

Preferably, the weight analysis of the input data is to decompose the complex data into various constituent factors by an analytic hierarchy process, group the factors into a hierarchical structure according to a domination relationship, determine the relative importance of each factor by means of pairwise comparison, then integrate the judgment of a decision maker, determine the total ranking of the relative importance of a decision scheme, perform weight analysis on the data, and position the storage position of the data by the system through weight analysis.

Preferably, the data processing unit classifies, regresses, clusters, similarity matches, frequent item sets, statistical descriptions, link predictions, and data compresses the stored data.

Preferably, the classification in the data processing unit is a basic data analysis mode, the data can be divided into different parts and types according to the characteristics of the data, and then further analyzed, so that the essence of the object can be further mined, the regression is a classification mode which applies a wide statistical analysis method, determines the causal relationship between variables by specifying dependent variables and independent variables, establishes a regression model, solves each parameter of the model according to the measured data, and then evaluates whether the regression model can well fit the measured data, the clustering is to classify the data into aggregation classes according to the intrinsic properties of the data, elements in each aggregation class have the same characteristics as much as possible, and the characteristic difference between different aggregation classes is as large as possible, different from the classification analysis, the classified classes are unknown, and the similar matching is realized by a certain method, the degree of similarity of the two data is calculated, and the degree of similarity is usually measured by a percentage.

Preferably, the frequent item set is a set of items frequently appearing in the case, the statistical description indicates information fed back by data by using a certain statistical index and index system according to the characteristics of the data, the basic processing work of data analysis is provided, the link prediction is a method for predicting the relationship which should exist among the data, the link prediction can be divided into prediction based on node attributes and prediction based on a network structure, the link prediction based on the attributes among the nodes comprises analyzing the information such as the attributes of node review and the relationship among the nodes, and the hidden relationship among the nodes is obtained by using the methods such as the node information knowledge set and the node similarity, compared with link prediction based on node attributes, network structure data is easier to obtain, and one main view in the field of complex networks shows that the characteristics of individuals in the network are less important than the relationship among individuals.

Preferably, the data compression means to reduce the amount of data to reduce the storage space and improve the transmission, storage and processing efficiency thereof, or to reorganize the data according to a certain algorithm and reduce the redundancy and storage space of the data without losing useful information

Preferably, the data information layering unit processes data with a large weight coefficient and high activity into set-top data, and processes data with a small weight coefficient and low activity into settlement data.

Advantageous effects

The invention provides an intelligent interconnected big data processing system. The method has the following beneficial effects:

(1) this big data processing system of intelligence interconnection, location through on the external equipment transmits original data inside the system, and convert these data into the internal format that the system is convenient for handle with external format, positioner on the external equipment fixes a position and statistics generation data map to the coordinate of input data, the application map comes analysis and show with the data that the position is relevant, adopt the management mode of general relational database and external equipment location to constitute the mixed type with original data transmission's mode with map data structure, be connected with people's life orbit through the data map that generates, the effect of making things convenient for people's retrieval data and data extraction has been reached.

(2) An intelligent interconnected big data processing system, the data input category analysis refers to analyzing data that has been input into the system from external raw data, the data mining technology involves using tools to extract the data and check its key patterns and insights, it accepts numbers and converts them into information, the statistical analysis technology examines samples for information such as median and deviation, which can help analysts find relevant patterns, the diagnostic analysis technology solves why some problems occur by identifying patterns in the data, the predictive analysis technology uses existing data to predict what may happen, which may be a key method of decision making, the data input category analysis can help guide the analyzer to important patterns in the quantitative data set, the data input category analysis is valuable to many industries, the data input category analysis is completed through a data mining technology, a statistical analysis technology, a diagnostic analysis technology and a predictive analysis technology, and the valuable effect of conveniently extracting data information is achieved.

(3) According to the intelligent interconnected big data processing system, on the premise that useful information is not lost through data compression, the data size is reduced to reduce the storage space, the transmission efficiency, the storage efficiency and the processing efficiency are improved, or data are reorganized according to a certain algorithm, and the effect of reducing the redundancy and the storage space of the data is achieved.

(4) This big data processing system of intelligence interconnection, at last through data information layering unit with compressed data carry out the layering, the system is big with the weight coefficient, the data processing that the activity is high becomes to put the top data, it is little with the weight coefficient, the data processing that the activity is low becomes to subside data, through with data generation data map, categorised, the layering processing, reduce the manual processing volume, reached and avoided the little data of weight coefficient and subside information and disturb manual processing data, improve data processing efficiency.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.

The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions that the present invention can be implemented, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the effects and the achievable by the present invention, should still fall within the range that the technical contents disclosed in the present invention can cover.

FIG. 1 is a flow chart of the system of the present invention;

FIG. 2 is a schematic flow chart of a data input unit according to the present invention;

FIG. 3 is a schematic diagram of a data processing unit and a data information layering unit according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example (b): an intelligent interconnected big data processing system is disclosed, as shown in fig. 1-fig. 3, comprising a data input unit, a data processing unit, a data classification unit and a data information layering unit, wherein the data input unit comprises data input positioning analysis, data input category analysis and input data weight analysis, the original data is transmitted to the inside of the system through positioning on an external device, and the data is converted into an internal format which is convenient for processing by the system in an external format, a positioning device on the external device positions and counts coordinates of the input data to generate a data map, the data related to the position is analyzed and displayed by applying the map, the data map is essentially the combination of a general chart and geographic information, corresponding graphic characteristics and geographic attribute data are composed of different data items, such as attribute data describing a certain section of river including name, code, and the like, The width, length, grade, navigation degree and the like, the attribute data describing a certain residential area has names, codes, administrative affiliation, grade, area, population, traffic significance, political culture significance and the like, the attribute data is not only attached to the space distribution position of a corresponding target, but also becomes the basis or parameter of a retrieval graph, and no necessary contact way exists between the attribute data and the parameter, the attribute data can be respectively formed into a plurality of two-dimensional tables, the map data structure is formed into a mixed type by adopting a management mode of a universal relational database and an external equipment positioning mode to transmit original data, the data input type analysis refers to the analysis of data input into a system by external original data, the data is extracted by using a tool and the key mode and the insights of the data are checked by connecting a generated data map with life tracks of people, and the data input type analysis refers to the analysis of data input into the system by using the data mining technology, it accepts numbers and converts them into information, statistical analysis techniques examine samples for information such as median and bias that can help analysts find relevant patterns, diagnostic analysis techniques solve why certain problems occur by identifying patterns in the data, predictive analysis techniques use existing data to predict what may happen, which may be a key method of decision making, help guide analyzers to important patterns in a quantitative data set by data input category analysis that is valuable to many industries, category analysis of data input is done by data mining techniques, statistical analysis techniques, diagnostic analysis techniques, predictive analysis techniques, after input data passes category analysis, the system re-weights data, the system locates the storage location of data by weight analysis, the weight analysis is a quantitative value of the degree of value and relative importance of the group of data in the whole and the size of the occupied proportion, the importance degree of a certain section of data analyzed on the whole database is determined and calculated by dividing a plurality of hierarchical indexes, the hierarchical analysis method is to decompose complex data into various composition factors, the factors are grouped according to the domination relationship to form a hierarchical structure, the relative importance of each factor is determined by means of pairwise comparison, then the total sequence of the relative importance of a decision scheme is determined by integrating the determination of a decision maker, the data are transmitted to different storage positions by performing weight analysis on the data, the stored data pass through a data processing unit, and the data processing unit classifies, regresses, clusters, similar matching, frequent item sets, statistical description, link prediction and data compression on the stored data, the classification is a basic data analysis mode, the data can be divided into different parts and types according to the characteristics of the data, and then further analysis can be performed, the essence of the object can be further mined, the regression is a statistical analysis method which is widely applied, the causal relationship between variables is determined by specifying dependent variables and independent variables, a regression model is established, each parameter of the model is solved according to the measured data, then whether the regression model can be well fitted to the measured data is evaluated, if the regression model can be well fitted, further prediction can be performed according to the independent variables, the clustering is to divide the data into a plurality of clustering classes according to the inherent property of the data, the elements in each clustering class have the same characteristic as much as possible, the classification mode of the characteristic difference between different clustering classes is different from the classification analysis as much as possible, and the divided classes are unknown, the similarity matching is to calculate the similarity degree of two data by a certain method, the similarity degree is usually measured by one percentage, a similarity matching algorithm is used in a plurality of different calculation scenes, such as the fields of data cleaning, user input error correction, recommendation statistics, plagiarism detection systems, automatic grading systems, web page search, DNA sequence matching and the like, a frequent item set refers to a set of items frequently appearing in the case, the statistical description refers to the basic processing work of data analysis by using certain statistical indexes and index systems according to the characteristics of the data, the link prediction refers to a method for predicting the relationship which should exist between the data, the link prediction can be divided into prediction based on node attributes and prediction based on a network structure, the link prediction based on the attributes between the nodes comprises analyzing the information of the attributes of node trial and the relationship between the nodes and the like, the hidden relation between nodes is obtained by utilizing methods of node information knowledge set, node similarity and the like, compared with link prediction based on node attributes, network structure data is easier to obtain, a main view in the field of complex networks shows that the characteristics of individuals in the networks are not important in relation among the individuals, data compression refers to a technical method of reducing data quantity to reduce storage space and improve transmission, storage and processing efficiency of the data on the premise of not losing useful information, or reorganizing the data according to a certain algorithm to reduce redundancy and storage space of the data, the data compression is divided into lossy compression and lossless compression, and finally the compressed data is layered through a data information layering unit, a system processes the data with large weight coefficient and high activity into set-top data and processes the data with small weight coefficient and low activity into settlement data, the data are subjected to data map generation, classification and layering processing, so that the manual processing amount is reduced, the interference of the small weight coefficient data and the settlement information on the manual processing data is avoided, and the data processing efficiency is improved.

The working principle is as follows: the method comprises the steps of transmitting original data to the inside of a system through positioning on external equipment, converting the data into an internal format which is convenient to process by the system in an external format, positioning coordinates of input data by a positioning device on the external equipment and counting to generate a data map, analyzing and displaying data related to positions by using the map, forming a mixed type by adopting a management mode of a general relational database and a mode of transmitting the original data through positioning of the external equipment and connecting the map with life tracks of people, and facilitating data retrieval and data extraction of people.

The data input category analysis refers to analysis of data that has been input into the system from external raw data, by data mining techniques involving the use of tools to extract the data and examine its key patterns and insights, which accept numbers and convert them into information, statistical analysis techniques examining samples for information such as median and bias, which can help analysts find relevant patterns, diagnostic analysis techniques to solve why certain problems occur by identifying patterns in the data, predictive analysis techniques using existing data to predict what may occur, which may be a key method of decision-making, by data input category analysis which can help guide analyzers to important patterns in a quantitative data set, data input category analysis which is valuable to many industries, by data mining techniques, statistical analysis techniques, etc, The diagnostic analysis technology and the predictive analysis technology complete the class analysis of data input, after the input data passes through the class analysis, the system performs weight analysis on the data again, and the importance degree of a certain section of analyzed data to the whole database is determined.

On the premise of not losing useful information, the data compression reduces the data volume to reduce the storage space and improve the transmission, storage and processing efficiency of the data, or reorganizes the data according to a certain algorithm to reduce the redundancy and storage space of the data.

And finally, layering the compressed data through a data information layering unit, processing the data with large weight coefficient and high activity into set top data and processing the data with small weight coefficient and low activity into settlement data by the system, generating a data map, classifying and layering the data, reducing the manual processing amount, avoiding the interference of the data with small weight coefficient and the settlement information on the manually processed data, and improving the data processing efficiency.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. An intelligent interconnected big data processing system is characterized in that: comprising a data input unit, a data processing unit, a data classification unit and a data information layering unit, and the data input unit comprises data input positioning analysis, data input category analysis and Input data weight analysis, the data input category analysis refers to analyzing the data that has been input into the system from external raw data. After the input data is analyzed by the data input category, the system performs weight analysis on the data again. To locate the storage location of data, weight analysis is to quantify the value and relative importance of a certain group of data in the whole and the size of the proportion, by quantifying the importance of a certain segment of data to the entire database. , the weight can be judged and calculated by dividing multiple levels of indicators, the stored data is passed through the data processing unit, and finally the compressed data is layered by the data information layering unit.

2. An intelligent interconnected big data processing system according to claim 1, wherein the data input positioning analysis transmits the original data to the inside of the system through positioning on an external device, and converts these data in an external format In order to facilitate the internal format of the system, the positioning device on the external device locates and counts the coordinates of the input data to generate a data map, and uses the map to analyze and display location-related data. The essence of the data map is the combination of general charts and geographic information. , the attribute data is not only attached to the spatial distribution position of the corresponding target, but also becomes the basis or parameter for retrieving graphics.

3. An intelligent interconnected big data processing system according to claim 1, characterized in that: the data input category analysis involves using tools to extract data and examine its key patterns and insights through data mining technology, it accepts digital and Converting it into information, statistical analysis techniques examine samples for information such as median and deviation, which can help analysts find relevant patterns, and diagnostic analysis techniques answer why certain things happen by identifying patterns in the data Problem, predictive analytics techniques use existing data to predict what is likely to happen, it can be a key method of decision-making, and data input category analysis can help guide the analyzer to important patterns in quantitative data sets. Data input category analysis is important for many The industry is very valuable, and the category analysis of data input is completed through data mining technology, statistical analysis technology, diagnostic analysis technology, and predictive analysis technology.

4. a kind of intelligent interconnection big data processing system according to claim 1, is characterized in that: described input data weight analysis is to decompose complex data into each constituent factor by Analytic Hierarchy Process, and these factors are grouped by dominant relation again Form a hierarchical structure, determine the relative importance of each factor through a pairwise comparison, and then synthesize the judgment of the decision maker to determine the overall ranking of the relative importance of the decision-making scheme. The location where the data is stored is located.

5. An intelligent interconnected big data processing system according to claim 1, characterized in that: the data processing unit performs classification, regression, clustering, similarity matching, frequent itemsets, statistical descriptions, and links on the stored data. Prediction and data compression.

6. The intelligent interconnected big data processing system according to claim 5, wherein the classification in the data processing unit is a basic data analysis method, and the data can be divided into data objects according to its characteristics. Different parts and types can be further analyzed to further explore the essence of things. Regression is a widely used statistical analysis method to determine the causal relationship between variables by specifying dependent variables and independent variables, and establish a regression model. And solve the parameters of the model according to the measured data, and then evaluate whether the regression model can fit the measured data well. Clustering is to divide the data into some aggregation classes according to the inherent nature of the data, and the elements in each aggregation class have as many as possible. A classification method that has the same characteristics and the characteristics of different aggregation classes are as different as possible. It is different from classification analysis. The divided classes are unknown, and similarity matching is to calculate the similarity of two data through a certain method. The degree of similarity is usually measured by a percentage.

7. A kind of intelligent interconnected big data processing system according to claim 5, is characterized in that: described frequent itemset refers to the collection of frequently appearing items in the case, and the statistical description is based on the characteristics of the data, using a certain statistical Indicators and indicator systems, indicating the information fed back by data, are the basic processing work for data analysis. Link prediction is a method of predicting the relationship that should exist between data. Link prediction can be divided into node attribute-based methods. Prediction and prediction based on network structure, link prediction based on attributes between nodes includes analyzing information such as the attributes of node qualification and the relationship between attributes between nodes, and using methods such as node information knowledge set and node similarity to obtain hidden information between nodes. Relationships, network structure data is more readily available than link prediction based on node attributes, a major view in the field of complex networks suggests that the characteristics of individuals in a network are less important than relationships between individuals.

8. An intelligent interconnected big data processing system according to claim 5, characterized in that: the data compression refers to reducing the amount of data to reduce storage space and improving its transmission and storage without losing useful information. and processing efficiency, or reorganize data according to a certain algorithm to reduce data redundancy and storage space.

9 . The intelligent interconnected big data processing system according to claim 1 , wherein the data information layering unit processes data with a large weight coefficient and high activity into top data, and processes data with a small weight coefficient and high activity into top data. 10 . Low-performance data are processed into sedimentation data.