Disclosure of Invention
The invention provides a secure data transfer method, device equipment and medium of a data element in a heterogeneous data source, which solve the problem of how to realize secure and efficient data transfer between heterogeneous data sources in the related technology.
In order to achieve the above purpose, the application adopts the following technical scheme:
in a first aspect, a method for secure data transfer of a data element in a heterogeneous data source is provided, including:
Based on the established connection relation between the data sources and the super management account, obtaining connection information between the data sources and super user information, and registering the data sources to a management center through corresponding JDBC;
managing a connection process between data sources through the management center;
managing the generation process of the data element through the management center;
Receiving a data query request initiated by a client through a pre-configured task scheduling node, and sending the data query request to the management center;
The management center queries and analyzes metadata information of a source data source and related information corresponding to a target data source according to the data query request, and transmits the analyzed data to the task scheduling node;
the task scheduling node registers a node task according to the data query request, and executes the node task according to the analyzed data to load the data;
the client obtains a data loading result of the task scheduling node through a data query request, and executes a corresponding user task according to the data loading result, and after the user task is executed, a calculation result is obtained and written out.
In a first possible implementation manner of the first aspect, the step of managing, by the management center, a connection procedure between data sources includes:
determining and acquiring roles of users according to the development, production and circulation processes of the data elements;
according to the role of the user, carrying out automatic authorization or refusal connection of the data source;
when the data source is on line, carrying out on-line recovery notification on an access request of a user during the off line;
The method comprises the steps of detecting the JDBC connection number of a data source, carrying out early warning when the connection number exceeds 80% of a preset threshold value, carrying out liveness sequencing on current connection users, sending a message notification of possible closing to the connection users with inactive TOP20%, and executing closing action on the connection users with inactive TOP20% when the connection number exceeds the threshold value, wherein the standard of the inactive connection users is set to be that no data transmission is detected within 10 minutes.
In a second possible implementation manner of the first aspect, the step of managing, by the management center, a generation process of the data element includes:
periodically scanning and maintaining metadata information of each data source, and recording scanning time;
when the data of the data source is increased, automatically synchronizing and updating metadata information of the data source;
The management center periodically scans heartbeat information of the data source to ensure that the data source is maintained in a usable state;
periodically scanning a registered data source list to ensure that new data sources can be discovered and used at any time;
Based on the virtual JDBC provided by the management center, the virtual JDBC is automatically routed to the JDBC corresponding to the data source according to the data resources required in the development, production and circulation processes of the data element and the metadata information of the data source.
In a third possible implementation manner of the first aspect, the step of transferring the parsed data to the task scheduling node includes:
the data is transferred by reflection or by the cooperation of a public buffer layer with fine grain authority.
In a second aspect, there is provided a secure data transfer device for a data element at a heterogeneous data source, comprising:
The data source registration module is used for acquiring connection information and super user information between the data sources based on the established connection relation between the data sources and the super management account, and registering the data sources to a management center through corresponding JDBC;
The data source connection management module is used for managing the connection process between the data sources through the management center;
the data element generation management module is used for managing the generation process of the data element through the management center;
the task scheduling node module is used for receiving a data query request initiated by a client through a pre-configured task scheduling node and sending the data query request to the management center;
the inquiring and analyzing module is used for inquiring and analyzing the metadata information of the source data source and the related information corresponding to the target data source according to the data inquiring request by the management center and transmitting the analyzed data to the task scheduling node;
The node data loading module is used for the task scheduling node to register a node task according to the data query request and execute the node task according to the analyzed data to load data;
and the user task execution module is used for the client to obtain the data loading result of the task scheduling node through the data query request, and execute the corresponding user task according to the data loading result, and the user task obtains the calculation result after the execution is completed and writes out the calculation result.
In a first possible implementation manner of the second aspect, the data source connection management module is specifically configured to:
determining and acquiring roles of users according to the development, production and circulation processes of the data elements;
according to the role of the user, carrying out automatic authorization or refusal connection of the data source;
when the data source is on line, carrying out on-line recovery notification on an access request of a user during the off line;
The method comprises the steps of detecting the JDBC connection number of a data source, carrying out early warning when the connection number exceeds 80% of a preset threshold value, carrying out liveness sequencing on current connection users, sending a message notification of possible closing to the connection users with inactive TOP20%, and executing closing action on the connection users with inactive TOP20% when the connection number exceeds the threshold value, wherein the standard of the inactive connection users is set to be that no data transmission is detected within 10 minutes.
In a second possible implementation manner of the second aspect, the data element generation management module is specifically configured to:
periodically scanning and maintaining metadata information of each data source, and recording scanning time;
when the data of the data source is increased, automatically synchronizing and updating metadata information of the data source;
The management center periodically scans heartbeat information of the data source to ensure that the data source is maintained in a usable state;
periodically scanning a registered data source list to ensure that new data sources can be discovered and used at any time;
Based on the virtual JDBC provided by the management center, the virtual JDBC is automatically routed to the JDBC corresponding to the data source according to the data resources required in the development, production and circulation processes of the data element and the metadata information of the data source.
In a third possible implementation manner of the second aspect, the query and parse module is specifically configured to:
the data is transferred by reflection or by the cooperation of a public buffer layer with fine grain authority.
In a third aspect, an electronic device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the secure data transfer method of data elements in heterogeneous data sources according to the first aspect.
In a fourth aspect, there is provided a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the secure data transfer method of the data element in a heterogeneous data source according to the first aspect.
The beneficial effects are that:
The application avoids the limitation of complicated configuration information of the database depending on personnel management, realizes centralized automatic management of the configuration information, ensures that the configuration of the database is more consistent and standardized, is convenient for monitoring and maintenance, reduces human errors and improves the efficiency.
The application provides an elastic data abstraction form, data can be invisible, the risk of improper access or leakage is reduced by avoiding generating an intermediate process table, the safety of the data is protected, the requirement of data storage is reduced, and the efficiency of data processing is improved.
Detailed Description
In order to further describe the technical means and effects adopted by the present application to achieve the predetermined purpose, the technical solutions in the embodiments of the present application are clearly described, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type not limited to the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The steps of the method flow described in the specification and the flow chart shown in the drawings of the specification are not necessarily strictly executed according to step numbers, and the execution order of the steps of the method may be changed. Moreover, some steps may be omitted, multiple steps may be combined into one step to be performed, and/or one step may be decomposed into multiple steps to be performed.
The following describes in detail the secure data transfer method, apparatus, device and medium of the data element in heterogeneous data source according to the embodiment of the present application with reference to the attached drawings and the preferred embodiments.
Firstly, an application scenario of the secure data transfer method of the data element in the heterogeneous data source according to the embodiment of the present application will be described in detail.
Different database management systems, file formats, network protocols, etc., result in the heterogeneity of different data sources.
In the data transfer process between heterogeneous data sources, an intermediate process table is often generated, which not only increases the complexity of data management, but also may cause the risk of data leakage. The existing data transfer method often ignores the security problem caused by an intermediate process table in the data migration process, and also comprises the form and form of the data elements and the security of data reading and writing of the data elements in heterogeneous data connection. The existence of these problems makes the data vulnerable to attack during transfer, increasing the risk of data leakage and tampering.
Aiming at the technical problems, the application provides a secure data transfer method of a data element in heterogeneous data sources, which can effectively manage connection information and metadata of a plurality of data sources, and can ensure the security and integrity of data in the data transfer process so as to meet the high requirements of modern enterprises on the security and reliability of the data.
Referring to fig. 1-2, an embodiment of the present application provides a secure data transfer method of a data element in a heterogeneous data source, where, as shown in fig. 1-2, the secure data transfer method of the embodiment of the present application includes the following steps:
Step S1, based on the established connection relation between the data sources and the super management account, obtaining connection information between the data sources and super user information, and registering the connection information and the super user information to a management center through the corresponding data sources.
And step S2, managing the connection process between the data sources through a management center.
In some possible embodiments, the managing the connection process includes:
Step S201, determining and acquiring roles of users according to the development, production and circulation processes of the data elements.
Step S202, according to the role of the user, automatic authorization of the data source or connection rejection are carried out.
And step 203, realizing the automatic on-line and off-line of the data source according to the health state of the data source, and carrying out on-line recovery notification on the access request of the user during the off-line when the data source is on-line.
Step S204, detecting the JDBC connection quantity of the data source, early warning when the connection quantity exceeds 80% of a preset threshold value, sorting the liveness of the current connection users, sending a message notice of possible closing to the connection users with the inactive TOP20%, and executing closing action to the connection users with the inactive TOP20% when the connection quantity exceeds the threshold value. Wherein the criteria for an inactive connected user is set such that no data transmission is detected within 10 minutes.
And step S3, managing the generation process of the data element through a management center.
In the generation process of the data elements, unified super account management realizes unified management of multiple data sources, metadata information of the data sources can be automatically scanned and updated through a management center, and virtual jdbc is directly authorized to access source data through the management center. So as to achieve the data timeliness and effectiveness of the element data generation process and the data source data elastic routing authorization.
In some possible embodiments, managing the generation process of the data element includes:
step S301, periodically scans and maintains metadata information of each data source, and records the scanning time.
Step S302, when the data of the data source is increased, the metadata information of the data source is automatically updated synchronously.
In step S303, the management center periodically scans the heartbeat information of the data source to ensure that the data source is maintained in a usable state.
Step S304, periodically scans the registered data source list to ensure that new data sources can be discovered and used at any time.
Step S305, based on the management center providing the virtual JDBC, the virtual JDBC is automatically routed to the JDBC of the corresponding data source according to the data resource and the metadata information of the data source required in the process of developing, producing and circulating the data element.
And S4, receiving a data query request initiated by the client through a pre-configured task scheduling node, and sending the data query request to a management center.
And S5, the management center queries and analyzes the metadata information of the source data source and the related information corresponding to the target data source according to the data query request, and transmits the analyzed data to the task scheduling node.
The data transmission process can be realized by transmitting the data in a reflection mode or by matching a public buffer layer with a fine grain authority.
And S6, the task scheduling node registers the node task according to the data query request, and executes the node task according to the analyzed data to load the data.
And S7, the client obtains a data loading result of the task scheduling node through the data query request, executes a corresponding user task according to the data loading result, and obtains and writes out a calculation result after the user task is executed.
In the above steps, the data element may be understood that the metadata management platform is isolated from the task scheduling and executing platform during the data transfer process of the heterogeneous data source. The data elements are enabled to receive data query requests initiated by the query clients through the set task scheduling nodes in the heterogeneous database system, and the metadata information of the source data sources and the related information of the target data sources are analyzed through the scheduling node request management center, so that the data elements are ensured not to store connection information of the data elements when the data elements are transferred on the scheduling platform, and the data elements are used immediately. The data element is on the whole task scheduling platform, and only the program knows the metadata information.
Thus, the existence of data elements between heterogeneous data sources can be understood as a flexible data abstraction that represents a collection of immutable, partitionable, parallel-computable elements within, when data conversion is performed. The elastomer is as follows:
(1) Fault tolerance elasticity, namely, the data loss can be automatically recovered;
(2) The elasticity of storage, namely automatic switching between the memory and the disk;
(3) Calculating elasticity, namely calculating an error retry mechanism;
(4) The elasticity of the segments can be re-segmented as needed.
In the data processing process of the data element, an intermediate process table is not generated, a user loads the data before using the data, and the user does not contact a data source, so that the reliability and the safety of the data element in the operation process are ensured.
Based on the technical scheme, the method and the device avoid the limitation that the complicated configuration information of the database depends on personnel management, realize centralized automatic management of the configuration information, enable the configuration of the database to be more consistent and standardized, facilitate monitoring and maintenance, reduce human errors and improve efficiency.
The application provides an elastic data abstraction form, data can be invisible, the risk of improper access or leakage is reduced by avoiding generating an intermediate process table, the safety of the data is protected, the requirement of data storage is reduced, and the efficiency of data processing is improved.
Referring to fig. 3, corresponding to the above embodiment of the secure data transfer method of the data element in the heterogeneous data source, the embodiment of the present application provides a secure data transfer device of the data element in the heterogeneous data source, where the secure data transfer device includes:
the data source registration module 1001 is configured to obtain connection information and super user information between data sources based on the established connection relationship between the data sources and the super management account, and register the connection information and the super user information to a management center through corresponding JDBC of the data sources;
a data source connection management module 1002, configured to manage a connection process between data sources through the management center;
a data element generation management module 1003, configured to manage a generation process of a data element by using the management center;
a task scheduling node module 1004, configured to receive, through a task scheduling node configured in advance, a data query request initiated by a client, and send the data query request to the management center;
the query and analysis module 1005 is configured to query and analyze metadata information of a source data source and related information corresponding to a target data source according to the data query request by the management center, and transmit the analyzed data to the task scheduling node;
the node data loading module 1006 is configured to register a node task according to the data query request by the task scheduling node, and execute the node task according to the parsed data to perform data loading;
The user task execution module 1007 is configured to obtain a data loading result of the task scheduling node by using the client through a data query request, execute a corresponding user task according to the data loading result, and obtain a calculation result and write out after the user task is executed.
Further, the data source connection management module is specifically configured to:
determining and acquiring roles of users according to the development, production and circulation processes of the data elements;
according to the role of the user, carrying out automatic authorization or refusal connection of the data source;
when the data source is on line, carrying out on-line recovery notification on an access request of a user during the off line;
The method comprises the steps of detecting the JDBC connection number of a data source, carrying out early warning when the connection number exceeds 80% of a preset threshold value, carrying out liveness sequencing on current connection users, sending a message notification of possible closing to the connection users with inactive TOP20%, and executing closing action on the connection users with inactive TOP20% when the connection number exceeds the threshold value, wherein the standard of the inactive connection users is set to be that no data transmission is detected within 10 minutes.
Further, the data element generation management module is specifically configured to:
periodically scanning and maintaining metadata information of each data source, and recording scanning time;
when the data of the data source is increased, automatically synchronizing and updating metadata information of the data source;
The management center periodically scans heartbeat information of the data source to ensure that the data source is maintained in a usable state;
periodically scanning a registered data source list to ensure that new data sources can be discovered and used at any time;
Based on the virtual JDBC provided by the management center, the virtual JDBC is automatically routed to the JDBC corresponding to the data source according to the data resources required in the development, production and circulation processes of the data element and the metadata information of the data source.
Further, the query and analysis module is specifically configured to:
the data is transferred by reflection or by the cooperation of a public buffer layer with fine grain authority.
The secure data transfer device of the data element in the heterogeneous data source realizes the steps of the secure data transfer method embodiment of the data element in the heterogeneous data source and the processes of the embodiments, and can achieve the same technical effects, so that repetition is avoided and redundant description is omitted.
Referring to fig. 4, corresponding to the embodiment of the secure data transfer method of the data element in the heterogeneous data source, the embodiment of the present application provides an electronic device, where the electronic device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the computer program when executed by the processor implements the steps of the embodiment of the secure data transfer method of the data element in the heterogeneous data source and the processes of the embodiment, and can achieve the same technical effects, so that repetition is avoided and no further description is given here.
The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory 1009 may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read-only memory (ROM), a programmable Read-only memory (ProgrammableROM, PROM), an erasable programmable Read-only memory (ErasablePROM, EPROM), an electrically erasable programmable Read-only memory (ElectricallyEPROM, EEPROM), or a flash memory, among others. The volatile memory may be random access memory (RandomAccessMemory, RAM), static random access memory (STATICRAM, SRAM), dynamic random access memory (DYNAMICRAM, DRAM), synchronous dynamic random access memory (SynchronousDRAM, SDRAM), double data rate synchronous dynamic random access memory (DoubleDataRateSDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (ENHANCEDSDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINKDRAM, SLDRAM), and direct memory bus random access memory (DirectRambusRAM, DRRAM). Memory 1009 in embodiments of the application includes, but is not limited to, these and any other suitable types of memory.
The processor 1010 may include one or more processing units, and optionally the processor 1010 integrates an application processor that primarily processes operations involving an operating system, user interface, application program, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 1010.
Corresponding to the embodiment of the secure data transfer method of the data element in the heterogeneous data source, the embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and the program or the instruction when executed by a processor implements the steps of the embodiment of the secure data transfer method of the data element in the heterogeneous data source and the processes of the embodiment, and can achieve the same technical effects, so that repetition is avoided and no further description is given here.
The processor is a processor in the electronic device described in the above embodiment of the present application. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.