[go: up one dir, main page]

CN113990399A - Privacy-protecting genetic data sharing method and device - Google Patents

Privacy-protecting genetic data sharing method and device Download PDF

Info

Publication number
CN113990399A
CN113990399A CN202111274064.7A CN202111274064A CN113990399A CN 113990399 A CN113990399 A CN 113990399A CN 202111274064 A CN202111274064 A CN 202111274064A CN 113990399 A CN113990399 A CN 113990399A
Authority
CN
China
Prior art keywords
data
encrypted
query
public key
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111274064.7A
Other languages
Chinese (zh)
Other versions
CN113990399B (en
Inventor
陈智罡
宋新霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Wanli University
Original Assignee
Zhejiang Wanli University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Wanli University filed Critical Zhejiang Wanli University
Priority to CN202111274064.7A priority Critical patent/CN113990399B/en
Publication of CN113990399A publication Critical patent/CN113990399A/en
Application granted granted Critical
Publication of CN113990399B publication Critical patent/CN113990399B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/40Encryption of genetic data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioethics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Computer Hardware Design (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for sharing gene data for protecting privacy and safety, which can control who can access and use the data. The method applied to the data providing terminal comprises the following steps: generating a group of virtual individual gene data, and maximally preventing an attacker from associating the encrypted data and the plaintext data by observing; generating a key for a data provider to encrypt an original file to obtain a symmetric encrypted file; allocating an identifier for each virtual individual and each data provider so as to eliminate the influence of the virtual individual on the data analysis statistics of a data inquirer; respectively encrypting the original file, the virtual individual gene data, the secret key and the identification by using the collective public key and sending the encrypted files to the computing node; setting an access policy and receiving and replying to data access notifications. The invention integrates and complements a plurality of technologies including homomorphic encryption, data owner access control based on encryption and block chain technology, and realizes controllable, transparent and safe genome data sharing.

Description

Gene data sharing method and device for protecting privacy and safety
Technical Field
The invention relates to the technical field of data encryption, in particular to a method and a device for sharing gene data for protecting privacy and safety.
Background
Due to the reduction of the cost of DNA analysis, the ever-increasing genome dataset is expected to improve preventive medicine and support the development of more targeted therapeutic methods, so that the value of the large-scale genome dataset is fully reflected. Has entered the large-scale genomics era. The personal genomics market, which is directed towards users, has increased dramatically over the past few years, and a large-scale population genomics program is being implemented in many countries around the world.
However, this potential for large-scale genomics can only be realized if genomic data is widely available. The privacy problem of gene data makes large-scale data sharing extremely difficult. Historically, the privacy problem of health data has always been addressed by "deauthorization" of the data, particularly by deleting fields that show the identity of individuals. However, personal genome data cannot be effectively addressed by traditional "de-identification" because even a small fraction of them is sufficient to identify individuals. Even a small fraction of this is sufficient to identify individuals or relatives, as evidenced by the successful use of DNA in the forensic field, for example.
This has led to a particular sensitivity and concern for genomic data privacy. There is a real or perceived risk of genetic discrimination in insurance, employment and other areas that prevents individuals from participating in demographic studies and using genetic testing services directed toward consumers.
In addition, it is desirable to have control over the use of their genomic data and to be able to share their data with researchers without the risk of misuse, all of which are safeguarded by the demographic initiative. However, these guarantees are currently not available from the population genomics program.
Over the last few years, researchers from the information security community have proposed some solutions. Some of these focus on providing secure storage for genomic data, while others propose ways of how to safely perform certain calculations. However, to date, none of these solutions have been adopted in practice, primarily because of their very limited range of use and applicability in practical applications. Furthermore, the trust problem is beyond the scope of providing secure storage and processing, as it is also closely related to transparency and personal control issues for data sharing and usage. Existing genome data sharing platforms all rely on extensive consent policies, which are adopted by enterprise regulatory data such as the Beacon network constructed by the global genomics and health alliance (GA4GH) to facilitate genome data sharing among enterprises, but do not provide dynamic and refined control over individuals providing genome data. These problems have not been solved effectively.
Therefore, the invention is especially provided.
Disclosure of Invention
The invention aims to provide a method and a device for sharing gene data for protecting privacy and safety, which can realize controllable, transparent and safe genome data sharing by taking a user as a center.
In a first aspect, the present invention provides a method for sharing gene data with privacy and security, which is applied to a data providing terminal, and includes:
generating a group of virtual individuals and corresponding gene data by adopting a virtual individual generation algorithm, namely virtual individual gene data;
generating a symmetric encryption key for a data provider to encrypt an original file to obtain a symmetric encryption file;
assigning an identifier to each virtual individual and data provider;
respectively encrypting the original file, the virtual individual gene data, the key and the identification by using the collective public key to obtain encrypted data, and sending the encrypted data to the computing node;
setting an access policy, and under the condition that a dynamic agreement policy exists, further comprising:
and when the state of the block chain is synchronous, receiving the inquiry request notice for the data provider to check and decide to approve or reject, and sending the result to the computing node.
In a second aspect, the present invention provides a method for sharing gene data to protect privacy and security, which is applied to two or more computing nodes, and includes:
each computing node generates a pair of public key and private key based on an addition homomorphic algorithm and also generates a secret, and all computing nodes broadcast own public keys so as to generate a collective public key; all the computing nodes are established into a block chain;
receiving encrypted data sent by a data providing terminal, verifying the encrypted data by all the computing nodes, and storing the encrypted data in a block chain;
sending the symmetric encrypted file and the identifier encrypted by the collective public key to a storage unit; carrying out distributed re-encryption on the original file and the virtual individual gene data encrypted by the collective public key, and sending the original file and the virtual individual gene data to a storage unit;
receiving an inquiry request which is sent by a data inquiry terminal and is subjected to homomorphic encryption through addition, wherein the inquiry request is verified by all computing nodes and then stored in a block chain; carrying out distributed re-encryption on the query request, and sending the query request to a storage unit;
the distributed re-encryption is to convert the addition homomorphic encryption form into a determined encryption form.
Optionally or preferably, when the method is applied to a computing node, the distributed re-encryption adopts an ElGamal encryption algorithm EC-ElGamal on an elliptic curve, and includes two rounds of processes:
let EK(v)=(C1,C2) Where v is an original file encrypted by a collective public key, or virtual individual gene data encrypted by a collective public key, or a query request, (rG, v + rK) denotes distributed re-encryption of v, K is a collective public key, r is a random number, G is a base point on an elliptic curve,
1, process round: a certain computing node i uses its secret siA 1 is toiG and C2Add and send the result to the next compute node i +1, with the final result being
Figure BDA0003328807760000021
And 2, process round: a certain computing node i receives the ciphertext of the last computing node i-1
Figure BDA0003328807760000022
Then, the calculation is executed
Figure BDA0003328807760000023
Figure BDA0003328807760000024
Taking out item 2 of the final result, i.e. the result of determining the encryption, i.e.
Figure BDA0003328807760000025
Wherein
Figure BDA0003328807760000026
Optionally or preferably, when the method is applied to a computing node, the method further includes:
reading an access strategy on a block chain, and associating the received encrypted data with the access strategy, wherein the access strategy is an all-agreement strategy or a dynamic agreement strategy; for all agreement policies, either agreeing to or denying the query request; for the dynamic agreement strategy, sending a query to the data providing terminal, and receiving and verifying a reply generated on the block chain and sent by the data provider to the query;
for the opt-in-agreement reply, the symmetric key encrypted using the collective public key is converted to a symmetric key encrypted using the public key U of the querier.
Optionally or preferably, when the method is applied to a computing node, the method further includes:
and receiving a calculation result sent by the storage unit, and storing the calculation result on the block chain after the calculation result is verified by all the calculation nodes.
Optionally or preferably, when the method is applied to a computing node, the method further includes a step of converting the computing result into a query result, and the following distributed key exchange is performed jointly by all computing nodes:
let EK(R)=(C1,C2) (rG, R + rK) is the result of the computation encrypted using the collective public key K, U is the public key of the querier,
first of all, modify EK(R)=(C1,C2) Is composed of
Figure BDA0003328807760000031
Each compute node then successively generates a random number viAnd calculate
Figure BDA0003328807760000032
Wherein
Figure BDA0003328807760000033
Figure BDA0003328807760000034
The final conversion to a query result encrypted using the querier's public key U is
Figure BDA0003328807760000035
Wherein v ═ v1+…+vn
And sending the query result encrypted by using the public key U of the querier to the data query terminal.
In a third aspect, the present invention provides a method for sharing gene data to protect privacy and security, which is applied to a data query terminal, and includes:
distributing query integrals and pseudo-random identities to data queriers, and storing the query integrals and the pseudo-random identities on a block chain;
generating a public key U and a private key U of an inquirer, and providing data field information for the inquirer to perform Boolean combination to establish an inquiry request; the collective public key is used for encrypting the inquiry request and sending the inquiry request to any one computing node together with a public key U of an inquirer;
receiving a query result sent by a computing node, then updating a query integral, and storing the updated query integral on a block chain;
and decrypting the query result by using the private key u of the inquirer, or decrypting the symmetric key by using the private key u of the inquirer, and decrypting the encrypted data by using the symmetric key to obtain the real individual data.
In a fourth aspect, the present invention provides a method for sharing gene data with privacy and security, which is applied to a storage unit, and includes:
receiving and storing a symmetric encrypted file, an identifier encrypted by a collective public key, an original file encrypted by the collective public key and then subjected to distributed re-encryption and virtual individual gene data, namely storage information;
receiving a query request subjected to distributed re-encryption, matching the query request with stored information, respectively performing the following two calculations according to the query request, and sending a calculation result to a calculation node:
let Φ be the set of the identity identifiers of the individuals successfully matched, including the data provider and the virtual individuals;
if only the number of successfully matched individuals is required in the query request, performing the following homomorphic calculation:
Figure BDA0003328807760000036
if the individual ID identifiers which are successfully matched are required in the query request, the following calculation is carried out:
multiplying the individual identity identifier with a homomorphically encrypted identification, i.e.
Figure BDA0003328807760000041
At this time, the identifier of the virtual individual is 0 after the calculation result is decrypted.
In a fifth aspect, the present invention provides a gene data sharing apparatus for protecting privacy and security, which is applied to a data providing terminal, and includes:
the virtual individual gene data generation module is used for generating a group of virtual individuals and corresponding gene data by adopting a virtual individual generation algorithm, namely virtual individual gene data;
the data sending module is used for generating a symmetric encryption key for a data provider to encrypt an original file to obtain a symmetric encryption file; distributing an identifier for each virtual individual and each data provider, encrypting the original file, the gene data of the virtual individual, the key and the identifier by using the collective public key respectively to obtain encrypted data, and sending the encrypted data to the computing node;
and the auditing module is used for setting an access strategy, receiving an inquiry request notice for a data provider to review, decide approval or rejection and sending a result to the computing node when the dynamic approval strategy exists and the block chain state is synchronous.
In a sixth aspect, the present invention provides a genetic data sharing apparatus for protecting privacy and security, which is applied to a computing node, and includes:
the data receiving module is used for receiving the encrypted data sent by the data providing terminal, and storing the encrypted data in the block chain after being verified by all the computing nodes; receiving an inquiry request which is sent by a data inquiry terminal and is subjected to homomorphic encryption through addition, wherein the inquiry request is verified by all computing nodes and then stored in a block chain; receiving a calculation result sent by a storage unit, and storing the calculation result on a block chain after the calculation result is verified by all the calculation nodes;
the data processing module is used for generating a pair of public key and private key based on the addition homomorphic algorithm and also generating a secret, and all the computing nodes broadcast own public keys so as to generate a collective public key; all the computing nodes are established into a block chain; sending the symmetric encrypted file and the identifier encrypted by the collective public key to a storage unit; performing distributed re-encryption on the original file and the virtual individual gene data which are subjected to the collective public key encryption according to claim 2 or 3, and sending the original file and the virtual individual gene data to a storage unit; and carrying out distributed re-encryption on the query request and sending the query request to the storage unit.
The access policy processing module is used for reading an access policy on the blockchain and associating the received encrypted data with the access policy, wherein the access policy is an all-granted policy or a dynamic-granted policy; for all agreement policies, either agreeing to or denying the query request; for the dynamic agreement strategy, sending a query to the data providing terminal, and receiving and verifying a reply generated on the block chain and sent by the data provider to the query;
for the opt-in-agreement reply, the symmetric key encrypted using the collective public key is converted to a symmetric key encrypted using the public key U of the querier.
In a seventh aspect, the present invention provides a gene data sharing device for protecting privacy and security, which is applied to a data query terminal, and the device includes:
the request generation module is used for distributing query integral and pseudo-random identity to the data inquirer and storing the query integral and the pseudo-random identity on the block chain; generating a public key U and a private key U of an inquirer, and providing data field information for the inquirer to perform Boolean combination to establish an inquiry request; the collective public key is used for encrypting the inquiry request and sending the inquiry request to any one computing node together with a public key U of an inquirer;
the result receiving module is used for receiving the query result sent by the computing node, then updating the query integral and storing the updated query integral on the block chain; and decrypting the query result by using the private key u of the inquirer, or decrypting the symmetric key by using the private key u of the inquirer, and decrypting the encrypted data by using the symmetric key to obtain the real individual data.
In an eighth aspect, the present invention provides a gene data sharing apparatus for protecting privacy and security, which is applied to a storage unit, and the apparatus includes:
the data storage module is used for receiving and storing the symmetric encrypted file, the identifier encrypted by the collective public key, the original file encrypted by the collective public key and then encrypted in a distributed mode and the virtual individual gene data, and the original file and the virtual individual gene data are storage information;
the query matching module is used for receiving the query request subjected to distributed re-encryption, matching the query request with the stored information, respectively performing the following two calculations according to the query request, and sending the calculation result to the calculation node:
let Φ be the set of the identity identifiers of the individuals successfully matched, including the data provider and the virtual individuals;
if only the number of successfully matched individuals is required in the query request, performing the following homomorphic calculation:
Figure BDA0003328807760000051
if the individual ID identifiers which are successfully matched are required in the query request, the following calculation is carried out:
multiplying the individual identity identifier with a homomorphically encrypted identification, i.e.
Figure BDA0003328807760000052
At this time, the identifier of the virtual individual is 0 after the calculation result is decrypted.
The method and the device for sharing the gene data for protecting privacy and safety provided by the invention have the following beneficial effects:
the combination of homomorphic encryption, encryption-based access control by data providers (i.e., data owners), and blockchain technology enables individuals to share their genomic and clinical data in encrypted form. The device applied to the data providing terminal, the computing node, the storage unit, the data query terminal and the block chain is combined into a safe and credible platform, so that the data uploaded by the data provider can be found by a data query person (such as a researcher), and the encryption state is always kept, and unauthorized access is prevented. In addition, the method also ensures that an authorized data inquirer can access and use the decrypted gene data to perform further data analysis operation in a data provider control and auditable mode, and simultaneously meets the requirements of ensuring trust distribution, end-to-end protection of genome data, safety data release, control and auditable of data provider individuals on own data and the like.
Most of the previous researches on privacy security are focused on the application of a single encryption technology, and the invention integrates multiple complementary technologies and realizes controllable, transparent and secure genome data sharing.
The previously proposed embodiments of privacy protection security focus on organization genome data sharing, and the invention centers on users (data providers, data inquirers) and guides the needs of the users on personal genomics services, so that the users can control the personal genomics data.
The present invention is versatile and portable and can be easily integrated with secure computing protocols with centralized features, making it more secure (by distributing trust across multiple nodes) and more user-centric (by enabling individuals to have control over the use of their data).
Drawings
Fig. 1 is a schematic diagram illustrating communication relationships among parts of a data query terminal, a data providing terminal, a computing node, and a storage time unit in the gene data sharing apparatus for protecting privacy and security according to the embodiment;
fig. 2 is a schematic diagram illustrating a communication relationship between a data providing terminal and a computing node in the gene data sharing apparatus for protecting privacy and security in the embodiment;
FIG. 3 is a schematic diagram illustrating a communication relationship between a storage time unit and a computing node in the gene data sharing apparatus for protecting privacy and security in the embodiment;
fig. 4 is a schematic diagram illustrating a communication relationship between a data query terminal and a computing node in the gene data sharing device for protecting privacy and security in the embodiment.
Detailed Description
The technical solutions of the present invention will be explained and illustrated in detail with reference to specific embodiments so that they will be more clearly understood and can be implemented by those skilled in the art. The steps listed in the examples are not limited to the order listed.
In a first embodiment, a method for sharing gene data with privacy and security is provided, and is applied to a data providing terminal, and the method includes:
s 11: and generating a group of virtual individuals and corresponding gene data by adopting a virtual individual generation algorithm, namely virtual individual gene data.
The purpose of generating virtual individual gene data is to maximally prevent an attacker from associating by observing encrypted data and plaintext data. The virtual individual generation algorithm inputs all the gene data which can be acquired, including a set of clinical attributes, gene variations and the like, and corresponding population statistics, namely frequency distribution of the gene variations, prevalence rate of clinical variables and the like, and outputs a group of virtual individuals and corresponding gene data after calculation.
The group refers to more than two, namely at least two virtual individuals in the group, and gene data corresponding to each virtual individual. The greater the number of virtual individuals, the stronger the protection for the real data provider.
s 12: generating a symmetric encryption key S for a data provideriThe system comprises a first encryption module, a second encryption module and a third encryption module, wherein the first encryption module is used for encrypting an original file to obtain a symmetrical encrypted file; assigning an identifier f to each virtual individual and data provideri
Encrypting S using a collective public key Ki:EK(Si) Respectively encrypting the original file and the virtual individual gene data by using the collective public key K, and simultaneously encrypting the identifier fi::EK(fi) And obtaining the encrypted data and sending the encrypted data to the computing node.
The original file, i.e. the file containing the genetic data and the clinical data, such as the genetic variation data, the clinical medical data, etc., all have a specific standard format.
Assigning an identifier f to each virtual individual and data provideriIn order to eliminate the influence of the virtual individual on the data analysis statistics of the subsequent data inquirer, the identifier may be set according to the requirement, for example, the length of the identifier is set to be 1 bit, the identifier represents a real individual when being 1, and the identifier represents a virtual individual when being 0.
s 13: and setting an access policy, receiving an inquiry request notification for a data provider to review and decide approval or rejection when the dynamic approval policy is in state synchronization with the block chain, and sending the approval or rejection result to the computing node.
The access policy generally has two basic options: all consent and dynamic consent. The all-agreement policy is to set some options that must be satisfied, when all the options are satisfied, the computing node will provide data access after all the computing nodes verify, otherwise, the computing node will refuse. The dynamic agreement policy is that the data providing terminal receives an inquiry notice when synchronizing with the blockchain state, the notice contains data access request related information, and the data provider can review the information and then choose to agree or reject as a response.
The hash signature of the original file and the set access policy are linked to the identity of the data provider (the data provider's own public key, used to generate transactions on the blockchain).
In this embodiment, the encrypted data sent to the computing node further includes a pseudo identity of the data provider, the pseudo identity is distributed by the data providing terminal, and also includes an access policy for protecting the real data provider.
In a second embodiment, a method for sharing genetic data with privacy and security is provided, and applied to a computing node, the method includes:
s 21: each compute node generates a pair of a public key and a private key based on an additive homomorphic algorithm and also generates a secret, e.g., based on an EC-ElGamal additive homomorphic encryption algorithm, the ith compute node generates a private key kiAnd a public key KiAnd secret si. All computing nodes broadcast their own public key KiThereby generating a collective public key
Figure BDA0003328807760000071
All compute nodes are built into a blockchain as auditable and unalterable logs of operations performed by parties and used to store topology information about storage units.
The collective public key ensures that data encrypted under the collective public key is protected unless all computing nodes are destroyed or the private keys of all computing nodes are stolen. The more compute nodes that participate in generating the collective key, the higher the overall security of the system.
s 22: and receiving encrypted data (containing an access strategy and a pseudo identity of a data provider) sent by the data providing terminal, verifying by all the computing nodes, and storing the encrypted data on the block chain.
Sending the symmetric encrypted file and the identifier encrypted by the collective public key to a storage unit; and performing distributed re-encryption on the original file and the virtual individual gene data encrypted by the collective public key, converting the addition homomorphic encryption form into a determined encryption form, and sending the determined encryption form to the storage unit.
s 23: receiving an inquiry request which is sent by a data inquiry terminal and is subjected to homomorphic encryption through addition, wherein the inquiry request is verified by all computing nodes and then stored in a block chain; and carrying out distributed re-encryption on the query request, converting the addition homomorphic encryption form into a determined encryption form, and sending the determined encryption form to the storage unit.
The purpose of distributed re-encryption is to convert data from an additively homomorphic encrypted form to a deterministically encrypted form, thereby enabling the storage unit to perform an "equal" match query on the distributed re-encrypted data. The method specifically comprises the step of matching gene data which are stored in a storage unit and uploaded by a data providing terminal and distributed and re-encrypted through a computing node with query request data which are uploaded by a data query terminal and distributed and re-encrypted through the computing node.
The general encryption algorithm is probability encryption, that is, the cipher text obtained by encrypting the same message is different. And the cipher text obtained by encrypting the same message each time is determined to be encrypted is the same. Although using deterministic encryption may reveal the distribution of encrypted data. However, since the data providing terminal generates a virtual individual so that the overall distribution of encrypted data is uniform, it is impossible for an adversary to perform a frequency analysis attack. As long as the encrypted individual identification is secure, it is impossible to distinguish between a real individual and a virtual individual.
The distributed re-encryption protocol involves a 2-round process involving the participation of all the compute nodes. The present embodiment takes the ElGamal encryption algorithm EC-ElGamal on the elliptic curve as an example, and describes the procedure of the protocol. The protocol may be constructed on a lattice code.
Let EK(v)=(C1,C2) Where v is an original file encrypted by a collective public key, or virtual individual gene data encrypted by a collective public key, or a query request, (rG, v + rK) denotes distributed re-encryption of v, K is a collective public key, r is a random number, and G is a base point on an elliptic curve.
1, process round: a certain computing node i uses its secret siA 1 is toiG and C2Add and send the result to the next meterComputing node i +1, the final result obtained is
Figure BDA0003328807760000081
And 2, process round: a certain computing node i receives the ciphertext of the last computing node i-1
Figure BDA0003328807760000082
Then, the calculation is executed
Figure BDA0003328807760000083
Figure BDA0003328807760000084
Taking out item 2 of the final result, i.e. the result of determining the encryption, i.e.
Figure BDA0003328807760000085
Wherein
Figure BDA0003328807760000086
The additive homomorphic encryption form can be converted into a deterministic encryption form using distributed re-encryption.
s 24: and reading an access policy on the blockchain, and associating the received encrypted data with the access policy, wherein the access policy is formulated by the data providing terminal and can be an all-agreement policy or a dynamic agreement policy. For all agreement policies, either agreeing to or denying the query request; for the dynamic consent policy, a query notification is sent to the data providing terminal, and a reply to the query notification sent by the data provider generated on the block chain is received and verified. For the opt-in-agreement reply, the symmetric key encrypted using the collective public key is converted to a symmetric key encrypted using the public key U of the querier.
s 25: and receiving a calculation result sent by the storage unit, and storing the calculation result on the block chain after the calculation result is verified by all the calculation nodes.
The calculation result is obtained after the storage unit performs query matching and calculation, the calculation result requires all the calculation nodes to jointly execute distributed key exchange, and the calculation result is converted into a query result and then is sent to the data query terminal.
Let EK(R)=(C1,C2) (rG, R + rK) is the result of the computation encrypted using the collective public key K, U is the public key of the querier,
first of all, modify EK(R)=(C1,C2) Is composed of
Figure BDA0003328807760000087
Each compute node then successively generates a random number viAnd calculate
Figure BDA0003328807760000088
Wherein
Figure BDA0003328807760000089
Figure BDA00033288077600000810
The final conversion to a query result encrypted using the querier's public key U is
Figure BDA00033288077600000811
Wherein M ═ C2-(k1+……+kn-1)C1,v=v1+…+vn
Therefore, the calculation result is converted into the inquiry result encrypted by the public key U of the inquirer, and the inquiry result can be decrypted by the inquirer by using the private key of the inquirer after the inquiry result is sent to the data inquiry terminal.
When the inquirer needs to obtain the original file, the key exchange step converts the key encrypted by the collective public key (namely, the symmetrically encrypted key generated by the data providing terminal) into the symmetric key encrypted by the inquirer public key, and the conversion process is the same as the key exchange process. The converted key can be used by the data inquiry terminal to decrypt the key by using the private key of the data inquiry terminal, and after the original file is downloaded, the decrypted key is used for decrypting the original file, so that the original data is obtained.
In a third embodiment, a method for sharing gene data with privacy and security is provided, and applied to a storage unit, the method includes:
s 31: and receiving and storing the symmetric encrypted file sent by the computing node, the identifier encrypted by the collective public key, the original file encrypted by the collective public key and then encrypted in a distributed way and the virtual individual gene data, namely the storage information.
s 32: receiving a distributed re-encrypted query request sent by a computing node, matching the query request with stored information, respectively performing the following two calculations according to the query request, and sending a calculation result to the computing node:
let Φ be the set of the identity identifiers of the individuals successfully matched, including the data provider and the virtual individuals;
if only the number of successfully matched individuals, namely the sum R of successfully matched individual identifications, is required in the query request, the following homomorphic calculation is carried out:
Figure BDA0003328807760000091
if the individual ID identifiers which are successfully matched are required in the query request, the following calculation is carried out:
multiplying the individual identity identifier with a homomorphically encrypted identification, i.e.
Figure BDA0003328807760000092
At this time, the identifier of the virtual individual is 0 after the calculation result is decrypted, so that the query is not influenced. Where id represents an individual identity identifier.
The execution of the query match in both cases results in a blockchain transaction that includes the definition of the query and the matching successful pin identifier. For non-tamperproof storage on the blockchain after verification by all compute nodes.
In a fourth embodiment, a method for sharing gene data with privacy and security is provided, and is applied to a data query terminal, and the method includes:
s 41: distributing query integrals and pseudo-random identities to data queriers, and storing the query integrals and the pseudo-random identities on a block chain; generating a public key U and a private key U of an inquirer, and providing data field information for the inquirer to perform Boolean combination to establish an inquiry request; the query request is encrypted using the collective public key and sent to any one of the compute nodes along with the public key U of the querier.
When the data inquirer registers at the data inquiry terminal, the data inquiry terminal records the registration information and performs proper authentication, and then distributes a pseudo-random identity and inquiry integral for the inquirer. The pseudo-random identity protects the privacy security of the querier.
The query score is recorded on the blockchain, and is consumed for each query. The purpose of this query credit is to limit the total number of query requests that the same data querier can send, and thus limit the amount of sensitive information that may be inferred from the query results that is not disclosed about the data provider. After proper authentication, an authorized querier with sufficient query credit can run a secure data query. And based on his authorization level, either obtains the total number of individuals whose data on the platform matches or the identifiers of these individuals, i.e. obtains different query rights. The query score does not decrease if the same query request is made multiple times by the same querier.
The query requests can be of different query authorities, and if only the information of the number of matches, such as the number of individuals with a certain genetic variation, needs to be obtained, the generated query request mainly contains the content of the query and the pseudo-random identity of the querier. If the specific information of the matched individual and the original file and the like are needed to be obtained, the generated query request also contains the file signature based on the hash function and the information of the querier, such as the name, the affiliated unit, the description of the research needing to access the personal data and the like in more detail.
After the query request is generated, a transaction is generated on the blockchain, that is, a certain computing node receives the query request sent by the data query terminal and the public key U of the querier, and the transaction is verified by all the computing nodes and then stored on the blockchain.
s 42: and receiving a query result sent by the computing node, then updating the query integral, and storing the updated query integral on the block chain.
For the query result only containing the matching quantity information, directly using the private key u of the inquirer to decrypt the query result;
for the inquiry result also containing the detailed information such as the matched individual identifier and the original file, the private key u of the inquirer is used for decrypting the key (namely, the symmetrically encrypted key generated by the data providing terminal), and the decrypted key is used for decrypting the original encrypted data to obtain the real individual data.
In a fifth embodiment, a genetic data sharing apparatus for protecting privacy and security is provided, which is applied to a data providing terminal, and includes:
the virtual individual gene data generation module is used for generating a group of virtual individuals and corresponding gene data by adopting a virtual individual generation algorithm, namely virtual individual gene data;
the data sending module is used for generating a symmetric encryption key for a data provider to encrypt an original file to obtain a symmetric encryption file; distributing an identifier for each virtual individual and each data provider, encrypting the original file, the gene data of the virtual individual, the symmetric encrypted file and the identifier by using a collective public key respectively to obtain encrypted data, and sending the encrypted data to a computing node;
and the auditing module is used for setting an access strategy, receiving an inquiry request notice for a data provider to review and decide approval or rejection when the dynamic approval strategy exists and the block chain state is synchronous, and sending the approval or rejection result to the computing node.
The data providing terminal is various devices that can be a front end, including a processor and a storage medium, and a program, such as a web page or the like, including a virtual individual gene data generating module, a data transmitting module, and an auditing module, which are stored in the storage medium and can be executed by the processor. The data providers, namely the users, can upload the original files of clinical and genetic variation data of the users safely through the data providing terminals and share the original files with other people for group genomics research.
In a sixth embodiment, a genetic data sharing apparatus for protecting privacy and security is provided, which is applied to a computing node, and includes:
the data receiving module is used for receiving the encrypted data sent by the data providing terminal, and storing the encrypted data in the block chain after being verified by all the computing nodes; receiving an inquiry request which is sent by a data inquiry terminal and is subjected to homomorphic encryption through addition, wherein the inquiry request is verified by all computing nodes and then stored in a block chain; receiving a calculation result sent by a storage unit, and storing the calculation result on a block chain after the calculation result is verified by all the calculation nodes;
the data processing module is used for generating a pair of public key and private key based on the addition homomorphic algorithm and also generating a secret, and all the computing nodes broadcast own public keys so as to generate a collective public key; all the computing nodes are established into a block chain; sending the symmetric encrypted file and the identifier encrypted by the collective public key to a storage unit; performing distributed re-encryption on the original file and the virtual individual gene data which are subjected to the collective public key encryption according to claim 2 or 3, and sending the original file and the virtual individual gene data to a storage unit; and carrying out distributed re-encryption on the query request and sending the query request to the storage unit.
The access policy processing module is used for reading an access policy on the blockchain and associating the received encrypted data with the access policy, wherein the access policy is an all-granted policy or a dynamic-granted policy; for all agreement policies, either agreeing to or denying the query request; for the dynamic agreement strategy, sending a query to the data providing terminal, and receiving and verifying a reply generated on the block chain and sent by the data provider to the query; for the opt-in-agreement reply, the symmetric key encrypted using the collective public key is converted to a symmetric key encrypted using the public key U of the querier.
The data processing module is further configured to convert the computation result into a query result, the conversion being performed jointly by all the compute nodes for the following distributed key exchange:
let EK(R)=(C1,C2) (rG, R + rK) is the result of the computation encrypted using the collective public key K, U is the public key of the querier,
first of all, modify EK(R)=(C1,C2) Is composed of
Figure BDA0003328807760000111
Each compute node then successively generates a random number viAnd calculate
Figure BDA0003328807760000112
Wherein
Figure BDA0003328807760000113
Figure BDA0003328807760000114
The final conversion to a query result encrypted using the querier's public key U is
Figure BDA0003328807760000115
Wherein M ═ C2-(k1+……+kn-1)C1,v=v1+…+vn
And the data processing module sends the query result encrypted by using the public key U of the querier to the data query terminal.
The device applied to the data node is a server hosted by a mutually independent government, academic or commercial institution, and a plurality of devices applied to the computing node are responsible for collectively and securely processing a query request of a data querier.
In a seventh embodiment, a genetic data sharing apparatus for protecting privacy and security is provided, which is applied to a data query terminal, and includes:
the request generation module is used for distributing query integral and pseudo-random identity to the data inquirer and storing the query integral and the pseudo-random identity on the block chain; generating a public key U and a private key U of an inquirer, and providing data field information for the inquirer to perform Boolean combination to establish an inquiry request; the collective public key is used for encrypting the inquiry request and sending the inquiry request to any one computing node together with a public key U of an inquirer;
the result receiving module is used for receiving the query result sent by the computing node, then updating the query integral and storing the updated query integral on the block chain; and decrypting the query result by using the private key u of the inquirer, or decrypting the symmetric key by using the private key u of the inquirer, and decrypting the encrypted data by using the symmetric key to obtain the real individual data.
Data inquirers are generally researchers who use data inquiry terminals to send inquiry requests and obtain inquiry results, and then find out individuals with clinical and genetic variation interested in them so as to recruit them in clinical research or drug trials.
In an eighth embodiment, there is provided a genetic data sharing apparatus for protecting privacy and security, applied to a storage unit, including:
the data storage module is used for receiving and storing the symmetric encrypted file and the identification encrypted by the collective public key, and the original file and the virtual individual gene data which are encrypted by the collective public key and then distributed and re-encrypted, namely the storage information;
the query matching module is used for receiving the query request subjected to distributed re-encryption, matching the query request with the stored information, respectively performing the following two calculations according to the demands of the querier, and sending the calculation result to the calculation node:
let Φ be the set of the identity identifiers of the individuals successfully matched, including the data provider and the virtual individuals;
if the inquirer only needs the number of successfully matched individuals, namely the sum R of successfully matched individual identifications, the following homomorphic calculation is carried out:
Figure BDA0003328807760000121
if the inquirer needs to match the successful individual ID, the following calculation is carried out:
multiplying an individual identity identifier with a homomorphically encrypted identityI.e. by
Figure BDA0003328807760000122
At this time, the identification of the virtual individual is 0 after the calculation result is decrypted by the data inquiry terminal. Where id represents an individual identity identifier.
The means applied to the storage unit may be one or more servers responsible for securely storing the clinical and genomic data of the data provider. The data provider may select any one of the servers to store their data. These servers may be distributed in any government, academic, or commercial establishment having an IT infrastructure capable of providing storage for large amounts of data.
The inventive concept is explained in detail herein using specific examples, which are given only to aid in understanding the core concepts of the invention. It should be understood that any obvious modifications, equivalents and other improvements made by those skilled in the art without departing from the spirit of the present invention are included in the scope of the present invention.

Claims (10)

1.保护隐私安全的基因数据分享方法,应用于数据提供终端,其特征在于,包括:1. a genetic data sharing method for protecting privacy, applied to a data providing terminal, is characterized in that, comprising: 采用虚拟个体生成算法生成一组虚拟个体以及对应的基因数据,即虚拟个体基因数据;A virtual individual generation algorithm is used to generate a set of virtual individuals and corresponding genetic data, that is, virtual individual genetic data; 为数据提供者生成一个对称加密的密钥用于对原始文件进行加密,获得对称加密文件;Generate a symmetric encryption key for the data provider to encrypt the original file to obtain a symmetric encrypted file; 为每个虚拟个体和数据提供者分配一个标识;Assign an identity to each virtual individual and data provider; 使用集体公钥分别对原始文件、虚拟个体基因数据、密钥、标识进行加密,获得加密后数据,并发送至计算节点;Use the collective public key to encrypt the original file, virtual individual genetic data, key, and identification respectively, obtain the encrypted data, and send it to the computing node; 设置访问策略,在有动态同意策略的情况下,还包括:Set up access policies, and in the case of dynamic consent policies, also include: 与区块链状态同步时,接收查询请求通知供数据提供者审查决定同意或者拒绝,并将结果发送至计算节点。When synchronizing with the state of the blockchain, a query request notification is received for the data provider to review and decide whether to approve or reject, and the result is sent to the computing node. 2.保护隐私安全的基因数据分享方法,应用于计算节点,其特征在于,所述计算节点为两个以上,包括:2. A genetic data sharing method for protecting privacy, applied to a computing node, wherein the number of computing nodes is more than two, including: 每个计算节点基于加法同态算法生成一对公钥和私钥,还生成一个秘密,所有计算节点广播自己的公钥,从而生成一个集体公钥;所有计算节点建立成一个区块链;Each computing node generates a pair of public key and private key based on the additive homomorphic algorithm, and also generates a secret. All computing nodes broadcast their own public key to generate a collective public key; all computing nodes build a blockchain; 接收数据提供终端发送的加密后数据,经所有计算节点验证后存储在区块链上;The encrypted data sent by the receiving data provider terminal is verified by all computing nodes and stored on the blockchain; 将对称加密文件和经过集体公钥加密的标识发送至存储单元;将经过集体公钥加密的原始文件和虚拟个体基因数据进行分布式再加密,并发送至存储单元;Send the symmetric encrypted file and the identifier encrypted by the collective public key to the storage unit; perform distributed re-encryption of the original file encrypted by the collective public key and the virtual individual genetic data, and send it to the storage unit; 接收数据查询终端发送的经加法同态加密的查询请求,查询请求经所有计算节点校验后存储在区块链上;对查询请求进行分布式再加密,并发送至存储单元;Receive the query request sent by the data query terminal that has been added and homomorphically encrypted, and the query request is verified by all computing nodes and stored on the blockchain; the query request is distributed and re-encrypted, and sent to the storage unit; 所述分布式再加密,是将加法同态加密形式转换为确定加密形式。The distributed re-encryption is to convert the additive homomorphic encryption form into a deterministic encryption form. 3.根据权利要求2所述的方法,其特征在于,所述分布式再加密采用椭圆曲线上的ElGamal加密算法EC-ElGamal,包括两轮过程:3. method according to claim 2 is characterized in that, described distributed re-encryption adopts ElGamal encryption algorithm EC-ElGamal on elliptic curve, comprises two rounds of process: 令EK(v)=(C1,C2)=(rG,v+rK)表示对v的分布式再加密,其中v是经过集体公钥加密的原始文件、或经过集体公钥加密的虚拟个体基因数据、或查询请求,K是集体公钥,r是随机数,G是椭圆曲线上的基点,Let E K (v)=(C 1 ,C 2 )=(rG,v+rK) denote the distributed re-encryption of v, where v is the original file encrypted with the collective public key, or the original file encrypted with the collective public key Virtual individual genetic data, or query request, K is the collective public key, r is a random number, G is the base point on the elliptic curve, 第1轮过程:某个计算节点i使用其秘密si,将siG与C2相加并将结果发送给下一个计算节点i+1,得到的最终结果是
Figure FDA0003328807750000011
Round 1 process: a computing node i uses its secret s i , adds s i G to C 2 and sends the result to the next computing node i+1, the final result is
Figure FDA0003328807750000011
第2轮过程:某个计算节点i收到上一个计算节点i-1的密文
Figure FDA0003328807750000012
后,执行计算
Figure FDA0003328807750000013
将最终结果的第2项取出来,即为确定加密的结果,即
Figure FDA0003328807750000014
其中
Figure FDA0003328807750000015
The second round of process: a computing node i receives the ciphertext of the previous computing node i-1
Figure FDA0003328807750000012
After that, perform the calculation
Figure FDA0003328807750000013
Take out the second item of the final result, which is the result of determining the encryption, that is,
Figure FDA0003328807750000014
in
Figure FDA0003328807750000015
4.根据权利要求2所述的方法,其特征在于,还包括:4. The method of claim 2, further comprising: 读取区块链上的访问策略,将接收的加密后数据与访问策略相关联,所述访问策略为全部同意策略或动态同意策略;对于全部同意策略,同意或者拒绝查询请求;对于动态同意策略,向数据提供终端发送询问,接收并验证区块链上生成的由数据提供者发送的对询问的答复结果;Read the access policy on the blockchain, and associate the received encrypted data with the access policy. The access policy is an all consent policy or a dynamic consent policy; for all consent policies, agree or deny the query request; for dynamic consent policies , send a query to the data providing terminal, receive and verify the response result to the query sent by the data provider generated on the blockchain; 对于选择同意的答复,将使用集体公钥加密的对称密钥转换为使用查询者的公钥U加密的对称密钥。For responses that choose yes, the symmetric key encrypted with the collective public key is converted to a symmetric key encrypted with the querier's public key U. 5.根据权利要求2所述的方法,其特征在于,还包括:5. The method of claim 2, further comprising: 接收存储单元发送的计算结果,计算结果经所有计算节点验证后存储在区块链上;Receive the calculation results sent by the storage unit, and store the calculation results on the blockchain after verification by all computing nodes; 将计算结果转换为查询结果,由所有计算节点联合执行以下分布式密钥交换:The calculation results are converted into query results, and the following distributed key exchanges are jointly performed by all computing nodes: 令EK(R)=(C1,C2)=(rG,R+rK)为使用集体公钥K加密的计算结果,U是查询者的公钥,Let E K (R)=(C 1 , C 2 )=(rG, R+rK) be the calculation result encrypted with the collective public key K, U is the public key of the queryer, 首先修改EK(R)=(C1,C2)为
Figure FDA0003328807750000021
然后每个计算节点相继生成一个随机数vi,并且计算
Figure FDA0003328807750000022
其中
Figure FDA0003328807750000023
最终转换为使用查询者的公钥U加密的查询结果是
Figure FDA0003328807750000024
其中v=v1+…+vn
First modify E K (R) = (C 1 , C 2 ) as
Figure FDA0003328807750000021
Then each computing node generates a random number v i successively, and calculates
Figure FDA0003328807750000022
in
Figure FDA0003328807750000023
The final conversion to the query result encrypted with the queryer's public key U is
Figure FDA0003328807750000024
where v=v 1 +...+v n ;
将使用查询者的公钥U加密的查询结果发送至数据查询终端。Send the query result encrypted with the queryer's public key U to the data query terminal.
6.保护隐私安全的基因数据分享方法,应用于数据查询终端,其特征在于,包括:6. A genetic data sharing method for protecting privacy, applied to a data query terminal, characterized in that, comprising: 向数据查询者分配查询积分和伪随机身份,并将查询积分和伪随机身份存储在区块链上;Allocate query points and pseudo-random identities to data queryers, and store query points and pseudo-random identities on the blockchain; 生成查询者自己的公钥U和私钥u,提供数据字段信息供查询者进行布尔组合建立查询请求;使用集体公钥对查询请求进行加密,并连同查询者的公钥U一起发送给任意一个计算节点;Generate the queryer's own public key U and private key u, provide data field information for the queryer to make a Boolean combination to establish a query request; use the collective public key to encrypt the query request, and send it together with the queryer's public key U to any one calculate node; 接收由计算节点发来的查询结果,然后更新查询积分,并将更新后的查询积分存储在区块链上;Receive the query result sent by the computing node, then update the query score, and store the updated query score on the blockchain; 使用查询者自己的私钥u解密查询结果,或者使用查询者自己的私钥u解密对称密钥,用对称密钥解密加密后数据,获得真实个体数据。Use the queryer's own private key u to decrypt the query result, or use the queryer's own private key u to decrypt the symmetric key, and use the symmetric key to decrypt the encrypted data to obtain the real individual data. 7.保护隐私安全的基因数据分享方法,应用于存储单元,其特征在于,包括:7. A genetic data sharing method for protecting privacy, applied to a storage unit, is characterized in that, comprising: 接收并存储对称加密文件和经过集体公钥加密的标识,以及经过集体公钥加密后又经过分布式再加密的原始文件和虚拟个体基因数据,即为存储信息;Receiving and storing symmetric encrypted files and identifiers encrypted by the collective public key, as well as the original files and virtual individual genetic data encrypted by the collective public key and then distributed and re-encrypted, are storage information; 接收经过分布式再加密的查询请求,将查询请求与存储信息进行匹配,根据查询请求分别进行以下两种计算,并将计算结果发送至计算节点:Receive the distributed re-encrypted query request, match the query request with the stored information, perform the following two calculations according to the query request, and send the calculation results to the computing node: 令Φ为匹配成功的个体身份标识的集合,包括数据提供者和虚拟个体;Let Φ be the set of successfully matched individual identities, including data providers and virtual individuals; 如果查询请求中只需要匹配成功的个体数量,进行以下同态计算:If only the number of successfully matched individuals is required in the query request, the following homomorphic calculation is performed:
Figure FDA0003328807750000025
Figure FDA0003328807750000025
如果查询请求中需要匹配成功的个体身份标识符,进行以下计算:If the query request needs to match the individual identity identifier successfully, the following calculation is performed: 将个体身份标识符与同态加密的标识相乘,即
Figure FDA0003328807750000026
此时计算结果经解密后虚拟个体的标识为0。
Multiply the individual identity identifier by the homomorphically encrypted identifier, i.e.
Figure FDA0003328807750000026
At this time, the identifier of the virtual individual is 0 after the calculation result is decrypted.
8.一种保护隐私安全的基因数据分享装置,应用于数据提供终端,其特征在于,所述装置包括:8. A genetic data sharing device for protecting privacy, applied to a data providing terminal, wherein the device comprises: 虚拟个体基因数据生成模块,用于采用虚拟个体生成算法生成一组虚拟个体以及对应的基因数据,即虚拟个体基因数据;The virtual individual genetic data generation module is used to generate a set of virtual individuals and corresponding genetic data by using the virtual individual generation algorithm, that is, the virtual individual genetic data; 数据发送模块,用于为数据提供者生成一个对称加密的密钥用于对原始文件进行加密,获得对称加密文件;为每个虚拟个体和数据提供者分配一个标识,使用集体公钥分别对原始文件、虚拟个体基因数据、对称加密文件、标识进行加密,获得加密后数据,并发送至计算节点;The data sending module is used to generate a symmetric encryption key for the data provider to encrypt the original file to obtain a symmetric encrypted file; assign an identifier to each virtual individual and data provider, and use the collective public key to respectively encrypt the original file. Files, virtual individual genetic data, symmetric encrypted files, and logos are encrypted, and encrypted data is obtained and sent to the computing node; 审计模块,用于设置访问策略;在有动态同意策略的情况下,所述审计模块还用于与区块链状态同步时,接收查询请求通知供数据提供者审查决定同意或者拒绝,并将结果发送至计算节点。The audit module is used to set the access policy; in the case of a dynamic consent policy, the audit module is also used to receive a query request notification for the data provider to review and decide whether to approve or reject when synchronizing with the blockchain state, and send the result to the audit module. sent to the compute node. 9.一种保护隐私安全的基因数据分享装置,应用于计算节点,其特征在于,所述装置包括:9. A privacy-protecting genetic data sharing device, applied to a computing node, wherein the device comprises: 数据接收模块,用于接收数据提供终端发送的加密后数据,经所有计算节点验证后存储在区块链上;接收数据查询终端发送的经加法同态加密的查询请求,查询请求经所有计算节点校验后存储在区块链上;接收存储单元发送的计算结果,计算结果经所有计算节点验证后存储在区块链上;The data receiving module is used to receive the encrypted data sent by the data providing terminal, and store it on the blockchain after verification by all computing nodes; it receives the query request sent by the data query terminal and is encrypted by addition and homomorphism, and the query request is processed by all computing nodes. After verification, it is stored on the blockchain; the calculation results sent by the storage unit are received, and the calculation results are verified by all computing nodes and stored on the blockchain; 数据处理模块,用于基于加法同态算法生成一对公钥和私钥,还生成一个秘密,所有计算节点广播自己的公钥,从而生成一个集体公钥;所有计算节点建立成一个区块链;将经过集体公钥加密的对称加密文件和标识发送至存储单元;将经过集体公钥加密的原始文件和虚拟个体基因数据进行权利要求2或3所述的分布式再加密,并发送至存储单元;对查询请求进行分布式再加密,并发送至存储单元;The data processing module is used to generate a pair of public key and private key based on the additive homomorphic algorithm, and also generates a secret. All computing nodes broadcast their own public key to generate a collective public key; all computing nodes build a blockchain ; Send the symmetric encrypted file and logo encrypted by the collective public key to the storage unit; Perform the distributed re-encryption described in claim 2 or 3 on the original file and virtual individual genetic data encrypted by the collective public key, and send to the storage unit unit; perform distributed re-encryption on the query request, and send it to the storage unit; 访问策略处理模块,用于读取区块链上的访问策略,将接收的加密后数据与访问策略相关联,所述访问策略为全部同意策略或动态同意策略;对于全部同意策略,同意或者拒绝查询请求;对于动态同意策略,向数据提供终端发送询问,接收并验证区块链上生成的由数据提供者发送的对询问的答复;The access policy processing module is used to read the access policy on the blockchain, and associate the received encrypted data with the access policy. The access policy is an all consent policy or a dynamic consent policy; for all consent policies, agree or deny Query request; for dynamic consent policies, send a query to the data provider terminal, receive and verify the response to the query sent by the data provider generated on the blockchain; 对于选择同意的答复,将使用集体公钥加密的对称密钥转换为使用查询者的公钥U加密的对称密钥。For responses that choose yes, the symmetric key encrypted with the collective public key is converted to a symmetric key encrypted with the querier's public key U. 10.一种保护隐私安全的基因数据分享装置,应用于数据查询终端,其特征在于,所述装置包括:10. A privacy-protecting genetic data sharing device, applied to a data query terminal, wherein the device comprises: 请求生成模块,用于向数据查询者分配查询积分和伪随机身份,并将查询积分和伪随机身份存储在区块链上;生成查询者自己的公钥U和私钥u,提供数据字段信息供查询者进行布尔组合建立查询请求;使用集体公钥对查询请求进行加密,并连同查询者的公钥U一起发送给任意一个计算节点;The request generation module is used to allocate query points and pseudo-random identities to data queryers, and store query points and pseudo-random identities on the blockchain; generate the queryer's own public key U and private key u, and provide data field information For the queryer to make a Boolean combination to establish a query request; use the collective public key to encrypt the query request, and send it to any computing node together with the queryer's public key U; 结果接收模块,用于接收由计算节点发来的查询结果,然后更新查询积分,并将更新后的查询积分存储在区块链上;使用查询者自己的私钥u解密查询结果,或者使用查询者自己的私钥u解密对称密钥,用对称密钥解密加密后数据,获得真实个体数据。The result receiving module is used to receive the query result sent by the computing node, then update the query score, and store the updated query score on the blockchain; decrypt the query result using the queryer's own private key u, or use the query score The user's own private key u decrypts the symmetric key, uses the symmetric key to decrypt the encrypted data, and obtains the real individual data.
CN202111274064.7A 2021-10-29 2021-10-29 Genetic data sharing method and device with privacy protection Active CN113990399B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111274064.7A CN113990399B (en) 2021-10-29 2021-10-29 Genetic data sharing method and device with privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111274064.7A CN113990399B (en) 2021-10-29 2021-10-29 Genetic data sharing method and device with privacy protection

Publications (2)

Publication Number Publication Date
CN113990399A true CN113990399A (en) 2022-01-28
CN113990399B CN113990399B (en) 2025-01-14

Family

ID=79744585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111274064.7A Active CN113990399B (en) 2021-10-29 2021-10-29 Genetic data sharing method and device with privacy protection

Country Status (1)

Country Link
CN (1) CN113990399B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114996763A (en) * 2022-07-28 2022-09-02 北京锘崴信息科技有限公司 Private data security analysis method and device based on trusted execution environment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109768987A (en) * 2019-02-26 2019-05-17 重庆邮电大学 A secure and private storage and sharing method of data files based on blockchain
CN111191288A (en) * 2019-12-30 2020-05-22 中电海康集团有限公司 Block chain data access authority control method based on proxy re-encryption
CN111723354A (en) * 2019-03-21 2020-09-29 宏观基因有限公司 Method for providing biological data, method for encrypting biological data, and method for processing biological data
CN112840403A (en) * 2018-07-17 2021-05-25 李伦京 Methods for preserving and using genomes and genomic data
CN113468570A (en) * 2021-07-15 2021-10-01 湖北央中巨石信息技术有限公司 Private data sharing method based on intelligent contract

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112840403A (en) * 2018-07-17 2021-05-25 李伦京 Methods for preserving and using genomes and genomic data
CN109768987A (en) * 2019-02-26 2019-05-17 重庆邮电大学 A secure and private storage and sharing method of data files based on blockchain
CN111723354A (en) * 2019-03-21 2020-09-29 宏观基因有限公司 Method for providing biological data, method for encrypting biological data, and method for processing biological data
CN111191288A (en) * 2019-12-30 2020-05-22 中电海康集团有限公司 Block chain data access authority control method based on proxy re-encryption
CN113468570A (en) * 2021-07-15 2021-10-01 湖北央中巨石信息技术有限公司 Private data sharing method based on intelligent contract

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
隋爱娜等: "数字内容安全技术", 31 October 2016, 中国传媒大学出版社, pages: 229 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114996763A (en) * 2022-07-28 2022-09-02 北京锘崴信息科技有限公司 Private data security analysis method and device based on trusted execution environment
CN114996763B (en) * 2022-07-28 2022-11-15 北京锘崴信息科技有限公司 Private data security analysis method and device based on trusted execution environment

Also Published As

Publication number Publication date
CN113990399B (en) 2025-01-14

Similar Documents

Publication Publication Date Title
US11019040B2 (en) Cloud key escrow system
US20190294817A1 (en) Method and system for managing access to personal data by means of a smart contract
Gao et al. BSSPD: A Blockchain‐Based Security Sharing Scheme for Personal Data with Fine‐Grained Access Control
EP3345372B1 (en) Secure key management and peer-to-peer transmission system with a controlled, double-tier cryptographic key structure and corresponding method thereof
CN108173805A (en) Collaborative Construction Method of Distributed Anonymous Areas with Privacy Preservation Based on Blockchain
JP2023500570A (en) Digital signature generation using cold wallet
CN114866323A (en) User-controllable private data authorization sharing system and method
CN115242518A (en) Medical health data protection system and method under mixed cloud environment
KR102465467B1 (en) The decentralized user data storage and sharing system based on DID
CN113393225A (en) Digital currency encryption payment method and system
Gajmal et al. Blockchain-based access control and data sharing mechanism in cloud decentralized storage system
Guo et al. Using blockchain to control access to cloud data
Almuzaini et al. Key Aggregation Cryptosystem and Double Encryption Method for Cloud‐Based Intelligent Machine Learning Techniques‐Based Health Monitoring Systems
Huynh et al. A reliability guaranteed solution for data storing and sharing
CN115396096B (en) Encryption and decryption method and protection system for secret files based on national secret algorithm
Singh et al. Mutual authentication framework using fog computing in healthcare
CN117216786A (en) On-demand sharing method of statistical data on crowdsourcing platform based on blockchain and differential privacy
CN113990399B (en) Genetic data sharing method and device with privacy protection
EP4165851A1 (en) Distributed anonymized compliant encryption management system
Zhang et al. Blockchain-enabled data governance for privacy-preserved sharing of confidential data
KR102475434B1 (en) Security method and system for crypto currency
Chao et al. A patient-identity security mechanism for electronic medical records during transit and at rest
WO2023104745A1 (en) A distributed communication network
De Oliveira et al. Red Alert: break-glass protocol to access encrypted medical records in the cloud
Yau et al. Anonymous service usage and payment in service-based systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant