[go: up one dir, main page]

CN119046990A - Collusion-resistant multi-file privacy information retrieval method and system - Google Patents

Collusion-resistant multi-file privacy information retrieval method and system Download PDF

Info

Publication number
CN119046990A
CN119046990A CN202411533296.3A CN202411533296A CN119046990A CN 119046990 A CN119046990 A CN 119046990A CN 202411533296 A CN202411533296 A CN 202411533296A CN 119046990 A CN119046990 A CN 119046990A
Authority
CN
China
Prior art keywords
file
retrieval
scheme
files
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202411533296.3A
Other languages
Chinese (zh)
Other versions
CN119046990B (en
Inventor
刘洋
张一炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202411533296.3A priority Critical patent/CN119046990B/en
Publication of CN119046990A publication Critical patent/CN119046990A/en
Application granted granted Critical
Publication of CN119046990B publication Critical patent/CN119046990B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Medical Informatics (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提出了一种抗合谋多文件隐私信息检索方法及系统,涉及隐私信息检索领域,包括:通过文件分包、编码存储的方式,将若干个文件分布式存储在多个服务器上,构成分布式存储系统;获取需求文件的检索信息,连同存储参数,进行查询轮次的选定,构建多个备选检索方案;根据检索方案的PIR码率,从多个备选检索方案中选取最优的检索方案;通过执行最优的检索方案,得到需求文件的内容;本发明基于边际信息查询和需求信息查询,高效实现在基于任意码的分布式存储系统中的可抵抗任意个服务器合谋的多文件隐私信息检索。

The present invention proposes a method and system for anti-collusion multi-file privacy information retrieval, which relates to the field of privacy information retrieval, including: distributing and storing a number of files on a plurality of servers by means of file subpackaging and encoding storage to form a distributed storage system; obtaining the retrieval information of the required file, selecting the query round together with the storage parameters, and constructing a plurality of alternative retrieval schemes; selecting the best retrieval scheme from a plurality of alternative retrieval schemes according to the PIR code rate of the retrieval scheme; obtaining the content of the required file by executing the best retrieval scheme; the present invention is based on marginal information query and demand information query, and efficiently realizes the search based on arbitrary Distributed storage system with arbitrarily resistant Retrieval of private information from multiple files using collusion between servers.

Description

Collusion-resistant multi-file privacy information retrieval method and system
Technical Field
The invention belongs to the field of private information retrieval, and particularly relates to a collusion-resistant multi-file private information retrieval method and system.
Background
The rapid development of big data and cloud computing technology brings great convenience for people to acquire information and also brings risk of leakage of private information of users, for example, most of current search engines analyze search behaviors of users to make corresponding personalized customization or advertisement popularization, personal privacy of users is protected when some sensitive databases are searched, and privacy information searching (Private information retrieval, PIR) allows users to send queries to a server to search required files, and meanwhile guarantees that the server cannot obtain any clues of the numbers of the searched files.
PIR problems originate in the cryptography field at the earliest and are closely related to research directions such as careless transmission, privacy calculation, safe multiparty calculation and the like, main optimization indexes of the PIR problems are traffic (uploading and downloading required by searching single bits) and algorithm complexity, sun and Jafar popularize the PIR problems in the cryptography field into a distributed storage system in the foundation work of 2017 of Sun and Jafar, and a searching target is also derived from single bits into a sufficiently large file.
A general description of PIR problems in a distributed system is as follows, in one baseIn a distributed storage system for codes,The individual files are stored via encodingIn a personal server, a user wishes to ensure arbitrary search for a single demand file while successfully searching for the demand fileThe server for collusion cannot obtain any information of the number of the searched file, and the measurement index of PIR scheme efficiency of the distributed storage system is called PIR code rate, namely the ratio between the size of the required file and the total download amount.
In the last decade, the PIR problem of the distributed system is one of the leading-edge hot problems of information science, and the articles of which the related subjects are recently published in International information theory flagship journal IEEE Transactions on Information Theory. However, most studies only consider single file retrieval. In the aspect of multi-file PIR retrieval, banawan and Ulukus indicate in 2018 work that the efficiency of simultaneously retrieving a plurality of files is strictly better than that of repeatedly executing a single-file PIR scheme, which indicates that the research of the multi-file PIR retrieval scheme has practical application value. In Banawan and Ulukus operations, if the number of files is retrievedStrictly less thanThe PIR code rate of the scheme still has a certain gap with the theoretical bound of the information theory. At the same time, their work is directed toThis degradation situation (i.e. no collusion between replicated storage systems, servers) does not take into account the general parametersAnd
Therefore, the existing multi-file privacy information retrieval scheme has limitations.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a collusion-resistant multi-file privacy information retrieval method and a collusion-resistant multi-file privacy information retrieval system, which are based on marginal information inquiry and demand information inquiry and are efficiently realized on the basis of any oneResistance to arbitrary in a distributed storage system for codesAnd searching multi-file privacy information colluded by the servers.
To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
The first aspect of the invention provides a collusion-resistant multi-file privacy information retrieval method.
A collusion-resistant multi-file privacy information retrieval method comprises the following steps:
A plurality of files are stored on a plurality of servers in a distributed mode through file sub-packaging and coding storage modes, so that a distributed storage system is formed;
acquiring search information of a demand file, carrying out query round selection together with storage parameters, and constructing a plurality of alternative search schemes;
selecting an optimal retrieval scheme from a plurality of alternative retrieval schemes according to the PIR code rate of the retrieval scheme;
obtaining the content of the demand file by executing an optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
Further, the method for packetizing and encoding the files is to cut the files into piecesOriginal data packets are divided, and each data packet passes throughCode generationThe data packets after the coding are stored in the following parts respectivelyIn the server.
Further, the storage parameters include the number of divisionsNumber of serversTotal number of files;
The search information of the requirement files comprises the number of the requirement filesNumbering of each demand file and the number of collusion servers
Further, the search scheme comprises a plurality of rounds, each round comprises a plurality of stages, in each stage, a user is required to send a query request to each server, and the server returns a query response;
Wherein the query request is as follows In the form of the sum of the search symbols of the individual documents, this is requested in batchesFiles are stored in file data packages on the server,Is the round sequence number;
The query response is a file data packet pointed by the server return retrieval symbol.
Further, the query round is selected fromIn the individual rounds, selectNumber of rounds, wherein, the frontThe round contains the demand information query and the marginal information query, the firstThe individual rounds only contain the demand information query;
the marginal information inquiry is only inquiry of the sum of retrieval symbols of the non-required files;
The demand information query is a query containing the sum of search symbols of the demand file.
Further, the construction of the alternative retrieval scheme further comprises calculating scheme parameters based on the storage parameters and the retrieval parametersAnd;
Utilization scheme parametersAndThe following calculations were performed:
(1) Calculate the selected The number of stages of each of the rounds;
(2) Utilization scheme parameters AndSetting search symbols of all files;
(3) Utilization scheme parameters AndThe PIR code rate for each scheme is calculated.
Further, the search symbol is set by a pre-selected methodAnd the coding matrix codes the data packet of each file to obtain the retrieval symbol of the file.
The second aspect of the invention provides a collusion-resistant multi-file privacy information retrieval system.
The collusion-resistant multi-file privacy information retrieval system comprises a file storage module, a scheme construction module, a scheme selection module and a retrieval execution module:
the file storage module is configured to store a plurality of files on a plurality of servers in a distributed manner in a file sub-packaging and coding storage mode to form a distributed storage system;
the scheme construction module is configured to acquire the retrieval information of the demand file, select query rounds together with the storage parameters and construct a plurality of alternative retrieval schemes;
The scheme optimizing module is configured to select an optimal searching scheme from a plurality of alternative searching schemes according to the PIR code rate of the searching scheme;
the searching and executing module is configured to obtain the content of the demand file by executing the optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
A third aspect of the present invention provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the steps in a collusion resistant multi-file privacy information retrieval method according to the first aspect of the present invention.
A fourth aspect of the present invention provides an electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in a collusion resistant multi-file privacy information retrieval method according to the first aspect of the present invention when the program is executed.
The one or more of the above technical solutions have the following beneficial effects:
1) The invention constructs a search scheme consisting of a plurality of rounds including the demand information query and the marginal information query and a round including only the demand information query, eliminates non-demand files in the follow-up round demand information query results by utilizing the marginal information query results to obtain the content of the demand files, and realizes the search scheme based on any one round of demand information query results One of the distributed storage systems of codes is resistant to arbitraryMultiple file privacy information retrieval scheme for collusion of individual servers, may be directed to any suitable parametersAndI.e. allowing the system to be slavedThe copy storage is generalized to be arbitraryAndAnd allows collusion of servers.
2) Banawan and Ulukus, in the previous work, atTake turns toDefault selected in wheelThe wheel, not being aware that the selection here will have an impact on the retrieval efficiency. The invention considers the difference of the selection of each round, and can efficiently calculate and select the PIR code rate of the scheme obtained by each round, thereby aiming at given parametersThe optimal round Q can be selected. For example, in,,,When Banawan and Ulukus default selections of the 5 th round, the PIR code rate of the scheme isThe invention selects the 4 th round, the PIR code rate of the scheme isTherefore, the search efficiency of the scheme is greatly improved.
3) The searching efficiency of the invention is far better than the efficiency of repeated single file searching scheme, thus the searching efficiency can be practically improved in the actual multi-file searching process.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
Fig. 1 is a flow chart of a method of a first embodiment.
Figure 2 is a schematic diagram of interactions of the parties involved in the first embodiment.
Fig. 3 is a diagram showing an example of the retrieval scheme of the first embodiment.
Detailed Description
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Example 1
In one embodiment of the present disclosure, a collusion-resistant multi-file privacy information retrieval method is provided, as shown in fig. 1, including the following steps:
step S1, a plurality of files are stored on a plurality of servers in a distributed mode through file sub-packaging and coding storage to form a distributed storage system;
further, the method for packetizing and encoding the files is to cut the files into pieces Original data packets are divided, and each data packet passes throughCode generationThe data packets after the coding are stored in the following parts respectivelyIn the server.
Step S2, acquiring search information of a demand file, carrying out query round selection together with storage parameters, and constructing a plurality of alternative search schemes;
further, the storage parameters include the number of divisions Number of serversTotal number of files;
The search information of the requirement files comprises the number of the requirement filesNumbering of each demand file and the number of collusion servers
Further, the search scheme comprises a plurality of rounds, each round comprises a plurality of stages, in each stage, a user is required to send a query request to each server, and the server returns a query response;
Wherein the query request is as follows In the form of the sum of the search symbols of the individual documents, this is requested in batchesFiles are stored in file data packages on the server,Is the round sequence number;
The query response is a file data packet pointed by the server return retrieval symbol.
Further, the query round is selected fromIn the individual rounds, selectNumber of rounds, wherein, the frontThe round contains the demand information query and the marginal information query, the firstThe individual rounds only contain the demand information query;
the marginal information inquiry is only inquiry of the sum of retrieval symbols of the non-required files;
The demand information query is a query containing the sum of search symbols of the demand file.
Further, the construction of the alternative retrieval scheme further comprises calculating scheme parameters based on the storage parameters and the retrieval parametersAnd;
Utilization scheme parametersAndThe following calculations were performed:
(1) Calculate the selected The number of stages of each of the rounds;
(2) Utilization scheme parameters AndSetting search symbols of all files;
(3) Utilization scheme parameters AndThe PIR code rate for each scheme is calculated.
Further, the search symbol is set by a pre-selected methodAnd the coding matrix codes the data packet of each file to obtain the retrieval symbol of the file.
Step S3, selecting an optimal retrieval scheme from a plurality of alternative retrieval schemes according to the PIR code rate of the retrieval scheme;
Step S4, obtaining the content of the demand file by executing an optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
In the above method, the sum isThe files are stored in the system via encoding, each file being cut intoBy the same packetCode generationThe coded data packets are stored in a distributed storage system respectivelyOn the server, there is no hybrid code between the files.
The number of files that the user needs to retrieve (i.e. the demand files) isAnd has As shown in fig. 2, the user generates a plurality of inquiry requests and sends the inquiry requests to each server, the server carries out feedback response on the inquiry of the user, and the user synthesizes all feedback response results and decodes the result to obtain the file content to be searched.
In this process, anyThe individual servers may collude, i.e., analyze the numbers of files retrieved by the user using the query responses they receive. PIR schemes require that the user not go to this if he can retrieve it correctlyThe collusion server leaks any information about the required file number, according to specific parametersAndThe user selects different inquiry request construction strategies to construct different retrieval schemes, and the method specifically comprises the following steps:
(1) At the position of When the user's query request construction strategy is as follows:
1) Setting the retrieval symbol of the file by Random permutations of eachThe coordinates of the individual files are scrambled, and the replaced coordinates are used as search symbols of the files.
2) The searching scheme is divided into a plurality of rounds, wherein each round comprises a plurality of stages, and in the first roundThe number of rounds is) In the form of a user search to each serverAny of the individual filesThe sum of certain coordinates of the files is totally thatAnd (5) searching.
3) The number of stages per round is set to follow specific rules-in the firstTake turns toOf the wheels, the user selects only one of the wheels (denoted as the first wheelWheel), will be the firstTake turns toThe number of stages of the remaining turns in the turn is set to 0, the turn selectedAnd front(s)The rounds together form a complete query scheme.
The number of the stages of the rounds is non-zero, the specific design follows a certain proportion, the essential purpose of the method is to fully utilize marginal information brought by non-requirement files to improve the PIR code rate, and the calculation details are shown in the definition and setting part of the rounds and the stages.
In general terms, the process is carried out,Has a certain influence on PIR code rate, and is given in parametersThen, for eachCalculating code rate one by one, selecting optimumDesign plan, especially inInteger divisionIn the time-course of which the first and second contact surfaces,The choice of (c) may be arbitrary.
(2) For general parameters,The user's query request construction strategy is as follows:
1) Selecting a particular one And the coding matrix carries out coding pretreatment on each file to generate a search symbol of the file.
2) In the first placeThe number of rounds is) In one stage of (a), the user retrieves from each server in the form ofAny of the individual filesSome of the files are summed together by the sign after the first encoding stepAnd (5) searching.
3) The specific design of the ratio of the number of stages of each round is subject toIs a function of (a) and (b).
In the concrete search, the coded symbols of the related files are required to meet the following settings that the user needs to search the linear independence among the symbols of the files, and the combined part of the non-searched files in the addition of each search is required to be calculated by the searched data on other servers, and meanwhile, in order to ensure the privacy of the scheme, the method is arbitrarily used for searching the filesOn the collusion server, each retrieval symbol of any file needs to be linearly irrelevant. For specific design details, see the "settings for each document retrieval symbol" section below.
The implementation process of the collusion-resistant multi-file privacy information retrieval method of the embodiment is described in detail below from the aspects of construction of a distributed storage system, definition and setting of 'rounds and phases', setting of retrieval symbols of each file, an overall scheme example and a PIR code rate calculation formula respectively.
1. Construction of distributed storage systems
Consider the followingDistributed storage system formed by individual servers, in which system storage is performedFiles, each via the sameThe code is stored in the specific form:
The following files: Wherein, the method comprises the steps of, wherein, I.e. each file is of lengthExpressed as a sizeIs a matrix of (a) in the matrix.
Code generator matrix:
Document is put into effectIs the matrix of (a) represents the (b)The row, which is a cut data packet, is denoted as a row vectorBy usingCode generator matrix with length ofIs a row vector of (2)Coded as lengthVector of (3)First, theThe individual servers store the encoded vectorsAfter traversing all rows of the matrix representation of all files, as shown in FIG. 2, the firstContent stored by individual serversIs that all row vectors are inProjection onto, i.e.
The user needs to search in the systemPersonal files (i.e. demand files), wherein And ensure arbitraryThe collusion server cannot obtain any information about the index file number.
2. Definition and setting of "turns and phases
The structure of a search scheme comprises a plurality of rounds, each round comprises a plurality of stages, and in each stage, a user sends a query request to each server, wherein the specific form of the query request is the sum of search symbols of a plurality of files.
First, theThe number of rounds is) Representing the searchIn individual filesThe sum of the search symbols of different documentsA phase of a round is defined as a phase comprisingAny of the individual filesModule for adding search symbols of individual filesAnd (5) searching.
Wherein, in the frontIn the wheel, the firstThe number of rounds is) One stage of (2) isQueries consisting of the sum of search symbols of non-required documents, called marginal information queries, andA query containing the sum of the symbols of the required documents is called a requirement information query. First, theTo the firstIn the wheel, the firstThe number of rounds is) Is only one stage ofAnd (5) inquiring the demand information.
The round selection setting of the alternative retrieval scheme follows the following rules:
The user is at the first To the firstOne of the wheels is selected, the firstTo the firstSetting the rest turns in the turns to 0, i.e. not carrying out actual search or query, and comparing the selected turns with the previous turnsThe rounds together form a complete retrieval scheme, and the complete retrieval scheme is summed upAlternative search schemes.
The retrieval scheme eliminates the related interference in the subsequent round of demand information inquiry by adjusting the stage number of each round to fully utilize the marginal information inquiry of the non-demand file. First, theTo the point ofThe round does not generate marginal information inquiry, and optionally one round can fully utilize all marginal information inquiry by adjusting the stage number proportion, so that the calculation is carried out one by one on the premise of giving the retrieval information and the storage parametersAfter PIR code rate of each alternative scheme, only the first is selectedTo the point ofOne of the rounds for maximizing the PIR code rate of the scheme is associated with the frontThe rounds form the optimal retrieval scheme.
The number of stages of the search scheme is set to follow the following rules:
Order the AndIn order to satisfy the minimum positive integer of the following formula, in this way, the scheme parameters are setAndIs a value of (2);
in an alternative search scheme, assume that the user has selected the first Wheel) Participate in composing alternative search scheme, number of stages of each turn of schemeThe calculation is as follows:
When (when) When, i.e. scheme No.The number of the wheel stages is. When (when) When, i.e. scheme NoTake turns toIn wheel divide byThe number of stages outside the wheel is set to 0. When (when)When, i.e. scheme NoTake turns toThe number of stages of the wheel is set to
First, theThe number of stages of the wheel is set toTo ensure that the number of stages of each round of the scheme is an integerThe number of stages of the wheel is defined byThe number of stages of the wheel is recursively obtained, the thOne stage of the wheel requires the firstWheel provisionMarginal information inquiry of each stage to eliminate interference in demand information inquiry, thusThe number of stages of the wheel is:
and so on, for the first Wheel) The number of stages is:
3. Arrangement of search symbols for each document
By pre-selection ofThe coding matrix codes the data packet of each file to obtain the retrieval symbol of the file, which is specifically:
provided that in a sufficiently large domain On is provided withCode whose transpose of the generator matrix is noted asOrder-making machineRepresenting a user slaveAll of the aboveSelected uniformly and independently in a full order matrixA random matrix.
In general parametersAndLower, the firstOne of the phases of the rounds relates toThe search symbol added query specific form of each file is set to be isomorphic with an auxiliary matrix, and the auxiliary matrix is constructed byThe size of each symbol is Each symbol appearing in the secondary matrixSecondary and arbitraryColumns share one and the same symbol, i.e. arbitraryThe servers have one and the same query.
The search symbol of the non-demand file is set as follows:
By a specific MDS coding matrix Will be Personal non-required filesSearch symbol precoding as WhereinFor a slave matrixSelected from (a) and (b) Line, ensure that the whole scheme is followedThe selected row is not repeated.
For the required fileIts search symbol is set as
In particular forThe search symbol of the file is set as follows:
the user uses M random permutations to disturb the coordinates of each file, and the coordinates are directly used as search symbols.
4. Overall solution example
,,,When the 5 th round is selected asWheel, and frontThe search scheme composed of individual rounds is shown in FIG. 3, for the first search schemeThe user searches the total of the servers at each stageArbitrary seedQuery of the sum of the search symbols of different documents, shaped asRepresentation ofThe query form of each file is isomorphic with the auxiliary matrix, and represents one query request.
ConsiderA demand information query including a required file and a non-required fileA marginal information query including only corresponding non-required files, each non-required file The search symbols are obtained by the precoding mode in 3), and the search information inquiry only comprises a required fileSearch symbol fromThe required information inquiry comprising two or more required files, only taking the symbol of one required file as the symbol required to be searched for inquiry, setting the other required files as interference in inquiry by using the same precoding mode of non-required files, setting the required information inquiry as the search symbol which can be determined from the search of other servers, and simultaneously, at any timeThe collusion server appears to be linearly independent of the retrieval sign of any file.
5. PIR code rate calculation formula
The PIR code rate of the search scheme refers to the ratio between the size of the demand file and the total download amount, and is calculated by the following iterative method:
Record the first The total load of one stage of the wheel is,Is the firstThe sum of the download amount of the wheel itself and the download amount of the required front-end wheel, the user search file size is,Is the firstThe sum of the retrieval file size of the round itself and the retrieval file size downloaded from other servers for the required preamble round, the firstThe wheel code rate is expressed as In the first placeCode rate of wheelFor the front to be the initial valueThe round may write an iterative calculation formula:
The code rate of the seed search alternative scheme is To the point ofThe code rate of the stage of the wheel is calculated as follows:
Selection of To the point ofHighest stage code rate of wheel asAn optimal search scheme among the alternative search schemes.
Example two
An embodiment of the present disclosure provides a collusion-resistant multi-file privacy information retrieval system, which includes a file storage module, a scheme construction module, a scheme optimization module, and a retrieval execution module:
the file storage module is configured to store a plurality of files on a plurality of servers in a distributed manner in a file sub-packaging and coding storage mode to form a distributed storage system;
the scheme construction module is configured to acquire the retrieval information of the demand file, select query rounds together with the storage parameters and construct a plurality of alternative retrieval schemes;
The scheme optimizing module is configured to select an optimal searching scheme from a plurality of alternative searching schemes according to the PIR code rate of the searching scheme;
the searching and executing module is configured to obtain the content of the demand file by executing the optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
Example III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs steps in a collusion resistant multi-file privacy information retrieval method according to a first embodiment of the present disclosure.
Example IV
An object of the present embodiment is to provide an electronic apparatus.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the steps in a collusion-resistant multi-file privacy information retrieval method according to the first embodiment of the present disclosure when executing the program.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A collusion resistant multi-file privacy information retrieval method, comprising:
A plurality of files are stored on a plurality of servers in a distributed mode through file sub-packaging and coding storage modes, so that a distributed storage system is formed;
acquiring search information of a demand file, carrying out query round selection together with storage parameters, and constructing a plurality of alternative search schemes;
selecting an optimal retrieval scheme from a plurality of alternative retrieval schemes according to the PIR code rate of the retrieval scheme;
obtaining the content of the demand file by executing an optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
2. The method for searching multi-file privacy information for collusion resistance according to claim 1, wherein the file is sub-packaged and coded and stored in a manner of cutting the file into piecesOriginal data packets are divided, and each data packet passes throughCode generationThe data packets after the coding are stored in the following parts respectivelyIn the server.
3. The collusion resistant multi-file privacy information retrieval method as recited in claim 1, wherein the storage parameters comprise a division numberNumber of serversTotal number of files;
The search information of the requirement files comprises the number of the requirement filesNumbering of each demand file and the number of collusion servers
4. The collusion resistant multi-file privacy information retrieval method as recited in claim 3 wherein the retrieval scheme comprises a plurality of rounds, each round comprising a plurality of phases, in each phase requiring the user to send a query request to each server, the server returning a query response;
Wherein the query request is as follows In the form of the sum of the search symbols of the individual documents, this is requested in batchesFiles are stored in file data packages on the server,Is the round sequence number;
The query response is a file data packet pointed by the server return retrieval symbol.
5. The collusion resistant multiple file privacy information retrieval method as recited in claim 4 wherein the query round is selected from the group consisting ofIn the individual rounds, selectNumber of rounds, wherein, the frontThe round contains the demand information query and the marginal information query, the firstThe individual rounds only contain the demand information query;
the marginal information inquiry is only inquiry of the sum of retrieval symbols of the non-required files;
The demand information query is a query containing the sum of search symbols of the demand file.
6. The collusion resistant multi-file privacy information retrieval method as recited in claim 5 wherein the construction of the alternative retrieval scheme further comprises calculating scheme parameters based on storage parameters, retrieval parametersAnd;
Utilization scheme parametersAndThe following calculations were performed:
(1) Calculate the selected The number of stages of each of the rounds;
(2) Utilization scheme parameters AndSetting search symbols of all files;
(3) Utilization scheme parameters AndThe PIR code rate for each scheme is calculated.
7. The collusion resistant multi-file privacy information retrieval method as recited in claim 6 wherein the search symbol is set by a pre-selected meansAnd the coding matrix codes the data packet of each file to obtain the retrieval symbol of the file.
8. The collusion-resistant multi-file privacy information retrieval system is characterized by comprising a file storage module, a scheme construction module, a scheme selection module and a retrieval execution module:
the file storage module is configured to store a plurality of files on a plurality of servers in a distributed manner in a file sub-packaging and coding storage mode to form a distributed storage system;
the scheme construction module is configured to acquire the retrieval information of the demand file, select query rounds together with the storage parameters and construct a plurality of alternative retrieval schemes;
The scheme optimizing module is configured to select an optimal searching scheme from a plurality of alternative searching schemes according to the PIR code rate of the searching scheme;
the searching and executing module is configured to obtain the content of the demand file by executing the optimal searching scheme;
the searching scheme comprises a plurality of rounds including the requirement information inquiry and the marginal information inquiry and a round including only the requirement information inquiry, and non-requirement files in the follow-up round requirement information inquiry results are eliminated by utilizing the marginal information inquiry results, so that the contents of the requirement files are obtained.
9. An electronic device, comprising:
a memory for non-transitory storage of computer readable instructions;
A processor for executing the computer readable instructions;
Wherein the computer readable instructions, when executed by the processor, perform the method of any of claims 1-7.
10. A storage medium, characterized by non-transitory storing computer readable instructions, wherein the computer readable instructions, when executed by a computer, perform the method of any of claims 1-7.
CN202411533296.3A 2024-10-31 2024-10-31 Collusion-resistant multi-file privacy information retrieval method and system Active CN119046990B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411533296.3A CN119046990B (en) 2024-10-31 2024-10-31 Collusion-resistant multi-file privacy information retrieval method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411533296.3A CN119046990B (en) 2024-10-31 2024-10-31 Collusion-resistant multi-file privacy information retrieval method and system

Publications (2)

Publication Number Publication Date
CN119046990A true CN119046990A (en) 2024-11-29
CN119046990B CN119046990B (en) 2025-03-04

Family

ID=93585653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411533296.3A Active CN119046990B (en) 2024-10-31 2024-10-31 Collusion-resistant multi-file privacy information retrieval method and system

Country Status (1)

Country Link
CN (1) CN119046990B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100185847A1 (en) * 2009-01-20 2010-07-22 New York University Database outsourcing with access privacy
CN110362537A (en) * 2019-07-09 2019-10-22 深圳大学 Based on the decoded population parameter Private information retrieval method of sawtooth
US20190325082A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Private information retrieval with probabilistic batch codes
CN113553615A (en) * 2021-07-07 2021-10-26 深圳前海新心数字科技有限公司 A Matching Query Method for Private Data Sharing System
CN114036565A (en) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 Private information retrieval system and private information retrieval method
CN114185860A (en) * 2021-10-29 2022-03-15 北京邮电大学 Collusion attack resistant data sharing method and device and electronic equipment
US11308081B1 (en) * 2020-12-18 2022-04-19 Seagate Technology Llc Private information retrieval using one query to perform multiple retrievals

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100185847A1 (en) * 2009-01-20 2010-07-22 New York University Database outsourcing with access privacy
US20190325082A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Private information retrieval with probabilistic batch codes
CN110362537A (en) * 2019-07-09 2019-10-22 深圳大学 Based on the decoded population parameter Private information retrieval method of sawtooth
US11308081B1 (en) * 2020-12-18 2022-04-19 Seagate Technology Llc Private information retrieval using one query to perform multiple retrievals
CN113553615A (en) * 2021-07-07 2021-10-26 深圳前海新心数字科技有限公司 A Matching Query Method for Private Data Sharing System
CN114185860A (en) * 2021-10-29 2022-03-15 北京邮电大学 Collusion attack resistant data sharing method and device and electronic equipment
CN114036565A (en) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 Private information retrieval system and private information retrieval method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KARIM BANAWAN等: "The Capacity of Private Information Retrieval from Coded Databases", IEEE, 31 December 2018 (2018-12-31) *
刘树波;李艳敏;刘梦君;: "基于密文检索的位置服务用户隐私保护方案", 计算机科学, no. 04, 15 April 2015 (2015-04-15) *
葛奕飞;郑彦斌;: "带有纠删或纠错性质的隐私保护信息检索方案", 广西师范大学学报(自然科学版), no. 03, 31 May 2020 (2020-05-31) *

Also Published As

Publication number Publication date
CN119046990B (en) 2025-03-04

Similar Documents

Publication Publication Date Title
US9305019B2 (en) Method of associating user related data with spatial hierarchy identifiers for efficient location-based processing
CN101257670B (en) Method, equipment and system for searching and downloading phone file
US6745177B2 (en) Method and system for retrieving data from multiple data sources using a search routing database
CN101694672B (en) Distributed safe retrieval system
CN110263061A (en) A kind of data query method and system
CN104217023B (en) It is a kind of to solve the method for map tile storage using packaging technique
CN111262953B (en) Method and device for pushing information in real time
CN107943952A (en) A kind of implementation method that full-text search is carried out based on Spark frames
CN103399945A (en) Data structure based on cloud computing database system
CN105357247B (en) Multidimensional property cloud resource range lookup method based on layering cloud peer-to-peer network
WO2020112580A1 (en) Data retrieval
CN117972795B (en) Method and device for secure retrieval of keywords in secret space based on XOR filter
CN108647266A (en) A kind of isomeric data is quickly distributed storage, exchange method
Gupta et al. Faster as well as early measurements from big data predictive analytics model
CN119046990B (en) Collusion-resistant multi-file privacy information retrieval method and system
CN103425694B (en) The searching method of relational data and device
JP2007317138A (en) Data storage system, file search device and program
US20050027684A1 (en) Database system and data accessing method thereof
Zhang et al. LPPS‐AGC: Location Privacy Protection Strategy Based on Alt‐Geohash Coding in Location‐Based Services
Bugiotti et al. SPARQL Query Processing in the Cloud.
EP2082317A2 (en) System and method for distributing queries to a group of databases and expediting data access
CN115114360A (en) Data comparison method and device, computer equipment and storage medium
Kim et al. Implementation of a front-end and back-end NDN system for climate modeling application
Harshan XOR-based codes for private information retrieval with private side information
Jiang et al. TriMLP: Revenge of a MLP-like Architecture in Sequential Recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant