Block chain-based security management and disclosure tracing method for confidential documents
Technical Field
The application relates to a security management and disclosure tracing method for a confidential document based on a block chain, in particular to a method for performing security management on the confidential document by adopting a block chain technology and effectively tracing the document after the document is disclosed.
Background
In government, military, enterprise and other units, confidential documents are directly related to national policy trends, military command operations, enterprise business secrets and the like, and once leakage occurs, immeasurable loss can be caused to the country, the military, the enterprise and the like. When a confidential document is printed and distributed, the existing security management mode usually adopts modes such as plain text watermarking, document numbering and the like, so that the document is difficult to be prevented from being leaked out in modes such as copying, photographing and the like at a distribution object, and particularly, the mode of only photographing part of characters in the document is difficult to control. Once a confidential document is leaked and spread widely, it is difficult to technically trace which distribution object the leakage is from, and it is difficult to effectively confirm a specific disclosure responsible person.
The document can be prevented from being randomly revealed to a certain extent by adding the digital watermark in the document, once the revealing happens, the responsible person can be traced through the digital watermark, but if two document responsible persons have the same document, it is difficult to distinguish who reveals the document. And for the plain text watermark, the digital watermark can be removed by some picture editing means and then leaked out. For the phenomenon that part of key contents in a document are exposed by photographing, it is difficult to pursue responsibility from technical means, people in charge can often pursue responsibility through management means in combination with network data source positioning, file pursuit and other modes, and the mode has low efficiency and poor effect, and the credibility of pursuit results is low.
Document dark watermarking technology can embed information into each character in a document in a way that is imperceptible to the naked eye, and people cannot be visually affected by the dark watermark when reading the document. When the document is distributed to different target objects, different dark watermarks can be respectively embedded according to different target object information. When the document is leaked, watermark information in the document is extracted to trace the person responsible for the leakage. However, such a dark watermarking technology usually depends on the association relationship between the dark watermark and the target object information, and the storage of such an association relationship has the problems of data deletion and tampering, unreliable query results, and the like.
The block chain is a special data structure formed by combining data blocks in a chain mode according to the time sequence, and a distrusted and distrusted distributed shared general ledger system which is not falsifiable and counterfeitable is ensured in a cryptographic mode. The block chain technology utilizes an encrypted chain block structure to verify and store data, utilizes a P2P network technology and a consensus mechanism to realize verification and communication of distributed nodes, and utilizes a chain script to realize complex business logic functions to carry out automatic operation on the data, thereby forming a new data recording, storing and expressing method.
The application combines a document dark watermark technology with a block chain technology, and provides a security management and disclosure pursuit method for a secret-involved document based on a block chain, which can embed a dark watermark in the secret-involved document and encrypt and store the watermark and information of a person responsible for the document on the block chain. When the confidential documents are leaked and transmitted, the secrecy watermark is extracted from the transmitted documents, the secrecy watermark is used as an index to inquire on the block chain, and then the inquired ciphertext information is decrypted, and finally the person responsible for the divulgence is confirmed. Compared with the existing confidential document leakage tracing method, the method has the advantages that the embedded watermark information is guaranteed not to affect the reading of people by naked eyes by applying the dark watermark technology, the watermark information and the information of the responsible person are guaranteed not to be falsified and deleted by applying the block chain technology, and the accuracy of the tracing result of the responsible person is achieved.
Disclosure of Invention
The application aims to provide a security management and leakage evidence obtaining method for a secret-related document based on a block chain, which can perform security management on printing and distribution of the secret-related document, and can accurately and trustfully confirm a responsible person according to the transmitted content when the secret-related document is leaked and transmitted in the forms of copying, photographing and the like.
In order to achieve the purpose, the application combines a dark watermark embedding and extracting technology and a block chain technology, and provides a security management and leakage evidence obtaining method for a secret-related document based on a block chain, as shown in fig. 1, before the secret-related document is printed and distributed, different hash values (digital watermarks) are generated through hash calculation according to different distribution target object information, the information of the distribution target object is encrypted into a ciphertext, the digital watermarks and the ciphertext information are stored into a block chain platform in a key-value mode, wherein the digital watermarks are keys, and the ciphertext information is values; different digital watermarks are embedded in different documents in the form of dark watermarks. When a confidential document is leaked and spread, in order to catch up a person responsible for the document, a digital watermark is extracted from the spread confidential document, then the digital watermark is used as a key to inquire on a block chain, corresponding value information (which is encrypted data) is found, and the encrypted data is decrypted to confirm the information of the person responsible for the leaked confidential document.
The method mainly comprises 4 modules, as shown in fig. 2, and is divided into a digital watermark generating and embedding module, a watermark information link certification module, a digital watermark extracting module and a watermark information link responsibility tracing module. The following describes the 4 modules in the method of the present application.
(1) Digital watermark generating and embedding module
When a confidential document is printed, an object distributed by the document needs to be clarified firstly, for example, if a department of a unit is a certain department, the information is recorded as a text NameText, besides, the superior department text belonging to the document, the title text TitleText of the document and the text TimeText of the current time need to be considered, the texts are combined, and then hash calculation is carried out to obtain a hash value HashValue, wherein the hash value is shown in the following formula:
Hash(NameText+SuperiorText+TitleText+TimeText)=HashValue
the hash calculation is a calculation that the input is a character string with any length and the character string is converted and output into a hash value with a fixed length through a hash algorithm. The process of hash calculation is irreversible, that is, input data cannot be inferred by outputting the result, and the output hash value is long enough to ensure that different inputs have unique output values. According to the method, the text information of the distribution object is subjected to Hash calculation by using an SHA256 Hash algorithm, the obtained Hash value is a binary bit string with the length of 256 bits, the binary bit string is embedded into a secret-related document, 1 bit is embedded into each Chinese character, and one Hash value can be embedded into every continuous 256 Chinese characters.
In the process of printing and outputting the confidential document, the HashValue can be hidden into the printed paper document under the condition that the HashValue cannot be detected by naked eyes. If the pdf document is an electronic version, the confidential document with the watermark information can also be generated by adopting a printing output mode.
Assume that the image of the character immediately before printing is F and the gray image obtained after the print scan is Fw (white and black pixels are represented by 0 and 1, respectively, and the intermediate gray level is represented by the number between 0 and 1). Fig. 3 shows a Chinese character and its partial enlarged display before and after printing and scanning, and it can be seen that the image printed on the paper will generate error diffusion.
Thus, the transition of the image from F to Fw resulting from the print scan process can be described approximately as a convolution process:
Fw(x)=K*F(x)
where K is a kernel function that depends only on the printing process and not on the specific character and x represents an image pixel. Assuming that I is a character image region, integrating the left and right ends of the above formula to obtain:
∫IFw(x)dx=∫IK(x)dx∫IF(x)dx
wherein integral ^ nIF (x) dx represents the number of black dots included in the original character image, [ integral ]IFw(x) dx can be approximately seen as the number of black dots comprised by the scanned image, and ^ fIK (x) dx is a constant independent of the character image. Therefore, the above formula shows that the number of black dots included in the character image before and after the print scan shows a linear relationship. Thus, assume A and AwThe average value of the number of black dots included in each character image before and after printing and scanning is obtained as followsThe formula is as follows:
thus, a constant amount of the printing and scanning process is found, namely, the ratio of the number of black dots contained in each character to the average value of the number of black dots of all the characters is constant before and after the printing and scanning.
When a specific confidential document is subjected to watermark embedding, firstly, dot matrix image data of all characters in the document is obtained, and segmentation is performed in a binary text image mode, namely, two groups of characters are segmented out: an embedding part A and an adjusting part B, wherein the number of black dots contained in each character image in the groups A and B is respectively set as
And
then, the average value of the black points contained in all the character images is calculated:
the following steps are then performed:
1) assume that the embedded watermark information is
According to w
iX is 0 or 1
iIs modified to x'
iOf x'
iAnd/m is the even or odd multiple nearest a selected step K > 0. Then, the change amount Δ of the black dot number of each character in the embedded portion A is calculated
i=x′
i-x
iAnd calculating the sum of all the variables
2) Adjust the black included in each character of the part BNumber of points yiIs modified to y'iSo that
The average black point number of the whole document character after the steps 1) and 2) is kept unchanged. The number of black dots delta x which need to be turned over for each character can be obtained by the steps 1) and 2)i=x′i-xiAnd Δ yi=y′i-yiAnd then, turning the white or black pixel points of the corresponding number of the image edge according to the positive and negative of the turning quantity, thereby realizing the embedding of the digital watermark information.
(2) Watermark information chaining evidence storage module
The watermark information of the chain credit certificate comprises a digital watermark hash value HashValue and encryption information of identity information of a person in particular responsibility (namely, data obtained by encrypting NameText + SuperOrientText + TitleText + TimeText). As shown in fig. 4, the length of the character string of the identity information is dynamically changed, different target objects have a large influence on the length of the character string, and the public key of the forensic staff is used for encryption before uplink, so that it is ensured that the data content can be decrypted only by the private key of the forensic staff, and the forensic staff is usually a security department staff who grasps the private key and can confirm the person responsible for disclosure. The hash value and the ciphertext form a key value pair, and the key value pair is stored in a block chain system in a Transaction (Transaction) mode. The transaction is a transaction request initiated by a client to a blockchain system, key value pair data needing to be linked and stored are written into the transaction request, and when the blockchain system processes the transaction, the verification of the transaction is completed through a consensus mechanism, and accounting is carried out, namely, the key value pair data are stored into the blockchain system.
(3) Digital watermark extraction module
When a certain distributed confidential document is spread in modes of copying, photographing and the like, the forensics personnel of the security department perform digital watermark extraction operation on the intercepted confidential document.
The method and the device for revealing the confidential text embedded with the watermark informationScanning the file to obtain a gray image, then segmenting characters in the same way as watermark embedding, and dividing all characters into an embedding part A and an adjusting part B. Respectively calculating the number of black dots contained in the characters A and B, and respectively setting the number as
And
then, the average value of the black points contained in the whole image character is calculated:
finally, extracting watermark information by using an odd-even quantization method if round (x'iIf (Km ')) is an even number, then the image character watermark information w ' is considered to be certain image character watermark information 'iIs 0, otherwise consider w'iIs 1. Where K is the quantization step size used in the watermark embedding process.
(4) Module for tracing responsibility on watermark information chain
As shown in fig. 5, after a digital watermark is extracted from a leaked confidential document, a search query is performed in a block chain system, that is, a hash value (digital watermark) is input for query, if no result is found, it is indicated that watermark information of the document is not linked or watermark information of the document is extracted incorrectly, and the process is ended; and if a value is returned, the obtained query result (the ciphertext of the identity information of the person responsible for the confidential document) is decrypted by using the private key of the evidence obtaining person to obtain specific identity information, and the person responsible for the confidential document is confirmed.
The method has the advantages that an effective safety management method is provided for the confidential documents, and the document can trace responsible persons from technical means after leakage occurs. The method and the device can ensure that the naked eye reading of people is not influenced by the embedding technology of the dark watermark to the document, and ensure the safe and reliable evidence storage of the watermark information and the information of the person in charge by adopting the block chain technology. The method and the system have the advantages that the evidence storing and obtaining effects can be achieved on any computing node, and the responsible person can be efficiently confirmed by carrying out watermark extraction on the leaked document and obtaining evidence on the chain. The method and the system have good application prospects for the confidential departments which need to manage a large number of confidential documents and the units with high confidentiality requirements.
Drawings
FIG. 1 is a block chain-based general framework diagram of a security management and leakage evidence-obtaining method for a confidential document;
FIG. 2 is a block diagram of the components of the method of the present application;
FIG. 3 is a schematic diagram of an enlarged error of a local part of a Chinese character;
FIG. 4 is a diagram illustrating a chain crediting process for watermarking information and identity information;
FIG. 5 is a process diagram of query confirmation of responsible persons on a watermark information chain;
FIG. 6 is a block chain platform diagram of watermark information writing and querying based on Fabric construction
Detailed Description
In order to better describe the security management and leakage evidence-obtaining method for the confidential documents based on the block chain, the following provides a specific embodiment of the present application.
The method adopts the block chain system to store the watermark information and the information of the person in charge of the document, and because the printing and distribution of the confidential documents are usually applied to the inside of one unit, the computers with the printing and distribution capacity of the same unit are connected together through a network to form the block chain system. Specifically, a blockchain system is built based on a Fabric alliance chain, the system is built by adopting a point-to-point distributed network, all nodes are equal without a central node, the same Fabric software program can be run on each computer, different nodes are endowed with different role categories according to different configurations of the program, the nodes all contain 4 types of role nodes, a network connection schematic diagram among the 4 types of nodes is shown in fig. 6, and corresponding adjustment can be performed according to the scale of a printing computer in specific implementation, namely the number of Orderer node clusters and the number of Peer node clusters can be adjusted according to specific user scale. The functions of the CA node, the Orderer node and the Peer node in the Fabric blockchain system can be directly realized by adopting modules realized by the Fabric, but a chain code for writing watermark information and inquiring the watermark information needs to be realized, the information needing to be stored is permanently stored in the blockchain system, and the chain code can be called by a program on an external printing computer only when being deployed in the Fabric blockchain system.
The information needing to be stored includes name information of an object distributed by a confidential document, namely personnel Z of Y department in X unit, the information is marked as text NameText, in addition, the text superior text of the upper department to which the document belongs, title text TitleText of the document and text TimeText of the current time are also considered, the texts are combined, the interval is carried out by using a character "+", and then the HashValue is calculated by using a SHA256 HashAlgorithm. Suppose that the text obtained after the combination of the distribution object information is "XYZ + Superior + development plan for X + 2020: 08: 01-09: 10: 10 ", the hash value obtained by performing SHA256 hash algorithm calculation on the text string is:
3a6198e74d3e297330f7590a62112656cbf2bdf2f84406e78f2f2f5221be37c1
and uploading a chain storage certificate to the watermark information, namely initiating a POST (POST position request) of HTTP (hyper text transport protocol) to any Peer node in a Peer node cluster in a block chain system through an interface provided by a chain code, encrypting text combinations of HashValue, NameText, superior, TitleText and TimeText by adopting an asymmetric encryption algorithm, and transmitting an encrypted ciphertext as a parameter. Before asymmetric encryption is carried out, the public key of the forensics generated by the generator needs to be:
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCca8Bsa+LTRpVH6yLldcxYTam4eB+DBSkYS5bA2WzmOrQ20gAYntcVsKxdbfHQ/03ceUwFu+H+QbUOGeBszpd93YWaxU5iiHbOUoY9/CLHkQUc6r+xx1+Nx1s+7mmnH7+f/rxEwQjb1WA/n+ZuG/VPzIE5ekkXz2xNw22TqPjy7QIDAQAB
the generated private key is:
MIICdwIBADANBgkqhkiG9w0BAQEFAASCAmEwggJdAgEAAoGBAJxrwGxr4tNGlUfrIuV1zFhNqbh4H4MFKRhLlsDZbOY6tDY6ABie1xWwrFlt8dD/Tdx5TAW74f5BtTQZ4GzOl33dhZrFTmKIdvRShj38IseRBRzqv7HHX43HWz7uaacfv5/+vETBCNvVYD+f5m4b9U/MgTl6SRfPbE3DbZOo+PLtAgMBAAECgYEAhOMmQpuIqep/bJGIu6QB8No3yaOWktYDk17KHjnmUbCx5bKCIsg4dikw5BoO4gvj6KN7efnek19+sm8uAukjVfZdhevyalFUmjtIQTNLlvl3eGcSDsD+rrbmECoMzh+FONTYmvz7ow320H+shEeY+ge7rDTmhL3j0KY4+s71poECQQDnV9ENsFB5qWVx2zYWnqjmxBctZYexv4+TCAwVpZRjYukx42jQND7s0kwZx+HQnDSrc3Y4SxZt/S3dS05d2aTPAkEArRewO8WSi/wNJL2MnVOBFJQO/11j9qFeFr4icI7p6qz5dwyudEUSOZnreYVOVemmVCB23CTGGECodFQi2s3TgwJBANHAEU8z+QMVz2B3vIatu73fNJR4ZZuHb4mD1lEaG3wxBfWxliqP9C2MTmthiyA1QJvix+EqU1/OGXN2/8qftokCQAk/OMOA27nbOCCUeGUXxzgASWTf6q2NJef1NQXqvRkrMRFpfhD8d+r3XJvbwbnZiGfKbE+LL4KwQdAlhs9GXFOCQFYTGSNT4PQ7MVHpFLVQMkLr3owq/dpIsQPb/3002FtLzKOECOXy6UXBK9miSTNMsNDdIHMVAyB8IvOac5UpTds=
for the text "XYZ + Superior + development planning for X + 2020: 08: 01-09: 10: 10' public key encryption is carried out by adopting an asymmetric encryption algorithm RSA, and the result is as follows:
ScxCK7sP3Z15sKECIUPCgP5HVdRYYlD5bRlxCwIDa2Lx7bP7PFgTBcst83wGiilOL/0oyzGjldV3XtZxusSEh9tUfSmQ39k7+U04NjzUJRNrgCA5vUuvmIW75hNuWYQmu402EWlUmuzylWHzSlSA4u1XuLlArHkWpSOHlw0NNOQ=
the HashValue generated by the information of the responsible person and the ciphertext after asymmetric encryption are used together to initiate a POST request of data uplink storage certificate to any Peer node in a block chain system.
After receiving the POST request, the Peer node calls a chain code, takes HashValue as a key, takes a ciphertext encrypted by an asymmetric encryption algorithm RSA as a value for data storage, and synchronously stores the data on all the computing nodes in the block chain system through a consensus mechanism.
In the process of printing and outputting the confidential document, the HashValue value is embedded into each Chinese character in a dark watermark mode, namely the dark watermark embedded into each Chinese character is 0 or 1 by converting the pixel points at the edge of the Chinese character, and a complete HashValue value, namely a digital watermark, is embedded into every 256 Chinese characters at intervals. According to the method and the device, different pieces of dark watermark information can be embedded into the confidential documents with the same content by inputting different pieces of distribution target object information during printing, and then the confidential documents with the unique dark watermark information are distributed to different target objects. The target object does not know whether a dark watermark is embedded in the document when it receives the confidential document.
When a confidential document is leaked and propagated at a certain distribution object, the specific leakage responsible person cannot be determined through the document content because the document is backed up at different target objects. After intercepting the document, the security department scans the document and extracts a digital watermark, wherein the watermark is a bit string with the length of 256 bits and is converted into a 16-system character string with the length of 64 characters. The forensics staff of the security department uses the digital watermark as a key to inquire in the block chain system, and the obtained result is encrypted information, and the inquiry result is assumed as follows:
ScxCK7sP3Z15sKECIUPCgP5HVdRYYlD5bR1xCwIDa2Lx7bP7PFgTBcst83wGiilOL/OoyzGjldV3XtZxusSEh9tUfSmQ39k7+U04NjzUJRNrgCA5vUuvmIW75hNuWYQmu402EWlUmuzylWHzSlSA4u1XuLlArHkWpSOHlw0NNOQ=
the evidence obtaining personnel uses the private key of the evidence obtaining personnel to decrypt the ciphertext by adopting an asymmetric encryption algorithm RSA to obtain the specific responsible person of the revealed confidential document, namely:
XYZ + Superior + development program for X + 2020: 08: 01-09: 10: 10
Through the process, the application can ensure that the embedded dark watermark information in the confidential document has uniqueness in the technical means, and does not influence the reading of characters by naked eyes. The information of the dark watermark and the information of the specific document responsible person are encrypted and then stored in the block chain system, so that the safety and reliability of information storage can be ensured. After the confidential documents are leaked and spread, the leakage responsible person can be quickly and effectively confirmed by extracting the dark watermark information in the confidential documents and combining with the on-chain query.