CN114065308B - Gate-level hardware Trojan horse positioning method and system based on deep learning - Google Patents
Gate-level hardware Trojan horse positioning method and system based on deep learning Download PDFInfo
- Publication number
- CN114065308B CN114065308B CN202111412498.9A CN202111412498A CN114065308B CN 114065308 B CN114065308 B CN 114065308B CN 202111412498 A CN202111412498 A CN 202111412498A CN 114065308 B CN114065308 B CN 114065308B
- Authority
- CN
- China
- Prior art keywords
- path
- module
- paths
- positioning
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- ZXQYGBMAQZUVMI-GCMPRSNUSA-N gamma-cyhalothrin Chemical compound CC1(C)[C@@H](\C=C(/Cl)C(F)(F)F)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 ZXQYGBMAQZUVMI-GCMPRSNUSA-N 0.000 title claims abstract description 64
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013135 deep learning Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 65
- 238000001514 detection method Methods 0.000 claims abstract description 50
- 238000012360 testing method Methods 0.000 claims abstract description 40
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000010845 search algorithm Methods 0.000 claims abstract description 7
- 238000010276 construction Methods 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000004806 packaging method and process Methods 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 abstract 1
- 230000004807 localization Effects 0.000 abstract 1
- 238000002372 labelling Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 229910052710 silicon Inorganic materials 0.000 description 4
- 239000010703 silicon Substances 0.000 description 4
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/70—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
- G06F21/71—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/32—Circuit design at the digital level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Geometry (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to a door-level hardware Trojan horse positioning method and a system based on deep learning, wherein the method firstly obtains seven public door-level netlist files to obtain a training set and a testing set; preprocessing, converting the netlist file into a path statement by using a depth-first search algorithm, and completing path generation; then construct and train TextCNN models for detection and localization; inputting the path set of the test set into a model to obtain a pre-detection result; carrying out path division and constructing virtual positioning coordinates on the pre-detection result to obtain a short path set SL for positioning; and finally inputting SL into TextCNN model to obtain positioning result P. The invention enables a fast and efficient evaluation of the security performance of an integrated circuit even if a threat is found and targeted.
Description
Technical Field
The invention relates to the fields of computer hardware protection and system-on-chip safety, in particular to a door-level hardware Trojan horse positioning method and system based on deep learning.
Background
Integrated Circuits (ICs) are the core components that make up computer hardware and are very complex in design and manufacturing processes. To reduce costs, many manufacturers choose to outsource a part of the IC manufacturing process, i.e. the so-called third party vendor, which undoubtedly introduces a significant security threat to hardware security. A Hardware Trojan (HT) is a small piece of circuitry that an attacker inserts in the original IC layout in order to achieve some malicious purpose. HT can be inserted at any stage of IC fabrication, and the security threats posed include changing circuit functions, causing information leakage, rejecting services, etc. Currently, research on HT detection can be roughly divided into pre-silicon detection, which is performed before the IC chip is finished, and post-silicon detection, which is performed after the IC chip is finished. Obviously, pre-silicon detection can reduce cost more, thereby achieving win-win between safety and profit. The pre-silicon test is mainly performed in the design stage of the IC, and the gate level is the last stage of the design stage, so that the test HT is very effective at the gate level.
In an IC design, the division is at an abstract level, in order from high to low: system level, algorithm level, register transfer level, gate level, transistor level. The gate level detection is a common static detection method, and a new Trojan horse detection method is explored through the logic structure of a gate level netlist analysis circuit. The key to detecting HT at the gate level is to obtain a netlist file describing the level, i.e. the gate level netlist. The gate level netlist is used to describe the interconnection between circuit elements that contain logic gates or other elements at the same level as the logic gates. To date, many efforts have been made to provide methods for preventing and detecting HT at the gate level. The most commonly used method is to utilize a gate-level netlist to mine the features of HT, and then input a deep learning model to perform feature learning, so that HT is effectively detected. Numerous studies have achieved considerable results, but merely staying in the detection phase is not truly resistant to HT, finding specific locations of HT is a prerequisite for more accurate resistance to them, however, studies for locating HT correlations remain very rare.
Disclosure of Invention
In view of the above, the present invention aims to provide a method and a system for locating a door-level hardware Trojan based on deep learning, which can realize the locating of the hardware Trojan on the door level.
In order to achieve the above purpose, the invention adopts the following technical scheme:
A door-level hardware Trojan horse positioning method based on deep learning comprises the following steps:
Step A: obtaining seven open gate netlist files, and dividing a data set by using a leave-one-out method to obtain a training set Tr and a testing set Ts;
And (B) step (B): preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path set And path set of test set Ts
Step C: constructing and initializing TextCNN model for detecting and locating HT and based on path set of training set Tr obtained in step BTraining;
step D: b, collecting paths of the test set Ts obtained in the step B Inputting the TextCNN model trained in the step C to obtain a pre-detection result;
Step E: performing path division and constructing virtual positioning coordinates on the pre-detection result obtained in the step D to obtain a short path set SL for positioning;
step F: inputting the short path set SL obtained in the step E into the TextCNN model trained in the step D to obtain a positioning result P.
Further, the step B specifically includes the following steps:
Step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire net as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree graph G obtained in the step B1, the condition of a real circuit can be restored, a plurality of non-label paths can be obtained, and then the non-label paths are combined into a non-label path set of the netlist;
Step B3: b1 and B2 are carried out on the gate-level netlist files of the training set Tr and the testing set Ts obtained in the step A, and finally a label-free path set of the training set Tr and the testing set Ts is obtained And
Step B4: based on the information of the gate-level netlists of the training set Tr and the testing set Ts obtained in the step A, labeling the label-free path obtained in the step B3 to obtain a labeled path set of the training set Tr and the testing set TsAnd
Further, the step C specifically includes:
Step C1: path set of training set Tr obtained in step B Generating a vocabulary for TextCNN model extraction features;
step C2: constructing and initializing TextCNN models;
step C3: path set based on training set Tr obtained in step B The TextCNN model can learn the characteristics of the paths with Trojan and the paths without Trojan respectively, and the training of the model is completed.
Further, the step C1 specifically includes:
Step C11: firstly, the path set of the training set Tr obtained in the step B is collected Converting into text content;
step C12: based on the text content obtained in the step C11, reading the words one by one and calculating the frequency of each word;
step C13: marking sequence numbers for each word from high to low according to the occurrence frequency of the word, and finishing the vectorization representation of the word;
step C14: and packaging the words and the corresponding sequence numbers into dictionary types, writing the dictionary types into a vocabulary file, and completing the generation of a vocabulary.
Further, the step D specifically includes:
Step D1: c, based on the TextCNN model trained in the step C, adding a storage operation for the last full-connection layer of the model, so as to record a pre-detection result conveniently;
Step D2: aggregating paths of a test set Inputting the TextCNN model trained in the step C to obtain a primary measurement result set { P TP,PFP,PTN,PFN }, wherein P TP is a set of Trojan paths which are identified correctly, P FP is a set of Trojan paths which are identified as Trojan, P TN is a set of Trojan paths which are identified correctly, and P FN is a set of Trojan paths which are identified as Trojan paths;
Step D3: based on the initial detection result set { P TP,PFP,PTN,PFN } obtained in step D2, a set P TP of Trojan paths in which the Trojan paths are correctly identified is selected as a pre-detection result.
Further, the step E specifically includes:
Step E1: numbering paths in the pre-detection result obtained in the step D to obtain an original long path set ll= { LL i |i=1, & gt, TP }, where TP is the number of paths contained in the set P TP of correctly identified Trojan paths obtained in the step D2;
Step E2: setting the divided length cutlen, sequentially dividing the long path LL i into a group according to cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
Step E3: and E2, executing the operation of the step E1 on each of the original long path sets LL to obtain a short path set SL and a virtual positioning coordinate set, and completing path division and constructing virtual positioning coordinates.
Further, the step E2 specifically includes:
Step E21: setting the length cutlen of the division;
Step E22: for the long path LL i, the number of short paths num i that can be generated after it is divided is calculated as follows:
Where length i denotes the length of long path LL i;
Step E23: dividing long path LL i into a group according to cutlen logic gates in turn to obtain a plurality of short paths Where j is an index of the short path, indicating that the j-th short path is divided from the long path LL L;
Step E24: according to the results of the step E22 and the step E23, the path is a short path Setting virtual positioning coordinatesTo record possible Trojan horse positions, wherein AndThe calculation formula of (2) is as follows:
Where t i is the t-th division of the original long path LL i;
Step E25: and E24, repeating the step, and finishing the setting of virtual positioning coordinates of num i short paths.
Further, the step F specifically includes:
step F1: one path in the short path set SL Inputting the TextCNN model trained in the step D, and predicting the class result of the TextCNN model;
step F2: if the predicted result output by TextCNN model is Trojan path, the corresponding virtual positioning coordinate is obtained Recording the positioning result P;
step F3: and F1 and F2 are repeated until all short paths execute the operation, and a final positioning result P is output to finish positioning.
A deep learning based door level hardware Trojan positioning system comprising:
The path generation module is used for generating path sentences representing circuit wires and comprises a searching sub-module, a temporary path sub-module and a label sub-module; firstly, preprocessing gate-level netlist files of an input training set Tr and a test set Ts, performing depth-first search on the gate-level netlist files through a searching submodule to obtain a tree graph G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule AndFinally, labeling the label-free path through a labeling sub-module to generate a labeled path set of a training set Tr and a test set TsAnd
The model generation module is used for constructing and training TextCNN models and comprises a vectorization sub-module, a model construction sub-module and a model training sub-module; first, path set of training set Tr generated for label moduleGenerating vocabulary files through a vectorization sub-module, constructing and initializing TextCNN models through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting a model and finishing training of the model;
The pre-detection module is used for obtaining a pre-detection result of the test set Ts and comprises an enrichment sub-module, a pre-detection sub-module and an output sub-module; firstly, adding a storage operation to the last full-connection layer of TextCNN models constructed by a model construction submodule through a storage increasing submodule so as to record a pre-detection result, and then, integrating paths through the pre-detection submodule Pre-detecting paths in the database to obtain an initial detection result set { P TP,PFP,PTN,PFN }, and finally taking the set P TP of the Trojan paths which are correctly identified as a pre-detection result to be output by an output submodule;
The path dividing module is used for dividing the result path output by the output module into short paths and reducing the positioning range, and comprises a sequencing sub-module, a dividing sub-module and a quasi-coordinate sub-module; for paths in the pre-detection result P TP output by the output module, numbering is carried out through the sequencing submodule, the paths are divided into a plurality of short paths through the dividing submodule, and finally a virtual positioning coordinate is set for each short path through the quasi-coordinate submodule;
The positioning module is used for completing the positioning of the Trojan and comprises a loading sub-module and an output sub-module; firstly, loading a TextCNN model trained by a short path to a model generation module by a loading sub-module, selecting a path predicted to be a Trojan horse after a predicted result passes through an output sub-module, outputting corresponding virtual positioning coordinates, and completing positioning.
Compared with the prior art, the invention has the following beneficial effects:
1. The invention realizes the detection of the hardware Trojan by utilizing the application of the convolutional neural network in text classification;
2. The method converts the detection problem of the hardware Trojan into two classification problems, so that the convolutional neural network learns the context characteristics of the circuit path statement, and autonomously discovers the characteristics of the Trojan path and the Trojan-free path, thereby classifying. Then on the basis of detection, the positioning of the hardware Trojan is explored, the path segmentation technology is considered to be applied to the positioning problem, and the positioning range of the hardware Trojan is reduced by dividing a long path in the circuit into a plurality of short paths;
3. The invention can realize further positioning work on the basis of detection, breaks through the situation that the positioning hardware Trojan is coarsely manufactured from the image of the integrated circuit in the past, can realize positioning on the gate level, and can resist the hardware Trojan more effectively from the design stage of the integrated circuit;
4. The invention can be used in an integrated circuit security detection system for evaluating the security performance of an integrated circuit and even if a threat is found and targeted, for a designer to take action against the threat, etc.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a system according to an embodiment of the invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples.
Referring to fig. 1, the invention provides a door-level hardware Trojan horse positioning method based on deep learning, which comprises the following steps:
step A: firstly, seven open gate-level netlist files are obtained, and a leave-one-out method is used for dividing a data set to obtain a training set Tr and a testing set Ts;
and (B) step (B): preprocessing the gate-level netlist file of the training set Tr and the testing set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a path set of the training set Tr and the testing set Ts AndCompleting generation of a path;
Step B1: traversing the netlist file by using a depth-first search algorithm, taking a wire net as an intermediary, and obtaining a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree graph G obtained in the step B1, the condition of a real circuit can be restored, a plurality of non-label paths can be obtained, and then the non-label paths are combined into a non-label path set of the netlist;
Step B3: b1 and B2 are carried out on the gate-level netlist files of the training set Tr and the testing set Ts obtained in the step A, and finally a label-free path set of the training set Tr and the testing set Ts is obtained And
Step B4: based on the information of the gate-level netlists of the training set Tr and the testing set Ts obtained in the step A, the label-free path obtained in the step B3 is labeled, and a labeled path set of the training set Tr and the testing set Ts is obtainedAnd
Step C: constructing and initializing TextCNN model for detecting and locating HT, and inputting path set of training set Tr obtained in step BCompleting the construction and training of a model;
Step C1: path set of training set Tr obtained in step B Generating a vocabulary for TextCNN model extraction features;
Step C11: firstly, the path set of the training set Tr obtained in the step B is collected Converting into text content;
Step C12: based on the text content obtained in the step C1, reading the words one by one and calculating the frequency of each word;
step C13: marking sequence numbers for each word from high to low according to the occurrence frequency of the word, and finishing the vectorization representation of the word;
step C14: and packaging the words and the corresponding sequence numbers into dictionary types, writing the dictionary types into a vocabulary file, and completing the generation of a vocabulary.
Step C2: constructing and initializing TextCNN models;
step C3: path set based on training set Tr obtained in step B The TextCNN model can learn the characteristics of the paths with Trojan and the paths without Trojan respectively, and the training of the model is completed.
Step D: b, collecting paths of the test set Ts obtained in the step BInputting the TextCNN model trained in the step C to obtain a pre-detection result;
Step D1: c, based on the TextCNN model trained in the step C, adding a storage operation for the last full-connection layer of the model, so as to record a pre-detection result conveniently;
Step D2: aggregating paths of a test set Inputting the TextCNN model trained in the step C, and obtaining a primary measurement result set { P TP,PFP,PTN,PFN }, wherein P TP is a set of Trojan paths which are identified correctly, P FP is a set of Trojan paths which are identified as Trojan, P TN is a set of Trojan paths which are identified correctly, and P FN is a set of Trojan paths which are identified as Trojan paths;
Step D3: based on the initial detection result set { P TP,PFP,PTN,PFN } obtained in step D2, only the set P TP of Trojan paths, which are correctly identified therein, is selected as a pre-detection result for subsequent positioning.
Step E: performing path division and constructing virtual positioning coordinates on the pre-detection result obtained in the step D to obtain a short path set SL for positioning;
Step E1: numbering paths in the pre-detection result obtained in the step D to obtain an original long path set ll= { LL i |i=1, & gt, TP }, where TP is the number of paths contained in the set P TP of correctly identified Trojan paths obtained in the step D2;
Step E2: setting the divided length cutlen, sequentially dividing the long path LL i into a group according to cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
Step E21: setting the length cutlen of the division;
Step E22: for the long path LL i, the number of short paths num i that can be generated after it is divided is calculated as follows:
Where length i denotes the length of long path LL i;
Step E23: dividing long path LL i into a group according to cutlen logic gates in turn to obtain a plurality of short paths Where j is an index of the short path, indicating that the j-th short path is divided from the long path LL i;
Step E24: according to the results of the step E22 and the step E23, the path is a short path Setting virtual positioning coordinatesTo record possible Trojan horse positions, wherein AndThe calculation formula of (2) is as follows:
Where t i is the t-th division of the original long path LL i;
Step E25: and E24, repeating the step, and finishing the setting of virtual positioning coordinates of num i short paths.
Step E3: and E2, executing the operation of the step E1 on each of the original long path sets LL to obtain a short path set SL and a virtual positioning coordinate set, and completing path division and constructing virtual positioning coordinates.
Step F: inputting the short path set SL obtained in the step E into the TextCNN model trained in the step D to obtain a positioning result P.
Step F1: one path in the short path set SLInputting the TextCNN model trained in the step D, and predicting the class result of the TextCNN model;
step F2: if the predicted result output by TextCNN model is Trojan path, the corresponding virtual positioning coordinate is obtained Recording the positioning result P;
step F3: and F1 and F2 are repeated until all short paths execute the operation, and a final positioning result P is output to finish positioning.
The invention also provides a door-level hardware Trojan horse positioning system based on deep learning, which comprises:
The path generation module is used for generating path sentences representing circuit wires and comprises a searching sub-module, a temporary path sub-module and a label sub-module; firstly, preprocessing gate-level netlist files of an input training set Tr and a test set Ts, performing depth-first search on the gate-level netlist files through a searching submodule to obtain a tree graph G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule AndFinally, labeling the label-free path through a labeling sub-module to generate a labeled path set of a training set Tr and a test set TsAnd
The model generation module is used for constructing and training TextCNN models and comprises a vectorization sub-module, a model construction sub-module and a model training sub-module; first, path set of training set Tr generated for label moduleGenerating vocabulary files through a vectorization sub-module, constructing and initializing TextCNN models through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting a model and finishing training of the model;
The pre-detection module is used for obtaining a pre-detection result of the test set Ts and comprises an enrichment sub-module, a pre-detection sub-module and an output sub-module; firstly, adding a storage operation to the last full-connection layer of TextCNN models constructed by a model construction submodule through a storage increasing submodule so as to record a pre-detection result, and then, integrating paths through the pre-detection submodule Pre-detecting paths in the database to obtain an initial detection result set { P TP,PFP,PTN,PFN }, and finally taking the set P TP of the Trojan paths which are correctly identified as a pre-detection result to be output by an output submodule;
The path dividing module is used for dividing the result path output by the output module into short paths and reducing the positioning range, and comprises a sequencing sub-module, a dividing sub-module and a quasi-coordinate sub-module; for paths in the pre-detection result P TP output by the output module, numbering is carried out through the sequencing submodule, the paths are divided into a plurality of short paths through the dividing submodule, and finally a virtual positioning coordinate is set for each short path through the quasi-coordinate submodule;
The positioning module is used for completing the positioning of the Trojan and comprises a loading sub-module and an output sub-module; firstly, loading a TextCNN model trained by a short path to a model generation module by a loading sub-module, selecting a path predicted to be a Trojan horse after a predicted result passes through an output sub-module, outputting corresponding virtual positioning coordinates, and completing positioning.
The foregoing description is only of the preferred embodiments of the invention, and all changes and modifications that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Claims (9)
1. The door-level hardware Trojan horse positioning method based on deep learning is characterized by comprising the following steps of:
Step A: obtaining seven open gate netlist files, and dividing a data set by using a leave-one-out method to obtain a training set Tr and a testing set Ts;
and (B) step (B): preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path set And path set of test set Ts
Step C: constructing and initializing TextCNN model for detecting and locating HT and based on path set of training set Tr obtained in step BTraining;
step D: b, collecting paths of the test set Ts obtained in the step B Inputting the TextCNN model trained in the step C;
Step E: performing path division and constructing virtual positioning coordinates on the pre-detection result obtained in the step D to obtain a short path set SL for positioning;
step F: inputting the short path set SL obtained in the step E into the TextCNN model trained in the step D to obtain a positioning result P.
2. The method for positioning the door-level hardware Trojan horse based on deep learning according to claim 1, wherein the step B is specifically as follows:
Step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire net as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
Step B2: based on the tree diagram G obtained in the step B1, the condition of a real circuit is restored, a plurality of non-label paths are obtained, and then the non-label paths are combined into a non-label path set of the netlist;
Step B3: b1 and B2 are carried out on the gate-level netlist files of the training set Tr and the testing set Ts obtained in the step A, and finally a label-free path set of the training set Tr and the testing set Ts is obtained And
Step B4: based on the information of the gate-level netlists of the training set Tr and the testing set Ts obtained in the step A, performing label distribution on the label-free path obtained in the step B3 to obtain a labeled path set of the training set Tr and the testing set TsAnd
3. The method for positioning a door-level hardware Trojan horse based on deep learning according to claim 1, wherein the step C is specifically:
Step C1: path set of training set Tr obtained in step B Generating a vocabulary for TextCNN model extraction features;
step C2: constructing and initializing TextCNN models;
step C3: path set based on training set Tr obtained in step B And TextCNN, respectively learning the characteristics of the path with the Trojan and the path without the Trojan by the model to finish training of the model.
4. A door-level hardware Trojan horse positioning method based on deep learning as claimed in claim 3, wherein: the step C1 specifically comprises the following steps:
Step C11: firstly, the path set of the training set Tr obtained in the step B is collected Converting into text content;
step C12: based on the text content obtained in the step C11, reading the words one by one and calculating the frequency of each word;
step C13: marking sequence numbers for each word from high to low according to the occurrence frequency of the word, and finishing the vectorization representation of the word;
step C14: and packaging the words and the corresponding sequence numbers into dictionary types, writing the dictionary types into a vocabulary file, and completing the generation of a vocabulary.
5. The deep learning-based door level hardware Trojan horse positioning method according to claim 1, wherein the method comprises the following steps: the step D specifically comprises the following steps:
Step D1: c, based on the TextCNN model trained in the step C, adding a storage operation for the last full-connection layer of the model, so as to record a pre-detection result conveniently;
Step D2: aggregating paths of a test set Inputting the set into the TextCNN model trained in the step C to obtain a primary measurement result set { P TP,PFP,PTN,PFN }, wherein P TP is a set of Trojan paths which are correctly identified, P FP is a set of Trojan paths which are identified as Trojan, P TN is a set of Trojan paths which are correctly identified, and P FN is a set of Trojan paths which are identified as Trojan paths;
Step D3: based on the initial detection result set { P TP,PFP,PTN,PFN } obtained in step D2, a set P TP of Trojan paths in which the Trojan paths are correctly identified is selected as a pre-detection result.
6. The deep learning-based door level hardware Trojan horse positioning method according to claim 1, wherein the method comprises the following steps: the step E specifically comprises the following steps:
Step E1: numbering paths in the pre-detection result obtained in the step D to obtain an original long path set ll= { LL i |i=1, & gt, TP }, where TP is the number of paths contained in the set P TP of correctly identified Trojan paths obtained in the step D2;
Step E2: setting the divided length cutlen, sequentially dividing the long path LL i into a group according to cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
Step E3: and E2, executing the operation of the step E1 on each of the original long path sets LL to obtain a short path set SL and a virtual positioning coordinate set, and completing path division and constructing virtual positioning coordinates.
7. The deep learning-based door level hardware Trojan positioning method according to claim 6, wherein the method comprises the following steps: the step E2 specifically comprises the following steps:
Step E21: setting the length cutlen of the division;
Step E22: for the long path LL i, the number of short paths num i that can be generated after it is divided is calculated as follows:
Where length i denotes the length of long path LL i;
Step E23: dividing long path LL i into a group according to cutlen logic gates in turn to obtain a plurality of short paths Where j is an index of the short path, indicating that the j-th short path is divided from the long path LL i;
Step E24: according to the results of the step E22 and the step E23, the path is a short path Setting virtual positioning coordinatesTo record possible Trojan horse positions, wherein AndThe calculation formula of (2) is as follows:
Where t i is the t-th division of the original long path LL i;
Step E25: and E24, repeating the step, and finishing the setting of virtual positioning coordinates of num i short paths.
8. The deep learning-based door level hardware Trojan horse positioning method according to claim 1, wherein the method comprises the following steps: the step F specifically comprises the following steps:
step F1: one path in the short path set SL Inputting the TextCNN model trained in the step D, and predicting the class result of the TextCNN model;
step F2: if the predicted result output by TextCNN model is Trojan path, the corresponding virtual positioning coordinate is obtained Recording the positioning result P;
step F3: and F1 and F2 are repeated until all short paths execute the operation, and a final positioning result P is output to finish positioning.
9. A door-level hardware Trojan horse positioning system based on deep learning, comprising:
And a path generation module: the path statement generating module is used for generating a path statement representing the circuit wiring and comprises a searching sub-module, a temporary path sub-module and a label sub-module; firstly, preprocessing gate-level netlist files of an input training set Tr and a test set Ts, performing depth-first search on the gate-level netlist files through a searching submodule to obtain a tree graph G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule AndFinally, label distribution is carried out on the label-free paths through a label sub-module, and a labeled path set of a training set Tr and a test set Ts is generatedAnd
And a model generation module: the model training module is used for constructing and training TextCNN models and comprises a vectorization sub-module, a model construction sub-module and a model training sub-module; first, path set of training set Tr generated for label moduleGenerating vocabulary files through a vectorization sub-module, constructing and initializing TextCNN models through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting a model and finishing training of the model;
A pre-detection module: the pre-detection result comprises an increase memory sub-module, a pre-detection sub-module and an output sub-module; firstly, adding a storage operation to the last full-connection layer of TextCNN models constructed by a model construction submodule through a storage increasing submodule so as to record a pre-detection result, and then, integrating paths through the pre-detection submodule Pre-detecting paths in the database to obtain an initial detection result set { P TP,PFP,PTN,PFN }, and finally taking the set P TP of the Trojan paths which are correctly identified as a pre-detection result to be output by an output submodule;
And a path dividing module: the device comprises an output module, a positioning module, a short-circuit path dividing module, a positioning module and a coordinate drawing module, wherein the output module is used for outputting a result path to be divided into short-circuit paths, and the positioning module is used for narrowing a positioning range and comprises a sequencing sub-module, a dividing sub-module and a coordinate drawing sub-module; for paths in the pre-detection result P TP output by the output module, numbering is carried out through the sequencing submodule, the paths are divided into a plurality of short paths through the dividing submodule, and finally a virtual positioning coordinate is set for each short path through the quasi-coordinate submodule;
and a positioning module: finishing the positioning of the Trojan horse, including a loading sub-module and an output sub-module; firstly, loading a TextCNN model trained by a short path to a model generation module by a loading sub-module, selecting a path predicted to be a Trojan horse after a predicted result passes through an output sub-module, outputting corresponding virtual positioning coordinates, and completing positioning.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111412498.9A CN114065308B (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111412498.9A CN114065308B (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114065308A CN114065308A (en) | 2022-02-18 |
CN114065308B true CN114065308B (en) | 2024-08-02 |
Family
ID=80276358
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111412498.9A Active CN114065308B (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114065308B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684834A (en) * | 2018-12-21 | 2019-04-26 | 福州大学 | A kind of gate leve hardware Trojan horse recognition method based on XGBoost |
CN113486347A (en) * | 2021-06-30 | 2021-10-08 | 福州大学 | Deep learning hardware Trojan horse detection method based on semantic understanding |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190272375A1 (en) * | 2019-03-28 | 2019-09-05 | Intel Corporation | Trust model for malware classification |
CN113591084B (en) * | 2021-07-26 | 2023-08-04 | 福州大学 | Transformer malicious chip identification method and system based on circuit path statement |
-
2021
- 2021-11-25 CN CN202111412498.9A patent/CN114065308B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684834A (en) * | 2018-12-21 | 2019-04-26 | 福州大学 | A kind of gate leve hardware Trojan horse recognition method based on XGBoost |
CN113486347A (en) * | 2021-06-30 | 2021-10-08 | 福州大学 | Deep learning hardware Trojan horse detection method based on semantic understanding |
Also Published As
Publication number | Publication date |
---|---|
CN114065308A (en) | 2022-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107798136B (en) | Entity relation extraction method and device based on deep learning and server | |
CN110968699B (en) | Logic map construction and early warning method and device based on fact recommendation | |
CN109697162B (en) | An automatic detection method for software defects based on open source code library | |
Demir et al. | Improving named entity recognition for morphologically rich languages using word embeddings | |
Zhou et al. | Brill tagging on the micron automata processor | |
CN106855853A (en) | Entity relation extraction system based on deep neural network | |
CN112711953A (en) | Text multi-label classification method and system based on attention mechanism and GCN | |
CN107168992A (en) | Article sorting technique and device, equipment and computer-readable recording medium based on artificial intelligence | |
CN110162771B (en) | Event trigger word recognition method and device and electronic equipment | |
CN112232058A (en) | Fake news identification method and system based on deep learning three-layer semantic extraction framework | |
CN109905385A (en) | A kind of webshell detection method, apparatus and system | |
CN113672931B (en) | Software vulnerability automatic detection method and device based on pre-training | |
Rau et al. | Transferring tests across web applications | |
CN114239083B (en) | Efficient state register identification method based on graph neural network | |
CN113886832B (en) | Smart contract vulnerability detection method, system, computer device and storage medium | |
CN113742733A (en) | Reading comprehension vulnerability event trigger word extraction and vulnerability type identification method and device | |
Zhu et al. | Tag: Learning circuit spatial embedding from layouts | |
CN113220996B (en) | Method, device, equipment and storage medium for recommendation of scientific and technological services based on knowledge graph | |
CN116522334A (en) | RTL-level hardware Trojan horse detection method and storage medium based on graph neural network | |
Kovvuri et al. | Pirc net: Using proposal indexing, relationships and context for phrase grounding | |
CN116975881A (en) | LLVM (LLVM) -based vulnerability fine-granularity positioning method | |
CN113806493A (en) | Entity relationship joint extraction method and device for Internet text data | |
CN110275957B (en) | Name disambiguation method and device, electronic equipment and computer readable storage medium | |
CN115510188A (en) | Text keyword association method, device, equipment and storage medium | |
CN115687314A (en) | Tangka culture knowledge graph display system and construction method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |