[go: up one dir, main page]

CN119004443B - A method for identifying and protecting against attackable data in the kernel file system - Google Patents

A method for identifying and protecting against attackable data in the kernel file system

Info

Publication number
CN119004443B
CN119004443B CN202411021889.1A CN202411021889A CN119004443B CN 119004443 B CN119004443 B CN 119004443B CN 202411021889 A CN202411021889 A CN 202411021889A CN 119004443 B CN119004443 B CN 119004443B
Authority
CN
China
Prior art keywords
data
file
kernel
attackable
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411021889.1A
Other languages
Chinese (zh)
Other versions
CN119004443A (en
Inventor
申文博
周金梦
胡嘉懿
潘子曰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202411021889.1A priority Critical patent/CN119004443B/en
Publication of CN119004443A publication Critical patent/CN119004443A/en
Application granted granted Critical
Publication of CN119004443B publication Critical patent/CN119004443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6281Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database at program execution time, where the protection is within the operating system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • G06F21/80Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in storage media based on magnetic or optical technology, e.g. disks with sectors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/545Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Storage Device Security (AREA)

Abstract

本发明公开了一种针对内核文件系统中可攻击数据的识别和防护方法,包括:将内核文件系统的源码编译为LLVM IR文件,基于所述LLVM IR文件扫描得到不同类别的核心数据,包括文件元数据、文件内容和文件读写方向;基于所述LLVM IR文件,追踪所述核心数据的数据流传播和控制流传播,得到可攻击的非控制数据;对所述可攻击的非控制数据进行动态验证,判断其是否为可攻击数据;将所述可攻击数据在磁盘中最终影响的数据进行保护。本发明是针对Linux内核文件系统中可攻击非控制数据所造成的攻击面的自动分析工具,能够根据语义信息自动识别使用潜在的可攻击数据,模拟攻击过程,验证数据的有效性,识别完整的攻击面。

This invention discloses a method for identifying and protecting against attackable data in a kernel file system, comprising: compiling the source code of the kernel file system into an LLVM IR file; scanning the LLVM IR file to obtain different categories of core data, including file metadata, file content, and file read/write directions; tracing the data flow propagation and control flow propagation of the core data based on the LLVM IR file to obtain attackable non-control data; dynamically verifying the attackable non-control data to determine whether it is attackable; and protecting the data ultimately affected by the attackable data on the disk. This invention is an automatic analysis tool for the attack surface caused by attackable non-control data in the Linux kernel file system. It can automatically identify and use potentially attackable data based on semantic information, simulate the attack process, verify the validity of the data, and identify the complete attack surface.

Description

Identification and protection method for attackeable data in kernel file system
Technical Field
The invention belongs to the field of kernel protection of an operating system, and particularly relates to a method for identifying and protecting attacked data in a kernel file system.
Background
The Linux kernel is the basis for the operation of a computer operating system, and has a very wide application range, including servers, personal computers, smart phones and embedded systems. Since the security of Linux kernels is directly related to the security of a large number of devices and systems, attacks and defenses against operating system kernels have evolved over the last decades. The attack strategy evolves from the earliest kernel code injection attack, to a code reuse attack on control data (return address and function pointer), to the last non-control data attack. Since various protection techniques have been proposed and deployed to resist kernel code injection attacks and code reuse attacks, it is difficult for an attacker to destroy the code and control data in the kernel, and then turn to non-control data for attack.
However, uncontrolled data attacks rely on understanding program semantics and there is still a large research space. For many years, attackers have made hacking attacks by tampering with various attacked non-control data, such as page tables, differential data structures, and user mode helper load paths in the kernel. These sensitive data were manually selected based on the experience of the researchers, and no systematic approach was available to identify potentially sensitive data. Security researchers, on the other hand, have proposed various protection techniques to monitor and protect such sensitive data. Research work KENALI for the kernel identifies sensitive non-control data based on heuristics of error codes, but it captures only some of the rights check related sensitive data without studying their impact and availability. There are still many unknown uncontrolled data that, once tampered with, can have serious consequences such as rights lifting. A typical case is a dirty cow hole (CVE-2016-5195), which exploits a race condition hole to manipulate the state of uncontrolled data to write to a read-only file to cause a right to be raised. Recently exploded dirty pipeline vulnerabilities (CVE-2022-0847) take advantage of uninitialized uncontrolled data for claiming.
However, before a real world attack occurs, these important uncontrolled data have never been found, and there is a lack of systematic research to explore the availability of uncontrolled data in the Linux kernel, so that the attack surface it causes is still unknown, and there is a lack of corresponding protection against attacks that occur at any time. Unlike control data, which has explicit semantic information, i.e. execution logic representing program jumps, the semantics of non-control data are diverse, and the semantic information of data determines whether it can cause a right-lifting attack. However, the kernel does not provide documents about the semantics of the non-control data, and the code of the kernel is changed continuously, so that it is almost impossible to record the use cases of all the non-control data.
Therefore, there is a need for an identification method that analyzes all data that can be targeted for attack, then summarizes the rules of the targeted data and provides a protection scheme to effectively resist attacks against non-control data in the file system.
Disclosure of Invention
Aiming at the defects of the prior art, the embodiment of the application aims to provide a method for identifying and protecting the attackeable data in the kernel file system.
According to a first aspect of an embodiment of the present application, there is provided a method for identifying and protecting attackeable data in a kernel file system, including:
compiling source codes of a kernel file system into LLVM IR files, and scanning to obtain core data of different types based on the LLVM IR files, wherein the core data comprise file metadata, file content and file reading and writing directions;
tracking data stream propagation and control stream propagation of the core data based on the LLVM IR file to obtain attacked non-control data;
dynamically verifying the attacked non-control data and judging whether the attacked non-control data is the attacked data or not;
And protecting the data which can be finally influenced by the attacked data in the disk.
Further, a data structure of authority data in an abstract file system layer is adopted as the file metadata, data structures representing file contents in a page buffer layer and a general block layer are adopted as the file contents, and read/write event seat file read/write directions which are uniformly recorded when data exchange is triggered in the general block layer are adopted.
Further, data stream propagation and control stream propagation of the core data is tracked through value flow analysis of type-based access paths.
Further, for core data in the form of a marker variable, the data stream transmission includes direct assignment or assignment after logic operation, and the value affecting another marker variable is judged through branching in the control stream transmission.
Further, for core data in the form of pointer references, the data streaming thereof includes direct assignment operations and assignment after arithmetic operations.
Further, the dynamic verification of the attacked non-control data, and the judgment of whether the attacked non-control data is the attacked data, includes:
According to the non-control data to be verified, the user mode program executes legal file writing operation on the file with writing authority, triggers the use code of the non-control data in the kernel, automatically records the target value of the non-control data by the kernel instrumentation code, and stores the target value into an array maintained by the kernel;
the user program tries to write the non-control data into the read-only file, triggers the using code of the non-control data in the kernel, takes out and writes the target value of the non-control data covered in the array, and the kernel relocates the data at the moment and automatically resets the data to the value recorded in the last step;
And the user state program re-reads the read-only file, checks whether the content is changed into the malicious content written in the previous step, and if so, the data is the attackeable data.
According to a second aspect of an embodiment of the present application, there is provided an apparatus for identifying and protecting attackeable data in a kernel file system, including:
The scanning module is used for compiling source codes of the kernel file system into LLVM IR files, and scanning the LLVM IR files to obtain core data of different types, including file metadata, file content and file reading and writing directions;
The tracking module is used for tracking data stream propagation and control stream propagation of the core data based on the LLVM IR file to obtain attacked non-control data;
the dynamic verification module is used for dynamically verifying the attacked non-control data and judging whether the attacked non-control data is the attacked data or not;
And the protection module is used for protecting the data which can be finally influenced by the attack data in the disk.
According to a third aspect of embodiments of the present application, there is provided a computer program product comprising a computer program/instruction which, when executed by a processor, carries out the method according to the first aspect.
According to a fourth aspect of an embodiment of the present application, there is provided an electronic apparatus including:
One or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of the first aspect.
According to a fifth aspect of embodiments of the present application there is provided a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method according to the first aspect.
Compared with the prior art, the method for classifying the non-control data based on the semantics has the technical innovation points that 1, the semantics of the non-control data which can be attacked in a modeling file system are classified into different categories according to the semantics, and core data representing each category is screened. 2. Static tracking technology of potentially attacked data, namely analyzing and obtaining all potentially attacked data aiming at the propagation process of the static tracking data of the core data representing each category. 3. The dynamic simulation attack verification technology simulates the value of the dynamic tampered data of an attacker, and automatically judges whether the attack of the copyright is caused, so that whether the data is the attackeable data is verified, and the attack surface caused by the non-control data is obtained. 4. The protection technology for the data of different categories is to analyze the hierarchy of a file system where the data available for the right-raising attack is located, and protect the data in different modes according to the category based on the semantics in the first step. Therefore, the invention has the beneficial effects that:
1) The invention relates to automatic and systematic attacked non-control data analysis, which is an automatic analysis tool for an attack surface caused by the attacked non-control data in a Linux kernel file system. The invention can automatically identify and use potential attacked data according to semantic information, simulate the attack process, verify the validity of the data and identify the complete attack surface.
2) And the invention designs different protection schemes for three kinds of data after obtaining the data effective for the right-raising attack, and selects the data which is finally influenced as a protection target according to different file system layers where the data are transmitted so as to reduce the performance cost.
3) Compatible with the driving of various file systems and storage hardware, the present invention automatically identifies the offensive non-control data, such as ext2, ext4, FAT, etc., that is present in the implementation of various file systems. In addition, the system can be compatible with different hardware, and can systematically analyze hardware drivers without additional labor cost.
4) Cross-version support is that the update iteration speed of the Linux kernel code is very high, the method can be transplanted into Linux kernels of various versions without extra labor cost, and analysis based on source codes can be compatible with all mainstream system architectures (x86_64, aarch64 and the like).
Therefore, the invention has good popularization and application prospect
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a flowchart illustrating a method for identifying and safeguarding offensive data in a kernel file system, in accordance with an exemplary embodiment.
FIG. 2 is a hierarchical diagram showing a file system in a Linux kernel.
FIG. 3 is a flow chart illustrating static extraction of core data and tracking of its propagation according to an exemplary embodiment.
FIG. 4 is a flow chart illustrating dynamic verification of attacked non-control data in accordance with an illustrative embodiment.
Fig. 5 is a flow chart illustrating protecting attacked data in accordance with an illustrative embodiment.
FIG. 6 is a block diagram illustrating an identification and guard for aggressor data in a kernel file system, according to one exemplary embodiment.
Fig. 7 is a schematic diagram of an electronic device, according to an example embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
FIG. 1 is a flowchart illustrating a method for identifying and protecting aggressor data in a kernel file system, according to an exemplary embodiment, where the method is applied to a terminal, as shown in FIG. 1, and may include the following steps:
step S1, compiling source codes of a kernel file system (shown in figure 2) into LLVM IR files, and scanning based on the LLVM IR files to obtain different types of core data, wherein the core data comprise file metadata, file content and file reading and writing directions. FIG. 2 shows the hierarchy of the file system in the Linux kernel, and lists at each level a portion of the dynamically verified number that can be used for the right-raising attack.
Specifically, as shown in FIG. 3, the kernel source code is compiled into LLVM IR files that are convenient for analysis, and then a static analysis tool is run to scan certain specific layers of the file system for non-control data representing different classes, i.e., core data.
The core data is mainly divided into file metadata, file contents and file reading and writing directions:
(1) And (3) file metadata, namely, an attacker can bypass the security check of the uppermost layer of the file system through tampering with the authority information of the file metadata, so as to acquire root authorities. As shown in FIG. 2, the abstract file system layer (e.g., virtual file system layer, generic block layer) hides the different implementations to provide an abstract interface, and the module selects the data structure of the rights data of that layer to represent the metadata of the file.
(2) File content, which represents user data stored in a file, exists in two ways, namely a disk and a memory (page buffer). The attacker directly writes the content of the read-only file by modifying the non-control data, so as to achieve the right-raising attack, such as modifying/etc/passwd content, and adding the attacker as a root user, thereby realizing the right-raising. The data structures representing the file content are thus selected in the page cache layer and the generic block layer, as shown in fig. 2, relatively independent of the different file system implementations and disk hardware drives.
(3) And in the read-write direction, file metadata and contents are stored on a disk and loaded into a memory when access is needed. The read-write operation penetrates through the whole file system, and is transmitted downwards from the topmost layer to the bottommost disk, so that data exchange is realized. The Linux kernel records unified data exchange events between disk and host in a generic block layer. The read operation can be changed into the write operation by modifying the read-write flag bit, and then the read-only file is written. Therefore, the universal block layer is selected to uniformly record the read/write event as core data when the data exchange is triggered.
Step S2, tracking data stream propagation and control stream propagation of the core data based on the LLVM IR file to obtain attacked non-control data;
Specifically, as shown in fig. 3, the core data is taken as a root of propagation, and the propagation process is tracked to discover more attacked non-control data. The three types of data propagation modes are different, and the data flow and the control flow of the core data are tracked through customizing the propagation rules, so that more potential attacked data are obtained through analysis. Specifically, the data structure of three types of core data in code has two forms (1) a flag variable, typically implemented by an integer, whose value marks some core state. Each bit of the flag variable represents a different state, and therefore only a logical operation (and or) is performed, and no arithmetic operation (add-subtract, multiply-divide) is involved. Thus, during the data stream propagation, the propagation algorithm only contains direct assignments or assignments after logic operations. In control flow propagation, one marker variable may influence the value of another marker variable by branching decisions. For example, a specific bit of a flag variable representing a right represents a legal right, the right is judged by branching, if the right is legal, the state of another flag variable is assigned as legal, otherwise, the state is marked as illegal. (2) Pointer references are used as indexes to find specific areas in the memory space. Unlike the flag variable, the information represented by the different bits of the flag variable is already hard coded in the core and does not change, so that values representing different states can be readily obtained. While the value of a pointer reference is dynamically changed, depending on where the target memory space is allocated. The value of the pointer is unique and uniquely points to a block of memory space. The data stream propagation process of tracking the value of pointer variables involves arithmetic operations (e.g., aligning a certain pointer) and not logical operations. Thus, the data streaming rules contain direct assignment operations and assignments after arithmetic operations. On the other hand, the value of the pointer is affected by the control flow, and in the branch judgment, the value of the pointer is decided according to the state of the flag variable in the condition judgment, and then the flag variable is regarded as new data to be propagated. This is because pointer variables are not normally present in the branch predicate conditions, only flag variables will be present therein for judging the state of the core. The propagation rule of the state variable is performed according to a first propagation rule (flag variable propagation). The above propagation process is recursive until no new data is found.
The tracking process is also applied to LLVM IR files, and by virtue of value flow analysis (such as PRACTICAL PROGRAM MODULARIZATION WITH TYPE-based DEPENDENCE ANALYSIS and STATICALLY DISCOVERING HIGH-Order TAINT STYLE Vulnerabilities in OS Kernels) based on type access paths, pointer analysis with huge cost is avoided.
Step S3, dynamically verifying the attacked non-control data and judging whether the attacked non-control data is the attacked data or not;
Specifically, step S2 results in a series of potentially attacked uncontrolled data sets, and in order to avoid interaction between the data, the invention extracts one data at a time from the sets until all the data has been validated. As shown in fig. 4, instrumentation is required before dynamic verification, which is divided into three steps, and the kernel and user mode program after LLVM instrumentation are required to be used cooperatively. The invention performs instrumentation on the positions of all the identification data in the kernel, but extracts one data at a time to verify, and only records and resets the value of the data when the kernel runs. The function of determining the data to be extracted and the function of recording and resetting the data are realized by a newly added system call, and the user mode program realizes dynamic verification by calling the function after the system call triggers the kernel instrumentation. Specifically, the invention self-defines a new system call, wherein the parameters of the system call are a certain data to be verified, and specific operations (including two types of record/reset) need to be carried out. In the process of inserting all data in advance, all data are numbered, so that the system call only needs to provide the number of the data to be verified, and parameters of specific operations are distinguished by integers and are used for steps S31-S32:
s31, according to the specific data, a user mode program executes legal file writing operation on a file with writing authority, triggers a using code of the data in a kernel, automatically records a target value of the data by a kernel instrumentation code, and stores the target value into an array maintained by the kernel;
S32, the user mode program tries to write malicious content into the read-only file, triggers the use code of the data in the kernel, takes out the value in the array, then writes the value into the data, and the kernel relocates the data at the moment and automatically resets the data to the value recorded in the last step.
S33, the user mode program re-reads the read-only file used in S32, checks whether the content is changed into the malicious content written in the previous step, if so, the data is the attackeable data, and collects the attackeable data into the attackeable data set.
After executing the steps S31-S33 on all the non-control data in the attacked non-control data set, reporting all the data in the attacked data set, including the data structure on which the allocation data depends, the specific offset position in the structure, and modifying the value of the offset position to the value recorded in the third step to cause writing into the read-only file, thereby achieving the right-raising attack and obtaining the attack surface of the non-control data in the kernel.
Step S4, protecting the data which can be finally influenced by the attack data in the disk;
as shown in FIG. 5, all the attackeable data are analyzed to obtain data representing file contents distributed in the bottommost layer of the kernel file system, namely, a disk, the data part comprises page buffers representing the file contents and disk block data, and it is worth noting that for file metadata and file read-write directions, an attacker bypasses the permission to write target file contents or modify the read-write directions to write the file contents, so that perfect protection can be realized only by protecting the file contents, and the attack of uncontrolled data on the file system is resisted.
The invention firstly provides systematic analysis of non-control data serving as an attack target in the kernel, automatically identifies data causing the right-raising attack, summarizes rules of the data and provides a corresponding protection scheme. The analysis range is a file system in the kernel, because the file plays an important role in kernel security, once the file access protection is destroyed, an attacker can easily initiate the right-raising attack, and the hierarchy of the file system is clear, so that semantic information of different layers can be analyzed conveniently. The method comprises the steps of dividing non-control data into a plurality of categories representing different semantics based on a multi-layer semantic structure of a file system, screening core non-control data representing each category according to characteristics of data semantics, statically tracking a semantic propagation process of the data, analyzing to obtain all potential attacked data in each category, automatically monitoring and simulating a process of attacking by an attacker to tamper with the attacked data by dynamic instrumentation, verifying to obtain data actually causing the right-lifting attack, wherein the identification result comprises all non-control data utilized by the current right-lifting attack, including dirty cow loopholes (vm_fault- > flags), dirty pipeline loopholes (pipe_buffer- > flags) and dirty right loopholes (file- > f_mode), and finally reporting all the attacked non-control data by adopting different protection schemes aiming at the data of different categories.
The application also provides an embodiment of the device for identifying and protecting the attacked data in the kernel file system, which corresponds to the embodiment of the method for identifying and protecting the attacked data in the kernel file system.
FIG. 6 is a block diagram illustrating an identification and guard for attackeable data in a kernel file system in accordance with an illustrative embodiment. Referring to fig. 6, the apparatus may include:
the scanning module 21 is configured to compile source codes of a kernel file system into LLVM IR files, and scan the LLVM IR files to obtain core data of different types, including file metadata, file content and file read-write directions;
A tracking module 22, configured to track data stream propagation and control stream propagation of the core data based on the LLVM IR file, so as to obtain attacked non-control data;
The dynamic verification module 23 is configured to dynamically verify the attacked non-control data, and determine whether the attacked non-control data is attacked data;
and the protection module 24 is used for protecting the data which is finally affected by the attacked data in the disk.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
Accordingly, the present application also provides a computer program product comprising a computer program/instruction which, when executed by a processor, implements a method for identifying and protecting data in a kernel file system as described above.
Correspondingly, the application further provides electronic equipment, which comprises one or more processors, a memory and a control unit, wherein the memory is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the identification and protection method for the attacked data in the kernel file system. As shown in fig. 7, a hardware structure diagram of an apparatus with data processing capability according to any of the embodiments of the present application, where the apparatus with data processing capability is located, is provided for identifying and protecting data in a kernel file system, and besides the processor, the memory, and the network interface shown in fig. 7, any of the apparatuses with data processing capability in the embodiments generally includes other hardware according to the actual function of the any apparatus with data processing capability, which is not described herein.
Correspondingly, the application also provides a computer readable storage medium, wherein computer instructions are stored on the computer readable storage medium, and the computer instructions realize the identification and protection method for the attacked data in the kernel file system when being executed by a processor. The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), an SD card, a flash memory card (FLASH CARD), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any device having data processing capabilities. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.

Claims (9)

1.一种针对内核文件系统中可攻击数据的识别和防护方法,其特征在于,包括:1. A method for identifying and protecting against attackable data in a kernel file system, characterized by comprising: 将内核文件系统的源码编译为LLVM IR文件,基于所述LLVM IR文件扫描得到不同类别的核心数据,包括文件元数据、文件内容和文件读写方向;The source code of the kernel file system is compiled into an LLVM IR file, and different categories of core data, including file metadata, file content, and file read/write direction, are obtained by scanning based on the LLVM IR file. 基于所述LLVM IR文件,追踪所述核心数据的数据流传播和控制流传播,得到可攻击的非控制数据;Based on the LLVM IR file, the data flow propagation and control flow propagation of the core data are traced to obtain attackable non-control data; 对所述可攻击的非控制数据进行动态验证,判断其是否为可攻击数据;The attackable non-control data is dynamically verified to determine whether it is attackable data. 将所述可攻击数据在磁盘中最终影响的数据进行保护,所述最终影响的数据为代表文件内容的数据,包含代表文件内容的页缓存以及磁盘块数据;The data ultimately affected by the attackable data on the disk is protected. The data ultimately affected is data representing the file content, including page cache representing the file content and disk block data. 其中,非控制数据分为文件元数据、文件内容、文件读写方向,对所述可攻击的非控制数据进行动态验证,判断其是否为可攻击数据,包括:The non-control data is divided into file metadata, file content, and file read/write direction. Dynamic verification is performed on the attackable non-control data to determine whether it is attackable, including: 根据待验证的非控制数据,用户态程序对具有写权限的文件执行合法的写文件操作,触发该非控制数据在内核中的使用代码,内核插桩代码自动记录该非控制数据的目标值,存入内核维护的数组中;Based on the uncontrolled data to be verified, the user-mode program performs a valid write operation on a file with write permissions, triggering the use code of the uncontrolled data in the kernel. The kernel instrumentation code automatically records the target value of the uncontrolled data and stores it in an array maintained by the kernel. 用户态程序尝试将恶意内容写入只读文件,触发内核中该非控制数据的使用代码,将所述数组中该非控制数据的目标值取出并进行写入,此时内核重新定位该数据,自动将其重置为上一步记录的值;A user-space program attempts to write malicious content to a read-only file, triggering the kernel's code that uses the uncontrolled data. The kernel then retrieves the target value of the uncontrolled data from the array and writes it. At this point, the kernel relocates the data and automatically resets it to the value recorded in the previous step. 用户态程序重新读取该只读文件,查看其内容是否更改为上一步中写入的恶意内容,如果是,则该数据为可攻击数据。The user-space program rereads the read-only file to check if its content has been changed to the malicious content written in the previous step. If so, the data is vulnerable to attack. 2.根据权利要求1所述的方法,其特征在于,采用抽象文件系统层中权限数据的数据结构作为所述文件元数据,采用页缓存层和通用块层中代表文件内容的数据结构作为所述文件内容,采用通用块层中触发交换数据时统一记录的读/写事件座位文件读写方向。2. The method according to claim 1, characterized in that, the data structure of permission data in the abstract file system layer is used as the file metadata, the data structure representing the file content in the page cache layer and the general block layer is used as the file content, and the read/write event uniformly recorded when data exchange is triggered in the general block layer is used to determine the file read/write direction. 3.根据权利要求1所述的方法,其特征在于,通过基于类型的访问路径的值流分析追踪所述核心数据的数据流传播和控制流传播。3. The method according to claim 1, characterized in that the data flow propagation and control flow propagation of the core data are traced through value flow analysis based on type-based access paths. 4.根据权利要求1所述的方法,其特征在于,对于标志变量形式的核心数据,其数据流传播包括直接赋值或进行逻辑操作后的赋值,在控制流传播中通过分支判断影响另一个标志变量的数值。4. The method according to claim 1, characterized in that, for core data in the form of flag variables, its data flow propagation includes direct assignment or assignment after logical operation, and the value of another flag variable is affected by branch judgment in the control flow propagation. 5.根据权利要求1所述的方法,其特征在于,对于指针引用形式的核心数据,其数据流传播包含直接赋值操作以及经过算术运算后的赋值。5. The method according to claim 1, wherein for core data in pointer reference form, the data stream propagation includes direct assignment operations and assignment after arithmetic operations. 6.一种针对内核文件系统中可攻击数据的识别和防护装置,其特征在于,包括:6. A device for identifying and protecting against attackable data in a kernel file system, characterized in that it comprises: 扫描模块,用于将内核文件系统的源码编译为LLVM IR文件,基于所述LLVM IR文件扫描得到不同类别的核心数据,包括文件元数据、文件内容和文件读写方向;The scanning module is used to compile the source code of the kernel file system into an LLVM IR file, and scan the LLVM IR file to obtain different categories of core data, including file metadata, file content and file read/write direction; 追踪模块,用于基于所述LLVM IR文件,追踪所述核心数据的数据流传播和控制流传播,得到可攻击的非控制数据;The tracing module is used to trace the data flow propagation and control flow propagation of the core data based on the LLVM IR file, and obtain attackable non-control data; 动态验证模块,用于对所述可攻击的非控制数据进行动态验证,判断其是否为可攻击数据;The dynamic verification module is used to dynamically verify the attackable non-control data and determine whether it is attackable data. 保护模块,用于将所述可攻击数据在磁盘中最终影响的数据进行保护,所述最终影响的数据为代表文件内容的数据,包含代表文件内容的页缓存以及磁盘块数据;The protection module is used to protect the data that the attackable data ultimately affects on the disk. The data that ultimately affects the data is data representing the file content, including page cache representing the file content and disk block data. 其中,非控制数据分为文件元数据、文件内容、文件读写方向,对所述可攻击的非控制数据进行动态验证,判断其是否为可攻击数据,包括:The non-control data is divided into file metadata, file content, and file read/write direction. Dynamic verification is performed on the attackable non-control data to determine whether it is attackable, including: 根据待验证的非控制数据,用户态程序对具有写权限的文件执行合法的写文件操作,触发该非控制数据在内核中的使用代码,内核插桩代码自动记录该非控制数据的目标值,存入内核维护的数组中;Based on the uncontrolled data to be verified, the user-mode program performs a valid write operation on a file with write permissions, triggering the use code of the uncontrolled data in the kernel. The kernel instrumentation code automatically records the target value of the uncontrolled data and stores it in an array maintained by the kernel. 用户态程序尝试将恶意内容写入只读文件,触发内核中该非控制数据的使用代码,将所述数组中该非控制数据的目标值取出并进行写入,此时内核重新定位该数据,自动将其重置为上一步记录的值;A user-space program attempts to write malicious content to a read-only file, triggering the kernel's code that uses the uncontrolled data. The kernel then retrieves the target value of the uncontrolled data from the array and writes it. At this point, the kernel relocates the data and automatically resets it to the value recorded in the previous step. 用户态程序重新读取该只读文件,查看其内容是否更改为上一步中写入的恶意内容,如果是,则该数据为可攻击数据。The user-space program rereads the read-only file to check if its content has been changed to the malicious content written in the previous step. If so, the data is vulnerable to attack. 7.一种计算机程序产品,包括计算机程序/指令,其特征在于,该计算机程序/指令被处理器执行时实现如权利要求1-5任一项所述的方法。7. A computer program product comprising a computer program/instructions, characterized in that, when executed by a processor, the computer program/instructions implement the method as described in any one of claims 1-5. 8.一种电子设备,其特征在于,包括:8. An electronic device, characterized in that it comprises: 一个或多个处理器;One or more processors; 存储器,用于存储一个或多个程序;Memory, used to store one or more programs; 当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5任一项所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-5. 9.一种计算机可读存储介质,其上存储有计算机指令,其特征在于,该指令被处理器执行时实现如权利要求1-5中任一项所述方法的步骤。9. A computer-readable storage medium having stored thereon computer instructions, characterized in that, when executed by a processor, the instructions implement the steps of the method as described in any one of claims 1-5.
CN202411021889.1A 2024-07-29 2024-07-29 A method for identifying and protecting against attackable data in the kernel file system Active CN119004443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411021889.1A CN119004443B (en) 2024-07-29 2024-07-29 A method for identifying and protecting against attackable data in the kernel file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411021889.1A CN119004443B (en) 2024-07-29 2024-07-29 A method for identifying and protecting against attackable data in the kernel file system

Publications (2)

Publication Number Publication Date
CN119004443A CN119004443A (en) 2024-11-22
CN119004443B true CN119004443B (en) 2025-11-04

Family

ID=93489257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411021889.1A Active CN119004443B (en) 2024-07-29 2024-07-29 A method for identifying and protecting against attackable data in the kernel file system

Country Status (1)

Country Link
CN (1) CN119004443B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545174A (en) * 2017-08-22 2018-01-05 武汉大学 A kind of system and method for resisting controlling stream abduction based on LLVM
CN107967426A (en) * 2017-11-27 2018-04-27 华中科技大学 A kind of detection method, defence method and the system of linux kernel Data attack

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1870829B1 (en) * 2006-06-23 2014-12-03 Microsoft Corporation Securing software by enforcing data flow integrity
US11568044B2 (en) * 2018-03-19 2023-01-31 University Of Florida Research Foundation, Incorporated Method and apparatus for vetting universal serial bus device firmware
US10860709B2 (en) * 2018-06-29 2020-12-08 Intel Corporation Encoded inline capabilities
US11010495B1 (en) * 2018-10-23 2021-05-18 Architecture Technology Corporation Systems and methods for runtime enforcement of data flow integrity
CN109918903B (en) * 2019-03-06 2022-06-21 西安电子科技大学 A Protection Method for Program Uncontrolled Data Attack Based on LLVM Compiler
CN111881485B (en) * 2020-07-14 2022-04-05 浙江大学 A Kernel Sensitive Data Integrity Protection Method Based on ARM Pointer Verification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545174A (en) * 2017-08-22 2018-01-05 武汉大学 A kind of system and method for resisting controlling stream abduction based on LLVM
CN107967426A (en) * 2017-11-27 2018-04-27 华中科技大学 A kind of detection method, defence method and the system of linux kernel Data attack

Also Published As

Publication number Publication date
CN119004443A (en) 2024-11-22

Similar Documents

Publication Publication Date Title
Rivera et al. Keeping safe rust safe with galeed
US7058768B2 (en) Memory isolation through address translation data edit control
US7565509B2 (en) Using limits on address translation to control access to an addressable entity
JP4828199B2 (en) System and method for integrating knowledge base of anti-virus software applications
US7251735B2 (en) Buffer overflow protection and prevention
Lee et al. Preventing use-after-free with dangling pointers nullification.
TWI696950B (en) Method for detecting high-level functionality of application executing on computing device, and system and computer program thereof
Landwehr et al. A taxonomy of computer program security flaws
US11263155B2 (en) Managing fusion of memory regions and ownership attributes for fused memory regions
Huang et al. Software crash analysis for automatic exploit generation on binary programs
KR20200023377A (en) Zone identifier comparison for translation cache lockup
US8037529B1 (en) Buffer overflow vulnerability detection and patch generation system and method
Kong et al. Improving software security via runtime instruction-level taint checking
Zhou et al. Beyond control: Exploring novel file system objects for data-only attacks on linux systems
KR20200023379A (en) Zone execution context masking and saving
CN119004443B (en) A method for identifying and protecting against attackable data in the kernel file system
Blair et al. Mpkalloc: Efficient heap meta-data integrity through hardware memory protection keys
González Taxi: Defeating code reuse attacks with tagged memory
Li et al. Memory access integrity: detecting fine-grained memory access errors in binary code
KR20210086501A (en) Buffer management system and method for identifying ransomware attacks
CN120197178B (en) Intelligent contract vulnerability detection method and system based on stain analysis
Huang Dataguard: Guarded pages for augmenting stack object protections
Dautenhahn Protection in commodity monolithic operating systems
CN119830281A (en) Memory vulnerability protection method and device, electronic equipment and storage medium
Mangard Enforcing Pointer Integrity through Static Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant