Disclosure of Invention
The application provides a method and a system for restoring SMB protocol files, which are used for solving the problem that complete files cannot be restored.
In a first aspect, the present application provides a method for recovering an SMB protocol file, where the method includes:
acquiring a restoring instruction of a target file;
responding to the restoring instruction, and if the message type of the SMB protocol is a downloaded file, acquiring the direction of the message;
if the direction is the request direction, acquiring restored file field information of the request direction message, and storing the restored file field information to a session control structure, wherein the restored file field information comprises an interactive message number, a file offset, a file name, a file size and file total length information;
if the direction is the response direction, acquiring the interactive message number of the response direction message;
if the interactive message number of the response direction message exists in the session control structure, matching the other restored file field information except the interactive message number in the restored file field information;
writing the file block of the request direction message or the response direction message into a memory according to the file offset;
and reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the written total length is the same as the total length of the files, representing that the target file restoration is completed.
In some possible embodiments, after the responding to the restore instruction, the method includes:
if the message type of the SMB protocol is an uploading file, acquiring the direction of the message;
if the direction is the request direction, acquiring the restored file field information of the request direction message, and storing the restored file field information to the session control structure;
if the interactive message number in the restored file field information of the request direction message exists in the session control structure, matching the file offset in the session control structure;
and writing the file block of the request direction message into a memory according to the file offset.
In some possible embodiments, the writing the file block of the request direction message or the response direction message into the memory according to the file offset includes:
detecting a memory file structure body in the session control structure, wherein the session control structure comprises a plurality of structure bodies;
if the memory file structure is in a default initial state, writing the file block of the request direction message or the response direction message into a memory according to the file offset, wherein the memory comprises a plurality of fields.
In some possible embodiments, the writing the file block of the request direction message or the response direction message into the memory according to the file offset includes:
assigning the restored file field information in the session control structure to the field of the memory, wherein the restored file field information and the field have a mapping relation;
setting the file pointer to be shifted to the shifting field of the memory;
and writing the file block of the request direction message into an offset field corresponding to the memory according to the file offset of the request direction message.
In some possible embodiments, the writing the file block of the request direction message or the response direction message into the memory according to the file offset further includes:
applying for a memory from an operating system, wherein the memory comprises a plurality of memory blocks.
In some possible embodiments, after reading the total length of the written file blocks in the memory and the total length of the file corresponding to the restored file field information, the method includes:
searching restored file field information corresponding to the file block according to the written file block;
if the total length of the written file blocks is the same as the total length of the file in the restored file field information, representing that the target file is stored in the memory;
and writing the target file stored in the memory into a hard disk to complete the restoration of the target file.
In some possible embodiments, the saving the restored file field information to the session control structure includes:
acquiring the interval time of the request direction message;
if the interval time is smaller than the time threshold value, acquiring the number of the conversation which can be saved by the conversation control structure; and creating an integer array in the session control structure, wherein the array is used for storing the interactive message number, the file offset and the file size of the request direction message, and the size of the array is the same as the number of the session which can be saved by the session control structure.
In some possible embodiments, the method further comprises:
acquiring quintuple information;
and creating a session control structure according to the quintuple information.
In some possible embodiments, the writing the file block of the request direction message or the response direction message into the memory further includes:
acquiring the file block type;
if the file block type is an executable file, detecting the file block based on a behavior analysis detection method to obtain a detection result;
and if the detection result is that the malicious behavior is not detected, writing the content of the file block.
In a second aspect, the present application provides an SMB protocol file restore system, configured to perform the SMB protocol file restore method of any one of the first aspect, where the system includes:
the acquisition unit is used for acquiring a restoring instruction of the target file;
responding to the restoring instruction, and if the message type of the SMB protocol is a downloaded file, acquiring the direction of the message by the acquisition unit; if the direction is a request direction, acquiring restored file field information of the request direction message, and storing the restored file field information to a session control structure, wherein the restored file field information comprises an interactive message number, a file offset, a file name, a file size and file total length information; if the direction is the response direction, the interactive message number of the response direction message is obtained;
the matching unit is used for matching the restored file field information except the interactive message number in the restored file field information if the interactive message number of the response direction message exists in the session control structure;
the writing unit is used for writing the file block of the request direction message or the response direction message into a memory according to the file offset;
and the reading unit is used for reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the total length of the written file blocks is the same as the total length of the files, the target file restoration is represented.
According to the technical scheme, the application provides a method and a system for restoring the SMB protocol file, wherein a restoring instruction of a target file is obtained, and if the message type of the SMB protocol is a downloaded file, the direction of the message is obtained; if the direction is the request direction, acquiring restored file field information of the request direction message, and storing the restored file field information into a session control structure, wherein the restored file field information comprises an interactive message number, a file offset, a file name, a file size and file total length information. If the direction is the response direction, acquiring the interactive message number of the response direction message; if the interactive message number of the response direction message exists in the session control structure, matching the other restored file field information except the interactive message number in the restored file field information; writing a file block of a request direction message or a response direction message into a memory according to the file offset; and reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the total length of the written file blocks is the same as the total length of the files, representing that the target file restoration is completed. According to the method and the device, the restored file field information is stored in the session control structure, and the problem that file disorder blocks exist in SMB protocol file transmission is solved through the interactive message number and the file offset, so that the complete target file is restored.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the present application. Merely as examples of systems and methods consistent with some aspects of the present application as detailed in the claims.
SMB is a protocol for file sharing, printer sharing, and general network communication, by which a client application can read and write files on a server under various network environments, and make service requests to the server program. In addition, through the SMB protocol, the application program can access the remote server-side files and the printer and other resources.
However, the SMB protocol may also be utilized by malicious users to conduct attacks, such as man-in-the-middle attacks, or to spread viruses through shared files. In order to realize the functions of controlling file transmission, preventing viruses and the like of the SMB protocol, the files in the transmission process need to be restored, and when the functions of preventing viruses and the like are controlled, the file name, the file size, the file content and the like in the file transmission process need to be identified by analyzing the transmission protocol.
Network security devices such as: IP protocol ciphers, security routers, line ciphers, firewalls, etc. can only regulate traffic by file name, file size, and at the same time, the anti-virus function is also matched by computing MD5 (Message Digest Algorithm 5 message digest algorithm version 5) for each message.
The functions of identifying the real file type based on the file content, preventing viruses and the like require a complete file. Such as: the file type identification function is based on the real file content, and the function needs to identify the header information of the file, so that the header content of the file is not necessarily in the first file transmission message in the uploading or downloading process of the SMB protocol, and the file type cannot be accurately identified. Based on the antivirus function of the flow MD5, the problem of disordered file blocks exists in the SMB protocol transmission process, and the calculation error of the MD5 can be caused, so that the message disinfection recognition rate is inaccurate. In summary, the complete file cannot be restored due to the problems of header information of the file, disordered file blocks and the like.
In order to solve the problem that complete files cannot be restored due to file block disorder in the process of file transmission of the SMB protocol, some embodiments of the present application provide a method for restoring files by the SMB protocol, referring to fig. 1, the method includes:
s100: and obtaining a restoring instruction of the target file.
The target file is a deleted file, a damaged file, a file infected by viruses, and the like, and when the condition of the target file is detected, a restoring instruction for restoring the whole file can be sent, namely the whole file is the data of the complete target file.
The method can acquire whether the complete file is needed for the function needed in the SMB protocol file transmission process when the target file is restored based on the requirement, for example: an antivirus function, an auditing function and a complete file; the function of the SMB protocol file transmission process is judged first, and then a restoration instruction is obtained.
When the SMB protocol is utilized to transmit the file, the client sends a request, the server receives and responds to the request sent by the client, the server sends the target file to the client, and when the anti-virus function needs to be realized, the server can generate a restoring instruction. For example, when a client receives an alert from antivirus software, and detects that a target file may carry the antivirus software, the client may request the server to delete or quarantine the file. The server then performs this operation to restore the corresponding file, for example: delete the virus file or move it to the quarantine area.
S200: responding to the restoring instruction, and if the message type of the SMB protocol is a downloaded file, acquiring the direction of the message.
The message types of the SMB include uploading a write file and downloading a read file, for the SMB protocol, file restoration includes file restoration of the uploaded file and file restoration of the downloaded file, according to RFC (Request For Comments ) documents, the message format in the transmission process of the SMB protocol includes a NETBIOS (Network Basic Input/Output System) header, an SMB header and text content; for file restoration of the SMB protocol, fields such as operation command in the SMB header (to distinguish between read download and write upload), interaction message ID, offset length of file transfer, total length of file, etc. may be used. The downloading and uploading are different due to the difference of message formats in the interaction process, the data of each field of the message is different, and the data structures used in the processing process are also different. Thus, the upload and download are restored separately.
The client is configured to respond to the restoring instruction of the server to acquire the direction of the message in the transmission process, wherein the direction of the message comprises a request direction and a response direction. The direction is the role and communication order between the sender and the receiver of the message, in the SMB protocol, the request direction is the sending of the message from the client to the server, and the response direction is the sending of the message from the server to the client as a response to the request.
S210: if the direction is the request direction, acquiring the restored file field information of the request direction message, and storing the restored file field information to the session control structure.
The restored file field information includes the interactive message number message-id, the file offset, the file name, the file size, and the file total length information. When the message of the SMB is a downloaded read file and the direction of the message is a request direction, analyzing the request direction message, and acquiring the interactive message number message_id, the file offset, the file name, the file size and the file total length information of the request direction message. The session control structure comprises a plurality of structures, and different structures are used for storing different restored file field information.
In some embodiments, the session control structure further comprises a linked list for storing and managing a plurality of message_ids in the session structure.
Illustratively, when a request message is received, the server assigns a unique message_id to the request message and adds the unique message_id to a linked list of the session structure, where each node in the linked list contains information about the message_id. Through the session control structure linked list, the server can track and manage a plurality of request messages, and when a certain request needs to be processed or responded, the server can acquire the corresponding message_id and related information by searching the linked list, and then execute corresponding operations.
In the transmission process of the SMB protocol, if the message continuously receives a plurality of request messages, when the server continuously receives the plurality of request messages, the server searches corresponding message_id and related information from the session control structure linked list, and then carries out corresponding processing according to the type of the request and the executed operation.
After the restored file field information of the request direction message is obtained, the information is saved in a Session control structure, and a Session (Session) is defined as an interaction process between the client and the server in network communication. This interaction involves the exchange of one or more data packets. The establishment of a session typically begins with the client sending a request to the server, for example: an HTTP request. The server responds to the request and sends one or more data packets in response. This process may contain some specific information such as: source IP address, destination IP address, source port, destination port, protocol type, i.e., five tuple information, which is used to uniquely identify a session.
Because the session control structure is used for storing information on the same session, it can be understood that other instructions are already present before the present storage and the instructions are the same instructions, and then the session control structure is already present before the present storage, and the restored file field information is stored in the existing session control structure.
In network communications, quintuple information can be used to determine a particular network connection or session. That is, if the five-tuple information of two network connections is the same, they are connections in the same session. For the first occurrence of session information, in some embodiments, five-tuple information is obtained; and creating a session control structure according to the quintuple information. Messages with the same quintuple information belong to the same session.
Because the interaction of the SMB protocol is not constant, a plurality of request messages can be sequentially sent in the message interaction process, and then response message writing files are sent. Referring to the message exchange mode of fig. 2, the offset file offset in the request process is not necessarily a sequential read file, and the read file may be skipped, so the message_id in each request needs to be recorded and the file offset matched with the message_id. The session control structure can be searched for the offset information matched with the session control structure through the message_id.
When the file restoring field information such as the file offset is searched, the file restoring field information is obtained through the message_id, and because the message_id needs to be frequently searched, the message_id information of each request can be stored in the form of an array, and the array can efficiently access and operate data. In some embodiments, the interval time of the request direction message can be obtained; if the interval time is smaller than the time threshold value, acquiring the number of the conversation which can be saved by the conversation control structure; and creating an integer array in the session control structure, wherein the array is used for storing the interactive message number, the file offset and the file size of the request direction message, and the size of the array is the same as the number of the session which can be stored in the session control structure.
By setting the session control structure and the size of the data group, the number of simultaneous sessions can be limited, and excessive consumption of system resources caused by excessive concurrent requests can be prevented.
And storing the file restoring field information into a session control structure, judging whether the session- > mem_file structure is empty, if so, representing that the received first request message is received, and applying for a memory with a corresponding size in a memory system at the moment so as to store the file block of the request message according to the file offset.
It will be appreciated that each message_id will only be used once during a complete file transfer interaction, and that during a file download or upload, the client will send a request to the server requesting the server to send or receive the file. This request contains a message_id that identifies the request. After receiving the request, the server performs corresponding processing according to the message_id in the request and sends a response to the client. In this process, the purpose of the message_id is to identify the uniqueness of each request or response. Since each message_id is unique, each message_id will only be used once during a complete file transfer interaction. When a message_id is sent, it is discarded and not reused. Therefore, in the file downloading and uploading process of the SMB protocol, there is no case that after the client and the server end complete one interaction, the next message_id is performed, unlike the process of restoring the file in other protocols. Each message_id is independent and unique and is associated with only one specific file transfer interaction procedure.
S220: and if the direction is the response direction, acquiring the interactive message number of the response direction message.
For message_id information, each request message corresponds to a message_id, and the response message also carries the same message_id field for explaining that the message_id information is a complete interaction request. Therefore, the message_id of the response direction message is acquired and is the same message_id as the message_id of the request direction message, and the restored file field information of the same session is already stored in the session control structure in the response direction message, if the same message_id can be acquired, it is stated that other restored file field information stored together with the message_id can be acquired.
S300: if the interactive message number of the response direction message exists in the session control structure, the other restored file field information except the interactive message number in the restored file field information is matched.
Referring to the message exchange process of fig. 3, other restored file field information besides the exchange message number includes file offset, file name, file size and file total length information. The file offset is used for indicating the position of the current read-write position in the file. Every time a read or write operation is performed, the current file offset is started, and the offset is increased by the number of bytes read and written.
S400: and writing the file blocks of the request direction message or the response direction message into the memory according to the file offset.
Because the message format in the SMB protocol transmission process comprises a NETBIOS header, an SMB header and text content, wherein the text content is file data. It will be appreciated that the messages in the uploading or downloading process contain text content, and this part of text content is the file that needs to be restored in this embodiment. Therefore, when the request direction message or the response direction message is matched, the text content in the message comprises a plurality of file blocks. As can be seen from the above, by determining that the session- > mem_file structure is empty, that is, detecting the memory file structure mem_file in the session control structure, if the memory file structure is in a default initial state, applying for the memory of the file size in the restored file field information in the memory, where the memory structure is exemplary is as follows:
{
void_file; content pointer for/(and/or file
Int mem_len; /(currently written file length)
Int mem_offset; offset value of/(write file)
Char_mem_file_name; file name
}
When the memory file structure is in a default initial state, each member variable representing the structure is set to a default value. For example, if an integer variable is included in the structure, the integer variable may be initialized to 0; if a character type variable is included, the character type variable may be initialized to null character '\0'; if a pointer variable is included, the pointer variable may be initialized to NULL.
To dynamically adjust the size of the memory, in some embodiments, the memory may be applied to the operating system. The memory comprises a plurality of memory blocks, the memory blocks apply for a memory space with a fixed size in advance and divide the memory space into a plurality of small blocks to provide the memory space of the small blocks, and it is understood that the memory blocks can be used for storing one file block or a plurality of file blocks. Each memory block may have a unique identifier to facilitate tracking and management thereof. Meanwhile, the size of the memory block can be dynamically adjusted, and can be dynamically adjusted according to the size of the target file.
After the memory block application is completed, file names and file offset data in the session control structure can be distributed and assigned to different memory blocks, and in some embodiments, the restored file field information in the session control structure is assigned to the field of the memory; setting the offset of the file pointer to an offset field of the memory; and writing the file block of the request direction message into an offset field corresponding to the memory according to the file offset of the request direction message.
Restoring the mapping relation between the file field information and the field; illustratively, the file name is assigned to mem_file_name and the file offset is assigned to mem_offset.
After assignment, the pointer of the file is shifted to the mem_offset, and when the response direction message is processed, the content of the target file can be written to the correct position only according to the file offset of the request direction, for example: offset1 in fig. 4.
In some embodiments, the type of file block may also be obtained before writing the file block into memory; if the file type of the file block is an executable file, detecting the file block based on a behavior analysis detection method to obtain a detection result; if the detection result is that the malicious behavior is not detected, writing the content of the file block.
By setting file type restrictions, transmission of viruses and other malicious files can be prevented. For example, transmission of certain types of executable files (e.g.,. Exe,. Com,. Dll, etc.) may be prohibited, thereby reducing the risk of virus transmission. And then the transmitted file content is checked to identify potential malicious codes. This can be achieved by file scanning, virus feature matching or detection methods based on behavioral analysis. The behavioral analysis may detect malicious behavior of the file, such as theft of sensitive data, destruction of systems, remote control, etc., and may discover potential malicious activity. If no malicious activity is detected, the contents of the file block may be safely written to the file.
S500: and reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the total length of the written file blocks is the same as the total length of the files, representing that the target file restoration is completed.
After the file blocks are gradually written into the memory, if the total length of the written file blocks is the same as the total length of the file, representing the target file and storing the target file into the memory; writing the file blocks stored in the memory into the hard disk to complete the restoration of the target file. After the target file is written into the memory, the hard disk is written again, so that the performance can be improved and the IO operation can be reduced, wherein it can be understood that the file blocks stored in the memory comprise a plurality of file blocks, the target file can be written into the hard disk by judging whether the length of the written file blocks in the memory is the same as the total length of the file in the restored file field information or not, and if the length of the written file blocks in the memory is the same as the total length of the file in the restored file field information, the target file can be written into the hard disk.
Taking a read to download a 2M size file as an example, the data structure according to S200-S500 is as follows:
the method comprises the steps that the method is the same as a downloaded file, and the direction of a message is also obtained after the message type of an SMB protocol is judged to be an uploaded file; in some embodiments, if the message type of the SMB protocol is an upload file, the direction of the message is obtained; if the direction is the request direction, acquiring the restored file field information of the request direction message, and storing the restored file field information to the session control structure; if the interactive message number in the restored file field information of the request direction message exists in the session control structure, matching the file offset in the session control structure; and writing the file block of the request direction message into the memory according to the file offset.
However, unlike downloading the file, in this embodiment, the restoring field information of the request direction message is directly obtained without considering the response direction message in the process of uploading the file, and after the restoring field information is stored in the session control structure, the file offset is directly obtained through the message_id, and the file block is written into the memory.
In order to facilitate the execution of the method described above, some embodiments of the present application further provide an SMB protocol file restore system, including:
the acquisition unit is used for acquiring a restoring instruction of the target file;
responding to the restoring instruction, and if the message type of the SMB protocol is a downloaded file, acquiring the direction of the message by the acquisition unit; if the direction is the request direction, the method is used for acquiring restored file field information of a request direction message and storing the restored file field information into a session control structure, wherein the restored file field information comprises an interactive message number, a file offset, a file name, a file size and file total length information; if the direction is the response direction, the interactive message number of the response direction message is obtained;
the matching unit is used for matching the restored file field information except the interactive message number in the restored file field information if the interactive message number of the response direction message exists in the session control structure;
the writing unit is used for writing the file blocks of the request direction message or the response direction message into the memory according to the file offset;
the reading unit is used for reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the total length of the written file blocks is the same as the total length of the files, the target file restoration is represented.
According to the technical scheme, the application provides a method and a system for restoring the SMB protocol file, wherein a restoring instruction of a target file is obtained, and if the message type of the SMB protocol is a downloaded file, the direction of the message is obtained; if the direction is the request direction, acquiring restored file field information of the request direction message, and storing the restored file field information into a session control structure, wherein the restored file field information comprises an interactive message number, a file offset, a file name, a file size and file total length information. If the direction is the response direction, acquiring the interactive message number of the response direction message; if the interactive message number of the response direction message exists in the session control structure, matching the other restored file field information except the interactive message number in the restored file field information; writing a file block of a request direction message or a response direction message into a memory according to the file offset; and reading the total length of the written file blocks in the memory and the total length of the files in the corresponding restored file field information, and if the total length of the written file blocks is the same as the total length of the files, indicating that the target file restoration is completed. According to the method and the device, the restored file field information is stored in the session control structure, and the problem that file disorder blocks exist in SMB protocol file transmission is solved through the interactive message number and the file offset, so that the complete target file is restored.
The foregoing detailed description of the embodiments is merely illustrative of the general principles of the present application and should not be taken in any way as limiting the scope of the invention. Any other embodiments developed in accordance with the present application without inventive effort are within the scope of the present application for those skilled in the art.