[go: up one dir, main page]

CN118860993B - Metadata processing method and device based on distributed storage system - Google Patents

Metadata processing method and device based on distributed storage system Download PDF

Info

Publication number
CN118860993B
CN118860993B CN202411346256.8A CN202411346256A CN118860993B CN 118860993 B CN118860993 B CN 118860993B CN 202411346256 A CN202411346256 A CN 202411346256A CN 118860993 B CN118860993 B CN 118860993B
Authority
CN
China
Prior art keywords
metadata
layer container
storage system
cache layer
distributed storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411346256.8A
Other languages
Chinese (zh)
Other versions
CN118860993A (en
Inventor
臧林劼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Jinan data Technology Co ltd
Original Assignee
Inspur Jinan data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Jinan data Technology Co ltd filed Critical Inspur Jinan data Technology Co ltd
Priority to CN202411346256.8A priority Critical patent/CN118860993B/en
Publication of CN118860993A publication Critical patent/CN118860993A/en
Application granted granted Critical
Publication of CN118860993B publication Critical patent/CN118860993B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a metadata processing method and device based on a distributed storage system, which are applied to the technical field of storage and comprise the steps of caching part of metadata in a metadata pool of the distributed storage system to a metadata cache layer container, wherein the metadata cache layer container is provided with a plurality of metadata partitions, returning the target metadata to a client when the target metadata corresponding to a metadata access request is in a target metadata partition of the metadata cache layer container, acquiring the target metadata corresponding to the metadata access request from the metadata pool when the target metadata corresponding to the metadata access request is not in a target metadata partition of the metadata cache layer container, caching the target metadata to the target metadata partition of the metadata cache layer container, and returning the target metadata to the client.

Description

Metadata processing method and device based on distributed storage system
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a metadata processing method and apparatus based on a distributed storage system.
Background
With the development of big data and cloud computing technology and the popularization of intelligent devices, data volume and data types are rapidly increasing, traditional storage systems cannot meet the large-scale and diversified data storage requirements, and distributed storage systems can provide storage solutions with high performance, high reliability and expandability.
In the distributed storage system, metadata management is performed through metadata service to ensure file consistency and reliability, and with the increase of metadata access requests, the existing metadata service is difficult to meet requirements.
Disclosure of Invention
In view of the foregoing, it is proposed to provide a metadata processing method and apparatus based on a distributed storage system that overcomes or at least partially solves the foregoing problems, including:
a metadata processing method based on a distributed storage system provided with a metadata cache layer container, the method comprising:
Caching partial metadata in a metadata pool of the distributed storage system to the metadata caching layer container, wherein the metadata caching layer container is provided with a plurality of metadata partitions;
when target metadata corresponding to a metadata access request is hit in a target metadata partition of the metadata cache layer container, returning the target metadata to a client;
When target metadata corresponding to a metadata access request is missed in a plurality of metadata partitions of the metadata cache layer container, acquiring target metadata corresponding to the metadata access request from the metadata pool, caching the target metadata into the target metadata partitions of the metadata cache layer container, and returning the target metadata to a client.
Optionally, the method further comprises:
and dynamically updating the metadata cached in the metadata caching layer container.
Optionally, the dynamically updating the metadata cached in the metadata cache layer container includes:
and determining weight information of the metadata cached in the metadata caching layer container, and dynamically updating the metadata cached in the metadata caching layer container according to the weight information.
Optionally, the determining the weight information of the metadata cached in the metadata cache layer container includes:
determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container;
and determining weight information of the metadata cached in the metadata cache layer container according to the access frequency and the cache hit rate.
Optionally, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes:
acquiring access times and preset time intervals of the metadata cached in the metadata caching layer container;
And determining the access frequency of the metadata cached in the metadata caching layer container according to the access times and the preset time interval.
Optionally, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes:
acquiring the access times and the cache hit times of the metadata cached in the metadata cache layer container;
And determining the cache hit rate of the metadata cached in the metadata cache layer container according to the access times and the cache hit times.
Optionally, the dynamically updating the metadata cached in the metadata cache layer container according to the weight information includes:
Sorting the metadata cached in the metadata caching layer container according to the weight information, wherein the weight of the metadata ranked in front is greater than that of the metadata ranked in back;
And removing the ordered preset quantity of metadata from the metadata cache layer container.
Optionally, the distributed storage system is provided with a plurality of storage nodes, and further includes:
and writing the metadata write request into a first storage node and a second storage node of the distributed storage system according to the metadata write request, wherein the second storage node is a node for backing up metadata in the first storage node.
Optionally, for the metadata write request, performing a write operation on a first storage node and a second storage node of the distributed storage system according to the metadata write request includes:
for a metadata write request, performing write operation on a first storage node of the distributed storage system according to the metadata write request, and marking the metadata write request as an unacknowledged state;
performing a write operation at a second storage node of the distributed storage system according to the metadata write request;
after all the second storage nodes complete the writing operation, marking the metadata writing request as a confirmation state so as to complete the metadata writing request.
Optionally, before the writing operation at the second storage node of the distributed storage system according to the metadata writing request, the method further includes:
negotiating at a plurality of second storage nodes of the distributed storage system to determine an order of write operations in the plurality of second storage nodes.
Optionally, the method further comprises:
And dividing the metadata partitions in the metadata cache layer container into a plurality of data fragments by adopting a consistent hash algorithm, and storing the data fragments on different storage nodes in the distributed storage system.
Optionally, the method further comprises:
receiving a metadata access request sent by a client through a remote procedure call communication mechanism;
the returning the target metadata to the client comprises the following steps:
and returning the target metadata to the client through a remote procedure call communication mechanism.
Optionally, the distributed storage system is provided with a plurality of storage nodes, and further includes:
receiving a data read-write request sent by a client through a remote procedure call communication mechanism;
In response to the data read-write request, data is stored in the form of objects in or read from a plurality of storage nodes of the distributed storage system.
Optionally, the storing data in the form of objects in a plurality of storage nodes of the distributed storage system includes:
slicing the data to be stored, and storing the sliced data in a plurality of storage nodes of the distributed storage system in the form of objects.
Optionally, each metadata partition stores metadata in a tree data structure, the tree data structure comprising a plurality of tree data structure nodes, the plurality of tree data structure nodes comprising root nodes and leaf nodes;
The path from the root node to the leaf node corresponds to a string, and the time complexity of the insert operation and the find operation in the tree data structure is related to the string length.
Optionally, the distributed storage system provides file system services externally, and the client initiates the metadata access request in the form of an access file.
A metadata processing apparatus based on a distributed storage system provided with a metadata cache layer container, the apparatus comprising:
The metadata cache module is used for caching partial metadata in a metadata pool of the distributed storage system to the metadata cache layer container, wherein the metadata cache layer container is provided with a plurality of metadata partitions;
The hit processing module is used for returning the target metadata to the client when the target metadata of the metadata cache layer container is hit in the target metadata corresponding to the metadata access request;
and the miss processing module is used for acquiring target metadata corresponding to the metadata access request from the metadata pool when target metadata corresponding to the metadata access request is missed in all the metadata partitions of the metadata cache layer container, caching the target metadata into the target metadata partitions of the metadata cache layer container, and returning the target metadata to the client.
An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor implements a method as described above.
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method as described above.
A computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
The embodiment of the invention has the following advantages:
In the embodiment of the invention, partial metadata in the metadata pool of the distributed storage system is cached to the metadata cache layer container, the metadata cache layer container is provided with a plurality of metadata partitions, when the target metadata corresponding to the metadata access request is hit in the target metadata partition of the metadata cache layer container, the target metadata is returned to the client, when the target metadata corresponding to the metadata access request is missed in all the plurality of metadata partitions of the metadata cache layer container, the target metadata corresponding to the metadata access request is obtained from the metadata pool, the target metadata is cached in the target metadata partition of the metadata cache layer container, and the target metadata is returned to the client, so that the metadata access requirement on the metadata is met by arranging the metadata cache layer container in the distributed storage system and caching the metadata through the metadata cache layer container.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture provided by some embodiments of the invention;
FIG. 2 is a schematic diagram of another system architecture provided by some embodiments of the invention;
FIG. 3 is a flow chart of steps of a metadata processing method based on a distributed storage system according to some embodiments of the present invention;
FIG. 4 is a schematic diagram of another system architecture provided by some embodiments of the invention;
FIG. 5 is a block diagram of a metadata processing apparatus based on a distributed storage system according to some embodiments of the present invention;
FIG. 6 is a block diagram of an electronic device provided in some embodiments of the invention;
FIG. 7 is a block diagram of a storage medium according to some embodiments of the invention;
fig. 8 is a block diagram of a program product provided by some embodiments of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In a distributed storage system, data can be stored on multiple independent storage nodes and distributed managed, thereby providing a high performance, high reliability and scalable storage solution.
The design concept of the distributed storage system is based on expandability and fault tolerance, a distributed architecture is adopted to slice data and store the data on a plurality of nodes in a distributed mode in an object mode, capacity and load balance of each node are guaranteed, data access and management are carried out through a network, and a unified storage interface is provided.
Because the bottom layers of the distributed storage systems are all stored in the form of sliced objects, when the file system service is provided externally, file metadata (such as file names, rights and attributes) management is required to be carried out through the metadata service, the consistency and reliability of the file system are ensured, and a high-performance file service function is provided.
With the large-scale and diversified data storage and the increase of the storage cost, in the massive small file storage IO requests, the metadata operation request ratio of the attributes, the rights, the indexes and the like of the files exceeds 76%, and the access load and the performance of metadata service are challenged. In mass large-scale distributed file storage, metadata has become a main bottleneck of IO performance of a distributed storage system, and the important reason is that the lateral expansion of the metadata of the distributed storage system has limitations, metadata management is too complex, and flexibility and scalability are lacking.
In the related art, in order to improve file access efficiency of distributed storage and avoid competition between data access and metadata management in a distributed storage file system, an MDS (METADATA SERVER, metadata service) is designed to be responsible for management and storage of metadata. In order to meet the expansibility of the distributed storage cluster, a plurality of MDS service processes are adopted, and the metadata of the whole file system is subjected to partition management.
By adopting a manual partition mode, directory subtrees of different levels are fixedly distributed to each MDS, when the load among the MDSs is unbalanced, the subtrees among the MDSs are migrated through manual adjustment according to the load condition of each node of the MDS cluster, and the subtrees are dynamically adjusted to be distributed to different MDS nodes, so that load balancing is realized. And the MDS node with higher load determines the size of the migration subtree according to the heat of the local directory, and partial subtrees are migrated to the MDS node with lighter load. It can be seen that the locations of the individual MDS node loads and subtrees are static and that this approach is clearly more applicable to scenarios where the data locations are relatively fixed. However, as the traffic continues to expand, when the cluster size needs to expand, it is also necessary to manually reassign subtrees to newly added MDS nodes, so that the metadata load balancing capability is poor.
However, the management and maintenance of multiple MDS service processes in the distributed storage cluster are complex, and the metadata cache needs to be manually preset according to the number of metadata services estimated by the distributed underlying base device, so as to ensure the performance, consistency and reliability of metadata.
Moreover, the metadata multi-service process is difficult to maintain, and a great deal of labor cost is required to be input for adjustment and optimization. Under the condition of high concurrent massive small file load, metadata cache cannot be timely and elastically expanded, metadata operation needs to be searched from a metadata storage pool through an MDS process, and storage IO delay is aggravated. Although metadata and data management are stored separately and decoupled, the essence is that the metadata needs to be preconfigured together, the metadata needs to depend on the efficiency and deployment configuration of the back-end data storage device, and when performance bottlenecks occur in any aspect of metadata management and back-end storage, the performance of the storage system is rapidly reduced.
Moreover, during dynamic subtree partitioning, data may need to be migrated between different MDSs, and data migration may cause problems such as performance degradation, network overhead increase, and the like, and even risk of data loss or data inconsistency may occur. Dynamic subtree partitioning adds to the complexity of management and requires monitoring and adjusting the loading conditions of the different subtrees to ensure overall system balance and stability.
In the embodiment of the invention, the metadata is managed through the function, namely the service mode, and the metadata of the distributed storage system is combined with the function, namely the service mode, so that an efficient, flexible and elastically-expanded metadata management solution is provided. Specifically, the operation of requesting metadata to the client is processed through a function, namely a service mode, a metadata update event is triggered, interaction is carried out with the distributed storage system, and consistency of storage IO of the file system is ensured. By the embodiment of the invention, metadata management is more flexible and efficient, the processing speed of data operation is accelerated, the elasticity and expandability of the system are improved, and the operation and maintenance burden is reduced.
Wherein the functional-as-service model is a cloud computing service model that allows developers to write and deploy functionally independent code, typically running in the form of functions.
In the model, a developer does not need to manage a server, an operating system or a network infrastructure, emphasis is placed on a writing function, a cloud service provider is responsible for managing the underlying infrastructure, expansibility and security, the function written by the developer can be operated as an event-driven micro-service and can be triggered to be executed only when needed, the resource waste is effectively reduced in an on-demand execution mode, meanwhile, high flexibility and scalability are provided, the function can be triggered to be executed through an event, asynchronous processing and quick response are realized, and a user can automatically expand the number of examples according to requirements and adapt to different load conditions.
As shown in fig. 1, the distributed storage system architecture includes a plurality of components, in a storage pool, data of a client is stored as 4MB objects by slicing, fixing the objects as a default, grouping the objects, and storing the group id as a value obtained by adding a Hash (object identification) and the number of groups in an underlying device.
The distributed storage system architecture includes the following key components:
1. A unified, self-controlling, scalable distributed storage consistency management system, the component being a core component of a distributed storage cluster, provides distributed storage services in the form of objects. In this component, data is stored as objects, each having a unique identifier and associated data. Is responsible for storing objects on each node of a storage cluster in a distributed manner and providing functions of copying, recovering, load balancing and the like of data so as to ensure the reliability and high-performance access of the data.
2. Storage pool-distributed storage cluster is composed of multiple storage nodes, each of which may contain multiple hard disks or storage devices. The storage pool divides the storage resources to form different pools to meet different storage requirements.
3. A controllable and extensible distributed data equilibrium placement algorithm uniformly distributes data objects across nodes of a storage cluster through the component to avoid data hot spots and improve system performance.
4. The metadata service container module is responsible for managing metadata information of a file system, including attributes, positions and the like of files and directories. The monitoring service cluster is responsible for monitoring the state and configuration information of the cluster.
As shown in fig. 2, a client of the distributed storage system sends a file system metadata request to a function, namely a service metadata cache layer, and the request communication mode adopts an RPC (Remote Procedure Call, remote procedure call communication) mechanism, the function, namely the service metadata cache layer, is executed in a container, and a metadata cache service and other storage services can be simultaneously operated on the distributed storage system, so that direct resource competition between the metadata cache service and the other storage services is avoided.
The automatic expansion strategy module realizes the update mechanism of the cache layer and the parallel processing of the metadata, and dynamically expands the metadata in the metadata cache layer container according to the load condition of the metadata request. And a metadata consistency module is designed to ensure the data consistency and accuracy of the user service storage IO and refresh the metadata with invalid cache.
For data read-write IO, according to the path information obtained by the metadata, the data is stored in OSD (Object storage daemon ) equipment of the storage nodes according to the strategy of the distributed storage system, and the data is stored in the form of objects on physical disks of each node in the cluster. And through the management of the metadata consistency module, the metadata is stored in an OSD device of the distributed storage node in a lasting way, and the device adopts high-speed SSD (Solid STATE DISK) device for storage.
In particular, the embodiment of the invention has the following advantages:
1. The method has the performance that under the condition of hardware resources based on the same condition, in a distributed file storage environment, the metadata function, namely the management mode of the service, can automatically stretch and retract the number of instances according to the request quantity, adapt to different load conditions, realize elastic expansion and ensure the thread growth of the performance of a distributed storage system.
2. The invention is decoupled with the distributed storage system, improves the maintainability and expandability of the system, reduces the complexity of the system, is transparent to the service client and has stability.
3. The assembly and the method module of the invention can be packaged and processed, can be rapidly deployed and started, reduce the online time and the deployment complexity, and have safety.
4. The metadata hierarchical processing logic, the metadata information and the structured attribute information storage form provided by the invention have the advantages that the management mode of metadata functions, namely services, does not need to consider the infrastructure of a distributed storage system, only needs to concentrate on the development of metadata business logic, can improve the competitiveness of distributed file storage and reduce the maintenance cost.
5. The method can process the metadata operation in an event-driven mode, realize instant response and quick processing, and has portability and universality.
The examples of the invention are further described below:
referring to FIG. 3, a flowchart illustrating steps of a metadata processing method based on a distributed storage system provided with a metadata cache layer container according to some embodiments of the present invention is shown.
Specifically, the method comprises the following steps:
step 301, caching part of metadata in a metadata pool of the distributed storage system to the metadata caching layer container, wherein the metadata caching layer container is provided with a plurality of metadata partitions.
In practical application, a metadata cache layer container based on functions, namely services, can be arranged in the distributed storage system, and then partial metadata in a metadata pool of the distributed storage system can be cached to the metadata container.
In the embodiment of the invention, the function-as-a-service model in cloud computing is fully utilized, firstly, metadata cache layer containers in the function-as-a-service model in cloud computing are utilized to be deployed, and the metadata cache directory tree structure can be automatically and quickly and transparently expanded outwards to respond to sudden work, so that metadata requests can be adapted to continuously changing workloads in real time, the resource utilization rate of a distributed storage system is improved, and resource competition with other service modules is avoided.
In addition, through designing the metadata cache layer container of functions, namely services, the problems of complex metadata management mode, multiple dependence conditions and the like are solved, the design does not need to depend on a management server, an operating system or a network infrastructure of a bottom distributed storage system, but focuses on a metadata management architecture reasonably designed on the function of the writing function, and seamless expansion of the system scale is supported so as to meet the ever-increasing data demands.
In some embodiments of the present invention, each metadata partition stores metadata in a tree data structure comprising a plurality of tree data structure nodes including root nodes and leaf nodes, a path from a root node to a leaf node corresponding to a string, and a time complexity of an insert operation and a find operation in the tree data structure being related to a string length.
In practical application, the metadata cache layer container caches metadata information, and stores the metadata information in a memory through tree data structures, wherein each tree data structure node represents a character, edges represent connection relations among the characters, paths from root nodes to leaf nodes represent a character string, time complexity of inserting and searching operations in the structure is related to the length of the character string, generally, O (m), m is the length of the character string, O () is a function of calculating time complexity, and speed is very high. By caching the metadata with the structure, the hit rate of the metadata reading operation can be improved.
And 302, returning the target metadata to the client when the target metadata partition of the metadata cache layer container hits the target metadata corresponding to the metadata access request.
In some embodiments of the invention, the distributed storage system provides file system services externally, and the client initiates the metadata access request in the form of an access file.
In some embodiments of the invention, further comprising:
And receiving a metadata access request sent by the client through a remote procedure call communication mechanism.
In some embodiments of the present invention, the returning the target metadata to the client includes:
and returning the target metadata to the client through a remote procedure call communication mechanism.
In practical applications, interaction between the client and the distributed storage system can be realized through an RPC communication path, so that throughput and flexibility of metadata requests are improved. Firstly, designing an RPC interface required by metadata request processing, including request types, parameters, return results and the like, secondly, writing specific metadata request processing logic, including functions of file deletion and correction operation, directory management, metadata retrieval and the like, designing a corresponding data structure according to each metadata request type, determining serialization and anti-serialization modes of data, and ensuring the correctness and the integrity of the data in the network transmission process. The specific process is as follows:
1. When the client sends a request, RPC communication of TCP connection is established, and in dynamic load, the number of metadata directory tree results of the metadata cache container is instantiated to meet different communication connection request processing.
2. The metadata cache container submits a metadata request to the cached directory tree structure metadata, and if the requested metadata does not exist in the container, the calling function instantiates a new metadata cache.
Step 303, when target metadata corresponding to a metadata access request is missed in all metadata partitions of the metadata cache layer container, obtaining target metadata corresponding to the metadata access request from the metadata pool, caching the target metadata into the target metadata partitions of the metadata cache layer container, and returning the target metadata to a client.
In the embodiment of the invention, by judging whether metadata corresponding to the metadata access request is hit in the metadata cache layer container first, and then determining whether the target metadata is acquired in the metadata cache layer container or the metadata pool, the problem of low metadata access efficiency is solved, and proper metadata access strategies including metadata cache, metadata distribution and the like need to be designed, so that the metadata access efficiency is improved.
As shown in fig. 4, the metadata access request is processed as follows:
1. the client initiates a metadata request 1 for file/mnt/tmp/test 1.
2. Searching in the tree structure in the metadata cache layer container, searching the metadata 2, caching hit, and returning the metadata to the client.
3. And directly returning the metadata to the client.
4. The client issues another metadata request 2 for file/mnt/share/test 2.
5. Cache misses when a lookup is performed.
6. And acquiring metadata information of test2 from the metadata pool, creating directory tree structure cache information of the file in the container, and returning the directory tree structure cache information to the client.
In some embodiments of the invention, further comprising:
and dynamically updating the metadata cached in the metadata caching layer container.
In the embodiment of the invention, the metadata cached in the metadata caching layer container is dynamically updated, so that the problem of metadata load balancing is solved, and in the metadata management process, the load balancing of the metadata is considered, so that the condition that the overload of a single node is avoided, and the performance and the stability of the system are influenced is avoided. The load distribution of the metadata nodes needs to be monitored and adjusted to keep the balance of the whole system.
In some embodiments of the present invention, the dynamically updating the metadata cached in the metadata cache layer container includes determining weight information of the metadata cached in the metadata cache layer container, and dynamically updating the metadata cached in the metadata cache layer container according to the weight information.
In some embodiments of the present invention, the determining the weight information of the metadata cached in the metadata cache layer container includes determining an access frequency and a cache hit rate of the metadata cached in the metadata cache layer container, and determining the weight information of the metadata cached in the metadata cache layer container according to the access frequency and the cache hit rate.
In some embodiments of the present invention, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes obtaining the access times and the preset time intervals of the metadata cached in the metadata cache layer container, and determining the access frequency of the metadata cached in the metadata cache layer container according to the access times and the preset time intervals.
In some embodiments of the present invention, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes obtaining the access frequency and the cache hit frequency of the metadata cached in the metadata cache layer container, and determining the cache hit rate of the metadata cached in the metadata cache layer container according to the access frequency and the cache hit frequency.
In some embodiments of the present invention, the dynamically updating the metadata cached in the metadata cache layer container according to the weight information includes:
According to the weight information, sorting the metadata cached in the metadata caching layer container; and removing the ordered preset quantity of metadata from the metadata cache layer container.
Wherein the weight of the metadata ranked first is greater than the weight of the metadata ranked later.
In the embodiment of the invention, an update mechanism of a cache layer and a parallel processing mode of metadata are realized by an automatic expansion strategy, the throughput of a file system can be improved by metadata cache, the quantity of the metadata cache needs to meet dynamic workload to the greatest extent, and excessive metadata cache quantity occupies memory resources and needs to balance the cost of the metadata cache and the memory resources, so the embodiment of the invention provides the automatic expansion strategy for managing the metadata cache, and the method is concretely as follows:
defining metadata access frequency RF (metadata) =access times/time interval;
Defining a metadata cache hit rate HR (metadata) =cache hit number/total access number;
The weight W (metadata) =rf (metadata) ×1-HR (metadata)) of the metadata is calculated.
And selecting a plurality of metadata with the lowest weight for elimination according to the weight W (metadata), and dynamically adjusting the size of the cache according to the requirement so as to adapt to the change of the access mode.
1. Initializing metadata cache, namely setting the initial size of the cache, and initializing information such as metadata access frequency, cache hit rate, weight and the like.
2. And monitoring metadata access, namely counting the access times and cache hit conditions when the metadata is accessed, and calculating the RF and HR of each metadata.
3. And calculating the weight W (metadata) of the metadata according to a formula, determining which metadata are to be eliminated, expanding the higher the weight, and otherwise, shrinking the stock.
4. And (3) automatically expanding the strategy, namely periodically checking the weight of each metadata and expanding or reducing the cache size according to the requirement. The metadata with the lowest weight will be replaced or eliminated, making room for metadata with high access frequency.
5. And realizing cache replacement, namely selectively replacing metadata according to the weight calculation result, and keeping the high-efficiency utilization of the cache.
6. And dynamically monitoring and adjusting, namely continuously monitoring the access mode and the caching effect of the metadata, and adjusting algorithm parameters and strategies according to actual conditions so as to optimize caching performance.
In some embodiments of the present invention, the distributed storage system is provided with a plurality of storage nodes, and further includes:
and writing the metadata write request into a first storage node and a second storage node of the distributed storage system according to the metadata write request, wherein the second storage node is a node for backing up metadata in the first storage node.
In some embodiments of the present invention, for a metadata write request, performing a write operation on a first storage node and a second storage node of the distributed storage system according to the metadata write request includes:
For a metadata write request, performing write operation on a first storage node of the distributed storage system according to the metadata write request, and marking the metadata write request as an unacknowledged state; and marking the metadata write request as a confirmation state after all the second storage nodes finish the write operation so as to finish the metadata write request.
In some embodiments of the present invention, before the writing operation at the second storage node of the distributed storage system according to the metadata writing request, the method further includes:
negotiating at a plurality of second storage nodes of the distributed storage system to determine an order of write operations in the plurality of second storage nodes.
In practical application, for the writing operation of metadata, the following operations may be performed:
1. The write request is sent to the alternate master metadata node, metadata 1, and the operation is marked as unacknowledged.
2. After receiving the write request, the master metadata node forwards the write request to all backup duplicate metadata nodes and requests these nodes to also perform the write operation.
3. After receiving the write request, all the backup copy metadata nodes carry out consistency check on the sequence of the write operation through negotiation, and finally reach agreement.
4. Once all backup copy metadata nodes confirm that the write operation is completed, the master metadata node marks the write request as a confirmation state, completing the entire write operation.
In some embodiments of the invention, further comprising:
And dividing the metadata partitions in the metadata cache layer container into a plurality of data fragments by adopting a consistent hash algorithm, and storing the data fragments on different storage nodes in the distributed storage system.
In practical application, the parent directory ID is taken as input, and corresponding hash values (partitions) are calculated through a consistent hash algorithm, each partition corresponds to directory tree structure metadata 1 to n, and each partition is used for determining the storage position of the metadata. When the nodes are dynamically increased or decreased, the characteristics of the consistent hash can ensure that most of data is still mapped to the original nodes, and the cost caused by data migration and redistribution is reduced. Through consistency hash, metadata management and position searching can be simplified, the information quantity to be tracked is reduced, and the expandability and flexibility of the system are improved. Each deployed metadata cache container instance is then responsible for partitioning of the metadata namespace. And finally, the client operates the metadata according to the partition.
In practical applications, the metadata partitions 1 through n in the metadata cache layer container are divided into a plurality of segments using a consistent hashing algorithm and stored on different nodes in the distributed storage system. The hash function is hash=h (metadata partitions 1 to n), the metadata partitions 1 to n in the metadata cache layer container are mapped to a position position=hash mod n on a hash ring, the value range of the hash ring is generally a fixed interval, and n is the number of distributed storage cluster nodes.
In some embodiments of the present invention, the distributed storage system is provided with a plurality of storage nodes, and further includes:
Responding to the data read-write request, storing the data in a plurality of storage nodes of the distributed storage system in the form of objects or reading the data from the plurality of storage nodes of the distributed storage system.
In some embodiments of the present invention, the storing data in the form of objects in a plurality of storage nodes of the distributed storage system includes slicing the data to be stored and storing the sliced data in the form of objects in a plurality of storage nodes of the distributed storage system.
In the distributed storage system, a distributed architecture is adopted, data are sliced and stored on a plurality of nodes in a distributed mode in an object mode, capacity and load balance of each node are guaranteed, data access and management are carried out through a network, and a uniform storage interface is provided.
Because the bottom layers of the distributed storage systems are all stored in the form of sliced objects, when the file system service is provided externally, file metadata (such as file names, rights and attributes) management is required to be carried out through the metadata service, the consistency and reliability of the file system are ensured, and a high-performance file service function is provided.
In the embodiment of the invention, partial metadata in the metadata pool of the distributed storage system is cached to the metadata cache layer container, the metadata cache layer container is provided with a plurality of metadata partitions, when the target metadata corresponding to the metadata access request is hit in the target metadata partition of the metadata cache layer container, the target metadata is returned to the client, when the target metadata corresponding to the metadata access request is missed in all the plurality of metadata partitions of the metadata cache layer container, the target metadata corresponding to the metadata access request is obtained from the metadata pool, the target metadata is cached in the target metadata partition of the metadata cache layer container, and the target metadata is returned to the client, so that the metadata access requirement on the metadata is met by arranging the metadata cache layer container in the distributed storage system and caching the metadata through the metadata cache layer container.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to fig. 5, a schematic structural diagram of a metadata processing apparatus based on a distributed storage system according to some embodiments of the present invention is shown, where the distributed storage system is provided with a metadata cache layer container, and may specifically include the following modules:
The metadata caching module 501 is configured to cache part of metadata in a metadata pool of the distributed storage system to the metadata caching layer container, where the metadata caching layer container is provided with a plurality of metadata partitions;
The hit processing module 502 is configured to return, when the target metadata of the metadata cache layer container is partitioned into target metadata corresponding to the metadata access request, the target metadata to the client;
The miss processing module 503 is configured to obtain, when target metadata corresponding to a metadata access request is missed in each of a plurality of metadata partitions of the metadata cache layer container, target metadata corresponding to the metadata access request from the metadata pool, cache the target metadata in the target metadata partition of the metadata cache layer container, and return the target metadata to a client.
Optionally, the method further comprises:
and dynamically updating the metadata cached in the metadata caching layer container.
Optionally, the dynamically updating the metadata cached in the metadata cache layer container includes:
and determining weight information of the metadata cached in the metadata caching layer container, and dynamically updating the metadata cached in the metadata caching layer container according to the weight information.
Optionally, the determining the weight information of the metadata cached in the metadata cache layer container includes:
determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container;
and determining weight information of the metadata cached in the metadata cache layer container according to the access frequency and the cache hit rate.
Optionally, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes:
acquiring access times and preset time intervals of the metadata cached in the metadata caching layer container;
And determining the access frequency of the metadata cached in the metadata caching layer container according to the access times and the preset time interval.
Optionally, the determining the access frequency and the cache hit rate of the metadata cached in the metadata cache layer container includes:
acquiring the access times and the cache hit times of the metadata cached in the metadata cache layer container;
And determining the cache hit rate of the metadata cached in the metadata cache layer container according to the access times and the cache hit times.
Optionally, the dynamically updating the metadata cached in the metadata cache layer container according to the weight information includes:
Sorting the metadata cached in the metadata caching layer container according to the weight information, wherein the weight of the metadata ranked in front is greater than that of the metadata ranked in back;
And removing the ordered preset quantity of metadata from the metadata cache layer container.
Optionally, the distributed storage system is provided with a plurality of storage nodes, and further includes:
and writing the metadata write request into a first storage node and a second storage node of the distributed storage system according to the metadata write request, wherein the second storage node is a node for backing up metadata in the first storage node.
Optionally, for the metadata write request, performing a write operation on a first storage node and a second storage node of the distributed storage system according to the metadata write request includes:
for a metadata write request, performing write operation on a first storage node of the distributed storage system according to the metadata write request, and marking the metadata write request as an unacknowledged state;
performing a write operation at a second storage node of the distributed storage system according to the metadata write request;
after all the second storage nodes complete the writing operation, marking the metadata writing request as a confirmation state so as to complete the metadata writing request.
Optionally, before the writing operation at the second storage node of the distributed storage system according to the metadata writing request, the method further includes:
negotiating at a plurality of second storage nodes of the distributed storage system to determine an order of write operations in the plurality of second storage nodes.
Optionally, the method further comprises:
And dividing the metadata partitions in the metadata cache layer container into a plurality of data fragments by adopting a consistent hash algorithm, and storing the data fragments on different storage nodes in the distributed storage system.
Optionally, the method further comprises:
receiving a metadata access request sent by a client through a remote procedure call communication mechanism;
the returning the target metadata to the client comprises the following steps:
and returning the target metadata to the client through a remote procedure call communication mechanism.
Optionally, the distributed storage system is provided with a plurality of storage nodes, and further includes:
receiving a data read-write request sent by a client through a remote procedure call communication mechanism;
In response to the data read-write request, data is stored in the form of objects in or read from a plurality of storage nodes of the distributed storage system.
Optionally, the storing data in the form of objects in a plurality of storage nodes of the distributed storage system includes:
slicing the data to be stored, and storing the sliced data in a plurality of storage nodes of the distributed storage system in the form of objects.
Optionally, each metadata partition stores metadata in a tree data structure, the tree data structure comprising a plurality of tree data structure nodes, the plurality of tree data structure nodes comprising root nodes and leaf nodes;
The path from the root node to the leaf node corresponds to a string, and the time complexity of the insert operation and the find operation in the tree data structure is related to the string length.
Optionally, the distributed storage system provides file system services externally, and the client initiates the metadata access request in the form of an access file.
In the embodiment of the invention, partial metadata in the metadata pool of the distributed storage system is cached to the metadata cache layer container, the metadata cache layer container is provided with a plurality of metadata partitions, when the target metadata corresponding to the metadata access request is hit in the target metadata partition of the metadata cache layer container, the target metadata is returned to the client, when the target metadata corresponding to the metadata access request is missed in all the plurality of metadata partitions of the metadata cache layer container, the target metadata corresponding to the metadata access request is obtained from the metadata pool, the target metadata is cached in the target metadata partition of the metadata cache layer container, and the target metadata is returned to the client, so that the metadata access requirement on the metadata is met by arranging the metadata cache layer container in the distributed storage system and caching the metadata through the metadata cache layer container.
Referring to fig. 6, an electronic device provided by some embodiments of the present invention includes a processor 601, a memory 602, and a computer program stored on the memory 602 and capable of running on the processor 601, the computer program implementing the method as above when executed by the processor.
Referring to fig. 7, a computer readable storage medium 700 is shown, where the computer readable storage medium 700 stores a computer program, which when executed by a processor, implements a method as described above.
Referring to fig. 8, a computer program product 800 provided by some embodiments of the invention is shown, comprising a computer program which, when executed by a processor, implements a method as described above.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one does not exclude that there are additional identical elements in a process, method, article or terminal device comprising the element.
While the foregoing has described in detail the method and apparatus for metadata processing based on a distributed storage system, specific examples have been provided herein to illustrate the principles and embodiments of the present invention, and the above examples are provided to assist in understanding the method and core ideas of the present invention, and in summary, the present invention should not be construed as being limited to the specific embodiments and application scope of the invention, as will be apparent to those of ordinary skill in the art in light of the present disclosure.

Claims (18)

1.一种基于分布式存储系统的元数据处理方法,其特征在于,所述分布式存储系统设置有元数据缓存层容器,所述方法包括:1. A metadata processing method based on a distributed storage system, characterized in that the distributed storage system is provided with a metadata cache layer container, and the method comprises: 将所述分布式存储系统的元数据池中部分元数据缓存至所述元数据缓存层容器;其中,所述元数据缓存层容器设置有多个元数据分区;Cache part of the metadata in the metadata pool of the distributed storage system to the metadata cache layer container; wherein the metadata cache layer container is provided with a plurality of metadata partitions; 在所述元数据缓存层容器的目标元数据分区命中元数据访问请求对应的目标元数据时,将所述目标元数据返回给客户端;When the target metadata partition of the metadata cache layer container hits the target metadata corresponding to the metadata access request, returning the target metadata to the client; 在所述元数据缓存层容器的多个元数据分区中均未命中元数据访问请求对应的目标元数据时,从所述元数据池中获取所述元数据访问请求对应的目标元数据,将所述目标元数据缓存至所述元数据缓存层容器的目标元数据分区中,并将所述目标元数据返回给客户端;When the target metadata corresponding to the metadata access request is not hit in the multiple metadata partitions of the metadata cache layer container, the target metadata corresponding to the metadata access request is obtained from the metadata pool, the target metadata is cached in the target metadata partition of the metadata cache layer container, and the target metadata is returned to the client; 其中,所述分布式存储系统设置有多个存储节点,还包括:The distributed storage system is provided with a plurality of storage nodes, and further includes: 对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点和第二存储节点进行写入操作;其中,所述第二存储节点为对所述第一存储节点中元数据备份的节点;For a metadata write request, a write operation is performed on a first storage node and a second storage node of the distributed storage system according to the metadata write request; wherein the second storage node is a node that backs up the metadata in the first storage node; 其中,所述对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点和第二存储节点进行写入操作,包括:Wherein, for the metadata write request, performing a write operation on the first storage node and the second storage node of the distributed storage system according to the metadata write request includes: 对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点进行写入操作,并将所述元数据写请求标记为未确认状态;For a metadata write request, performing a write operation on a first storage node in the distributed storage system according to the metadata write request, and marking the metadata write request as an unconfirmed state; 根据所述元数据写请求在所述分布式存储系统的第二存储节点进行写入操作;Performing a write operation on a second storage node in the distributed storage system according to the metadata write request; 在所有的第二存储节点均完成写入操作后,将所述元数据写请求标记为确认状态,以完成所述元数据写请求。After all the second storage nodes complete the write operation, the metadata write request is marked as a confirmed state to complete the metadata write request. 2.根据权利要求1所述的方法,其特征在于,还包括:2. The method according to claim 1, further comprising: 对所述元数据缓存层容器中缓存的元数据进行动态更新。The metadata cached in the metadata cache layer container is dynamically updated. 3.根据权利要求2所述的方法,其特征在于,所述对所述元数据缓存层容器中缓存的元数据进行动态更新,包括:3. The method according to claim 2, characterized in that the dynamically updating the metadata cached in the metadata cache layer container comprises: 确定所述元数据缓存层容器中缓存的元数据的权重信息,并按照所述权重信息,对所述元数据缓存层容器中缓存的元数据进行动态更新。The weight information of the metadata cached in the metadata cache layer container is determined, and the metadata cached in the metadata cache layer container is dynamically updated according to the weight information. 4.根据权利要求3所述的方法,其特征在于,所述确定所述元数据缓存层容器中缓存的元数据的权重信息,包括:4. The method according to claim 3, characterized in that the determining the weight information of the metadata cached in the metadata cache layer container comprises: 确定所述元数据缓存层容器中缓存的元数据的访问频率和缓存命中率;Determining the access frequency and cache hit rate of the metadata cached in the metadata cache layer container; 根据所述访问频率和所述缓存命中率,确定所述元数据缓存层容器中缓存的元数据的权重信息。According to the access frequency and the cache hit rate, weight information of the metadata cached in the metadata cache layer container is determined. 5.根据权利要求4所述的方法,其特征在于,所述确定所述元数据缓存层容器中缓存的元数据的访问频率和缓存命中率,包括:5. The method according to claim 4, characterized in that determining the access frequency and cache hit rate of the metadata cached in the metadata cache layer container comprises: 获取所述元数据缓存层容器中缓存的元数据的访问次数和预设的时间间隔;Obtaining the number of accesses to the metadata cached in the metadata cache layer container and a preset time interval; 根据所述访问次数和预设的时间间隔,确定所述元数据缓存层容器中缓存的元数据的访问频率。The access frequency of the metadata cached in the metadata cache layer container is determined according to the number of accesses and a preset time interval. 6.根据权利要求4所述的方法,其特征在于,所述确定所述元数据缓存层容器中缓存的元数据的访问频率和缓存命中率,包括:6. The method according to claim 4, characterized in that determining the access frequency and cache hit rate of the metadata cached in the metadata cache layer container comprises: 获取所述元数据缓存层容器中缓存的元数据的访问次数和缓存命中次数;Obtaining the number of accesses and cache hits of the metadata cached in the metadata cache layer container; 根据所述访问次数和所述缓存命中次数,确定所述元数据缓存层容器中缓存的元数据的缓存命中率。A cache hit rate of the metadata cached in the metadata cache layer container is determined according to the number of accesses and the number of cache hits. 7.根据权利要求3所述的方法,其特征在于,所述按照所述权重信息,对所述元数据缓存层容器中缓存的元数据进行动态更新,包括:7. The method according to claim 3, characterized in that the dynamically updating the metadata cached in the metadata cache layer container according to the weight information comprises: 根据所述权重信息,对所述元数据缓存层容器中缓存的元数据排序;其中,排序在前的元数据的权重大于排序在后的元数据的权重;According to the weight information, the metadata cached in the metadata cache layer container is sorted; wherein the weight of the metadata in the first sorting order is greater than the weight of the metadata in the last sorting order; 将排序在后的预设数量的元数据从所述元数据缓存层容器中去除。A preset number of metadata that are sorted later are removed from the metadata cache layer container. 8.根据权利要求1所述的方法,其特征在于,在所述根据所述元数据写请求在所述分布式存储系统的第二存储节点进行写入操作之前,还包括:8. The method according to claim 1, characterized in that before performing a write operation on the second storage node of the distributed storage system according to the metadata write request, it further comprises: 在所述分布式存储系统的多个第二存储节点进行协商,以确定多个所述第二存储节点中写入操作的顺序。A negotiation is performed among the plurality of second storage nodes in the distributed storage system to determine an order of write operations among the plurality of second storage nodes. 9.根据权利要求1所述的方法,其特征在于,还包括:9. The method according to claim 1, further comprising: 采用一致性哈希算法,将所述元数据缓存层容器中多个元数据分区划分为多个数据片段,并在所述分布式存储系统中的不同的存储节点上进行存储。A consistent hashing algorithm is used to divide multiple metadata partitions in the metadata cache layer container into multiple data fragments, and store them on different storage nodes in the distributed storage system. 10.根据权利要求1至7任一项所述的方法,其特征在于,还包括:10. The method according to any one of claims 1 to 7, further comprising: 通过远程过程调用通信机制,接收客户端发送的元数据访问请求;Receive metadata access requests sent by clients through remote procedure call communication mechanism; 所述将所述目标元数据返回给所述客户端,包括:The returning the target metadata to the client comprises: 通过远程过程调用通信机制,将所述目标元数据返回给所述客户端。The target metadata is returned to the client through a remote procedure call communication mechanism. 11.根据权利要求1至7任一项所述的方法,其特征在于,所述分布式存储系统设置有多个存储节点,还包括:11. The method according to any one of claims 1 to 7, characterized in that the distributed storage system is provided with a plurality of storage nodes, and further comprising: 通过远程过程调用通信机制,接收客户端发送的数据读写请求;Receive data read and write requests sent by the client through the remote procedure call communication mechanism; 响应于所述数据读写请求,以对象的形式将数据存储于所述分布式存储系统的多个存储节点,或者,从所述分布式存储系统的多个存储节点中读取数据。In response to the data read and write request, the data is stored in the form of an object in multiple storage nodes of the distributed storage system, or the data is read from the multiple storage nodes of the distributed storage system. 12.根据权利要求11所述的方法,其特征在于,所述以对象的形式将数据存储于所述分布式存储系统的多个存储节点,包括:12. The method according to claim 11, wherein storing data in the form of objects in multiple storage nodes of the distributed storage system comprises: 对待存储的数据进行切片,并将切片后的数据以对象的形式将数据存储于所述分布式存储系统的多个存储节点。The data to be stored is sliced, and the sliced data is stored in the form of objects in multiple storage nodes of the distributed storage system. 13.根据权利要求1所述的方法,其特征在于,每个元数据分区以树形数据结构存储元数据,所述树形数据结构包括多个树形数据结构节点,所述多个树形数据结构节点包括根节点和叶子节点;13. The method according to claim 1, characterized in that each metadata partition stores metadata in a tree data structure, wherein the tree data structure includes a plurality of tree data structure nodes, and the plurality of tree data structure nodes include a root node and leaf nodes; 从根节点到叶子节点的路径对应一个字符串,在所述树形数据结构中插入操作和查找操作的时间复杂度与字符串长度相关。The path from the root node to the leaf node corresponds to a string, and the time complexity of the insertion operation and the search operation in the tree data structure is related to the length of the string. 14.根据权利要求1所述的方法,其特征在于,所述分布式存储系统对外提供文件系统服务,所述客户端以访问文件的形式发起元数据访问请求。14. The method according to claim 1 is characterized in that the distributed storage system provides file system services to the outside, and the client initiates a metadata access request in the form of accessing a file. 15.一种基于分布式存储系统的元数据处理装置,其特征在于,所述分布式存储系统设置有元数据缓存层容器,所述装置包括:15. A metadata processing device based on a distributed storage system, characterized in that the distributed storage system is provided with a metadata cache layer container, and the device comprises: 元数据缓存模块,用于将所述分布式存储系统的元数据池中部分元数据缓存至所述元数据缓存层容器;其中,所述元数据缓存层容器设置有多个元数据分区;A metadata cache module, used to cache part of the metadata in the metadata pool of the distributed storage system to the metadata cache layer container; wherein the metadata cache layer container is provided with a plurality of metadata partitions; 命中处理模块,用于在所述元数据缓存层容器的目标元数据分区命中元数据访问请求对应的目标元数据时,将所述目标元数据返回给客户端;A hit processing module, configured to return the target metadata to the client when the target metadata partition of the metadata cache layer container hits the target metadata corresponding to the metadata access request; 未命中处理模块,用于在所述元数据缓存层容器的多个元数据分区中均未命中元数据访问请求对应的目标元数据时,从所述元数据池中获取所述元数据访问请求对应的目标元数据,将所述目标元数据缓存至所述元数据缓存层容器的目标元数据分区中,并将所述目标元数据返回给客户端;a miss processing module, configured to, when the target metadata corresponding to the metadata access request is not hit in the multiple metadata partitions of the metadata cache layer container, obtain the target metadata corresponding to the metadata access request from the metadata pool, cache the target metadata in the target metadata partition of the metadata cache layer container, and return the target metadata to the client; 其中,所述分布式存储系统设置有多个存储节点,还包括:The distributed storage system is provided with a plurality of storage nodes, and further includes: 对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点和第二存储节点进行写入操作;其中,所述第二存储节点为对所述第一存储节点中元数据备份的节点;For a metadata write request, a write operation is performed on a first storage node and a second storage node of the distributed storage system according to the metadata write request; wherein the second storage node is a node that backs up the metadata in the first storage node; 其中,所述对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点和第二存储节点进行写入操作,包括:Wherein, for the metadata write request, performing a write operation on the first storage node and the second storage node of the distributed storage system according to the metadata write request includes: 对于元数据写请求,根据所述元数据写请求在所述分布式存储系统的第一存储节点进行写入操作,并将所述元数据写请求标记为未确认状态;For a metadata write request, performing a write operation on a first storage node in the distributed storage system according to the metadata write request, and marking the metadata write request as an unconfirmed state; 根据所述元数据写请求在所述分布式存储系统的第二存储节点进行写入操作;Performing a write operation on a second storage node in the distributed storage system according to the metadata write request; 在所有的第二存储节点均完成写入操作后,将所述元数据写请求标记为确认状态,以完成所述元数据写请求。After all the second storage nodes complete the write operation, the metadata write request is marked as a confirmed state to complete the metadata write request. 16.一种电子设备,其特征在于,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至14中任一项所述的方法。16. An electronic device, comprising a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the computer program implements the method according to any one of claims 1 to 14 when executed by the processor. 17.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至14中任一项所述的方法。17. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method according to any one of claims 1 to 14 is implemented. 18.一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序在被处理器执行时实现如权利要求1至14中任一项所述的方法。18. A computer program product, characterized by comprising a computer program, wherein when the computer program is executed by a processor, the computer program implements the method according to any one of claims 1 to 14.
CN202411346256.8A 2024-09-26 2024-09-26 Metadata processing method and device based on distributed storage system Active CN118860993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411346256.8A CN118860993B (en) 2024-09-26 2024-09-26 Metadata processing method and device based on distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411346256.8A CN118860993B (en) 2024-09-26 2024-09-26 Metadata processing method and device based on distributed storage system

Publications (2)

Publication Number Publication Date
CN118860993A CN118860993A (en) 2024-10-29
CN118860993B true CN118860993B (en) 2025-03-07

Family

ID=93173851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411346256.8A Active CN118860993B (en) 2024-09-26 2024-09-26 Metadata processing method and device based on distributed storage system

Country Status (1)

Country Link
CN (1) CN118860993B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115170A (en) * 2020-09-18 2020-12-22 苏州浪潮智能科技有限公司 Metadata caching method, system, device and medium
CN117827787A (en) * 2023-12-13 2024-04-05 天翼云科技有限公司 A metadata management method and system applied to distributed file system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9792298B1 (en) * 2010-05-03 2017-10-17 Panzura, Inc. Managing metadata and data storage for a cloud controller in a distributed filesystem
CN102710790B (en) * 2012-06-20 2015-06-10 深圳市远行科技有限公司 Memcached implementation method and system based on metadata management
CN104317736B (en) * 2014-09-28 2017-09-01 曙光信息产业股份有限公司 A kind of distributed file system multi-level buffer implementation method
CN117033261A (en) * 2023-08-04 2023-11-10 深圳市腾讯计算机系统有限公司 Cache allocation method, device, equipment, medium and program product
CN117093158B (en) * 2023-10-17 2024-02-06 苏州元脑智能科技有限公司 Storage node, system and data processing method and device of distributed storage system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115170A (en) * 2020-09-18 2020-12-22 苏州浪潮智能科技有限公司 Metadata caching method, system, device and medium
CN117827787A (en) * 2023-12-13 2024-04-05 天翼云科技有限公司 A metadata management method and system applied to distributed file system

Also Published As

Publication number Publication date
CN118860993A (en) 2024-10-29

Similar Documents

Publication Publication Date Title
US10789217B2 (en) Hierarchical namespace with strong consistency and horizontal scalability
US10891267B2 (en) Versioning of database partition maps
US9710407B2 (en) Congestion control in storage systems
US9772787B2 (en) File storage using variable stripe sizes
US9495478B2 (en) Namespace management in distributed storage systems
US9779015B1 (en) Oversubscribed storage extents with on-demand page allocation
US10372685B2 (en) Scalable file storage service
US8386540B1 (en) Scalable relational database service
US9519510B2 (en) Atomic writes for multiple-extent operations
US10264071B2 (en) Session management in distributed storage systems
US10275489B1 (en) Binary encoding-based optimizations at datastore accelerators
US9294558B1 (en) Connection re-balancing in distributed storage systems
US9602424B1 (en) Connection balancing using attempt counts at distributed storage systems
US9569459B1 (en) Conditional writes at distributed storage services
US9449008B1 (en) Consistent object renaming in distributed systems
US11561930B2 (en) Independent evictions from datastore accelerator fleet nodes
JP2020514885A (en) Methods, devices, and systems for maintaining metadata and data integrity across data centers
US20200045010A1 (en) Naming Service in a Distributed Memory Object Architecture
US10146833B1 (en) Write-back techniques at datastore accelerators
Liu et al. Cfs: A distributed file system for large scale container platforms
CN117687970B (en) Metadata retrieval method and device, electronic equipment and storage medium
US20170351620A1 (en) Caching Framework for Big-Data Engines in the Cloud
CN117539915B (en) Data processing method and related device
Garefalakis et al. Acazoo: A distributed key-value store based on replicated lsm-trees
CN118860993B (en) Metadata processing method and device based on distributed storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant