Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present application.
It should be noted that in the description of the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and the like in this specification are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The present application will be further described in detail below with reference to the drawings and detailed description for the purpose of enabling those skilled in the art to better understand the aspects of the present application.
The specific application environment architecture or specific hardware architecture upon which execution of the configuration method of the storage cluster depends is described herein.
The method embodiments provided in the embodiments of the present application may be executed in a server apparatus or similar computing device. Taking the example of running on a server device, fig. 1 is a block diagram of a hardware structure of a method for configuring a storage cluster according to an embodiment of the present application. As shown in fig. 1, the server device may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU, a programmable logic device FPGA, or the like processing means) and a memory 104 for storing data, wherein the server device may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 1 is merely illustrative and is not intended to limit the architecture of the server apparatus described above. For example, the server device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a starting method of an operating system in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located with respect to the processor 102, which may be connected to the server device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a server device. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In the related art, the configuration methods of the storage clusters can be divided into three types:
1) The Ceph cluster of tens of nodes in the early stage can still rely on the 'empirical formula' of the senior engineers, namely, the copy number=3, the PG number is approximately equal to OSD multiplied by 100, CACHE TIER, and 10% hot spot discs are reserved. However, when the node size is excessive, the hardware is heterogeneous, and the traffic load is alternating between peaks and valleys in a "tidal" manner, the empirical formula begins to fail. To reduce read latency, engineers have to review nearly blind official documents overnight, cross-align manuals, mailing lists, redmine worksheets, and then write or rewrite Ansible Playbook. A seemingly simple requirement of reducing the P99 reading delay to 5ms is often to go through a plurality of iterations of document reading, script changing, gray level changing, rollback and changing again, and the time consumption is variable from days to weeks. More seriously, the awareness of the "best practices" varies from engineer to engineer, resulting in multiple "dialect configurations" of the same cluster over the year, with dramatic decreases in maintainability.
2) Configuration management tools such as Ansible, puppet, chef, saltStack and the like are generated in the "templated script" stage, and experience is solidified through YAML/JSON templates, so that batch deployment and version control are realized. However, the templates are naturally "static contracts" and must be updated synchronously once the underlying hardware or upper layer traffic changes. For example, when a business team decides to migrate a pool of chills originally running on SATA disks to an NVMe full flash pool to support a real-time recommendation model, the operation and maintenance personnel needs ① to re-evaluate copy policy and EC ratio, ② rewrites templates, ③ runs two weeks of pressure testing in a test environment, ④ grayscale is on line. The whole process is still in units of "weeks". The "minute scale expansion and contraction, hour scale online" emphasized in the Yun Yuansheng era is clearly not satisfied.
3) In recent years, academia and industry begin to try to find the optimal configuration by machine learning. For example, some studies search for IOPS maxima in a 20-dimensional parameter space using Bayesian optimization, and vendors train regression models based on APM data to predict the effect of "pg_num tuning from 1024 to 2048 on delay. However, these regimens generally have three pain points:
the target is single, most only pay attention to throughput or delay, and multiple targets such as cost, reliability, power consumption and the like cannot be considered at the same time;
The scene is closed, training data is often from a certain set of storage and a certain set of hardware, and the training data needs to be retrained once disc-changing or network-changing is performed;
The transaction guarantee is lacking, the 'suggested parameters' given by AI still need to be manually executed to the production environment, if performance rollback occurs, rollback operation needs to be manually intervened, and risks are uncontrollable.
Further, with the popularity of hybrid clouds and edge clouds, the topology of the storage cluster presents the ultra-high complexity of "cloudy multi-region + multi-hardware-agent + multi-traffic-load". The traditional script and the single-point AI call cannot cross the gap between the brand difference and the protocol difference, namely, the RESTful API of Ceph, the mc CLI of MinIO, the lctl of Lustre and the dfsadmin of HDFS are respectively administrative, so that the same set of optimization logic is forced to be repeatedly realized in various scripts, and the maintenance cost is exponentially increased.
In summary, although the related technology covers the evolution path of 'manual experience- & gt template script- & gt single-point intelligence', the related technology still stays in a 'man-machine' two-layer interaction paradigm, namely that people are responsible for understanding services, writing rules and executing commands, and machines are responsible for repeating labor. When the business demand is changed instantaneously, the cluster size is tens of thousands, and the parameter dimension is tens of thousands, the paradigm approaches the manpower ceiling. There is a need in the industry for a completely new configuration paradigm that enables "business language input, zero awareness across brands, and automatic rollback".
In order to solve the above-mentioned problems, in this embodiment, a storage cluster configuration method is provided, which is applied to a large language model (i.e. the execution subject of the following steps is a large language model), and fig.2 is a flowchart of a storage cluster configuration method according to an embodiment of the present application, as shown in fig.2, and the method includes the following steps S202-S206:
step S202, acquiring a target natural language, wherein the target natural language is used for indicating the configuration of a storage cluster;
It should be noted that, the user may input a sentence of business level requirements on the unified Web or command line interface, and then the large language model may obtain the target natural language. Illustratively, the target natural language is "ensure that the 5-node Ceph cluster reduces the online order pool read P99 delay to within 5ms within 1 minute".
Step S204, generating a corresponding configuration strategy according to the target natural language;
Step S206, sending the configuration strategy to the storage cluster through the model context protocol to instruct the storage cluster to execute the corresponding configuration operation according to the configuration strategy.
According to the steps, the LLM can directly analyze the natural language instruction, the configuration intention of the user is automatically understood, the user does not need to grasp complex configuration grammar or understand the working principle of the storage system in depth, the configuration threshold is greatly reduced, and the configuration efficiency is improved. Secondly, the MCP is used as a unified communication protocol, allows the LLM to interact with different types of storage clusters in a standardized manner, and can realize the issuing of a configuration strategy through the MCP no matter the clusters are Ceph, HDFS or other storage solutions, thereby avoiding the complicated work of independently writing management scripts for each storage system and realizing cross-brand and cross-protocol configuration consistency. In addition, the application makes the configuration process automatic, from natural language analysis to configuration strategy generation and then strategy execution, the whole process is automatically completed in a short time, the response speed and accuracy of configuration are greatly improved, and the problems of delay and error caused by manual configuration are solved. By the method, a user can rapidly and accurately configure the storage clusters without manually writing scripts or manually adjusting parameters, so that the configuration efficiency and the flexibility of cluster management are greatly improved, and the problem that the storage clusters cannot be configured efficiently is solved.
In an exemplary embodiment, the above generation of the corresponding configuration policy according to the target natural language may be implemented according to the following steps S11-S14:
step S11, generating a structured intention object corresponding to the target natural language;
in one exemplary embodiment, the structured intent object has three fields of values, the three fields including a first field for indicating a traffic demand type, a second field for indicating a quantization constraint corresponding to the traffic demand type, and a third field for indicating an arbitration level when there is a conflict between the traffic demand type or the quantization constraint.
For better understanding, the structured intent is specifically described as I { goalType, constraint set, priority }, and Table 1 below describes goalType (first field), constraint set (second field), and priority (third field):
TABLE 1
For a better understanding, the following detailed description is given:
goalType specify the type of traffic demand that the user wishes the system to optimize or meet, such as latency reduction, throughput improvement, cost reduction, reliability enhancement (resilience), or mixed workload processing (mixed). goalType as an index for the system to retrieve best practice knowledge pieces helps to narrow the search scope, more quickly find configuration practices that match the user's needs.
Constraint set-specific quantization limits or requirements for goalType are typically expressed in terms of key-value pairs. For example, the user may wish that the P99 delay of the read operation does not exceed 5ms ({ "readP" 99: 5, "unit": "ms" }) or that the write bandwidth needs to reach 2GB/s ({ "writeBW": 2GB/s "}). The constraint set and goalType, in combination with details that help the system understand the needs of the user, are used to constrain the range of optional configuration parameters, ensuring that the final configuration policy can meet the performance index and business objective of the user.
Priority resolution Priority when there is a conflict between the user's needs or constraints. priority indicates which constraints or objectives should be prioritized in a multi-objective optimization. For example, if there is a tradeoff between high latency and high cost, priority may indicate that the system is preferred to meet the low latency requirement (high), even though this may result in increased cost.
It should be noted that, the structuring intention effectively enhances the generation precision and applicability of the configuration policy through the setting of three key fields (service requirement type, quantization constraint and arbitration level). Specifically, the service requirement type field ensures that the configuration policy corresponds closely to the actual service target, and the quantization constraint field converts the fuzzy performance requirement into a definite parameter threshold, so that the configuration adjustment is dependent. In addition, the arbitration level field provides a decision basis when facing multi-objective conflict, ensures that the service requirement with higher priority is met, and avoids overall performance reduction or invalid configuration caused by mutual restriction among objectives. The mechanism remarkably improves the intelligent degree of configuration, so that the storage cluster can be better adapted to complex and changeable service environments, high performance is guaranteed, and meanwhile, the stability and reliability of the system are maintained.
In one exemplary embodiment, field names, value types, and units in the structured intent object are all statically defined in the interface definition language of the model context protocol.
It should be noted that by statically defining field names, value types, and units in the interface definition language of the Model Context Protocol (MCP), all subsequent processing modules (e.g., management components of the storage cluster) may know in advance the exact format and meaning of these fields without additional parsing or format conversion work at runtime. This not only speeds up the execution of the configuration flow, but also eliminates the possibility of parsing errors, enabling configuration policies generated by large language models to be quickly and accurately executed by the storage clusters.
Step S12, acquiring current state information of a storage cluster through a query tool in a model context protocol;
it should be noted that the query tool is a service endpoint in a model context protocol (Model Context Protocol, abbreviated as MCP), which allows a large language model (Large Language Model, abbreviated as LLM) to query the state information of the storage clusters.
In one exemplary embodiment, the current state information includes the number of nodes of the storage cluster, the disk type, network topology, load characteristics, and alarms corresponding to the storage cluster.
Disk type-the type of storage medium used in the storage cluster, such as SSD, HDD, NVMe, etc., which affects data read-write performance and storage cost.
Network topology, which is a connection mode among nodes in a cluster, including a network structure, a node position (such as rack sensing), and the like, has important influence on data transmission rate and redundancy strategies.
Load characteristics-the type and characteristics of the workload being processed by the cluster, such as read-write ratio, data size distribution, service request frequency, etc., are important for optimizing performance and resource allocation.
Alarms, any potential problem or warning information in the cluster, such as disk failure, increased network delay, over-high resource utilization or configuration anomalies, are the basis for immediate adjustment of the configuration policy.
S13, acquiring a target knowledge segment from a prompt registry of a model context protocol, wherein the target knowledge segment is provided with configuration guidance information of a storage cluster;
In an exemplary embodiment, the target knowledge segment comprises a configuration template, parameter recommendation, an adjustable range and benefit description, wherein the configuration template is a set of predefined parameter sets used for rapidly deploying or adjusting the configuration of the storage cluster, the parameter recommendation is optimization suggestion and parameter selection corresponding to specific types or models of hardware and software versions and specific business requirements, the adjustable range is used for indicating an effective adjustment interval of each configuration parameter in the storage cluster, the benefit description is used for indicating change or optimization effects on indexes caused by adjusting the specific parameters in different scenes, and the indexes comprise at least one of performance, cost or power consumption.
It should be noted that the objective knowledge segment constructs a comprehensive, flexible and instructive best practice library. The target knowledge segment covers a configuration template, parameter recommendation, adjustable range and benefit description, and multi-dimensional optimization suggestions are provided for the configuration of the storage cluster. The design ensures that the configuration adjustment is quick and accurate, effectively guides the reasonable selection of parameters, and avoids performance reversal possibly caused by blind tuning.
And S14, generating a configuration strategy according to the structured intention object, the current state information and the target knowledge piece.
In an exemplary embodiment, the step S14 can be implemented by utilizing semantic understanding and reasoning capabilities of a large language model, and generating a configuration strategy by performing semantic matching, relevance ranking and multi-objective weighing on the objective knowledge segments according to the structural intention objects and the current state information.
That is, in this embodiment, in the Prompt Registry, the LLM directly reads all available best practice knowledge segments k_i (the target knowledge segments), then takes the intention object I and the context object C (i.e. the current state information) as input, performs semantic matching, relevance ranking and multi-objective balancing on the knowledge segments by using the semantic understanding and reasoning capabilities of the large language model, and independently completes multi-objective optimization calculation without any additional boolean expression or external rule engine, and finally generates and outputs the configuration policy.
In the step S11, the large language model converts the target natural language instruction input by the user into the structured intention object, which defines the configured target and constraint conditions, thereby greatly simplifying the processing steps. Step S12 rapidly obtains real-time status of the storage cluster, including key information such as hardware configuration, network topology, traffic load, etc., through a query tool in the MCP, which provides necessary context for generating policies. Step S13 retrieves knowledge pieces matching the cluster status and configuration requirements from the hint registry of the MCP, which contain the best practices and configuration guidelines of the storage cluster, which are important bases for generating configuration policies. Step S14 is a process of integrating the above information and generating a specific configuration strategy through a multi-objective optimization algorithm.
The user can generate the configuration strategy conforming to the business target, the cluster state and the best practice with the help of the large language model only by expressing the configuration requirement by using natural language without deep knowledge of the storage technology details.
In one exemplary embodiment, the configuration policy includes a key-value parameter table including parameters to be configured in the storage cluster and corresponding parameter values, and an ordered action list including a plurality of configuration change operations and execution order information of the plurality of configuration change operations.
For a better understanding, the following detailed description illustrates the key-value parameter table PARAMETERMAP in the configuration policy P { PARAMETERMAP, actionSequence }, and the following table 3 illustrates the ordered action list actionSequence:
TABLE 2
TABLE 3 Table 3
It should be noted that, the combination of the key-value parameter table and the ordered action list of the configuration strategy provides an accurate and ordered parameter adjustment scheme for the storage cluster. The key-value parameter list defines specific parameters to be adjusted and target values thereof, ensures the definition and the executability of configuration instructions, and the ordered action list defines the execution sequence of a plurality of change operations, so that the dependence relationship and possible interaction influence among the configuration parameters are considered, the consistency of configuration adjustment and the atomicity of transactions are ensured, and the configuration problem caused by improper operation sequence is avoided. The design greatly simplifies the operation flow of operation and maintenance personnel, reduces the risk of configuration failure, and improves the efficiency and stability of storage cluster configuration.
In one exemplary embodiment, sending the configuration policy to the storage cluster via the model context protocol includes packaging the configuration policy into a global transaction via a commit tool in the model context protocol, wherein the commit tool is the tool in the model context protocol responsible for performing the configuration policy issue, and sending the global transaction atomically to the storage cluster.
It should be noted that, in the present application, the transmission and execution of the configuration policy are innovatively integrated into one atomization operation, which is implemented by a dedicated commit Tool (tool_c) in the Model Context Protocol (MCP). tool_C is not just a simple messenger, it acts as a constructor and coordinator, encapsulating the complex configuration policy P { PARAMETERMAP, actionSequence } output by the Large Language Model (LLM) into a global transaction with a unique transaction ID (tid).
It should be noted that, in this embodiment, through the atomic transaction sending mechanism of the MCP protocol, reliability and consistency of configuration change are remarkably enhanced, a multi-step configuration flow is simplified, and a configuration speed is accelerated.
In one exemplary embodiment, after the configuration policy is sent to the storage cluster by the commit tool in the model context protocol, the method further includes obtaining a transaction identifier of a global transaction returned by the commit tool, wherein the transaction identifier is an identifier generated by the commit tool after packaging the configuration policy into one global transaction, and invoking a validation tool in the model context protocol to perform atomic validation on the storage cluster based on the transaction identifier to validate a state of the storage cluster after performing the corresponding configuration operation.
It should be noted that, in this embodiment, after the configuration policy is submitted to the storage cluster through the Model Context Protocol (MCP), the verification phase is entered. The MCP commit Tool (tool_C) is not only responsible for the issuing of configuration policies, but also returns a globally unique transaction identifier (tid) after a transaction is successfully committed. This tid, as an identity card for the transaction, will accompany the overall configuration execution and verification process, becoming an important clue to track and audit the configuration change history. Subsequently, the Large Language Model (LLM) invokes the validation Tool (tool_v) in the MCP based on this tid to perform comprehensive atomic validation on the current state of the storage cluster, ensuring proper execution of the configuration operation and achievement of the intended effect.
It should be noted that, the application uses the globally unique transaction identifier (tid) to perform atomic verification, ensures the validity and consistency of the configuration change of the storage cluster, strengthens the stability and safety of the storage cluster, optimizes the operation and maintenance audit flow, obviously improves the efficiency and the intelligent level of configuration management, brings revolutionary improvement of real-time verification and instant response to distributed storage operation and maintenance, and realizes accurate and controllable configuration management.
In one exemplary embodiment, the validation tool in the model context protocol is invoked to perform atomic validation on the storage clusters based on the transaction identifier, including consistency validation on the storage clusters, health detection on core components in the storage clusters, performance testing and stress testing on the storage clusters, and determining whether the storage clusters are in an expected state.
It should be noted that atomic verification is a key step of ensuring that the storage cluster state meets the expectations after configuration change. Specifically, based on the transaction identifier (tid), a verification Tool (tool_v) in a Model Context Protocol (MCP) performs a series of deep checks, namely, firstly, consistency verification confirms that the configuration states of all cluster nodes are consistent, and ensures that the clusters are integrally cooperated, secondly, checking the states of core components such as OSD, MON, MGR through health detection, ensuring that the key parts are free from abnormality, maintaining the high availability of the clusters, and finally, evaluating the real-time performance and potential load bearing capacity of the clusters after modification through performance test and pressure test, and ensuring that key indexes such as read-write performance, delay and the like meet expected targets. The series of verification is carried out atomically under the guidance of tid, namely, all verification passes or no change is carried out, so that the stability and the safety of cluster configuration are ensured.
In the embodiment, the atomic verification mechanism ensures the safety and effectiveness of the configuration change of the storage cluster through consistency check, health monitoring and performance test, feeds back verification results in time, simplifies audit flow, remarkably enhances the stable operation and high-efficiency performance of the storage cluster, improves the transparency and controllability of the management of the storage cluster, and realizes the comprehensive quality control of the configuration change.
In one exemplary embodiment, after the verification tool in the context protocol of the transaction identifier call model performs atomic verification on the storage cluster, the method further comprises the steps of calling the commit semantics of the commit tool to determine that the configuration change of the storage cluster is effective when the atomic verification on the storage cluster passes, and calling the rollback semantics of the commit tool to return the state of the storage cluster to the state before the generation of the transaction identifier when the atomic verification on the storage cluster fails.
It should be noted that, in this embodiment, after the atomic verification is passed, the large language model further invokes the commit semantic of the MCP submission tool to formally mark the configuration change as an efficient state, and at this time, the change is consistently accepted and persisted by all nodes of the cluster, so as to ensure the integrity of configuration adjustment and the expected consistency of the cluster state. On the contrary, when verification fails, the large language model can immediately call the rollback semantic of the submitting tool, quickly return the state of the storage cluster to the baseline before changing, eliminate the configuration influence which fails verification through reverse operation, protect the normal operation of the original cluster and avoid the business risk caused by unexpected state. This mechanism ensures that each configuration adjustment can be performed under strict authentication and control, providing a solid foundation for efficient management and operation of the storage clusters.
In order to better understand the foregoing configuration policy issuing and verifying process, the following details are described below:
The configuration policy P is committed to the target storage cluster at one time by the LLM through the tool_C of the MCP (i.e., the commit Tool described above). Tool_C adopts a transaction barrier mechanism, namely, an idempotent operation is executed according to actionSequence sequence after an interface is called, a load is requested and a cluster side receives the request. tool_C generates global transaction ID tid (128-bit UUID) for the current change, and immediately returns tid to LLM as unique identification for subsequent verification, rollback or audit.
After receiving tid, the LLM does not directly commit, but performs three-party verification through MCP tool_V, and specifically comprises the following steps:
and (3) consistency check, namely confirming that all nodes have received and persist the same configuration version.
Health detection, checking OSD, MON, MGR and other core components have no DOWN or DEGRADED status.
Performance bypass pressure testing (optional) is that 5-10s random read-write is executed at the isolated client to confirm whether the P99 delay and IOPS meet the I.constraint set.
Note that, if all three items pass, tool_v returns status=ok, and if any one item FAILs, return status=fail with the failure cause.
After the Tool_C returns the global transaction ID tid, the LLM performs atomic verification on the configuration through the MCP Tool_V, if verification is passed, the LLM calls the commit semantic of the MCP Tool_C to mark the configuration as effective at one time and synchronize the persistent audit log, if verification fails, the LLM calls the rollback semantic of the MCP Tool_C to cancel all the issued changes in the storage cluster, so that the cluster is retracted to the state before tid generation, the whole process is penetrated by tid, power, traceability and auditability are ensured, and the storage cluster is always in an expected state.
Therefore, a user does not need to understand the technical details or brand differences of the bottom layer, and the optimal configuration of the storage cluster which is subjected to multi-objective optimization, transaction guarantee and retroactive rolling can be obtained in a second level only by one sentence of natural language.
In one exemplary embodiment, the naming convention for the uniform resource identifier URI of the tool in the model context protocol is uniform and the large language model invokes the tool in the model context protocol in a uniform URI mode.
That is, the application can realize brand decoupling, URIs of all tools in the model context protocol follow the unified naming convention mcp:///< function domain >/< operation >/< resource >, so that the same set of LLM promts can be seamlessly migrated among heterogeneous storages.
It should be noted that in this embodiment, tools in the Model Context Protocol (MCP) follow a unified naming convention to form a set of standardized Uniform Resource Identifiers (URIs). The design allows a Large Language Model (LLM) to call different tools of the MCP in a consistent manner, and the LLM can easily establish communication with a target tool through a predefined URI format no matter when inquiring the cluster state, executing configuration change or conducting transaction verification, so that the call difficulty caused by grammar difference or non-uniformity of interfaces is reduced, and the management flexibility and efficiency of cross-brand storage clusters are improved.
In one exemplary embodiment, in the case of a new function in the storage group, a piece of information of a configuration instruction or a function description related to the new function is added or updated in a hint registry of the model context protocol, and an operation interface corresponding to the new function is registered in a tool registry of the model context protocol.
It should be noted that the present application can implement knowledge-capability double-layer decoupling, namely, the Prompt Registry only carries "knowledge" (best practice, functional document), and the Tool Registry only carries "capability" (read, write, verify, rollback). When a new functional module (such as compression and hierarchical cache) is on line in the cluster, only new promt fragments and Tool are added to MCP REGISTRY, and LLM is not required to be restarted or retrained, so that the new functional module can be immediately effective in the next round of main flow.
In one exemplary embodiment, the method further comprises obtaining current state information of the storage cluster through a query tool in the model context protocol, generating a repair policy according to the current state information in the event that a storage cluster failure is determined according to the current state information, and transmitting the repair policy to the storage cluster through the model context protocol.
That is, in this embodiment, when the LLM obtains real-time status information of the cluster, such as node status, disk utilization, network status, etc., through the query tool in the MCP, and discovers the fault sign, it can quickly analyze the fault cause and generate a targeted repair policy based on rich sympt library knowledge. This policy is also submitted to the storage cluster by MCP transacting, performing operations including, but not limited to, replacing failed nodes, adjusting data distribution, or optimizing network configuration, to enable automated diagnosis and recovery of failures. The method greatly reduces the burden of operation and maintenance personnel on troubleshooting and solving, shortens the fault recovery time, and improves the self-healing capacity and usability of the storage cluster.
In one exemplary embodiment, generating a corresponding configuration policy based on the target natural language includes determining identity information of a target object submitting the target natural language, determining a modification permission level of the target object to the storage cluster based on the identity information, and generating the corresponding configuration policy based on the modification permission level and the target natural language.
That is, in the present embodiment, the link of generating the configuration policy incorporates a careful rights management and security control mechanism. When a user puts forward a configuration requirement through natural language, the LLM can determine the identity information of the user, including the role of the user, department attribution and the like, so as to evaluate the access and modification authority of the user to the storage cluster. Based on the identity information of the user, the LLM determines its permission level, which determines the scope and depth of configuration changes that the user can request. For example, a common operator may be limited to adjusting read-write performance related parameters, while a senior administrator may have authority to modify configurations relating to data redundancy and security. And then, the LLM generates a configuration strategy which accords with the intention of the user and is not unauthorized by means of the intelligent analysis capability of the LLM according to the established authority level and the specific requirements of the user, so that the configuration adjustment is efficient and safe each time, and the risk or data security problem possibly caused by improper authority management is avoided. The authority control mechanism not only enhances the security of the system, but also provides a fairer, more reasonable and controllable storage cluster management environment for users with different roles.
In one exemplary embodiment, the method further includes obtaining historical operational data of the storage cluster, learning a load pattern and performance trend of the storage cluster based on the historical operational data, determining future load changes of the storage cluster, and generating a preventative configuration policy based on the future load changes, and transmitting the preventative configuration policy to the storage cluster via the MCP.
In this embodiment, future load challenges are also handled prospectively, and LLM analyzes and learns the load patterns and performance trends of the clusters using advanced machine learning algorithms by acquiring historical operating data of the storage clusters, such as read-write request frequency, block size distribution, CPU and network utilization curves over the past week or month, etc. Based on these historical insights, LLM is able to predict load changes that storage clusters may face over a future period of time, such as during peak hours or special events, including potential performance bottlenecks or under-resource conditions. The LLM will then automatically generate a set of preventive configuration strategies based on the predicted future load, aiming at pre-adjusting the parameter settings of the cluster to cope with the upcoming load surge, ensuring that the cluster will remain stable and high performance under high pressure. The set of strategies are sent to the storage cluster through MCP transacting and are validated after atomic verification, so that the active performance optimization based on historical data analysis is realized, the intelligent management and self-adaption capacity of the storage cluster is remarkably improved, and the passive response pressure of operation and maintenance personnel is reduced.
It will be apparent that the embodiments described above are merely some, but not all, embodiments of the application. For better understanding of the above method, the following description will explain the above process with reference to the examples, but is not intended to limit the technical solution of the embodiments of the present application, specifically:
The application provides a general goal of completing storage cluster configuration by one sentence, which essentially upgrades the traditional human-machine two-layer interaction (manual document reading and manual command executing) into the human-LLM-storage cluster three-layer closed-loop interaction (natural language, LLM intention understanding and decision, MCP protocol automatic execution). The method specifically comprises the following steps:
(1) Natural language triggering, namely, a user inputs a sentence of business level requirements on a unified Web or CLI interface, such as ' ensuring that a 5-node Ceph cluster delays online order library reading P99 to be within 5ms ' within 1 minute '. The large language model does not limit the input language by any technical term and supports Chinese, english and mixed expression.
(2) Intent abstraction LLM converts natural language into structured intent object I through word segmentation, entity recognition and semantic alignment. The object contains only three fields goalType (target class), constraint set (quantifiable constraint dictionary), priority (conflict arbitration level). The field names, value types and units are all statically defined in the MCP IDL, so that zero analysis cost of the subsequent modules is ensured.
(3) And (3) collecting real-time context, namely calling MCP tool_Q (namely the query Tool) by LLM in a unified URI mode, pulling four types of real-time data of cluster capacity, topology, load and alarm at one time, and packaging the four types of real-time data into a context object C. tool_Q masks Ceph, minIO, lustre, HDFS or other storage brands, making LLM unnecessary to care about underlying protocols.
(4) Policy calculation LLM loads all best practice knowledge segments K_i (each segment contains configuration templates, parameter recommendations, tunable scopes, and revenue descriptions) in the Prompt Registry. LLM takes I and C as input, invokes a built-in multi-target optimizer, performs pareto front edge search on tens of dimensions of the number of copies, erasure code matching, caching strategy, compression algorithm and the like, and finally generates a strategy object P. P consists of a two-level structure of PARAMETERMAP (key-value parameter table) and actionSequence (ordered action list), which can be mapped directly to the storage cluster RESTful API.
(5) Transactional execution and verification, LLM issues P in the form of atomic transaction through MCP tool_C to obtain global transaction ID tid. tool_C ensures that all actions are performed in actionSequence order and that any node fails, i.e., fails in its entirety. The LLM then invokes the MCP tool_v (i.e., the validation Tool described above) to perform a consistency, health, performance triple validation.
In addition, the application also has a double-layer decoupling mechanism:
1) Knowledge-capability double-layer decoupling: prompt Registry only carries "knowledge" (best practices, functional documents), tool Registry only carries "capability" (read, write, verify, rollback). The new function only needs to add Prompt and Tool, and the LLM or the retraining model is not required to be restarted.
2) Brand decoupling-URIs of all Tool follow the unified naming convention mcp:// < function field >/< operation >/< resource >, so that the same set of LLM Prompts can migrate seamlessly between heterogeneous stores.
It should be noted that, in the distributed storage field, the embodiment realizes the end-to-end configuration automation of 'natural language → second level verifiable transaction → cross brand zero perception', which significantly reduces the operation and maintenance threshold and improves the system stability.
For better understanding, fig. 3 is an overall flowchart of a storage cluster configuration method according to an embodiment of the present application, as shown in fig. 3, a user proposes a requirement in natural language, parses an intended object I by a large language model, pulls a real-time cluster context C through a query tool of a model context protocol, retrieves a best practice promtt from a promtt Registry, obtains a configuration policy P by LLM, issues a transaction (tid) through a commit tool of the model context protocol, automatically commits or rolls back after verification by a verification tool of the model context protocol, and returns a final result to the user.
For a better understanding, the following description is given in connection with specific examples:
Some head video platform needs to temporarily raise the concurrent read bandwidth of the 'hot spot playback pool' from 30GB/s to 80GB/s at the world cup resolution night, and the read P99 delay must not exceed 100ms. The operation staff has only a2 minute window. Table 4 below illustrates a step-role-action-data list:
TABLE 4 Table 4
TABLE 4 order
It should be noted that the present application has the following advantages over the prior art:
1) LLM-driven zero threshold configuration, namely, relying on the deep semantic understanding capability of a large language model to natural language, a user can be resolved into an intention object executable by a machine in real time only by a spoken language requirement, and the high cognitive load of the traditional manual reading and script writing is thoroughly eliminated.
2) The cross-brand consistency carried by the MCP protocol is that a Model Context Protocol (MCP) stores differences in a unified URI, a unified IDL, a unified transaction semantic mask Ceph/MinIO/Lustre/HDFS and the like, and the same set of LLM promt and MCP Tool combination can run indiscriminately in a multi-cloud and multi-hardware intercity environment, so that maintenance workload is reduced by one order of magnitude.
3) The transaction level reliability is guaranteed by MCP, namely MCP tool_C and tool_V realize 'atom issuing-atom verification-atom rollback' through global transaction ID tid, the recommended result of LLM is converted into an online transaction with ACID attribute, the rollback window is <100ms, and the reliability is improved from 'manual experience' to 'protocol level guarantee'.
4) The continuous evolution capability is still driven by LLM+MCP double wheels, namely, the newly added storage function only needs to add knowledge to the Prompt Registry and register an operation interface to the Tool Registry, and the LLM or the retraining model is not required to be restarted, so that the continuous evolution of 'knowledge, configuration, capability, plug-in' is realized.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment.
The embodiment of the application also provides a configuration device of a storage cluster, which is applied to a large language model, and fig. 4 is a structural block diagram of the configuration device of the storage cluster according to the embodiment of the application, as shown in fig. 4, and the device comprises:
An obtaining module 402, configured to obtain a target natural language, where the target natural language is used to indicate configuration of a storage cluster;
A generating module 404, configured to generate a corresponding configuration policy according to the target natural language;
The configuration module 406 is configured to send the configuration policy to the storage cluster through the model context protocol, so as to instruct the storage cluster to execute the corresponding configuration operation according to the configuration policy.
According to the device, the LLM can directly analyze the natural language instruction, the configuration intention of the user is automatically understood, the user does not need to grasp complex configuration grammar or understand the working principle of the storage system in depth, the configuration threshold is greatly reduced, and the configuration efficiency is improved. Secondly, the MCP is used as a unified communication protocol, allows the LLM to interact with different types of storage clusters in a standardized manner, and can realize the issuing of a configuration strategy through the MCP no matter the clusters are Ceph, HDFS or other storage solutions, thereby avoiding the complicated work of independently writing management scripts for each storage system and realizing cross-brand and cross-protocol configuration consistency. In addition, the application makes the configuration process automatic, from natural language analysis to configuration strategy generation and then strategy execution, the whole process is automatically completed in a short time, the response speed and accuracy of configuration are greatly improved, and the problems of delay and error caused by manual configuration are solved. By the method, a user can rapidly and accurately configure the storage clusters without manually writing scripts or manually adjusting parameters, so that the configuration efficiency and the flexibility of cluster management are greatly improved, and the problem that the storage clusters cannot be configured efficiently is solved.
In an exemplary embodiment, the generating module 404 is further configured to generate a structured intent object corresponding to the target natural language, obtain, by using a query tool in the model context protocol, current state information of the storage cluster, obtain, from a hint registry of the model context protocol, a target knowledge segment, where the target knowledge segment has configuration guidance information of the storage cluster, and generate a configuration policy according to the structured intent object, the current state information, and the target knowledge segment.
In one exemplary embodiment, the structured intent object has three fields of values, the three fields including a first field for indicating a traffic demand type, a second field for indicating a quantization constraint corresponding to the traffic demand type, and a third field for indicating an arbitration level when there is a conflict between the traffic demand type or the quantization constraint.
In one exemplary embodiment, field names, value types, and units in the structured intent object are all statically defined in the interface definition language of the model context protocol.
In one exemplary embodiment, the current state information includes the number of nodes of the storage cluster, the disk type, network topology, load characteristics, and alarms corresponding to the storage cluster.
In an exemplary embodiment, the target knowledge segment comprises a configuration template, parameter recommendation, an adjustable range and benefit description, wherein the configuration template is a set of predefined parameter sets used for rapidly deploying or adjusting the configuration of the storage cluster, the parameter recommendation is optimization suggestion and parameter selection corresponding to specific types or models of hardware and software versions and specific business requirements, the adjustable range is used for indicating an effective adjustment interval of each configuration parameter in the storage cluster, the benefit description is used for indicating change or optimization effects on indexes caused by adjusting the specific parameters in different scenes, and the indexes comprise at least one of performance, cost or power consumption.
In an exemplary embodiment, the generating module 404 is further configured to generate the configuration policy by using semantic understanding and reasoning capabilities of the large language model, and performing semantic matching, relevance ranking and multi-objective trade-off on the target knowledge segments according to the structured intention objects and the current state information.
In an exemplary embodiment, the configuration module 408 is further configured to package the configuration policy into a global transaction through a commit tool in the model context protocol, where the commit tool is a tool in the model context protocol responsible for performing the issuing of the configuration policy, and atomically send the global transaction to the storage cluster.
In an exemplary embodiment, the device further comprises a verification module, configured to obtain a transaction identifier of a global transaction returned by the commit tool after the configuration policy is sent to the storage cluster by the commit tool in the model context protocol, wherein the transaction identifier is an identifier generated after the commit tool packages the configuration policy into one global transaction, and call the verification tool in the model context protocol to perform atomic verification on the storage cluster based on the transaction identifier to verify a state of the storage cluster after the corresponding configuration operation is performed.
In an exemplary embodiment, the verification module is further configured to perform consistency verification on the storage cluster, perform health detection on a core component in the storage cluster, perform performance testing and stress testing on the storage cluster, and determine whether the storage cluster is in an expected state.
In an exemplary embodiment, the verification module is further configured to, after performing atomic verification on the storage cluster by using a verification tool in a context protocol of the model based on the transaction identifier, invoke commit semantics of a commit tool to determine that a configuration change of the storage cluster is effective if the atomic verification on the storage cluster passes, and invoke rollback semantics of the commit tool to return a state of the storage cluster to a state before the transaction identifier is generated if the atomic verification on the storage cluster fails.
In one exemplary embodiment, the naming convention for the uniform resource identifier URI of the tool in the model context protocol is uniform and the large language model invokes the tool in the model context protocol in a uniform URI mode.
In one exemplary embodiment, in the case of a new function in the storage group, a piece of information of a configuration instruction or a function description related to the new function is added or updated in a hint registry of the model context protocol, and an operation interface corresponding to the new function is registered in a tool registry of the model context protocol.
In an exemplary embodiment, the apparatus is further configured to obtain current state information of the storage cluster through a query tool in a model context protocol, generate a repair policy according to the current state information in case of determining a failure of the storage cluster according to the current state information, and send the repair policy to the storage cluster through the model context protocol.
In an exemplary embodiment, the generating module 404 is further configured to determine identity information of the target object submitting the target natural language, determine a modification authority level of the target object to the storage cluster according to the identity information, and generate a corresponding configuration policy according to the modification authority level and the target natural language.
In one exemplary embodiment, the configuration policy includes a key-value parameter table including parameters to be configured in the storage cluster and corresponding parameter values, and an ordered action list including a plurality of configuration change operations and execution order information of the plurality of configuration change operations.
The description of the features in the embodiment corresponding to the configuration device of the storage cluster may refer to the related description of the embodiment corresponding to the configuration method of the storage cluster, which is not described herein in detail.
An embodiment of the application also provides an electronic device comprising a memory in which a computer program is stored and a processor arranged to run the computer program to perform the steps of any of the above described embodiments of the configuration method of a storage cluster.
The embodiment of the application also provides a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the steps of any of the above-described embodiments of the configuration method of a storage cluster when run.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
The embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of any of the embodiments of the configuration method of a storage cluster described above.
Embodiments of the present application also provide another computer program product comprising a non-volatile computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of any of the storage cluster configuration method embodiments described above.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method and the device for configuring the storage cluster, the electronic equipment, the storage medium and the computer program product provided by the application are described in detail. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.