CN119441443A

CN119441443A - Automatic prompt construction method based on human-computer dialogue history and semantic retrieval

Info

Publication number: CN119441443A
Application number: CN202510037633.8A
Authority: CN
Inventors: 王雪芳; 杨珍豪
Original assignee: Beijing Yian Tianxia Technology Co ltd
Current assignee: Beijing Yian Tianxia Technology Co ltd
Priority date: 2025-01-10
Filing date: 2025-01-10
Publication date: 2025-02-14
Anticipated expiration: 2045-01-10
Also published as: CN119441443B

Abstract

The invention relates to the technical field of artificial intelligence and discloses a prompt automatic construction method based on man-machine conversation history and semantic retrieval, which comprises the following steps of S1, carrying out text preprocessing on historical conversation data and user input, including dynamically generating query keywords, verifying and analyzing a user uploading file, and extracting core content, S2, mapping the historical conversation and user request into semantic vectors by utilizing a pre-training language model, matching by combining keyword weights, and constructing a historical vector set and a user request vector, and S3, extracting context fragments most relevant to the user request from the historical conversation vector based on a convex optimization method. The context information highly relevant to the user request is accurately extracted through a sparse optimization technology, and the background information, the user request and the generation instruction are organically integrated by combining with the dynamic Prompt template design, so that the semantic relevance and the context consistency of the generated reply are improved, and the interference of redundant and irrelevant information is avoided.

Description

Automatic prompt construction method based on man-machine conversation history and semantic retrieval

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a prompt automatic construction method based on man-machine conversation history and semantic retrieval.

Background

With the continuous development of artificial intelligence, a multi-round dialogue system gradually becomes an important application core in the fields of intelligent customer service, virtual assistant and the like. Efficient interaction with users through understanding and generation of natural language is a major goal of these systems. However, in a multi-turn dialog scenario, how to generate a high quality natural language reply using dialog history is always a technical problem to be solved. Although the prior method advances to a certain extent, the prior method still has a plurality of limitations in coping with complex contexts and generating accurate and consistent replies.

Conventional multi-round dialog systems typically rely on rule templates or semantic vector retrieval techniques. Rule templates, although structured explicitly, cannot accommodate the diverse needs of users due to their static nature, especially when dealing with dynamic contexts and complex dialog scenarios, presenting large limitations. The semantic vector search can screen the history dialogue to a certain extent, but the relevance extraction of the information in the high-dimensional semantic space is not accurate enough, and redundant or irrelevant contents are easily introduced. These problems make the generated replies often lack context suitability and may even appear stiff and unnatural.

Furthermore, in multi-round dialog systems, the dynamics and diversity of historical dialogs presents significant challenges for context handling. The system needs to quickly screen out the information most relevant to the current request in the conversation history while avoiding redundancy and noise interference. However, simple linear combinations or search methods often result in excessive or insufficient information screening, affecting the consistency and focus of the generated content. More importantly, the prior art lacks a dynamic control mechanism for generating quality, and the accuracy and suitability of the generated content cannot be optimized in real time.

Therefore, in order to solve the problems, the invention provides a prompt automatic construction method based on man-machine conversation history and semantic retrieval.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a promt automatic construction method based on man-machine conversation history and semantic retrieval, which solves the problems of how to accurately extract relevant context information and dynamically construct optimal promt in a multi-round conversation system so as to generate a response which is consistent in semantics, relevant in content and natural and reasonable.

The invention is realized by the following technical scheme that the method for automatically constructing the prompt based on man-machine conversation history and semantic retrieval comprises the following steps:

s1, performing text preprocessing on historical dialogue data and user input, including dynamically generating query keywords, verifying and analyzing user uploading files, and extracting core content;

S2, mapping the historical dialogue and the user request into semantic vectors by utilizing a pre-training language model, and matching by combining keyword weights to construct a historical vector set and a user request vector;

s3, extracting a context segment most relevant to a user request from the historical dialogue vector based on a convex optimization method, combining high-relevance content captured by a web crawler, and screening a context through a sparse optimization problem;

S4, constructing a dynamic promtt template, and integrating the extracted context fragments, the user request and the generation instruction to generate promtt;

s5, inputting the generated promt into a generated language model for processing, and generating a natural language reply text;

And S6, optimizing the sparse optimization parameters and the Prompt template design according to the generated quality evaluation result.

Preferably, the step S1 includes the following substeps:

s1.1, performing sentence segmentation and word segmentation processing on historical dialogue data and user input, filtering stop words, and performing part-of-speech labeling and dependency analysis;

s1.2, dynamically generating query keywords by the system according to the input content of the user, and screening high-correlation keywords by combining semantic similarity to ensure accurate retrieval;

S1.3, verifying the uploading file format of the user, including PDF, DOCX and MarkDown, cleaning irrelevant characters and extracting the core content of the file for retrieval by a subsequent knowledge base.

Preferably, the step S2 includes the following substeps:

S2.1, mapping the historical dialogue data into vector representation through a sentence embedding model to form a historical vector set ;

S2.2, mapping the text of the user request into a semantic vectorCalculating a relevance score based on word frequency and position weight by combining dynamically generated keyword weight so as to optimize a vectorization result;

s2.3, constructing a historical dialogue semantic vector set V and a user request vector 。

Preferably, the process of extracting the context segment most relevant to the user request in the step S3 includes the following:

s3.1, simulating search engine behaviors by the system through a web crawler, capturing webpage contents in real time, and extracting text data with high correlation;

S3.2, constructing a sparse optimization problem, wherein an objective function is as follows:

Wherein, A vector is requested for the user and,As a result of the history of dialogue vectors,As a sparse coefficient vector, lambda is a sparse regularization parameter;

s3.3, an exception handling mechanism is built in the system, automatic retry of the network request is supported, and an exception log is recorded.

Preferably, the sparse optimization problem in the step S3.2 is solved by an alternate direction multiplier method, and specifically includes:

Updating the sparse coefficient alpha:

Wherein, The sparse coefficient vector obtained by optimization in the k+1th iteration is the sparse coefficient vector to be optimized in the current iteration,For a semantic vector representation of a user request,For semantic vector representation of the history dialog, N is the total turn of the history dialog,Fitting user request vectors for linear combinations of historical dialog vectors,Reconstructing an error term, ρ is a penalty factor,In order to constrain the terms,As the updated sparse variable in the kth iteration,Is the Lagrangian multiplier in the kth iteration;

updating a sparsity constraint variable z:

Wherein, For the sparse variable updated in the k +1 iteration,As a function of the soft threshold value,For the sparse coefficient vector optimized in the k+1th iteration,The Lagrangian multiplier in the kth iteration is represented by lambda as a regularization parameter and rho as a penalty factor;

Updating Lagrangian multiplier μ:

Wherein, Is the lagrangian multiplier updated in the k +1 iteration,Is the lagrangian multiplier in the kth iteration, p is the penalty factor,Optimizing the obtained sparse coefficient vector for the k+1th iteration,Sparse variables updated for the k+1st iteration;

And repeating the iteration until the sparse coefficient meets the preset convergence condition.

Preferably, the sparse optimization result in the step S3.2 is a sparse coefficient vectorScreening a history dialogue fragment set corresponding to non-zero weight from the history dialogue fragment setThe expression is:

Wherein, Is the most relevant context fragment to the user request.

Preferably, the construction of the dynamic template in step S4 includes:

extracting background information, namely sorting the extracted historical dialogue fragment sets according to time sequence or semantic importance;

Integrating the request text currently input by the user into a template;

Instruction design, namely explicitly generating instruction content including language, style and length limitation.

Preferably, the step S5 includes the following specific steps:

S5.1, inputting a generated language model to the generated promt, and triggering a generation task;

S5.2, based on the historical dialogue fragments and the semantic retrieval results, carrying out context consistency verification on the output of the generated model, and screening the optimal output by calculating the semantic relevance score of the generated text and the input promt;

s5.3, processing the output text of the generated model according to a preset format, wherein the processing comprises automatic sentence breaking, punctuation adjustment and weighted sequencing of key contents;

And S5.4, evaluating the generation result, adjusting the Prompt content according to real-time feedback of the generation quality, and re-executing the generation step until the output quality requirement is met.

Preferably, the evaluation of the quality generated in the step S6 includes the following indexes:

semantic relevance computing user request vectors And generating a reply vectorCosine similarity of (c);

context continuity, namely evaluating and generating matching degree of the text and the reference text through BLEU and ROUGE-L indexes;

user satisfaction, namely subjective evaluation is carried out on the generated result based on the user score.

Preferably, the calculation formula of the cosine similarity in the step S6 is as follows:

Wherein, Representing user request vectorsAnd generating a reply vectorThe degree of semantic similarity between the two,For a semantic vector representation of a user request,To generate a semantic vector representation of the reply,Representation ofAndIs used for the dot product of (a),The euclidean norm of the vector is requested for the user,Euclidean norms for the reply vectors are generated.

The invention provides a prompt automatic construction method based on man-machine conversation history and semantic retrieval. The beneficial effects are as follows:

1. According to the method, the context information most relevant to the user request is screened out from the historical dialogue by constructing the sparse optimization problem, the Prompt input is dynamically constructed, the dialogue segment most semantically contributing to the generation task can be effectively selected in the high-dimensional vector space by the sparse representation method, meanwhile, interference of irrelevant or redundant information is eliminated, compared with the traditional rule matching or simple vector retrieval method, the semantic relevance and the context suitability of the generated content are remarkably improved, and the generated reply is more natural and reasonable and can be smoothly connected with the historical dialogue content by combining the context management technology and the semantic consistency evaluation, so that the method has higher semantic accuracy and logic consistency in a multi-round dialogue scene.

2. According to the method, the screened context fragments, the user requests and the explicit generation instructions are integrated into a standardized input structure through dynamic Prompt template design, meanwhile, the global optimality of the selection of the context fragments is ensured by combining with a convex optimization theory, the output result of the generation model is precisely controlled through the dynamic parameters of the generation instructions in the Prompt template, so that the generated content can be dynamically adjusted according to different task requirements and dialogue scenes.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, the embodiment of the invention provides a method for automatically constructing a prompt based on man-machine conversation history and semantic retrieval, which comprises the following steps:

the step S1 comprises the following substeps:

Specifically, in this embodiment, the sentence and word segmentation is first performed on the history dialogue data and the user input, so as to ensure the accuracy of the subsequent language processing operation. The method comprises the following specific steps:

The method comprises the steps of dividing an input text sentence by using a sentence dividing algorithm, dynamically adjusting a dividing strategy by adopting a rule dividing method based on punctuation such as periods, question marks, commas and the like and combining an adaptive language model, dividing each divided sentence of the text by using a word dividing model, distributing grammar tags for each vocabulary by combining a part-of-speech labeling technology, introducing a dependency analysis technology, constructing a grammar tree to analyze grammar structures in the sentences, extracting main-predicate-guest relations and modification components in the sentences, and providing structural support for generating subsequent keywords.

Extracting core words in text input by a user, and constructing a candidate keyword set by combining historical dialogue data;

And calculating the semantic similarity of the candidate keywords, wherein the formula is as follows:

Wherein, Representing keywordsIs a function of the semantic vector of (a),A semantic vector representing the text entered by the user,Cosine similarity of the two;

Screening high-correlation keywords based on semantic similarity scores of keywords, and constructing a high-priority keyword set ;

Collecting the screened keywordsThe method is applied to a semantic retrieval module and used for acquiring the context information highly relevant to the user request.

Verifying whether the format of the file uploaded by the user is of a supported type or not, wherein the formats comprise PDF, DOCX, markDown and the like;

Extracting file content by using an Optical Character Recognition (OCR) technology aiming at the PDF file, and recognizing a text in the embedded picture;

aiming at the DOCX file, analyzing the text content, the table, the chart and the annotation part of the file, and removing the format mark and the redundant characters;

Analyzing the structured content of the MarkDown file aiming at the MarkDown file, and extracting information such as a title, a paragraph, a code block and the like;

preprocessing all the parsed text contents, including removing irrelevant characters, repeated contents and nonsensical stop words;

vectorizing the core text content after preprocessing, wherein the formula is as follows:

Wherein, For the vector representation of the core content of the file,Is the semantic vector of the i-th important vocabulary in the file,And m is the number of important words in the file.

By preprocessing the history dialogue data and the user input in the embodiment, not only the context information highly related to the user request can be effectively extracted, but also the external resources provided by the user can be fully utilized in the file analysis process, so that high-quality input data can be provided for the subsequent Prompt construction. Meanwhile, the method is guaranteed to be universal and adaptive in terms of combination of word segmentation, keyword generation and file analysis, and is applicable to various input data types and scene requirements.

the step S2 comprises the following substeps:

Specifically, in this embodiment, the construction of the history vector set and the user request vector is completed by combining the pre-training language model with the keyword optimization technology, which specifically includes the following steps:

In this embodiment, semantic vectorization processing is performed on the historical dialogue data through a sentence embedding model, which specifically includes the following steps:

The system calls a pre-training language model (such as BERT or Sentence-BERT) based on a transducer architecture, processes each dialogue in the historical dialogue data, and generates a corresponding sentence embedded vector;

defining a set of history vectors WhereinSemantic vectors representing the ith historical dialog, N being the number of historical dialog bars;

the system carries out normalization processing on the generated historical dialogue semantic vector, and the normalization formula is as follows:

Wherein, Representing vectorsNormalizing to ensure the scale consistency of the semantic vector;

And storing the normalized vector set V for semantic matching of the subsequent user request vector.

In this embodiment, the system maps the text of the user request to a semantic vector, and optimizes the semantic vectorization result by combining the keyword weight, and specifically includes the following steps:

the system calls the generate_query_keywords function, and dynamically generates a primary keyword set according to the text u requested by the user ;

For each keywordGenerating corresponding semantic vectorsIn combination with semantic vectors of user-requested textCalculating the semantic similarity of the keywords, wherein the semantic similarity formula is as follows:

Wherein, A semantic vector representing the text of the user request,Representing keywordsSemantic similarity with the user request;

System screening semantic similarity is higher than threshold Form a set of highly correlated keywords;

The system calls the weighted_keyword_matching function, which is a keyword setSetting weight for each keyword in the list, and carrying out comprehensive score calculation by combining word frequency and position weight, wherein the weight calculation formula is as follows:

Wherein, Is a keywordThe word frequency value in the text of the user request,As the relative position weight of the keywords,And max (p) is the maximum value of word frequency and position weight, and is used for normalization;

Optimizing semantic vectors of user-requested text based on keyword weights The optimization formula is as follows:

Wherein, For the optimized user to request a semantic vector,Is a keywordIs used for the weight of the (c),Is a semantic vector of keywords.

In this embodiment, by combining the historical dialog semantic vector set V with the optimized user request vectorThe construction of the semantic vector set is completed, and the specific steps are as follows:

Ensuring a set of history vectors Vector with user requestIs located in the same semantic vector space;

System to user request vector And carrying out semantic similarity calculation on each vector in the historical dialogue vector set V, wherein a similarity calculation formula is as follows:

Wherein, Representing optimized user request vectorsSemantic vector for history dialog with ithSemantic similarity between the two;

sorting the historical dialogue vectors according to the semantic similarity score from high to low, and screening a plurality of vectors with higher correlation ;

The system optimizes the user request vectorWith the filtered high correlation history vector setTogether to a subsequent context extraction and Prompt generation module.

Dynamic keyword generation and optimization are achieved by calling the generate_query_keywords and the weighted_keyword_matching functions, and a semantic vector optimization method of a user request text is combined, so that a historical dialogue semantic vector set and an optimized user request vector are constructed. By means of the dynamic adjustment of the keyword weight and the semantic similarity calculation mechanism, the accuracy and precision of semantic matching are remarkably improved, and a high-quality semantic basis is laid for subsequent Prompt generation and context extraction.

The process of extracting the context segment most relevant to the user request in step S3 includes the following:

And S3.2, solving the sparse optimization problem by an alternate direction multiplier method, wherein the method specifically comprises the following steps:

Updating the sparse coefficient alpha:

updating a sparsity constraint variable z:

Updating Lagrangian multiplier μ:

In the S3.2 step, the sparse optimization result is a sparse coefficient vectorScreening a history dialogue fragment set corresponding to non-zero weight from the history dialogue fragment setThe expression is:

Wherein, Is the most relevant context fragment to the user request.

Specifically, in this embodiment, the system implements web crawler crawling through Crawler types, and the web content highly related to the user request specifically includes the following steps:

invoking Crawler types to simulate the behavior of a search engine, constructing a query link through a dynamic URL splicing mechanism, and generating a user request keyword and a complete query address of the search engine;

adopting a delay mechanism and a random request head strategy to avoid being blocked by a search engine, and simulating real user behaviors by setting dynamic interval time by the system to reduce the crawling failure probability;

Analyzing an HTML page returned by a search engine, extracting main contents, including a title, a abstract and a text fragment, and taking the contents as external supplementary information for subsequent context expansion;

in order to improve the stability of the network request, the system introduces an exponential backoff mechanism in the_send_request method, and when the network request fails, the system exponentially prolongs the retry interval time until the preset maximum retry times are reached;

The system records abnormal information in each crawling task, and stores the abnormality (such as overtime, link failure and the like) in the crawler process into a log file by adopting a log recording mechanism, so that the follow-up debugging and problem positioning are facilitated.

In this embodiment, the system extracts, from the historical dialog vector, a context segment most relevant to the user request by a sparse optimization method, and specifically includes the following steps:

an objective function of the sparse optimization problem is defined, as follows:

introducing reconstruction error terms Ensuring that the optimized historical dialogue vector combination can fit the user request vector to the greatest extent;

by passing through Screening out a minimum amount of history dialogue vectors to participate in reconstruction of user request vectors, and optimizing the context extraction efficiency;

web content semantic vector obtained by combining historical dialogue vector set V and crawler A set of context vectors is co-constructed for subsequent screening.

In this embodiment, the sparse optimization problem is solved by an alternate direction multiplier method, which specifically includes the following steps:

Updating the sparse coefficient alpha:

Wherein, The sparse coefficient vector obtained by optimization in the k+1th iteration is the sparse coefficient vector to be optimized in the current iteration,For a semantic vector representation of a user request,For semantic vector representation of the history dialog, N is the total turn of the history dialog,Fitting user request vectors for linear combinations of historical dialog vectors,Reconstructing an error term, ρ is a penalty factor,In order to constrain the terms,As the updated sparse variable in the kth iteration,Is the lagrangian multiplier in the kth iteration.

Updating a sparsity constraint variable z:

Wherein, For the sparse variable updated in the k +1 iteration,As a function of the soft threshold value,For the sparse coefficient vector optimized in the k+1th iteration,Is the Lagrangian multiplier in the kth iteration, λ is the regularization parameter, ρ is the penalty factor.

Updating Lagrangian multiplier μ:

Wherein, Is the lagrangian multiplier updated in the k +1 iteration,Is the lagrangian multiplier in the kth iteration, p is the penalty factor,Optimizing the obtained sparse coefficient vector for the k+1th iteration,The updated sparse variable is for the k+1st iteration.

Iterative convergence conditions:

The optimization process is iterated repeatedly until the sparse coefficient alpha meets the convergence condition:

Wherein, Is a preset convergence threshold.

In this embodiment, the system extracts, through the sparse optimization result, a context segment most relevant to the user request, and specifically includes the following steps:

obtaining sparse coefficient vectors from the results of sparse optimization problems The historical dialogue fragments corresponding to the non-zero weights are screened, and the extraction formula is as follows:

Wherein, Representing a set of historical dialog fragments most relevant to the user request,An index corresponding to a non-zero weight in the sparse coefficient;

splicing the extracted historical dialogue fragments and the high-correlation text content acquired by the crawler into complete context vectors, and transmitting the complete context vectors to a subsequent generation module;

And using ContextManager to dynamically expand the context, splicing the historical dialogue fragments, the crawler results and the current input of the user into a complete Prompt, and ensuring the consistency and the accuracy of the generation.

The construction of the dynamic template in the S4 step comprises the following steps:

Integrating the request text currently input by the user into a template;

Specifically, in this embodiment, the system constructs a background information portion of the promt template according to the extracted context fragment set, and specifically includes the following steps:

based on a set of context fragments Firstly, sequencing fragments according to a time sequence to ensure the consistency of context semantics;

If the user request relates to a specific keyword or topic, the system filters the fragment sets based on semantic similarity, and only the fragment sets highly relevant to the user request are reserved . The calculation formula of the semantic similarity is as follows:

Wherein, For the semantic vector of the user request,Is a semantic vector of the context segment.

The system is based on the importance of the segmentsFurther ordering is performed. The importance of the segments can be evaluated based on their frequency of occurrence in context or degree of matching with keywords, ultimately forming a highly relevant set of background information。

The filtered background information is used as a pre-part of the Prompt template to provide an explicit semantic context for generating the model.

In this embodiment, the system integrates the request text currently input by the user into the template of the Prompt to ensure that the generated content meets the actual requirement of the user, and specifically includes the following steps:

Extracting user input text Taking the template as a core part of the template, and placing the template after background information;

If the user request content contains explicit generating intention or task instructions (such as generating a summary or answering questions), the system extracts the task instructions and displays labels as important components of the template;

The combination logic of the user request and the background information needs to ensure semantic consistency, and the system can set clear separation in the template, so that the generated model can effectively distinguish the background from the request content.

In this embodiment, the system designs a clear generation instruction for the template of the Prompt to guide the language, style, output length and other contents of the generated model, and specifically includes the following steps:

According to the scene demand requested by the user, dynamically adjusting the mood and style of the generated instruction. The instruction design rule includes:

If the user request relates to formal content (e.g., academic papers, business reports), the system designs the generated instructions to "use formal and professional language;

If the user request is biased to a daily communication or interaction scene, the instruction prompts the generation model to adopt a' succinct friendly language;

the length limit of the generated content is set in connection with the contextual requirements of the user input text, with explicit constraints in the instruction. For example:

"please control the output within 300 words. "

The output needs to comprise the following points of 1. Problem background, 2. Detailed analysis and 3. Conclusion. "

The instruction content adapts the capability of generating the model according to the task demands of the user (such as whether detailed explanation or specific steps are needed or not) so as to ensure that the promt can effectively guide the task generation.

In this embodiment, the system adjusts the structure and content of the promt template in real time according to the user input and the dynamic change of the context content, and specifically includes the following steps:

Dynamically expanding context content, namely combining a multi-round dialogue manager, extracting the most relevant fragments in the current dialogue context, and splicing historical dialogue content with the latest user input;

dynamically integrating external search results, namely adding the search results into the background information part of the Prompt template according to priority if the system acquires external content related to the user request through web crawlers or semantic search;

and optimizing the template of the Prompt according to the quality of the generated result in the process of running the generated model for a plurality of times, for example, adjusting the number of background fragments or modifying language and style prompts in the generated instruction.

s5, the method comprises the following specific steps:

Specifically, in this embodiment, the system first inputs the dynamically generated promt into the generated language model, and specifically includes the following steps:

The system takes the constructed template as input directly, and calls the generated language model to execute the generated task;

and dynamically expanding the context, namely dynamically managing the historical dialogue, the search result and the current request content of the user by utilizing ContextManager, and ensuring that the Prompt contains complete context information. The logic of the context splice is:

Wherein, Representing a segment of a historical dialog,Representing the retrieved content of the associated content,Representing a current request of a user;

Triggering a generating task, namely after the generating language model receives the Prompt, starting the generating task, and generating a corresponding natural language reply according to the Prompt content.

In this embodiment, the system performs semantic consistency verification on the output of the generated language model, and filters an optimal output result by calculating the semantic relevance between the generated result and the Prompt content, and specifically includes the following steps:

And (3) calculating semantic relevance, namely scoring the semantic relevance of the output text TgT_ gTg of the generation model and the Prompt content PPP, wherein the calculation formula of the semantic relevance score is as follows:

Wherein, AndSemantic vectors representing the generated text and the Prompt content, respectively;

context consistency verification, namely further carrying out context consistency verification on the content in the generated text, and ensuring semantic matching degree of the content with the historical dialogue fragments and the user request content;

And screening the optimal output, namely screening the generated result from high to low according to the semantic relevance score, and selecting the optimal output text as the final generated result.

In this embodiment, the system processes the output text of the generated model according to a preset format to ensure the semantic clarity and readability of the text, and specifically includes the following steps:

automatic sentence breaking processing, namely segmenting sentences of the generated text, ensuring clear paragraph structure and facilitating reading;

punctuation mark adjustment, namely correcting punctuation mark errors in the generated text, and ensuring grammar specification;

And (3) key content weighted sorting, namely sorting key information in the generated text according to semantic importance, and placing key content at the front end of the text. The formula for weighted ranking is as follows:

Wherein, Is contentIs used for the weight of the (c),Representing the semantic relevance of the content to the user request,Representing the position of the content in the generated text, alpha and beta are weight parameters.

In this embodiment, the system dynamically optimizes the promt content based on the generated quality feedback, and specifically includes the following steps:

Generating quality assessment, namely performing quality assessment on the generated text by a system, wherein the quality assessment comprises the following indexes:

Semantic relativity, namely evaluating semantic consistency of the generated text and a user request;

context consistency, namely evaluating the semantic consistency degree of the generated text and the historical dialogue fragments;

User satisfaction, namely, subjective feedback scores based on users;

prompt content optimization, namely dynamically adjusting the content of the Prompt template according to the generated quality evaluation result, for example:

Increasing or decreasing background information segments;

Modifying the language or style requirements of the generated instruction;

Adjusting a length limit of the generated content;

And re-executing the generating task, namely triggering the generating task again after the system re-optimizes the promt if the generating quality does not reach the expected standard until the generating result meets the quality requirement.

In this embodiment, the system realizes gradual return of the generated result through the streaming response mechanism, so as to reduce the waiting time of the user, and specifically includes the following steps:

Dividing the output text according to blocks and returning step by step in the generating process of the generating model;

gradually outputting and dynamically updating, namely returning the generated content by the generating task in a streaming mode, and enabling a user to check partial results before the generation is completed so as to improve response efficiency;

Context expansion and update-in a multi-turn dialog scenario, the system dynamically expands context based on real-time feedback of streaming response and adjusts the generation of subsequent portions of the task according to the new user input.

In this embodiment, the sparse optimization parameters and the template design of the promt are further optimized according to the generated quality evaluation result, and specifically include the following steps:

the sparse optimization parameter adjustment is that regularization parameter lambda and penalty factor rho in sparse optimization are dynamically adjusted according to the generated quality feedback, and the selection and the sequencing of the context segments are optimized;

The structure of the promtt template is optimized, namely, the semantic analysis result of the text quality is generated, the background information sequence in the promtt template, the user request expression mode and the generation instruction content are adjusted;

and in the multi-round generation task, the system gradually optimizes a context screening mechanism and a template structure according to the generation result, so that the overall quality of the generation task is improved.

The evaluation of the quality of the generation in step S6 includes the following indexes:

The calculation formula of the cosine similarity in the step S6 is as follows:

Specifically, in this embodiment, the system adopts cosine similarity to quantify the semantic correlation between the user request vector and the generated reply vector, and specifically includes the following steps:

Generating semantic vectors Representing a request by a user that,Representing a semantic vector that generated the reply text;

the semantic relevance of the user request and the generated reply is calculated through a cosine similarity formula, wherein the formula is as follows:

Wherein, Representing user request vectorsAnd generating a reply vectorThe degree of semantic similarity between the two,For a semantic vector representation of a user request,To generate a semantic vector representation of the reply,·Representation ofAndIs used for the dot product of (a),The euclidean norm of the vector is requested for the user,Euclidean norms for the reply vectors are generated.

The system sets a similarity threshold according to the set similarity thresholdAnd judging whether the semantic matching degree of the generated reply and the user request reaches the expectation. If it isThe semantic relevance is considered to meet the requirements.

In this embodiment, the system evaluates the context matching degree of the generated text and the reference text through BLEU and ROUGE-L indexes, and specifically includes the following steps:

The system is based on generating text And reference textThe BLEU score is calculated and the consistency of both at the vocabulary and phrase level is assessed. The formula is as follows:

BP is penalty factor, used for controlling and generating the length of the text; Weights for different n-grams; is the accuracy of the n-gram.

The Longest Common Subsequence (LCS) matching between the generated text and the reference text is evaluated by ROUGE-L index as follows:

Wherein: for the ratio of LCS length matched with the reference text in the generated text to the length of the generated text; The ratio of the LCS length to the reference text length;

Beta is the tuning parameter used to balance the weights of Precision and Recall.

The system integrates the scores of BLEU and ROUGE-L and evaluates whether the contextual continuity of the generated text and the reference text meets the quality requirement.

In this embodiment, the system performs subjective evaluation on the generated result based on the user score, and specifically includes the following steps:

The system collects satisfaction degree scores of users on the generated results through interface interaction, wherein the scoring range is 0 to 5 points;

Subjective evaluation weight-user scores are given higher weight for assisting in adjusting the optimization strategy of the generated model. For example, when the user score is below 3, the system will optimize the promt template or context clip.

The real-time feedback of the user score can be used as a basis for generating quality optimization together with semantic relevance and context consistency indexes.

In this embodiment, the system performs comprehensive calculation on the scores of the semantic relevance, the context continuity and the user satisfaction, forms a comprehensive score of the generation quality, and optimizes the promt template and the sparse optimization parameter according to the comprehensive score, and specifically includes the following steps:

the composite score Q of the generation quality is defined as:

wherein alpha, beta, gamma are weight coefficients; is a semantic relevance score, BLEU and ROUGE-L are context continuity indicators, userScore is a user score.

And generating quality feedback and optimization, namely dynamically adjusting regularization parameters lambda and penalty factors rho in sparse optimization and context segment sequencing and generating instruction content in a Prompt template according to the change trend of the comprehensive score Q, and ensuring continuous improvement of the generation quality.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A prompt automatic construction method based on human-computer dialogue history and semantic retrieval, characterized in that it includes the following steps:

S1. Perform text preprocessing on historical conversation data and user input, including dynamically generating query keywords, verifying and parsing user uploaded files, and extracting core content;

S2. Use the pre-trained language model to map historical conversations and user requests into semantic vectors, match them with keyword weights, and construct a historical vector set and user request vector;

S3, extract the most relevant context fragments from the historical conversation vector based on the convex optimization method, combine the highly relevant content captured by the web crawler, and filter the context through the sparse optimization problem;

S4, build a dynamic prompt template, integrate the extracted context fragments, user requests and generation instructions to generate prompts;

S5, inputting the generated Prompt into the generative language model for processing to generate a natural language reply text;

S6. Optimize the sparse optimization parameters and prompt template design according to the generation quality evaluation results.

2. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, characterized in that the step S1 comprises the following sub-steps:

S1.1. Perform sentence and word segmentation on historical conversation data and user input, filter stop words, and perform part-of-speech tagging and dependency analysis;

S1.2. The system dynamically generates query keywords based on user input, and selects highly relevant keywords based on semantic similarity to ensure accurate retrieval;

S1.3. Verify the file formats uploaded by users, including PDF, DOCX, and Markdown, clean up irrelevant characters, and extract the core content of the file for subsequent knowledge base retrieval.

3. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, characterized in that the step S2 comprises the following sub-steps:

S2.1. Map historical conversation data into vector representations through sentence embedding models to form a historical vector set ;

S2.2. Mapping user request text into semantic vectors , combined with dynamically generated keyword weights, the relevance score is calculated based on word frequency and position weights to optimize the vectorization results;

S2.3. Constructing the historical conversation semantic vector set V and user request vector .

4. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, wherein the process of extracting the most relevant context fragment to the user request in step S3 comprises the following contents:

S3.1. The system simulates the behavior of search engines through web crawlers, crawls web page content in real time, and extracts highly relevant text data;

S3.2, construct a sparse optimization problem, whose objective function is:

;

in, is the user request vector, is the historical dialogue vector, is the sparse coefficient vector, λ is the sparse regularization parameter;

S3.3. The system has a built-in exception handling mechanism that supports automatic retry of network requests and records exception logs.

5. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 4 is characterized in that the sparse optimization problem in step S3.2 is solved by an alternating direction multiplier method, specifically comprising:

Update the sparse coefficient α:

;

in, is the sparse coefficient vector optimized in the k+1th iteration, α is the sparse coefficient vector to be optimized in the current iteration, is the semantic vector representation of the user's request, is the semantic vector representation of the historical dialogue, N is the total number of rounds of the historical dialogue, is a linear combination of historical conversation vectors, fitting the user request vector , is the reconstruction error term, ρ is the penalty factor, is a constraint, is the sparse variable updated in the kth iteration, is the Lagrange multiplier in the kth iteration;

Update the sparsity constraint variable z:

;

in, is the sparse variable updated in the k+1th iteration, is the soft threshold function, is the sparse coefficient vector optimized in the k+1th iteration, is the Lagrange multiplier in the kth iteration, λ is the regularization parameter, and ρ is the penalty factor;

Update the Lagrange multiplier μ:

;

in, is the Lagrange multiplier updated in the k+1th iteration, is the Lagrange multiplier in the kth iteration, ρ is the penalty factor, is the sparse coefficient vector obtained by optimizing the k+1th iteration, is the sparse variable updated in the k+1th iteration;

Repeat the iteration until the sparse coefficients meet the preset convergence conditions.

6. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 4 is characterized in that the sparse optimization result in step S3.2 is a sparse coefficient vector , from which the set of historical dialogue segments corresponding to non-zero weights is selected , whose expression is:

;

in, The most relevant context snippet for the user's request.

7. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, characterized in that the construction of the dynamic prompt template in step S4 comprises:

Extracting background information: sorting the extracted historical conversation fragments by chronological order or semantic importance;

User request content: Integrate the request text currently entered by the user into the template;

Instruction design: Clearly generate instruction content, including tone, style, and length limits.

8. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, characterized in that the step S5 comprises the following specific steps:

S5.1. Input the generative language model into the generated Prompt and trigger the generation task;

S5.2. Based on the historical dialogue fragments and semantic retrieval results, the output of the generated model is verified for contextual consistency, and the best output is selected by calculating the semantic relevance score between the generated text and the input prompt;

S5.3. Processing the output text of the generated model according to a preset format, including automatic sentence segmentation, punctuation adjustment, and weighted sorting of key content;

S5.4. Evaluate the generation results, adjust the prompt content according to the real-time feedback of the generation quality, and re-execute the generation steps until the output quality requirements are met.

9. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 1, characterized in that the evaluation of generation quality in step S6 includes the following indicators:

Semantic relevance: Calculating user request vectors and generate the reply vector The cosine similarity of

Contextual coherence: The matching degree between the generated text and the reference text is evaluated by BLEU and ROUGE-L indicators;

User satisfaction: Subjective evaluation of the generated results based on user ratings.

10. The method for automatically constructing prompts based on human-computer dialogue history and semantic retrieval according to claim 9, characterized in that the calculation formula of the cosine similarity in step S6 is as follows:

;

in, Represents the user request vector and generate the reply vector The semantic similarity between is the semantic vector representation of the user's request, To generate a semantic vector representation of the reply, · express and The dot product of The Euclidean norm of the vector requested by the user, is the Euclidean norm of the generated reply vector.