CN109344231A - Method and system for completing corpus of semantic deformity - Google Patents
Method and system for completing corpus of semantic deformity Download PDFInfo
- Publication number
- CN109344231A CN109344231A CN201811288739.1A CN201811288739A CN109344231A CN 109344231 A CN109344231 A CN 109344231A CN 201811288739 A CN201811288739 A CN 201811288739A CN 109344231 A CN109344231 A CN 109344231A
- Authority
- CN
- China
- Prior art keywords
- regular expression
- speech
- completion
- participle
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a method and a system for complementing corpus of semantic disabilities, wherein the method comprises the following steps: acquiring a corpus sample library with complete semantics, and establishing an audio library, a semantic slot and a regular expression library according to the corpus sample library; acquiring user voice; matching the user voice with the audio library; when the matching result is consistent, determining part-of-speech corresponding to the matched participle according to the semantic slot, wherein the matched participle is the participle matched with the audio library in the user voice; comparing the part of speech of the matched participle with the regular expression library, and completing the incomplete components in the user voice according to the regular expression in the regular expression library to obtain a completed text; and performing semantic analysis according to the completion text. The invention intelligently identifies the real user intention by complementing the incomplete components in the corpus.
Description
Technical field
The present invention relates to technical field of voice recognition, the method and system of espespecially a kind of semantic incomplete corpus of completion.
Background technique
With the fast development of internet, people's lives become more and more intelligent, therefore people are also increasingly accustomed to
Various demands are completed using intelligent terminal in ground.And with increasingly mature, the intelligence of each Terminal Type of artificial intelligence the relevant technologies
Change degree is also higher and higher.AC applications one of of the interactive voice as human-computer interaction mainstream in intelligent terminal, and increasingly
Favor by user.
The voice that intelligent terminal is all based on user's input identifies, then takes appropriate measures, therefore user is logical
The accuracy for crossing the voice that terminal is inputted drastically influences feedback made by intelligent terminal.
Since user inputs the accident that is likely to occur in voice process, for example, voice input a part interrupted by unexpected or
The interference of situations such as part of speech microphone is not got and external environment, such as the excessively noisy intelligent recognition portion of environment
Divide voice, occurs the phenomenon of ingredient incompleteness for the voice of above-mentioned acquisition, it is difficult to accurately identify the true intention of user.
In addition for the students of the junior years, the stage of study has just been started due to being in, during language expression, often
It will appear language element incompleteness, it is intended that fuzzy situation causes speech recognition product to be difficult to the true user of intelligent recognition and is intended to.
Summary of the invention
The object of the present invention is to provide a kind of method and system of the semantic incomplete corpus of completion, realize and pass through completion corpus
The ingredient of middle incompleteness is intended to the true user of intelligent recognition.
Technical solution provided by the invention is as follows:
The present invention provides a kind of method of semantic incomplete corpus of completion characterized by comprising
Semantic complete corpus sample database is obtained, audio repository, semantic slot and canonical table are established according to the corpus sample database
Da Shiku;
Obtain user speech;
The user speech and the audio repository are matched;
When matching result is consistent, determine that matching segments corresponding part of speech according to the semantic slot, the matching participle is
The participle being consistent in the user speech with the audio storehouse matching;
The part of speech of the matching participle and the regular expression library are compared, according in the regular expression library
Regular expression by the incomplete ingredient completion in the user speech, obtain completion text;
Semantic parsing is carried out according to the completion text.
Further, the semantic complete corpus sample database of the acquisition, according to the corpus sample database establish audio repository,
Semantic slot and regular expression library specifically include:
The semantic complete corpus sample database is obtained, according to participle technique to the corpus sample in the corpus sample database
It is segmented to obtain the participle for including in the corpus sample and corresponding part of speech;
The semantic slot is established according to the participle and the part of speech;
The corresponding audio of the participle is obtained, the audio repository is established according to the audio;
It analyzes the corpus sample summary and obtains regular expression, the regular expressions are established according to the regular expression
Formula library.
Further, the analysis corpus sample summary obtains regular expression, according to the regular expression
The regular expression library is established to specifically include:
Analyze the incidence relation between the participle in the corpus sample;
Regular expression is obtained according to the part of speech and incidence relation summary, is established according to the regular expression
The regular expression library.
Further, described to carry out the user speech and the audio repository after the acquisition user speech
Include: before matching
Identification text is converted by the user speech, parses the identification text;
When the identification text component incompleteness, according to the audio repository, the semantic slot and the regular expression library
Text is identified described in completion.
Further, described to compare the part of speech of the matching participle and the regular expression library, according to institute
The regular expression in regular expression library is stated by the incomplete ingredient completion in the user speech, completion text is obtained and specifically wraps
It includes:
Determine relative position of all matching participles in the user speech;
Corresponding part of speech relative position is determined depending on that relative position;
It is compared according to the part of speech relative position and the regular expression library, selects the matching ratio of preset quantity
More than or equal to preset ratio regular expression as target regular expression;
According to the target regular expression by the incomplete ingredient completion in the user speech, completion text is obtained.
The present invention also provides a kind of methods of the semantic incomplete corpus of completion characterized by comprising
Database module obtains semantic complete corpus sample database, according to the corpus sample database establish audio repository,
Semantic slot and regular expression library;
Module is obtained, user speech is obtained;
Matching module, by the user speech that the acquisition module obtains and the institute that the Database module is established
Audio repository is stated to be matched;
Analysis module is determined when matching result is consistent according to the semantic slot that the Database module is established
Matching segments corresponding part of speech, and the matching participle is the participle being consistent in the user speech with the audio storehouse matching;
Processing module builds the part of speech for the matching participle that the analysis module determines and the Database module
The vertical regular expression library compares, according to the regular expression in the regular expression library by the user speech
In incomplete ingredient completion, obtain completion text;
Parsing module carries out semantic parsing according to the completion text that the processing module obtains.
Further, the Database module specifically includes:
Acquiring unit obtains the semantic complete corpus sample database;
Participle unit, the corpus sample in the corpus sample database that the acquiring unit is obtained according to participle technique into
Row participle obtains the participle for including in the corpus sample and corresponding part of speech;
Database unit, the participle and the part of speech obtained according to the participle unit establish the semanteme
Slot;
The acquiring unit obtains the corresponding audio of the participle;
The Database unit establishes the audio repository according to the audio that the acquiring unit obtains;
Analytical unit, the corpus sample summary analyzed in the corpus sample database that the acquiring unit obtains obtain
Regular expression;
The Database unit establishes the canonical table according to the regular expression that the analytical unit obtains
Da Shiku.
Further, the analytical unit specifically includes:
Subelement is analyzed, the incidence relation between the participle in the corpus sample is analyzed;
Subelement is generated, is obtained just according to the incidence relation summary that the part of speech and the analysis subelement obtain
Then expression formula.
Further, further includes:
Conversion module converts identification text for the user speech that the acquisition module obtains, parses the identification
Text;
Completion module, when the conversion module parses the identification text component incompleteness, according to the audio repository, institute
Text is identified described in predicate justice slot and regular expression library completion.
Further, handled module specifically includes:
Processing unit determines relative position of all matching participles in the user speech;
The processing unit determines corresponding part of speech with respect to position according to the relative position that the processing unit determines
It sets;
Selecting unit, the part of speech relative position and the regular expression library determined according to the processing unit carry out
Comparison selects the matching ratio of preset quantity to be more than or equal to the regular expression of preset ratio as target regular expression;
Completion unit, will be residual in the user speech according to the target regular expression that the selecting unit selects
Ingredient completion is lacked, completion text is obtained.
A kind of method and system of the semantic incomplete corpus of the completion provided through the invention, can bring following at least one
Kind the utility model has the advantages that
1, in the present invention, audio repository, semantic slot and regular expression are established by obtaining semantic complete corpus sample database
Library, so that the rule that semantic complete corpus has is analyzed, convenient for subsequent according to the semantic incomplete corpus of the rule completion.
2, in the present invention, first determining whether the user speech obtained, whether ingredient is incomplete, and judgement is ingredient incompleteness completion language again
Justice avoids increasing workload.
3, in the present invention, the user speech that will acquire and the feature (sound obtained by semantic complete corpus sample summary
Frequency library, semantic slot and regular expression library) it compares, thus the ingredient of the incompleteness of completion most possibly.
Detailed description of the invention
Below by clearly understandable mode, preferred embodiment is described with reference to the drawings, to a kind of semantic incompleteness of completion
Above-mentioned characteristic, technical characteristic, advantage and its implementation of the method and system of corpus are further described.
It is a kind of flow chart of one embodiment of the method for the semantic incomplete corpus of completion of the present invention shown in Fig. 1;
It is a kind of flow chart of second embodiment of the method for the semantic incomplete corpus of completion of the present invention shown in Fig. 2;
It is a kind of flow chart of the third embodiment of the method for the semantic incomplete corpus of completion of the present invention shown in Fig. 3;
It is a kind of flow chart of the 4th embodiment of the method for the semantic incomplete corpus of completion of the present invention shown in Fig. 4;
It is a kind of structural representation of the 5th embodiment of the system of the semantic incomplete corpus of completion of the present invention shown in Fig. 5
Figure;
It is a kind of structural representation of the 6th embodiment of the system of the semantic incomplete corpus of completion of the present invention shown in Fig. 6
Figure;
It is a kind of structural representation of the 7th embodiment of the system of the semantic incomplete corpus of completion of the present invention shown in Fig. 7
Figure;
It is a kind of structural representation of the 8th embodiment of the system of the semantic incomplete corpus of completion of the present invention shown in Fig. 8
Figure.
Drawing reference numeral explanation:
The system of the semantic incomplete corpus of 1000 completions
1100 Database module, 1110 acquiring unit, 1120 participle unit, 1130 Database unit 1140
Analytical unit
1141 analysis subelements 1142 generate subelement
1200 obtain 1300 matching module of module, 1400 analysis module
1500 processing module, 1510 processing unit, 1520 selecting unit, 1530 completion unit
1600 parsing module, 1700 conversion module, 1800 completion module
Specific embodiment
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, Detailed description of the invention will be compareed below
A specific embodiment of the invention.It should be evident that drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing, and obtain other embodiments.
To make simplified form, part related to the present invention is only schematically shown in each figure, they are not represented
Its practical structures as product.In addition, there is identical structure or function in some figures so that simplified form is easy to understand
Component only symbolically depicts one of those, or has only marked one of those.Herein, "one" is not only indicated
" only this ", can also indicate the situation of " more than one ".
The first embodiment of the present invention, as shown in Figure 1, a kind of method of the semantic incomplete corpus of completion, comprising:
S100 obtains semantic complete corpus sample database, establishes audio repository, semantic slot and just according to the corpus sample database
Then expression formula library.
Corpus sample database is established specifically, collecting and obtaining a large amount of semantic complete corpus sample, analyzes all corpus
Sample establishes audio repository, semantic slot and regular expression library to sum up feature possessed by semantic complete corpus.
S200 obtains user speech.
Specifically, obtaining user speech, which may be the voice that user inputs in real time, such as user merely enters
Part of speech, or since the factors such as environment influence system and only collect to obtain part of speech.It is also likely to be downloading or recording
Audio, such as the audio noise of recording is larger, can only identify part of speech.
S500 matches the user speech and the audio repository.
Specifically, the user speech that will acquire and according to a large amount of corpus sample summarize the audio in the audio repository obtained by
One is matched.
S600 determines that matching segments corresponding part of speech, the matching point when matching result is consistent, according to the semantic slot
Word is the participle being consistent in the user speech with the audio storehouse matching.
Specifically, a certain audio in audio repository is consistent with certain a part of matching result in the user speech of acquisition
When, by the corresponding participle of the audio as matching participle, matching participle is found in semantic slot, so that it is determined that matching participle pair
The part of speech answered.
S700 compares the part of speech of the matching participle and the regular expression library, according to the regular expression
Incomplete ingredient completion in the user speech is obtained completion text by the regular expression in library.
Specifically, the regular expression in the part of speech and regular expression library of matching participle is compared one by one, thus
It determines the part of speech that broken partial section is most possible in user speech, has the content of part and the incompleteness of supposition further according to user speech
Incomplete ingredient completion in user speech is obtained completion text by partial part of speech.
S800 carries out semantic parsing according to the completion text.
Specifically, being carried out according to the corresponding part of speech of participle in completion text and the relationship between part of speech to completion text
Parsing, obtains the semanteme of user speech, to identify the true intention of user, then makes corresponding feedback or measure.
In the present embodiment, audio repository, semantic slot and regular expression are established by obtaining semantic complete corpus sample database
Library, so that the rule that semantic complete corpus has is analyzed, convenient for subsequent according to the semantic incomplete corpus of the rule completion.
Then the user speech that will acquire and the feature obtained (audio repository, semantic slot and just are summarized by semantic complete corpus sample
Then expression formula library) it compares, thus the ingredient of the incompleteness of completion most possibly.
The second embodiment of the present invention is the optimal enforcement example of above-mentioned first embodiment, as shown in Figure 2, comprising:
S110 obtains the semantic complete corpus sample database, according to participle technique to the corpus in the corpus sample database
Sample is segmented to obtain the participle for including in the corpus sample and corresponding part of speech.
Corpus sample database is established specifically, collecting and obtaining a large amount of semantic complete corpus sample, corpus sample is not only
Refer to penman text, further include voice, audio etc., difference is that the corpus sample such as voice, audio needs first to be converted to corresponding text
Then this information carries out subsequent processing.
Corpus sample is segmented according to participle technique, judges the structure of sentence in corpus sample, identifies corpus sample
In every a word in word part of speech, then by every a word in corpus sample according to the part of speech of word by entire sentence
It is divided into the participles such as word, word and phrase composition.Therefore the participle for including in corpus sample and corresponding part of speech have been obtained.
S120 establishes the semantic slot according to the participle and the part of speech.
Specifically, all participles for including in above-mentioned all corpus samples are obtained, according to all participle and participle
Corresponding part of speech establishes semantic slot, and in the corresponding relationship established between participle and part of speech in semantic slot.
S130 obtains the corresponding audio of the participle, establishes the audio repository according to the audio.
Specifically, obtaining each segments corresponding audio, it is same due to the influence of the factors such as age of user and accent
A participle may correspond to multiple audios, the different audios of the same participles of acquisition more as far as possible, and one time subsequent to identify comprehensively
User speech avoids omitting.Then audio repository is established according to all audios, is being established between participle and audio in audio repository
Corresponding relationship.
S140 analyzes the corpus sample summary and obtains regular expression, establishes the canonical according to the regular expression
Expression formula library.
Specifically, analyzing the summary of each corpus sample one by one obtains regular expression, each corpus sample corresponding one
Regular expression, establishes regular expression library according to all regular expressions, statisticallys analyze in all regular expressions
Proportion is more than or equal to the rule of preset ratio, using the rule as the rule of the incomplete ingredient of completion corpus, such as semanteme
The participle of a certain part of speech should be regular with another or a variety of specific participle connections etc. in complete corpus.
S200 obtains user speech.
S500 matches the user speech and the audio repository.
S600 determines that matching segments corresponding part of speech, the matching point when matching result is consistent, according to the semantic slot
Word is the participle being consistent in the user speech with the audio storehouse matching.
S700 compares the part of speech of the matching participle and the regular expression library, according to the regular expression
Incomplete ingredient completion in the user speech is obtained completion text by the regular expression in library.
S800 carries out semantic parsing according to the completion text.
Wherein, the S140 analyzes the corpus sample summary and obtains regular expression, is built according to the regular expression
The regular expression library is stood to specifically include:
S141 analyzes the incidence relation between the participle in the corpus sample.
Specifically, above-mentioned obtained the participle for including in corpus sample and the corresponding word of the participle according to participle technique
Property, then according to the incidence relation between the participle in the constituent analysis corpus sample of sentence in corpus sample.
For example, a certain corpus sample are as follows: which the composition for describing autumn has.Judged in the corpus sample by participle technique
The part of speech for the word covered: (auxiliary word) composition (noun) for describing (verb) autumn (time word) has (verb) which (pronoun),
Relationship between word are as follows: relationship in fixed: (verb) is described in composition (noun)-, moves guest's relationship: describing (the time in (verb)-autumn
Word).
S142 obtains regular expression according to the part of speech and incidence relation summary, according to the regular expression
Establish the regular expression library.
Specifically, obtaining regular expressions according to the part of speech and mutual incidence relation summary that segment in corpus sample
Formula establishes regular expression library according to all regular expressions, and it is big to statistically analyze proportion in all regular expressions
In the rule for being equal to preset ratio, using the rule as the rule of the incomplete ingredient of completion corpus, such as semantic complete corpus
The participle of middle a certain kind part of speech should be regular with another or a variety of specific participle connections etc..
For example, a certain corpus sample are as follows: which the composition for describing autumn has.Judged in the corpus sample by participle technique
The part of speech for the word covered: (auxiliary word) composition (noun) for describing (verb) autumn (time word) has (verb) which (pronoun),
Relationship between word are as follows: relationship in fixed: (verb) is described in composition (noun)-, moves guest's relationship: describing (the time in (verb)-autumn
Word).Therefore the corresponding corpus regular expression of the corpus sample are as follows: verb # time word # auxiliary word # noun # verb # pronoun.Noun
It is relationship in fixed with first verb, first verb and time word are guest's relationship.
In the present embodiment, semantic complete corpus sample is segmented according to participle technique, to establish audio repository, language
Adopted slot and regular expression library, and therefrom the rule that semantic complete corpus has is precipitated in statistical, convenient for subsequent according to this
The semantic incomplete corpus of regular completion.
The third embodiment of the present invention is the optimal enforcement example of above-mentioned first embodiment, as shown in Figure 3, comprising:
S100 obtains semantic complete corpus sample database, establishes audio repository, semantic slot and just according to the corpus sample database
Then expression formula library.
S200 obtains user speech.
The user speech is converted identification text by S300, parses the identification text.
S400 is when the identification text component incompleteness, according to the audio repository, the semantic slot and the regular expressions
Text is identified described in the completion of formula library.
Specifically, the user speech that will acquire is converted into identification text, the identification text is parsed, judges the identification text
Whether ingredient incomplete, if incomplete, according to above by a large amount of semantic complete corpus sample summarize the audio repository obtained,
The ingredient of semantic slot and the completion of regular expression library the identification text.If ingredient is not incomplete, directly according to the identification text
The true intention of user is identified, to take corresponding feedback or measure.
S500 matches the user speech and the audio repository.
S600 determines that matching segments corresponding part of speech, the matching point when matching result is consistent, according to the semantic slot
Word is the participle being consistent in the user speech with the audio storehouse matching.
S700 compares the part of speech of the matching participle and the regular expression library, according to the regular expression
Incomplete ingredient completion in the user speech is obtained completion text by the regular expression in library.
S800 carries out semantic parsing according to the completion text.
In the present embodiment, after getting user speech, first determining whether the user speech obtained, whether ingredient is incomplete, only
Have and just take corresponding method completion semantic when determining the ingredient incompleteness of user speech, to avoid increasing workload.
The fourth embodiment of the present invention is the optimal enforcement example of above-mentioned first embodiment, as shown in Figure 4, comprising:
S100 obtains semantic complete corpus sample database, establishes audio repository, semantic slot and just according to the corpus sample database
Then expression formula library.
S200 obtains user speech.
S500 matches the user speech and the audio repository.
S600 determines that matching segments corresponding part of speech, the matching point when matching result is consistent, according to the semantic slot
Word is the participle being consistent in the user speech with the audio storehouse matching.
S710 determines relative position of all matching participles in the user speech.
Specifically, the matching participle that matching is consistent is obtained after the audio in user speech and audio repository is matched,
Determine that position and all matching of the matching participle in the user speech segment mutual relative position.
S720 determines corresponding part of speech relative position depending on that relative position.
Specifically, user speech is matched after obtaining the matching participle that matching is consistent with the audio in audio repository,
It is determined to match according to semantic slot and segments corresponding part of speech, mutual relative position is then segmented to obtain by obtained matching
Corresponding part of speech relative position is segmented to matching.
S730 is compared according to the part of speech relative position and the regular expression library, selects the matching of preset quantity
Ratio is more than or equal to the regular expression of preset ratio as target regular expression.
Specifically, the regular expression in part of speech relative position and regular expression library is compared one by one to obtain the two
Matching ratio, choose wherein preset quantity matching ratio be more than or equal to preset ratio regular expression as objective expression
Formula, the preset quantity and the preset ratio are independently selected by system intelligent set or user.
S740, by the incomplete ingredient completion in the user speech, obtains completion text according to the target regular expression
This.
Specifically, the part of speech relative position of matching participle to be compared to the target regular expression of selection, incomplete portion is judged
Then the part of speech divided can identify that the Semantic judgement of parsing part is most possibly semantic according to user speech, what is judged
The word of above-mentioned corresponding semanteme is selected in the part of speech of broken partial section part by the incomplete ingredient completion in user speech, obtains completion
Text.
S800 carries out semantic parsing according to the completion text.
In the present embodiment, user speech is performed corresponding processing and determines matching participle, obtains the opposite position of matching participle
It sets and part of speech relative position, selection target regular expression is compared, thus by the incomplete ingredient completion in user speech,
So as to identify the true intention of user, corresponding feedback or measure are then taken.
The fifth embodiment of the present invention, as shown in figure 5, a kind of system 1000 of the semantic incomplete corpus of completion, comprising:
Database module 1100 obtains semantic complete corpus sample database, establishes sound according to the corpus sample database
Frequency library, semantic slot and regular expression library.
Corpus sample database is established specifically, collecting and obtaining a large amount of semantic complete corpus sample, analyzes all corpus
Sample establishes audio repository, semantic slot and regular expression library to sum up feature possessed by semantic complete corpus.
Module 1200 is obtained, user speech is obtained.
Specifically, obtaining user speech, which may be the voice that user inputs in real time, such as user merely enters
Part of speech, or since the factors such as environment influence system and only collect to obtain part of speech.It is also likely to be downloading or recording
Audio, such as the audio noise of recording is larger, can only identify part of speech.
Matching module 1300, the user speech that the acquisition module 1200 is obtained and the Database module
1100 audio repositories established are matched.
Specifically, the user speech that will acquire and according to a large amount of corpus sample summarize the audio in the audio repository obtained by
One is matched.
Analysis module 1400, when matching result is consistent, according to institute's predicate of the Database module 1100 foundation
Adopted slot determines that matching segments corresponding part of speech, and the matching, which segments, to be consistent in the user speech with the audio storehouse matching
Participle.
Specifically, a certain audio in audio repository is consistent with certain a part of matching result in the user speech of acquisition
When, by the corresponding participle of the audio as matching participle, matching participle is found in semantic slot, so that it is determined that matching participle pair
The part of speech answered.
Processing module 1500 builds the part of speech for the matching participle that the analysis module 1400 determines and the database
The regular expression library that formwork erection block 1100 is established compares, will according to the regular expression in the regular expression library
Incomplete ingredient completion in the user speech, obtains completion text.
Specifically, the regular expression in the part of speech and regular expression library of matching participle is compared one by one, thus
It determines the part of speech that broken partial section is most possible in user speech, has the content of part and the incompleteness of supposition further according to user speech
Incomplete ingredient completion in user speech is obtained completion text by partial part of speech.
Parsing module 1600 carries out semantic parsing according to the completion text that the processing module 1500 obtains.
Specifically, being carried out according to the corresponding part of speech of participle in completion text and the relationship between part of speech to completion text
Parsing, obtains the semanteme of user speech, to identify the true intention of user, then makes corresponding feedback or measure.
In the present embodiment, audio repository, semantic slot and regular expression are established by obtaining semantic complete corpus sample database
Library, so that the feature that semantic complete corpus has is analyzed, convenient for the semantic incomplete corpus of subsequent completion.Then it will acquire
User speech and pass through semantic complete corpus sample and summarize the feature (audio repository, semantic slot and regular expression library) obtained
It compares, thus the ingredient of the incompleteness of completion most possibly.
The sixth embodiment of the present invention is the optimal enforcement example of above-mentioned 5th embodiment, as shown in Figure 6, comprising:
Database module 1100 obtains semantic complete corpus sample database, establishes sound according to the corpus sample database
Frequency library, semantic slot and regular expression library.
The Database module 1100 specifically includes:
Acquiring unit 1110 obtains the semantic complete corpus sample database.
Specifically, acquiring unit 1110, which collects a large amount of semantic complete corpus sample of acquisition, establishes corpus sample database, language
Material sample refers not only to penman text, further includes voice, audio etc., and difference is that the corpus sample such as voice, audio needs first to turn
It is melted into corresponding text information, then carries out subsequent processing.
Participle unit 1120, the language in the corpus sample database that the acquiring unit 1110 is obtained according to participle technique
Material sample is segmented to obtain the participle for including in the corpus sample and corresponding part of speech.
Specifically, participle unit 1120 segments corpus sample according to participle technique, sentence in corpus sample is judged
Structure, identify corpus sample in every a word in word part of speech, then by basis in every a word in corpus sample
Entire sentence is divided into the participles such as word, word and phrase and constituted by the part of speech of word.Therefore obtained include in corpus sample
Participle and corresponding part of speech.
Database unit 1130, the participle and the part of speech obtained according to the participle unit 1120 establish institute
Predicate justice slot.
Specifically, all participles for including in above-mentioned all corpus samples are obtained, Database unit 1130
According to all corresponding semantic slots of part of speech foundation of participle and participle, and the corresponding pass between participle and part of speech is being established in semantic slot
System.
The acquiring unit 1110 obtains the corresponding audio of the participle.
The Database unit 1130 establishes the audio according to the audio that the acquiring unit 1110 obtains
Library.
Specifically, acquiring unit 1110, which obtains each, segments corresponding audio, due to age of user and accent etc. because
The influence of element, the same participle may correspond to multiple audios, and the different audios of the same participles of acquisition more as far as possible, one time subsequent
User speech can be identified comprehensively, avoid omitting.Then Database unit 1130 establishes audio repository according to all audios,
In the corresponding relationship established in audio repository between participle and audio.
Analytical unit 1140 analyzes the corpus sample in the corpus sample database that the acquiring unit 1110 obtains
Summary obtains regular expression.
Specifically, analytical unit 1140 analyzes the summary of each corpus sample one by one obtains regular expression, each language
Expect the corresponding regular expression of sample, regular expression library is established according to all regular expressions, is statisticallyd analyze all
Proportion is more than or equal to the rule of preset ratio in regular expression, using the rule as the rule of the incomplete ingredient of completion corpus
Then, such as in semantic complete corpus the participle of a certain part of speech should be with another or a variety of specific participle connection isotactics
Then.
The analytical unit 1140 specifically includes:
Subelement 1141 is analyzed, the incidence relation between the participle in the corpus sample is analyzed.
Specifically, above-mentioned obtained the participle for including in corpus sample and the corresponding word of the participle according to participle technique
Property, it then analyzes subelement 1141 and is closed according to the association between the participle in the constituent analysis corpus sample of sentence in corpus sample
System.
For example, a certain corpus sample are as follows: which the composition for describing autumn has.Judged in the corpus sample by participle technique
The part of speech for the word covered: (auxiliary word) composition (noun) for describing (verb) autumn (time word) has (verb) which (pronoun),
Relationship between word are as follows: relationship in fixed: (verb) is described in composition (noun)-, moves guest's relationship: describing (the time in (verb)-autumn
Word).
Subelement 1142 is generated, the incidence relation obtained according to the part of speech and the analysis subelement 1141 is total
Knot obtains regular expression.
The Database unit 1130 establishes institute according to the regular expression that the analytical unit 1140 obtains
State regular expression library.
Specifically, it is total according to the part of speech and mutual incidence relation segmented in corpus sample to generate subelement 1142
Knot show that regular expression, Database unit 1130 establish regular expression library according to all regular expressions, counts
The rule that proportion in all regular expressions is more than or equal to preset ratio is analyzed, using the rule as the residual of completion corpus
Lack a certain part of speech in the rule of ingredient, such as semantic complete corpus participle should with another or it is specific point a variety of
The rules such as word connection.
For example, a certain corpus sample are as follows: which the composition for describing autumn has.Judged in the corpus sample by participle technique
The part of speech for the word covered: (auxiliary word) composition (noun) for describing (verb) autumn (time word) has (verb) which (pronoun),
Relationship between word are as follows: relationship in fixed: (verb) is described in composition (noun)-, moves guest's relationship: describing (the time in (verb)-autumn
Word).Therefore the corresponding corpus regular expression of the corpus sample are as follows: verb # time word # auxiliary word # noun # verb # pronoun.Noun
It is relationship in fixed with first verb, first verb and time word are guest's relationship.
Module 1200 is obtained, user speech is obtained.
Matching module 1300, the user speech that the acquisition module 1200 is obtained and the Database module
1100 audio repositories established are matched.
Analysis module 1400, when matching result is consistent, according to institute's predicate of the Database module 1100 foundation
Adopted slot determines that matching segments corresponding part of speech, and the matching, which segments, to be consistent in the user speech with the audio storehouse matching
Participle.
Processing module 1500 builds the part of speech for the matching participle that the analysis module 1400 determines and the database
The regular expression library that formwork erection block 1100 is established compares, will according to the regular expression in the regular expression library
Incomplete ingredient completion in the user speech, obtains completion text.
Parsing module 1600 carries out semantic parsing according to the completion text that the processing module 1500 obtains.
In the present embodiment, semantic complete corpus sample is segmented according to participle technique, to establish audio repository, language
Adopted slot and regular expression library, and therefrom the rule that semantic complete corpus has is precipitated in statistical, convenient for subsequent according to this
The semantic incomplete corpus of regular completion.
The seventh embodiment of the present invention is the optimal enforcement example of above-mentioned 5th embodiment, as shown in fig. 7, comprises:
Database module 1100 obtains semantic complete corpus sample database, establishes sound according to the corpus sample database
Frequency library, semantic slot and regular expression library.
Module 1200 is obtained, user speech is obtained.
Conversion module 1700 converts identification text for the user speech that the acquisition module 1200 obtains, parses
The identification text.
Completion module 1800, when the conversion module 1700 parses the identification text component incompleteness, according to described
Text is identified described in audio repository, the semantic slot and regular expression library completion.
Specifically, the user speech that conversion module 1700 will acquire is converted into identification text, the identification text, completion are parsed
Module 1800 judges whether the ingredient of the identification text is incomplete, if incomplete, according to above by a large amount of semantic complete
Corpus sample summarizes the ingredient of the audio repository obtained, semantic slot and the completion of regular expression library the identification text.If ingredient is not
Incompleteness, then directly according to the true intention of identification text identification user, to take corresponding feedback or measure.
Matching module 1300, the user speech that the acquisition module 1200 is obtained and the Database module
1100 audio repositories established are matched.
Analysis module 1400, when matching result is consistent, according to institute's predicate of the Database module 1100 foundation
Adopted slot determines that matching segments corresponding part of speech, and the matching, which segments, to be consistent in the user speech with the audio storehouse matching
Participle.
Processing module 1500 builds the part of speech for the matching participle that the analysis module 1400 determines and the database
The regular expression library that formwork erection block 1100 is established compares, will according to the regular expression in the regular expression library
Incomplete ingredient completion in the user speech, obtains completion text.
Parsing module 1600 carries out semantic parsing according to the completion text that the processing module 1500 obtains.
In the present embodiment, after getting user speech, first determining whether the user speech obtained, whether ingredient is incomplete, only
Have and just take corresponding method completion semantic when determining the ingredient incompleteness of user speech, to avoid increasing workload.
The eighth embodiment of the present invention is the optimal enforcement example of above-mentioned 5th embodiment, as shown in Figure 8, comprising:
Database module 1100 obtains semantic complete corpus sample database, establishes sound according to the corpus sample database
Frequency library, semantic slot and regular expression library.
Module 1200 is obtained, user speech is obtained.
Matching module 1300, the user speech that the acquisition module 1200 is obtained and the Database module
1100 audio repositories established are matched.
Analysis module 1400, when matching result is consistent, according to institute's predicate of the Database module 1100 foundation
Adopted slot determines that matching segments corresponding part of speech, and the matching, which segments, to be consistent in the user speech with the audio storehouse matching
Participle.
Processing module 1500 builds the part of speech for the matching participle that the analysis module 1400 determines and the database
The regular expression library that formwork erection block 1100 is established compares, will according to the regular expression in the regular expression library
Incomplete ingredient completion in the user speech, obtains completion text.
Handled module 1500 specifically includes:
Processing unit 1510 determines relative position of all matching participles in the user speech.
Specifically, the matching participle that matching is consistent is obtained after the audio in user speech and audio repository is matched,
Processing unit 1510 determines that position and all matching of the matching participle in the user speech segment mutual phase
To position.
The processing unit 1510 determines corresponding part of speech according to the relative position that the processing unit 1510 determines
Relative position.
Specifically, user speech is matched after obtaining the matching participle that matching is consistent with the audio in audio repository,
Determine that matching segments corresponding part of speech according to semantic slot, then processing unit 1510 segments mutual phase by obtained matching
Corresponding part of speech relative position is segmented to obtain matching to position.
Selecting unit 1520, the part of speech relative position determined according to the processing unit 1510 and the regular expressions
Formula library compares, and the matching ratio of preset quantity is selected to be more than or equal to the regular expression of preset ratio as target canonical table
Up to formula.
Specifically, the regular expression in part of speech relative position and regular expression library is compared one by one to obtain the two
Matching ratio, selecting unit 1520 choose wherein preset quantity matching ratio be more than or equal to preset ratio regular expression
As goal expression, the preset quantity and the preset ratio are independently selected by system intelligent set or user.
Completion unit 1530, according to the target regular expression of the selecting unit 1520 selection by user's language
Incomplete ingredient completion in sound, obtains completion text.
Specifically, the part of speech relative position of matching participle to be compared to the target regular expression of selection, incomplete portion is judged
Then the part of speech divided can identify that the Semantic judgement of parsing part is most possibly semantic according to user speech, what is judged
The word completion unit 1530 of above-mentioned corresponding semanteme is selected in the part of speech of broken partial section part by the incomplete ingredient in user speech
Completion obtains completion text.
Parsing module 1600 carries out semantic parsing according to the completion text that the processing module 1500 obtains.
In the present embodiment, user speech is performed corresponding processing and determines matching participle, obtains the opposite position of matching participle
It sets and part of speech relative position, selection target regular expression is compared, thus by the incomplete ingredient completion in user speech,
So as to identify the true intention of user, corresponding feedback or measure are then taken.
It should be noted that above-described embodiment can be freely combined as needed.The above is only of the invention preferred
Embodiment, it is noted that for those skilled in the art, in the premise for not departing from the principle of the invention
Under, several improvements and modifications can also be made, these modifications and embellishments should also be considered as the scope of protection of the present invention.
Claims (10)
1. a kind of method of the semantic incomplete corpus of completion characterized by comprising
Semantic complete corpus sample database is obtained, audio repository, semantic slot and regular expression are established according to the corpus sample database
Library;
Obtain user speech;
The user speech and the audio repository are matched;
When matching result is consistent, determine that matching segments corresponding part of speech according to the semantic slot, the matching participle is described
The participle being consistent in user speech with the audio storehouse matching;
By it is described matching participle part of speech and the regular expression library compare, according in the regular expression library just
Then the incomplete ingredient completion in the user speech is obtained completion text by expression formula;
Semantic parsing is carried out according to the completion text.
2. the method for the semantic incomplete corpus of completion according to claim 1, which is characterized in that the acquisition semanteme is complete
Whole corpus sample database is established audio repository, semantic slot and regular expression library according to the corpus sample database and is specifically included:
The semantic complete corpus sample database is obtained, the corpus sample in the corpus sample database is carried out according to participle technique
Participle obtains the participle and corresponding part of speech for including in the corpus sample;
The semantic slot is established according to the participle and the part of speech;
The corresponding audio of the participle is obtained, the audio repository is established according to the audio;
It analyzes the corpus sample summary and obtains regular expression, the regular expression is established according to the regular expression
Library.
3. the method for the semantic incomplete corpus of completion according to claim 2, which is characterized in that the analysis institute predicate
Material sample summary obtains regular expression, establishes the regular expression library according to the regular expression and specifically includes:
Analyze the incidence relation between the participle in the corpus sample;
Regular expression is obtained according to the part of speech and incidence relation summary, according to regular expression foundation
Regular expression library.
4. the method for the semantic incomplete corpus of completion according to claim 1, which is characterized in that acquisition user's language
After sound, it is described the user speech and the audio repository are matched before include:
Identification text is converted by the user speech, parses the identification text;
When the identification text component incompleteness, according to the audio repository, the semantic slot and regular expression library completion
The identification text.
5. the method for the semantic incomplete corpus of completion according to claim 1-4, which is characterized in that it is described general
The part of speech of the matching participle and the regular expression library compare, according to the regular expressions in the regular expression library
Incomplete ingredient completion in the user speech is obtained completion text and specifically included by formula:
Determine relative position of all matching participles in the user speech;
Corresponding part of speech relative position is determined depending on that relative position;
It is compared according to the part of speech relative position and the regular expression library, the matching ratio of preset quantity is selected to be greater than
Equal to preset ratio regular expression as target regular expression;
According to the target regular expression by the incomplete ingredient completion in the user speech, completion text is obtained.
6. a kind of system of the semantic incomplete corpus of completion characterized by comprising
Database module obtains semantic complete corpus sample database, establishes audio repository, semanteme according to the corpus sample database
Slot and regular expression library;
Module is obtained, user speech is obtained;
Matching module, by the user speech that the acquisition module obtains and the sound that the Database module is established
Frequency library is matched;
Analysis module determines matching according to the semantic slot that the Database module is established when matching result is consistent
Corresponding part of speech is segmented, the matching participle is the participle being consistent in the user speech with the audio storehouse matching;
Processing module, what the part of speech and the Database module for the matching participle that the analysis module is determined were established
The regular expression library compares, will be in the user speech according to the regular expression in the regular expression library
Incomplete ingredient completion, obtains completion text;
Parsing module carries out semantic parsing according to the completion text that the processing module obtains.
7. the system of the semantic incomplete corpus of completion according to claim 6, which is characterized in that the Database mould
Block specifically includes:
Acquiring unit obtains the semantic complete corpus sample database;
Participle unit, the corpus sample in the corpus sample database obtained according to participle technique to the acquiring unit divide
Word obtains the participle for including in the corpus sample and corresponding part of speech;
Database unit, the participle and the part of speech obtained according to the participle unit establish the semantic slot;
The acquiring unit obtains the corresponding audio of the participle;
The Database unit establishes the audio repository according to the audio that the acquiring unit obtains;
Analytical unit, the corpus sample summary analyzed in the corpus sample database that the acquiring unit obtains obtain canonical
Expression formula;
The Database unit establishes the regular expression according to the regular expression that the analytical unit obtains
Library.
8. the system of the semantic incomplete corpus of completion according to claim 7, which is characterized in that the analytical unit is specific
Include:
Subelement is analyzed, the incidence relation between the participle in the corpus sample is analyzed;
Subelement is generated, canonical table is obtained according to the incidence relation summary that the part of speech and the analysis subelement obtain
Up to formula.
9. the system of the semantic incomplete corpus of completion according to claim 6, which is characterized in that further include:
Conversion module converts identification text for the user speech that the acquisition module obtains, parses the identification text;
Completion module, when the conversion module parses the identification text component incompleteness, according to the audio repository, institute's predicate
Text is identified described in adopted slot and regular expression library completion.
10. according to the system of the semantic incomplete corpus of the described in any item completions of claim 6-9, which is characterized in that handled
Module specifically includes:
Processing unit determines relative position of all matching participles in the user speech;
The processing unit determines corresponding part of speech relative position according to the relative position that the processing unit determines;
Selecting unit, the part of speech relative position and the regular expression library determined according to the processing unit carry out pair
Than selecting the matching ratio of preset quantity to be more than or equal to the regular expression of preset ratio as target regular expression;
Completion unit, according to the target regular expression of selecting unit selection by the user speech it is incomplete at
Divide completion, obtains completion text.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811288739.1A CN109344231B (en) | 2018-10-31 | 2018-10-31 | A method and system for completing semantically incomplete corpus |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811288739.1A CN109344231B (en) | 2018-10-31 | 2018-10-31 | A method and system for completing semantically incomplete corpus |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109344231A true CN109344231A (en) | 2019-02-15 |
| CN109344231B CN109344231B (en) | 2021-08-17 |
Family
ID=65313422
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811288739.1A Active CN109344231B (en) | 2018-10-31 | 2018-10-31 | A method and system for completing semantically incomplete corpus |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109344231B (en) |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109933198A (en) * | 2019-03-13 | 2019-06-25 | 广东小天才科技有限公司 | Semantic recognition method and device |
| CN109948155A (en) * | 2019-03-12 | 2019-06-28 | 广东小天才科技有限公司 | Multi-intention selection method and device and terminal equipment |
| CN110310641A (en) * | 2019-02-26 | 2019-10-08 | 北京蓦然认知科技有限公司 | A kind of method and device for voice assistant |
| CN110428830A (en) * | 2019-07-17 | 2019-11-08 | 上海麦图信息科技有限公司 | A kind of blank pipe instruction intension recognizing method based on regular expression |
| CN111858867A (en) * | 2019-04-30 | 2020-10-30 | 广东小天才科技有限公司 | A method and device for completing incomplete corpus |
| CN111949797A (en) * | 2019-04-30 | 2020-11-17 | 广东小天才科技有限公司 | A method and device for entity relationship completion based on neural network |
| CN113255343A (en) * | 2021-06-21 | 2021-08-13 | 中国平安人寿保险股份有限公司 | Semantic identification method and device for label data, computer equipment and storage medium |
| CN113362824A (en) * | 2021-06-09 | 2021-09-07 | 深圳市同行者科技有限公司 | Voice recognition method and device and terminal equipment |
| CN113539253A (en) * | 2020-09-18 | 2021-10-22 | 厦门市和家健脑智能科技有限公司 | Audio data processing method and device based on cognitive assessment |
| CN115048447A (en) * | 2022-06-27 | 2022-09-13 | 华中科技大学 | Database natural language interface system based on intelligent semantic completion |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1512395A (en) * | 2002-12-27 | 2004-07-14 | 联想(北京)有限公司 | Establishing method for open type natural language |
| EP1548575A2 (en) * | 2003-12-12 | 2005-06-29 | Alcatel | A fast, scalable pattern-matching engine |
| CN101470700A (en) * | 2007-12-28 | 2009-07-01 | 日电(中国)有限公司 | Text template generator, text generation equipment, text checking equipment and method thereof |
| CN103366740A (en) * | 2012-03-27 | 2013-10-23 | 联想(北京)有限公司 | Voice command recognition method and voice command recognition device |
| CN104572626A (en) * | 2015-01-23 | 2015-04-29 | 北京云知声信息技术有限公司 | Automatic semantic template generation method and device and semantic analysis method and system |
| CN105677642A (en) * | 2015-12-31 | 2016-06-15 | 成都数联铭品科技有限公司 | Machine translation word order adjusting method |
| CN107315737A (en) * | 2017-07-04 | 2017-11-03 | 北京奇艺世纪科技有限公司 | A kind of semantic logic processing method and system |
| CN107632979A (en) * | 2017-10-13 | 2018-01-26 | 华中科技大学 | The problem of one kind is used for interactive question and answer analytic method and system |
-
2018
- 2018-10-31 CN CN201811288739.1A patent/CN109344231B/en active Active
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1512395A (en) * | 2002-12-27 | 2004-07-14 | 联想(北京)有限公司 | Establishing method for open type natural language |
| EP1548575A2 (en) * | 2003-12-12 | 2005-06-29 | Alcatel | A fast, scalable pattern-matching engine |
| CN101470700A (en) * | 2007-12-28 | 2009-07-01 | 日电(中国)有限公司 | Text template generator, text generation equipment, text checking equipment and method thereof |
| CN103366740A (en) * | 2012-03-27 | 2013-10-23 | 联想(北京)有限公司 | Voice command recognition method and voice command recognition device |
| CN104572626A (en) * | 2015-01-23 | 2015-04-29 | 北京云知声信息技术有限公司 | Automatic semantic template generation method and device and semantic analysis method and system |
| CN105677642A (en) * | 2015-12-31 | 2016-06-15 | 成都数联铭品科技有限公司 | Machine translation word order adjusting method |
| CN107315737A (en) * | 2017-07-04 | 2017-11-03 | 北京奇艺世纪科技有限公司 | A kind of semantic logic processing method and system |
| CN107632979A (en) * | 2017-10-13 | 2018-01-26 | 华中科技大学 | The problem of one kind is used for interactive question and answer analytic method and system |
Cited By (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110310641A (en) * | 2019-02-26 | 2019-10-08 | 北京蓦然认知科技有限公司 | A kind of method and device for voice assistant |
| CN110310641B (en) * | 2019-02-26 | 2022-08-26 | 杭州蓦然认知科技有限公司 | Method and device for voice assistant |
| CN109948155A (en) * | 2019-03-12 | 2019-06-28 | 广东小天才科技有限公司 | Multi-intention selection method and device and terminal equipment |
| CN109933198A (en) * | 2019-03-13 | 2019-06-25 | 广东小天才科技有限公司 | Semantic recognition method and device |
| CN111858867B (en) * | 2019-04-30 | 2024-10-18 | 广东小天才科技有限公司 | Incomplete corpus completion method and device |
| CN111858867A (en) * | 2019-04-30 | 2020-10-30 | 广东小天才科技有限公司 | A method and device for completing incomplete corpus |
| CN111949797A (en) * | 2019-04-30 | 2020-11-17 | 广东小天才科技有限公司 | A method and device for entity relationship completion based on neural network |
| CN110428830B (en) * | 2019-07-17 | 2021-09-21 | 上海麦图信息科技有限公司 | Regular expression-based empty pipe instruction intention identification method |
| CN110428830A (en) * | 2019-07-17 | 2019-11-08 | 上海麦图信息科技有限公司 | A kind of blank pipe instruction intension recognizing method based on regular expression |
| CN113539253A (en) * | 2020-09-18 | 2021-10-22 | 厦门市和家健脑智能科技有限公司 | Audio data processing method and device based on cognitive assessment |
| CN113539253B (en) * | 2020-09-18 | 2024-05-14 | 厦门市和家健脑智能科技有限公司 | Audio data processing method and device based on cognitive assessment |
| CN113362824A (en) * | 2021-06-09 | 2021-09-07 | 深圳市同行者科技有限公司 | Voice recognition method and device and terminal equipment |
| CN113362824B (en) * | 2021-06-09 | 2024-03-12 | 深圳市同行者科技有限公司 | Voice recognition method and device and terminal equipment |
| CN113255343A (en) * | 2021-06-21 | 2021-08-13 | 中国平安人寿保险股份有限公司 | Semantic identification method and device for label data, computer equipment and storage medium |
| CN115048447A (en) * | 2022-06-27 | 2022-09-13 | 华中科技大学 | Database natural language interface system based on intelligent semantic completion |
| CN115048447B (en) * | 2022-06-27 | 2023-06-16 | 华中科技大学 | Database natural language interface system based on intelligent semantic completion |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109344231B (en) | 2021-08-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109344231A (en) | Method and system for completing corpus of semantic deformity | |
| CN114547329B (en) | Method for establishing pre-trained language model, semantic parsing method and device | |
| CN103065630B (en) | User personalized information voice recognition method and user personalized information voice recognition system | |
| US20160328467A1 (en) | Natural language question answering method and apparatus | |
| JP2005084681A (en) | Method and system for semantic language modeling and reliability measurement | |
| KR20030078388A (en) | Apparatus for providing information using voice dialogue interface and method thereof | |
| CN1237259A (en) | Implicit-Markov-Pronunciation Model Matching Method in Speech Recognition System | |
| RU2761940C1 (en) | Methods and electronic apparatuses for identifying a statement of the user by a digital audio signal | |
| CN119418699B (en) | An intelligent voice interaction system and electronic student ID card | |
| KR101677859B1 (en) | Method for generating system response using knowledgy base and apparatus for performing the method | |
| KR20070102267A (en) | Conversation management method using conversation modeling method based on conversation management device and conversation example | |
| CN110992959A (en) | Voice recognition method and system | |
| Gandhe et al. | Using web text to improve keyword spotting in speech | |
| CN109271492A (en) | Automatic generation method and system of corpus regular expression | |
| KR101396131B1 (en) | Apparatus and method for measuring relation similarity based pattern | |
| CN120670583B (en) | Intelligent chart dynamic generation method based on speech recognition and multimodal interaction | |
| Kowtha et al. | Detecting emotion primitives from speech and their use in discerning categorical emotions | |
| CN109360552A (en) | Method and system for automatically filtering awakening words | |
| CN109545202B (en) | Method and system for adjusting corpus with semantic logic confusion | |
| CN106951491A (en) | A kind of Intelligent dialogue control method and device applied to robot | |
| Yu et al. | Incorporating multimodal sentiments into conversational bots for service requirement elicitation | |
| Braunger et al. | A comparative analysis of crowdsourced natural language corpora for spoken dialog systems | |
| CN109766551A (en) | Method and system for determining polysemous word meaning | |
| CN119721046B (en) | Text semantic analysis system based on deep learning | |
| KR100366703B1 (en) | Human interactive speech recognition apparatus and method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |