CN109597983A - A kind of spelling error correction method and device - Google Patents
A kind of spelling error correction method and device Download PDFInfo
- Publication number
- CN109597983A CN109597983A CN201710928606.5A CN201710928606A CN109597983A CN 109597983 A CN109597983 A CN 109597983A CN 201710928606 A CN201710928606 A CN 201710928606A CN 109597983 A CN109597983 A CN 109597983A
- Authority
- CN
- China
- Prior art keywords
- phonetic
- error correction
- similar
- pinyin
- spelling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012937 correction Methods 0.000 title claims abstract description 279
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000000605 extraction Methods 0.000 claims description 21
- 239000000284 extract Substances 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 10
- 230000008901 benefit Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 abstract description 10
- 238000004590 computer program Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000009825 accumulation Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 241000269435 Rana <genus> Species 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 235000004240 Triticum spelta Nutrition 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000010192 kaixin Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of spelling error correction method and devices, it is related to technical field of data processing, it is only associated according to the positional relationship of letter each in keyboard and error correction when to solve existing spelling error correction, so that the lower problem of spelling error correction accuracy rate caused by error correction is unilateral.The method comprise the steps that obtaining to error correction phonetic;It is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic;According to the keypad editor distance based on Chinese phonetic alphabet rule settings, each similar pinyin is calculated with described to the corresponding error correction probability of error correction phonetic;According to error correction probability output with described to the corresponding amendment phonetic of error correction phonetic.The present invention is suitably applied in the error correction to phonetic spelling.
Description
Technical field
The present invention relates to technical field of data processing more particularly to a kind of spelling error correction methods and device.
Background technique
With the continuous development of network technology, in daily life more and more people be handled official business, done shopping by computer,
Information etc. is searched, the interaction between user and smart machine is very frequent.Under normal conditions, people by using keyboard input,
The modes such as touch screen hand-writing input, voice input input corresponding information in a device.However, people input by keyboard
When, logical production, which can exist, to be tapped during spelling to wrong keyboard, for example, when people want input Chinese character " phone ", logical
Crossing during keyboard is spelt may be write as " dianhia " for the correct Chinese phonetic alphabet " dianhua ", need at this time defeated according to user
The incorrect pinyin entered carries out error correction and is associated with out corresponding text.
Currently, the error correcting system for misspelling is mainly associated according to the positional relationship of letter each in keyboard
And error correction, so that error correcting system is more unilateral, so as to cause the lower problem of the accuracy of spelling error correction.
Summary of the invention
In view of the above problems, the present invention provides a kind of spelling error correction method and device, and main purpose is that Chinese is combined to spell
Sound rule carries out error correction to the phonetic that user inputs.
In order to solve the above technical problems, in a first aspect, the present invention provides a kind of spelling error correction methods, this method comprises:
It obtains to error correction phonetic, the multiple characters inputted to error correction phonetic for user;
It is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic, in the initialized data base
Whole words and the corresponding correct spelling phonetic of word are preserved, the similar pinyin is to differ preset quantity to error correction phonetic with described
The phonetic of character;
According to the keypad editor distance based on Chinese phonetic alphabet rule settings, each similar pinyin is calculated with described wait entangle
The corresponding error correction probability of misspelled sound;
According to error correction probability output with described to the corresponding amendment phonetic of error correction phonetic.
Optionally, the method also includes:
It is obtained from the initialized data base and extracts accumulative number of searches corresponding with each similar pinyin.
Optionally, described to calculate each similar pinyin with described to the corresponding error correction probability of error correction phonetic, it wraps
It includes:
According to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, calculate described each similar
Phonetic with described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient be according to the similar pinyin with to
What the character quantity differed between error correction phonetic was set;
It will be between the corresponding accumulative number of searches of each similar pinyin and the inverse of the clean up editing distance
Product, be determined as the error correction probability of each similar pinyin.
Optionally, described to include: to the corresponding phonetic of error correction phonetic with described according to error correction probability output
Each similar pinyin is ranked up according to the error correction probability;
Extract the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;
The similar pinyin of the extraction is exported according to presetting rule.
Optionally, before the acquisition is to error correction phonetic, the method also includes:
Obtain phonetic to be detected;
Detection is in the initialized data base with the presence or absence of correct spelling phonetic corresponding with the phonetic to be detected;
If it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;
If it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
Second aspect, the present invention also provides a kind of spelling error correction device, which includes:
Acquiring unit, for obtaining to error correction phonetic, the multiple characters inputted to error correction phonetic for user;
Extraction unit, for being extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic, institute
It states and preserves whole words and the corresponding correct spelling phonetic of word in initialized data base, the similar pinyin is to spell with described to error correction
The phonetic of sound difference preset quantity character;
Computing unit, for calculating described each similar according to the keypad editor distance based on Chinese phonetic alphabet rule settings
Phonetic is with described to the corresponding error correction probability of error correction phonetic;
Output unit, for according to error correction probability output with described to the corresponding amendment phonetic of error correction phonetic.
Optionally, the acquiring unit is also used to obtain each similar pinyin difference from the initialized data base
Corresponding accumulative number of searches;
The extraction unit is also used to extract the corresponding with each similar pinyin of the acquiring unit acquisition
Accumulative number of searches.
Optionally, the computing unit includes:
Computing module, for according to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, meter
Each similar pinyin is calculated with described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient is according to institute
State what similar pinyin was set with the character quantity to differ between error correction phonetic;
Determining module, for will the corresponding accumulative number of searches of each similar pinyin and the clean up editing away from
From inverse between product, be determined as the error correction probability of each similar pinyin.
Optionally, the output unit includes:
Sorting module, for being ranked up each similar pinyin according to the error correction probability;
Extraction module, for extracting the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;
Output module, for exporting the similar pinyin of the extraction according to presetting rule.
Optionally, described device further include: detection unit,
The acquiring unit is also used to obtain phonetic to be detected;
The detection unit, for detecting in the initialized data base with the presence or absence of corresponding with the phonetic to be detected
Correct spelling phonetic;
The output unit is also used to if it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;
The determination unit is also used to if it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
To achieve the goals above, according to the third aspect of the invention we, a kind of storage medium, the storage medium are provided
Program including storage, wherein equipment where controlling the storage medium in described program operation executes spelling described above
Write error correction method.
To achieve the goals above, according to the fourth aspect of the invention, a kind of processor is provided, the processor is used for
Run program, wherein described program executes spelling error correction method described above when running.
By above-mentioned technical proposal, spelling error correction method and device provided by the invention, for the prior art to user
When the phonetic of input carries out error correction, mainly it is associated according to positional relationship of each letter in keyboard and error correction, so that
Error correcting system is more unilateral, and the present invention is by extracting and differing present count with to error correction phonetic after getting to error correction phonetic
The whole similar pinyins for measuring character are successively calculated and are extracted then in conjunction with the keypad editor distance based on Chinese phonetic alphabet rule settings
Each similar pinyin and to the error correction probability between error correction phonetic, determined according to the error correction probability being calculated and spelled with to error correction
The corresponding phonetic of sound, therefore compared with the prior art, the present invention is combined when the phonetic inputted to user carries out spelling error correction
The frequent degree that the corresponding keypad editor distance of Chinese phonetic alphabet rule settings and user search for web page contents, can be more complete
Face, accurately according to user input incorrect pinyin be modified, thus improve spelling error correction accuracy.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of spelling error correction method flow chart provided in an embodiment of the present invention;
Fig. 2 shows another spelling error correction method flow charts provided in an embodiment of the present invention;
Fig. 3 shows a kind of composition block diagram for spelling error correction device provided in an embodiment of the present invention;
Fig. 4 shows the composition block diagram of another spelling error correction device provided in an embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
In order to improve the accuracy of spelling error correction, the embodiment of the invention provides a kind of spelling error correction methods, such as Fig. 1 institute
Show, this method comprises:
101, it obtains to error correction phonetic.
Wherein, the multiple characters inputted to error correction phonetic for user.And described to error correction phonetic can be a word
Corresponding pinyin character, or the corresponding pinyin character of a phrase or the corresponding character of an English word etc..Tool
Body, application scenarios of the embodiment of the present invention can be the page etc. in browser page or application APP, but the present invention is implemented
Example is not specifically limited in this embodiment.
It should be noted that can be used to entangle spelling in webpage for configuration for the executing subject of the embodiment of the present invention
Wrong device, when device detect that user inputs in webpage be unrecognized phonetic when, illustrate to be needed this moment to the spelling
Sound carries out error correction, triggers acquisition instruction, and then realize the error correction to phonetic.
102, it is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic.
Wherein, whole words and the corresponding correct spelling phonetic of word, the similar pinyin are preserved in the initialized data base
For with the phonetic that preset quantity character is differed to error correction phonetic.And the spelling with to the corresponding similar pinyin of error correction phonetic
Sound quantity can be 3,10,20 etc., described to be with the phonetic for differing preset quantity character to error correction phonetic
From to there are one or more different characters between error correction phonetic, or with to compared between error correction phonetic more than/
Less than one or more characters etc., for example, getting user's input to error correction phonetic is kaihin, then error correction phonetic pair is waited for this
The similar pinyin answered can wrap containing { kaixin, kaibin, kaiyin, kuaijin }.
Specifically, for the embodiment of the present invention, the step 102 can be got according to error correction phonetic, can be with
Directly in the database being pre-created traversal queries with to the corresponding similar pinyin of error correction phonetic, can also be first according to wait entangle
Misspelled sound generates all and to the corresponding phonetic for differing preset quantity character of error correction phonetic, then compares in the database again
To it is final with to the corresponding whole similar pinyins of error correction phonetic.
103, according to the keypad editor distance based on Chinese phonetic alphabet rule settings, calculate each similar pinyin with it is described
To the corresponding error correction probability of error correction phonetic.
It wherein, include according to the different Chinese phonetic alphabet in the keypad editor distance based on Chinese phonetic alphabet rule settings
The keypad editor distance of rule settings, such as tongue rule is stuck up according to flat in the Chinese phonetic alphabet, the keyboard of setting c and ch, s and sh etc.
Editing distance is 0.5;According in the Chinese phonetic alphabet the initial and the final rule, the keyboard distance for setting bu is 0.5, and the editor of bt away from
From being 1;And according to the accent of different regions Chinese phonetic alphabet rule, the keypad editor distance of setting lu and lv be 0.5, mo and
The keypad editor distance of me is 0.5 etc..It should be noted that for the embodiment of the present invention, it can be in advance by the Chinese existing for whole
The phonetic of language phonetic special rules and corresponding keypad editor distance pre-save in the database, and the Chinese is not present for remaining
The keypad editor distance of the phonetic of language phonetic special rules then can be calculated and be saved according to calculation in the prior art,
Finally obtain the keypad editor distance described in this step based on Chinese phonetic alphabet rule settings.It therefore in this step can be first
It first traverses to error correction phonetic and is compared with the keypad editor distance based on Chinese phonetic alphabet rule settings saved in database,
Get the keypad editor distance of each similar pinyin.
Further, the error correction probability is the incorrect pinyin that mark is inputted according to user, is associated in phonetic set each
A possibility that a similar pinyin, thus when be calculated similar pinyin with when the corresponding error correction probability of error correction phonetic is bigger, that
The phonetic and user input bigger a possibility that association to error correction phonetic, conversely, when the similar pinyin that be calculated
Error correction probability gets over hour, then the phonetic and user's input rate to error correction phonetic a possibility that association with regard to smaller.
104, it is exported with described according to the error correction probability to the corresponding amendment phonetic of error correction phonetic.
Wherein, the amendment phonetic is the correct phonetic selected for user, and in this step final output amendment phonetic
It can be one, or be arranged in descending order according to error correction probability multiple in order to therefrom being selected for user
It selects.As described in step 103, since error correction probability is to identify each similar pinyin and to possibility associated between error correction phonetic
Property, so, by calculating the error correction probability of each phonetic in whole similar pinyins, can be exported according to obtained error correction probability
Treat the processing result of error correction phonetic.
Spelling error correction method provided in an embodiment of the present invention carries out error correction in the phonetic inputted to user for the prior art
When, it is mainly associated according to each positional relationship of the letter in keyboard and error correction, so that error correcting system is more unilateral, this
Invention is extracted and the similar spelling of whole that differs preset quantity character to error correction phonetic by after getting to error correction phonetic
Sound successively calculates each similar of extraction then in conjunction with based on according to the keypad editor distance based on Chinese phonetic alphabet rule settings
Phonetic and to the error correction probability between error correction phonetic, according to the error correction probability being calculated determine with to the corresponding spelling of error correction phonetic
Sound, therefore compared with the prior art, the present invention combines Chinese phonetic alphabet rule when the phonetic inputted to user carries out spelling error correction
Then set corresponding keypad editor distance and frequent degree that user searches for web page contents, it can be more comprehensively and accurately
It is modified according to the incorrect pinyin that user inputs, to improve the accuracy of spelling error correction.
Further, as the refinement and extension to embodiment illustrated in fig. 1, the embodiment of the invention also provides another kinds to spell
Sound error correction method, as shown in Figure 2.
201, it obtains to error correction phonetic.
Wherein, the multiple characters inputted to error correction phonetic for user.And it is specific described to error correction phonetic and described
The concept explanation of character can be referred to and accordingly be described in the step 101, and details are not described herein.
In order to avoid the wasting of resources, for the embodiment of the present invention, before the step 201, the method can also be wrapped
It includes: obtaining phonetic to be detected;Detection is in the initialized data base with the presence or absence of correct spelling corresponding with the phonetic to be detected
Write phonetic;If it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;It if it does not exist, then will be described to be checked
Phonetic is surveyed to be determined as to error correction phonetic.Wherein, whole words are preserved in the initialized data base and the corresponding correct spelling of word is spelled
Sound, and accumulation number of searches identification information corresponding with each correct spelling phonetic, the accumulation number of searches
Identification information is for identity user to the searching times of each correct spelling phonetic.
It should be noted that each webpage can be respectively configured different preset data for the embodiment of the present invention
The corresponding correct spelling phonetic data of the full content for including in the webpage are all stored in data for each webpage by library
In library, and when each user searching webpage content in the webpage, the secondary search is recorded, and by statistical result
Phonetic corresponding with content of pages carries out corresponding preservation.Corresponding data can also be configured according to webpage of the classification to each classification
Library, by counting the searching times of content, in order to which user's searching times are carried out phonetic as reference factor
It corrects.
202, it is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic.
Wherein, it is described with it is described to the corresponding similar pinyin of error correction phonetic be to differ preset quantity to error correction phonetic with described
The phonetic of character, and described explain with the concrete concept of the phonetic for differing preset quantity character to error correction phonetic can refer to
It is accordingly described in the step 102, details are not described herein.
Specifically, the step 202 can generate corresponding lookup function, and root to error correction phonetic according to acquisition first
It is searched in initialized data base according to the lookup function, but not limited to this.For example, when what is got is to error correction phonetic
When " niulai ", successively the letter for including in the phonetic can be successively replaced to generate corresponding function, such as root
Corresponding lookup function is generated according to " niula_ ", " niu_ai ", the letter in phonetic can also be increased or decreased, then utilizing should
Function is searched in the database, obtains whole similar pinyins that preset quantity character is differed with to error correction phonetic.
It can also include: to obtain and extract from the initialized data base after this step for the embodiment of the present invention
Accumulative number of searches corresponding with each similar pinyin.All may be by having obtained after step 202
To the amendment phonetic of error correction phonetic, the obtained corresponding accumulation number of searches of each similar pinyin will be searched at this time and is mentioned
It takes, in order to which each similar pinyin is further processed, avoids and differ present count with to error correction phonetic to each respectively
Cumbersome the problem of wasting time caused by the phonetic of amount character is searched and extracted, to improve the effect of phonetic error correction
Rate.
It specifically, can be to write each similar pinyin and the corresponding accumulation number of searches of each similar pinyin respectively
Enter into tables of data, the whole similar pinyins extracted and the corresponding accumulation number of searches of each phonetic can also directly be protected
It deposits, and is separated each similar pinyin using preset separator, the embodiment of the present invention is not specifically limited.
For example, getting is wanba to error correction phonetic, by by each similar pinyin and corresponding accumulation number of searches
It is as shown in Table 1 that similar pinyin tables of data is obtained in write-in tables of data:
Phonetic data | Accumulate number of searches |
wanha | 5 |
wanna | 1 |
rana | 2 |
wanga | 1 |
wangba | 200 |
It is found that it should be to include 5 spellings in the corresponding similar pinyin of error correction phonetic according to phonetic tables of data shown in table one
Sound, and it is 200 times that wherein the corresponding user of wangba, which accumulates searching times, and it is 1 that the corresponding user of wanga, which accumulates searching times,
Deng.By will be all saved together with to the corresponding difference of error correction phonetic in the phonetic of preset quantity character, so as to
It can directly be extracted from tables of data etc. when being further processed to phonetic data, avoid and successively extracted in a large amount of non-ordered datas
Caused by extract mistake and cumbersome problem, to improve the accuracy and convenience of phonetic error correction.
For the embodiment of the present invention, after this step, the tables of data to error correction phonetic and creation can also be protected
Deposit, thus when detect input when error correction phonetic has existed before this, can directly acquire and save before this
Wait for that the corresponding tables of data of error correction phonetic has saved the time and avoided resource without extracting phonetic again and creating set with this
The problem of waste, to improve the efficiency of phonetic error correction and economize on resources.
203, it according to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, calculates described each
Similar pinyin is with described to the corresponding clean up editing distance of error correction phonetic.
Wherein, the predetermined coefficient is to be carried out according to the similar pinyin with the character quantity to differ between error correction phonetic
Setting, that is to say, that when similar pinyin and when increasing/deleting a character between error correction phonetic, coefficient is determined as 1, when
Coefficient is exactly N when increasing/delete N number of character, likewise, when like phonetic, with when error correction phonetic differs a character, coefficient is
1, when differing N number of character, coefficient is N etc..
Specifically, the step 203 can be by the keypad editor distance based on Chinese phonetic alphabet rule settings and default system
Number is multiplied, by obtained product be determined as each similar pinyin with to the corresponding clean up editing distance of error correction phonetic.Wherein, described
The concept explanation of keypad editor distance based on Pinyin rule setting can be referred in the step 102 and accordingly be described, herein not
It repeats again.
Such as step 202 example, when error correction phonetic is wanba, calculated by taking the phonetic wangba in similar pinyin as an example with
To the corresponding clean up editing distance of error correction phonetic: wangba and between error correction phonetic wanba, since wangba is than to error correction
Phonetic increases a character, so the coefficient set according to preset rules as 1, is in advance based on Chinese phonetic alphabet rule settings ang
The second keypad editor distance between an is 0.5, so phonetic wangba and to the editing distance between error correction phonetic wanba
For 1 × 0.5=0.5.Similarly, other each similar pinyins can be distinguished and to the clean up editing distance between error correction phonetic, obtained
Be 1 × 1=1 to the synthesis editing distance between wanha and wanba, the clean up editing distance between wanna and wanba for 1 ×
Clean up editing distance between 1=1, rana and wanba is 1 × 2=2, and the clean up editing distance between wanga and wanba is 1
× 1=1.
It 204, will be between the corresponding accumulative number of searches of each similar pinyin and the inverse of the clean up editing distance
Product, be determined as the error correction probability of each similar pinyin.
Wherein, the error correction probability is the incorrect pinyin that is inputted according to user of mark, and be associated with each similar pinyin can
It can property.Specifically, the calculation method of the step 204 are as follows: accumulative number of searches × (1/ (clean up editing distance)).Such as step
Described in 203, according to the corresponding accumulative number of searches of each similar pinyin, the error correction that can successively calculate each similar pinyin is general
It is 5 × (1/1)=5 that rate, which is respectively as follows: the error correction probability that the error correction probability of wangba is 200 × (1/0.5)=400, wanha,
The error correction probability that the error correction probability that the error correction probability of wanna is 1 × (1/1)=1, rana is 2 × (1/2)=1, wanga is 1 ×
(1/1)=1.
205, it is exported with described according to the error correction probability to the corresponding phonetic of error correction phonetic.
Wherein, the concept explanation of the amendment phonetic can be no longer superfluous herein with reference to the corresponding description in the step 103
It states.
Specifically, the step 205 may include: to be arranged each similar pinyin according to the error correction probability
Sequence;Extract the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;By the similar pinyin of the extraction according to preset
Rule is exported.Wherein, the predetermined probabilities threshold value can be 90%, 85% or 70% etc., and the embodiment of the present invention, which is not done, to be had
Body limits.For example, with to include 5 phonetics { phonetic 1, phonetic 2, phonetic 3, phonetic in the corresponding similar pinyin of error correction phonetic
4, phonetic 5 }, be obtained by calculation this five phonetics be respectively 98% to the corresponding error correction probability of error correction phonetic, 23%, 3%,
67% and 85%, each phonetic is ranked up from high to low according to error correction probability to obtain { phonetic 1, phonetic 5, phonetic 4, phonetic
2, phonetic 3 }, if predetermined probabilities threshold value be 80%, error correction probability be more than predetermined probabilities threshold value phonetic be phonetic 1 and phonetic 5,
Phonetic 1 and phonetic 5 and the Sequential output according to error correction probability from high to low: phonetic 1- phonetic 5 are extracted at this time.
It should be noted that amount threshold can be set for the embodiment of the present invention, when the phonetic in similar pinyin is corresponding
Accumulation number of searches when being more than the threshold value, then can be with the sequence of each similar pinyin of appropriate adjustment, such as will be more than threshold value
Pinyin sorting exports etc. at first.For the embodiment of the present invention, by setting predetermined probabilities threshold value and according to the threshold value
It is screened to obtain qualified phonetic, and all output for selection by the user, avoids and only provides one by obtained phonetic
Unilateral problem is corrected caused by when a amendment phonetic, to improve the comprehensive of phonetic error correction, and improves user's use
Impression.
Further, as the realization to method shown in above-mentioned Fig. 1, the embodiment of the invention also provides a kind of spelling error correction
Device, for being realized to above-mentioned method shown in FIG. 1.The Installation practice is corresponding with preceding method embodiment, for convenient for
It reads, present apparatus embodiment no longer repeats the detail content in preceding method embodiment one by one, it should be understood that this reality
The full content realized in preceding method embodiment can be corresponded to by applying the device in example.As shown in figure 3, the device includes: to obtain
Unit 31, extraction unit 32, computing unit 33, output unit 34, wherein
Acquiring unit 31 can be used for obtaining to error correction phonetic, the multiple characters inputted to error correction phonetic for user.
Extraction unit 32 can be used for extracting from initialized data base and spell with what the acquiring unit 31 was got to error correction
Sound corresponding multiple similar pinyins preserve whole words and the corresponding correct spelling phonetic of word in the initialized data base, described
Similar pinyin be and the phonetic that preset quantity character is differed to error correction phonetic.
Computing unit 33 can be used for keypad editor distance of the basis based on Chinese phonetic alphabet rule settings, mention described in calculating
The each similar pinyin for taking unit 32 to extract is with described to the corresponding error correction probability of error correction phonetic.
Output unit 34, the error correction probability that can be used for being calculated according to the computing unit 33 are exported with described wait entangle
The corresponding amendment phonetic of misspelled sound.
Further, as the realization to method shown in above-mentioned Fig. 2, the embodiment of the invention also provides another kind spellings to entangle
Misloading is set, for realizing to above-mentioned method shown in Fig. 2.The Installation practice is corresponding with preceding method embodiment, for just
In reading, present apparatus embodiment no longer repeats the detail content in preceding method embodiment one by one, it should be understood that this
Device in embodiment can correspond to the full content realized in preceding method embodiment.As shown in figure 4, the device includes: to obtain
Unit 41, extraction unit 42, computing unit 43, output unit 44 are taken, wherein
Acquiring unit 41 can be used for obtaining to error correction phonetic, the multiple characters inputted to error correction phonetic for user.
Extraction unit 42 can be used for extracting from initialized data base with the acquiring unit 41 acquisition to error correction phonetic
Corresponding multiple similar pinyins preserve whole words and the corresponding correct spelling phonetic of word, the phase in the initialized data base
It is and the phonetic that preset quantity character is differed to error correction phonetic like phonetic.
Computing unit 43 can be used for keypad editor distance of the basis based on Chinese phonetic alphabet rule settings, mention described in calculating
The each similar pinyin for taking unit 42 to extract is with described to the corresponding error correction probability of error correction phonetic.
Output unit 44, the error correction probability that can be used for being calculated according to the computing unit 43 are exported with described wait entangle
The corresponding amendment phonetic of misspelled sound.
Further, described device further include: detection unit 45, determination unit 46.
The acquiring unit 41 can be also used for obtaining phonetic to be detected.
The detection unit 45, can be used for detecting whether there is and the phonetic to be detected in the initialized data base
Corresponding correct spelling phonetic.
The output unit 44, can be also used for if it exists, then output correct spelling corresponding with the phonetic to be detected
Phonetic.
The determination unit 46, can be used for if it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
Further,
The computing unit 43, specifically can be used for disk editing distance according to described based on Chinese phonetic alphabet rule settings and
Pre-set factory calculates each similar pinyin with described to the corresponding clean up editing distance of error correction phonetic.
The determination unit 46, can be also used for by the corresponding accumulative number of searches of each similar pinyin and it is described most
Product between the inverse of whole editing distance is determined as the error correction probability of each similar pinyin.
Further, the output unit 44 includes:
Sorting module 4401 can be used for being ranked up each similar pinyin according to the error correction probability.
Extraction module 4402 can be used for extracting the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value.
Output module 4403 can be used for exporting the similar pinyin of the extraction according to presetting rule.
Another spelling error correction device provided in an embodiment of the present invention.Described device include: acquiring unit, extraction unit,
Computing unit and output unit.For the prior art when the phonetic inputted to user carries out error correction, mainly according to each word
Positional relationship of the mother in keyboard is associated and error correction so that error correcting system is more unilateral, the present invention by get to
After error correction phonetic, whole similar pinyins that preset quantity character is differed with to error correction phonetic are extracted, then in conjunction with based on Chinese
The keypad editor distance of Pinyin rule setting, successively calculates each similar pinyin of extraction and general to the error correction between error correction phonetic
Rate, according to the error correction probability that is calculated determine with to the corresponding phonetic of error correction phonetic, it is therefore compared with the prior art, of the invention
When the phonetic inputted to user carries out spelling error correction, combine the corresponding keypad editor distance of Chinese phonetic alphabet rule settings and
The frequent degree that user searches for web page contents can be repaired more comprehensively and accurately according to the incorrect pinyin that user inputs
Just, to improve the accuracy of spelling error correction.Meanwhile after the phonetic for getting user's input, detection is in preset number first
According to whether there is content corresponding with the phonetic in library, it is ensured that only there is no the phonetics inputted with user in the database
The error correction to phonetic is carried out when corresponding content, just so as to avoid being that correct phonetic still carries out error correction and makes when user's input
At the wasting of resources the problem of, to save resource.
The text processing apparatus includes processor and memory, above-mentioned acquiring unit 31, extraction unit 32, computing unit
33, output unit 34 etc. stores in memory as program unit, is executed by processor stored in memory above-mentioned
Program unit realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one
Or more, the accuracy of spelling error correction is improved by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is deposited
Store up chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor
The existing spelling error correction method.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation
Error correction method is spelt described in Shi Zhihang.
The embodiment of the invention provides a kind of equipment, equipment include processor, memory and storage on a memory and can
The program run on a processor, processor performs the steps of when executing program to be obtained to error correction phonetic, described to spell to error correction
Sound is multiple characters of user's input;It is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic,
Whole words and the corresponding correct spelling phonetic of word are preserved in the initialized data base, the similar pinyin is with described to error correction
The phonetic of phonetic difference preset quantity character;According to the keypad editor distance based on Chinese phonetic alphabet rule settings, calculate described each
A similar pinyin is with described to the corresponding error correction probability of error correction phonetic;It is exported according to the error correction probability and described to error correction
The corresponding amendment phonetic of phonetic.
Further, the method also includes:
It is obtained from the initialized data base and extracts accumulative number of searches corresponding with each similar pinyin.
It is further, described to calculate each similar pinyin with described to the corresponding error correction probability of error correction phonetic,
Include:
According to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, calculate described each similar
Phonetic with described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient be according to the similar pinyin with to
What the character quantity differed between error correction phonetic was set;
It will be between the corresponding accumulative number of searches of each similar pinyin and the inverse of the clean up editing distance
Product, be determined as the error correction probability of each similar pinyin.
Further, described to include: to the corresponding phonetic of error correction phonetic with described according to error correction probability output
Each similar pinyin is ranked up according to the error correction probability;
Extract the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;
The similar pinyin of the extraction is exported according to presetting rule.
Further, before the acquisition is to error correction phonetic, the method also includes:
Obtain phonetic to be detected;
Detection is in the initialized data base with the presence or absence of correct spelling phonetic corresponding with the phonetic to be detected;
If it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;
If it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
Equipment in the embodiment of the present invention can be server, PC, PAD, mobile phone etc..
The embodiment of the invention also provides a kind of computer program products, when executing on data processing equipment, are suitable for
It executes the program of initialization there are as below methods step: obtaining to error correction phonetic, described to error correction phonetic is the multiple of user's input
Character;It is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic, is protected in the initialized data base
There are whole words and the corresponding correct spelling phonetic of word, the similar pinyin is to differ preset quantity word to error correction phonetic with described
The phonetic of symbol;According to the keypad editor distance based on Chinese phonetic alphabet rule settings, calculate each similar pinyin and it is described to
The corresponding error correction probability of error correction phonetic;According to error correction probability output with described to the corresponding amendment spelling of error correction phonetic
Sound.
Further, the method also includes:
It is obtained from the initialized data base and extracts accumulative number of searches corresponding with each similar pinyin.
It is further, described to calculate each similar pinyin with described to the corresponding error correction probability of error correction phonetic,
Include:
According to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, calculate described each similar
Phonetic with described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient be according to the similar pinyin with to
What the character quantity differed between error correction phonetic was set;
It will be between the corresponding accumulative number of searches of each similar pinyin and the inverse of the clean up editing distance
Product, be determined as the error correction probability of each similar pinyin.
Further, described to include: to the corresponding phonetic of error correction phonetic with described according to error correction probability output
Each similar pinyin is ranked up according to the error correction probability;
Extract the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;
The similar pinyin of the extraction is exported according to presetting rule.
Further, before the acquisition is to error correction phonetic, the method also includes:
Obtain phonetic to be detected;
Detection is in the initialized data base with the presence or absence of correct spelling phonetic corresponding with the phonetic to be detected;
If it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;
If it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie
The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices
Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence " including one ... ", it is not excluded that including element
Process, method, there is also other identical elements in commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art,
Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement,
Improve etc., it should be included within the scope of the claims of this application.
Claims (10)
1. a kind of spelling error correction method, which is characterized in that the described method includes:
It obtains to error correction phonetic, the multiple characters inputted to error correction phonetic for user;
It is extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic, is saved in the initialized data base
There are whole words and the corresponding correct spelling phonetic of word, the similar pinyin is to differ preset quantity character to error correction phonetic with described
Phonetic;
According to the keypad editor distance based on Chinese phonetic alphabet rule settings, each similar pinyin is calculated and described to error correction spelling
The corresponding error correction probability of sound;
According to error correction probability output with described to the corresponding amendment phonetic of error correction phonetic.
2. the method according to claim 1, wherein the method also includes:
It is obtained from the initialized data base and extracts accumulative number of searches corresponding with each similar pinyin.
3. according to the method described in claim 2, it is characterized in that, each similar pinyin and described to error correction of calculating
The corresponding error correction probability of phonetic, comprising:
According to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings, each similar pinyin is calculated
With described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient be according to the similar pinyin with to error correction
What the character quantity differed between phonetic was set;
By multiplying between the corresponding accumulative number of searches of each similar pinyin and the inverse of the clean up editing distance
Product, is determined as the error correction probability of each similar pinyin.
4. the method according to claim 1, wherein described according to error correction probability output and described to error correction
The corresponding phonetic of phonetic includes:
Each similar pinyin is ranked up according to the error correction probability;
Extract the corresponding similar pinyin of error correction probability more than predetermined probabilities threshold value;
The similar pinyin of the extraction is exported according to presetting rule.
5. the method according to claim 1, wherein the method is also wrapped before the acquisition is to error correction phonetic
It includes:
Obtain phonetic to be detected;
Detection is in the initialized data base with the presence or absence of correct spelling phonetic corresponding with the phonetic to be detected;
If it exists, then output correct spelling phonetic corresponding with the phonetic to be detected;
If it does not exist, then the phonetic to be detected is determined as to error correction phonetic.
6. a kind of spelling error correction device, which is characterized in that described device includes:
Acquiring unit, for obtaining to error correction phonetic, the multiple characters inputted to error correction phonetic for user;
Extraction unit, it is described pre- for being extracted from initialized data base with described to the corresponding multiple similar pinyins of error correction phonetic
It sets and preserves whole words and the corresponding correct spelling phonetic of word in database, the similar pinyin is with described to error correction phonetic phase
The phonetic of poor preset quantity character;
Computing unit, for calculating each similar pinyin according to the keypad editor distance based on Chinese phonetic alphabet rule settings
With described to the corresponding error correction probability of error correction phonetic;
Output unit, for according to error correction probability output with described to the corresponding amendment phonetic of error correction phonetic.
7. device according to claim 6, which is characterized in that
The acquiring unit, is also used to obtain that each similar pinyin is corresponding accumulative searches from the initialized data base
Rope quantity;
The extraction unit is also used to extract the corresponding with each similar pinyin accumulative of the acquiring unit acquisition
Number of searches.
8. device according to claim 7, which is characterized in that the computing unit includes:
Computing module, for calculating institute according to the keypad editor distance and predetermined coefficient based on Chinese phonetic alphabet rule settings
Each similar pinyin is stated with described to the corresponding clean up editing distance of error correction phonetic, the predetermined coefficient is according to the phase
It is set like phonetic with the character quantity to be differed between error correction phonetic;
Determining module, for by the corresponding accumulative number of searches of each similar pinyin and the clean up editing distance
Product between inverse is determined as the error correction probability of each similar pinyin.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program
When control the storage medium where equipment perform claim require 1 to the spelling error correction side described in any one of claim 5
Method.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit requires 1 to the spelling error correction method described in any one of claim 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710928606.5A CN109597983B (en) | 2017-09-30 | 2017-09-30 | Spelling error correction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710928606.5A CN109597983B (en) | 2017-09-30 | 2017-09-30 | Spelling error correction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109597983A true CN109597983A (en) | 2019-04-09 |
CN109597983B CN109597983B (en) | 2022-11-04 |
Family
ID=65956394
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710928606.5A Active CN109597983B (en) | 2017-09-30 | 2017-09-30 | Spelling error correction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109597983B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111028834A (en) * | 2019-10-30 | 2020-04-17 | 支付宝(杭州)信息技术有限公司 | Voice message reminding method and device, server and voice message reminding equipment |
CN111694985A (en) * | 2020-06-17 | 2020-09-22 | 北京字节跳动网络技术有限公司 | Search method, search device, electronic equipment and computer-readable storage medium |
CN111739514A (en) * | 2019-07-31 | 2020-10-02 | 北京京东尚科信息技术有限公司 | Voice recognition method, device, equipment and medium |
CN112560452A (en) * | 2021-02-25 | 2021-03-26 | 智者四海(北京)技术有限公司 | Method and system for automatically generating error correction corpus |
CN112765231A (en) * | 2021-01-04 | 2021-05-07 | 珠海格力电器股份有限公司 | Data processing method and device and computer readable storage medium |
CN115437511A (en) * | 2022-11-07 | 2022-12-06 | 北京澜舟科技有限公司 | Pinyin Chinese character conversion method, conversion model training method and storage medium |
CN115905297A (en) * | 2023-01-04 | 2023-04-04 | 脉策(上海)智能科技有限公司 | Method, apparatus and medium for retrieving data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831177A (en) * | 2012-07-31 | 2012-12-19 | 聚熵信息技术(上海)有限公司 | Statement error correction method and system |
US20140298168A1 (en) * | 2013-03-28 | 2014-10-02 | Est Soft Corp. | System and method for spelling correction of misspelled keyword |
CN106202153A (en) * | 2016-06-21 | 2016-12-07 | 广州智索信息科技有限公司 | The spelling error correction method of a kind of ES search engine and system |
CN106847288A (en) * | 2017-02-17 | 2017-06-13 | 上海创米科技有限公司 | The error correction method and device of speech recognition text |
-
2017
- 2017-09-30 CN CN201710928606.5A patent/CN109597983B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831177A (en) * | 2012-07-31 | 2012-12-19 | 聚熵信息技术(上海)有限公司 | Statement error correction method and system |
US20140298168A1 (en) * | 2013-03-28 | 2014-10-02 | Est Soft Corp. | System and method for spelling correction of misspelled keyword |
CN106202153A (en) * | 2016-06-21 | 2016-12-07 | 广州智索信息科技有限公司 | The spelling error correction method of a kind of ES search engine and system |
CN106847288A (en) * | 2017-02-17 | 2017-06-13 | 上海创米科技有限公司 | The error correction method and device of speech recognition text |
Non-Patent Citations (1)
Title |
---|
郑文曦等: "自动拼写校对的算法设计和系统实现", 《科技和产业》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111739514A (en) * | 2019-07-31 | 2020-10-02 | 北京京东尚科信息技术有限公司 | Voice recognition method, device, equipment and medium |
CN111739514B (en) * | 2019-07-31 | 2023-11-14 | 北京京东尚科信息技术有限公司 | Voice recognition method, device, equipment and medium |
CN111028834A (en) * | 2019-10-30 | 2020-04-17 | 支付宝(杭州)信息技术有限公司 | Voice message reminding method and device, server and voice message reminding equipment |
CN111028834B (en) * | 2019-10-30 | 2023-01-20 | 蚂蚁财富(上海)金融信息服务有限公司 | Voice message reminding method and device, server and voice message reminding equipment |
CN111694985A (en) * | 2020-06-17 | 2020-09-22 | 北京字节跳动网络技术有限公司 | Search method, search device, electronic equipment and computer-readable storage medium |
CN111694985B (en) * | 2020-06-17 | 2022-03-01 | 北京字节跳动网络技术有限公司 | Search method, search device, electronic equipment and computer-readable storage medium |
CN112765231A (en) * | 2021-01-04 | 2021-05-07 | 珠海格力电器股份有限公司 | Data processing method and device and computer readable storage medium |
CN112560452A (en) * | 2021-02-25 | 2021-03-26 | 智者四海(北京)技术有限公司 | Method and system for automatically generating error correction corpus |
CN115437511A (en) * | 2022-11-07 | 2022-12-06 | 北京澜舟科技有限公司 | Pinyin Chinese character conversion method, conversion model training method and storage medium |
CN115905297A (en) * | 2023-01-04 | 2023-04-04 | 脉策(上海)智能科技有限公司 | Method, apparatus and medium for retrieving data |
CN115905297B (en) * | 2023-01-04 | 2023-12-15 | 脉策(上海)智能科技有限公司 | Method, apparatus and medium for retrieving data |
Also Published As
Publication number | Publication date |
---|---|
CN109597983B (en) | 2022-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109597983A (en) | A kind of spelling error correction method and device | |
US11860684B2 (en) | Few-shot named-entity recognition | |
US11016997B1 (en) | Generating query results based on domain-specific dynamic word embeddings | |
CN105989040B (en) | Intelligent question and answer method, device and system | |
CN107704503A (en) | User's keyword extracting device, method and computer-readable recording medium | |
CN105975459B (en) | A kind of the weight mask method and device of lexical item | |
CN110019668A (en) | A kind of text searching method and device | |
US10831993B2 (en) | Method and apparatus for constructing binary feature dictionary | |
CN106610931B (en) | Topic name extraction method and device | |
TWI710917B (en) | Data processing method and device | |
CN106970912A (en) | Chinese sentence similarity calculating method, computing device and computer-readable storage medium | |
CN109344406A (en) | Part-of-speech tagging method, apparatus and electronic equipment | |
US20170185653A1 (en) | Predicting Knowledge Types In A Search Query Using Word Co-Occurrence And Semi/Unstructured Free Text | |
US8290925B1 (en) | Locating product references in content pages | |
US20210109994A1 (en) | Natural language processing using joint sentiment-topic modeling | |
CN106598997B (en) | Method and device for calculating text theme attribution degree | |
CN110019784B (en) | Text classification method and device | |
CN109597982B (en) | Abstract text recognition method and device | |
CN110717008B (en) | Search result ordering method and related device based on semantic recognition | |
CN111597336A (en) | Processing method and device of training text, electronic equipment and readable storage medium | |
CN115563268A (en) | Text abstract generation method and device, electronic equipment and storage medium | |
CN105095826B (en) | A kind of character recognition method and device | |
CN111475641B (en) | Data extraction method and device, storage medium and equipment | |
CN107861950A (en) | The detection method and device of abnormal text | |
CN110019665A (en) | Text searching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |