CN112989148A - Error correction word ordering method and device, terminal equipment and storage medium - Google Patents
Error correction word ordering method and device, terminal equipment and storage medium Download PDFInfo
- Publication number
- CN112989148A CN112989148A CN201911279538.XA CN201911279538A CN112989148A CN 112989148 A CN112989148 A CN 112989148A CN 201911279538 A CN201911279538 A CN 201911279538A CN 112989148 A CN112989148 A CN 112989148A
- Authority
- CN
- China
- Prior art keywords
- error correction
- error
- word
- words
- type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9038—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0237—Character input methods using prediction or retrieval techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Document Processing Apparatus (AREA)
Abstract
The embodiment of the application is applicable to the technical field of input methods, and provides a method and a device for sorting error-correcting words, terminal equipment and a storage medium, wherein the method comprises the following steps: when a character string input by a user is received, acquiring a plurality of error correction words matched with the character string, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type; determining the weight value of each error correction word according to the word order weight of the error correction type; and sequencing the plurality of error-correcting words according to the weight value of each error-correcting word. According to the method, the weighted value of the error correction words is determined again through the word order weighted value of the error correction type, the error correction words corresponding to the frequently-occurring input errors can be displayed to the user preferentially, the error correction efficiency of the input method is improved, the method can be applied to the fields of artificial intelligence, natural language processing and the like, and the input efficiency is improved.
Description
Technical Field
The application belongs to the technical field of input methods, and particularly relates to a method and a device for sorting error-correcting words, terminal equipment and a storage medium.
Background
A phonetic input method suitable for Chinese input is characterized by that after a string of phonetic alphabets is input by user, the system can convert the input phonetic alphabets into a string of Chinese characters. Because no special memory is needed and the method accords with the thinking mode of people, the pinyin can be input as long as the pinyin is used, and the pinyin input method is more and more an input mode which is most widely applied at present.
With the popularization of mobile devices, when a user uses the pinyin input method on a mobile phone or a tablet personal computer with a small area, the pinyin is easy to be knocked by mistake. For example, the user wants to input "you", because the letters "i" and "o" are adjacent to each other on the keyboard, it is very easy to press wrong when inputting, and as a result, "yiu" is input, which results in that the Chinese character is not the one the user wants. The user needs to re-input or modify the pinyin with great expense to obtain the desired word.
Currently, various input method manufacturers provide an automatic error correction function for the pinyin input method. When the user enters the wrong pinyin, the input method will attempt to automatically correct the wrong pinyin to the correct pinyin, and then give the corresponding chinese word according to the corrected pinyin. However, for the incorrectly input pinyin, a plurality of different error correction results may be obtained according to different error correction methods. When the error correction result is too much, the user also needs to go through tedious operations to find out the words really wanted by the user.
Disclosure of Invention
The embodiment of the application provides a method and a device for sorting error-correcting words, terminal equipment and a storage medium, which can improve the error-correcting rate of a pinyin input method and enable error-correcting words finally presented to a user to be more matched with the actual requirements of the user.
In a first aspect, an embodiment of the present application provides a method for sorting error-correcting words, including:
when a character string input by a user is received, acquiring a plurality of error correction words matched with the character string, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
determining the weight value of each error correction word according to the word order weight of the error correction type;
and sequencing the plurality of error-correcting words according to the weight value of each error-correcting word.
For example, the weight value of each error correction word may be determined according to the initial weight value of each error correction word and the word order weight of the corresponding error correction type.
Illustratively, when the error correction words are sorted, the number of the error correction words corresponding to various error correction types can be counted respectively; and deleting the error-correcting words more than the threshold value of the number of the error-correcting words, and then sequencing the error-correcting words according to the weight values of the remaining error-correcting words.
For example, for error-correcting words with a mutually exclusive relationship, error-correcting words with relatively smaller order weights may be deleted according to the order weights of the error-correcting types, so as to reduce the number of error-correcting words.
In a second aspect, an embodiment of the present application provides an apparatus for sorting error-correcting words, including:
the device comprises an error correction word acquisition module, a word sequence weighting module and a word sequence weighting module, wherein the error correction word acquisition module is used for acquiring a plurality of error correction words matched with a character string when the character string input by a user is received, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
the weighted value determining module is used for determining the weighted value of each error correction word according to the word order weight of the error correction type;
and the error-correcting word sorting module is used for sorting the plurality of error-correcting words according to the weight value of each error-correcting word.
In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for sorting error-correcting words according to any one of the above first aspects when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor of a terminal device, the computer program implements the method for sorting error-correcting words in any one of the above first aspects.
In a fifth aspect, an embodiment of the present application provides a computer program product, which, when running on a terminal device, causes the terminal device to execute the method for sorting error-correcting words in any one of the above first aspects.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
according to the embodiment of the application, when the character string input by the user is received, the error-correcting words matched with the character string are obtained, then the weight values of the error-correcting words can be determined again according to the word order weights of the error-correcting types corresponding to the error-correcting words, and after the error-correcting words are sequenced according to the determined weight values, the error-correcting words can be displayed to the user. Generally, the error correction type with higher word order weight is used for correcting errors which are frequently input by a user during input, the weight value of an error correction word is determined again according to the error correction type, the error correction word corresponding to the frequently input error can be preferentially displayed to the user, the error correction efficiency of the input method is improved, the method can be applied to the fields of artificial intelligence, natural language processing and the like, and the input efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the embodiments or the description of the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic diagram of a hardware structure of a mobile phone to which a method for sorting error-correcting words provided in an embodiment of the present application is applied;
fig. 2 is a schematic diagram of a software structure of a mobile phone to which a method for sorting error-correcting words provided in an embodiment of the present application is applied;
FIG. 3 is a flowchart illustrating exemplary steps of a method for sorting error correction words according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an input method architecture according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating exemplary steps of a method for ordering error correction words according to another embodiment of the present application;
FIG. 6 is a flowchart illustrating exemplary steps of a method for sorting error correction words according to yet another embodiment of the present application;
fig. 7 is a block diagram illustrating a structure of an apparatus for sorting error-correcting words according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
The terminology used in the following examples is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of this application and the appended claims, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, such as "one or more", unless the context clearly indicates otherwise. It should also be understood that in the embodiments of the present application, "one or more" means one, two, or more than two; "and/or" describes the association relationship of the associated objects, indicating that three relationships may exist; for example, a and/or B, may represent: a alone, both A and B, and B alone, where A, B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
At present, input methods of various brands provide corresponding error correction functions. The basic principle of the pinyin input method is that a fixed error correction list or strategy is preset, when the pinyin character string input by a user is detected to be matched with the error correction list or strategy, the error correction character string corresponding to the input pinyin character string is searched from a system word stock, and then candidate words matched with the error correction character string are displayed to the user.
For example, if the user inputs the pinyin character string "ng", the auto-error-correction function of the input method supplements the pinyin character string "lang", "leng", or "ling" according to a preset error-correction strategy, and then finds out the corresponding chinese character to provide to the user.
For another example, if the user inputs the pinyin character string "yiu", the input method will search for adjacent other letters around the letter "i" to form the character string "you", and then provide the user with the chinese characters corresponding to the character string "you".
However, the error correction words provided by the input method after automatic error correction are not necessarily the words really intended by the user. The normal use of the user is influenced by the condition that the error-correcting words are not right or too many.
For example, if the input method detects that the user inputs a character string "daohng", the character string cannot be directly matched with a corresponding Chinese character. At this time, the input method will correct the error to obtain "daohang", "daohong" and "daoheng", and give the corresponding error correction word. In fact, the user may just want to enter "daohang-navigation", and error correction according to the above would result in too many error correction words. When the input method provides all the error-correcting words to the user, it takes some time for the user to find out the word that the user really wants to input.
Therefore, in order to solve the above problems, the embodiments of the present application provide a method for sorting error-correcting words, which scientifically and reasonably sorts each error-correcting word obtained by error correction, so as to improve the error-correcting rate of the pinyin input method, enable the error-correcting words finally presented to a user to better match the actual requirements of the user, and reduce the operation event that the user selects a word that is really desired to be input from excessive error-correcting words.
The method for ordering error-correcting words provided by the embodiment of the present application is described below with reference to a specific technical solution.
The method for sorting error-correcting words provided by the embodiment of the application can be applied to terminal devices such as a mobile phone, a tablet personal computer, a wearable device, a vehicle-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA) and the like, and the embodiment of the application does not limit the specific types of the terminal devices.
Take the terminal device as a mobile phone as an example. Fig. 1 is a block diagram illustrating a partial structure of a mobile phone according to an embodiment of the present disclosure. Referring to fig. 1, the cellular phone includes: radio Frequency (RF) circuit 110, memory 120, input unit 130, display unit 140, sensor 150, audio circuit 160, wireless fidelity (Wi-Fi) module 170, processor 180, and power supply 190. Those skilled in the art will appreciate that the handset configuration shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each component of the mobile phone in detail with reference to fig. 1:
the RF circuit 110 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 180; in addition, the data for designing uplink is transmitted to the base station. Typically, the RF circuitry includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE)), e-mail, Short Messaging Service (SMS), and the like.
The memory 120 may be used to store software programs and modules, and the processor 180 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 120. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone 100. Specifically, the input unit 130 may include a touch panel 131 and other input devices 132. The touch panel 131, also referred to as a touch screen, may collect touch operations of a user on or near the touch panel 131 (e.g., operations of the user on or near the touch panel 131 using any suitable object or accessory such as a finger or a stylus pen), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 131 may include two parts, i.e., a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 180, and can receive and execute commands sent by the processor 180. In addition, the touch panel 131 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 130 may include other input devices 132 in addition to the touch panel 131. In particular, other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 140 may be used to display information input by a user or information provided to the user and various menus of the mobile phone. The Display unit 140 may include a Display panel 141, and optionally, the Display panel 141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 131 can cover the display panel 141, and when the touch panel 131 detects a touch operation on or near the touch panel 131, the touch operation is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. Although the touch panel 131 and the display panel 141 are shown as two separate components in fig. 1 to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 131 and the display panel 141 may be integrated to implement the input and output functions of the mobile phone.
The handset 100 may also include at least one sensor 150, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 141 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 141 and/or the backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
Wi-Fi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the Wi-Fi module 170, and provides wireless broadband internet access for the user. Although fig. 1 shows the Wi-Fi module 170, it is understood that it does not belong to the essential constitution of the cellular phone 100, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 180 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby integrally monitoring the mobile phone. Alternatively, processor 180 may include one or more processing units; preferably, the processor 180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.
The handset 100 also includes a power supply 190 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 180 via a power management system, such that the power management system may be used to manage charging, discharging, and power consumption.
Although not shown, the handset 100 may also include a camera. Optionally, the position of the camera on the mobile phone 100 may be front-located or rear-located, which is not limited in this embodiment of the application.
Optionally, the mobile phone 100 may include a single camera, a dual camera, or a triple camera, which is not limited in this embodiment.
For example, the cell phone 100 may include three cameras, one being a main camera, one being a wide camera, and one being a tele camera.
Optionally, when the mobile phone 100 includes a plurality of cameras, the plurality of cameras may be all front-mounted, all rear-mounted, or a part of the cameras front-mounted and another part of the cameras rear-mounted, which is not limited in this embodiment of the present application.
In addition, although not shown, the mobile phone 100 may further include a bluetooth module or the like, which is not described herein.
Fig. 2 is a schematic diagram of a software structure of the mobile phone 100 according to the embodiment of the present application. Taking the operating system of the mobile phone 100 as an Android system as an example, in some embodiments, the Android system is divided into four layers, which are an application layer, an application Framework (FWK) layer, a system layer and a hardware abstraction layer, and the layers communicate with each other through a software interface.
As shown in fig. 2, the application layer may include a series of application packages, which may include short message, calendar, camera, video, navigation, gallery, call, and other applications.
The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer may include some predefined functions, such as functions for receiving events sent by the application framework layer.
As shown in FIG. 2, the application framework layers may include a window manager, a resource manager, and a notification manager, among others.
The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like. The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.
The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.
The application framework layer may further include:
a viewing system that includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
The phone manager is used to provide the communication functions of the handset 100. Such as management of call status (including on, off, etc.).
The system layer may include a plurality of functional modules. For example: a sensor service module, a physical state identification module, a three-dimensional graphics processing library (such as OpenGL ES), and the like.
The sensor service module is used for monitoring sensor data uploaded by various sensors in a hardware layer and determining the physical state of the mobile phone 100;
the physical state recognition module is used for analyzing and recognizing user gestures, human faces and the like;
the three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.
The system layer may further include:
the surface manager is used to manage the display subsystem and provide fusion of 2D and 3D layers for multiple applications.
The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.
The hardware abstraction layer is a layer between hardware and software. The hardware abstraction layer may include a display driver, a camera driver, a sensor driver, etc. for driving the relevant hardware of the hardware layer, such as a display screen, a camera, a sensor, etc.
The following embodiments may be implemented on the cellular phone 100 having the above-described hardware structure/software structure. The following embodiment will take the mobile phone 100 as an example to describe the method for sorting error-correcting words provided by the embodiment of the present application.
Referring to fig. 3, a schematic step flow chart of a method for sorting error-correcting words provided in an embodiment of the present application is shown, and by way of example and not limitation, the method may be applied to the mobile phone 100, and the method may specifically include the following steps:
s301, when a character string input by a user is received, acquiring a plurality of error correction words matched with the character string, and determining an error correction type to which each error correction word belongs and a word order weight corresponding to the error correction type;
in this embodiment, the character string input by the user may be a pinyin character string. Namely, a user inputs a pinyin character string when using the pinyin input method, and then outputs a corresponding Chinese character through matching and searching of the input method.
Fig. 4 is a schematic diagram of an input method architecture of the present embodiment. The method for sorting error-correcting words provided by this embodiment can be implemented in the input method architecture shown in fig. 4. The input method architecture shown in fig. 4 mainly includes four core units of input, engine, lexicon, and display. The input unit can provide input modes of keyboard input for a user, including a Pinyin full key, a Pinyin nine key and the like, and the core processing of the input unit is false touch prevention, so that the user can input a correct character string as much as possible. Of course, the input unit can also provide other language input modes for the user, and various types of input modes such as strokes, handwriting, voice and the like.
For the engine unit, aiming at the pinyin input method, an N-gram language model and Viterbi (Viterbi) pinyin decoding are mainly provided, so that specific word output, association and error correction functions can be realized by combining a word bank. The word stock in the input method comprises a basic word stock, network hot words, user-defined word groups and the like, and each word stock can be updated according to certain frequency.
The display unit of the input method can be presented to the user through a display interface of the terminal equipment, and besides the output of the basic candidate words, the display unit of the input method can also comprise recommendation information of other information or service classes.
On the basis of the input method architecture, the sorting process of the candidate words is improved, and a new error-correcting word sorting method is provided. As described in detail below.
In general, when a user inputs a pinyin character string, a situation of inputting errors may occur. At this time, the input method can automatically correct the character string with the input error. For example, the user enters the character string "yiu" which the input method automatically corrects to the character string "you".
The input method can output error-correcting words matched with the character string after correcting the wrongly input pinyin character string. For example, for the character string "yiu" input by the user, the input method may output error-correcting words such as "have", "again", "by", etc. that match the corrected character string "you".
The error correction words can be obtained by correcting the character string with errors according to some error correction type. Similarly, taking the character string "yiu" as an example, the input method may output the error correction words such as "have", "again", "by", etc. as the error correction words obtained by correcting the error of "yiu" according to the adjacent key error correction. That is, the letter "i" in the character string "yiu" input in error is replaced with the letter "o" adjacent thereto.
Of course, the adjacent key error correction provided above is only an example, and the error correction manner provided by the input method may include many kinds, which is not limited in this embodiment.
In this embodiment, for any error correction type, corresponding word order weights may be configured for the error correction type, so that after error correction is performed on a pinyin character string, error-corrected words may be sorted according to the word order weights.
The word order weights of various error correction types in this embodiment may be calculated according to a large amount of statistical data and error correction cases. In the embodiment, the word order weights of various error correction types are calculated by analyzing the corresponding relation between a large number of character strings with input errors and Chinese characters finally input by a user, so that the most common input error types of the user can be embodied. Therefore, the word order weight of the error correction type corresponding to the most common input error of the user is generally large. Such as fuzzy tone error correction, tail-less-input-character error correction, etc.
It should be noted that the pinyin character string input by the user may not be the wrong character string. For example, when the user enters the character string "zai," it may indicate that the user does wish to enter "zai" and obtain a chinese character that matches the character string. On the other hand, the input method can also correct the character string into the character string "zhai" according to a corresponding error correction mode, and simultaneously provide the Chinese characters matched with the character string "zhai" for the user. At this time, each kanji matching the character string "zai" is a non-error-correcting word, and a kanji matching the character string "zhai" is an error-correcting word.
S302, determining the weight value of each error correction word according to the word order weight of the error correction type;
in this embodiment, various error correction types may be divided into different error correction classes in advance, and for any error correction class, a word order weight is set for the error correction class.
As an example of this embodiment, a plurality of error correction types may be divided into four major categories, i.e., error correction one category, error correction two category, error correction three category, and error correction four category, and then a word order weight may be set for each category. For example, the word order weight of the first type of error correction may be set equal to the word order weight of the second type of error correction, the word order weight of the second type of error correction may be set greater than the word order weight of the third type of error correction, the word order weight of the third type of error correction may be set greater than the word order weight of the fourth type of error correction, and so on.
When the weight value of each error correction word is determined according to the word order weight of the error correction type, the word order weight of the error correction type can be directly given to the corresponding error correction word. That is, the word order weight of the error correction class is directly used as the weight value of the error correction word obtained after error correction according to the error correction class. Or the word order weight of the error correction class can be used as an additional value, and the word order weight is added to the original weight value of the error correction word. For example, if the word order weight of a certain error correction type is 60%, the original weight value of a certain error correction word obtained by performing error correction according to the certain error correction type may be multiplied by 60% to be used as the new weight value of the error correction word, which is not limited in this embodiment.
S303, sequencing the plurality of error-correcting words according to the weight value of each error-correcting word.
After the weight value of each error correction word is determined, each error correction word can be sequenced according to the weight value. Generally, the larger the weight value, the earlier the ranking; otherwise, the sequence is the later.
In the embodiment of the application, when a character string input by a user is received, the weight value of each error-correcting word can be re-determined by acquiring the error-correcting word matched with the character string and then according to the word order weight of the error-correcting type corresponding to the error-correcting word, and after the error-correcting words are sequenced according to the re-determined weight value, each error-correcting word can be displayed to the user. Generally, the error correction type with higher word order weight is to correct errors of input errors which occur more frequently when a user inputs the input words, the weight value of an error correction word is determined again according to the error correction type, the error correction word corresponding to the input errors which occur more frequently can be preferentially displayed to the user, the error correction efficiency of the input method is improved, the method can be widely applied to the fields of Artificial Intelligence (AI), natural language processing and the like, and the method is favorable for improving the input efficiency.
Referring to fig. 5, a flowchart illustrating schematic steps of a method for sorting error-correcting words according to another embodiment of the present application is shown, where the method specifically includes the following steps:
s501, when a character string input by a user is received, correcting errors of the character string according to multiple preset error correction types respectively, and determining word sequence weights corresponding to the various error correction types;
it should be noted that the method can be applied to terminal devices, such as mobile phones, tablet computers, and the like. That is, the execution subject of the present embodiment is a terminal device. Taking a terminal device as an example of a mobile phone, when a user uses a pinyin input method on the mobile phone, if a pinyin character string input by the user is an erroneous character string, the error correction can be performed on the erroneous character string according to the method provided by the embodiment, and error-correcting words obtained after error correction are reordered, so that the error-correcting words in the top of the order have a higher probability of belonging to words that the user really wants to input.
In this embodiment, the plurality of error correction types provided by the input method may be classified first. For example, the various types of error correction are divided into one type of error correction, two types of error correction, three types of error correction, or four types of error correction, so that each type of error correction at least includes one specific type of error correction.
Generally, the types of error correction provided by the input method include user-configured fuzzy tone error correction, tail few-input character error correction, neighbor key error correction, default fuzzy tone error correction, multiple-input character error correction, middle few-input character error correction, and swap character error correction, among others. Wherein, the user-configured fuzzy tone error correction may refer to an error correction type manually configured by the user. The input method can provide the user with a function of manually configuring and correcting errors which are frequently generated in the input process of the user. For example, if the user frequently confuses the letters "f" and "h" during the input process, the user may manually select to configure the two letters as the error correction pair, and then, during the input process, the user may preferentially detect whether the two letters are mistakenly input.
The error correction of the input characters at the tail end can mean that the pinyin is not completely input, and corresponding error correcting words can be obtained by automatically supplementing all possible legal pinyin behind the pinyin. For example, for the input character string "pe", legal pinyin such as "pei", "pen", or "pen" can be obtained by filling up other letters at the end.
The adjacent key error correction can mean that letters of adjacent key positions are pressed during input, and the pressed letters need to be corrected back when the adjacent key error correction is executed. In general, adjacent key correction should be limited to no more than one key position that is biased.
Default ambiguity correction may refer to some ambiguity correction that the input method supports by default without the user configuring the ambiguity correction by himself. The difference between default and manual fuzzy tone correction is that the number of error correcting words is limited and their word order weights are relatively small.
The multiple input character error correction may refer to the user adding an additional letter to the input pinyin character string. The multi-entered character may be a repeated letter.
The error correction of the few input characters in the middle can mean that a certain letter is input in the middle of the pinyin character string in an omission mode. It should be noted that the middle low input character error correction is only for the case where letters are missed at the middle position of the character string, and is not for the case where letters are missed in front of or behind the character string.
The exchange character error correction can mean that adjacent letters in a character string are exchanged incorrectly, and the error correction can be finished only by reversing the sequence of the two letters.
As shown in table one, the definitions, ranges and corresponding descriptions of various error correction types in this embodiment are provided.
Table one:
for the various error correction types, the error correction types can be divided into a plurality of error correction classes, and a corresponding threshold value of the number of error correction words and a word order weight are set for each error correction class.
In this embodiment, the threshold number of error correction words may be the maximum number of the type of error correction words that are allowed to be presented to the user when error correction is performed according to a certain error correction type. For example, if the threshold value of the number of error correction words corresponding to a certain error correction type is 5, it indicates that the maximum number of error correction words provided to the user is not more than 5 when the pinyin field string is corrected by type.
As an example of this embodiment, when dividing the error correction type into four error correction classes, the threshold of the number of error correction words of one error correction class may be set to be greater than a first number threshold, the threshold of the number of error correction words of the second error correction class may be set to be less than or equal to a second number threshold, the threshold of the number of error correction words of the third error correction class may be set to be less than or equal to a third number threshold, and the threshold of the number of error correction words of the fourth error correction class may be set to be less than or equal to a fourth number threshold, where the first number threshold may be greater than the second number threshold, the second number threshold may be greater than the third number threshold, and the third number threshold may be greater than.
For example, the first number threshold may be set to infinity, which means that when error correction is performed according to an error correction type belonging to the class of error correction, all error correction words may be presented to the user without being limited by the number of error correction words. The second number threshold may be set to 5, the third number threshold may be set to 2, the fourth number threshold may be set to 1, and so on. Of course, the above is only an example of the present embodiment, and a person skilled in the art may specifically select the error correction word number threshold according to actual needs, which is not limited in the present embodiment.
The word order weight in this embodiment may be a discount value discounting the weight value of the error correction word. For example, if the word order weight is 1, it indicates that the weight value of the error correction word obtained by the error correction type is not discounted; if the word order weight of a certain error correction type is 80%, it means that the weight value of the error correction word obtained according to the error correction type needs to be folded eight times, thereby reducing the weight value of the error correction word.
As shown in table two, this is an example of classification of one error correction type of the present embodiment.
Table two:
s502, if the character string is successfully corrected according to the target error correction type, generating a target character string corresponding to the target error correction type, wherein the target error correction type is any one of the multiple error correction types;
in this embodiment, when receiving a pinyin character string input by a user, error correction may be performed on the pinyin character string according to the various error correction types, and if the error correction is successful, a target character string corresponding to the error correction type may be obtained.
For example, with respect to the character string "png" input by the user, by performing error correction in accordance with the above-described various error correction types, the target character strings "pan", "pen", and "ping" corresponding to the error correction type of missing can be obtained, and so on.
It should be noted that there may be more than one type of error correction for any string of characters entered by the user. That is, the input method may correct the error of the character string according to a plurality of different error correction types, which is not limited in this embodiment.
S503, acquiring a plurality of error correction words matched with the target character string;
for the obtained target character strings, the input method may provide error-correcting words matched with the respective target character strings according to the existing word output manner.
When there are a plurality of error correction types, the error correction word corresponding to each error correction type may include a plurality of words.
S504, obtaining the initial weight value of each error correction word, and determining the weight value of each error correction word according to the initial weight value of each error correction word and the word order weight of the error correction type corresponding to each error correction word;
the initial weight value of each error-correcting word is the weight value of each word obtained according to the existing sorting strategy of the input method, and the initial weight value represents the sorting position of each error-correcting word before being processed.
In this embodiment, when obtaining the initial weight value of each error correction word and obtaining the word order weight of the error correction type corresponding to the error correction word, the initial weight value and the word order weight of the corresponding error correction type may be multiplied to obtain the final weight value of each error correction word.
For example, if a certain error correction word belongs to the words given by the adjacent key error correction, the initial weight value may be multiplied by 60% when calculating the final weight value of the error correction word, and after processing, the final weight value will be smaller than the initial weight value. Accordingly, the rank position of the error-correcting word in all candidate words may be moved backward.
After the weight value of each error correction word is determined, the error correction words can be sorted according to the weight value. Generally, the larger the weight value, the more top the ranking.
S505, respectively counting the number of error correction words corresponding to each error correction type;
in this embodiment, since different error correction types may all give a plurality of error correction words, in order to reduce the occurrence of some unnecessary error correction words in the candidate word list, the number of error correction words obtained by performing error correction according to the type may be counted for each error correction type.
For example, the number of error correction words obtained by performing error correction according to the manual fuzzy tone is counted, the number of error correction words obtained by performing error correction according to the adjacent key is counted, and the like.
S506, if the number of the error-correcting words is more than the threshold value of the number of the error-correcting words of the corresponding error-correcting type, deleting the error-correcting words more than the threshold value of the number of the error-correcting words;
in this embodiment, after the numbers of the various types of error-correcting words are obtained according to the statistics of the error-correcting types, the more error-correcting words may be processed according to the threshold value of the number of error-correcting words shown in table two.
In the concrete implementation, as can be seen from table two, the number of error-correcting words obtained by manual fuzzy sound error correction is not limited, and error-correcting words obtained by error correction according to the type can be completely retained in candidate words; for the error correction type of the input-less, only 5 error correction words are allowed to be reserved at most, and if the number of the error correction words obtained by error correction according to the input-less is 7, more than 2 error correction words need to be deleted; for the processing of other types of error-correcting words, the processing may also be performed in the above manner, which is not described in detail in this embodiment.
And S507, sequencing the remaining error-correcting words according to the weight values of the remaining error-correcting words.
Taking 7 error-correcting words obtained by error correction according to the above-mentioned low input as an example, 2 of the error-correcting words should be deleted because the number of the error-correcting words exceeds the maximum number of error-correcting words allowed by one error-correcting type of the low input.
In a specific implementation, 2 error-correcting words with relatively small weight values may be deleted, and the remaining 5 error-correcting words with relatively large weight values may be retained.
In the embodiment of the application, various error correction types are classified, and the number threshold and the word order weight of the error correction words corresponding to each error correction type are set, so that after the input pinyin character string is corrected according to various error correction types, the weight value of each error correction word can be re-determined according to the word order weight, and the error correction words exceeding the number threshold are deleted, so that the candidate error correction words finally presented to the user are obtained. By subdividing the error correction types, the problems of error correction, error correction failure and excessive error correction in the error correction of the input method are solved, the words which are most likely to match the words really wanted to be input by the user can be presented to the user, the negative effects caused by the automatic error correction function of the input method are reduced, and the error correction efficiency of the pinyin input method is improved.
Referring to fig. 6, a flowchart illustrating schematic steps of a method for sorting error-correcting words according to another embodiment of the present application is shown, where the method specifically includes the following steps:
s601, when a character string input by a user is received, acquiring a plurality of error correction words matched with the character string, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
s602, determining the weight value of each error correction word according to the word order weight of the error correction type;
s603, sequencing the plurality of error-correcting words according to the weight value of each error-correcting word;
since steps S601 to S603 in this embodiment are similar to steps S301 to S303 and S501 to S507 in the foregoing embodiment, reference may be made to the description of the foregoing embodiment, and this embodiment will not be described again.
S604, judging whether the error correction types corresponding to any two error correction words have a mutual exclusion relationship;
in this embodiment, when the error-correcting words are sorted, it may be further determined whether the error-correcting types corresponding to any two error-correcting words have a mutual exclusion relationship, where the mutual exclusion relationship may mean that some two error-correcting types having the mutual exclusion relationship cannot occur at the same time, but only one of the two error-correcting types occurs.
In a specific implementation, based on a large amount of statistical data analysis, the three types of error correction and the four types of error correction in the foregoing embodiment may be set to have a mutually exclusive relationship. That is, the error correction words obtained by error correction according to the three types of error correction will exclude the error correction words obtained by error correction according to the four types of error correction.
S605, if the error correction types corresponding to any two error correction words have a mutual exclusion relationship, determining the error correction type to be deleted;
in this embodiment, when the error correction types corresponding to the two error correction words have a mutual exclusion relationship, one of the error correction words needs to be deleted.
As an example of the present embodiment, when there are error correction words having a mutual exclusion relationship, error correction words included in an error correction type whose word order weight is relatively small may be deleted.
Therefore, if the error correction types corresponding to any two error correction words have a mutual exclusion relationship, the word order weights of the error correction types having the mutual exclusion relationship may be compared first, and then the error correction type corresponding to the minimum value of the word order weights among the error correction types having the mutual exclusion relationship is determined as the error correction type to be deleted.
And S606, deleting each error correction word matched with the error correction type to be deleted.
In this embodiment, the word order weight of the error correction four types is smaller than the word order weight of the error correction three types. Therefore, when error correction words obtained by performing error correction according to the error correction types belonging to the error correction three types and the error correction four types appear, it is necessary to delete each error correction word corresponding to the error correction four types.
Of course, the mutual exclusion relationship belongs to a supplementary limitation when the error-correcting words are sorted in this embodiment, and as shown in table three, other supplementary limitations such as error correction of the preferred words may also be set.
Table three:
of course, the above supplementary definition is only an example, and those skilled in the art may add other definitions according to actual needs to optimize the sorting order of the error-correcting words.
As shown in table four, the method for sorting error-correcting words provided in this embodiment is used to correct errors in a part of input pinyin character strings and display the part of error-correcting words.
Table four:
inputting phonetic alphabets | Error correction words and ordering |
eili | Qi, calendar and Hitachi |
daxue | College, daxue, xue, capitalization and law |
napi | Where, that |
zian | Zi' an, Xian, first |
zianzai | Now, first |
szn | Crossbow for ten-year, ancestor bird, three and crossbow |
aan | A an, an |
wanan | Night safety |
iy | One to |
vl | Lu, Lu tea |
liag | Two are |
bingkui | Lack of ice |
hngda | Hengda (constant altitude) |
qinren | Relatives, lovers, Qin people |
sanghai | Injury of the mulberry sea |
In view of the above example, the candidate words obtained after error correction and sorting are performed according to the error-correcting word sorting method provided in this embodiment, not only is error correction accurate, but also the number of the obtained candidate words is moderate, and the method meets the actual use requirements of the user.
By randomly selecting tens of thousands of wrong pinyin cases and applying the error-correcting word sorting method provided by the embodiment to correct errors and sort, the error-correcting rate is improved by 40 percent, the error-correcting rate is reduced by 30 percent, and relatively balanced results are obtained on the error correction, sorting and word output number of pinyin character strings. The method for sorting error-correcting words provided by the embodiment can be applied to the field of artificial intelligence, and particularly can effectively improve the input efficiency and the error-correcting efficiency by applying the method for sorting error-correcting words provided by the embodiment in the sub-field of natural language processing corresponding to the field of artificial intelligence.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Fig. 7 shows a structural block diagram of an error-correcting word sorting apparatus according to an embodiment of the present application, which corresponds to the error-correcting word sorting method described in the foregoing embodiment, and only shows portions related to the embodiment of the present application for convenience of description.
Referring to fig. 7, the apparatus may be applied to a terminal device, and specifically may include the following modules:
an error correction word obtaining module 701, configured to, when a character string input by a user is received, obtain multiple error correction words matched with the character string, and determine an error correction type to which each error correction word belongs and a word order weight corresponding to the error correction type;
a weight value determining module 702, configured to determine a weight value of each error correction word according to the word order weight of the error correction type;
the error-correcting word sorting module 703 is configured to sort the multiple error-correcting words according to the weight value of each error-correcting word.
In this embodiment of the present application, the error-correcting word obtaining module 701 may specifically include the following sub-modules:
the character string error correction sub-module is used for respectively correcting the character strings according to a plurality of preset error correction types when receiving the character strings input by a user;
the target character string generation submodule is used for generating a target character string corresponding to a target error correction type if the character string is successfully corrected according to the target error correction type, wherein the target error correction type is any one of the multiple error correction types;
and the error-correcting word acquisition sub-module is used for acquiring a plurality of error-correcting words matched with the target character string.
In this embodiment of the present application, the weight value determining module 702 may specifically include the following sub-modules:
an initial weight value obtaining sub-module, configured to obtain an initial weight value of each error correction word;
and the weight value determining submodule is used for determining the weight value of each error-correcting word according to the initial weight value of each error-correcting word and the word order weight of the error-correcting type corresponding to each error-correcting word.
In this embodiment of the application, each error correction type further has a corresponding error correction word number threshold, and the error correction word sorting module 703 may specifically include the following sub-modules:
the error correction word number counting submodule is used for respectively counting the number of the error correction words corresponding to each error correction type;
the error correction word deleting submodule is used for deleting the error correction words more than the threshold of the number of the error correction words if the number of the error correction words is more than the threshold of the number of the error correction words of the corresponding error correction type;
and the error-correcting word ordering submodule is used for ordering the remaining error-correcting words according to the weight values of the remaining error-correcting words.
In this embodiment, the apparatus may further include the following modules:
the mutual exclusion relationship judging module is used for judging whether the error correction types corresponding to any two error correction words have a mutual exclusion relationship;
a to-be-deleted type determining module, configured to determine an error correction type to be deleted if the error correction types corresponding to any two error correction words have a mutual exclusion relationship;
and the error correction word deleting module is used for deleting each error correction word matched with the error correction type to be deleted.
In this embodiment of the present application, the to-be-deleted type determining module may specifically include the following sub-modules:
the order weight comparison submodule is used for comparing the word order weights of the error correction types with the mutual exclusion relationship if the error correction types corresponding to any two error correction words have the mutual exclusion relationship;
and the to-be-deleted type determining submodule is used for determining the error correction type corresponding to the minimum word sequence weight value as the error correction type to be deleted in the error correction types with the mutual exclusion relationship.
In this embodiment of the present application, the error correction type may include at least one of a first error correction type, a second error correction type, a third error correction type, and a fourth error correction type, where a word order weight of the first error correction type is equal to a word order weight of the second error correction type, the word order weight of the second error correction type is greater than the word order weight of the third error correction type, and the word order weight of the third error correction type is greater than the word order weight of the fourth error correction type.
In this embodiment of the present application, the first error correction word number threshold is greater than the first number threshold, the second error correction word number threshold is less than or equal to the second number threshold, the third error correction word number threshold is less than or equal to the third number threshold, the fourth error correction word number threshold is less than or equal to the fourth number threshold, the first number threshold is greater than the second number threshold, the second number threshold is greater than the third number threshold, and the third number threshold is greater than the fourth number threshold.
In the embodiment of the present application, the three types of error correction and the four types of error correction have a mutually exclusive relationship.
In an embodiment of the present application, the first type of error correction may include user-configured fuzzy tone error correction, the second type of error correction may include last few-input character error correction, the third type of error correction may include at least one of adjacent key error correction and default fuzzy tone error correction, and the fourth type of error correction may include at least one of multiple-input character error correction, middle few-input character error correction, and swap character error correction.
For the apparatus embodiment, since it is substantially similar to the method embodiment, it is described relatively simply, and reference may be made to the description of the method embodiment section for relevant points.
Referring to fig. 8, a schematic diagram of a terminal device according to an embodiment of the present application is shown. As shown in fig. 8, the terminal device 800 of the present embodiment includes: a processor 810, a memory 820, and a computer program 821 stored in the memory 820 and operable on the processor 810. The processor 810, when executing the computer program 821, implements the steps in the various embodiments of the error-correcting word sorting method described above, such as the steps S301 to S303 shown in fig. 3. Alternatively, the processor 810, when executing the computer program 821, implements the functions of the modules/units in the device embodiments, such as the functions of the modules 701 to 703 shown in fig. 7.
Illustratively, the computer program 821 may be partitioned into one or more modules/units that are stored in the memory 820 and executed by the processor 810 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which may be used to describe the execution of the computer program 821 in the terminal device 800. For example, the computer program 821 may be divided into an error-correcting word obtaining module, a weight value determining module, and an error-correcting word sorting module, where the specific functions of each module are as follows:
the device comprises an error correction word acquisition module, a word sequence weighting module and a word sequence weighting module, wherein the error correction word acquisition module is used for acquiring a plurality of error correction words matched with a character string when the character string input by a user is received, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
the weighted value determining module is used for determining the weighted value of each error correction word according to the word order weight of the error correction type;
and the error-correcting word sorting module is used for sorting the plurality of error-correcting words according to the weight value of each error-correcting word.
The terminal device 800 may be a desktop computer, a notebook, a palm computer, or other computing devices. The terminal device 800 may include, but is not limited to, a processor 810 and a memory 820. Those skilled in the art will appreciate that fig. 8 is only one example of a terminal device 800 and does not constitute a limitation of terminal device 800, and may include more or fewer components than shown, or some components may be combined, or different components, e.g., terminal device 800 may also include input-output devices, network access devices, buses, etc.
The Processor 810 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 820 may be an internal storage unit of the terminal device 800, such as a hard disk or a memory of the terminal device 800. The memory 820 may also be an external storage device of the terminal device 800, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and so on, provided on the terminal device 800. Further, the memory 820 may also include both an internal storage unit and an external storage device of the terminal apparatus 800. The memory 820 is used for storing the computer program 821 and other programs and data required by the terminal device 800. The memory 820 may also be used to temporarily store data that has been output or is to be output.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed error correction word ordering method, apparatus, terminal device and storage medium may be implemented in other ways. For example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: the computer program code can be carried to the apparatus for error correction word sorting, any entity or apparatus of a terminal device, a recording medium, a computer Memory, a Read-Only Memory (ROM), a Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.
Claims (13)
1. A method for ordering error-correcting words, comprising:
when a character string input by a user is received, acquiring a plurality of error correction words matched with the character string, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
determining the weight value of each error correction word according to the word order weight of the error correction type;
and sequencing the plurality of error-correcting words according to the weight value of each error-correcting word.
2. The method of claim 1, wherein obtaining a plurality of error correction words matching a character string when the character string input by a user is received comprises:
when receiving a character string input by a user, respectively correcting errors of the character string according to multiple preset error correction types;
if the character string is successfully corrected according to the target error correction type, generating a target character string corresponding to the target error correction type, wherein the target error correction type is any one of the multiple error correction types;
and acquiring a plurality of error correction words matched with the target character string.
3. The method according to claim 1, wherein the determining the weight value of each error correction word according to the word order weight of the error correction type comprises:
acquiring an initial weight value of each error correction word;
and determining the weight value of each error-correcting word according to the initial weight value of each error-correcting word and the word order weight of the error-correcting type corresponding to each error-correcting word.
4. The method according to claim 3, wherein each of the error correction types further has a corresponding threshold number of error correction words, and the sorting the plurality of error correction words according to the weight value of each error correction word comprises:
respectively counting the number of error correction words corresponding to each error correction type;
if the number of the error-correcting words is more than the threshold value of the number of the error-correcting words of the corresponding error-correcting type, deleting the error-correcting words more than the threshold value of the number of the error-correcting words;
and sequencing the remaining error-correcting words according to the weight values of the remaining error-correcting words.
5. The method of claim 1, further comprising:
judging whether the error correction types corresponding to any two error correction words have a mutual exclusion relationship;
if the error correction types corresponding to any two error correction words have a mutual exclusion relationship, determining the error correction type to be deleted;
and deleting each error correction word matched with the error correction type to be deleted.
6. The method according to claim 5, wherein determining the error correction type to be deleted if the error correction types corresponding to any two error correction words have a mutual exclusion relationship includes:
if the error correction types corresponding to any two error correction words have a mutual exclusion relationship, comparing the word sequence weights of the error correction types with the mutual exclusion relationship;
and determining the error correction type corresponding to the minimum word order weight value as the error correction type to be deleted in the error correction types with the mutual exclusion relationship.
7. The method according to any one of claims 1 to 6, wherein the error correction type comprises at least one of a first error correction type, a second error correction type, a third error correction type and a fourth error correction type, wherein the word order weight of the first error correction type is equal to the word order weight of the second error correction type, the word order weight of the second error correction type is greater than the word order weight of the third error correction type, and the word order weight of the third error correction type is greater than the word order weight of the fourth error correction type.
8. The method according to claim 7, wherein the error correction word number threshold of the first type of error correction is greater than a first number threshold, the error correction word number threshold of the second type of error correction is less than or equal to a second number threshold, the error correction word number threshold of the third type of error correction is less than or equal to a third number threshold, the error correction word number threshold of the fourth type of error correction is less than or equal to a fourth number threshold, the first number threshold is greater than the second number threshold, the second number threshold is greater than the third number threshold, and the third number threshold is greater than the fourth number threshold.
9. The method of claim 7, wherein the three types of error correction have a mutually exclusive relationship with the four types of error correction.
10. The method of claim 7, wherein one class of error correction comprises user-configured fuzzy tone error correction, wherein the second class of error correction comprises last few-input character error correction, wherein the third class of error correction comprises at least one of neighbor error correction and default fuzzy tone error correction, and wherein the fourth class of error correction comprises at least one of multiple-input character error correction, middle few-input character error correction, and swap character error correction.
11. An apparatus for sorting error-correcting words, comprising:
the device comprises an error correction word acquisition module, a word sequence weighting module and a word sequence weighting module, wherein the error correction word acquisition module is used for acquiring a plurality of error correction words matched with a character string when the character string input by a user is received, and determining an error correction type to which each error correction word belongs and a word sequence weight corresponding to the error correction type;
the weighted value determining module is used for determining the weighted value of each error correction word according to the word order weight of the error correction type;
and the error-correcting word sorting module is used for sorting the plurality of error-correcting words according to the weight value of each error-correcting word.
12. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method of ordering error-correcting words according to any one of claims 1 to 10 when executing the computer program.
13. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out a method for ordering error-correcting words according to any one of claims 1 to 10.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911279538.XA CN112989148B (en) | 2019-12-13 | 2019-12-13 | Error correction word sorting method, device, terminal device and storage medium |
PCT/CN2020/124484 WO2021114928A1 (en) | 2019-12-13 | 2020-10-28 | Error correction word sorting method and apparatus, terminal device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911279538.XA CN112989148B (en) | 2019-12-13 | 2019-12-13 | Error correction word sorting method, device, terminal device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112989148A true CN112989148A (en) | 2021-06-18 |
CN112989148B CN112989148B (en) | 2025-05-16 |
Family
ID=76329550
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911279538.XA Active CN112989148B (en) | 2019-12-13 | 2019-12-13 | Error correction word sorting method, device, terminal device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112989148B (en) |
WO (1) | WO2021114928A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115268664A (en) * | 2022-08-01 | 2022-11-01 | 腾讯科技(深圳)有限公司 | Control method, device and equipment for displaying error correction words and storage medium |
CN116088692A (en) * | 2021-11-03 | 2023-05-09 | 百度国际科技(深圳)有限公司 | Method and apparatus for presenting candidate character strings and training discriminant models |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436614B (en) * | 2021-07-02 | 2024-02-13 | 中国科学技术大学 | Speech recognition method, device, equipment, system and storage medium |
CN113468871B (en) * | 2021-08-16 | 2024-08-16 | 北京北大方正电子有限公司 | Text error correction method, device and storage medium |
CN113655895B (en) * | 2021-08-17 | 2024-06-11 | 北京百度网讯科技有限公司 | Information recommendation method and device applied to input method and electronic equipment |
CN113849071B (en) * | 2021-09-10 | 2025-01-24 | 维沃移动通信有限公司 | String processing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170017636A1 (en) * | 2015-07-17 | 2017-01-19 | Ebay Inc. | Correction of user input |
CN106708893A (en) * | 2015-11-17 | 2017-05-24 | 华为技术有限公司 | Error correction method and device for search query term |
CN106774970A (en) * | 2015-11-24 | 2017-05-31 | 北京搜狗科技发展有限公司 | The method and apparatus being ranked up to the candidate item of input method |
CN107092424A (en) * | 2016-02-18 | 2017-08-25 | 北京搜狗科技发展有限公司 | A kind of display methods of error correction, device and the device of the display for error correction |
CN107870677A (en) * | 2016-09-23 | 2018-04-03 | 北京搜狗科技发展有限公司 | A kind of input method, device and the device for input |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103677299A (en) * | 2012-09-12 | 2014-03-26 | 深圳市世纪光速信息技术有限公司 | Method and device for achievement of intelligent association in input method and terminal device |
-
2019
- 2019-12-13 CN CN201911279538.XA patent/CN112989148B/en active Active
-
2020
- 2020-10-28 WO PCT/CN2020/124484 patent/WO2021114928A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170017636A1 (en) * | 2015-07-17 | 2017-01-19 | Ebay Inc. | Correction of user input |
CN106708893A (en) * | 2015-11-17 | 2017-05-24 | 华为技术有限公司 | Error correction method and device for search query term |
CN106774970A (en) * | 2015-11-24 | 2017-05-31 | 北京搜狗科技发展有限公司 | The method and apparatus being ranked up to the candidate item of input method |
CN107092424A (en) * | 2016-02-18 | 2017-08-25 | 北京搜狗科技发展有限公司 | A kind of display methods of error correction, device and the device of the display for error correction |
CN107870677A (en) * | 2016-09-23 | 2018-04-03 | 北京搜狗科技发展有限公司 | A kind of input method, device and the device for input |
Non-Patent Citations (1)
Title |
---|
王永景: "面向文本识别流的自动校对算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 06, 15 June 2008 (2008-06-15) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116088692A (en) * | 2021-11-03 | 2023-05-09 | 百度国际科技(深圳)有限公司 | Method and apparatus for presenting candidate character strings and training discriminant models |
CN116088692B (en) * | 2021-11-03 | 2024-04-19 | 百度国际科技(深圳)有限公司 | Method and apparatus for presenting candidate strings and training a discriminant model |
CN115268664A (en) * | 2022-08-01 | 2022-11-01 | 腾讯科技(深圳)有限公司 | Control method, device and equipment for displaying error correction words and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2021114928A1 (en) | 2021-06-17 |
CN112989148B (en) | 2025-05-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112989148B (en) | Error correction word sorting method, device, terminal device and storage medium | |
CN108156508B (en) | Barrage information processing method and device, mobile terminal, server and system | |
CN104246663A (en) | Character string input control method and device | |
CN107436948B (en) | File searching method and device and terminal | |
CN106202422B (en) | The treating method and apparatus of Web page icon | |
CN108737618B (en) | Information processing method and device, electronic equipment and computer readable storage medium | |
CN106332020A (en) | Short message merging method, device and terminal equipment | |
CN110532231B (en) | File query method, file query device and terminal equipment | |
CN112148135A (en) | Input method processing method, device and electronic device | |
CN108205568A (en) | Method and device based on label selection data | |
CN111880668A (en) | Input display method and device and electronic equipment | |
CN113359999B (en) | Candidate word updating method and device and electronic equipment | |
CN109543014B (en) | Man-machine conversation method, device, terminal and server | |
CN106886294B (en) | Input method error correction method and device | |
CN109063076B (en) | A kind of picture generation method and mobile terminal | |
CN110619879A (en) | Voice recognition method and device | |
US20100149190A1 (en) | Method, apparatus and computer program product for providing an input order independent character input mechanism | |
CN108491502B (en) | News tracking method, terminal, server and storage medium | |
CN111178055B (en) | Corpus identification method, apparatus, terminal device and medium | |
CN110825291B (en) | Data processing method, data processing device and computer equipment | |
CN110362805B (en) | Content typesetting recommendation method and device and terminal equipment | |
CN109101586B (en) | Movie information acquisition method and device and mobile terminal | |
CN113360796A (en) | Data sorting method and device, and data sorting model training method and device | |
CN107862013B (en) | A schedule search method and mobile terminal | |
CN113850050A (en) | Character display method, character display device and terminal equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |