CN104463101B

CN104463101B - Answer recognition methods and system for character property examination question

Info

Publication number: CN104463101B
Application number: CN201410624173.0A
Authority: CN
Inventors: 胡雨隆; 胡金水; 竺博; 魏思; 胡国平; 胡郁; 刘庆峰
Original assignee: iFlytek Co Ltd
Current assignee: Anhui Knowledge Science & Technology Co Ltd
Priority date: 2014-11-06
Filing date: 2014-11-06
Publication date: 2017-08-25
Anticipated expiration: 2034-11-06
Also published as: CN104463101A

Abstract

The invention discloses a kind of answer recognition methods for character property examination question and system, this method includes：Obtain character property script image；One or more answer character strings to be identified are obtained from the answer image；Handwriting recognition is carried out to the answer character string to be identified based on general acoustic model, the first recognition result is obtained；Obtain the answer information of character property examination question；According to the answer information of first recognition result and the character property examination question, adaptive acoustic model is built；Handwriting recognition is carried out to the answer character string to be identified using the adaptive acoustic model, final recognition result is obtained.Using the present invention, the recognition accuracy of character property objective item can be effectively improved, and then improve go over examination papers efficiency and accuracy.

Description

Answer recognition method and system for text test questions

Technical Field

The invention relates to the technical field of information processing, in particular to an answer identification method and system for a text test question.

Background

With the continuous promotion of computer technology and education informatization, computer and artificial intelligence technology have been gradually applied to various activities of daily education and teaching, and have been correspondingly applied in practical application scenes such as teaching assistance, teaching evaluation and the like. The main investigation forms of the existing basic education and the learning conditions of students in China are still various types of examinations or tests, and under the condition, teachers bear great work pressure of correction homework and examination papers. Aiming at the phenomenon, various automatic paper marking systems are gradually popularized and adopted in large and medium-sized examinations or tests with high importance, and the systems can reduce the workload of teacher paper marking to a certain extent.

However, in the conventional automatic paper marking system, most of the parts of the paper marking completely completed by the computer are the paper marking of the filling objective questions (such as the choice questions), and the paper marking of the text test questions (such as the blank filling questions and the short answer questions) is still mainly performed intensively by teachers or trained professionals. In the traditional automatic paper marking system, because the computer paper marking identification accuracy of the text test questions does not reach the expected value of wide use, the paper marking is still carried out manually so far, thereby bringing the problems of low paper marking efficiency, large human resource consumption, paper marking deviation caused by subjective factors of paper marking people and the like.

Disclosure of Invention

The embodiment of the invention provides an answer recognition method and system for a text test question, which are used for improving the recognition accuracy of a text objective question and further improving the paper marking efficiency and accuracy.

Therefore, the embodiment of the invention provides the following technical scheme:

an answer recognition method for a text test question, comprising:

acquiring a text test question answer image;

obtaining one or more answer character strings to be identified from the answer image;

performing handwriting recognition on the answer character string to be recognized based on a general acoustic model to obtain a first recognition result;

acquiring answer information of the text test questions;

constructing a self-adaptive acoustic model according to the first recognition result and answer information of the text test question;

and performing handwriting recognition on the answer character string to be recognized by utilizing the self-adaptive acoustic model to obtain a final recognition result.

Preferably, the obtaining one or more answer character strings to be recognized from the answer image includes:

for a semi-open type writing layout, performing fine segmentation on the answer image according to context structure information among different lines of characters and statistical information of character component distribution; then merging the fine segmentation results to obtain one or more answer character strings to be identified;

and for the limited region type writing layout, obtaining one or more answer character strings to be identified according to the writing layout information of the answer sheet.

Preferably, the constructing an adaptive acoustic model according to the first recognition result and the answer information of the text test question includes:

determining an acoustic model needing to be subjected to self-adaption according to answer information of the text test questions;

taking the first recognition result as a model self-adaptive training sample, and determining a credible training sample;

performing adaptive iterative training on the acoustic model to be subjected to adaptive training according to the credible training sample to obtain an adaptive transformation matrix;

and obtaining the self-adaptive acoustic model after the self-adaptive iterative training is finished.

Preferably, the text test question comprises: a literal objective question;

the acquiring of answer information of the text test questions comprises:

acquiring an objective question standard answer character list L1 and a frequently wrong character list L2 corresponding to the objective question standard answer characters;

the step of determining the acoustic model to be self-adapted according to the answer information of the text test question comprises the following steps: and selecting an acoustic model corresponding to the union character of the character list L1 and the character list L2 as the acoustic model needing to be self-adapted.

Preferably, the text test question comprises: a textual subjective question;

the acquiring of answer information of the text test questions comprises:

acquiring a character list L3 of the related range of the subjective question answer;

determining candidate characters from the first recognition result, and generating a candidate character list L4;

the step of determining the acoustic model to be self-adapted according to the answer information of the text test question comprises the following steps:

and selecting an acoustic model corresponding to the union character of the character list L3 and the character list L4 as the acoustic model needing to be self-adapted.

Preferably, the text test question comprises: a literary objective question and a literary subjective question;

the acquiring of answer information of the text test questions comprises:

acquiring an objective question standard answer character list L1, a frequently wrong character list L2 corresponding to objective question standard answer characters and a character list L3 of a subjective question answer related range;

and selecting acoustic models corresponding to union characters of the character list L1, the character list L2, the character list L3 and the character list L4 as the acoustic models needing to be self-adapted.

Preferably, the step of using the first recognition result as a model adaptive training sample, and the step of determining a credible training sample comprises:

identifying the training sample based on the acoustic model after the current self-adaptive iteration to obtain an identification first candidate posterior probability;

and if the first candidate posterior probability is larger than the confidence coefficient threshold value, taking the training sample as a credible training sample of the next self-adaptive iteration.

Preferably, the method further comprises:

calculating the first candidate posterior probability of the full character training set on the universal acoustic model;

clustering the universal acoustic model according to the distribution map of the first candidate posterior probability;

and counting the recognition rate of the full character training set on the universal acoustic model, and determining the confidence coefficient threshold value.

An answer recognition system for textual test questions, comprising:

the first acquisition module is used for acquiring a text test question answer image;

the character string acquisition module is used for acquiring one or more answer character strings to be identified from the answer image;

the general recognition module is used for carrying out handwriting recognition on the answer character string to be recognized based on a general acoustic model to obtain a first recognition result;

the second acquisition module is used for acquiring answer information of the text test questions;

the model construction module is used for constructing a self-adaptive acoustic model according to the first recognition result and the answer information of the text test question;

and the self-adaptive recognition module is used for performing handwriting recognition on the answer character string to be recognized by utilizing the self-adaptive acoustic model to obtain a final recognition result.

Preferably, the cutting module comprises:

the first processing unit is used for performing fine segmentation on the answer image according to context structure information among different lines of characters and statistical information of character component distribution on a semi-open type writing layout, and performing merging processing on the obtained fine segmentation results to obtain one or more answer character strings to be recognized;

and the second processing unit is used for obtaining one or more answer character strings to be recognized according to the writing layout information of the answer sheet for the limited region type writing layout.

Preferably, the model building module comprises:

the initialization unit is used for determining an acoustic model needing to be subjected to self-adaption according to the answer information of the text test question;

a training sample determining unit, configured to determine a trusted training sample by using the first recognition result as a model adaptive training sample;

the training unit is used for carrying out self-adaptive iterative training on the acoustic model needing to be subjected to self-adaptation according to the credible training sample to obtain a self-adaptive transformation matrix; and obtaining the self-adaptive acoustic model after the self-adaptive iterative training is finished.

Preferably, the text test question comprises: a literal objective question;

the second obtaining module is specifically configured to obtain an objective question standard answer character list L1 and a frequently wrong character list L2 corresponding to the objective question standard answer characters;

the initialization unit is specifically configured to select an acoustic model corresponding to a union character of the character list L1 and the character list L2 as the acoustic model to be adapted.

Preferably, the text test question comprises: a textual subjective question;

the second acquisition module includes:

a first acquisition unit configured to acquire a character list L3 of a range to which a subjective question answer relates;

a list generating unit, configured to determine a candidate character from the first recognition result, and generate a candidate character list L4;

the initialization unit is specifically configured to select an acoustic model corresponding to a union character of the character list L3 and the character list L4 as the acoustic model to be adapted.

the second acquisition module includes:

a second obtaining unit, configured to obtain an objective question standard answer character list L1, a frequently wrong character list L2 corresponding to objective question standard answer characters, and a character list L3 of a subject question answer related range;

the initialization unit is specifically configured to select an acoustic model corresponding to a union character of the character list L1, the character list L2, the character list L3, and the character list L4 as the acoustic model to be adapted.

Preferably, the training sample determination unit includes:

the recognition subunit is used for recognizing the training sample based on the acoustic model after the current self-adaptive iteration to obtain a recognition first candidate posterior probability;

and the judging subunit is used for taking the training sample as a credible training sample of the next self-adaptive iteration when the posterior probability of the first candidate is greater than the confidence coefficient threshold value.

Preferably, the system further comprises: a confidence threshold determination module, the confidence threshold determination module specifically comprising:

the posterior probability calculating unit is used for calculating the first candidate posterior probability of the full-character training set on the universal acoustic model;

the clustering unit is used for clustering the universal acoustic model according to the distribution map of the first candidate posterior probability;

and the statistic unit is used for counting the recognition rate of the full character training set on the universal acoustic model and determining the confidence coefficient threshold value.

According to the answer recognition method and system for the literary test questions, provided by the embodiment of the invention, the unsupervised self-adaptive technology is adopted to learn the writing style of the user, so that a recognition model customized according to the writing habits of the user is generated, and the recognition accuracy of the literary test question answers is greatly improved. The method and the system provided by the embodiment of the invention are applied to an automatic marking system, and can solve the problem that the textual test questions cannot be completely and automatically marked by a computer widely because the answer recognition rate is low in the traditional automatic marking system.

Drawings

In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a flow chart of an answer recognition method for textual test questions according to an embodiment of the present invention;

FIG. 2 is an example of a semi-open writing layout in an embodiment of the present invention;

FIG. 3 is an example of a defined area writing layout in an embodiment of the present invention;

FIG. 4 is a flow chart of constructing an adaptive acoustic model in an embodiment of the present invention;

FIG. 5 is a flow chart of determining a training sample confidence threshold in an embodiment of the present invention;

FIG. 6 is a block diagram of an answer recognition system for textual test questions according to an embodiment of the present invention;

FIG. 7 is a block diagram of a model building module in an embodiment of the invention;

FIG. 8A is a block diagram of another embodiment of an answer recognition system for textual test questions according to the present invention;

FIG. 8B is a block diagram of another embodiment of an answer recognition system for textual test questions;

FIG. 9 is a block diagram of a confidence threshold determination module in an embodiment of the invention.

Detailed Description

In order to make the technical field of the invention better understand the scheme of the embodiment of the invention, the embodiment of the invention is further described in detail with reference to the drawings and the implementation mode.

The automatic marking system adopted in the existing large and medium or important examinations cannot realize the automatic marking and reading of the literal examination questions (including literal objective questions and literal subjective questions), mainly because the automatic marking and reading of the literal examination questions, including the correct and wrong judgment of the literal objective questions and the grading of the literal subjective questions, are highly dependent on the answer recognition effect, and the writing of the answers of the literal examination questions is open and free from constraint, and the writing styles of the answers are different due to different answerers, so that the answer recognition rate is greatly reduced, and the recognition effect of the answers of the literal examination questions still cannot reach the practical expected value.

Therefore, the embodiment of the invention provides an answer identification method and an answer identification system for a text test question, and an unsupervised self-adaptive technology is applied to answer identification of the text test question. The method adopts an unsupervised self-adaptive technology to learn the writing style of a user, thereby generating an identification model customized according to the writing habit of the user, greatly improving the identification accuracy, and further solving the problem that the literary test questions cannot be fully and automatically read by a computer because of low answer recognition rate in the traditional automatic paper reading system.

As shown in fig. 1, it is a flowchart of an answer recognition method for text test questions according to an embodiment of the present invention, including the following steps:

step 101, obtaining a text test question answer image.

In the embodiment of the present invention, the answer character string may be a chinese character string, an english character string, or the like.

The specific process of obtaining the answer image is as follows:

(1) and acquiring an image of the answer sheet.

The answer sheet image can be obtained by scanning with a cursor reading device or by photographing with a high-speed photographing instrument, a mobile terminal and the like.

(2) And according to the layout information of the answer sheet, dividing and extracting a target answer area.

In practical application, before the target answer area is segmented and extracted, the answer sheet image can be preprocessed to extract an accurate target answer area. The pre-processing may include: the specific processing method of the answer sheet image positioning, calibration, noise reduction, contrast enhancement, graying and other operations is the same as the answer sheet image preprocessing method in the existing automatic marking system, and is not described herein again.

The layout information of the answer sheet is known prior information, if the answer sheet image needs to be positioned, the positioning information of the answer sheet image can be obtained, and the target answer area can be accurately segmented and extracted through edge detection according to the information.

(3) And extracting a text test question answer image from the target answer area.

After the target answer area is obtained, according to the layout information of the answer sheet, an answer image can be obtained through edge point detection, wherein the answer image is an image of an answer character string.

And 102, obtaining one or more answer character strings to be recognized from the answer image.

Since the writing layout of the answer sheet usually has a semi-open type and a limited area type, fig. 2 shows a semi-open type writing layout example, and fig. 3 shows a limited area type writing layout example. Therefore, in the embodiment of the present invention, different processing may be performed on different writing layouts to obtain one or more answer character strings to be recognized, which specifically includes the following steps:

for the limited region type writing layout, one or more answer character strings to be recognized can be directly obtained according to the writing layout information of the answer sheet. Of course, before obtaining the character string of the answer to be recognized according to the writing layout information of the answer sheet, the answer image may be preprocessed, and the preprocessing may include: and performing binarization processing, correcting handwriting inclination, thinning handwriting and the like on the answer image.

For a semi-open type writing layout, the answer image can be finely divided according to context structure information among different lines of characters and statistical information of character component distribution; then merging the fine segmentation results to obtain one or more answer character strings to be identified; the context structure information between different lines of characters includes: geometric information between image connections, foreground pixel projection information, etc. The information may be obtained by a connected component analysis, a projection analysis, a skeleton analysis, or the like.

It should be noted that, in practical applications, before the answer image is finely divided, the answer image may be preprocessed to obtain a more accurate division result. The pre-processing may include: and performing binarization processing, correcting handwriting inclination, thinning handwriting and the like on the answer image.

The merging process of the fine segmentation results specifically includes: exhaustively combining the fine segmentation results, and calculating the combining reliability; and then determining a merging result according to the credibility to obtain one or more answer character strings to be identified.

The exhaustive merging refers to performing all possible mergers one by one, for example, there are 5 finely cut molecular blocks, and there are the following mergers:

(1) if a character is assumed, merging the sub-blocks 1, 2, 3, 4 and 5;

(2) if two characters are assumed, merging the sub-blocks 1, 2, 3 and 4; merging the sub-blocks 1, 2 and 3, and merging the sub-blocks 4 and 5; merging the sub-blocks 1, 2, and merging the sub-blocks 3, 4, 5; merging the sub-blocks 2, 3, 4 and 5;

and so on, up to five characters are assumed.

The merging reliability represents the accuracy of the characters obtained after merging, specifically, the character features such as the height, width, aspect ratio, outer space of the characters, inner space of the characters and the like of the characters after merging can be extracted, the merging likelihood score is obtained by calculation according to a rule statistical model obtained by pre-training and the character features, the merging reliability is determined according to the likelihood score, and the likelihood score can also be directly used as the corresponding merging reliability.

The rule statistical model is a statistical model trained according to the characteristics of the extracted training data after segmentation, such as the height, width, aspect ratio, outer character spacing, inner character spacing and the like, and the model can be a GMM (Gaussian mixture model) or an SVM (Support Vector Machine) and the like.

If the confidence level is greater than a set threshold, the merge is considered trustworthy, otherwise the merge is considered untrustworthy. And then one or more answer character strings to be identified are obtained according to the credible combination result.

It should be noted that, in practical applications, before merging or when judging the merging reliability, some judgment rules may be set according to experience or experiment, for example, the handwriting of one chinese character does not exceed 3 fine segmentation sub-blocks, so as to further assist or guide to complete the judgment whether the segmentation result of the character string is correct, thereby improving the accuracy of the judgment.

And 103, performing handwriting recognition on the answer character string to be recognized based on a general acoustic model to obtain a first recognition result.

The general acoustic Model may be a GMM (Gaussian Mixture Model), an MQDF (Modified quantized discrete Function) Model, or the like.

And 104, acquiring answer information of the text test questions.

The answer information of different types of text test questions has respective characteristics, which will be described in detail later.

And 105, constructing a self-adaptive acoustic model according to the first recognition result and the answer information of the text test question.

The construction process of the adaptive acoustic model will be described in detail later.

And 106, performing handwriting recognition on the answer character string to be recognized by using the self-adaptive acoustic model to obtain a final recognition result.

The answer recognition method for the literary test questions provided by the embodiment of the invention adopts the unsupervised self-adaptive technology to learn the writing style of the user, so that a recognition model customized according to the writing habits of the user is generated, the recognition accuracy of the literary test question answers is greatly improved, and the problem that the literary test questions cannot be widely and automatically read by a computer due to low answer recognition rate in the traditional automatic paper reading system is further solved.

As shown in fig. 4, a specific process for constructing an adaptive acoustic model in the embodiment of the present invention includes the following steps:

step 401, determining an acoustic model to be adapted according to answer information of the text test question.

It should be noted that, in practical applications, the text test questions may include only text objective questions or text subjective questions, or may include both text objective questions and text subjective questions, and the embodiment of the present invention is not limited thereto.

Since the objective literary questions and the subjective literary questions have different characteristics, for example, the objective literary questions correspond to standard answers, and the subjective literary questions have no standard answers but have corresponding keywords, the acoustic models to be adapted suitable for the corresponding characteristics can be determined according to the characteristics of the two test questions in the step 401. Specifically, there may be several cases as follows:

(1) for the literal objective question, the answer information of the above-mentioned literal test question may include: the objective question standard answer character list L1 and the normal and wrong character list L2 corresponding to the objective question standard answer characters.

The frequently wrong characters corresponding to the standard answer characters can be determined by test history information statistics or according to teacher experience.

Correspondingly, when the acoustic model to be adapted is determined according to the answer information of the text test question, the acoustic model corresponding to the union character of the character list L1 and the character list L2 may be selected as the acoustic model to be adapted.

(2) For the literal subjective questions, the answer information of the literal test questions comprises: the answer to the subjective question relates to a character list L3 of the range, and a candidate character list L4.

The candidate character list L4 is generated from the candidate characters determined from the first recognition result. The candidate character determination specifically may select, as the candidate character, a recognition result in the first recognition result whose confidence coefficient is greater than a set confidence coefficient threshold, or may select, as the candidate character, a certain number (for example, 50) of recognition results according to the level of the confidence coefficient of the recognition result.

The characters of the related range of the answers of the subjective questions can also be determined by test history information statistics or according to the experience of teachers.

Correspondingly, when the acoustic model to be adapted is determined according to the answer information of the text test question, the acoustic model corresponding to the union character of the character list L3 and the character list L4 needs to be selected as the acoustic model to be adapted.

(3) For the case of simultaneously including the textual objective questions and the textual subjective questions, the answer information of the textual test questions may include: the objective question standard answer character list L1, the frequently wrong character list L2 corresponding to the objective question standard answer characters, the character list L3 of the subjective question answer related range, and the candidate character list L4.

Correspondingly, when the acoustic model to be adapted is determined according to the answer information of the text test question, the acoustic model corresponding to the union character of the character list L1, the character list L2, the character list L3 and the character list L4 needs to be selected as the acoustic model to be adapted.

And step 402, taking the first recognition result as a model self-adaptive training sample, and determining a credible training sample.

Specifically, the first recognition result is used as model adaptive training data, and appearance features of the training data, which may be texture features or gradient features, are extracted. Then, traversing the current training sample, calculating the confidence coefficient of the current training sample, and selecting a credible training sample according to the confidence coefficient threshold of the training sample. Specifically, based on the acoustic model after the latest adaptive iteration, the current training sample is identified, the posterior probability of the identified first candidate (i.e., the candidate category with the maximum likelihood value returned by the classifier) is obtained, whether the posterior probability of the first candidate is greater than the confidence threshold of the training sample is judged, if yes, the training sample is determined to be the credible training sample of the next adaptive iteration training, otherwise, the training sample does not participate in the next adaptive iteration training.

It should be noted that different training samples correspond to different confidence thresholds. Before the first self-adaptive iteration, the acoustic model of the training samples is identified to be a universal acoustic model, the confidence threshold of each training sample can be determined according to the class of the identification result, and the training samples belonging to the same cluster can share one confidence threshold. The calculation of the training sample confidence threshold will be described in detail later.

And 403, performing adaptive iterative training on the acoustic model to be subjected to adaptive training according to the credible training sample to obtain an adaptive transformation matrix.

And (4) calculating to obtain a self-adaptive transformation matrix A according to the credible training sample determined in the step 402, and updating the acoustic model needing self-adaptation. Specifically, an adaptive transformation matrix a is calculated using an unsupervised adaptive loss function/:

wherein,the class with the smallest distance from the ith training sample is k, x_iIs a feature of the ith sample, u_jThe model mean value of the category j, and M is the number of the cluster categories. A is the adaptive transformation matrix, N is the total number of adaptive training samples, f_iTo identify confidenceWhere I is the identity matrix, the second term of the loss function is the quadratic regularization term, and β is the regularization coefficient.

Minimizing the above loss function, an adaptive transformation matrix a can be obtained:

it should be noted that, in the adaptive iterative training process, after each iteration is completed, the output result needs to be used as a model adaptive training sample of the next iterative training process, and then a trusted training sample is determined from the samples to perform the next iterative training.

And step 404, obtaining an adaptive acoustic model after the adaptive iterative training is finished.

Specifically, the number of iterations may be greater than a preset number threshold (e.g., 5), i.e., it is determined that the adaptive iterative training is completed.

As mentioned above, in the embodiment of the present invention, the confidence threshold for each training sample may be determined according to the class to which the recognition result belongs, and fig. 5 shows a process for determining the confidence threshold of the training sample, which includes the following steps:

step 501, calculating the posterior probability of the first candidate of the full character training set on the universal acoustic model.

Specifically, the full character set training data and the universal acoustic model are classified according to character types, and the posterior probability of the first candidate of all samples of the training data corresponding to the identification of the universal acoustic model is calculated.

Step 502, clustering the general acoustic models according to the distribution map of the first candidate posterior probability.

And (3) according to the first candidate posterior probability distribution histogram of the corresponding sample of each character acoustic model calculated in the step 501, clustering by adopting a K-means algorithm to generate M major classes. Where the value of K is generally selected based on experience or a number of experimental results.

Step 503, counting the recognition rate of the full character training set on the universal acoustic model, and determining a confidence threshold.

According to the labeling of the full-character training set and the recognition result of the training set in the step 501, the recognition rate of each large class of corresponding training set is counted, and the confidence threshold is determined according to the recognition rate. The confidence threshold is determined as follows:

wherein k is_mConfidence threshold, p, for class m_mFor the first candidate posterior probability, H, of the training sample in class m_pmIs the number of samples with the posterior probability of P as the first candidate in the class m, R_mIs the recognition rate of class m, N_mIs the number of training samples belonging to class m.

Correspondingly, an embodiment of the present invention further provides an answer recognition system for the text test questions, as shown in fig. 6, which is a block diagram of the system.

In this embodiment, the system includes:

a first obtaining module 600, configured to obtain a text test question answer image;

a character string obtaining module 601, configured to obtain one or more answer character strings to be recognized from the answer image;

a general recognition module 602, configured to perform handwriting recognition on the answer character string to be recognized based on a general acoustic model, so as to obtain a first recognition result;

a second obtaining module 603, configured to obtain answer information of the text test question; the model construction module 604 is configured to construct an adaptive acoustic model according to the first recognition result and answer information of the text test question;

and the adaptive recognition module 605 is configured to perform handwriting recognition on the answer character string to be recognized by using the adaptive acoustic model to obtain a final recognition result.

The first obtaining module 600 may specifically obtain the answer image of the textual test question in a segmentation and extraction manner, and one specific structure of the module may include: the device comprises an acquisition unit, an area extraction unit and an image extraction unit. Wherein:

the acquisition unit is used for acquiring an image of the answer sheet;

the area extraction unit is used for segmenting and extracting a target answer area according to the layout information of the answer sheet;

and the image extraction unit is used for extracting a text test question answer image from the target answer area.

The acquiring unit may specifically be: cursor reading equipment, or a high-speed shooting instrument, or mobile terminal equipment and the like.

The character string obtaining module 601 may perform different processing on different writing layouts to obtain one or more answer character strings to be recognized, and a specific structure of the module may include: a first processing unit, and/or a second processing unit. Wherein:

the first processing unit is used for performing fine segmentation on the answer image according to context structure information among different lines of characters and statistical information of character component distribution on a semi-open type writing layout; and merging the obtained fine segmentation results to obtain one or more answer character strings to be identified.

The specific processing procedures of the first processing unit performing the fine segmentation on the answer image and merging the fine segmentation results may be referred to the description in the foregoing embodiment of the method of the present invention, and are not described herein again.

The answer recognition system for the literary test questions provided by the embodiment of the invention adopts the unsupervised self-adaptive technology to learn the writing style of the user, so that a recognition model customized according to the writing habits of the user is generated, the recognition accuracy of the literary test question answers is greatly improved, and the problem that the literary test questions cannot be widely and automatically read by a computer due to low answer recognition rate in the traditional automatic paper reading system is further solved.

As shown in fig. 7, a structural block diagram of a model building module in the embodiment of the present invention is shown, which includes:

an initialization unit 701, configured to determine an acoustic model to be adapted according to answer information of the text test question;

a training sample determining unit 702, configured to determine a trusted training sample by using the first recognition result as a model adaptive training sample;

a training unit 703, configured to perform adaptive iterative training on the acoustic model to be subjected to adaptation according to the trusted training sample, to obtain an adaptive transformation matrix; and obtaining the self-adaptive acoustic model after the self-adaptive iterative training is finished.

The answer recognition system for the text test questions provided by the embodiment of the invention can realize accurate recognition of answers of different types of text test questions, for example, the following conditions can be provided:

(1) the test question with characters comprises: the objective question of the character nature. Accordingly, the answer information of the text test question includes: the objective question standard answer character list L1 and the normal and wrong character list L2 corresponding to the objective question standard answer characters.

In this case, the initialization unit may select an acoustic model corresponding to a union character of the character list L1 and the character list L2 as the acoustic model to be adapted.

(2) The test question with characters comprises: a textual subjective question. Accordingly, the answer information of the text test question includes: the answer to the subjective question relates to a character list L3 and a candidate character list L4 of the range.

Accordingly, in this case, the second obtaining module 603 includes, as shown in fig. 8A:

a first obtaining unit 801 for obtaining a character list L3 of the range related to the subject question answer

A list generating unit 802, configured to determine a candidate character from the first recognition result, and generate a candidate character list L4.

Accordingly, in this embodiment, the initialization unit in the model building module 604 may select an acoustic model corresponding to a union character of the character list L3 and the character list L4 as the acoustic model to be adapted.

(3) The test question with characters comprises: a literary objective question and a literary subjective question; the answer information of the text test questions comprises: an objective question standard answer character list L1, a frequently wrong character list L2 corresponding to the objective question standard answer characters, a character list L3 of the subjective question answer related range, and a candidate character list L4;

also in this case, the second obtaining module 603 includes, as shown in fig. 8B: a second acquisition unit 803, and the list generation unit 802 described above. The second obtaining unit 803 is configured to obtain an objective question standard answer character list L1, a frequently wrong character list L2 corresponding to objective question standard answer characters, and a character list L3 of a subjective question answer related range.

Accordingly, in this embodiment, the initialization unit in the model building module 604 may select the acoustic model corresponding to the union character of the four lists of the character list L1, the character list L2, the character list L3, and the character list L4 as the acoustic model to be adapted.

The training sample determination unit 702 may specifically identify a training sample based on the acoustic model after the current adaptive iteration to obtain an identified first candidate posterior probability, and determine whether each training sample can be used as a trusted training sample for the next adaptive iteration according to the posterior probability. One specific structure of the training sample determination unit 703 includes: the recognition subunit is used for recognizing the training sample based on the acoustic model after the current self-adaptive iteration to obtain a recognition first candidate posterior probability;

In the system according to the embodiment of the present invention, the confidence threshold of each training sample may be determined according to the class to which the recognition result belongs, as shown in fig. 9, a structural block diagram of the confidence threshold determining module according to the embodiment of the present invention is provided.

The confidence threshold determination module specifically includes:

the posterior probability calculating unit 901 is used for calculating the first candidate posterior probability of the full-character training set on the universal acoustic model;

a clustering unit 902, configured to cluster the generic acoustic model according to the distribution map of the first candidate posterior probability;

a statistic unit 903, configured to count a recognition rate of the full-character training set on the generic acoustic model, and determine the confidence threshold.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. The above-described system embodiments are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. Moreover, the functions provided by some of the modules can also be implemented by software, and some of the modules can be shared with the same functional modules in the existing devices (such as personal computers, tablet computers and mobile phones). One of ordinary skill in the art can understand and implement it without inventive effort.

The above detailed description of the embodiments of the present invention, and the detailed description of the embodiments of the present invention used herein, is merely intended to facilitate the understanding of the methods and apparatuses of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An answer recognition method for a text test question, comprising:

acquiring a text test question answer image;

acquiring answer information of the text test questions;

2. The method of claim 1, wherein obtaining one or more answer character strings to be recognized from the answer image comprises:

for a semi-open type writing layout, performing fine segmentation on the answer image according to context structure information among different lines of characters and statistical information of character component distribution; then merging the fine segmentation results to obtain one or more answer character strings to be identified; and/or

3. The method according to claim 1 or 2, wherein the constructing an adaptive acoustic model according to the first recognition result and answer information of the text question comprises:

4. The method of claim 3, wherein the textual questions comprise: a literal objective question;

the acquiring of answer information of the text test questions comprises: acquiring an objective question standard answer character list L1 and a frequently wrong character list L2 corresponding to the objective question standard answer characters;

and selecting an acoustic model corresponding to the union character of the character list L1 and the character list L2 as the acoustic model needing to be self-adapted.

5. The method of claim 3, wherein the textual questions comprise: a textual subjective question;

the acquiring of answer information of the text test questions comprises:

acquiring a character list L3 of the related range of the subjective question answer; determining candidate characters from the first recognition result, and generating a candidate character list L4;

6. The method of claim 3, wherein the textual questions comprise: a literary objective question and a literary subjective question;

the acquiring of answer information of the text test questions comprises:

7. The method of claim 3, wherein the determining the credible training samples using the first recognition result as a model adaptive training sample comprises:

8. The method of claim 7, further comprising:

9. An answer recognition system for textual test questions, comprising:

10. The system of claim 9, wherein the string obtaining module comprises:

the first processing unit is used for performing fine segmentation on the answer image according to context structure information among different lines of characters and statistical information of character component distribution on a semi-open type writing layout, and performing merging processing on the obtained fine segmentation results to obtain one or more answer character strings to be recognized; and/or

11. The system of claim 9 or 10, wherein the model building module comprises:

12. The system of claim 11, wherein the textual questions comprise: a literal objective question;

the second obtaining module is specifically configured to obtain an objective question standard answer character list L1 and a frequently wrong character list L2 corresponding to the objective question standard answer characters; the initialization unit is specifically configured to select an acoustic model corresponding to a union character of the character list L1 and the character list L2 as the acoustic model to be adapted.

13. The system of claim 11, wherein the textual questions comprise: a textual subjective question;

the second acquisition module includes:

14. The system of claim 11, wherein the textual questions comprise: a literary objective question and a literary subjective question;

the second acquisition module includes:

15. The system of claim 11, wherein the training sample determination unit comprises:

16. The system of claim 15, further comprising: a confidence threshold determination module, the confidence threshold determination module specifically comprising: