CN112800182B - Test question generation method and device - Google Patents
Test question generation method and device Download PDFInfo
- Publication number
- CN112800182B CN112800182B CN202110185141.5A CN202110185141A CN112800182B CN 112800182 B CN112800182 B CN 112800182B CN 202110185141 A CN202110185141 A CN 202110185141A CN 112800182 B CN112800182 B CN 112800182B
- Authority
- CN
- China
- Prior art keywords
- information
- question
- test question
- test
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The disclosure relates to a test question generation method and device, wherein the method comprises the following steps: performing first natural language processing on the question information of a first test question, and determining the knowledge points examined in the first test question; performing second natural language processing on the stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question; and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question is the same as the first test question in question type. According to the method and the device, different stem descriptions can be generated under the condition that the examined knowledge points are inconvenient, and the understanding and grasping ability of students to the knowledge points can be fully examined; meanwhile, different questions adopt different stem descriptions, and cheating behaviors in the examination can be effectively prevented, so that fairness and fairness of examination results are guaranteed to a greater extent.
Description
Technical Field
The disclosure relates to the technical field of information processing, in particular to a test question generation method and device.
Background
In the prior art, in order to prevent cheating of students in an examination, different test papers are formed by a method of randomly disturbing test question serial numbers (or candidate answer sequences). Although the existing anti-cheating method can play a certain role, the stem description of the questions is generally kept unchanged, so that students can still plagiarize according to the questions and the candidate answers. In addition, in the prior art, the stem description form of the questions is single, and the understanding and grasp conditions of students on knowledge points cannot be fully examined.
Disclosure of Invention
The embodiment of the disclosure provides a test question generation method and device, which can solve the problems that students cannot fully examine understanding and mastering conditions of knowledge points and cannot effectively prevent cheating of the students in the prior art.
According to one aspect of the present disclosure, there is provided a test question generation method including:
performing first natural language processing on the question information of a first test question, and determining the knowledge points examined in the first test question;
Performing second natural language processing on the stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question;
and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question is the same as the first test question in question type.
In some embodiments, performing a first natural language process on topic information of a first test question, determining knowledge points examined in the first test question includes:
Determining keywords of the topic information;
Comparing the keywords with knowledge points of a preset category, and determining semantic similarity of the keywords and the knowledge points of the preset category;
and when the semantic similarity between the keyword and the knowledge points of the preset category is larger than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
In some embodiments, determining keywords for the topic information includes:
Word segmentation processing is carried out on the topic information of the first test question, and candidate keywords are determined;
And determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as keywords of the topic information.
In some embodiments, determining keywords for the topic information includes:
Determining the question type of the first test question;
Determining an information category contained in the question information of the first test question based on the question type of the first test question;
And determining keywords of the topic information according to the information category.
In some embodiments, when the first test question is a selected question, the question information of the first test question includes question stem information and candidate answer information, and the determining the keyword of the question information according to the information category includes:
The method comprises the steps of respectively determining keywords in the stem information and the candidate answer information of the first test question, and determining the keywords of the stem information matched with the keywords of the candidate answer information as the keywords of the first test question; or alternatively
When the first test question is a question or an analysis question, the question information of the first test question comprises question stem information and question information, and the keyword of the question information is determined according to the information category, and the method comprises the following steps:
And respectively determining keywords in the question stem information and the question information of the first test question, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
In some embodiments, generating at least one second test question based on the knowledge points and the entity information includes:
determining related words corresponding to the core words based on the text features of the core words and the syntactic features of the stem information;
And generating the stem information of at least one second test question conforming to the grammar structure based on the knowledge points, the core words and the related words.
In some embodiments, determining the associated word corresponding to the core word based on the text feature of the core word and the syntactic feature of the stem information includes:
Determining the semantics expressed by the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information;
based on the semantics of the core word expressed in the stem information, carrying out semantic expansion on the core word to obtain a first associated word; and/or
Based on the text features of the core words and the syntactic features of the stem information, determining context related words corresponding to the core words in the stem information, and carrying out semantic expansion on the context related words to obtain second related words.
In some embodiments, the text features of the core word include at least one of a category of the core word, a part of speech, a location in the stem information, and a dependency relationship with other core words.
In some embodiments, the method further comprises:
determining a difficulty coefficient of the first test question;
And generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient.
According to one of the schemes of the present disclosure, there is also provided a test question generating apparatus, including:
the determining module is configured to perform first natural language processing on the question information of the first test questions and determine knowledge points examined in the first test questions;
the acquisition module is configured to perform second natural language processing on the stem information of the first test question and acquire entity information of the first test question; the entity information is a core word contained in the first test question;
and the generation module is configured to generate at least one second test question based on the knowledge points and the entity information, wherein the second test question is the same as the first test question in question type.
According to one of the schemes of the present disclosure, there is further provided an electronic device, including a processor and a memory, where the memory is configured to store computer executable instructions, and the processor implements the test question generating method described above when executing the computer executable instructions.
According to one aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a processor, implement the above-described test question generation method.
According to the test question generation method and device provided by the various embodiments of the disclosure, a natural language processing technology is utilized to determine the knowledge point to be examined of the first test question from the question information of the first test question to be generated, the entity information in the first test question is extracted from the question stem information of the first test question, and then at least one second test question is regenerated by utilizing the knowledge point and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examination knowledge point is unchanged, and the understanding and grasping capability of students to the knowledge point can be fully examined; meanwhile, different questions adopt different stem descriptions, barriers can be effectively manufactured for plagiarism among students, cheating behaviors in examination are effectively prevented, and fairness of examination results are guaranteed to a greater extent.
Drawings
FIG. 1 shows a flowchart of a test question generation method of an embodiment of the present disclosure;
FIG. 2 illustrates an example diagram of a first test question at the time of test question generation in an embodiment of the present disclosure;
FIG. 3 illustrates an example diagram of a second test question generated in an embodiment of the present disclosure;
FIG. 4 illustrates another flow chart of a test question generation method of an embodiment of the present disclosure;
FIG. 5 shows yet another flow chart of a test question generation method of an embodiment of the present disclosure;
FIG. 6 shows yet another flowchart of a test question generation method of an embodiment of the present disclosure;
Fig. 7 shows a schematic structural diagram of a test question generating apparatus according to an embodiment of the present disclosure.
Detailed Description
Various aspects and features of the disclosure are described herein with reference to the drawings.
It should be understood that various modifications may be made to the embodiments of the application herein. Therefore, the above description should not be taken as limiting, but merely as exemplification of the embodiments. Other modifications within the scope and spirit of this disclosure will occur to persons of ordinary skill in the art.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and, together with a general description of the disclosure given above and the detailed description of the embodiments given below, serve to explain the principles of the disclosure.
These and other characteristics of the present disclosure will become apparent from the following description of a preferred form of embodiment, given as a non-limiting example, with reference to the accompanying drawings.
It should also be understood that, although the present disclosure has been described with reference to some specific examples, those skilled in the art can certainly realize many other equivalent forms of the present disclosure.
The above and other aspects, features and advantages of the present disclosure will become more apparent in light of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely examples of the disclosure, which may be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the disclosure in unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not intended to be limiting, but merely serve as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.
Fig. 1 shows a flowchart of a test question generation method of an embodiment of the present disclosure. As shown in fig. 1, the present disclosure provides a test question generation method, including:
s1: and carrying out first natural language processing on the question information of the first test questions, and determining the knowledge points examined in the first test questions.
The first test questions are first test questions, the question information is all information contained in the first test questions, and the first test questions of different question types are different in question information. For example, as shown in fig. 2, the first question is a single choice question, and the question information includes question stem information and candidate answer information.
The first natural language processing is text analysis based on text recognition, text classification and the like so as to accurately determine knowledge points examined by the first test questions. For example, as shown in fig. 2, the "relational database management system" is the knowledge point of the determined first examination question. The knowledge point is explicitly expressed in the stem information of the first test question and can be directly determined through text recognition.
In other embodiments, knowledge points are not explicitly represented in the topic information of the first test question, and need to be obtained through reasoning and summarization, so that the knowledge points need to be determined jointly through a combination of methods such as text recognition and text classification.
S2: performing second natural language processing on the stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question.
Specifically, the core words include keywords, logical relation words, indicator words, direction words and other words in the first test questions, which play a core role in the stem information sentence. As shown in fig. 2, the "relational database system", "management" and "relationship" in the stem information may be core words.
Because the entity information is specific content contained in the first test question, the second natural language processing is mainly text analysis based on text recognition so as to recognize and extract core words from the first test question.
S3: and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question is the same as the first test question in question type.
After knowledge points and entity information are obtained, the knowledge points and the entity information can be combined, text analysis processing is further carried out, and at least one second test question with the same type as the first test question is generated. For example, the second test question shown in fig. 3 may be generated based on the knowledge points and the entity information determined from the first test question shown in fig. 2, and the first test question and the generated second test question are both single choice questions.
Wherein, generating at least one second test question refers to generating the question stem information of the second test question. For example, when the first test question is a blank question or a judgment question, the question information only includes the question stem information, and the generated second test question is the question stem information of the second test question. When the first test question is a selected question, a question answer, an analysis question and the like, the question information not only comprises the question stem information but also comprises candidate answers or question information, and because the entity information is obtained from the question stem information of the first test question, the question stem information of the second test question is generated based on the knowledge points and the entity information.
According to the test question generation method provided by the embodiment of the disclosure, a natural language processing technology is utilized to determine the knowledge point examined by the first test question from the question information of the first test question to be generated, the entity information in the first test question is extracted from the question stem information of the first test question, and then at least one second test question is regenerated by utilizing the knowledge point and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examined knowledge point is unchanged, and the understanding and grasping capability of students to the knowledge point can be fully examined; meanwhile, different questions adopt different stem descriptions, barriers can be effectively manufactured for plagiarism among students, cheating behaviors in examination are effectively prevented, and fairness of examination results are guaranteed to a greater extent.
In some embodiments, as shown in fig. 4, in step S1, performing a first natural language process on the topic information of a first test question, and determining knowledge points examined in the first test question includes:
s11: determining keywords of the topic information;
s12: comparing the keywords with knowledge points of a preset category, and determining semantic similarity of the keywords and the knowledge points of the preset category;
S13: and when the semantic similarity between the keyword and the knowledge points of the preset category is larger than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
Since the knowledge points examined by the first test question are key points in the whole test question, it is necessary to determine the key words in the question information of the first test question. The determination of keywords may be performed using text recognition methods. For example, the topic information may be segmented and then keywords in the topic information may be determined from different segments or sections. The keywords may be one or a plurality of keywords. For example, in fig. 2, the "relational database management system" is a keyword of the topic information of the first test question.
After determining the keyword, the keyword can be compared with knowledge points of preset categories in a preset knowledge point base, and semantic similarity of the keyword and the knowledge points is calculated, wherein a certain number of accurate knowledge points of known categories are stored in the knowledge point base.
The knowledge points of the preset category may be knowledge points identical to the semantic environment of the keyword, for example, a subject, a section of a teaching material, or the like to which the "relational database management system" belongs may be determined, the subject, the section of the teaching material, or the like to which the "relational database management system" belongs is taken as the semantic environment of the keyword, and then the knowledge points of the subject, the section, or the like are extracted from the preset knowledge point library as the knowledge points of the preset category and compared with the keyword. When the semantic similarity between the keyword and the knowledge points of the preset category is larger than a preset semantic threshold, the knowledge points of the preset category meeting the preset threshold can be determined to be candidate knowledge points of the first test question. After the candidate knowledge points are determined, the knowledge points with the preset categories and the maximum semantic similarity are determined to be the knowledge points examined in the final first test questions.
When the number of the knowledge points in the preset category meeting the preset threshold is multiple, some knowledge points are non-key knowledge points, so that the non-key knowledge points can be removed to ensure that the screened final knowledge points are the knowledge points examined in the first test questions. For example, when the knowledge points are "management of the relational database management system", the candidate knowledge points determined based on the keyword "relational database management system" may be "management of the relational database management system", "classification of the relational database management system", and the like, and at this time, the "management of the relational database management system" with the maximum similarity threshold is determined as the knowledge point examined in the first test question, so that a more accurate knowledge point can be obtained.
The specific semantic similarity calculation method can be a word co-occurrence-based statistical method, and is mainly implemented by counting word frequencies in sentences, such as a TF-IDF algorithm and the like; the corpus training feature extraction method based on the neural network and the like can also be adopted, and the specific calculation method is not particularly limited in the disclosure.
In some embodiments, in step S11, determining the keyword of the topic information includes:
s111: word segmentation processing is carried out on the topic information of the first test question, and candidate keywords are determined;
S112: and determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as keywords of the topic information.
The word segmentation processing comprises word splitting, punctuation filtering and the like. The word segmentation process is performed to obtain a plurality of word segmentation fragments, wherein the word segmentation fragments can be single words, single words or synthesized words composed of a plurality of words.
After word segmentation, judging the occurrence probability of each word segmentation segment in a preset keyword library according to a word frequency statistical method and the like, and selecting the word segmentation segment as a candidate keyword if the occurrence probability of the word segmentation segment in the preset keyword library meets a preset threshold; and if the probability of the word segmentation segment appearing in the preset keyword library does not meet the preset threshold, determining that the word segmentation segment is not a candidate keyword. In specific implementation, the candidate keywords may also be determined by determining whether the word segmentation segment exists in a preset keyword library.
After the candidate keywords are determined, the synonyms synonymous with the candidate keywords are extracted from a preset keyword library, and the candidate keywords and the synonyms are used as keywords of the topic information of the first test question together, so that more comprehensive and accurate keywords can be obtained. For example, in fig. 2, the RDBMS is an english abbreviation of the relational database management system, and when it is determined that the keyword is the "relational database management system", it is necessary to select "RDBMS" synonymous with the "relational database management system" or the english full name thereof from a preset keyword library or an existing dictionary, and if only the keyword in the topic information is used, the comparison between the "RDBMS" and the preset knowledge points may be omitted, which results in a decrease in accuracy of determining the knowledge points.
In other embodiments, in step S11, determining the keyword of the topic information includes:
s113: determining the question type of the first test question;
s114: determining an information category contained in the question information of the first test question based on the question type of the first test question;
S115: and determining keywords of the topic information according to the information category.
Since the types of information included in the question information of different question types are different, for example, when the first test question is a selected question, the question information includes the question stem information and the candidate answer information, and if the keyword is obtained only through the steps S111 and S112, then the part of the interference information with a high occurrence frequency in the candidate answer may be the keyword. For example, the same keyword appears in the interference options of the single choice questions. In addition, for the first test questions of different question types, although keywords of the question information can be determined from the question information, only keywords are determined from the question information, important keywords appearing in correct options may be missed, and accuracy of keyword extraction of the question information is reduced. Therefore, in step S113-115, the information category included in the topic information of the first test question is determined according to the topic type of the first test question, so as to determine the keyword of the topic information.
In a specific embodiment, when the first test question is a selected question, the question information of the first test question includes question stem information and candidate answer information, and the determining the keyword of the question information according to the information category includes:
And respectively determining keywords in the stem information and the candidate answer information of the first test question, and determining the keywords of the stem information matched with the keywords of the candidate answer information as the keywords of the first test question.
That is, when the first test question is a selected question, the information category of the question information includes the question stem information and the candidate answer information, the question stem information and the candidate answer information are subjected to word segmentation processing respectively to obtain keywords in the question stem information and the candidate answer information, then the first keywords in the determined question stem information are matched with the second keywords in the candidate answer information, if the first keywords and the second keywords contain the same keywords, the keywords are indicated to be the keywords in the question stem information and the keywords in the candidate answer, so that the influence of the keywords in the interference options on the determination of the keywords of the question information can be removed, and the accuracy of the determination of the keywords of the question information is ensured.
In another specific embodiment, when the first test question is a question or an analysis question, the question information of the first test question includes question stem information and question information, and the determining the keyword of the question information according to the information category includes:
And respectively determining keywords in the question stem information and the question information of the first test question, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
The keyword determination of the question information of the answer or analysis question is similar to the keyword determination of the question information of the selection question, and is not described here again.
The steps S111 to S112 and the steps S113 to S115 may be performed independently or may be performed in combination, and for example, the keywords of the topic information may be determined in the step S115 using the steps S111 to S112.
In some embodiments, as shown in fig. 5, in step S3, generating at least one second test question based on the knowledge points and the entity information includes:
S31: determining related words corresponding to the core words based on the text features of the core words and the syntactic features of the stem information;
S32: and generating the stem information of at least one second test question conforming to the grammar structure based on the knowledge points, the core words and the related words.
The text features of the core words comprise at least one of categories, parts of speech, positions in the stem information and dependency relationships with other core words of the core words. The category of the core word can be proper nouns such as personal names, organization names, place names and the like, actions, meaningful time and the like; the part of speech of the core word can be nouns, verbs, adjectives, adverbs and the like; the positions of the core words in the stem information can be word head, word tail and the like, and the word sequence of a plurality of core words can be determined based on the positions of the core words; dependencies with other core words include context, dependency, modifier, or whether it is necessary to use with other core words, etc. The syntactic characteristic of the stem information comprises a sentence pattern structure and a logical semantic relation among the words, wherein the sentence pattern structure can comprise a main-predicate structure, a main-predicate structure and the like of the whole stem information or partial phrases in the stem information.
For example, in fig. 2, the relationship managed by the "relationship database management system" may be divided into "relationship database management system", "management", "relationship" and "yes" after the word segmentation process, where the core words "relationship database management system", "management", "relationship" are subjects, predicates and objects, and the "management" and "relationship" need to be composed by means of "the aid word".
Through analyzing the text characteristics of the core words and the syntactic characteristics of the stem information, the related words corresponding to the core words can be obtained, so that more words related to the core words can be obtained, and the sentences can be regrouped. After the related words corresponding to the core words are obtained, screening and combining the core words and the related words, and obtaining the second test questions with more quantity and more accurate semantic expression under the condition that knowledge points are the same.
It should be noted that, the above grammar-compliant structure refers to a complete sentence-like expression that conforms to a conventional general grammar.
In some embodiments, in step S31, determining the associated word corresponding to the core word based on the text feature of the core word and the syntactic feature of the stem information includes:
Step S311: determining the semantics expressed by the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information;
Step S312: based on the semantics of the core word expressed in the stem information, carrying out semantic expansion on the core word to obtain a first associated word; and/or
Step S313: based on the text features of the core words and the syntactic features of the stem information, determining context related words corresponding to the core words in the stem information, and carrying out semantic expansion on the context related words to obtain second related words.
Wherein, step S311 and step S312 are expansion steps of expanding semantic words of the core word, and the semantics expressed by the core word in the stem information is accurately determined through step S311, so as to prevent semantic understanding errors. Especially, the semantic recognition can be accurately performed under the condition that the semantics of the same core word expressed under different semantic environments are different. For example, a relationship in "relational database management system" is to modify "database management system" as a modifier to determine the type of "database management system"; the relationship in the "managed relationship" is a noun, so the semantics expressed by each core word need to be accurately identified, so as to obtain a more accurate first associated word. After the semantics expressed by the core words are accurately determined, the core words are expanded to obtain more core words used for generating the second test questions so as to generate more second test questions.
For example, in fig. 2, the relationship managed by the stem information "relationship database management system" is "in which the parts of speech of the core word" relationship database management system "," management "and" relationship "are nouns, verbs and nouns, respectively, and the syntactic characteristics of the stem information are standard main-predicate structures, so that the" relationship database management system "is a proper noun that can be used independently, and therefore, it can be directly synonymously expanded to obtain its english shorthand" RDBMS ", or the" relationship database management system "can be subjected to semantic reasoning to obtain the relationship data managed by the" relationship database management system ", so as to obtain the expanded semantic word" relationship data "as shown in fig. 3. For another example, the "managed relationship" in the stem information is a guest-to-move relationship, and the "yes" is determined to be filled in later and should be a data type based on the "managed" and the "relationship" or the dependency relationship between the two, so that the "stored" expanded semantic word as shown in fig. 3 can be obtained based on the two core words of "managed" and "relationship" and the syntactic feature of the stem information. And then, recombining the screened knowledge points, the core words and the first related words based on common grammar expression or idiom expression to obtain a second test question. For example, "store" is used as a verb, and the common expression behind it is typically "store" or "store as," and "store as" is typically not used, "store as" is typically a noun representing a place behind "and" store as "may be followed by a data type, thus determining" store as "for the second question; the "management" used in the first test question may be used continuously, the core word "relational database management system" may be used multiple times, and when it is used again, it may be abbreviated as "system", since in this embodiment, the knowledge points and one of the core words are the same, and therefore, the two may be combined, and finally, the stem information of the first test question "relational database management system management relationship is" regenerated into "in RDBMS, and the system management relationship is stored as" re-generated ".
In some embodiments, when the topic information further includes information of other information categories, step S31 further includes:
and determining the associated word corresponding to the core word based on the text feature of the core word, the syntactic feature of the stem information and other information except the stem information in the stem information.
Specifically, based on each candidate answer of the single choice question as shown in fig. 2, it can be determined that the question stem information "the relation managed by the relation database management system is" the file type should be filled in later ", so that after the extended semantic word" store "is determined, it can be directly determined that" store as "is used in the second test question, and the test question generation efficiency can be improved.
Prior to step S311, the method further comprises:
s3101: judging whether the semantic meaning expressed by the core word is single or not;
S3102: if yes, the core word is not subjected to semantic expansion; if not, extracting the core word with the expressed semantic meaning being not single, and executing the step S311.
When the semantics expressed by the core words in the stem information are single, the core words can be determined to be independent words, the expressed semantics are unique no matter in any semantic environment, expansion is not needed, and the text analysis processing efficiency can be improved.
After the expanded semantic word of the core word is obtained, the knowledge points, the core word and the expanded semantic word can be input into a test question generation model containing a preset grammar structure to generate a second test question.
In the process of generating the test questions, the core words and the expanded semantic words thereof can be replaced with each other, and can be used for multiple times in the process of arranging and combining the group sentences according to actual needs, and the specific use mode is not particularly limited in the embodiment.
In specific implementation, among different test questions (including a first test question and a generated second test question), different core words and expansion semantic words thereof are used as much as possible, so that the test questions are expressed more abundantly, and cheating is effectively prevented.
Step S313 is a determination of a context related word of a core word, and an expansion step of the context related word, if the second test question is generated based on knowledge points, the core word, and the expanded semantic word of the core word, the core word is usually an important word such as a noun or a verb, so that the generated second test question is expressed in a hard or unsmooth manner, and therefore, in order to ensure that the generated second test question is more natural and smooth, the context related word of the core word in the stem information is obtained in step S313, and the context related word may be a core word or a non-core word such as a pronoun or a conjunctive word.
The context related words are classified after being determined, the core words are classified into one type, the non-core words are classified into one type, and the processing of the core words in the context related words can be expanded as described in the steps S311 and S312, which are not described herein. For the non-core words, the expanded semantic words of the non-core words can be obtained by expansion, then the context related words and the expanded semantic words of the context related words are used as second related words and are combined with knowledge points, entity information and/or the first related words to obtain second test questions which are accurate in semantic expression and more natural and smooth; meanwhile, more second test questions can be generated by acquiring the context related words and the extended semantic words thereof, and the semantic expressions are more abundant.
In some embodiments, as shown in fig. 6, the method further comprises:
s4: determining a difficulty coefficient of the first test question;
s33: and generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient.
The students have different understanding capacities on different stem information, and when the stem information is expressed smoothly, the students can easily understand the semantics expressed by the stem information, so that quick response can be performed; when the stem information is expressed and is subject to the notch, students need to analyze to solve the semantics expressed by the stem information, so that the answering time is influenced, the stem understanding errors are easily caused, the understanding and grasping ability of the students to knowledge points can be difficult to accurately check, and the fairness of the examination can be influenced. Therefore, by determining the difficulty coefficient of the first test question and then generating the second test question which is the same as or similar to the first test question difficulty coefficient based on the knowledge points and the entity information, not only can the understanding and mastering ability of students to the knowledge points be accurately examined, but also the fairness of the examination can be ensured.
In specific implementation, the first test questions and the second test questions with sufficient numbers and the answer conditions of the first test questions and the second test questions can be utilized for pre-training to obtain the difficulty coefficient model. After the difficulty coefficient of the first test question is obtained through the difficulty coefficient model, the knowledge points, the entity information and the difficulty coefficient are input into a preset text generation model containing the difficulty coefficient, and at least one second test question which is the same as the difficulty coefficient of the first test question can be obtained.
The method for generating the test questions is mainly a step for generating the stem information of the second test question, and when the stem information of the first test question contains information of other information types except the stem information, the information of the other information types of the second test question can be generated at the same time, for example, for the selection of the test questions, the stem information can be generated, and meanwhile, a candidate answer can be generated, and the specific generation method of the candidate answer is similar to the generation method of the stem information and is not repeated herein.
Fig. 7 shows a schematic structural diagram of a test question generating apparatus according to an embodiment of the present disclosure. As shown in fig. 7, an embodiment of the present disclosure provides a test question generating apparatus, including:
the determining module 10 is configured to perform first natural language processing on the question information of the first test question, and determine the knowledge points examined in the first test question;
The acquiring module 20 is configured to perform second natural language processing on the stem information of the first test question, and acquire entity information of the first test question; the entity information is a core word contained in the first test question;
The generating module 30 is configured to generate at least one second test question based on the knowledge points and the entity information, wherein the second test question is the same as the first test question in question type.
According to the test question generation device provided by the embodiment of the disclosure, a natural language processing technology is utilized to determine the knowledge point examined by the first test question from the question information of the first test question to be generated, the entity information in the first test question is extracted from the question stem information of the first test question, and then at least one second test question is regenerated by utilizing the knowledge point and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examined knowledge point is unchanged, and the understanding and grasping capability of students to the knowledge point can be fully examined; meanwhile, different questions adopt different stem descriptions, barriers can be effectively manufactured for plagiarism among students, cheating behaviors in examination are effectively prevented, and fairness of examination results are guaranteed to a greater extent.
The test question generating device provided in the embodiment of the present disclosure corresponds to the test question generating method in the foregoing embodiment, and based on the test question generating method, those skilled in the art can understand the specific implementation manner of the test question generating device in the embodiment of the present disclosure and various variations thereof, and any optional item in the embodiment of the test question generating method is also suitable for the test question generating device, which is not repeated herein.
The embodiment of the disclosure also provides an electronic device, including: the test question generation method comprises a processor and a memory, wherein the memory is used for storing computer executable instructions, and the processor realizes the test question generation method when executing the computer executable instructions.
The processor may be a general-purpose processor, including a central processing unit CPU, a network processor (networkprocessor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The embodiment of the disclosure also provides a computer-readable storage medium, on which computer-executable instructions are stored, which when executed by a processor, implement the test question generation method described above.
The above embodiments are merely exemplary embodiments of the present disclosure, which are not intended to limit the present disclosure, the scope of which is defined by the claims. Various modifications and equivalent arrangements of parts may be made by those skilled in the art, which modifications and equivalents are intended to be within the spirit and scope of the present disclosure.
Claims (8)
1.A test question generation method comprises the following steps:
Performing first natural language processing on the question information of a first test question, and determining the knowledge points examined in the first test question and the difficulty coefficient of the first test question;
Performing second natural language processing on the stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question;
generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient, including: determining related words corresponding to the core words based on the text features of the core words and the syntactic features of the stem information; generating stem information of at least one second test question conforming to a grammar structure based on the knowledge points, the core words and the related words; inputting the knowledge points, the entity information and the difficulty coefficient into a preset text generation model containing the difficulty coefficient to obtain at least one second test question with the same difficulty coefficient as the first test question; the second test questions are the same as the first test questions in question type.
2. The method of claim 1, wherein performing a first natural language process on the topic information of the first topic, determining knowledge points examined in the first topic, comprises:
Determining keywords of the topic information;
Comparing the keywords with knowledge points of a preset category, and determining semantic similarity of the keywords and the knowledge points of the preset category;
and when the semantic similarity between the keyword and the knowledge points of the preset category is larger than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
3. The method of claim 2, wherein determining keywords for the topic information comprises:
Word segmentation processing is carried out on the topic information of the first test question, and candidate keywords are determined;
And determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as keywords of the topic information.
4. The method of claim 2, wherein determining keywords for the topic information comprises:
Determining the question type of the first test question;
Determining an information category contained in the question information of the first test question based on the question type of the first test question;
And determining keywords of the topic information according to the information category.
5. The method of claim 4, wherein when the first test question is a selection question, the question information of the first test question includes question stem information and candidate answer information, and the determining the keyword of the question information according to the information category includes:
The method comprises the steps of respectively determining keywords in the stem information and the candidate answer information of the first test question, and determining the keywords of the stem information matched with the keywords of the candidate answer information as the keywords of the first test question; or alternatively
When the first test question is a question or an analysis question, the question information of the first test question comprises question stem information and question information, and the keyword of the question information is determined according to the information category, and the method comprises the following steps:
And respectively determining keywords in the question stem information and the question information of the first test question, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
6. The method of claim 1, wherein determining the associated word corresponding to the core word based on the text feature of the core word and the syntactic feature of the stem information comprises:
Determining the semantics expressed by the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information;
based on the semantics of the core word expressed in the stem information, carrying out semantic expansion on the core word to obtain a first associated word; and/or
Based on the text features of the core words and the syntactic features of the stem information, determining context related words corresponding to the core words in the stem information, and carrying out semantic expansion on the context related words to obtain second related words.
7. The method of claim 1, wherein the text characteristics of the core word include at least one of a category of the core word, a part of speech, a location in the stem information, and a dependency relationship with other core words.
8. A test question generation device includes:
the determining module is configured to perform first natural language processing on the question information of the first test questions and determine knowledge points examined in the first test questions and difficulty coefficients of the first test questions;
the acquisition module is configured to perform second natural language processing on the stem information of the first test question and acquire entity information of the first test question; the entity information is a core word contained in the first test question;
The generation module is configured to generate at least one second test question based on the knowledge points, the entity information and the difficulty coefficient, and comprises the following steps: determining related words corresponding to the core words based on the text features of the core words and the syntactic features of the stem information; generating stem information of at least one second test question conforming to a grammar structure based on the knowledge points, the core words and the related words; inputting the knowledge points, the entity information and the difficulty coefficient into a preset text generation model containing the difficulty coefficient to obtain at least one second test question with the same difficulty coefficient as the first test question; the second test questions are the same as the first test questions in question type.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110185141.5A CN112800182B (en) | 2021-02-10 | 2021-02-10 | Test question generation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110185141.5A CN112800182B (en) | 2021-02-10 | 2021-02-10 | Test question generation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112800182A CN112800182A (en) | 2021-05-14 |
CN112800182B true CN112800182B (en) | 2024-11-26 |
Family
ID=75815087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110185141.5A Active CN112800182B (en) | 2021-02-10 | 2021-02-10 | Test question generation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112800182B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113505195A (en) * | 2021-06-24 | 2021-10-15 | 作业帮教育科技(北京)有限公司 | Knowledge base, construction method and retrieval method thereof, and question setting method and system based on knowledge base |
CN114037571A (en) * | 2021-10-27 | 2022-02-11 | 南京谦萃智能科技服务有限公司 | Test question expansion method and related device, electronic equipment and storage medium |
CN114201613B (en) * | 2021-11-30 | 2022-10-21 | 北京百度网讯科技有限公司 | Test question generation method, test question generation device, electronic device, and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201349159A (en) * | 2012-05-31 | 2013-12-01 | Han Lin Publishing Co Ltd | Method for generating learning test questions and system thereof |
CN109359290A (en) * | 2018-08-20 | 2019-02-19 | 国政通科技有限公司 | The knowledge point of examination question text determines method, electronic equipment and storage medium |
CN112101017A (en) * | 2020-04-02 | 2020-12-18 | 上海迷因网络科技有限公司 | Method for generating questions for rapid expressive force test |
CN112287659A (en) * | 2019-07-15 | 2021-01-29 | 北京字节跳动网络技术有限公司 | Information generation method and device, electronic equipment and storage medium |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106409041B (en) * | 2016-11-22 | 2020-05-19 | 深圳市鹰硕技术有限公司 | Method and system for generating blank question and judging paper |
CN107273490B (en) * | 2017-06-14 | 2020-04-17 | 北京工业大学 | Combined wrong question recommendation method based on knowledge graph |
CN108334493B (en) * | 2018-01-07 | 2021-04-09 | 深圳前海易维教育科技有限公司 | Question knowledge point automatic extraction method based on neural network |
CN110659352B (en) * | 2019-10-10 | 2023-06-13 | 浙江蓝鸽科技有限公司 | Test question examination point identification method and system |
CN111311459B (en) * | 2020-03-16 | 2023-09-26 | 宋继华 | Interactive question-setting method and system for international Chinese teaching |
CN111815274A (en) * | 2020-07-03 | 2020-10-23 | 北京字节跳动网络技术有限公司 | Information processing method and device and electronic equipment |
CN112069295B (en) * | 2020-09-18 | 2022-12-06 | 科大讯飞股份有限公司 | Similar question recommendation method and device, electronic equipment and storage medium |
CN112164261A (en) * | 2020-09-24 | 2021-01-01 | 浙江太学科技集团有限公司 | Intelligent assessment method |
-
2021
- 2021-02-10 CN CN202110185141.5A patent/CN112800182B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201349159A (en) * | 2012-05-31 | 2013-12-01 | Han Lin Publishing Co Ltd | Method for generating learning test questions and system thereof |
CN109359290A (en) * | 2018-08-20 | 2019-02-19 | 国政通科技有限公司 | The knowledge point of examination question text determines method, electronic equipment and storage medium |
CN112287659A (en) * | 2019-07-15 | 2021-01-29 | 北京字节跳动网络技术有限公司 | Information generation method and device, electronic equipment and storage medium |
CN112101017A (en) * | 2020-04-02 | 2020-12-18 | 上海迷因网络科技有限公司 | Method for generating questions for rapid expressive force test |
Also Published As
Publication number | Publication date |
---|---|
CN112800182A (en) | 2021-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10339168B2 (en) | System and method for generating full questions from natural language queries | |
US10339453B2 (en) | Automatically generating test/training questions and answers through pattern based analysis and natural language processing techniques on the given corpus for quick domain adaptation | |
JP4654745B2 (en) | Question answering system, data retrieval method, and computer program | |
US10303767B2 (en) | System and method for supplementing a question answering system with mixed-language source documents | |
US20210149936A1 (en) | System and method for generating improved search queries from natural language questions | |
CN112800182B (en) | Test question generation method and device | |
US9720962B2 (en) | Answering superlative questions with a question and answer system | |
US10303766B2 (en) | System and method for supplementing a question answering system with mixed-language source documents | |
CN109271524B (en) | Entity Linking Method in Knowledge Base Question Answering System | |
US20180075135A1 (en) | System and method for generating full questions from natural language queries | |
JP2011118689A (en) | Retrieval method and system | |
Abidin et al. | Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review | |
Ilyas et al. | Plagiarism detection using natural language processing techniques | |
US8577924B2 (en) | Determining base attributes for terms | |
Malhar et al. | Deep learning based answering questions using t5 and structured question generation system’ | |
Vysotska | Linguistic intellectual analysis methods for Ukrainian textual content processing | |
CN113505889A (en) | Processing method and device of atlas knowledge base, computer equipment and storage medium | |
Cowen-Breen et al. | Logion: Machine-learning based detection and correction of textual errors in greek philology | |
Elema | Developing Amharic Question Answering Model Over Unstructured Data Source Using Deep Learning Approach | |
Arnfield | Enhanced Content-Based Fake News Detection Methods with Context-Labeled News Sources | |
Thenmozhi et al. | An open information extraction for question answering system | |
CN119558280A (en) | A fact-enhanced decoding method for large language models based on part-of-speech judgment | |
Cheatham | The properties of property alignment on the semantic web | |
Yergesh et al. | A System for Classifying Kazakh Language Documents: Morphological Analysis and Automatic Keyword Identification | |
Pimentel et al. | First steps towards improving official statistics data accessibility in Mexico: Query expansion with neural networks and ad-hoc space vectors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |