CN112559747A

CN112559747A - Event classification processing method and device, electronic equipment and storage medium

Info

Publication number: CN112559747A
Application number: CN202011484245.8A
Authority: CN
Inventors: 黄佳艳
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2021-03-26
Anticipated expiration: 2040-12-15
Also published as: CN112559747B

Abstract

The application discloses an event classification processing method and device, electronic equipment and a storage medium, and relates to the technical field of deep learning, knowledge maps and natural language processing in the technical field of artificial intelligence. The specific implementation scheme is as follows: acquiring a plurality of sample event sets belonging to different event types, wherein each sample event set comprises a plurality of sample event texts belonging to the same event type; acquiring a character vector corresponding to each sample event text; performing semantic analysis on each sample event text to label role entities, and acquiring word vectors corresponding to each role entity; and taking the character vector corresponding to each sample event text and the word vector corresponding to the role entity as input information of a preset neural network model, and taking the event type corresponding to the sample event set to which each sample event belongs as output information of the neural network model, thereby training the neural network model to classify the events.

Description

Event classification processing method and device, electronic equipment and storage medium

Technical Field

The present application relates to the technical field of deep learning, knowledge graph and natural language processing in the technical field of artificial intelligence, and in particular, to an event classification processing method and apparatus, an electronic device, and a storage medium.

Background

Generally, event extraction techniques can extract events of interest to a user from unstructured information and present them to the user in a structured form. Event classification is the basis for event extraction, and the quality of event classification determines the quality of event extraction.

In the prior art, when event classification processing is performed, a type system of an event type is predefined, and therefore, the current classification technology can only process the event type for a specific field.

Disclosure of Invention

The application provides a method, a device, equipment and a storage medium for event classification processing, and relates to the technical field of deep learning, knowledge maps and natural language processing in the technical field of artificial intelligence. A technical scheme capable of processing the classification problem of the open domain events is provided.

According to a first aspect of the present application, there is provided an event classification processing method, including:

acquiring a plurality of sample event sets belonging to different event types, wherein each sample event set comprises a plurality of sample event texts belonging to the same event type;

acquiring a character vector corresponding to each sample event text;

performing semantic analysis on each sample event text to label role entities, and acquiring word vectors corresponding to each role entity;

and taking the character vector corresponding to each sample event text and the word vector corresponding to the role entity as input information of a preset neural network model, and taking an event type corresponding to a sample event set to which each sample event belongs as output information of the neural network model, so as to train the neural network model to classify events.

According to a second aspect of the present application, there is provided an event classification processing apparatus including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module acquires a plurality of sample event sets belonging to different event types, and each sample event set comprises a plurality of sample event texts belonging to the same event type;

the first processing module is used for acquiring a character vector corresponding to each sample event text;

the second processing module is used for carrying out semantic analysis on each sample event text and marking role entities to obtain word vectors corresponding to each role entity;

and the training module is used for taking the character vector corresponding to each sample event text and the word vector corresponding to the role entity as input information of a preset neural network model, taking an event type corresponding to a sample event set to which each sample event belongs as output information of the neural network model, and further training the neural network model to classify events.

According to a third aspect of the present application, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of event classification processing according to an aspect of the present application.

According to a fourth aspect of the present application, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the event classification processing method according to the first aspect of the present application.

According to a fifth aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the event classification processing method according to the first aspect.

According to the technical scheme of the application, the event classification processing method which does not need to construct an event type system in advance and generates event type information more completely is provided.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a flow diagram of a method of event classification processing according to one embodiment of the present application;

FIG. 2 is a flow diagram of a method of obtaining a plurality of sample event sets belonging to different event types according to another embodiment of the present application;

FIG. 3 is a flow diagram of a method of generating a vector of an input decoding layer according to yet another embodiment of the present application;

FIG. 4 is a model structure diagram of a neural network model according to an embodiment of the present application

FIG. 5 is a block diagram of an arrangement of an event classification process model according to an embodiment of the present application;

FIG. 6 is a block diagram of an arrangement of an event classification process model according to another embodiment of the present application;

FIG. 7 is a block diagram of an arrangement of an event classification process model according to yet another embodiment of the present application;

FIG. 8 is a block diagram of an electronic device for implementing the event classification process of an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The application provides an event classification processing method, and the event classification technical scheme of the processing method can be applied to an open domain. When the model is trained, the input information of the model comprises word vectors corresponding to the role entities, the addition of the word vectors can enable the model to extract events more effectively, and the extracted event information is more complete.

Fig. 1 is a flowchart of an event classification processing method according to an embodiment of the present application. It should be noted that the event classification processing method according to the embodiment of the present application can be applied to the event classification processing device according to the embodiment of the present application, and the event classification device can be configured on the electronic device according to the embodiment of the present application. As shown in fig. 1, the event classification processing method may include:

step 101, obtaining a plurality of sample event sets belonging to different event types, wherein each sample event set comprises a plurality of sample event texts belonging to the same event type.

Generally, a knowledge graph can be composed of events with internal relations and relations among the events, the events can be extracted from expression modes, the same semantics can have multiple expression modes, a technology for extracting corresponding semantics from the multiple expression modes is an event extraction technology, and a key step in the event extraction technology is event classification. In the prior art, event classification can only be applied to a specific field, but cannot be applied to an open domain. Therefore, in order to adapt the event extraction technology to a wider application scenario, the event classification model needs to be trained by using a sample event set covering a wider range, and the process consumes a lot of manpower. The application provides an event classification processing method which is not limited to a specific field.

In some embodiments of the present application, first, a plurality of sample event sets belonging to different event types are obtained according to a preset algorithm, where it is to be noted that the preset algorithm may be an unsupervised clustering algorithm in the machine learning field, and compared with a supervised learning algorithm, the unsupervised clustering algorithm does not need to mark data, so that an event classification system does not need to be established in advance. The unsupervised clustering algorithm includes but is not limited to: any one of K-means, single pass and hierarchical clustering algorithm.

The event types obtained according to the unsupervised clustering algorithm can be coarse-grained or fine-grained, the coarse-grained and the fine-grained are a group of relative concepts, and the fine-grained classification is more detailed than coarse-grained classification. For example, coarse-grained classifications include, but are not limited to: any one or more of life, affairs, and play. When the coarse-grained classification is a beat, the corresponding fine-grained classification includes, but is not limited to: any one or more of a movie, and a telephotograph. When the output of the model is expected to be finer in granularity, a fine-granularity classification method can be adopted; when the output coarse granularity of the model is expected, a coarse-grained classification method may be employed.

It can be understood that a plurality of sample event sets can be obtained through unsupervised clustering algorithm processing, and sample event texts in the same sample event set belong to the same event type; sample event texts in different sample event sets do not belong to the same event type. In a sample event set, there may be multiple sample event texts, and the sources of the sample event texts may be various, including but not limited to: news website pickup, hot website crawling, and forum posting record. The sample event may also be a portion of the event taken from the whole, including but not limited to: any one or more of abstract, title, and main nature section. The sample event text may also be in a variety of languages, including but not limited to: any one or more of Chinese, English and Japanese.

It is to be understood that this step merely refers to obtaining a plurality of sample event sets belonging to different event types, and does not require labeling of sample event sets of different event types.

And 102, acquiring a character vector corresponding to each sample event text.

It is understood that after the sample event text is acquired, the acquired sample event text needs to be processed by a deep learning technique, and then, it is necessary to convert the text information into data information that can be processed by a computer.

In some embodiments of the present application, it is necessary to split the sample event text into words and/or words, and use the words and/or words as input information of the deep learning model, as a conversion means, the words and/or words in the sample event text can be converted into corresponding character vectors and/or word vectors. The character vector and/or the word vector may be used as input information for the model.

And 103, performing semantic analysis on each sample event text to label the role entities, and acquiring word vectors corresponding to each role entity.

It is understood that similar to the character vector, the word vector in the sample event text can also be used as input information of the model, and the word vector and the character vector can carry more information than the character vector, and generally speaking, the information can be a role entity. The role entities can be words with nouns in a sentence, and through semantic analysis, nouns can be divided into different role entities according to different relations between predicates and nouns. Common role entities include, but are not limited to: one or more of the performer, the actor, and the actor. Wherein: the chores can be the subject of sentences; the victim may be an object of a sentence; the respondent may be an inactive participant to the event initiated by the actor.

In some embodiments of the present application, the role entity may further include a trigger. In the prior art, a trigger word may be used as a core, and events are classified by combining with context features of the trigger word, for example: the trigger word of 'I eat' is 'eat'. And the trigger words are added into the role entities, so that the training of the model can be more efficient, and the event classification is more accurate.

In some embodiments of the present application, before performing semantic analysis, word segmentation processing may be performed on each sample event text, where the word segmentation processing is a common technology in the natural language technology field, and the sample event text after word segmentation processing is subjected to semantic analysis to label role entities. And the word corresponding to each role entity has a corresponding word vector, and the word vector is the acquisition target.

Understandably, the addition of the word vector corresponding to the role entity can enable the model to be finer in granularity and more detailed when event classification is carried out. The fine-grained and detailed event classification is more beneficial to the event extraction and the event logic.

And step 104, taking the character vector corresponding to each sample event text and the word vector corresponding to the role entity as input information of a preset neural network model, and taking the event type corresponding to the sample event set to which each sample event belongs as output information of the neural network model, thereby training the neural network model to classify the events.

It can be understood that the character vector corresponding to the sample event text and the word vector corresponding to the role entity may be used together as input information to be input into the preset neural network model. And the event type corresponding to the sample event set to which the sample event belongs is the output information of the neural network. The preset neural network model is a complex network system formed by a large number of simple processing units which are widely connected with each other, reflects many basic characteristics of human brain functions, and is a highly complex nonlinear dynamical learning system.

In some embodiments of the present application, the neural network model may be composed of different topologies, including, but not limited to, a Sequence to Sequence (seq 2 seq) model, an Attention model, a generative confrontation network model. Training a neural network model using the input information and the output information may enable the neural network to have the ability to perform event classification.

In some embodiments of the present application, the word vector input into the neural network model may further form a key-value pair structure with its corresponding role entity, and the key-value pair may be converted into a set of key-value pair vectors. The key-value pair vector corresponding to the sample event text and the character vector corresponding to the sample event text can be used as input information to be input into the preset neural network.

According to the event classification processing method, a plurality of sample event sets are obtained, wherein each sample event set is the same event type. Training a neural network model, wherein input information of the neural network model comprises: the character vector corresponding to the sample event text and the word vector corresponding to the role entity corresponding to the sample event text. The output information of the neural network model comprises: and the event type corresponding to the sample event set to which the sample event belongs. The trained neural network model has the capability of event classification.

Most of the existing event classification models are oriented to non-open domains, an event type set needs to be defined in advance, however, the event types are complex and various, the coverage of the predefined event type set is very small, a method for classifying sample events by using an unsupervised learning model is adopted in the method, the event types are labeled on the basis of the classification, and then the model obtained through the training is used for carrying out event classification on original data. Due to the adoption of the unsupervised learning model and the fact that the event types defined in advance are not adopted, the technical scheme of the application can be applied to the open domain. When the neural network model is trained, the input information further comprises word vectors corresponding to the role entities, the event types can be extracted more easily by the model due to the addition of the word vectors, and the extracted event type information is more complete and has better fine granularity.

In a second embodiment of the present application, based on the first embodiment, in order to make training efficiency of the model higher and training difficulty lower, a sample event set may be obtained from a candidate event set with a larger data volume. The method can be specifically explained using example two based on the event classification processing scheme of fig. 1. Optionally, the step 101 comprises: a plurality of sample event sets belonging to different event types are obtained, and the specific operation may be step 201 and step 204.

To more clearly illustrate how to obtain a sample event set according to a requirement, fig. 2 may be specifically illustrated, where fig. 2 is a flowchart of a method for obtaining a plurality of sample event sets belonging to different event types according to an embodiment of the present application, and specifically includes:

step 201, obtaining candidate event texts meeting preset conditions.

Understandably, the acquisition range of the candidate event text needs to be limited through a preset condition, so that the training of the model can be more efficient.

In some embodiments of the present application, the preset condition may include, but is not limited to: any one or more of a time limit, a source limit, an acquisition mode limit. The time limit can be that the event text generated in a certain time period is a candidate event text; the source limitation can be that the event texts of some websites are candidate event texts; limitations of the manner of obtaining include, but are not limited to: any one or more of script crawling, manual screening and downloading and background database obtaining.

Step 202, performing clustering processing on the candidate event texts to generate a plurality of candidate event sets belonging to different event types, wherein each candidate event set comprises a plurality of candidate event texts belonging to the same event type.

It can be understood that the candidate event text contains a plurality of event types, and the candidate event text needs to be processed to obtain a candidate event set.

In some embodiments of the present application, the process may be a clustering process, and an unsupervised clustering model may be used to perform the clustering process on the candidate event texts, where the unsupervised clustering algorithm includes, but is not limited to: any one or more of K-means, single pass, hierarchical clustering algorithm. After the clustering process, a plurality of candidate event sets are generated, and the candidate event texts in the candidate event sets are all of the same event type.

In some embodiments of the present application, during the clustering process, a similarity calculation is required, and the similarity calculation method includes, but is not limited to: a reference method, a word shift distance, and a training encoder. And performing similarity calculation by using an event type normalization model, namely randomly selecting some event texts from the candidate event texts, pairing the event texts pairwise, and manually labeling the event texts after pairing. If the two event texts describe the same event type, marking as a positive sample; if the two event text descriptions are not of the same event type, then the negative example is marked. And taking the paired event texts as the input of a two-classification model, taking the corresponding artificially marked positive sample or negative sample as the output of the two-classification model, training the two-classification model, wherein the trained model is an event type normalization model, and the model can be used for calculating the event type similarity.

Step 203, extracting the set characteristics of each candidate event set, and acquiring the characteristic score corresponding to each set characteristic.

Some event sets in the candidate event set are small sample data or other data which are not suitable for neural network model training, and generally speaking, the data can cause a fitting problem, so that the event type information output by the model is inaccurate. The set of candidate events may be filtered by the set features.

In some embodiments of the present application, the set characteristics of the set of candidate events comprise at least one of: the number of candidate event texts, the number of sites to which the candidate event texts belong, the hot degree value of the sites to which the candidate event texts belong and the hot degree value of the candidate event texts. After the set features are obtained, the feature score corresponding to each set feature needs to be obtained, and understandably, the more the number of candidate event texts and the number of sites to which the candidate event texts belong, the higher the corresponding feature score is; the higher the hot value of the candidate text and the hot value of the site to which the candidate event text belongs, the higher the corresponding feature score.

In some embodiments of the present application, the set feature includes a heat value of the event text, and in order to obtain a feature score corresponding to the heat value of the event text, the following operations may be performed:

and calculating the character similarity between each candidate event text in the candidate event set and the text in the preset database, and determining the text heat matched with the database text in the candidate event set according to the character similarity.

It can be understood that the text in the preset database may correspond to one hot value related data, and the hot value corresponding to the candidate event text may be obtained by obtaining the text in the corresponding preset database according to the candidate event text.

As an example, the candidate event text may be subjected to word segmentation to obtain a plurality of words, and character similarity between the obtained plurality of words and the text in the preset database is calculated, where the calculation method of the character similarity includes, but is not limited to, simhash or TF-IDF (term frequency-inverse document frequency index). The text in the pre-set database may be a keyword that may be used to describe part of the characteristics of a common event type. The text in the preset database may also be a query word input in the retrieval system, and generally, the query word is called: and (5) query. And comparing the character similarity with a preset threshold, if the character similarity is greater than the threshold, retaining the database text, and if the character similarity is less than the threshold, not retaining the database text.

When the database texts are queries, each query has a corresponding pv (page view) value, and the pv values of a plurality of target event texts corresponding to one candidate event text are mathematically processed, so that the heat value of the target event text can be obtained. The mathematical processing includes, but is not limited to, any one of accumulation and averaging, and the pv value may be a heat value corresponding to the candidate event text.

And a heat model can be preset, and the text heat is processed according to the preset heat model to generate a heat value of the candidate event text.

It can be understood that matching between the database text obtained after the character similarity processing and the candidate event text is not accurate, and the database text can be screened through a hot model to obtain a hot value of the candidate event text.

As an example, the reserved database text is recorded as a candidate database text, and the candidate database text is further subjected to screening by a heat model, wherein the heat model is a binary model. The training method of the heat model comprises the following steps: firstly, acquiring participles obtained by performing participle processing on event texts and candidate database texts corresponding to the participles, randomly selecting some sample database texts from the candidate database texts, forming a pair of data by the sample database texts and the corresponding participles, and manually labeling each pair of data. If the semanteme of the participle is similar to the text semanteme of the sample database, marking as a positive sample; and if the semanteme of the participle is not similar to the text semanteme of the sample database, marking as a negative sample. And training the two-classification model by taking the sample database text and the corresponding participle as the input of the two-classification model and taking the corresponding manually marked positive sample or negative sample as the output of the two-classification model, wherein the trained model is the heat model. And screening the candidate database texts by using the heat model, and judging that the candidate database texts which are positive samples are the candidate database texts needing to be reserved, wherein the candidate database texts needing to be reserved are the basis for generating the heat value of the candidate event texts.

It can be understood that one candidate event text corresponds to a plurality of screened database texts, and the popularity values of the plurality of screened database texts corresponding to one candidate event text are combined through operation, so that the popularity value of the candidate event text can be obtained.

And 204, selecting a plurality of sample event sets meeting the screening condition from a plurality of candidate event sets according to the feature score corresponding to each set feature, wherein each sample event set comprises a plurality of sample event texts belonging to the same event type.

It is to be understood that after obtaining the feature scores, the set of candidate events may be screened according to the feature scores.

In some embodiments of the present application, a feature score corresponding to each of the set features may be calculated, including but not limited to: and accumulating, weighting and summing, and solving the square difference. For each set of candidate events, an overall feature score is derived. And sorting according to the overall characteristic score, wherein the sorting method comprises descending or ascending sorting, and the top N candidate event sets are selected as sample event sets meeting the screening condition. It is understood that each event set contains a plurality of sample event texts belonging to the same event type.

According to the set feature of the embodiment of the application, the heat value of the candidate event text is introduced into the set feature, and the higher the heat value is, the higher the probability is, the higher the event set is, the input probability of the model after training is. After the set characteristics are added, the evaluation criteria of the set characteristics are more comprehensive. One method for obtaining the heat value of the candidate event text is to perform operation according to the query and the pv value thereof, and the pv value is updated periodically, so that the heat value obtained according to the pv value can reflect data concerned by the current user, and an event set obtained by screening according to the set characteristics is more representative.

In some embodiments of the present application, the screening of the candidate event set may be completed by comparing the feature score with a preset threshold, and to complete the screening, the specific operations are as follows:

and acquiring preset weight corresponding to each set characteristic.

It will be appreciated that each set of features may be weighted differently depending on their importance. Generally, the more important the set features, the higher the weight; the less important the set feature, the lower the weight.

And calculating the set score of each candidate event set according to the feature score and the weight corresponding to each set feature.

It will be appreciated that the aggregate score may be calculated by: and processing the characteristic scores according to the characteristic weights corresponding to the set characteristics, wherein the higher the weight of the set characteristics, the greater the contribution of the set characteristics to the set scores.

As an example, a feature score corresponding to a set feature and a feature weight corresponding to the feature score may be multiplied, and then the result obtained after the multiplication operation is performed on different set features may be accumulated, where the result obtained after the accumulation operation is the set score of the candidate event set.

And comparing the set score of each candidate event set with a preset threshold, and taking the candidate event set corresponding to the set score larger than the threshold as a sample event set.

It is to be understood that after the collection scores of the candidate event sets are obtained, the candidate event sets need to be screened according to the collection scores.

As an example, the candidate event set may be filtered by using a preset threshold, where the threshold may be preset empirically, and the candidate sample event set with a set score greater than the threshold may be used as the sample event set.

The sample event collection after screening is more representative, and small sample data is removed, so that the fitting problem is prevented.

According to the sample event set acquisition method, the candidate event set is obtained by clustering the candidate event texts, corresponding feature scores are acquired according to the set features of the candidate event set, and the sample event set is screened out according to the feature scores. The method can classify the events without constructing an event classification system in advance, thereby being applied to an open domain. And through the screening of the characteristic scores, the screened sample event set is more representative, so that the training of the model is more efficient.

In a third embodiment of the present application, based on the above embodiments, in order to make the model training difficult and make the event type output by the trained model more conform to human habits, the character vector may be input into the decoding layer after being processed. To more clearly illustrate the input processing, the third embodiment can be used based on the event classification processing schemes of the above embodiments, and the input processing method is specifically described. In some embodiments of the present application, before training the neural network model, step 301-304 is further included.

Step 301, a first vector of input information passing through a coding layer in a neural network model is obtained.

It will be appreciated that the neural network model may be of a variety of types, including but not limited to: a baseline model, a Pointer Network.

In some embodiments of the present application, a seq2seq model in a neural network model can be taken as an example without limitation to illustrate how to learn event classification by using the neural network model. The basic seq2seq model has two parts, an encoding layer and a decoding layer, where the encoder and decoder can be of various kinds, including but not limited to: any one of CNN (Convolutional Neural Networks), RNN (Recurrent Neural Networks) and LSTM (Long Short-Term Memory), the coding layer part can convert the input sequence into a vector with fixed length; the decoding layer component may translate fixed-length vectors into an output sequence. The first vector is a vector obtained after the input information is processed by the coding layer.

Step 302, a second vector of the character vector passing through a full connection layer in the neural network model is obtained.

It will be appreciated that a fully-connected layer is used for feature extraction, which can map the modeled distributed feature signatures into the sample labeling space.

In some embodiments of the present application, the fully connected layer may be implemented by a variety of operations. The processing formation of the full connection layer is a character vector corresponding to the sample event text, and the encoding mode of the character vector includes but is not limited to one hot and any one of TF-IDF (term frequency-inverse text frequency index). Through the feature extraction of the full connection layer, the model can record the character vector corresponding to the character appearing in the input information.

Step 303, the first vector and the second vector are cascaded to generate a third vector.

It will be appreciated that when event classification is performed, many characters of the event type output by the model may appear in the input information, for example: inputting ' three falling curtains of a football match in campus of the northeast district ', outputting ' falling curtain of the football match ', inputting ' A boxing King sea beat ' B TV play ', and outputting ' playing '. Therefore, when the event type is output by the decoding layer, if the model can be more inclined to output characters appearing in the input text, the training difficulty of the model is reduced, and the output event type is closer to the ideal event type. The addition of a second vector will cause the model to have the functionality described above.

In some embodiments of the present application, the first vector and the second vector are concatenated, and the concatenation process may be concatenation of vectors, for example, vector a ═ 1,2,3, vector b ═ 4,5,6, where the concatenation of vector a and vector b generates [1,2,3,4,5,6], and the concatenation operation may also be other operation that retains two vector characteristics at the same time.

Step 304, inputting the third vector to a decoding layer in the neural network model.

It is understood that the input information of the decoding layer of the neural network model is the third vector.

Fig. 4 may be a schematic structural diagram of a neural network model according to an embodiment of the present application, where encoder is a coding layer, decoder is a decoding layer, one hot is a character coding method, fc (full connected layer) is a full connection layer, and X is a full connection layer_1,1、X_1,2、X_1,NIs a character vector corresponding to a character, Y_1,1、Y_1,2、Y_1,NThe event type is corresponding to the character. As shown in fig. 4, a word vector obtained by performing semantic analysis on a sample event text and a character vector corresponding to the sample event text are input to an encoding layer, the character vector also performs feature extraction through a full connection layer to generate a second vector, the second vector and a first vector output by the encoding layer are cascaded to generate a third vector, the third vector is input to a decoding layer, and the decoding layer outputs a corresponding event type.

According to the embodiment of the application, the application also provides an event classification processing device.

Fig. 5 is a block diagram of an event classification processing apparatus according to an embodiment of the present application. As shown in fig. 5, the event classification processing apparatus 500 may include: a first obtaining module 510, a first processing module 520, a second processing module 530, and a training module 540.

Specifically, the first obtaining module 510 is configured to obtain a plurality of sample event sets belonging to different event types, where each sample event set includes a plurality of sample event texts belonging to the same event type;

a first processing module 520, configured to obtain a character vector corresponding to each sample event text;

a second processing module 530, configured to perform semantic analysis on each sample event text to label a role entity, and obtain a word vector corresponding to each role entity;

the training module 540 is configured to use the character vector corresponding to each sample event text and the word vector corresponding to the role entity as input information of a preset neural network model, and use an event type corresponding to a sample event set to which each sample event belongs as output information of the neural network model, so as to train the neural network model to perform event classification.

In some embodiments of the present application, as shown in fig. 6, the first obtaining module 610 in the event classification processing apparatus 600 may further include: a first acquiring unit 611, a clustering unit 612, a second acquiring unit 613, and a filtering unit 614.

Specifically, the first obtaining unit 611 is configured to obtain a candidate event text that meets a preset condition;

a clustering unit 612, configured to perform clustering on candidate event texts to generate a plurality of candidate event sets belonging to different event types, where each candidate event set includes a plurality of candidate event texts belonging to the same event type;

a second obtaining unit 613, configured to extract a set feature of each candidate event set, and obtain a feature score corresponding to each set feature;

a screening unit 614, configured to select a plurality of sample event sets that satisfy a screening condition from the plurality of candidate event sets according to the feature score corresponding to each set feature.

Wherein 610-640 in fig. 6 and 510-540 in fig. 5 have the same functions and structures.

In some embodiments of the present application, as shown in fig. 7, the event classification processing apparatus 700 may further include: a second obtaining module 750, a third obtaining module 760, a cascade module 770, and an input module 780.

Specifically, the second obtaining module 750 is configured to obtain a first vector of the input information passing through an encoding layer in the neural network model.

And a third obtaining module 760, configured to obtain a second vector of the character vector passing through a fully connected layer in the neural network model.

A cascade module 770, configured to cascade the first vector and the second vector to generate a third vector.

An input module 780, configured to input the third vector to a decoding layer in the neural network model.

Wherein 701-704 in fig. 7 and 604-604 in fig. 6 have the same functions and structures.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

There is also provided, in accordance with an embodiment of the present application, an electronic device, a readable storage medium, and a computer program product.

FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 1001, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the event classification processing method. For example, in some embodiments, the event classification processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 1002 and/or communications unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the event classification processing method described above may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured to perform the event classification processing method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that when executed by a processor implement the event classification processing methods described in the embodiments above, the one or more computer programs are executable and/or interpretable on a programmable system including at least one programmable processor, which may be a special or general purpose programmable processor, that receives data and instructions from, and transmits data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

According to the technical scheme of the embodiment of the application, the unsupervised learning model is adopted, and the predefined event type is not adopted, so that the method and the device can be applied to the open domain. When the neural network model is trained, the input information further comprises word vectors corresponding to the role entities, the event types can be extracted more easily by the model due to the addition of the word vectors, and the extracted event type information is more complete and has better fine granularity. In some embodiments of the application, a candidate event set with a later weight can be removed, so that the training of the model is more efficient and targeted, the generated event types are more concentrated, and meanwhile, the problem of fitting caused by a small sample is avoided. Meanwhile, the character vector corresponding to the character of the input information is input into the decoding layer, so that the probability that the event type output by the decoding layer contains the self character of the input information is higher, the training efficiency of the model is improved, the training difficulty of the model is reduced, and the method also more accords with the habit of event classification of human beings in reality. In some embodiments of the application, a hot value of a candidate event text can be introduced into a set feature, so that the evaluation criterion of the set feature is more comprehensive, and an event set obtained by screening according to the set feature is more representative.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An event classification processing method comprises the following steps:

acquiring a character vector corresponding to each sample event text;

2. The method of claim 1, wherein said obtaining a plurality of sample event sets belonging to different event types comprises:

acquiring candidate event texts meeting preset conditions;

clustering the candidate event texts to generate a plurality of candidate event sets belonging to different event types, wherein each candidate event set comprises a plurality of candidate event texts belonging to the same event type;

extracting set characteristics of each candidate event set, and acquiring a characteristic score corresponding to each set characteristic;

selecting the plurality of sample event sets satisfying a screening condition from the plurality of candidate event sets according to the feature score corresponding to each set feature.

3. The method of claim 2, wherein, when the set feature is a popularity value of the candidate event text, the obtaining a feature score corresponding to each of the set features comprises:

calculating character similarity between each candidate event text in the candidate event set and a text in a preset database, and determining text heat matched with the database text in the candidate event set according to the character similarity;

and processing the text heat according to a preset heat model to generate a heat value of the candidate event text.

4. The method of claim 2, wherein said selecting the plurality of sample event sets that satisfy a screening condition from the plurality of candidate event sets according to the feature score corresponding to each of the set features comprises:

acquiring preset weight corresponding to each set feature;

calculating the set score of each candidate event set according to the feature score and the weight corresponding to each set feature;

and comparing the set score of each candidate event set with a preset threshold value, and taking the candidate event set corresponding to the set score larger than the threshold value as the sample event set.

5. The method of any of claims 1-4, wherein prior to said training said neural network model, further comprising:

acquiring a first vector of the input information passing through a coding layer in the neural network model;

acquiring a second vector of the character vector passing through a full connection layer in the neural network model;

performing cascade processing on the first vector and the second vector to generate a third vector;

inputting the third vector to a decoding layer in the neural network model.

6. An event classification processing apparatus comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a plurality of sample event sets belonging to different event types, and each sample event set comprises a plurality of sample event texts belonging to the same event type;

the second processing module is used for carrying out semantic analysis on each sample event text to label role entities and acquiring word vectors corresponding to each role entity;

7. The apparatus of claim 6, wherein the first obtaining means comprises:

the first acquisition unit is used for acquiring candidate event texts meeting preset conditions;

the clustering unit is used for clustering the candidate event texts to generate a plurality of candidate event sets belonging to different event types, wherein each candidate event set comprises a plurality of candidate event texts belonging to the same event type;

the second acquisition unit is used for extracting set characteristics of each candidate event set and acquiring a characteristic score corresponding to each set characteristic;

and the screening unit is used for selecting the plurality of sample event sets meeting the screening condition from the plurality of candidate event sets according to the feature score corresponding to each set feature.

8. The apparatus of claim 7, wherein the second obtaining unit is specifically configured to:

9. The apparatus of claim 7, wherein the screening unit is specifically configured to:

acquiring preset weight corresponding to each set feature;

10. The apparatus of any of claims 6-9, further comprising:

the second acquisition module is used for acquiring a first vector of the input information passing through a coding layer in the neural network model;

the third acquisition module is used for acquiring a second vector of the character vector passing through a full connection layer in the neural network model;

the cascade module is used for carrying out cascade processing on the first vector and the second vector to generate a third vector;

an input module for inputting the third vector to a decoding layer in the neural network model.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1-5 when executed by a processor.