CN112883350B

CN112883350B - Data processing method, device, electronic equipment and storage medium

Info

Publication number: CN112883350B
Application number: CN201911206373.3A
Authority: CN
Inventors: 杨广煜
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2024-12-17
Anticipated expiration: 2039-11-29
Also published as: CN112883350A

Abstract

The embodiment of the application discloses a data processing method, a device, electronic equipment and a storage medium, wherein the method comprises the steps of acquiring target biological information when a primary identity is in a valid state; the method comprises the steps of identifying service intention and target user identity corresponding to the target biological information, obtaining target secondary identity corresponding to the target user identity, wherein the target secondary identity is a sub-identity of the primary identity, and executing service instruction corresponding to the service intention based on the target secondary identity. By adopting the application, the business behavior executed by the terminal equipment can be matched with the user identity.

Description

Data processing method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, and related devices.

Background

With the rapid development of internet technology, the number of internet users is continuously increasing, and among all application software, video software is one of the software with the highest use frequency of internet users. The data shows that the usage duration of the video software is up to 34.5% of the total usage duration of the mobile device.

In a home scenario, a terminal device (e.g., a smart tv) is often common to multiple family members, and when a family member a logs in to a video application in the terminal device using its own account a and views a video 1, the terminal device records the viewing progress of the family member a on the video 1. If another family member B uses the video application to watch the video 1, and does not switch the login account of the video application, the terminal device automatically jumps to the watching progress of the family member A on the video 1 at this time, so that the service behavior executed by the terminal device is not matched with the user identity.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device and related equipment, which can enable business behaviors executed by terminal equipment to be matched with user identities.

In one aspect, an embodiment of the present application provides a data processing method, including:

when the first-level identity is in a valid state, acquiring target biological information;

identifying a business intention and a target user identity corresponding to the target biological information;

Acquiring a target secondary identity corresponding to the target user identity, wherein the target secondary identity is a sub-identity of the primary identity;

and executing a service instruction corresponding to the service intention based on the target secondary identity.

Wherein the target biological information includes target voice data;

The identifying the business intention and the target user identity corresponding to the target biological information comprises the following steps:

Converting the target voice data into text data, and semantically identifying the text data to obtain the service intention;

Invoking an identification model corresponding to the primary identification to determine a matching result between the target voice data and at least one template user identity, wherein the identification model is a classification model generated according to the at least one template user identity and the template voice data respectively corresponding to the at least one template user identity;

if at least one matching result has a matching result meeting the matching condition, taking the template user identity corresponding to the matching result meeting the matching condition as the target user identity;

The obtaining the target secondary identity corresponding to the target user identity includes:

Extracting a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity, wherein the secondary identity in the secondary identity set is a child identity of the primary identity.

Wherein, still include:

if the at least one matching result does not have the matching result meeting the matching condition, creating the target user identity;

identifying age information corresponding to the target voice data, and searching an identity head portrait matched with the age information in an image material library;

Creating the target secondary user identity for the target user identity;

Setting the target secondary user identifier as a sub identifier of the primary identity identifier;

and carrying out association storage on the target user identity, the target secondary identity and the identity head portrait.

The identity recognition model comprises a feature generator and a pattern matcher;

the step of calling the identification model corresponding to the primary identification to determine a matching result between the target voice data and at least one template user identity comprises the following steps:

extracting target voiceprint features of the target voice data based on the feature generator;

And determining the matching probability between the target voiceprint feature and at least one template voiceprint feature based on the pattern matcher, wherein the obtained matching probabilities are used as matching results, and the at least one template voiceprint feature is the voiceprint feature corresponding to the at least one template voice data respectively.

Wherein the extracting the target voiceprint feature of the target voice data based on the feature generator includes:

extracting a spectrum parameter and a linear prediction parameter of the target voice data based on the feature generator, wherein the spectrum parameter is a short-time spectrum feature parameter of the target voice data;

and obtaining the target voiceprint feature according to the frequency spectrum parameter and the linear prediction parameter.

Wherein, still include:

acquiring template voice data corresponding to the identity of a template user;

Generating an identity tag vector corresponding to the template voice data;

Acquiring an initial classification model, predicting the matching degree between the sample voice data and the at least one template user identity based on the initial classification model, and acquiring an identity prediction vector according to the acquired matching degree;

and determining a classification error according to the identity tag vector and the identity prediction vector, and training the initial classification model according to the classification error to obtain the identity recognition model.

Wherein, still include:

when a matching result meeting the matching condition exists in the at least one matching result, sending an animation playing instruction to a client, and indicating the client to play a target animation;

and when the execution of the business instruction is completed, sending an instruction for stopping playing the animation to the client, and indicating the client to close the target animation.

The service intention comprises a client secondary login object switching intention;

The executing the service instruction corresponding to the service intention based on the target secondary identity comprises the following steps:

Generating a switching instruction corresponding to the switching intention of the client secondary login object, wherein the switching instruction belongs to the service instruction;

And taking the target secondary identity as a secondary login object of the client according to the switching instruction.

Wherein, still include:

Acquiring behavior data of a user in the client corresponding to the target secondary identity, wherein the behavior data is used for generating recommended service data for the user;

and carrying out association storage on the behavior data and the target secondary identity.

Wherein the business intent comprises a business data query intent;

Generating a query instruction corresponding to the service data query intention, wherein the query instruction belongs to the service instruction;

and inquiring target service data corresponding to the target secondary identity, and returning the target service data to the client.

The user authority of the target secondary identity is the same as the user authority of the primary identity.

Another aspect of an embodiment of the present application provides a data processing apparatus, including:

the first acquisition module is used for acquiring target biological information when the primary identity is in a valid state;

The identification module is used for identifying the service intention and the target user identity corresponding to the target biological information;

The second acquisition module is used for acquiring a target secondary identity corresponding to the target user identity, wherein the target secondary identity is a sub-identity of the primary identity;

and the determining module is used for executing the business instruction corresponding to the business intention based on the target secondary identity.

Wherein the target biological information includes target voice data;

The identification module comprises:

the conversion unit is used for converting the target voice data into text data, and semantically identifying the text data to obtain the business intention;

The system comprises a first-level identity identification unit, a calling unit and a second-level identity identification unit, wherein the first-level identity identification unit is used for identifying the first-level identity identification unit and the second-level identity identification unit, and the first-level identity identification unit is used for identifying the first-level identity identification unit and the second-level identity identification unit;

The first determining unit is used for taking the template user identity corresponding to the matching result meeting the matching condition as the target user identity if the matching result meeting the matching condition exists in at least one matching result;

the second acquisition module includes:

The first extraction unit is used for extracting a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity, wherein the secondary identity in the secondary identity set is a child identity of the primary identity.

Wherein, still include:

The second determining unit is used for creating the target user identity if the matching result meeting the matching condition does not exist in the at least one matching result, identifying age information corresponding to the target voice data, and searching an identity head portrait matched with the age information in an image material library;

the second acquisition module includes:

The second extraction unit is used for creating the target secondary user identification for the target user identity, setting the target secondary user identification as a sub-identification of the primary identity, and carrying out association storage on the target user identity, the target secondary identity and the identity head portrait.

The calling unit comprises:

an extraction subunit, configured to extract, based on the feature generator, a target voiceprint feature of the target voice data;

And the matching subunit is used for determining the matching probability between the target voiceprint feature and at least one template voiceprint feature based on the pattern matcher, and taking the obtained matching probabilities as matching results, wherein the at least one template voiceprint feature is the voiceprint feature corresponding to the at least one template voice data respectively.

The extraction subunit is specifically configured to extract a spectral parameter and a linear prediction parameter of the target voice data based on the feature generator, and obtain the target voiceprint feature according to the spectral parameter and the linear prediction parameter, where the spectral parameter is a short-time spectral feature parameter of the target voice data, and the linear prediction parameter is a spectral fitting feature parameter of the target voice data.

Wherein, still include:

The training module is used for obtaining template voice data corresponding to the template user identity, generating an identity tag vector corresponding to the template voice data, obtaining an initial classification model, predicting the matching degree between the template voice data and the at least one template user identity based on the initial classification model, obtaining an identity prediction vector according to the obtained matching degree, determining a classification error according to the identity tag vector and the identity prediction vector, and training the initial classification model according to the classification error to obtain the identity recognition model.

Wherein, still include:

the playing module is used for sending an animation playing instruction to the client to instruct the client to play the target animation when the matching result meeting the matching condition exists in the at least one matching result;

and the playing module is also used for sending an instruction for stopping playing the animation to the client when the execution of the business instruction is completed, and indicating the client to close the target animation.

The determining module includes:

The first generation unit is used for generating a switching instruction corresponding to the switching intention of the secondary login object of the client, and taking the target secondary identity as the secondary login object of the client according to the switching instruction, wherein the switching instruction belongs to the service instruction.

Wherein, still include:

The storage module is used for acquiring behavior data of the user in the client corresponding to the target secondary identity, and carrying out association storage on the behavior data and the target secondary identity, wherein the behavior data is used for generating recommended service data for the user.

Wherein the business intent comprises a business data query intent;

The determining module includes:

the second generating unit is used for generating a query instruction corresponding to the service data query intention, querying target service data corresponding to the target secondary identity, and returning the target service data to the client, wherein the query instruction belongs to the service instruction.

Another aspect of the embodiments of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the computer program when executed by the processor causes the processor to perform a method as in one aspect of the embodiments of the present application.

Another aspect of the embodiments of the present application provides a computer storage medium storing a computer program comprising program instructions which, when executed by a processor, perform a method as in one aspect of the embodiments of the present application.

The application can determine the target secondary identity corresponding to the user identity by identifying the user identity and the service intention which generate the current biological information, so that the service instruction executed based on the target secondary identity not only meets the current service intention of the user and is matched with the user identity, but also can determine the service intention and the user identity of the user at the same time by collecting the target biological information of the user, and the user does not need to execute two operations of determining the service intention and determining the user identity, thereby reducing the operation cost of the user and improving the efficiency of the terminal to execute the service instruction matched with the service intention and the user identity of the user.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a system architecture diagram for data processing according to an embodiment of the present application;

FIGS. 2 a-2 d are schematic diagrams of a scenario of data processing according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 4 is a flowchart of another data processing method according to an embodiment of the present application;

FIG. 5 is a timing diagram of a data processing method according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of determining a target user identity and a target secondary identity according to an embodiment of the present application;

FIG. 7 is a flowchart of another data processing method according to an embodiment of the present application;

FIG. 8 is a timing diagram of another data processing method according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The scheme provided by the embodiment of the application belongs to the technical field of artificial intelligence, namely a voice technology (Speech Technology), natural language processing (Nature Language processing, NLP) and machine learning (MACHINE LEARNING, ML).

Key technologies to speech technology (Speech Technology) are automatic speech recognition technology (ASR) and speech synthesis technology (TTS) and voiceprint recognition technology. The computer can listen, watch, say and feel.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

In the present application, speech technology is concerned with converting a user's speech into text, and natural language processing is concerned with semantically recognizing text to determine the user's intent.

Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. The application relates to machine learning, in particular to a technology for identifying the user identity of a current user, and concretely relates to technology such as an artificial neural network, logistic regression and the like in the machine learning.

Fig. 1 is a system architecture diagram for data processing according to an embodiment of the present application. The present application relates to a server 10d and a terminal device cluster, which may comprise a terminal device 10a, a terminal device 10b, a terminal device 10c, etc.

Taking the terminal device 10a as an example, when the primary identity is in a valid state, the terminal device 10a collects the biological information of the user and sends the collected biological information to the server 10d. The server 10d performs semantic recognition on the biological information to determine the intention of the biological information, and the server 10d determines the identity of the user of the biological information, extracts the secondary identity of the user, and the extracted secondary identity is a child of the primary identity. The server 10d executes instructions related to the above intent based on the determined secondary identity. Subsequently, the server 10d may return the execution result of the instruction to the terminal device 10a.

Identifying the intention of the biometric information, determining the identity of the user of the biometric information, and executing instructions related to the intention may also be accomplished by the terminal device 10 a.

The terminal devices 10a, 10b, 10c, etc. shown in fig. 1 may include a smart tv, a mobile phone, a tablet computer, a notebook computer, a palm computer, a mobile internet device (MID, mobile INTERNET DEVICE), a wearable device (e.g., a smart watch, a smart bracelet, etc.), etc. The server 10d shown in fig. 1 may refer to a single server device or may refer to a server cluster including a plurality of server devices.

The following fig. 2 a-2 d illustrate in detail how the terminal device 10a recognizes the intention of the biometric information, determines the user identity of the biometric information and executes the instructions related to the intention, and the recognition of the intention of the biometric information and the determination of the user identity of the biometric information and the execution of the instructions related to the intention may be specific to the video client in the terminal device 10 a:

Fig. 2a to fig. 2d are schematic diagrams of a scenario of data processing according to an embodiment of the present application. When the video client detects that the primary account number "01" is logged in, but no secondary account number subordinate to the primary account number "01" is logged in, the current user starts the video client in the terminal device 10a, and a prompt message may be displayed on the screen, wherein the prompt message is "the current secondary account number is not logged in, the current secondary account number is input through voice or the current secondary account number can be logged in by clicking, so as to prompt the current user to log in the secondary account number, and the primary account number" 01 "is the primary account number of the user 1.

The current user can input voice of 'logging in a secondary account', the video client can acquire voice data 20b of the voice of 'logging in a secondary account', the video client converts the voice data 20b into text data, the text data is identified semantically, and the intention corresponding to the voice data 20b is determined to be 'logging in a secondary account'.

The video client inputs the voice data 20b into a trained prediction model 20d corresponding to the primary account number "01", the prediction model 20d can extract the voiceprint feature of the voice data 20b, and match the voiceprint feature with a plurality of template voiceprint features, and if the template voiceprint features matched with the voiceprint feature of the voice data 20b exist in the plurality of template voiceprint features, the user identity corresponding to the matched template voiceprint features is extracted (assuming that the extracted user identity is user 2).

Each template voiceprint feature in the predictive model 20d corresponds to 1 user identity, and assuming that the predictive model 20d is trained from 2 template voiceprint features, the user identities corresponding to the 2 template voiceprint features are user 1 and user 2, respectively, and the secondary account of user 1 and the secondary account of user 2 are both sub-accounts of the current primary account "01".

As shown in fig. 2b, the second-level account corresponding to user 2 is found 002 in the user information record table 20e corresponding to the first-level account "01".

As can be seen from FIG. 2b, the user information record table 20e includes 3 user records, wherein the 3 user records respectively correspond to a primary account number "01" and 2 secondary accounts (respectively, a secondary account number "001" and a secondary account number "002") which are subordinate to the primary account number "01", the primary account number "01" and the secondary account number "001" are both accounts of the user 1, the secondary account number "002" is an account number of the user 2, and the history records of the user are all stored in association with the secondary account numbers.

The video client extracts the secondary account number 002 of the user 2, and the intention of the voice data 20b determined in the foregoing is "secondary account number login", so the secondary account number 002 can be used as a secondary login object of the video client.

As shown in page 20f in fig. 2b, the video client may display an animation in the screen in the process of determining the intention of the voice data 20b and determining the secondary account number "002", stop playing the animation when the video client has made the secondary account number "002" the secondary login object, and jump to the first page of the video client.

As shown in page 20g, the video client is currently logged in with secondary account number "002" and primary account number "01".

Alternatively, the foregoing is that it is assumed that there is a template voiceprint feature matching the voiceprint feature of the voice data 20b among the plurality of template voiceprint features, thereby determining that the user identity corresponding to the voice data 20b is user 2.

As shown in fig. 2c, it is assumed that there is no template voiceprint feature of the plurality of template voiceprint features that matches the voiceprint features of the voice data 20b, that is, there is no corresponding user identity for the current user generating the voice data 20b and no corresponding user record in the user information record table 20 c. Because there is no corresponding user record, the video client may create 1 new user record for the current user, and the user records include user identity "user 2", secondary account "002", header, level "level 2", and history (of course, the history is empty at this time).

The video client adds the user record to the user information record table 20c, and a new user information record table 20h can be obtained after the addition.

The video client extracts the newly created secondary account number 002 of the user 2, and the intention of the voice data 20b determined in the foregoing is "secondary account number login", so the secondary account number 002 "can be used as a secondary login object of the video client.

As shown in the page 20i in fig. 2d, the video client creates the secondary account "002", and after logging in the secondary account "002", a prompt message may be displayed in the screen, "no secondary account is detected, a secondary account is newly created for you, and logged in" for prompting the current user to create a new secondary account.

As shown in page 20j, the video client is currently logged in with the newly created secondary account "002" and primary account "01".

The specific process of acquiring the target biometric information (e.g. the voice data 20b of voice "login to secondary account" in the above embodiment), and identifying the service intention (e.g. the intention "secondary account login" in the above embodiment) and the target user identity (e.g. the user 2 in the above embodiment) can be seen in the embodiments corresponding to fig. 3-8 described below.

Referring to fig. 3, a flow chart of a data processing method according to an embodiment of the present application is shown in fig. 3, where the data processing method may include the following steps:

step S101, when the primary identity is in a valid state, acquiring target biological information.

Specifically, the server (e.g., the server 10d in the corresponding embodiment of fig. 1) detects whether the primary id is the primary login object of the current corresponding client (e.g., the video client in the corresponding embodiment of fig. 2 a-2 d), and if the primary id is the primary login object of the client, it indicates that the primary id is in a valid state. The login object of the client may include a primary login object and a secondary login object, where the primary login object corresponds to a primary identity, and the secondary login object corresponds to a secondary identity, and the secondary identity is a child of the primary identity.

The client may be specifically a video client, an instant messaging client, or a mail client, etc.

When the primary id is in a valid state, the server may receive the biometric information (referred to as target biometric information, such as voice data 20b of voice "log in secondary account" in the corresponding embodiment of fig. 2 a-2 d above) sent by the client.

The target biometric information may include voice data (referred to as target voice data), and the target biometric information may also include voice data (referred to as target voice data) and image data (referred to as target image data), wherein the target image data may be facial image data of the current user.

Step S102, identifying the business intention and the target user identity corresponding to the target biological information.

Specifically, when the target biometric information includes target voice data, the server may determine a business intention of the target voice data by converting the target voice data into text data and semantically recognizing the text data.

And, the server may determine the user identity of the current user (referred to as the target user identity, as user 2 in the corresponding embodiment of fig. 2 a-2 d described above) through an identification model (such as predictive model 20d in the corresponding embodiment of fig. 2 a-2 d described above) corresponding to the first identification.

The order in which the server determines the service intent and the identity of the target user is not limited.

The method comprises the steps of converting target voice data into text data, determining the state of each audio frame of the target voice data by adopting an acoustic model (the acoustic model can be a model established based on a dynamic time warping method of pattern matching or a model established based on an artificial neural network recognition method and the like), combining a plurality of states into phonemes, and combining a plurality of phonemes into words.

A language model (the language model can be an N-Gram language model, a Markov N-Gram model, an index model (Exponential Models) or a decision tree model (Decision Tree Models) and the like) is adopted to combine a plurality of words into correct, unambiguous and logical sentences so as to obtain text data.

The text data is semantically identified to determine the business intention of the text data, and an entity-predicate knowledge graph can be adopted to perform pattern matching with the text data, so that the entity and the predicate in the text data are determined. The server may combine the identified entities and predicates into business intent.

For example, after the current user inputs 'query for a history play record' and converts the voice data into text data 'query for a history play record', the knowledge graph can be used to determine that the entity is 'history play record', and the predicate is 'query', so that the business intention is that the history play record is the query.

The identity recognition model is a classification model trained by at least one template user identity and voice data (called template voice data) corresponding to the template user identity, each template user identity has a secondary identity (such as the secondary account "001" and the secondary account "002" in the corresponding embodiments of fig. 2a-2 d, the identity may be a user account) corresponding to the template user identity, and the secondary identity of each template user identity is a sub-identity of the primary identity.

When the target biometric information includes target voice data and target image data, the server may also determine the business intention of the target voice data in the above manner, and determine the user identity (referred to as a first user identity) of the target voice data according to the identity recognition model;

The server may also determine the user identity of the target image data (referred to as the second user identity) based on an image recognition model, which is similar to the identity recognition model, and which is a classification model trained from at least one template user identity and image data corresponding to the template user identity (referred to as template image data).

The server may determine the final target user identity according to the first user identity determined by the identity recognition model and the second user identity determined by the image recognition model, and the target user identity determined based on the two models has higher accuracy.

Alternatively, when the target biometric information includes target voice data and target image data, the server may determine the service intention of the target voice data as well as the target user identity of the target image data based on the image recognition model only in the manner described above.

Step S103, a target secondary identity corresponding to the target user identity is obtained, wherein the target secondary identity is a sub-identity of the primary identity.

Specifically, the server may obtain the target secondary identity of the target user identity from the secondary identity set corresponding to the at least one template user identity (e.g. the secondary account "002" in the corresponding embodiment of fig. 2 a-2 d above), or recreate the target secondary identity, where the target secondary identity is a child identifier of the primary identity.

And step S104, executing a service instruction corresponding to the service intention based on the target secondary identity.

Specifically, when the service intention is the switching intention of the secondary login object of the client, the server generates a switching instruction corresponding to the switching intention of the secondary login object of the client, wherein the switching instruction is used for indicating the server to switch the current secondary login object of the client, and the switching instruction belongs to the service instruction.

And the server can take the target secondary identity as a current secondary login object of the client according to the switching instruction. Subsequently, the server can issue the switching notification message to the client, so that after the client receives the switching notification message, a prompt message can be displayed for prompting the user that the current secondary login object is the target secondary identity.

Subsequently, when the secondary login object of the client is the target secondary identity, the server may receive behavior data (the behavior data may include at least one of viewing behavior data, browsing behavior data, search behavior data and comment behavior data) reported by the client, where the behavior data is user behavior data of the user collected by the client when the secondary login object of the client is the target secondary identity.

The server can store the target secondary identity and behavior data reported by the client in an associated mode. Subsequently, the server can generate recommended service data for the user based on the behavior data so as to achieve the purpose of personalized recommendation.

When the service intention is a service data query intention, the server generates a query instruction corresponding to the service data query intention, wherein the query instruction is used for indicating the server to query the service data, and the query instruction belongs to the service instruction. Such as querying a historical viewing record, querying a viewing progress record, querying a search record, etc.

The server can inquire service data (called target service data) related to the target secondary identity according to the inquiry command, and then the server can return the inquired target service data to the client so that the client can display the target service data after receiving the target service data.

Or after the server generates the query instruction, the target secondary identity is used as a secondary login object of the client, and at the same time, query operation related to the query instruction is executed.

It should be noted that the user authority of the target secondary identity is the same as the user authority of the primary identity, and further, the user authority of the sub-identities of all the primary identities is the same as the user authority of the primary identity.

For example, if the primary id has the member VIP authority, the sub-id of the primary id (including the target secondary id and the secondary id of the template user id in the foregoing) has the member VIP authority.

The user identities of different levels correspond to different functional architectures, and the primary identity can be used for managing member rights and for counting statistics (e.g., total time of viewing, etc.) of all secondary identities. The secondary identity is personalized information for managing each user identity.

It should be noted that, the above steps S101 to S104 are described with the server as an execution body, and the execution body may also be a client installed in a terminal device (such as the terminal device 10a in the corresponding embodiment of fig. 2a to 2 d), and the terminal device may be a smart tv, and the client may be a video client installed in the smart tv.

When the primary identity is in a valid state, the client acquires target biological information, the client recognizes the service intention of the target biological information and invokes an identity recognition model to determine the target user identity, the client acquires a target secondary identity of the target user identity, and executes a service instruction corresponding to the service intention based on the target secondary identity, for example, the target secondary identity is used as a service instruction of a secondary login object of the client, and the service instruction of target service data corresponding to the target secondary identity is queried.

Fig. 4 is a flowchart of another data processing method according to an embodiment of the present application, where the data processing includes the following steps:

in step S201, the flow starts.

In step S202, the server acquires voice data.

Specifically, the current primary account (which can correspond to the primary identity in the application) logs in the client, that is, the primary account is in a valid state, and the server receives voice data sent by the client, wherein the voice data is data acquired by the client when the user inputs a login sub-account in the client in a voice manner.

In step S203, the server determines whether the voiceprint exists.

Specifically, the server semantically recognizes voice data and determines that the service intention is to log in the sub-account.

The server judges whether the template voiceprint features matched with the voiceprint features of the voice data exist or not by calling an identity recognition model of the primary account, if so, the server executes the steps S204 and S206, and if not, the server executes the steps S205-S206.

In step S204, the server logs the secondary account into the client.

Specifically, the server sets the secondary account corresponding to the matched voiceprint feature (which can correspond to the target secondary identity in the application) as the secondary login account of the client, and at this time, the client logs in a primary account and a secondary account.

In step S205, the server creates a new secondary account (which may correspond to the target secondary identity in the present application), where the secondary account is a sub-account of the primary account, stores the newly created secondary account in association with the voiceprint feature, and uses the newly created secondary account as the secondary login account of the client.

Step S206, the flow ends.

Referring to fig. 5, which is a timing chart of a data processing method according to an embodiment of the present application, a video background server, a voice recognition server and a voiceprint recognition server described below all belong to the servers in the present application, and the data processing includes the following steps:

In step S301, the primary account is in an active state, and the client collects the voice data "enter sub-account" input by the user.

In step S302, the client sends the voice data to the video background server.

In step S303, the video backend server transmits the voice data to the voice recognition server.

In step S304, the video background server sends the voice data to the voiceprint recognition server.

In step S305, the voice recognition server performs semantic recognition on the voice data, determines that the service intention is to access the secondary account, and sends the determined service intention back to the video background server.

And step S306, the voiceprint recognition server performs voiceprint recognition on the voice data according to the identity recognition model corresponding to the primary account to obtain a voiceprint recognition result, and the voiceprint recognition server sends the voiceprint recognition result back to the video background server.

Step S307, the video background server generates an access secondary account instruction corresponding to the service intention.

Step S308, the video background server judges whether a corresponding secondary account exists according to the voiceprint recognition result, if so, the video background server returns service data corresponding to the secondary account to the client according to the instruction for accessing the secondary account, and if not, a new secondary account is created, wherein the secondary account is a sub-account of the primary account.

Referring to fig. 6, a flow chart for determining a target user identity and a target secondary identity according to an embodiment of the present application is provided, where determining the target user identity and the target secondary identity includes the following steps S401 to S404, and the steps S401 to S404 are specific embodiments of the steps S102 to S103 in the corresponding embodiment of fig. 3.

Step S401, converting the target voice data into text data, and recognizing the text data semantically to obtain the business intention.

Specifically, when the target biological information is the target voice data, the server divides the target voice data into a plurality of audio frames according to the preset frame length and the preset frame shift, and partial overlap exists between the audio frames, and the overlap length is equal to the preset frame shift.

For example, according to a frame length of 20ms, a frame shift of 10ms divides target voice data with a time dimension of 0-30ms, and can be divided into voice data between 1:0-20ms of audio frames and voice data between 2:10-30ms of audio frames.

Spectral parameters of each audio frame are extracted, wherein the spectral parameters are short-time spectral feature parameters of the audio frame, and the short-time spectral feature parameters are parameters extracted based on physiological structures of sound generating organs such as glottal, vocal tract or nasal cavity, etc.

The short-time spectral feature parameters may include parameters such as at least one of a pitch spectrum and its contour, energy of a pitch frame, a spectral envelope, frequency of occurrence of a pitch formant, and its trajectory.

Extracting linear prediction parameters of each audio frame, wherein the linear prediction parameters are spectral fitting characteristic parameters of the audio frame, the spectral fitting characteristic parameters are parameters which are proposed by simulating the characteristics of human ears on sound frequency perception from the auditory perspective, and the spectral fitting characteristic parameters are voice characteristics estimated by approximating current audio frames by a plurality of 'past' audio frames from the mathematical perspective and using corresponding approximation parameters.

The spectral fit characteristic parameters may include at least one of linear prediction cepstrum (LPCC), line Spectrum Pair (LSP), auto-correlation and logarithmic area ratio, mel frequency cepstrum (MFCC), perceptual Linear Prediction (PLP), etc.

In the above manner, the spectral parameters and the linear prediction parameters extracted for each audio frame are combined into one vector, so that each audio frame can be expressed as a multi-dimensional vector (which may also be referred to as a feature vector). The state of the feature vector corresponding to each audio frame is determined by using an acoustic model, and in general, the states of adjacent audio frames should be the same, because the frame length of each audio frame is relatively short, in the order of milliseconds and ms.

The states corresponding to several audio frames (typically 3 audio frames) are combined into one phoneme, which is the smallest speech unit, which is the unit separated from the perspective of the tone, and one phoneme exists alone or several phonemes are combined together to be called syllables.

And combining the phonemes into words (or words). Because of the time variability, noise and other instabilities of the speech signal, each word has a close relation with the context, and in order to further improve the accuracy of the speech-text conversion, the adaptation is performed according to the context of all the words. Therefore, the server can adopt a language model, and the recognized words are formed into logical and unambiguous sentences, so that text data corresponding to the target voice data can be obtained.

The server may obtain an entity-predicate knowledge graph that includes a plurality of entity strings and predicate strings, and each entity string (or predicate string) identifies whether the string is an entity attribute or a predicate attribute. The server may perform multi-mode string matching on the text data and the entity-predicate knowledge graph by using a multi-mode string matching algorithm (the multi-mode string matching algorithm may include AC automaton, hash function matching, etc.), and determine that a matched string in the text data and the string are entity attributes or predicate attributes. The server may use, as an entity, a string belonging to an entity attribute in the text data, and use, as a predicate, a string belonging to a predicate attribute. The entities and predicates identified from the text data are combined into business intent.

Step S402, calling an identification model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity, wherein the identification model is a classification model generated according to the at least one template user identity and the template voice data respectively corresponding to the at least one template user identity.

Specifically, the server acquires an identification model corresponding to the primary identity, the identification model is a classification model trained according to at least one template user identity and template voice data corresponding to each template user identity, the template user identity can be understood as a user identity which has been created by the server, each template user identity has a secondary identity corresponding to the template user identity, and the secondary identity is a child identity of the primary identity.

The identity recognition model comprises a feature generator and a pattern matcher:

The feature generator is for dividing the target speech data into a plurality of audio frames, and extracting the spectral parameters and the linear prediction parameters of each audio frame (the process of extracting the spectral parameters and the linear prediction parameters of each audio frame may be referred to above in step S401), combining the spectral parameters of all audio frames into the spectral parameters of the target speech data, and combining the linear prediction parameters of all audio frames into the linear prediction parameters of the target speech data. The spectral parameters of the target voice data and the linear prediction parameters of the target voice data are combined into a voiceprint feature (referred to as a target voiceprint feature) of the target voice data in a predetermined order.

The pattern matcher is used for identifying the similarity (or matching probability) between the target voiceprint feature and at least one template voiceprint feature, and taking the obtained at least one matching probability as a matching result, wherein the template voiceprint feature is the voiceprint feature of the template voice data (the extraction process of the template voiceprint feature is the same as the extraction process of the target biological feature).

Since the template voice data is voice data corresponding to the identity of the template user, the similarity (or matching result) between the target biological feature and the at least one template voiceprint feature is equal to the matching between the target voice data and the at least one template user identity.

The pattern matcher may be a model having a prediction classification function, such as a BP (Back Propagation) neural network model, a convolutional neural network model, or various regression models (e.g., a linear regression model, a logistic regression model).

Step S403, if there is a matching result satisfying the matching condition in at least one matching result, taking the template user identity corresponding to the matching result satisfying the matching condition as the target user identity.

Specifically, the server obtains a preset probability threshold. If the matching result is larger than the preset probability threshold, the matching result is the matching result meeting the matching condition.

When at least one obtained matching result has a matching result meeting the matching condition, taking the template user identity corresponding to the matching result meeting the matching condition as a target user identity.

Step S404, extracting a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity, wherein the secondary identity in the secondary identity set is a child identity of the primary identity.

Specifically, each template user identity has a secondary identity corresponding to the template user identity, the secondary identity of each template user identity is a sub-identity of the primary identity, and the secondary identities of all the template user identities can be combined into a secondary identity set.

The server may extract, from the set of secondary identities, a secondary identity of the target user identity (i.e., a template user identity corresponding to a matching result satisfying the matching condition) as the target secondary identity.

For example, the probability of matching between the target voiceprint feature of the target voice data and the template voiceprint feature 1 (corresponding to the template user identity 1) is 0.1, the probability of matching between the target voiceprint feature and the template voiceprint feature 2 (corresponding to the template user identity 2) is 0.8, and the probability of matching between the target voiceprint feature and the template voiceprint feature 3 (corresponding to the template user identity 3) is 0.1. If the preset probability threshold is 0.5, the server can take the template user identity 2 as the target user identity and take the secondary identity of the template user identity as the target secondary identity, which indicates that the matching result between the target biological feature and the template voiceprint feature 2 meets the matching condition.

Optionally, when at least one obtained matching result has a matching result satisfying the matching condition, the server may send a play animation instruction to the client, so that the client plays the target animation according to the play animation instruction, and the target animation may be a lightweight animation.

Subsequently, when execution of the service instruction corresponding to the service intention is completed, the server may send an instruction to stop playing the animation to the client, so that the client stops the target animation according to the instruction to stop playing the animation.

The above-described step S403 to step S404 describe the case when there is a matching result satisfying the matching condition among the obtained at least one matching result, and the case when there is no matching result satisfying the matching condition among the obtained at least one matching result is described below:

If the matching result is smaller than or equal to the preset probability threshold value, the matching result is the matching result which does not meet the matching condition.

When there is no matching result satisfying the matching condition in the obtained at least one matching result (or when none of the obtained at least one matching result satisfies the matching condition), the server may create a user identity (referred to as a target user identity) for the current user, create a secondary identity (referred to as a target secondary identity) for the target user identity, and set the target secondary identity as a child of the primary identity.

The server can also identify age information corresponding to the target voice data, and search an image matched with the age information from the image material library to be used as an identity head portrait.

The server may store the target user identity, the target secondary identity, and the identity head portrait in association.

Subsequently, the server may take the target user identity as a new template user identity and add the target secondary user identity to the target secondary identity set.

For example, the probability of matching between the target voiceprint feature of the target voice data and the template voiceprint feature 1 (corresponding to the template user identity 1) is 0.1, the probability of matching between the target voiceprint feature and the template voiceprint feature 2 (corresponding to the template user identity 2) is 0.2, and the probability of matching between the target voiceprint feature and the template voiceprint feature 3 (corresponding to the template user identity 3) is 0.2. If the preset probability threshold is 0.5, it is indicated that there is no matching result satisfying the matching condition in the above 3 matching results, then the server may recreate the target user identity (e.g., user identity 4) and recreate the target secondary identity for the target user identity, and set the recreated target secondary identity as a child identity of the primary identity.

Optionally, the foregoing describes a process of using the identity recognition model, and the following describes a training process of the identity recognition model, where the training process is illustrated by taking a model training performed by using a template user identity and corresponding template voice data as an example.

The server acquires template voice data of the template user identity, and generates a tag vector (called an identity tag vector) of the template voice data, wherein the identity tag vector is used for identifying the template user identity to which the template voice data belongs.

And acquiring an initial classification model, predicting the matching degree between the template voice data and at least one template user identity based on the initial classification model, and combining the acquired matching degree into an identity prediction vector.

The difference between the identity tag vector and the identity prediction vector is determined and used as a classification error, and the classification error is reversely propagated to the initial classification model to adjust model parameters in the initial classification model.

For example, 3 template user identities (template user identity 1, template user identity 2 and template user identity 3 respectively) are present, the template user identity 2 is currently trained, and then the identity tag vector of the template voice data of the template user identity 2 is [0,1,0]. If the initial classification model predicts that the matching degree between the template voice data of the template user identity 2 and the template user identity 1 is 0.4, the matching degree between the template voice data and the template user identity 2 is 0.3, and the matching degree between the template voice data and the template user identity 3 is 0.3, the identity prediction vector is [0.4,0.3,0.3]. The classification error may be (0-0.4) ²+(1-0.3)²+(0-0.3)² =0.41. And back-propagating the calculated classification errors to the initial classification model to adjust model parameters in the initial classification model.

The server can train the initial classification model continuously in the mode, and when the training times reach the frequency threshold or the variation of the parameters of the model adjusted in two adjacent times is smaller, the trained initial classification model can be used as the identity recognition model.

As can be seen from the foregoing, the server may newly create a target user identity and a target secondary identity, where the newly created target user identity is to be used as a new template user identity, in which case the server needs to retrain the identity model, and the new identity model is to be newly added with a class output based on the original identity model, and the newly added class output is used to output a probability that the voice data belongs to the newly added target user identity.

Fig. 7 is a flowchart of another data processing method according to an embodiment of the present application, where the data processing may include the following steps:

In step S501, the flow starts.

In step S502, the server acquires voice data.

Specifically, the current primary account (which can correspond to the primary identity in the application) logs in the client, that is, the primary account is in a valid state, and the server receives voice data sent by the client, wherein the voice data is data acquired by the client when the user inputs the sub-account through voice in the client.

In step S503, the server recognizes that the service intention of the voice data is to access the secondary account intention, and extracts the target voiceprint feature of the voice data based on the identity recognition model, and the specific process of extracting the target voiceprint feature can be referred to in step S402 in the corresponding embodiment of fig. 6.

In step S504, the server performs pattern matching on the extracted target voiceprint feature and the existing template voiceprint feature.

Step S505, the server determines whether the template voiceprint features matched with the target biological features exist in the existing template voiceprint features according to the pattern matching result, if so, step S507-step S508 are executed, and if not, step S506 and step S508 are executed.

In step S506, the server creates a new secondary account (which may correspond to the target secondary identity in the present application) according to the intention of accessing the secondary account, establishes an association between the secondary account and the extracted target voiceprint feature, and logs the newly created secondary account into the client.

Step S507, the server searches the secondary account corresponding to the voice print feature of the matched template (which can correspond to the target secondary identity in the application), the secondary account is the existing secondary account, and the server returns the service data under the secondary account to the client.

Step S508, the flow ends.

When the client in the above step is a video client and the video client is installed in the smart television, each family member sharing the smart television uniquely corresponds to a user identity and a secondary account (may correspond to the secondary identity in the present application) through voiceprint features, and the server may determine the viewing history, the focused movie and the voiceprint features of each family member based on the unique secondary account, so as to achieve personalized recommendation.

The following scenario is illustrated by taking an example that the user A has created the secondary account of the client, but the user B has not created the secondary account, wherein the client collects voice data input by the user and enters the sub-account.

Referring to fig. 8, a timing chart of another data processing method according to an embodiment of the present application is shown, where the data processing method includes the following steps:

In step S601, the client collects voice data of the user a.

Specifically, the current primary account logs in the client side in the terminal equipment, namely, the primary account is in a valid state, the user inputs a login sub-account to the voice of the client side, and the client side collects voice data of the user inputting the login sub-account by voice.

In step S602, the client sends the voice data to the server.

In step S603, the server determines the service intention through semantic recognition, matches the service intention to the corresponding secondary account through voiceprint recognition, logs in the secondary account to the client, searches the behavior data (for example, the historical viewing record of the user a, the video concerned, the video of the comment, the search record, etc.) under the secondary account, and generates recommendation data according to the behavior data.

In step S604, the server returns recommended data to the client.

In step S605, the client collects the voice data of the user B, namely 'login sub-account'.

In step S606, the client uploads the voice data of the user B to the server.

In step S607, the server determines the service intention through semantic recognition, and identifies the secondary account which is not matched with the corresponding secondary account through voiceprint, establishes a secondary account, takes the established secondary account as the sub-account of the primary account, and logs in the client with the established secondary account.

In step S608, the user B generates viewing behavior data (viewing video, video of interest, video of comments, search records, etc.) based on the newly created secondary account in the client.

In step S609, the client uploads the movie watching behavior data of the user B to the server.

In step S610, the server stores the movie watching behavior data in association with the newly-built secondary account for subsequent generation of personalized recommendation data for the user B.

Further, please refer to fig. 9, which is a schematic diagram illustrating a structure of a data processing apparatus according to an embodiment of the present application. As shown in fig. 9, the data processing apparatus 1 may be applied to the server in the embodiment corresponding to fig. 3 to 8, and the data processing apparatus 1 may include a first acquisition module 11, an identification module 12, a second acquisition module 13, and a determination module 14.

The first obtaining module 11 is configured to obtain target biological information when the first-level identity is in a valid state;

an identification module 12 for identifying a business intention and a target user identity corresponding to the target biometric information;

The second acquisition module 13 is used for acquiring a target secondary identity corresponding to the target user identity, wherein the target secondary identity is a sub-identity of the primary identity;

And the determining module 14 is used for executing the business instruction corresponding to the business intention based on the target secondary identity.

The specific functional implementation manners of the first acquiring module 11, the identifying module 12, the second acquiring module 13, and the determining module 14 may refer to step S101 to step S104 in the corresponding embodiment of fig. 3, which are not described herein.

Referring to fig. 9, the target bio-information includes target voice data;

the identification module 12 may include a conversion unit 121, a calling unit 122, and a first determination unit 123.

A conversion unit 121, configured to convert the target voice data into text data, and semantically identify the text data to obtain the service intention;

A calling unit 122, configured to call an identification model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity, where the identification model is a classification model generated according to the at least one template user identity and the template voice data corresponding to the at least one template user identity respectively;

A first determining unit 123, configured to, if at least one matching result has a matching result that satisfies a matching condition, take, as the target user identity, a template user identity corresponding to the matching result that satisfies the matching condition;

the second obtaining module 13 may include a first extracting unit 131.

The first extracting unit 131 is configured to extract, from a secondary identity set corresponding to the at least one template user identity, a target identity corresponding to the target user identity, where a secondary identity in the secondary identity set is a child identity of the primary identity.

The identification module 12 may further comprise a second determination unit 124.

A second determining unit 124, configured to create the target user identity if a matching result satisfying the matching condition does not exist in the at least one matching result, identify age information corresponding to the target voice data, and search an identity head portrait matching the age information in an image material library;

the second acquisition module 13 may include a second extraction unit 132.

The second extraction unit 132 is configured to create the target secondary user identifier for the target user identity, set the target secondary user identifier as a child identifier of the primary identity, and store the target user identity, the target secondary identity and the identity head portrait in an associated manner.

The specific processes of the conversion unit 121, the calling unit 122, the first determining unit 123, the second determining unit 124, the first extracting unit 131, and the second extracting unit 132 may be referred to as step S401-step S404 in the corresponding embodiment of fig. 6, and the detailed description thereof is omitted herein.

When the first determining unit 123 and the first extracting unit 131 determine the target user identity and the target secondary identity, the second determining unit 124 and the second extracting unit 132 do not perform the corresponding steps any more, and when the second determining unit 124 and the second extracting unit 132 determine the target user identity and the target secondary identity, the first determining unit 123 and the first extracting unit 131 do not perform the corresponding steps any more.

Referring to fig. 9, the identification model includes a feature generator and a pattern matcher;

The calling unit 122 may include an extraction subunit 1221 and a matching subunit 1222.

An extraction subunit 1221, configured to extract, based on the feature generator, a target voiceprint feature of the target voice data;

A matching subunit 1222, configured to determine, based on the pattern matcher, matching probabilities between the target voiceprint feature and at least one template voiceprint feature, and take the obtained matching probabilities as matching results;

The extracting subunit 1221 is specifically configured to extract, based on the feature generator, a spectral parameter and a linear prediction parameter of the target voice data, where the spectral parameter is a short-time spectral feature parameter of the target voice data, and the linear prediction parameter is a spectral fitting feature parameter of the target voice data, and obtain the target voiceprint feature according to the spectral parameter and the linear prediction parameter.

The specific process of the extracting subunit 1221 and the matching subunit 1222 may refer to step S402 in the corresponding embodiment of fig. 6, and will not be described herein.

Referring to fig. 9, the service intention includes a client secondary login object switching intention;

The determining module 14 includes a first generating unit 141.

The first generating unit 141 is configured to generate a switching instruction corresponding to the switching intention of the second-level login object of the client, and use the target second-level identity as the second-level login object of the client according to the switching instruction, where the switching instruction belongs to the service instruction.

The specific process of the first generating unit 141 may refer to step S104 in the corresponding embodiment of fig. 3, and will not be described herein.

Referring to fig. 9, the service intention includes a service data query intention;

the determining module 14 may include a second generating unit 142.

The second generating unit 142 is configured to generate a query instruction corresponding to the service data query intention, query target service data corresponding to the target secondary identity, and return the target service data to the client, where the query instruction belongs to the service instruction.

The specific process of the second generating unit 142 may refer to step S104 in the corresponding embodiment of fig. 3, which is not described herein.

Referring to fig. 9, the data processing apparatus 1 may include a first acquisition module 11, an identification module 12, a second acquisition module 13, and a determination module 14, and may further include a storage module 15, a training module 16, and a playing module 17.

The storage module 15 is configured to obtain behavior data of a user in the client corresponding to the target secondary identity, and store the behavior data and the target secondary identity in an associated manner, where the behavior data is used to generate recommended service data for the user.

The training module 16 is configured to obtain template voice data corresponding to an identity of a template user, generate an identity tag vector corresponding to the template voice data, obtain an initial classification model, predict a matching degree between the template voice data and the identity of the at least one template user based on the initial classification model, obtain an identity prediction vector according to the obtained matching degree, determine a classification error according to the identity tag vector and the identity prediction vector, and train the initial classification model according to the classification error to obtain the identity recognition model.

A playing module 17, configured to send a playing animation instruction to a client, when a matching result that meets the matching condition exists in the at least one matching result, to instruct the client to play a target animation;

the playing module 17 is further configured to send an instruction for stopping playing the animation to the client when the execution of the service instruction is completed, and instruct the client to close the target animation.

The specific process of the storage module 15, the training module 16, and the playing module 17 may refer to step S404 in the corresponding embodiment of fig. 6, which is not described herein.

Further, please refer to fig. 10, which is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The server in the foregoing embodiments of fig. 3-8 may be an electronic device 1000, as shown in fig. 10, where the electronic device 1000 may include a user interface 1002, a processor 1004, an encoder 1006, and a memory 1008. Signal receiver 1016 is used to receive or transmit data via cellular interface 1010, WIFI interface 1012, a. The encoder 1006 encodes the received data into a computer-processed data format. The memory 1008 has stored therein a computer program, by which the processor 1004 is arranged to perform the steps of any of the method embodiments described above. The memory 1008 may include volatile memory (e.g., dynamic random access memory, DRAM) and may also include non-volatile memory (e.g., one-time programmable read only memory, OTPROM). In some examples, the memory 1008 may further include memory located remotely from the processor 1004, which may be connected to the electronic device 1000 over a network. The user interface 1002 may include a keyboard 1018 and a display 1020.

In the electronic device 1000 shown in fig. 10, the processor 1004 may be configured to invoke the storage of a computer program in the memory 1008 to implement:

It should be understood that the electronic device 1000 described in the embodiment of the present invention may perform the description of the data processing method in the embodiment corresponding to fig. 3 to 8, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 9, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.

It should be noted here that, in addition, the embodiment of the present invention further provides a computer storage medium, in which the aforementioned computer program executed by the data processing apparatus 1 is stored, and the computer program includes program instructions, when executed by the processor, can execute the description of the data processing method in the embodiment corresponding to fig. 3 to 8, and therefore, a detailed description will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer storage medium according to the present invention, please refer to the description of the method embodiments of the present invention.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.

The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims

1. A data processing method, comprising:

When the primary identity is in a valid state, obtaining target biometric information, wherein the target biometric information includes target voice data;

Convert the target voice data into text data, semantically identify the text data, and obtain the business intent;

Calling an identity recognition model corresponding to the primary identity identifier to determine a matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated based on the at least one template user identity and the template voice data corresponding to the at least one template user identity;

If there is a matching result satisfying the matching condition in at least one matching result, taking the template user identity corresponding to the matching result satisfying the matching condition as the target user identity, sending a play animation instruction to the client, instructing the client to play the target animation;

Acquire a target secondary identity identifier corresponding to the target user identity; the target secondary identity identifier is a sub-identifier of the primary identity identifier;

Executing a business instruction corresponding to the business intention based on the target secondary identity;

When the execution of the service instruction is completed, a stop animation playing instruction is sent to the client, instructing the client to close the target animation.

2. The method according to claim 1, characterized in that the step of obtaining a target secondary identity corresponding to the target user identity comprises:

A target identity corresponding to the target user identity is extracted from a set of secondary identity identifiers corresponding to the at least one template user identity; the secondary identity identifiers in the set of secondary identity identifiers are sub-identifiers of the primary identity identifier.

3. The method according to claim 2, further comprising:

If there is no matching result satisfying the matching condition in the at least one matching result, creating the target user identity;

Identify age information corresponding to the target voice data, and search for an identity head image matching the age information in an image material library;

The step of obtaining a target secondary identity corresponding to the target user identity includes:

Creating the target secondary user identifier for the target user identity;

Setting the target secondary user identifier as a sub-identifier of the primary identity identifier;

The target user identity, the target secondary identity identifier and the identity avatar are stored in association.

4. The method according to claim 2, characterized in that the identity recognition model includes a feature generator and a pattern matcher;

The calling of the identity recognition model corresponding to the primary identity identifier to determine a matching result between the target voice data and at least one template user identity includes:

Based on the feature generator, extracting target voiceprint features of the target voice data;

Based on the pattern matcher, the matching probability between the target voiceprint feature and at least one template voiceprint feature is determined, and the obtained matching probabilities are all used as matching results; the at least one template voiceprint feature is the voiceprint feature corresponding to the at least one template voice data respectively.

5. The method according to claim 4, characterized in that extracting the target voiceprint features of the target speech data based on the feature generator comprises:

Based on the feature generator, extract the spectrum parameters and linear prediction parameters of the target speech data; the spectrum parameters are the short-time spectrum feature parameters of the target speech data; the linear prediction parameters are the spectrum fitting feature parameters of the target speech data;

The target voiceprint feature is obtained according to the spectrum parameter and the linear prediction parameter.

6. The method according to claim 2, further comprising:

Obtain template voice data corresponding to the template user identity;

Generating an identity tag vector corresponding to the template speech data;

Acquire an initial classification model, predict a matching degree between the template voice data and the at least one template user identity based on the initial classification model, and obtain an identity prediction vector according to the acquired matching degree;

A classification error is determined according to the identity label vector and the identity prediction vector, and the initial classification model is trained according to the classification error to obtain the identity recognition model.

7. The method according to claim 1, characterized in that the business intention includes a secondary login object switching intention of the client;

The executing the business instruction corresponding to the business intention based on the target secondary identity includes:

Generate a switching instruction corresponding to the secondary login object switching intention of the client; the switching instruction belongs to the business instruction;

According to the switching instruction, the target secondary identity is used as a secondary login object of the client.

8. The method according to claim 7, further comprising:

Acquire behavior data of a user corresponding to the target secondary identity in the client; the behavior data is used to generate recommended service data for the user;

The behavior data and the target secondary identity are associated and stored.

9. The method according to claim 1, characterized in that the business intention includes a business data query intention;

Generate a query instruction corresponding to the business data query intention; the query instruction belongs to the business instruction;

Query the target business data corresponding to the target secondary identity identifier, and return the target business data to the client.

10. The method according to claim 1, characterized in that the user rights possessed by the target secondary identity identifier are the same as the user rights possessed by the primary identity identifier.

11. A data processing device, comprising:

A first acquisition module, used for acquiring target biometric information when the primary identity identifier is in a valid state, wherein the target biometric information includes target voice data;

A recognition module is used to convert the target voice data into text data, semantically recognize the text data, and obtain the business intention; call the identity recognition model corresponding to the primary identity to determine the matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated according to the at least one template user identity and the template voice data corresponding to the at least one template user identity; if there is a matching result that meets the matching condition in at least one matching result, the template user identity corresponding to the matching result that meets the matching condition is used as the target user identity, and a play animation instruction is sent to the client to instruct the client to play the target animation;

A second acquisition module is used to acquire a target secondary identity identifier corresponding to the target user identity; the target secondary identity identifier is a sub-identifier of the primary identity identifier;

The determination module is used to execute a business instruction corresponding to the business intention based on the target secondary identity identifier; when the business instruction is executed, a stop animation playing instruction is sent to the client to instruct the client to close the target animation.

12. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the method according to any one of claims 1 to 10.

13. A computer storage medium, characterized in that the computer storage medium stores a computer program, wherein the computer program comprises program instructions, and when the program instructions are executed by a processor, the method according to any one of claims 1 to 10 is executed.