CN115830487A

CN115830487A - Counterfeit detection method, device, storage medium and electronic equipment

Info

Publication number: CN115830487A
Application number: CN202211226699.4A
Authority: CN
Inventors: 武文琦
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2022-10-09
Filing date: 2022-10-09
Publication date: 2023-03-21

Abstract

The specification discloses a forgery detection method, a forgery detection device, a storage medium, and an electronic apparatus, wherein the method includes: extracting multi-class object feature maps of target object data through the counterfeiting detection model, performing feature texture enhancement on the multi-class object feature maps to obtain texture enhancement feature maps, performing attention feature enhancement on the object feature maps through the counterfeiting detection model to obtain attention feature maps, performing feature attention fusion on the object feature maps, the texture enhancement feature maps and the attention feature maps to obtain object fusion features, performing counterfeiting detection on the object fusion features, and outputting object counterfeiting detection results.

Description

Counterfeit detection method, device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting forgery, a storage medium, and an electronic device.

Background

With the rapid development of computer technology, the data counterfeiting cost of data such as images, videos and the like is lower and lower, the quality of generated counterfeit data is higher and higher, and the counterfeit data is randomly visible; in daily application, whether object data such as a face image, a face video, a user image and the like is forged or not needs to be accurately judged, and the forging detection of the data also becomes an important part in daily application scenes such as security privacy, user authentication, identity authentication and the like.

Disclosure of Invention

The specification provides a counterfeit detection method, a counterfeit detection device, a storage medium and an electronic device, and the technical scheme is as follows:

in a first aspect, the present description provides a counterfeit detection method, the method comprising:

inputting target object data into a counterfeiting detection model, extracting multi-class object feature maps of the target object data through the counterfeiting detection model, and performing feature texture enhancement on the basis of the multi-class object feature maps to obtain texture enhancement feature maps;

performing attention characteristic enhancement on the object characteristic diagram through the counterfeit detection model to obtain an attention characteristic diagram;

and performing feature attention fusion on the object feature map, the texture enhancement feature map and the attention feature map by adopting the counterfeiting detection model to obtain object fusion features, and outputting an object counterfeiting detection result based on the object fusion features.

In a second aspect, the present specification provides a tamper detection device comprising:

the texture processing module is used for inputting target object data into a counterfeiting detection model, extracting multi-class object feature maps of the target object data through the counterfeiting detection model, and performing feature texture enhancement on the basis of the multi-class object feature maps to obtain a texture enhancement feature map;

the attention enhancement module is used for carrying out attention characteristic enhancement on the object characteristic map through the counterfeit detection model to obtain an attention characteristic map;

and the fusion processing module is used for performing feature attention fusion on the object feature map, the texture enhancement feature map and the attention feature map by adopting the counterfeiting detection model to obtain object fusion features, and outputting an object counterfeiting detection result based on the object fusion features.

In a third aspect, the present specification provides a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to carry out the above-mentioned method steps.

In a fourth aspect, the present specification provides an electronic device, which may comprise: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

The technical scheme provided by some embodiments of the present description brings beneficial effects at least including:

in one or more embodiments of the present description, an electronic device inputs target object data into a counterfeit detection model, extracts a plurality of types of object feature maps of the target object data through the counterfeit detection model, performs feature texture enhancement based on the plurality of types of object feature maps to obtain a texture enhancement feature map, performs attention feature enhancement on the object feature map through the counterfeit detection model to obtain an attention feature map, and performs feature attention fusion on the object feature map, the texture enhancement feature map, and the attention feature map to obtain an object fusion feature, so that counterfeit detection can be accurately performed based on the object fusion feature, thereby avoiding a phenomenon that a counterfeit detection result is wrong, and a counterfeit detection inverse solution mechanism can be adaptively focused on a texture information difference region, so as to better adapt to a complex application scene to realize effective detection; and the global detection effect of the model in a complex scene can be improved, and the robustness and the universality of the counterfeit detection are ensured.

Drawings

In order to more clearly illustrate the technical solutions in the present specification or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present specification, and it is also possible for a person skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a scenario of a forgery detection system provided by the present specification;

FIG. 2 is a schematic flow chart of a counterfeit detection method provided herein;

FIG. 3 is a schematic diagram of a scenario of feature processing provided herein;

FIG. 4 is a schematic flow chart of a forgery detection method provided by the present specification;

fig. 5 is a schematic view of a scene of feature map extraction according to the present specification;

FIG. 6 is a schematic view of a scenario for feature map determination in accordance with the present description;

FIG. 7 is a schematic diagram of a scene of a texture enhancement module according to the present disclosure;

FIG. 8 is a schematic flow chart of a forgery detection method provided by the present specification;

FIG. 9 is a schematic diagram of a scenario of model training of a forgery detection model provided in the present specification;

FIG. 10 is a schematic view of a flow device of a tamper detection device as provided herein;

FIG. 11 is a block diagram of a texture processing module provided herein;

fig. 12 is a schematic structural diagram of an electronic device provided in this specification;

FIG. 13 is a block diagram of the operating system and user space provided herein;

FIG. 14 is an architectural diagram of the android operating system of FIG. 13;

FIG. 15 is an architectural diagram of the IOS operating system of FIG. 13.

Detailed Description

The technical solutions in the present specification will be clearly and completely described below with reference to the drawings in the present specification, and it is obvious that the described embodiments are only a part of the embodiments in the present specification, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.

In the description herein, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present specification, it is to be noted that, unless explicitly stated or limited otherwise, "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus. The specific meanings of the above terms in the present specification can be understood in specific cases by those of ordinary skill in the art. Further, in the description of the present specification, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

In the related art, a machine learning model for forgery detection is often trained to perform forgery detection on corresponding target data, most of the target data with the forgery phenomenon are generated based on a deep convolution neural network model, the similarity of a model basic network structure is very high, and a resistance mechanism for the forgery detection, namely a forgery detection counter-solution mechanism, exists at a very high probability, so that the detection result of the machine learning model for forgery detection after being actually online is not accurate, and the adaptation capability under an actual application scene is not strong.

The present specification will be described in detail with reference to specific examples.

Please refer to fig. 1, which is a schematic view of a scenario of a counterfeit detection system provided in the present specification. As shown in fig. 1, the forgery detection system can include at least a client cluster and a service platform 100.

The client cluster may include at least one client, as shown in fig. 1, specifically including client 1 corresponding to user 1 and client 2, \8230correspondingto user 2, where client n corresponding to user n is an integer greater than 0.

Each client in the client cluster may be a communication-enabled electronic device including, but not limited to: wearable devices, handheld devices, personal computers, tablet computers, in-vehicle devices, smart phones, computing devices or other processing devices connected to a wireless modem, and the like. Electronic devices in different networks may be called different names, such as: user equipment, access terminal, subscriber unit, subscriber station, mobile station, remote terminal, mobile device, user terminal, wireless communication device, user agent or user equipment, cellular telephone, cordless telephone, personal Digital Assistant (PDA), electronic device in a 5G network or future evolution network, and the like.

The service platform 100 may be a separate server device, such as: rack-mounted, blade, tower-type or cabinet-type server equipment, or hardware equipment with higher computing power such as workstations and large computers; the server cluster can also be a server cluster formed by a plurality of servers, each server in the service cluster can be formed in a symmetrical mode, wherein each server has the equivalent function and the equivalent status in a transaction link, and each server can provide services for the outside independently, and the independent service can be understood as the assistance without other servers.

In one or more embodiments of the present description, the service platform 100 may establish a communication connection with at least one client in the client cluster, and complete data interaction, such as online transaction data interaction, in a counterfeit detection process based on the communication connection, for example, the service platform 100 may implement deployment to a plurality of clients based on a counterfeit detection model obtained by the counterfeit detection method of the present description, so as to assist the clients to collect target object data in an actual counterfeit detection scenario, and perform counterfeit detection on the target object based on the counterfeit detection model; for another example, the service platform 100 may obtain target object data collected by the client from the client, and perform counterfeit detection on the target object by using a counterfeit detection model;

it should be noted that the service platform 100 establishes a communication connection with at least one client in the client cluster to perform interactive communication through a network, where the network may be a wireless network including but not limited to a cellular network, a wireless local area network, an infrared network, or a bluetooth network, and the wired network includes but not limited to an ethernet network, a Universal Serial Bus (USB), or a controller area network. In one or more embodiments of the specification, data (e.g., object compressed packets) exchanged over a network is represented using techniques and/or formats including Hyper Text Markup Language (HTML), extensible Markup Language (XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), transport Layer Security (TLS), virtual Private Network (VPN), internet Protocol Security (IPsec). In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

The embodiment of the forgery detection system provided in this specification and the forgery detection method in one or more embodiments belong to the same concept, and an execution subject corresponding to the forgery detection method in one or more embodiments in this specification may be an electronic device corresponding to the service platform 100; the execution subject corresponding to the forgery detection method according to one or more embodiments of the specification may also be an electronic device corresponding to the client, and is specifically determined based on an actual application environment. The embodiment of the counterfeit detection system, which embodies the implementation process, can be seen in the following method embodiments, and will not be described herein again.

Based on the scene diagram shown in fig. 1, the following describes in detail a counterfeit detection method provided in one or more embodiments of the present specification.

Referring to fig. 2, a flow chart of a counterfeit detection method, which can be implemented by relying on a computer program and can run on a counterfeit detection device based on von neumann architecture, is provided for one or more embodiments of the present disclosure. The computer program may be integrated into the application or may run as a separate tool-like application. The forgery detection means may be an electronic device.

Specifically, the counterfeit detection method includes:

s102: inputting target object data into a counterfeiting detection model, extracting multi-class object feature maps of the target object data through the counterfeiting detection model, and performing feature texture enhancement on the basis of the multi-class object feature maps to obtain texture enhancement feature maps;

it can be understood that, with the development of artificial intelligence technology, the artificial intelligence technology such as deep learning may be used to forge corresponding object data, such as replacing a facial object region in certain media data (image or video), such as tampering a leg object region in certain media data (image or video), and the forged object data is difficult for a user to directly distinguish whether the image or the video is used.

In an actual transaction scenario, it is necessary to perform forgery detection on target object data in a corresponding forgery detection scenario, for example, to perform forgery detection processing on target object data such as a user object image and a user object video for a corresponding object part (e.g., a face and a limb part of an object).

Further, the target object data may be subjected to forgery detection based on a previously trained forgery detection model to verify whether the input target object data is subjected to forgery processing, so as to obtain an object forgery detection result for the target object data.

The counterfeit detection model can be obtained by model training of an initial counterfeit detection model created based on a machine learning model, the machine learning model can be realized by fitting one or more of machine learning models such as a Convolutional Neural Network (CNN) model, a Deep Neural Network (DNN) model, a Recurrent Neural Network (RNN), a model, an embedding (embedding) model, a Gradient Boosting Decision Tree (GBDT) model, a Logistic Regression (Logistic Regression, LR) model and the like, and the counterfeit detection model can be obtained by performing model training on the initial counterfeit detection model by using offline sample object data when a model ending condition is met, and then the counterfeit detection model can be deployed online to an actual counterfeit detection scene to detect target object data.

Further, the counterfeit detection scenario may be a scenario that requires counterfeit detection of an object in practical applications, such as finance, insurance, security, and the like. For example, when performing user object authentication, a user generally needs to submit object data such as an identity card, a student card, current face object data and the like on an application program or a website, and the target object data may be in the form of visual data such as video, images, small videos, animation and the like carrying object information.

In one or more embodiments of the present specification, a multi-class object feature map of target object data may be extracted by inputting the target object data into a counterfeit detection model and performing feature extraction on the target object data layer by using the counterfeit detection model;

the multi-class object feature maps can be understood as feature maps, extracted correspondingly by different feature extraction stages in the feature extraction process, of the same target object data, for a feature extraction part in a counterfeit detection model, the feature extraction part is usually of a multi-layer structure, and different hierarchical structures of the feature extraction part correspondingly extract different classes of object feature maps;

illustratively, the feature extraction part may be composed of a plurality of backbone networks (e.g., a plurality of backbone networks stacked) for feature extraction, and different backbone networks may extract different types of object feature maps, and the backbone networks may be based on fitting of one or more of the above machine learning models. Furthermore, the feature resolution and semantic expression degree of different classes of object feature maps are different.

In one or more embodiments of the present description, a multi-class object feature map of target object data is extracted through a forgery detection model, and feature texture enhancement is performed on all or part of the multi-class object feature map to obtain a texture enhancement feature map;

for example, in the process of processing the forgery detection model, feature texture enhancement can be performed on the object feature map by extracting each class of object feature map from the backbone network with different hierarchical structures, so as to obtain a texture enhancement map corresponding to each class of object feature map.

For another example, in the process of processing the forgery detection model, each class of object feature maps extracted by the backbone networks of different hierarchical structures may perform feature texture enhancement on a part of the object feature maps therein to obtain a texture enhancement map corresponding to each class of object feature map, where the object feature map may be preset, for example, the object feature map may be set as a first class of object feature map corresponding to the first backbone network and a second class of object feature map corresponding to the second backbone network.

Schematically, feature texture enhancement is performed on the object feature map, and the counterfeiting detection model mainly focuses on the texture feature information of the object feature map to enhance the texture feature information so as to amplify the counterfeiting details corresponding to object counterfeiting (such as face counterfeiting and limb counterfeiting).

In a possible implementation manner, the forgery detection model at least includes a plurality of backbone networks and a texture enhancement module, and the extracting, by the forgery detection model, a plurality of types of object feature maps and performing feature texture enhancement based on the plurality of types of object feature maps to obtain a texture enhancement feature map may be:

extracting multi-class object feature maps of target object data through each backbone network, determining at least one class of target object feature maps from the multi-class object feature maps, inputting the various classes of target object feature maps into a texture enhancement module, enhancing texture information from the various classes of target object feature maps through the texture enhancement module, and outputting texture enhancement feature maps corresponding to the various classes of target object feature maps.

Schematically, as shown in fig. 3, fig. 3 is a scene schematic diagram of a feature processing according to this specification, and fig. 3 illustrates a partial model structure and a data flow state of a forgery detection model, where the forgery detection model illustrated in fig. 3 may include a plurality of backbone networks, such as a backbone network 1, a backbone network 2, a.

Further, for the object feature map corresponding to each backbone network, all or part of the object feature maps in the object feature maps may be selected and input to the texture enhancement module for texture enhancement, so as to obtain a texture enhancement feature map corresponding to the object feature map.

In one or more embodiments of the present specification, texture information such as forged traces and forged artifacts, which is usually retained in an object feature map with high feature resolution, is more, and in some embodiments, such an object feature map is often referred to as a shallow object feature map, for example, the object feature map 1 is usually a shallow object feature map.

In a consistent and feasible implementation manner, the texture enhancement module can process the input object feature map by using operations such as convolution operation, pooling operation and the like to enhance texture information to obtain a texture enhancement feature map;

s104: performing attention characteristic enhancement on the object characteristic diagram through the counterfeit detection model to obtain an attention characteristic diagram;

in one or more embodiments of the present description, an attention mechanism portion of a forgery detection model extracts a feature from at least one of a plurality of object feature maps, and an attention feature map can be obtained;

illustratively, the Attention mechanism part can be at least one layer of Attention enhancement module, and the Attention feature map can be obtained by performing Attention feature enhancement on the object feature map based on at least one layer of Attention enhancement module.

Schematically, the attention feature enhancement on the object feature map may be visual attention enhancement processing and feature extraction performed on a visual region from a visual layer surface, and may be understood as feature extraction performed on visual attention layer surface enhancement performed on an easily-forged region of an object focused on visual dimensions, so as to obtain an attention feature map; the attention feature enhancement of the object feature map can also be semantic attention enhancement processing and feature extraction of image/video semantic regions from a semantic level to obtain an attention feature map, and the attention enhancement is performed so as to assist in focusing on forged difference sensitive regions.

S106: and performing feature attention fusion on the object feature map, the texture enhancement feature map and the attention feature map by adopting the counterfeiting detection model to obtain object fusion features, and outputting an object counterfeiting detection result based on the object fusion features.

In one or more embodiments of the present specification, feature attention fusion may be performed on the object feature map, the texture enhancement feature map, and the attention feature map through a forgery detection model, and the object feature map, the texture enhancement feature map, and the attention feature map may be subjected to attention pooling to perform multi-layer attention fusion to implement adaptive focusing of a feature level on a texture sensitive region, and a series of enhanced object fusion features are finally subjected to forgery detection classification, which is a result of object forgery detection.

Illustratively, if global average fusion is used for the object feature map, the texture enhancement feature map and the attention feature map, the fused object fusion features are affected by the strength of the attention map, which is not beneficial for the purpose of focusing on texture information, taking into account the difference between different region ranges, and based on this, optimization is performed by using an attention pooling operation.

Illustratively, the object forgery detection result is generally an object forgery result or an object true result for the target object data.

Referring to fig. 4, fig. 4 is a schematic flow chart of another embodiment of a counterfeit detection method according to one or more embodiments of the present disclosure. Specifically, the method comprises the following steps:

s202: inputting target object data into a forgery detection model, and respectively extracting multi-class object characteristic graphs of the target object data through each trunk network of the forgery detection model;

in one or more embodiments of the present specification, the plurality of Backbone networks of the counterfeit detection model may be a Backbone network structure composed of multiple Layers of backhaul Layers;

in some embodiments, the object data may be a video type, for example, a section of facial video of a user's face is collected, then the network structure in the forgery detection model is a three-dimensional network structure, the Backbone network may be referred to as 3D Backbone networks, and when extracting features of the target object data, the 3D Backbone network also generally focuses on feature information of the target object data in a time dimension, an object feature legend corresponding to the 3D Backbone network may be in a form of "C × H × W × T", C is a channel parameter, H represents a length or height parameter, W represents a width parameter, and T represents a time parameter. It can be understood that the forgery detection model trained based on the object data of the video type can be applied to the object data of the image type, that is, the forgery detection can be performed on the object data of the image type.

In a possible implementation manner, taking three of the plurality of Backbone networks as an example, the plurality of Backbone networks may include a first Backbone network, a second Backbone network, and a third Backbone network, the first Backbone network may be 3d Backbone layer 1, the second Backbone network 3d Backbone layer 2, and the third Backbone network 3d Backbone layer 3, and the plurality of Backbone networks may be stacked.

Further, the extracting, through each of the backbone networks, a multi-class object feature map of the target object data may be: fig. 5 is a schematic view of a scene of feature map extraction according to the present specification, as shown in fig. 5;

extracting a first class object feature map of target object data through the first trunk network 3D Backbone Layers1, extracting a second class object feature map of the target object data through the second trunk network 3D Backbone Layers2, and extracting a third class object feature map of the target object data through the third trunk network 3D Backbone Layers3;

the feature resolution of the second class of object feature maps is smaller than that of the first class of object feature maps and larger than that of the third class of object feature maps, and the semantic expression degree of the second class of object feature maps is larger than that of the first class of object feature maps and smaller than that of the third class of object feature maps.

Illustratively, the features corresponding to the first type of object feature map are usually closer to the input data and include more pixel point information, such as color, texture, edge, and other feature information, that is, the feature resolution of the first type of object feature map is usually higher than that of other object feature maps, but the semantic expression degree is smaller than that of other object feature maps due to fewer factors of the main network through which the first type of object feature map passes. Illustratively, the third class of object feature maps, which are over multiple backbone networks, typically have a higher semantic representation than other object feature maps.

Illustratively, the backbone network may be a fit of one or more of the machine learning models in one or more embodiments of the present description. The backbone network may be a 3D backbone network, the object feature graph extracted by the 3D backbone network may be, for example, "C × H × W × T", and the 3D backbone network may be compatible with the object data of the image class and the video class to perform feature graph extraction.

S204: determining at least one type of target object feature map from the multi-type object feature maps;

it can be understood that each backbone network correspondingly extracts a class of object feature maps;

schematically, as shown in fig. 6, fig. 6 is a schematic view of a scenario determined by a feature map according to this specification, in fig. 6, multiple types of object feature maps, such as an object feature map 1, an object feature map 2,. An object feature map n, of the target object data are extracted through each of the backbone networks, and then at least one type of target object feature map is determined from the multiple types of object feature maps, as shown in fig. 6, the determined target object feature maps are the object feature map 1 and the object feature map 2, and then the object feature map 1 and the object feature map 2 are input into a texture enhancement module, and texture enhancement feature maps corresponding to the various types of target object feature maps are output respectively;

in one or more embodiments of the present specification, considering that as technology develops, the ability of data falsification such as images and videos is continuously improved by using an artificial intelligence technology such as deep learning, and the object data after falsification may resist a large number of machine learning models in the related art, and dimensions such as falsification details and global receptive fields are higher and higher, based on this, the falsification detection method related to the present specification considers the aforementioned factors and carries out multidimensional scaling, and selects a target object feature map from multiple types of object feature maps to carry out texture lifting, instead of focusing on an object feature map of a shallow dimension with high feature resolution (such as a first type of object feature map), so as to mine texture information in object feature maps with different feature resolutions and different semantic expression degrees from the multiple types of object feature maps and further enhance the texture information.

In a possible implementation manner, a default object feature map can be set based on a plurality of backbone networks of a forgery detection model, and in an actual model application stage, at least one default class of target object feature maps can be determined from a plurality of classes of object feature maps;

for example, assuming that the multi-class object feature maps may be a first class object feature map, a second class object feature map, and a third class object feature map, the default target object feature map may be the first class object feature map + the second class object feature map, or the first class object feature map + the third class object feature map, or the first class object feature map + the second class object feature map + the third class object feature map, or the first class object feature map.

Schematically, a first class object feature map and a second class object feature map are determined from the multiple classes of object feature maps, the first class object feature map and the second class object feature map are respectively input into the texture enhancement module, and a first texture enhancement feature map corresponding to the first class object feature map and a second texture enhancement feature map corresponding to the second class object feature map are output; or the like, or a combination thereof,

schematically, a first class object feature map and a third class object feature map are determined from the multiple classes of object feature maps, the first class object feature map and the third class object feature map are respectively input into the texture enhancement module, and a first texture enhancement feature map corresponding to the first class object feature map and a third texture enhancement feature map corresponding to the third class object feature map are output;

schematically, a first class object feature map, a second class object feature map and a third class object feature map are determined from the multiple classes of object feature maps, the first class object feature map, the second class object feature map and the third class object feature map are respectively input into the texture enhancement module, and a first texture enhancement feature map corresponding to the first class object feature map, a second texture enhancement feature map corresponding to the second class object feature map and a third texture enhancement feature map corresponding to the third class object feature map are output; or the like, or, alternatively,

illustratively, a first class object feature map is determined from the multiple classes of object feature maps, the first class object feature map is input into the texture enhancement module, and a first texture enhancement feature map corresponding to the first class object feature map is output.

Optionally, a default target object feature map may be set from all object feature maps based on the type of transaction scenario to which the forgery detection model is applied;

in a possible implementation manner, the forgery detection model may be controlled to perform texture enhancement map prediction processing on the multiple types of object feature maps to obtain at least one type of target object feature map.

The texture enhancement map prediction process may be understood as predicting a target object feature map requiring texture enhancement from a plurality of types of object feature maps.

The texture enhancement map prediction processing may be to perform forgery feature recognition on the object feature map, that is, to recognize a forgery feature from an object feature map that has not been subjected to texture enhancement, and it can be understood that the forgery feature recognition is to perform a suspected forgery feature scan on the object feature map preliminarily, and the suspected forgery feature may not be forged, and through the forgery feature recognition, recognition result data can be obtained, and the recognition result data includes but is not limited to the number of suspected forgery features, the distribution situation of the suspected forgery features, and the area of suspected forgery pixels, and when the recognition result data indicates that the suspected forgery factors account for a large number, a small number of object feature maps can be selected to perform the texture enhancement, and it can be understood that the suspected forgery factors account for a large number of forgery features in the object feature map that indicates that the feature resolution is high, and it is sufficient to perform the texture enhancement only in the object feature map that has the feature resolution is high for the subsequent classification judgment; when the identification result data indicates that the suspected counterfeiting factors account for less or do not exist, more object feature maps can be selected for texture enhancement, and it can be understood that the suspected counterfeiting factors account for less or do not exist, which indicates that the object feature maps with high feature resolution have unobvious or do not exist, and at this time, a higher-level counterfeiting resistance mechanism may exist to resist counterfeiting detection.

Furthermore, the residual prediction probability can be evaluated based on the identification result data, and the data quantization can be realized based on the residual prediction probability, and it can be understood that the more types of data such as the number of suspected counterfeit features, the distribution condition of the suspected counterfeit features, the area of the suspected counterfeit pixels, and the like in the identification result data, the larger the value of the residual prediction probability is, and conversely, the smaller the value of the residual prediction probability is.

Illustratively, a probability mapping relationship between the data amount of the type data such as the number of suspected counterfeit features, the distribution of the suspected counterfeit features, the area of the suspected counterfeit pixels and the residual prediction probability can be established in the identification result data, and the residual prediction probability can be quickly obtained by combining the current data amount of each type based on the probability mapping relationship.

Optionally, the texture enhancement map prediction processing may be performed by using an object feature map with a high feature resolution to obtain a residual prediction probability, for example, the object feature map with the high feature resolution is usually an object feature map corresponding to a first or a preset number (e.g., 2, 3, etc.) of backbone networks. It can be understood that as the depth of the network structure increases, the resolution of the features in the obtained object feature map is lower than that of the first or the preset number of backbone networks, so that suspected counterfeit features are more difficult to find.

Optionally, a texture prediction mapping relationship between a plurality of reference prediction probability ranges and the reference object feature map may be established, and different reference prediction probability ranges correspond to different numbers and different types of reference object feature maps. Based on this, in the actual model processing stage, after obtaining the residual prediction probability of a certain object feature map, a target range in a plurality of reference prediction probability ranges in which the residual prediction probability falls is determined, and then a plurality of target object feature maps indicated by the target range in the texture prediction mapping relationship are obtained.

In a possible implementation manner, the performing texture enhancement map prediction processing on the multiple classes of object feature maps to obtain at least one class of target object feature maps may be: and carrying out artifact residual detection on the object feature map, obtaining identification result data through the artifact residual detection, wherein the identification result data comprises type data such as the number of suspected forged features, the distribution condition of the suspected forged features, the area of suspected forged pixels and the like, obtaining a residual prediction probability corresponding to the object feature map based on the identification result data, and then determining at least one type of target object feature map from the multiple types of object feature maps based on the residual prediction probability.

Optionally, at least one type of target object feature map may include a first object feature map, such as the first type of object feature map.

Optionally, in the counterfeit detection model: a graph prediction module may be configured before the texture enhancement module, where the graph prediction module is configured to perform texture enhancement graph prediction processing on the multiple types of object feature graphs to obtain at least one type of target object feature graph, that is, the input of the graph prediction module is the object feature graph, and the output of the graph prediction module is the at least one type of target object feature graph, so as to input the various types of target object feature graphs to the texture enhancement module.

S206: inputting the various target object feature maps into the texture enhancement module, and respectively outputting texture enhancement feature maps corresponding to the various target object feature maps;

illustratively, the texture enhancement module may include a pooling layer and a Dense convolution layer based on a residual structure, as shown in fig. 7, fig. 7 is a scene schematic diagram of a texture enhancement module related to this specification, an input of the texture enhancement module is a (target) object feature map, an output of the texture enhancement module is a texture enhancement feature map corresponding to each (target) object feature map, the pooling layer may be an averaging posing shown in fig. 7, the Dense convolution layer based on the residual structure may be a 3D sense block shown in fig. 7, it can be understood that the Dense convolution layer is a 3D Dense convolution layer based on the residual structure, and may be used for texture enhancement of an object feature map related to object data suitable for a video type and an image type, and when texture enhancement is performed on the target object feature map, the 3D sense block generally focuses on texture temporal characteristics of the target object data in a temporal dimension.

In a possible implementation manner, the inputting of the various types of target object feature maps into the texture enhancement module and the outputting of the texture enhancement feature maps corresponding to the various types of target object feature maps may be: inputting the various target object feature maps into the texture enhancement module respectively, and performing feature average pooling on the target object feature maps through the pooling layer to obtain pooled feature maps; and fitting the target object characteristic diagram and the pooling characteristic diagram through the dense convolution layer to obtain a texture enhancement characteristic diagram corresponding to the target object characteristic diagram.

Illustratively, the pooling layer performs feature average pooling operation on the target object feature map to obtain a pooled feature map, and then correlates the pooled feature map with a high-frequency part of the original target object feature map and inputs the pooled feature map into a dense convolution layer based on a residual structure for processing, so as to obtain a texture enhancement feature map corresponding to the target object feature map.

Illustratively, by focusing on the texture feature information of the object feature map, the texture feature information is enhanced to amplify the counterfeiting details corresponding to object counterfeiting (such as face counterfeiting and limb counterfeiting), so that subsequent better counterfeiting classification can be performed.

S208: performing attention characteristic enhancement on the object characteristic diagram through the counterfeit detection model to obtain an attention characteristic diagram;

in one or more embodiments of the present specification, the forgery detection model may include an attention enhancement module, and specifically, at least one type of reference object feature map may be selected from the multiple types of object feature maps, and an attention mechanism portion in the attention enhancement module performs attention enhancement on texture-sensitive difference features in the at least one type of reference object feature map to obtain an attention feature map.

Alternatively, the reference object feature map may be a default selected object feature map of multiple classes of object feature maps, which may be determined in advance in a training stage of the forgery detection model,

schematically, the attention feature enhancement on the object feature map may be visual attention enhancement processing and feature extraction performed on a visual region from a visual layer surface, and may be understood as feature extraction performed on visual attention layer surface enhancement performed on an easily-forged region of an object focused on visual dimensions, so as to obtain an attention feature map; the attention feature enhancement of the object feature map can also be semantic attention enhancement processing and feature extraction of an image/video semantic region from a semantic level to obtain an attention feature map.

Optionally, the Attention enhancement module may be a 3d Attention module, and the 3d Attention module may be an Attention module of a single-layer Attention mechanism, where when the Attention enhancement module performs Attention enhancement on the texture sensitive difference features in the reference object feature map, the Attention enhancement module also focuses on feature information of the reference object feature map in a time dimension, and obtains the Attention feature map by performing Attention enhancement on the texture sensitive difference features in the at least one type of reference object feature map through an Attention mechanism portion in the Attention enhancement module.

S210: and performing feature attention fusion on the object feature map, the texture enhancement feature map and the attention feature map by adopting the forgery detection model to obtain object fusion features, and outputting an object forgery detection result based on the object fusion features.

Illustratively, the forgery detection model may include an attention pooling module, and the object feature map, the texture enhancement feature map, and the attention feature map are subjected to Bilinear attention pooling (Bilinear attention pooling) by the attention pooling module to perform feature attention fusion, and in the feature attention fusion process, the object feature map, the texture enhancement feature map, and the attention feature map are mapped to the same map resolution and then fused to obtain object fusion features, and then input to the classifier to perform forgery classification of the input object according to the object fusion features.

Illustratively, the object forgery detection result is generally an object forgery result or an object true result for the target object data. For example, when the target object data is a user face video, the object forgery detection result, that is, the user face video is a real face video or a forged face video.

Referring to fig. 8, fig. 8 is a schematic flow chart of another embodiment of a counterfeit detection method according to one or more embodiments of the present disclosure. Specifically, the method comprises the following steps:

s302: creating an initial counterfeit detection model comprising an object data decoder, inputting sample object data into the initial counterfeit detection model for model training, determining counterfeit classification loss of the initial counterfeit detection model and determining object restoration prediction loss through the object data decoder;

the object data decoder is configured to generate object restoration data for the sample object data based on the object fusion features in the model training process, for example, if the sample object data is a sample object image of an object, the object restoration data is an object restoration image for the sample object image generated based on the object fusion features, the object restoration image is an image generated by re-decoding the object fusion features based on the object data decoder, and the object restoration image corresponds to the original sample object image.

In one or more embodiments of the present disclosure, an initial counterfeit detection model including an object data decoder is created in advance, where the initial counterfeit detection model may be obtained by model training based on an initial counterfeit detection model created by a machine learning model, and the machine learning model may be implemented by fitting one or more of a Convolutional Neural Network (CNN) model, a Deep Neural Network (DNN) model, a Recurrent Neural Network (RNN) model, an embedded (embedding) model, a Gradient Boosting Decision Tree (GBDT) model, a Logistic Regression (LR) model, and other machine learning models.

Illustratively, a large amount of sample object data can be obtained in advance, the sample object data may be a sample object image of an image type or a sample object video of a video type, and the sample object data may be labeled with an authenticity label, that is, the sample object data is labeled with a real object data type or a counterfeit data type.

Illustratively, off-line sample object data is adopted for an initial counterfeiting detection model for model training, namely, each sample object data is input into the initial counterfeiting detection model for model training, the counterfeiting classification loss of the initial counterfeiting detection model and the object reduction prediction loss are determined by an object data encryptor in the model training process, the counterfeiting classification loss and the object reduction prediction loss can be synthesized to obtain the model total loss, model parameter adjustment is carried out on the initial counterfeiting detection model by adopting a back propagation algorithm based on the model total loss until the initial counterfeiting detection model meets the model finishing condition, the trained counterfeiting detection model can be obtained, and then the counterfeiting detection model can be deployed on line to the actual counterfeiting detection scene to carry out counterfeiting detection on the target object data.

In one or more embodiments of the present specification, the model end training condition may include, for example, a value of the loss function is less than or equal to a preset loss function threshold, the number of iterations reaches a preset number threshold, and the like. The specific model ending training condition may be determined based on actual conditions, and is not described herein again.

In one or more embodiments of the present disclosure, after the initial counterfeit detection model satisfies the model termination condition, the object data decoder in the initial counterfeit detection model may be discarded to retain the rest of the model network structure, so as to generate the trained counterfeit detection model.

In a possible implementation, the determining, by the object data decoder, the object restoration prediction loss may be:

the initial counterfeit detection model typically includes multiple rounds of model training, each round of model training performing model training based at least on one or a collection of sample object data,

a2, in each round of model training process, controlling an initial counterfeit detection model to perform object reduction processing on the sample object data through an object data decoder to obtain object reduction data;

illustratively, after the initial forgery detection model receives the input sample object data, it performs: extracting multi-class object feature maps of sample object data through an initial counterfeiting detection model, and performing feature texture enhancement on the basis of the multi-class object feature maps to obtain texture enhancement feature maps; carrying out attention characteristic enhancement on the object characteristic diagram through an initial counterfeit detection model to obtain an attention characteristic diagram; performing feature attention fusion on an object feature map, a texture enhancement feature map and an attention feature map by adopting an initial counterfeiting detection model to obtain object fusion features, and performing counterfeiting detection on the basis of the object fusion features to obtain a counterfeiting identification result' of the sample object data;

a4, controlling an initial forgery detection model to perform loss calculation by adopting a comparison loss function based on the object reduction data and the sample object data to obtain an object reduction prediction loss L ₂ 。

The comparison Loss function can be a Margin Loss function, and the specific form of the Margin Loss function can refer to the Margin Loss function form in the related technology;

the alignment loss function may also be a decoding loss function based on euclidean distance.

Controlling an initial forgery detection model to calculate a forgery classification loss L based on a forgery recognition result for the sample object data and a forgery reference result for the sample object data ₁ 。

In one possible embodiment, the method comprisesDetermining a forgery classification loss for the initial forgery detection model may be: determining a forgery identification result of the initial forgery detection model aiming at the sample object data, and acquiring a forgery reference result aiming at the sample object data; performing loss calculation by adopting a cross entropy loss function based on the fake identification result and the fake reference result to obtain fake classification loss L ₁ 。

The fake reference result is also the true and false label marked on the sample object data in advance;

the forgery identification result is also an object forgery detection result output by the current wheel initial forgery detection model to the sample object data;

the input of the cross entropy loss function is a fake reference result and a fake identification result, and the output is fake classification loss.

S304: and adjusting model parameters of the initial counterfeit detection model based on the counterfeit classification loss and the object restoration prediction loss until a model training end condition is met, so as to obtain a counterfeit detection model without the object data decoder.

In a possible implementation, the performing model parameter adjustment on the initial counterfeit detection model based on the counterfeit classification loss and the object restoration prediction loss may be: inputting the counterfeiting classification loss and the object reduction prediction loss into a model comprehensive calculation formula, outputting model comprehensive loss, and adjusting model parameters of the initial counterfeiting detection model by adopting the model comprehensive loss;

the model comprehensive calculation formula satisfies the following formula:

L _total ＝a*L ₁ +b*L ₂

wherein, L is _total For model synthetic losses, said L ₁ For the fake classification loss, the a is a first hyperparameter of the fake classification loss, the L ₂ Restoring the predicted loss for the object, wherein b is a second hyperparameter of the predicted loss restored for the object.

Illustratively, in each round of model training for an initial counterfeit detection model: the method comprises the steps of calculating model comprehensive loss in a model training process, adjusting parameters of a model network based on the model comprehensive loss, wherein the parameters comprise a weight value and a threshold value of the model network, completing training until the whole initial counterfeit detection model reaches a training finishing condition, converging the model network, abandoning and reserving other network structures of an object data decoder in the initial counterfeit detection model so as to obtain a counterfeit detection model for counterfeit detection after training, wherein the object data decoder can generate object reduction data in each round, visually feeds back the model processing capacity of the initial counterfeit detection model through the object reduction data in a visual mode, and can grasp the counterfeit detection capacity of the initial counterfeit detection model in each round of training process in real time based on the object reduction data.

For better understanding of the forgery detection method to which this specification refers, a model training process of a forgery detection model is shown below as follows:

referring to fig. 9, fig. 9 is a schematic view of a scenario of performing model training on a forgery detection model according to the present specification, where an initial forgery detection model may be created in advance, and the initial forgery detection model may include a plurality of backbone networks, a texture enhancement module, an attention pooling module, a classifier, and a data decoder;

schematically, in fig. 9, the number of the Backbone networks may include three, the plurality of Backbone networks may include a first Backbone network, a second Backbone network, and a third Backbone network, the first Backbone network may be 3d Backbone layer 1, the second Backbone network may be 3d Backbone layer 2, and the third Backbone network may be 3d Backbone layer 3, and the plurality of Backbone networks may be stacked; the texture enhancement module may include a pooling layer average firing and a residual structure based Dense convolutional layer 3D density block; the Attention enhancement module may be a 3D Attention module, a 3D Attention module may be an Attention module of a single-layer Attention mechanism; the attention pooling module may be a Bilinear attention pooling module for performing feature attention fusion by performing a Bilinear attention pooling operation (Bilinear attention pooling); the Classifier is also called a Classifier and is used for performing forgery classification according to the fusion characteristics of the sample object; the data decoder may be an SD decoder for generating object restoration data for the sample object data based on the object fusion features during the model training process;

illustratively, a large amount of sample object data is acquired in advance, the sample object data may be a video type sample object video, and each sample object data is input into an initial forgery detection model such as that shown in fig. 9 for model training, in the model training process: respectively extracting multi-class object feature maps of sample target object data through each backbone network (a first backbone network, a second backbone network and a third backbone network), namely a first class object feature map corresponding to the first backbone network, a second class object feature map corresponding to the second backbone network and a third class object feature map corresponding to the third backbone network;

further, in the model training process of each wheel pair sample object data, at least one type of target object feature map is determined from the first type of object feature map, the second type of object feature map and the third type of object feature map, as shown in fig. 9, the target object feature maps determined in fig. 9 are also the first type of object feature map and the second type of object feature map;

then inputting the target object feature map into a texture enhancement module, and respectively outputting the texture enhancement feature maps corresponding to the target object feature map by the texture enhancement module, schematically performing feature average pooling on the target object feature map through a pooling layer of the texture enhancement module to obtain a pooled feature map; and fitting the target object characteristic diagram and the pooling characteristic diagram through a dense convolution layer of the texture enhancement module to obtain a texture enhancement characteristic diagram corresponding to the target object characteristic diagram.

Schematically, in the process of model training of sample object data by each wheel, attention feature enhancement is further performed on one or more object feature maps output by the backbone network to obtain an attention feature map, as shown in fig. 9, fig. 9 shows that attention feature enhancement is performed on a second class of object feature maps output by a second backbone network to obtain an attention feature map;

optionally, at least one type of reference object feature map may be selected from the multiple types of object feature maps, and the attention mechanism in the attention enhancement module performs attention enhancement on the texture sensitive difference features in the at least one type of reference object feature map to obtain an attention feature map.

Schematically, feature attention fusion is performed on an object feature map, a texture enhancement feature map and an attention feature map in each round of model training process by using an attention pooling module to obtain a sample object fusion feature, and further, the sample object fusion feature is input into a classifier to perform counterfeiting classification on an input object, wherein an object counterfeiting detection result is usually an object counterfeiting result or an object true result aiming at target object data.

In each round of model training, object restoration data for the sample object data is also generated by the object data decoder based on the (sample) object fusion features, for example, if the sample object data is a sample object image of a certain object, the object restoration data is an object restoration image for the sample object image generated based on the object fusion features.

Further, a forgery classification loss L of each round of the initial forgery detection model is determined ₁ And determining an object restoration prediction loss L from the object restoration image ₂ Inputting the forgery classification loss and the object reduction prediction loss into a model comprehensive calculation formula, outputting model comprehensive loss, adjusting model parameters of the initial forgery detection model by adopting the model comprehensive loss until the whole initial forgery detection model reaches a training finishing condition, finishing training, converging a model network, abandoning an object data decoder in the initial forgery detection model, and reserving other network structures to obtain a trained forgery detection model for forgery detection.

The forgery detection apparatus provided in the present specification will be described in detail below with reference to fig. 10. It should be noted that the forgery detection apparatus shown in fig. 10 is used for executing the method of the embodiment shown in fig. 1 to 9 of the present specification, and for convenience of description, only the part relevant to the present specification is shown, and specific technical details are not disclosed, please refer to the embodiment shown in fig. 1 to 9 of the present specification.

Referring to fig. 10, a schematic structural diagram of a forgery detection apparatus according to the present specification is shown. The forgery detection apparatus 1 can be implemented by software, hardware, or a combination of both as all or a part of a user terminal. According to some embodiments, the forgery detection device 1 comprises a forgery detection module 11, a forgery detection module 12 and a forgery detection module 13, in particular for:

the texture processing module 11 is configured to input target object data into a counterfeit detection model, extract multiple types of object feature maps of the target object data through the counterfeit detection model, and perform feature texture enhancement based on the multiple types of object feature maps to obtain a texture enhancement feature map;

an attention enhancement module 12, configured to perform attention feature enhancement on the object feature map through the counterfeit detection model to obtain an attention feature map;

and a fusion processing module 13, configured to perform feature attention fusion on the object feature map, the texture enhancement feature map, and the attention feature map by using the forgery detection model to obtain an object fusion feature, and output an object forgery detection result based on the object fusion feature.

Optionally, the forgery detection model includes a plurality of backbone networks and a texture enhancement module, as shown in fig. 11, where the texture processing module 11 includes:

a feature map extracting unit 111, configured to extract, through each of the backbone networks, a multi-class object feature map of the target object data;

a texture enhancing unit 112, configured to determine at least one type of target object feature map from the multiple types of object feature maps, input each type of target object feature map into the texture enhancing module, and output a texture enhancing feature map corresponding to each type of target object feature map.

Optionally, the feature map extracting unit 111 is configured to:

determining at least one default class of target object feature map from the multiple classes of object feature maps; or the like, or, alternatively,

and performing texture enhancement image prediction processing on the multi-class object characteristic images to obtain at least one class of target object characteristic images.

Optionally, the texture enhancing unit 112 is configured to:

and carrying out artifact residual detection on the object characteristic diagram to obtain a residual prediction probability corresponding to the object characteristic diagram, and determining at least one type of target object characteristic diagram from the multiple types of object characteristic diagrams based on the residual prediction probability.

Optionally, the texture enhancement module includes a pooling layer and a dense convolution layer based on a residual structure, and the texture enhancement unit 112 is configured to:

inputting the various target object feature maps into the texture enhancement module respectively, and performing feature average pooling on the target object feature maps through the pooling layer to obtain pooled feature maps;

and fitting the target object characteristic diagram and the pooling characteristic diagram through the dense convolution layer to obtain a texture enhancement characteristic diagram corresponding to the target object characteristic diagram.

Optionally, the multiple backbone networks include a first backbone network, a second backbone network, and a third backbone network, and the feature map extracting unit 111 is configured to:

extracting a first class object feature map of the target object data through the first backbone network, extracting a second class object feature map of the target object data through the second backbone network, and extracting a third class object feature map of the target object data through the third backbone network,

Optionally, the texture enhancing unit 112 is configured to:

determining a first class object feature diagram and a second class object feature diagram from the multi-class object feature diagrams, respectively inputting the first class object feature diagram and the second class object feature diagram into the texture enhancement module, and outputting a first texture enhancement feature diagram corresponding to the first class object feature diagram and a second texture enhancement feature diagram corresponding to the second class object feature diagram; or the like, or, alternatively,

determining a first class object feature map, a second class object feature map and a third class object feature map from the multiple classes of object feature maps, respectively inputting the first class object feature map, the second class object feature map and the third class object feature map into the texture enhancement module, and outputting a first texture enhancement feature map corresponding to the first class object feature map, a second texture enhancement feature map corresponding to the second class object feature map and a third texture enhancement feature map corresponding to the third class object feature map; or the like, or, alternatively,

determining a first class object feature map from the multiple classes of object feature maps, inputting the first class object feature map into the texture enhancement module, and outputting a first texture enhancement feature map corresponding to the first class object feature map.

Optionally, the forgery detection model includes an attention enhancing module, and the attention enhancing module is configured to:

and selecting at least one type of reference object feature map from the multi-type object feature maps, and performing attention enhancement on the at least one type of reference object feature map through the attention enhancement module to obtain an attention feature map.

Optionally, the apparatus 1 is further configured to:

creating an initial counterfeit detection model comprising an object data decoder, inputting sample object data into the initial counterfeit detection model for model training, determining counterfeit classification loss of the initial counterfeit detection model and determining object restoration prediction loss through the object data decoder;

and adjusting model parameters of the initial counterfeit detection model based on the counterfeit classification loss and the object restoration prediction loss until a model training end condition is met, so as to obtain a counterfeit detection model without the object data decoder.

Optionally, the apparatus 1 is further configured to:

in each round of model training process, carrying out object reduction processing on the sample object data through the object data decoder to obtain object reduction data;

and performing loss calculation by adopting a comparison loss function based on the object reduction data and the sample object data to obtain the object reduction prediction loss.

Optionally, the apparatus 1 is further configured to:

determining a forgery identification result of the initial forgery detection model aiming at the sample object data, and acquiring a forgery reference result aiming at the sample object data;

and performing loss calculation by adopting a cross entropy loss function based on the fake identification result and the fake reference result to obtain fake classification loss.

Optionally, the apparatus 1 is further configured to:

inputting the counterfeiting classification loss and the object reduction prediction loss into a model comprehensive calculation formula, outputting model comprehensive loss, and adjusting model parameters of the initial counterfeiting detection model by adopting the model comprehensive loss;

the model comprehensive calculation formula satisfies the following formula:

L _total ＝a*L ₁ +b*L ₂

wherein, L is _total For model synthetic losses, said L ₁ For the fake classification loss, the a is a first hyperparameter of the fake classification loss, the L ₂ And b, restoring the predicted loss for the object, wherein b is a second hyperparameter of the predicted loss for the object.

It should be noted that, when the forgery detection apparatus provided in the above embodiment executes the forgery detection method, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the forgery detection apparatus and the forgery detection method provided by the above embodiments belong to the same concept, and the detailed implementation process thereof is referred to as the method embodiment, which is not described herein again.

The above-mentioned serial numbers are for description purposes only and do not represent the merits of the embodiments.

In one or more embodiments of the present specification, the electronic device inputs target object data into a counterfeit detection model, extracts multiple types of object feature maps of the target object data through the counterfeit detection model, performs feature texture enhancement based on the multiple types of object feature maps to obtain a texture enhancement feature map, performs attention feature enhancement on the object feature maps through the counterfeit detection model to obtain an attention feature map, and performs feature attention fusion on the object feature maps, the texture enhancement feature map, and the attention feature map to obtain an object fusion feature, so that counterfeit detection can be accurately performed based on the object fusion feature, thereby avoiding a phenomenon that a counterfeit detection result is erroneous, and adaptively focusing on a texture information difference region, thereby resisting a counterfeit detection counter-solution mechanism, and better adapting to a complex application scene to realize effective detection; and the global detection effect of the model in a complex scene can be improved, and the robustness and the universality of the counterfeit detection are ensured.

The present specification further provides a computer storage medium, where a plurality of instructions may be stored, where the instructions are suitable for being loaded by a processor and executing the counterfeit detection method according to the embodiment shown in fig. 1 to 9, and a specific execution process may refer to specific descriptions of the embodiment shown in fig. 1 to 9, which is not described herein again.

The present specification further provides a computer program product, where at least one instruction is stored, and the at least one instruction is loaded by the processor and executes the counterfeit detection method according to the embodiment shown in fig. 1 to 9, where a specific execution process may refer to a specific description of the embodiment shown in fig. 1 to 9, and is not described herein again.

Referring to fig. 12, a block diagram of an electronic device according to an exemplary embodiment of the present disclosure is shown. The electronic device in this specification may include one or more of the following components: a processor 110, a memory 120, an input device 130, an output device 140, and a bus 150. The processor 110, memory 120, input device 130, and output device 140 may be connected by a bus 150.

Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall electronic device using various interfaces and lines, and performs various functions of the electronic device 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), field-programmable gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.

The Memory 120 may include a Random Access Memory (RAM) or a read-only Memory (ROM). Optionally, the memory 120 includes a non-transitory computer-readable medium. The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like, and the operating system may be an Android (Android) system, including a system based on Android system depth development, an IOS system developed by apple, including a system based on IOS system depth development, or other systems. The data storage area may also store data created by the electronic device during use, such as phone books, audio and video data, chat log data, and the like.

Referring to fig. 13, the memory 120 may be divided into an operating system space, in which an operating system runs, and a user space, in which native and third-party applications run. In order to ensure that different third-party application programs can achieve a better operation effect, the operating system allocates corresponding system resources for the different third-party application programs. However, the requirements of different application scenarios in the same third-party application program on system resources also differ, for example, in a local resource loading scenario, the third-party application program has a higher requirement on the disk reading speed; in the animation rendering scene, the third-party application program has a high requirement on the performance of the GPU. The operating system and the third-party application program are independent from each other, and the operating system cannot sense the current application scene of the third-party application program in time, so that the operating system cannot perform targeted system resource adaptation according to the specific application scene of the third-party application program.

In order to enable the operating system to distinguish a specific application scenario of the third-party application program, data communication between the third-party application program and the operating system needs to be opened, so that the operating system can acquire current scenario information of the third-party application program at any time, and further perform targeted system resource adaptation based on the current scenario.

Taking an operating system as an Android system as an example, programs and data stored in the memory 120 are as shown in fig. 14, and a Linux kernel layer 320, a system runtime library layer 340, an application framework layer 360, and an application layer 380 may be stored in the memory 120, where the Linux kernel layer 320, the system runtime library layer 340, and the application framework layer 360 belong to an operating system space, and the application layer 380 belongs to a user space. The Linux kernel layer 320 provides underlying drivers for various hardware of the electronic device, such as a display driver, an audio driver, a camera driver, a bluetooth driver, a Wi-Fi driver, power management, and the like. The system runtime library layer 340 provides a main feature support for the Android system through some C/C + + libraries. For example, the SQLite library provides support for a database, the OpenGL/ES library provides support for 3D drawing, the Webkit library provides support for a browser kernel, and the like. Also provided in the system runtime library layer 340 is an Android runtime library (Android runtime), which mainly provides some core libraries capable of allowing developers to write Android applications using the Java language. The application framework layer 360 provides various APIs that may be used in building an application, and developers may build their own applications by using these APIs, such as activity management, window management, view management, notification management, content provider, package management, session management, resource management, and location management. At least one application program runs in the application layer 380, and the application programs may be native application programs carried by the operating system, such as a contact program, a short message program, a clock program, a camera application, and the like; or a third-party application developed by a third-party developer, such as a game application, an instant messaging program, a photo beautification program, and the like.

Taking an operating system as an IOS system as an example, programs and data stored in the memory 120 are shown in fig. 15, and the IOS system includes: a Core operating system Layer 420 (Core OS Layer), a Core Services Layer 440 (Core Services Layer), a Media Layer 460 (Media Layer), and a touchable Layer 480 (Cocoa Touch Layer). The kernel operating system layer 420 includes an operating system kernel, drivers, and underlying program frameworks that provide functionality closer to hardware for use by program frameworks located in the core services layer 440. The core services layer 440 provides system services and/or program frameworks, such as a Foundation framework, an account framework, an advertisement framework, a data storage framework, a network connection framework, a geographic location framework, a motion framework, and so forth, as required by the application. The media layer 460 provides audiovisual related interfaces for applications, such as graphics image related interfaces, audio technology related interfaces, video technology related interfaces, audio video transmission technology wireless playback (AirPlay) interfaces, and the like. Touchable layer 480 provides various common interface-related frameworks for application development, and touchable layer 480 is responsible for user touch interaction operations on the electronic device. Such as a local notification service, a remote push service, an advertising framework, a game tool framework, a messaging User Interface (UI) framework, a User Interface UIKit framework, a map framework, and so forth.

In the framework shown in FIG. 15, the framework associated with most applications includes, but is not limited to: a base framework in the core services layer 440 and a UIKit framework in the touchable layer 480. The base framework provides many basic object classes and data types, provides the most basic system services for all applications, and is UI independent. While the class provided by the UIKit framework is a basic library of UI classes for creating touch-based user interfaces, iOS applications can provide UIs based on the UIKit framework, so it provides an infrastructure for applications for building user interfaces, drawing, processing and user interaction events, responding to gestures, and the like.

The Android system may be referred to as a manner and principle for implementing data communication between the third-party application program and the operating system in the IOS system, and details are not repeated herein.

The input device 130 is used for receiving input instructions or data, and the input device 130 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 140 is used for outputting instructions or data, and the output device 140 includes, but is not limited to, a display device, a speaker, and the like. In one example, the input device 130 and the output device 140 may be combined, and the input device 130 and the output device 140 are touch display screens for receiving touch operations of a user on or near the touch display screens by using any suitable object such as a finger, a touch pen, and the like, and displaying user interfaces of various applications. Touch displays are typically provided on the front panel of an electronic device. The touch display screen may be designed as a full-face screen, a curved screen, or a profiled screen. The touch display screen can also be designed to be a combination of a full-face screen and a curved-face screen, and a combination of a special-shaped screen and a curved-face screen, which is not limited in the specification.

In addition, those skilled in the art will appreciate that the configurations of the electronic devices illustrated in the above-described figures do not constitute limitations on the electronic devices, which may include more or fewer components than illustrated, or some components may be combined, or a different arrangement of components. For example, the electronic device further includes a radio frequency circuit, an input unit, a sensor, an audio circuit, a wireless fidelity (WiFi) module, a power supply, a bluetooth module, and other components, which are not described herein again.

In this specification, the execution subject of each step may be the electronic apparatus described above. Optionally, the execution subject of each step is an operating system of the electronic device. The operating system may be an android system, an IOS system, or another operating system, which is not limited in this specification.

The electronic device of this specification may further have a display device mounted thereon, and the display device may be various devices that can implement a display function, for example: a cathode ray tube display (CR), a light-emitting diode display (LED), an electronic ink panel, a Liquid Crystal Display (LCD), a Plasma Display Panel (PDP), and the like. A user may utilize a display device on the electronic device 101 to view information such as displayed text, images, video, and the like. The electronic device may be a smartphone, a tablet computer, a gaming device, an AR (Augmented Reality) device, an automobile, a data storage device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as an electronic watch, an electronic glasses, an electronic helmet, an electronic bracelet, an electronic necklace, an electronic garment, or the like.

In the electronic device shown in fig. 12, where the electronic device may be a terminal, the processor 110 may be configured to call the network optimization application stored in the memory 120, and specifically perform the following operations:

In an embodiment, the forgery detection model includes a plurality of backbone networks and a texture enhancement module, and the processor 110 performs the following steps when the processor extracts a plurality of types of object feature maps through the forgery detection model and performs feature texture enhancement based on the plurality of types of object feature maps to obtain a texture enhancement feature map:

respectively extracting multi-class object feature maps of the target object data through each backbone network;

determining at least one class of target object feature maps from the multiple classes of object feature maps, inputting the various classes of target object feature maps into the texture enhancement module, and respectively outputting the texture enhancement feature maps corresponding to the various classes of target object feature maps.

In one embodiment, the processor 110, in executing the determining at least one class of target object feature map from the multiple classes of object feature maps, performs the following steps:

In one embodiment, the processor 110 performs the texture enhancement graph prediction process on the multiple classes of object feature graphs to obtain at least one class of target object feature graphs, and performs the following steps:

In one embodiment, the texture enhancement module includes a pooling layer and a dense convolution layer based on a residual structure, and the processor 110, in executing the inputting of the target object feature maps of the respective types into the texture enhancement module, outputs texture enhancement feature maps corresponding to the target object feature maps of the respective types, respectively, and executes the following steps:

In one embodiment, the plurality of backbone networks include a first backbone network, a second backbone network and a third backbone network, and the processor 110 performs the following steps in executing the multi-class object feature map for extracting the target object data through each of the backbone networks respectively:

In one embodiment, the processor 110, during the execution, determines at least one type of target object feature map from the multiple types of object feature maps, inputs the various types of target object feature maps into the texture enhancement module, and outputs texture enhancement feature maps corresponding to the various types of target object feature maps, respectively, performs the following steps:

In one embodiment, the forgery detection model includes an attention enhancement module, and the processor 110 performs the following steps when performing the attention feature enhancement on the object feature map by the forgery detection model to obtain an attention feature map:

In one embodiment, the processor 110 further performs the following steps before performing the inputting of the target object data into the forgery detection model:

and adjusting model parameters of the initial counterfeit detection model based on the counterfeit classification loss and the object reduction prediction loss until a model training end condition is met to obtain a counterfeit detection model without the object data decoder.

In one embodiment, the processor 110 performs the following steps in performing the determining of the object restoration prediction loss by the object data decoder:

In one embodiment, the processor 110 performs the following steps in performing the determining of the forgery classification loss of the initial forgery detection model:

determining a forgery identification result of the initial forgery detection model for the sample object data, and acquiring a forgery reference result for the sample object data;

In one embodiment, the processor 110 performs the following steps in performing the model parameter adjustment of the initial forgery detection model based on the forgery classification loss and the object restoration prediction loss:

the model comprehensive calculation formula satisfies the following formula:

L _total ＝a*L ₁ +b*L ₂

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer readable storage medium and executed by a computer to implement the processes of the embodiments of the methods described above. The storage medium can be a magnetic disk, an optical disk, a read-only memory or a random access memory.

It should be noted that the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, displayed data, etc.) and signals referred to in the embodiments of the present description are authorized by the user or fully authorized by various parties, and the collection, use and processing of the relevant data need to comply with relevant laws and regulations and standards in relevant countries and regions. For example, the target object data, feature processing, forgery detection, and the like referred to in this specification are all performed under sufficient authorization.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present disclosure, and it is not intended to limit the scope of the present disclosure, so that the present disclosure will be covered by the claims and their equivalents.

Claims

1. A method of counterfeit detection, the method comprising:

and performing feature attention fusion on the object feature map, the texture enhancement feature map and the attention feature map by adopting the forgery detection model to obtain object fusion features, and outputting an object forgery detection result based on the object fusion features.

2. The method of claim 1, the forgery detection model including a plurality of backbone networks and texture enhancement modules,

the extracting multi-class object feature maps through the forgery detection model, and performing feature texture enhancement based on the multi-class object feature maps to obtain texture enhancement feature maps includes:

3. The method of claim 2, wherein said determining at least one class of target object feature maps from said plurality of classes of object feature maps comprises:

4. The method according to claim 3, wherein said performing texture enhancement graph prediction processing on said multiple classes of object feature maps to obtain at least one class of target object feature maps comprises:

5. The method of claim 2, the texture enhancement module comprising a pooling layer and a dense convolution layer based on a residual structure,

the inputting of the various types of target object feature maps into the texture enhancement module and the outputting of the texture enhancement feature maps corresponding to the various types of target object feature maps respectively include:

6. The method of claim 2, the plurality of backbone networks comprising a first backbone network, a second backbone network, and a third backbone network,

the extracting, through each of the backbone networks, a multi-class object feature map of the target object data, respectively, includes:

7. The method according to claim 6, wherein said determining at least one type of target object feature map from said plurality of types of object feature maps, inputting each type of target object feature map into said texture enhancement module, and outputting the corresponding texture enhancement feature map of each type of target object feature map respectively comprises:

determining a first class object feature map and a third class object feature map from the multiple classes of object feature maps, respectively inputting the first class object feature map and the third class object feature map into the texture enhancement module, and outputting a first texture enhancement feature map corresponding to the first class object feature map and a third texture enhancement feature map corresponding to the third class object feature map; or the like, or, alternatively,

8. The method of claim 1, the forgery detection model including an attention enhancement module, the attention feature map enhanced by the forgery detection model to the object feature map comprising:

9. The method of any of claims 1-8, the inputting target object data into a forgery detection model further comprising, prior to:

10. The method of claim 9, the determining, by the object data decoder, an object restoration prediction loss, comprising:

11. The method of claim 9, the determining a forgery classification loss for the initial forgery detection model, comprising:

12. The method of claim 9, the model parameter adjusting the initial forgery detection model based on the forgery classification loss and the object restoration prediction loss, comprising:

the model comprehensive calculation formula satisfies the following formula:

L _total ＝a*L ₁ +b*L ₂

13. A counterfeit detection apparatus, the apparatus comprising:

14. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1 to 12.

15. A computer program product having stored at least one instruction for being loaded by said processor and for performing the method steps according to any of claims 1 to 12.

16. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 12.