CN114005019B

CN114005019B - Method for identifying flip image and related equipment thereof

Info

Publication number: CN114005019B
Application number: CN202111275642.9A
Authority: CN
Inventors: 毛晓飞; 黄灿; 王长虎
Original assignee: Beijing Youzhuju Network Technology Co Ltd
Current assignee: Beijing Youzhuju Network Technology Co Ltd
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2023-09-22
Anticipated expiration: 2041-10-29
Also published as: CN114005019A; WO2023071609A1

Abstract

The application discloses a method for identifying a flip image and related equipment thereof, wherein the method comprises the following steps: after the image to be identified is obtained, firstly, performing visual feature extraction processing on the image to be identified to obtain visual features of the image to be identified; then, carrying out feature splitting treatment on the visual features to obtain at least one feature to be used; then, carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used; and finally, determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used. The "feature encoding result of the at least one feature to be used" can accurately represent the moire information carried by the image to be identified, so that it can accurately represent whether the image to be identified is a flip image, so that whether one image data is a flip image can be identified based on the "feature encoding result of the at least one feature to be used" in the following.

Description

Method for identifying flip image and related equipment thereof

Technical Field

The application relates to the technical field of image processing, in particular to a method for identifying a flip image and related equipment thereof.

Background

With popularization of informatization technology, the application range of image uploading technology is wider and wider. For example, the image uploading technique may be applied to some authentication scenarios (e.g., real-name authentication, etc.), and may also be applied to some digital information extraction scenarios.

In addition, for the image uploading technology, it may perform uploading processing not only for a normal image but also for a flip image. Where "conventional image" refers to image data obtained by image acquisition by an image acquisition device (e.g., camera, scanning device, etc.) for a physical object (e.g., an identification card, some credentials, etc.). "flip image" refers to image data obtained by image capturing by an image capturing apparatus with respect to image data displayed on an image display apparatus (e.g., a display screen or the like).

However, for some application scenarios (such as identity authentication, etc.), the conventional image is usually legal image data, but the flipped image is usually illegal image data, so in order to improve the information security of the user, it is necessary to identify and alarm the flipped image, so how to identify the flipped image is a technical problem to be solved urgently.

Disclosure of Invention

In order to solve the technical problems, the application provides a method for identifying a flipped image and related equipment thereof, which can accurately identify whether one image data is a flipped image.

In order to achieve the above object, the technical solution provided by the embodiments of the present application is as follows:

the embodiment of the application provides a method for identifying a flip image, which comprises the following steps:

after an image to be identified is obtained, performing visual feature extraction processing on the image to be identified to obtain visual features of the image to be identified;

performing feature splitting treatment on the visual features to obtain at least one feature to be used;

performing feature coding processing on each feature to be used to obtain feature coding results of each feature to be used;

and determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used.

In one possible implementation, the visual features include a feature map to be used, and the feature map to be used includes N channels of image data; wherein N is a positive integer;

the determining process of the at least one feature to be used comprises the following steps:

Flattening the image data of the nth channel in the feature map to be used to obtain nth flattening data; wherein N is a positive integer, N is less than or equal to N;

carrying out data extraction processing on the nth flattening data according to preset extraction parameters to obtain at least one extracted data segment of the nth flattening data; wherein N is a positive integer, N is less than or equal to N;

the at least one feature to be used is determined from at least one extracted data segment of the N flattened data.

In one possible embodiment, the method further comprises:

determining a weighted weight value of each feature to be used;

multiplying each feature to be used with a weighted weight value of each feature to be used to obtain each weighted feature;

and performing feature coding processing on each feature to be used to obtain feature coding results of each feature to be used, wherein the feature coding results comprise:

and respectively carrying out feature coding processing on each weighted feature to obtain feature coding results of each feature to be used.

In a possible implementation manner, the determining the weighted weight value of each feature to be used includes:

and inputting each feature to be used into a pre-constructed weight determination model to obtain a weighted weight value of each feature to be used, which is output by the weight determination model.

In a possible implementation manner, the determining whether the image to be identified is a flip image according to the feature encoding result of the at least one feature to be used includes:

splicing the feature coding results of the at least one feature to be used to obtain coding features of the image to be identified;

performing image reproduction identification processing on the coding features of the image to be identified to obtain an identification result of the image to be identified;

and determining whether the image to be identified is a flip image or not according to the identification result of the image to be identified.

In a possible implementation manner, the performing a flip image recognition process on the coding feature of the image to be recognized to obtain a recognition result of the image to be recognized includes:

inputting the coding features of the image to be identified into a pre-constructed reproduction identification model to obtain an identification result of the image to be identified, which is output by the reproduction identification model.

In one possible implementation manner, the process for constructing the flap recognition model includes:

acquiring a sample image and an actual label of the sample image; the actual label of the sample image is used for indicating whether the sample image is actually a flip image or not;

Determining at least one image to be used according to the sample image;

determining the actual label of the sample image as the actual label of each image to be used;

and constructing the flip identification model by using the at least one image to be used and the actual label of the at least one image to be used.

In a possible implementation manner, the constructing the flip recognition model by using the at least one image to be used and the actual label of the at least one image to be used includes:

inputting each image to be used into a model to be trained, and obtaining the identification result of each image to be used output by the model to be trained;

updating the model to be trained according to the identification result of the at least one image to be used and the actual label of the at least one image to be used, and continuously executing the step of inputting each image to be used into the model to be trained until the preset stop condition is reached, and determining the flap recognition model according to the model to be trained.

In one possible implementation, the model to be trained includes: the device comprises a visual feature extraction layer, a feature splitting layer, a weight determining layer, a feature weighting layer, a feature coding layer, a feature splicing layer and a flap recognition layer; the input data of the flap recognition layer comprises the output data of the characteristic splicing layer; the input data of the characteristic splicing layer comprises the output data of the characteristic coding layer; the input data of the feature coding layer comprises the output data of the feature weighting layer; the input data of the characteristic weighting layer comprises the output data of the weight determining layer and the output data of the characteristic splitting layer; the input data of the weight determining layer comprises the output data of the characteristic splitting layer; the input data of the feature disassembly layer comprises the output data of the visual feature extraction layer;

The step of determining the flap recognition model according to the model to be trained comprises the following steps:

and determining the flap recognition model according to the flap recognition layer in the model to be trained.

In a possible implementation manner, the determining at least one image to be used according to the sample image includes:

cutting the sample image according to a preset rule to obtain at least one cutting image;

and determining at least one image to be used according to the sample image and the at least one clipping image.

In one possible implementation manner, the process of acquiring the image to be identified includes:

after the equipment acquisition image is acquired, adjusting the equipment acquisition image according to a preset size to obtain the image to be identified.

The embodiment of the application also provides a device for identifying the flipped image, which comprises the following steps:

the extraction unit is used for carrying out visual feature extraction processing on the image to be identified after the image to be identified is acquired, so as to obtain visual features of the image to be identified;

the splitting unit is used for carrying out feature splitting treatment on the visual features to obtain at least one feature to be used;

The coding unit is used for respectively carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used;

and the identification unit is used for determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used.

The embodiment of the application also provides equipment, which comprises a processor and a memory:

the memory is used for storing a computer program;

the processor is used for executing any implementation mode of the method for identifying the flipped image provided by the embodiment of the application according to the computer program.

The embodiment of the application also provides a computer readable storage medium for storing a computer program for executing any implementation mode of the method for identifying the flipped image.

The embodiment of the application also provides a computer program product, which when being run on the terminal equipment, causes the terminal equipment to execute any implementation mode of the method for identifying the flipped image.

Compared with the prior art, the embodiment of the application has at least the following advantages:

In the technical scheme provided by the embodiment of the application, after the image to be identified is obtained, firstly, visual feature extraction processing is carried out on the image to be identified to obtain the visual features of the image to be identified; then, carrying out feature splitting treatment on the visual features to obtain at least one feature to be used; then, carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used; and finally, determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used. The feature encoding result of the at least one feature to be used can accurately represent the image information carried by the image to be identified, so that the feature encoding result of the at least one feature to be used can accurately represent the moire information carried by the image to be identified, and accordingly, the feature encoding result of the at least one feature to be used can accurately represent whether the image to be identified is a flip image, and whether the image data is a flip image can be identified based on the feature encoding result of the at least one feature to be used.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for identifying a flipped image according to an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a model to be trained according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a device for identifying a flipped image according to an embodiment of the present application.

Detailed Description

The inventor finds that the mole pattern information carried by the flipped image is different from the mole pattern information carried by the conventional image in the research of the flipped image, so that the difference exists between the image information carried by the flipped image and the image information carried by the conventional image, and therefore, whether the image data is the flipped image can be identified based on the image information carried by one image data.

Based on the above findings, in order to solve the technical problems in the background art, the embodiment of the application provides a method for identifying a flipped image, which includes: after the image to be identified is obtained, firstly, performing visual feature extraction processing on the image to be identified to obtain visual features of the image to be identified; then, carrying out feature splitting treatment on the visual features to obtain at least one feature to be used; then, carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used; and finally, determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used.

Therefore, the "feature encoding result of the at least one feature to be used" can accurately represent the image information carried by the image to be identified, so that the "feature encoding result of the at least one feature to be used" can accurately represent the moire information carried by the image to be identified, so that the "feature encoding result of the at least one feature to be used" can accurately represent whether the image to be identified is a flip image, and so that whether one image data is a flip image can be identified based on the "feature encoding result of the at least one feature to be used" in the following.

In addition, the embodiment of the application is not limited to the execution subject of the image reproduction identification method, for example, the image language identification method provided by the embodiment of the application can be applied to data processing equipment such as terminal equipment or a server. The terminal device may be a smart phone, a computer, a personal digital assistant (Personal Digital Assitant, PDA), a tablet computer, or the like. The servers may be stand alone servers, clustered servers, or cloud servers.

In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Method embodiment one

Referring to fig. 1, the flowchart of a method for identifying a flipped image according to an embodiment of the present application is shown.

The method for identifying the flip image provided by the embodiment of the application comprises the following steps of S1-S4:

s1: after the image to be identified is obtained, visual feature extraction processing is carried out on the image to be identified, and visual features of the image to be identified are obtained.

The image to be identified refers to image data which needs to be subjected to the identification processing of the flip image; the embodiment of the present application is not limited to the above-mentioned process of acquiring the "image to be identified", and may specifically be, for example: after the device acquisition image is acquired, the device acquisition image is directly determined as an image to be identified. "device-captured image" refers to image data captured by an image capture device.

In addition, the embodiment of the present application also provides another possible implementation manner of acquiring the "image to be identified", which may specifically include: after the equipment acquisition image is acquired, the equipment acquisition image is adjusted according to a preset size, and an image to be identified is obtained. Wherein, the 'preset size' can be preset; and embodiments of the present application are not limited to this "preset size", which may be, for example, 128 x 128 in particular.

The embodiment of the present application is not limited to the above-described "resizing process", and may be implemented by any method that can be resized for one image data (for example, a resizing method) existing in the present or future, for example.

The "visual feature extraction process" is for extracting visual features from one image data; moreover, embodiments of the present application are not limited to this implementation of the "visual feature extraction process," which may be implemented using any visual feature extraction method, existing or occurring in the future, for example. As another example, it may be implemented in particular by means of a pre-built visual feature extraction model.

The visual feature extraction model is used for carrying out visual feature extraction processing on input data of the visual feature extraction model; moreover, embodiments of the application are not limited to this "visual feature extraction model", which may be, for example, a convolutional neural network (Convolutional Neural Networks, CNN) model in particular.

In addition, the embodiment of the application is not limited to the construction process of the visual feature extraction model, and for example, the embodiment can be implemented by adopting any model construction method existing or appearing in the future. As another example, the following may also be employed Method example IIIThe model construction method shown is implemented.

The visual characteristic of the image to be identified is used for representing the image information carried by the image to be identified; moreover, the embodiment of the present application does not limit the process of acquiring the visual feature of the image to be identified, and for example, it may specifically include: inputting the image to be identified into a pre-constructed visual feature extraction model to obtain the visual features of the image to be identified, which are output by the visual feature extraction model.

S2: and carrying out feature splitting treatment on the visual features of the image to be identified to obtain at least one feature to be used.

Wherein the "feature splitting process" is used for performing feature division processing for one visual feature (for example, feature map); the embodiment of the application is not limited to the "feature splitting process", and may be implemented by any method that can perform feature division processing on feature data, for example, existing or future.

In addition, the embodiment of the application also provides a possible implementation manner of the feature splitting process, and for convenience of understanding, the description is made below with reference to an example.

As an example, when the "visual feature of the image to be recognized" described above includes a feature map to be used, and the feature map to be used includes image data of N channels, S2 may specifically include S21-S23:

S21: flattening the image data of the nth channel in the feature map to be used to obtain nth flattened data. Wherein N is a positive integer, and N is less than or equal to N.

The feature map to be used is used for representing image information carried by the image to be identified; and the "feature map to be used" may include image data of N channels. Wherein N is a positive integer.

In addition, the embodiment of the present application is not limited to the "feature map to be used", and for example, when N is 32 as described above, the "feature map to be used" may be 64×64×32 image data. Wherein "32" refers to the number of channels of the "feature map to be used"; "64×64" refers to the data dimension that the image data of each channel has in the "feature map to be used".

The flattening process is used for performing unfolding and splicing processing on image data to obtain a data vector, so that the data vector is used for recording pixel information carried by the image data; moreover, embodiments of the present application are not limited to this "flattening process," and may be implemented using any flattening process, either existing or future.

The "nth flattening data" refers to a data vector for representing image information carried by an image to be identified; moreover, the embodiment of the present application is not limited to the "nth flattening data", and for example, when the "image data of the nth channel in the feature map to be used" described above is a 64×64-dimensional data matrix, the "nth flattening data" may be a 4096-dimensional data vector.

S22: and carrying out data extraction processing on the nth flattening data according to preset extraction parameters to obtain at least one extracted data segment of the nth flattening data. Wherein N is a positive integer, and N is less than or equal to N.

Wherein, the 'preset extraction parameters' can be preset; furthermore, embodiments of the present application are not limited to this "preset extraction parameter", and may specifically be 1024×4, for example. Where "1024" refers to the data length of each extracted data segment; "4" refers to the number of extracted data segments.

The above-described "data extraction processing" is used for performing a data division operation with respect to one data vector; moreover, embodiments of the present application are not limited to this "data extraction process" implementation, and may be implemented, for example, using any method that can divide one data vector into a plurality of extracted data segments (e.g., a resize method) that occurs in the present or future.

The "at least one extracted data segment of the nth applanation data" is used for representing image information carried by the nth applanation data; moreover, embodiments of the present application are not limited to the "at least one extracted data segment of nth applanation data", for example, when the nth applanation data is a 4096-dimensional data vector and the preset extraction parameter is 1024×4, the "at least one extracted data segment of nth applanation data" may include 4 1024-dimensional extracted data segments because the nth applanation data can be divided into 4 1024-dimensional extracted data segments.

S23: at least one feature to be used is determined from at least one extracted data segment of the N flattened data.

As an example, when N is 32 and the "at least one extracted data segment of nth applanation data" includes 4 extracted data segments of 1024 dimensions, the "at least one extracted data segment of N applanation data" may include 96 extracted data segments of 1024 dimensions, so S23 may specifically include: the 1 st 1024 th extracted data segment, the 2 nd 1024 th extracted data segment, … …, and the 96 th 1024 th extracted data segment are all determined as the features to be used, to obtain 96 features to be used.

Based on the above-mentioned related content of S21 to S23, after the visual feature of the image to be identified is obtained, the visual feature may be split into a plurality of feature data, so that the encoding feature of the visual feature can be determined by means of encoding processing for the plurality of feature data, which is beneficial to the subsequent encoding efficiency.

S3: and respectively carrying out feature coding treatment on each feature to be used to obtain feature coding results of each feature to be used.

Wherein, the characteristic coding process can be preset; moreover, embodiments of the present application are not limited to this "feature encoding process" implementation, and may be implemented using any encoding method, for example, existing or future. As another example, implementation can be performed with the aid of a pre-constructed feature encoding model.

The characteristic coding model is used for coding input data of the characteristic coding model; moreover, embodiments of the present application are not limited to the model structure of the "feature encoding model", which may include 5 encoding networks, for example. It should be noted that the embodiments of the present application are not limited to the implementation of the "coding network", and may be implemented by any coding method (e.g., an encoding module in a transformer model, or a transformer structure).

In addition, the embodiment of the application is not limited to the construction method of the feature coding model, and for example, the embodiment can be implemented by adopting any model construction method existing or appearing in the future. As another example, the following may also be employedMethod example IIIThe model construction method shown is implemented.

The feature encoding result of the kth feature to be used is used for representing image information (for example, moire information) carried by the kth feature to be used; the embodiment of the present application is not limited to the determination process of the "feature encoding result of the kth feature to be used", and for example, it may specifically be: inputting the kth feature to be used into a pre-constructed feature coding model to obtain a feature coding result of the kth feature to be used output by the feature coding model. Where K is a positive integer, K is equal to or less than K, K is a positive integer, and K represents the number of features to be used (e.g., 96).

S4: and determining whether the image to be identified is a flip image or not according to the feature coding result of at least one feature to be used.

The embodiment of the present application is not limited to the implementation manner of S4, and may be, for example: and inputting the feature coding result of at least one feature to be used into a pre-trained flipped image detection model to obtain a detection result output by the flipped image detection model, so that the detection result is used for indicating whether the image to be identified is a flipped image or not.

The 'flip image detection model' is used for performing flip image detection and identification processing on input data of the flip image detection model; moreover, embodiments of the present application are not limited to this "flipped image detection model," which may be, for example, any machine learning model that exists in the present or future. It should be noted that, the embodiment of the present application is not limited to the training process of the "flipped image detection model", and may be implemented by any model training method existing or appearing in the future, for example.

In addition, in order to improve the recognition effect of the flipped image, the embodiment of the present application further provides another possible implementation manner of S4, which may specifically include S41-S43:

S41: and splicing the feature coding results of at least one feature to be used to obtain the coding feature of the image to be identified.

The "coding feature of the image to be identified" is used to represent image information (in particular, moire information, etc.) carried by the image to be identified.

In addition, the embodiment of the present application is not limited to the "splicing" implementation manner, and may be implemented by any splicing method existing or appearing in the future.

S42: and performing the reproduction image recognition processing on the coding features of the image to be recognized to obtain a recognition result of the image to be recognized.

Wherein, the 'reproduction image identification processing' is used for identifying whether the coding feature of one image data belongs to the feature of the reproduction image; moreover, embodiments of the present application are not limited to this "flipped image recognition process", and may be implemented, for example, by means of a pre-built flipped recognition model.

The 'flip recognition model' is used for performing flip image recognition processing on input data of the flip recognition model; moreover, embodiments of the present application are not limited to the model structure of the "tap recognition model", and for example, it may include a linear processing layer and a full connection layer, and the input data of the full connection layer includes the output data of the linear processing layer.

In addition, embodiments of the present application are not limited to the implementation of the "linear processing layer" described above, and may be implemented using any linear processing method (e.g., linear module in a transducer model) that exists in the present or future, for example. Also, embodiments of the present application are not limited to the "fully-connected layer" implementation described above, and may be implemented using any fully-connected processing method (e.g., softmax module in a transducer model) that is present or that occurs in the future, for example.

In addition, the embodiment of the application is not limited to the construction method of the "flip recognition model", and for example, the method can be implemented by adopting any model construction method existing or appearing in the future. As another example, the following may also be employedMethod example IIIThe model construction method shown is implemented.

The above-mentioned "recognition result of the image to be recognized" is used for showing the possibility that the image to be recognized belongs to the reproduction image; moreover, the embodiment of the present application is not limited to the representation of the "recognition result of the image to be recognized", and for example, it may include a first probability and a second probability. Wherein, the first probability is used for representing the possibility that the image to be identified belongs to the flip image; the "second probability" is used to indicate a possibility that the image to be recognized does not belong to the reproduction image.

In addition, the embodiment of the present application is not limited to the implementation of S42, and may specifically include, for example: inputting the coding features of the image to be identified into a pre-constructed reproduction identification model to obtain the identification result of the image to be identified, which is output by the reproduction identification model.

S43: and determining whether the image to be identified is a flip image or not according to the identification result of the image to be identified.

In the embodiment of the application, after the identification result of the image to be identified is obtained, whether the identification result of the image to be identified reaches the preset flipping condition can be judged; if the image to be identified is the flip image, the image to be identified is determined to be the flip image; if not, it may be determined that the image to be identified is not a flipped image. The "preset flipping condition" may be preset, for example, the "preset flipping condition" may specifically be: the first probability is higher than a first threshold and/or the second probability is lower than a second threshold.

Based on the above-mentioned related content from S1 to S4, in the method for identifying a flip image provided by the embodiment of the present application, after an image to be identified is obtained, visual feature extraction processing is performed on the image to be identified, so as to obtain visual features of the image to be identified; then, carrying out feature splitting treatment on the visual features to obtain at least one feature to be used; then, carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used; and finally, determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used.

Method embodiment II

In order to improve the identification effect of the flipped image, the embodiment of the present application also provides another possible implementation manner of the flipped image identification method, where the flipped image identification method may further include S5-S7 in addition to S1, S2, and S4 described above:

s5: a weighted weight value for each feature to be used is determined.

The weighting weight value of the kth feature to be used is used for representing the importance degree of the image information represented by the kth feature to be used; the embodiment of the present application is not limited to the method for acquiring the "weighting weight of the kth feature to be used", and may be preset, for example. Where K is a positive integer, K is equal to or less than K, K is a positive integer, and K represents the number of features to be used (e.g., 96).

In addition, in order to improve the flexibility of identifying the flipped image, the embodiment of the present application further provides another possible implementation manner of S5, which may specifically include: and inputting each feature to be used into a pre-constructed weight determination model to obtain a weighted weight value of each feature to be used output by the weight determination model.

The weight determining model is used for carrying out information importance measurement processing on the input data of the weight determining model; moreover, embodiments of the present application are not limited to this "weight determination model," and may be any machine learning model (e.g., a 1×1 convolutional network), for example.

In addition, the embodiment of the present application is not limited to the above-described "weight determination model" construction process, and for example, it may be implemented by any model construction method that occurs in the existing or future. As another example, the following may also be employedMethod example IIIThe model construction method shown is implemented.

S6: and multiplying each feature to be used by the weighted weight value of each feature to be used to obtain each weighted feature.

As an example, S6 may specifically include: multiplying the kth feature to be used with the weighted weight value of the kth feature to obtain the kth weighted feature, so that the kth weighted feature can more accurately represent the image information carried by the kth feature to be used and the influence degree of the image information on subsequent image reproduction identification processing.

S7: and respectively carrying out feature coding processing on each weighted feature to obtain feature coding results of each feature to be used.

It should be noted that, the relevant content of S7 is similar to the relevant content of S3 above, and the "feature to be used" in the relevant content of S3 above is merely replaced by the "weighted feature".

Based on the above-mentioned related content of S5 to S7, for the image to be identified, at least one feature to be used of the image to be identified and its weighted weight value may be referred to, and the at least one feature to be used of the image to be identified may be subjected to encoding processing to obtain feature encoding results of each feature to be used, so that the feature encoding results of each feature to be used may better represent image information (especially, moire information) carried by the image to be identified, which is beneficial to improving identification accuracy of the flipped image.

Method example III

In order to improve the construction effect of the above models (such as a visual feature extraction model, a feature coding model, a weight determination model, a flap recognition model, etc.), the embodiment of the present application further provides a model construction method, which specifically may include steps 11 to 14:

step 11: a sample image and an actual label of the sample image are acquired.

Wherein, the sample image refers to image data required to be used in the model construction process; and the number of sample images is not limited in the embodiment of the application.

The "actual label of the sample image" is used to indicate whether the sample image is actually a flip image; the embodiment of the application is not limited to the acquisition process of the actual label of the sample image, and can be implemented in a manual labeling mode, for example.

In addition, the embodiment of the present application is not limited to the above-described expression of "actual label of sample image", and for example, it may include a third probability and a fourth probability. Wherein the "third probability" is used to indicate the likelihood that the sample image actually belongs to the flip image; the "fourth probability" is used to indicate a possibility that the sample image does not actually belong to the flip image.

Step 12: at least one image to be used is determined from the sample image.

Wherein the "image to be used" is used to represent part of image information or all of image information of the sample image.

In addition, the embodiment of the present application is not limited to the above-described "at least one image to be used", and for example, it may include one first image sample, H second image samples, and G third image samples. Wherein H is a positive integer, and G is a positive integer. Note that, the embodiment of the present application does not limit the relationship between H and G, for example, g=2×h.

The first image sample is used for recording all image information in the sample image; furthermore, embodiments of the present application are not limited to this determination of the "first image sample", and for example, the sample image may be directly determined as the first image sample.

The "second image sample" is used to record the image information at the first cropping scale (e.g., 70%) in the sample image; the embodiment of the application is not limited to the determination process of the "H second image samples", for example, the sample image may be subjected to the H random cropping process according to the first cropping proportion to obtain the H second image sample, so that the same proportion between the image information carried by the H second image sample and the image information carried by the sample image reaches the "first cropping proportion".

The "third image sample" is used to record the image information at the second cropping rate (e.g., 40%) in the sample image; in addition, the embodiment of the present application is not limited to the determination process of the "G third image samples", for example, the sample image may be subjected to a G-th random cropping process according to the second cropping ratio, so as to obtain a G-th third image sample, so that the same ratio between the image information carried by the G-th third image sample and the image information carried by the sample image may reach the "second cropping ratio".

In addition, the embodiment of the present application is not limited to the implementation of step 12, and for example, it may specifically include steps 121 to 122:

step 121: and cutting the sample image according to a preset rule to obtain at least one cutting image.

The preset rule may be preset, for example, it may specifically include: and carrying out H times of cutting processing on the sample image according to the first cutting proportion, and carrying out G times of cutting processing on the sample image according to the second cutting proportion.

The "clip image" described above is used to represent part of the image information in the sample image.

Step 122: at least one image to be used is determined from the sample image and the at least one cropping image.

The embodiment of the present application is not limited to the implementation of step 122, and may be, for example: the sample image, and each clip image are directly determined as the image to be used. As another example, step 122 may specifically be: firstly, carrying out data filling processing on each cut image according to the image size of a sample image to obtain each filling image so as to keep the image size of each filling image consistent with the image size of the sample image; and determining the sample image and each filling image as images to be used.

Based on the above-mentioned related content of step 12, after the sample image is obtained, the image information of different areas in the sample image may be referred to determine different images to be used, so that the different images to be used can represent the image information carried by the different areas in the sample image, so that each model constructed based on the images to be used has better image area feature extraction performance, which is beneficial to improving feature extraction effect, and is beneficial to improving the identification effect of the flip image.

Step 13: the actual label of the sample image is determined as the actual label of each image to be used.

In the embodiment of the application, after each image to be used is acquired, the actual label of each image to be used can be determined according to the actual label of the sample image, so that the actual label of each image to be used is consistent with the actual label of the sample image.

Step 14: at least one of a visual feature extraction model, a feature encoding model, a weight determination model, and a tap recognition model is constructed using at least one image to be used and an actual tag of the at least one image to be used.

As an example, step 14 may specifically include steps 141-144:

Step 141: and inputting each image to be used into the model to be trained, and obtaining the identification result of each image to be used output by the model to be trained.

The training method comprises the steps that a model to be trained is used for carrying out image reproduction identification processing on input data of the model to be trained; moreover, embodiments of the present application are not limited to this "model to be trained," which may be a machine learning model, for example.

In addition, the embodiment of the present application is not limited to the above-mentioned model structure of the "model to be trained", for example, as shown in fig. 2, the "model to be trained" may specifically include a visual feature extraction layer, a feature splitting layer, a weight determining layer, a feature weighting layer, a feature encoding layer, a feature stitching layer, and a flap recognition layer. The input data of the flip identification layer comprises output data of the characteristic splicing layer; the input data of the feature splicing layer comprises output data of the feature coding layer; the input data of the feature coding layer comprises output data of the feature weighting layer; the input data of the feature weighting layer comprises the output data of the weight determining layer and the output data of the feature splitting layer; the input data of the weight determining layer comprises output data of the feature splitting layer; the input data of the feature extraction layer includes output data of the visual feature extraction layer.

The "visual feature extraction layer" is used for performing visual feature extraction processing on one image data; moreover, embodiments of the present application are not limited to the network structure of the "visual feature extraction layer", for example, it may be implemented using the model structure of the "visual feature extraction model" above.

The characteristic splitting layer is used for carrying out characteristic splitting processing on input data of the characteristic splitting layer; moreover, the working principle of the "feature splitting layer" is not limited, and for example, it may be implemented by using the feature splitting process shown in the above S21-S23.

The weight determining layer is used for carrying out information importance measurement processing on the input data of the weight determining layer; moreover, embodiments of the present application are not limited to the network structure of the "weight determination layer", and may be implemented, for example, using the model structure of the "weight determination model" above.

The characteristic weighting layer is used for carrying out weighting processing on the input data of the weight determining layer; moreover, the working principle of the "feature weighting layer" is not limited, and for example, it may be implemented by using the weighting process shown in S6 above.

The characteristic coding layer is used for coding input data of the characteristic coding layer; moreover, embodiments of the present application are not limited to the network structure of the "feature encoding layer", and may be implemented, for example, using the model structure of the "feature encoding model" above.

The characteristic splicing layer is used for carrying out splicing processing on input data of the characteristic splicing layer; the working principle of the characteristic splicing layer is not limited, and the characteristic splicing layer can be implemented by any splicing method existing or appearing in the future.

The 'reproduction identification layer' is used for carrying out reproduction data identification processing on the input data of the reproduction identification layer; moreover, embodiments of the present application are not limited to the network structure of the "tap recognition layer", and for example, it may be implemented using the model structure of the "tap recognition model" above.

The above-described "recognition result of the image to be used" is used to indicate the possibility that the image to be used predicts that it belongs to the reproduction image.

Step 142: judging whether a preset stopping condition is met, if so, executing step 144; if not, step 143 is performed.

Wherein, the 'preset stop condition' can be preset; the embodiment of the application is not limited to the preset stopping condition, for example, the model loss value of the model to be trained is lower than a preset loss value threshold; the change rate of the model loss value of the model to be trained is lower than a preset change rate threshold value; the update times of the model to be trained can also reach a preset time threshold.

The model loss value of the model to be trained is used for representing the identification performance of the flip image of the model to be trained; the embodiment of the application is not limited to the process of determining the model loss value of the model to be trained, and can be implemented by adopting any model loss value calculation method existing or appearing in the future.

Step 143: and updating the model to be trained according to the identification result of the at least one image to be used and the actual label of the at least one image to be used, and returning to the step 141.

In the embodiment of the application, after the fact that the to-be-trained model of the current wheel does not reach the preset stop condition is determined, it can be determined that the identification performance of the flip image of the to-be-trained model is still relatively poor, so that the to-be-trained model can be updated according to the difference between the identification result of at least one to-be-used image and the actual label of the at least one to-be-used image, so that the updated to-be-trained model has better identification performance of the flip image, and the step 141 and the subsequent steps thereof are continuously executed based on the updated to-be-trained model, so that the next training process for the to-be-trained model is realized.

Step 144: and determining at least one of a visual feature extraction model, a feature coding model, a weight determination model and a flap recognition model according to the model to be trained.

In the embodiment of the application, after the to-be-trained model of the current wheel is determined to reach the preset stopping condition, the to-be-trained model can be determined to have better reproduction image recognition performance, so that at least one of a visual feature extraction model, a feature coding model, a weight determination model and a reproduction recognition model can be determined directly according to the to-be-trained model; and the determination process may specifically be: determining a visual feature extraction model according to the visual feature extraction layer in the model to be trained (for example, directly determining the visual feature extraction layer in the model to be trained as the visual feature extraction model); determining a feature coding model according to the feature coding layer in the model to be trained (for example, directly determining the feature coding layer in the model to be trained as the feature coding model); determining a weight determination model according to the weight determination layer in the model to be trained (for example, directly determining the weight determination layer in the model to be trained as the weight determination model); and determining a flap recognition model according to the flap recognition layer in the model to be trained (for example, directly determining the flap recognition layer in the model to be trained as the flap recognition model).

Based on the above-mentioned related content of step 11 to step 14, after the sample image and the actual label thereof are obtained, at least one image to be used and the actual label thereof can be determined by referring to the sample image and the actual label thereof; and constructing at least one of a visual feature extraction model, a feature coding model, a weight determining model and a flip image recognition model by using at least one image to be used and an actual label thereof, so that each constructed model has good model performance, and the recognition performance of the flip image is improved.

Based on the above method for identifying a flipped image provided by the embodiment of the method, the embodiment of the application also provides a flipped image identifying device, which is explained and illustrated below with reference to the accompanying drawings.

Device embodiment

For the technical details of the apparatus for identifying a flipped image provided in the apparatus embodiment, please refer to the above-mentioned method embodiment.

Referring to fig. 3, the structure of a device for identifying a flipped image according to an embodiment of the present application is shown.

The apparatus 300 for identifying a flip image provided in the embodiment of the present application includes:

an extracting unit 301, configured to perform a visual feature extraction process on an image to be identified after the image to be identified is acquired, so as to obtain visual features of the image to be identified;

a splitting unit 302, configured to perform feature splitting processing on the visual feature to obtain at least one feature to be used;

the encoding unit 303 is configured to perform feature encoding processing on each feature to be used, so as to obtain feature encoding results of each feature to be used;

and the identifying unit 304 is configured to determine whether the image to be identified is a flip image according to the feature encoding result of the at least one feature to be used.

The splitting unit 302 is specifically configured to: flattening the image data of the nth channel in the feature map to be used to obtain nth flattening data; wherein N is a positive integer, N is less than or equal to N; carrying out data extraction processing on the nth flattening data according to preset extraction parameters to obtain at least one extracted data segment of the nth flattening data; wherein N is a positive integer, N is less than or equal to N; the at least one feature to be used is determined from at least one extracted data segment of the N flattened data.

In one possible implementation, the apparatus 300 further includes:

a determining unit, configured to determine a weighted weight value of each feature to be used;

the weighting unit is used for multiplying each feature to be used with the weighting weight value of each feature to be used to obtain each weighting feature;

the encoding unit 303 is specifically configured to: and respectively carrying out feature coding processing on each weighted feature to obtain feature coding results of each feature to be used.

In a possible embodiment, the determining unit is specifically configured to: and inputting each feature to be used into a pre-constructed weight determination model to obtain a weighted weight value of each feature to be used, which is output by the weight determination model.

In a possible implementation manner, the identifying unit 304 includes:

a splicing subunit, configured to splice the feature encoding results of the at least one feature to be used to obtain an encoding feature of the image to be identified;

the processing subunit is used for carrying out the reproduction image recognition processing on the coding features of the image to be recognized to obtain a recognition result of the image to be recognized;

and the determining subunit is used for determining whether the image to be identified is a flip image or not according to the identification result of the image to be identified.

In a possible embodiment, the processing subunit is specifically configured to: inputting the coding features of the image to be identified into a pre-constructed reproduction identification model to obtain an identification result of the image to be identified, which is output by the reproduction identification model.

In one possible implementation, the apparatus 300 further includes:

the construction unit is used for acquiring a sample image and an actual label of the sample image; the actual label of the sample image is used for indicating whether the sample image is actually a flip image or not; determining at least one image to be used according to the sample image; determining the actual label of the sample image as the actual label of each image to be used; and constructing the flip identification model by using the at least one image to be used and the actual label of the at least one image to be used.

In one possible embodiment, the building block comprises:

the identification subunit is used for inputting each image to be used into the model to be trained to obtain an identification result of each image to be used output by the model to be trained;

the updating subunit is used for updating the model to be trained according to the identification result of the at least one image to be used and the actual label of the at least one image to be used, and returning to the identification subunit to continuously execute the step of inputting each image to be used into the model to be trained;

and the construction subunit is used for determining the flap recognition model according to the model to be trained when a preset stop condition is reached.

The construction subunit is specifically configured to: and determining the flap recognition model according to the flap recognition layer in the model to be trained.

In a possible implementation manner, the splitting unit 302 is specifically configured to: cutting the sample image according to a preset rule to obtain at least one cutting image; and determining at least one image to be used according to the sample image and the at least one clipping image.

In one possible implementation, the apparatus 300 further includes:

and the adjusting unit is used for adjusting the equipment acquisition image according to a preset size after the equipment acquisition image is acquired, so as to obtain the image to be identified.

Based on the related content of the above-mentioned flip image recognition device 300, for the flip image recognition device 300, after obtaining the image to be recognized, the image to be recognized is first subjected to visual feature extraction processing to obtain the visual features of the image to be recognized; then, carrying out feature splitting treatment on the visual features to obtain at least one feature to be used; then, carrying out feature coding processing on each feature to be used to obtain feature coding results of each feature to be used; and finally, determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used. The feature encoding result of the at least one feature to be used can accurately represent the image information carried by the image to be identified, so that the feature encoding result of the at least one feature to be used can accurately represent the moire information carried by the image to be identified, and accordingly, the feature encoding result of the at least one feature to be used can accurately represent whether the image to be identified is a flip image, and whether the image data is a flip image can be identified based on the feature encoding result of the at least one feature to be used.

Further, an embodiment of the present application also provides an apparatus, where the apparatus includes a processor and a memory:

the memory is used for storing a computer program;

Further, the embodiment of the application also provides a computer readable storage medium for storing a computer program, wherein the computer program is used for executing any implementation mode of the method for identifying the flipped image provided by the embodiment of the application.

Further, the embodiment of the application also provides a computer program product, which when run on a terminal device, causes the terminal device to execute any implementation mode of the method for identifying the flipped image provided by the embodiment of the application.

It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

The above description is only of the preferred embodiment of the present invention, and is not intended to limit the present invention in any way. While the invention has been described with reference to preferred embodiments, it is not intended to be limiting. Any person skilled in the art can make many possible variations and modifications to the technical solution of the present invention or modifications to equivalent embodiments using the methods and technical contents disclosed above, without departing from the scope of the technical solution of the present invention. Therefore, any simple modification, equivalent variation and modification of the above embodiments according to the technical substance of the present invention still fall within the scope of the technical solution of the present invention.

Claims

1. A method for identifying a flip image, the method comprising:

Determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used;

the visual features comprise feature maps to be used, and the feature maps to be used comprise image data of N channels; wherein N is a positive integer;

2. The method according to claim 1, wherein the method further comprises:

determining a weighted weight value of each feature to be used;

3. The method of claim 2, wherein said determining a weighted weight value for each of said features to be used comprises:

4. The method according to claim 1, wherein the determining whether the image to be identified is a flip image according to the feature encoding result of the at least one feature to be used includes:

5. The method according to claim 4, wherein the performing a flip image recognition process on the coding feature of the image to be recognized to obtain a recognition result of the image to be recognized includes:

6. The method of claim 5, wherein the process of constructing the flap recognition model comprises:

determining at least one image to be used according to the sample image;

7. The method of claim 6, wherein constructing the tap recognition model using the at least one image to be used and the actual label of the at least one image to be used comprises:

8. The method of claim 7, wherein the model to be trained comprises: the device comprises a visual feature extraction layer, a feature splitting layer, a weight determining layer, a feature weighting layer, a feature coding layer, a feature splicing layer and a flap recognition layer; the input data of the flap recognition layer comprises the output data of the characteristic splicing layer; the input data of the characteristic splicing layer comprises the output data of the characteristic coding layer; the input data of the feature coding layer comprises the output data of the feature weighting layer; the input data of the characteristic weighting layer comprises the output data of the weight determining layer and the output data of the characteristic splitting layer; the input data of the weight determining layer comprises the output data of the characteristic splitting layer; the input data of the feature disassembly layer comprises the output data of the visual feature extraction layer;

9. The method of claim 6, wherein said determining at least one image to be used from said sample image comprises:

10. The method according to claim 1, wherein the process of acquiring the image to be identified comprises:

11. A flip image recognition device, comprising:

the identification unit is used for determining whether the image to be identified is a flip image or not according to the feature coding result of the at least one feature to be used;

the splitting unit is specifically configured to:

12. An apparatus comprising a processor and a memory:

the memory is used for storing a computer program;

the processor is configured to perform the method of any of claims 1-10 according to the computer program.

13. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a computer program for executing the method of any one of claims 1-10.

14. A computer program product, characterized in that the computer program product, when run on a terminal device, causes the terminal device to perform the method of any of claims 1-10.