CN112529846A

CN112529846A - Image processing method and device, electronic equipment and storage medium

Info

Publication number: CN112529846A
Application number: CN202011337518.6A
Authority: CN
Inventors: 顾津锦; 董超; 任思捷
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2021-03-19

Abstract

The present disclosure relates to an image processing method and apparatus, an electronic device, and a storage medium. The method comprises the following steps: respectively acquiring first characteristic information of a target image and second characteristic information of a reference image, wherein the target image and the reference image correspond to the same scene; acquiring a feature difference value between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image, wherein the distance between the plurality of first pixel points and the second pixel points in space does not exceed a preset range; according to the characteristic difference value, determining the similarity between the target image and the reference image; and determining the image quality of the target image according to the similarity of the target image and the reference image.

Description

Image processing method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, an electronic device, and a storage medium.

Background

Image quality evaluation is one of important factors influencing the development trend in the field of image restoration, and an image restoration model optimized based on different image quality evaluation methods can bring different perception effects to an output image.

However, with the development of the generation countermeasure network (GAN) technology, the generated images may have distortion type distortion in space, and the related image quality evaluation method cannot objectively evaluate the quality of such images. Therefore, how to objectively determine the quality of various types of images becomes a problem to be solved.

Disclosure of Invention

The present disclosure proposes an image processing technical solution.

According to an aspect of the present disclosure, there is provided an image processing method including:

respectively acquiring first characteristic information of a target image and second characteristic information of a reference image, wherein the target image and the reference image correspond to the same scene; acquiring a feature difference value between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image, wherein the distance between the plurality of first pixel points and the second pixel points in space does not exceed a preset range; according to the characteristic difference value, determining the similarity between the target image and the reference image; and determining the image quality of the target image according to the similarity of the target image and the reference image.

Through the above-mentioned embodiment, because the distance between the second pixel point and the plurality of first pixel points in the space does not exceed the preset range, under the condition that the target image has distortion in the space, the corresponding relationship between the second pixel point and the plurality of first pixel points in the space can be utilized to obtain a relatively real feature difference value, so that the similarity between the target image and the reference image determined based on the feature difference value is more accurate, the quality of the target image determined based on the similarity is more accurate and objective, and the accuracy of the quality judgment of the target image is effectively improved.

In a possible implementation manner, the obtaining, according to a plurality of first pixel points in the target image and a second pixel point in the reference image, a feature difference between the second feature information and the first feature information includes: moving the first feature information to at least one direction within the preset range to obtain at least one third feature information, wherein the third feature information comprises a plurality of first pixel points in the target image; acquiring a first intermediate characteristic difference value between the second characteristic information and at least one third characteristic information according to a second pixel point in the reference image; and determining a feature difference value between the second feature information and the first feature information according to at least one first intermediate feature difference value.

Through the disclosed embodiment, the dislocation subtraction between the second characteristic information and the first characteristic information can be realized by utilizing the third characteristic information obtained after the first characteristic information is moved, the influence of factors such as space distortion of the target image on the accuracy of the characteristic difference value calculation result is reduced, and the obtained characteristic difference value between the second characteristic information and the first characteristic information is more accurate, so that the similarity between the target image and the reference image determined according to the characteristic difference value is more accurate, and the image quality of the target image is more accurately and objectively determined.

In a possible implementation manner, the obtaining, according to a plurality of first pixel points in the target image and a second pixel point in the reference image, a feature difference between the second feature information and the first feature information includes: respectively taking at least one pixel point in the reference image as a second pixel point, and taking at least one pixel point, of which the position distance between the target image and the second pixel point is within a preset range, as a first pixel point; obtaining at least one second intermediate feature difference value according to feature information corresponding to the second pixel points in the second feature information and feature information corresponding to at least one first pixel point in the first feature information; determining a feature difference value of the second pixel point according to the at least one second intermediate feature difference value; and determining a characteristic difference value between the second characteristic information and the first characteristic information according to the characteristic difference value of at least one second pixel point.

Through the disclosed embodiment, the characteristic information between the second pixel point of the reference image and the plurality of first pixel points of the target image can be subjected to traversal subtraction within a certain range, the pixel points which are actually corresponding to the reference image in each pixel point in the target image in the space distortion are searched, and the similarity between the target image and the reference image can be more accurately judged based on the characteristic difference value determined by the actually corresponding pixel points, so that the quality of the target image can be more objectively and accurately evaluated.

In a possible implementation manner, the first feature information includes first feature information in at least one scale, and the second feature information includes second feature information in at least one scale; the obtaining, according to a plurality of first pixel points in the target image and second pixel points in the reference image, a feature difference between the second feature information and the first feature information includes: and respectively determining a feature difference value between the second feature information under the at least one scale and the first feature information with the same scale according to the second pixel point in the reference image and the plurality of first pixel points in the target image to obtain the feature difference value under the at least one scale.

Through the disclosed embodiment, the accuracy of the subsequently determined similarity can be improved by obtaining the feature difference values of a plurality of scales, and the objectivity and the accuracy of the image quality of the determined target image are improved.

In a possible implementation manner, the respectively obtaining the first feature information of the target image and the second feature information of the reference image includes: inputting the target image into a feature extraction network, and acquiring first feature information of the target image under at least one scale according to the output of at least one network layer in the feature extraction network; and inputting the reference image into a feature extraction network, and acquiring second feature information of the reference image under at least one scale according to the output of at least one network layer in the feature extraction network.

Through the above-mentioned disclosed embodiment, on one hand, the first feature information and the second feature information can be accurately and efficiently obtained by using the feature extraction network, and on the other hand, the first feature information and the second feature information under at least one scale can be obtained by using the feature extraction network, so that the obtained feature information can include low-frequency features (such as picture background) in the image, and can also include medium-high frequency features (such as texture details and outline contour) in the image.

In a possible implementation manner, the determining the similarity between the target image and the reference image according to the feature difference includes: and inputting the characteristic difference value into a regression network, and determining the similarity between the target image and the reference image according to the output of the regression network.

Through the disclosed embodiment, the regression network can be used for efficiently and accurately processing and calculating the characteristic difference value, so that the more accurate similarity is obtained, and the efficiency, the accuracy and the realization convenience of the process of determining the quality of the target image are improved.

In one possible implementation, the feature difference value includes a feature difference value at least one scale; the determining the similarity between the target image and the reference image according to the feature difference comprises: respectively inputting the feature difference values under the at least one scale into a regression network to obtain an output result of the at least one regression network; and fusing the output results of the at least one regression network, and determining the similarity between the target image and the reference image according to the fusion result.

Through the disclosed embodiment, the multiple similarities of the target image and the reference image can be respectively obtained based on the feature difference values of the target image and the reference image under different scales, so that the similarities of the target image and the reference image are comprehensively judged according to the multiple similarities, the accuracy of the obtained similarities is improved, and the quality of the target image can be more accurately and objectively evaluated.

In one possible implementation, the target image is obtained by image preprocessing an initial target image, and the reference image is obtained by image preprocessing an initial reference image; the image preprocessing comprises: regularization processing and/or format conversion.

Through the disclosed embodiment, the obtained target image and the reference image can better meet the requirement of subsequent feature extraction, on one hand, the efficiency of the whole image processing process is improved, on the other hand, the precision of the features obtained by the subsequent feature extraction can also be improved, and therefore the accuracy and the objectivity of the finally determined target image quality are improved.

According to an aspect of the present disclosure, there is provided an image processing apparatus including:

the system comprises a feature extraction module, a feature extraction module and a feature extraction module, wherein the feature extraction module is used for respectively acquiring first feature information of a target image and second feature information of a reference image, and the target image and the reference image correspond to the same scene; a feature difference obtaining module, configured to obtain, according to a plurality of first pixel points in the target image and second pixel points in the reference image, a feature difference between the second feature information and the first feature information, where a distance between the plurality of first pixel points and the second pixel points in a space does not exceed a preset range; a similarity determining module, configured to determine, according to the feature difference, a similarity between the target image and the reference image; and the image quality determining module is used for determining the image quality of the target image according to the similarity between the target image and the reference image.

In a possible implementation manner, the feature difference value obtaining module is configured to: moving the first feature information to at least one direction within the preset range to obtain at least one third feature information, wherein the third feature information comprises a plurality of first pixel points in the target image; acquiring a first intermediate characteristic difference value between the second characteristic information and at least one third characteristic information according to a second pixel point in the reference image; and determining a feature difference value between the second feature information and the first feature information according to at least one first intermediate feature difference value.

In a possible implementation manner, the feature difference value obtaining module is configured to: respectively taking at least one pixel point in the reference image as a second pixel point, and taking at least one pixel point, of which the position distance between the target image and the second pixel point is within a preset range, as a first pixel point; obtaining at least one second intermediate feature difference value according to feature information corresponding to the second pixel points in the second feature information and feature information corresponding to at least one first pixel point in the first feature information; determining a feature difference value of the second pixel point according to the at least one second intermediate feature difference value; and determining a characteristic difference value between the second characteristic information and the first characteristic information according to the characteristic difference value of at least one second pixel point.

In a possible implementation manner, the first feature information includes first feature information in at least one scale, and the second feature information includes second feature information in at least one scale; the characteristic difference value obtaining module is used for: and respectively determining a feature difference value between the second feature information under the at least one scale and the first feature information with the same scale according to the second pixel point in the reference image and the plurality of first pixel points in the target image to obtain the feature difference value under the at least one scale.

In one possible implementation, the feature extraction module is configured to: inputting the target image into a feature extraction network, and acquiring first feature information of the target image under at least one scale according to the output of at least one network layer in the feature extraction network; and inputting the reference image into a feature extraction network, and acquiring second feature information of the reference image under at least one scale according to the output of at least one network layer in the feature extraction network.

In one possible implementation manner, the similarity determination module is configured to: and inputting the characteristic difference value into a regression network, and determining the similarity between the target image and the reference image according to the output of the regression network.

In one possible implementation, the feature difference value includes a feature difference value at least one scale; the similarity determination module is to: respectively inputting the feature difference values under the at least one scale into a regression network to obtain an output result of the at least one regression network; and fusing the output results of the at least one regression network, and determining the similarity between the target image and the reference image according to the fusion result.

According to an aspect of the present disclosure, there is provided an electronic device including:

a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the instructions stored by the memory to perform the image processing method described above.

According to an aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described image processing method.

In the embodiment of the disclosure, the first characteristic information of the target image and the second characteristic information of the reference image are respectively obtained, and according to the second pixel point in the reference image and a plurality of first pixel points in the target image, the distance between which and the second pixel point in the space does not exceed a preset range, the characteristic difference between the second characteristic information and the first characteristic information is obtained, so that the similarity between the target image and the reference image is determined according to the characteristic difference, and the image quality of the target image is determined according to the similarity. Through the process, because the distance between the second pixel point and the plurality of first pixel points in the space does not exceed the preset range, under the condition that the target image has distortion in the space, the corresponding relation between the second pixel point and the plurality of first pixel points in the space can be utilized to obtain a relatively real characteristic difference value, so that the similarity between the target image determined based on the characteristic difference value and the reference image is more accurate, the quality of the target image determined based on the similarity is more accurate and objective, and the accuracy of judging the quality of the target image is effectively improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure.

Fig. 2 illustrates a flow diagram of an image processing method according to an embodiment of the present disclosure.

Fig. 3 illustrates a schematic diagram of obtaining a feature difference value between second feature information and first feature information according to an embodiment of the present disclosure.

Fig. 4 shows a schematic diagram of an application example according to the present disclosure.

Fig. 5 illustrates a block diagram of an image processing apparatus according to an embodiment of the present disclosure.

Fig. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.

Fig. 7 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Fig. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure, which may be applied to an image processing apparatus, which may be a terminal device, a server, or other processing device. The terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like.

In some possible implementations, the image processing method may be implemented by a processor calling computer readable instructions stored in a memory.

As shown in fig. 1, the image processing method may include:

step S11, respectively obtaining first characteristic information of a target image and second characteristic information of a reference image, where the target image and the reference image correspond to the same scene.

Step S12, obtaining a feature difference between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image, where a distance between the plurality of first pixel points and the second pixel points in the space does not exceed a preset range.

And step S13, determining the similarity between the target image and the reference image according to the characteristic difference.

Step S14, determining the image quality of the target image according to the similarity between the target image and the reference image.

The target image may be any image whose quality needs to be determined, and the implementation form thereof is not limited in the embodiment of the present disclosure, and may be flexibly selected according to the actual situation. The manner of acquiring the target image is not limited in the embodiments of the present disclosure, and is not limited to the following embodiments. In one possible implementation, the target image may be an image obtained by some image processing or generation method, and the effect of the image processing or generation method may be further evaluated by determining the image quality of the target image. In one possible implementation, the target image may be an image generated by a neural network of some kind of image processing, and by determining the image quality of the target image, the quality of the neural network, etc. may be judged.

The reference image may be an image used as a reference for determining the image quality of the target image, and the implementation form thereof is also not limited in the embodiments of the present disclosure, and may be flexibly selected according to the actual situation. In one possible implementation, the target image and the reference image may correspond to the same scene, so that the image quality of the target image may be determined according to the similarity between the target image and the reference image; for example, in one example, the reference image may be an original image that has the same image content as the target image and has a higher image quality without any distortion processing. The reference image acquisition mode is not limited in the embodiments of the present disclosure, and may be flexibly selected according to actual situations, and is not limited in the following disclosure embodiments. In one possible implementation, the reference image may be an image obtained from some or some image database, in which case the target image may be an image obtained by subjecting the reference image to some image distortion processing, an image generated by some image generation or reconstruction method and having the same content as the reference image, or the like.

Based on the target image and the reference image proposed in the above-described embodiments, the first characteristic information of the target image and the second characteristic information of the reference image can be acquired through the above-described step S11, respectively.

The first feature information may be feature information extracted from a target image, and the representation form of the first feature information is not limited in the embodiment of the present disclosure, and in a possible implementation manner, the first feature information may be represented in a data form such as a feature vector set; in a possible implementation manner, the first feature information may be represented by, for example, a form of a feature map, and how to select may be flexibly determined according to an actual situation.

The acquisition mode of the first feature information may be flexibly determined according to actual conditions, and any mode that can perform feature extraction on the target image may be used as the acquisition mode of the first feature information, which is described in detail in the following disclosure embodiments and is not expanded herein. The condition of the feature information included in the first feature information is not limited in the embodiment of the present disclosure, and may be flexibly determined according to an obtaining manner of the first feature information, in a possible implementation manner, the first feature information may include feature information of multiple scales, and in a possible implementation manner, the first feature information may also include only feature information of a certain scale, and the like.

The second feature information may be feature information extracted from a reference image, and the representation form of the second feature information is not limited in the embodiment of the present disclosure, and the first feature information may be referred to, which is not described herein again.

The acquisition mode of the second feature information can also be flexibly determined according to the actual situation, and any mode capable of extracting the features of the reference image can be used as the acquisition mode of the second feature information, which is described in detail in the following disclosure embodiments and is not expanded first. The condition of the feature information included in the second feature information is not limited in this disclosure, and may be flexibly determined according to an obtaining manner of the second feature information, in a possible implementation manner, the second feature information may include feature information of multiple scales, and in a possible implementation manner, the second feature information may also include only feature information of a certain scale, and the like.

It should be noted that, in the embodiment of the present disclosure, "first" and "second" in the first feature information and the second feature information are only used for distinguishing feature information of different images, and no limitation is imposed on the manner or order of acquiring the feature information. The acquisition modes, the representation forms, the implementation modes and the like of the first characteristic information and the second characteristic information can be the same or different, and the first characteristic information and the second characteristic information can be acquired simultaneously or sequentially in a certain sequence and can be flexibly determined according to actual conditions. The numbering of other pairs of feature information appearing subsequently is the same, and is not repeated.

After the first feature information and the second feature information are obtained, in step S12, a feature difference between the second feature information and the first feature information is obtained according to the second pixel point in the reference image and the plurality of first pixel points in the target image.

The second pixel point may be any pixel point included in the reference image, and specifically, which points in the reference image are selected as the second pixel point, which is not limited in the embodiments of the present disclosure, and is not limited to the following disclosure embodiments. In a possible implementation manner, a certain pixel point in the reference image may be selected as the second pixel point; in a possible implementation manner, a plurality of pixel points in the reference image may also be respectively used as the second pixel points; in a possible implementation manner, each pixel point in the reference image may also be used as a second pixel point.

The first pixel points may be pixel points in the target image, and it can be seen from the above-described disclosed embodiment that, in a possible implementation manner, the number of the first pixel points may be multiple, and the specific number of the multiple first pixel points may be flexibly determined according to an actual situation, which is not limited in the embodiment of the present disclosure.

It can also be seen from the above disclosure that the distance between the first pixel point and the second pixel point in the space does not exceed the preset range, wherein the distance between the first pixel point and the second pixel point in the space can be flexibly determined according to the actual situation. In a possible implementation manner, the positions of the first pixel point and the second pixel point in the space can be respectively determined according to the position of the first pixel point in the target image and the position of the second pixel point in the reference image, and then the distance between the first pixel point and the second pixel point in the space is determined according to the difference between the positions of the first pixel point and the second pixel point in the space; in a possible implementation manner, as described in the above-mentioned disclosure, since the reference image and the target image may correspond to the same scene, the reference image and the target image may have the same corresponding relationship with a space in the scene, and in this case, the distance between the first pixel point and the second pixel point in the space and the like may be determined directly according to the difference between the position of the first pixel point in the target image and the position of the second pixel point in the reference image.

The preset range can be flexibly set according to actual conditions, and is not limited in the embodiments of the present disclosure, but is not limited in each of the following disclosure embodiments. In a possible implementation manner, a range in which the distance between the position in the target image and the position of the second pixel point in the reference image is not greater than d may be used as a preset range; in a possible implementation manner, a range in which the x-axis distance and the y-axis distance between the position in the target image and the position of the second pixel point in the reference image are not greater than d may also be used as the preset range. The value of d may be flexibly selected according to actual situations, and is not limited to the following disclosure embodiments, and in one example, the value of d may be 3 pixel distances.

As described in the above-mentioned embodiment, the distances between the first pixel points and the second pixel points in the space do not exceed the preset range, and therefore, the first pixel points may be at least some of the pixel points in the preset range of the second pixel points. For example, in one possible implementation manner, the plurality of first pixel points may be all pixel points in the target image whose distance from the second pixel point in space does not exceed a preset range; in a possible implementation manner, the plurality of pixel points may also be partial pixel points selected from all the pixel points in the target image whose distance from the second pixel point in the space does not exceed a preset range, the selection manner is not limited in the embodiment of the present disclosure, in one example, the partial pixel points may be randomly selected as the first pixel points, and in one example, some number of pixel points may also be selected as the first pixel points in some directions according to a preset selection rule, for example. It should be noted that, since the preset range can be flexibly determined according to the actual situation, in some possible implementation manners, the plurality of first pixel points may include a pixel point whose distance in space from the second pixel point in the target image is 0, that is, the position of the first pixel point in the target image may be the same as the position of the second pixel point in the reference image.

In some possible implementations, since the target image and the reference image correspond to the same scene, theoretically, pixels located at the same position in the target image and the reference image should be pixels expressing the same content, for example, a pixel located at position a in the reference image corresponds to a pixel located at position a in the target image, and the expressed content is the same. However, in some possible implementation manners, when the target image has spatial distortion, for example, when the target image is an image generated through a GAN network, a pixel in the target image may have a positional offset, in this case, a pixel at a position a in the reference image may correspond to a pixel at a position B in the target image to express the same content, in this case, a feature difference between the second feature information and the first feature information is directly obtained according to the pixel at the position a in the reference image and the pixel at the position a in the target image, and the obtained feature difference may not really reflect the degree of similarity between the target image and the reference image. Therefore, in a possible implementation manner, the feature difference between the second feature information and the first feature information may be obtained according to the second pixel point in the reference image and the plurality of first pixel points in the target image. Further, how to implement step S12 to obtain the feature difference value, the specific implementation manner of the feature difference value may be flexibly determined according to the actual situation, and will be described in the following disclosure embodiments without being expanded herein.

After the feature difference between the second feature information and the first feature information is obtained, the similarity between the target image and the reference image may be determined according to the feature difference. The similarity may reflect the degree of similarity between the target image and the reference image, and the expression form of the similarity may be flexibly determined according to the actual situation, and is not limited to the following embodiments, for example, the similarity may be expressed in the form of a similarity score or a similarity percentage, etc. The specific process of determining the similarity between the target image and the reference image according to the feature difference can be flexibly determined according to the actual situation, for example, regression calculation or other manners can be performed based on the feature difference, for details, see the following disclosed embodiments, and will not be expanded herein.

The image quality may be an evaluation of the quality of the target image, and factors affecting the image quality may include a variety of factors such as similarity to the reference image, and other additional factors such as image sharpness and the like. After the similarity between the target image and the reference image is determined, how to determine the image quality of the target image according to the similarity can be flexibly determined according to the actual situation of the reference image and the determination standard of the image quality, and the like, which will be described in detail in the following disclosure embodiments without being expanded.

As described in the above embodiments, the manner of acquiring the feature information in step S1 can be flexibly determined according to actual situations. In some possible implementation manners, the first feature information of the target image and the second feature information of the reference image may be extracted by some related feature extraction algorithms, respectively, where the algorithm for extracting the features of the target image may be the same as or different from the algorithm for extracting the features of the reference image.

Fig. 2 shows a flowchart of an image processing method according to an embodiment of the present disclosure, and as shown in the figure, in one possible implementation, step S1 may include:

step S111, inputting the target image into a feature extraction network, and acquiring first feature information of the target image under at least one scale according to the output of at least one network layer in the feature extraction network;

step S112, inputting the reference image into the feature extraction network, and acquiring second feature information of the reference image under at least one scale according to the output of at least one network layer in the feature extraction network.

The feature extraction network may be a neural network for extracting features of an image, and the implementation form thereof may be flexibly selected according to the actual situation, and is not limited to the following disclosure embodiments. In a possible implementation manner, the feature extraction network may be an Alex convolutional neural network or a VGG convolutional neural network; in one possible implementation, the feature extraction network may also be a modification of the relevant neural network for feature extraction, such as replacing the largest pooling layer in the relevant neural network for feature extraction with an l2 norm pooling layer, and so on.

The specific implementation process of step S111 can be flexibly determined according to actual situations. In some possible implementation manners, in the case that the first feature information only includes the first feature information under one scale, the first feature information may be obtained by extracting an output feature of a network layer in the middle of the network or an output feature of a last network layer according to the features; in some possible implementations, in a case that the first feature information includes first feature information at multiple scales, the first feature information at multiple scales may be obtained according to output features of multiple network layers in the feature extraction network. For example, in an example, in the case that the feature extraction network is an Alex convolutional neural network, the first feature information at five scales may be obtained according to the outputs of network layers such as 2 nd, 5 th, 7 th, 9 th and 12 th layers of the feature extraction network, respectively. In one example, in the case that the feature extraction network is a VGG convolutional neural network, the first feature information at five scales can be obtained according to the output features of network layers such as layers 4, 9, 16, 23 and 30 of the feature extraction network, respectively.

The implementation of step S112 may refer to the implementation of step S111, which is not described herein again. In some possible implementations, the feature extraction network for extracting the first feature information may be the same as or different from the feature extraction network for extracting the second feature information, and the implementation manner of step S112 may be the same as or different from that of step S111, and both may be determined flexibly according to actual situations. The implementation order of step S111 and step S112 can also be flexibly determined according to the actual situation, in one possible implementation manner, step S111 and step S112 can be implemented simultaneously, and in one possible implementation manner, step S111 and step S112 can also be executed sequentially according to a certain order.

By inputting a target image into a feature extraction network, acquiring feature information of the target image in at least one scale according to an output of at least one network layer in the feature extraction network, and inputting a reference image into the feature extraction network, obtaining second feature information of the reference image in at least one scale according to the output of at least one network layer in the feature extraction network, through the above process, on one hand, the first characteristic information and the second characteristic information can be accurately and efficiently obtained by using the characteristic extraction network, on the other hand, the first characteristic information and the second characteristic information under at least one scale can be obtained through the characteristic extraction network, therefore, the obtained feature information can contain low-frequency features (such as picture background) in the image, and can also contain medium-high frequency features (such as texture details, outline and the like) in the image.

As described in the foregoing disclosure embodiments, in one possible implementation manner, the first feature information of the target image and the second feature information of the reference image may be obtained through a feature extraction network. In some possible implementations, in order to meet the format requirement or processing requirement of the image by the feature extraction network or other feature extraction methods, a certain image preprocessing may be required on the target image or the reference image. Thus, in one possible implementation, the target image may be obtained by image preprocessing an initial target image, and the reference image may be obtained by image preprocessing an initial reference image; in one possible implementation, the image preprocessing may include: regularization processing and/or format conversion.

The initial target image and the initial reference image may be corresponding unprocessed and directly obtained images, and the obtaining manner may refer to the obtaining manner of the target image and the reference image in each of the above disclosed embodiments, and details are not repeated herein. After the initial target image and the initial reference image are obtained, corresponding image preprocessing can be performed on the initial target image and the initial reference image according to the requirement of feature extraction, wherein the processing mode included in the image preprocessing can be flexibly determined according to the actual situation, and is not limited to the following disclosure embodiments.

In one possible implementation, the image preprocessing may include a regularization process, parameters of the regularization process may be flexibly set according to actual conditions of the image, in one example, for a three-channel RGB image, a mean value of the regularization process may be set to [0.485,0.456,0.406], and a standard deviation of the regularization process may be set to [0.229,0.224,0.225 ].

In a possible implementation manner, the image preprocessing may also include format conversion, and both the target format after conversion and the conversion manner may be flexibly determined according to the actual requirement of feature extraction, and in one example, the image may be converted into a tensor format through a machine learning library pytorreh.

In a possible implementation manner, the image preprocessing may also include format conversion and regularization processing, where the order of implementing the processing manners such as format conversion and regularization processing may be flexibly set according to actual situations, and is not limited to the following disclosure embodiments. In one example, the image may be first formatted and then regularized.

In some possible implementations, the image preprocessing can be extended to other processing manners, such as one or more of changing resolution, image cropping, image downsampling, or image interpolation, and how to select can be flexibly determined according to actual situations. In one possible implementation, the image pre-processing performed on the initial target image and the initial reference image may be the same or different.

The target image and the reference image are obtained by carrying out image preprocessing on the initial target image and the initial reference image in various forms, so that the obtained target image and the obtained reference image can better meet the requirement of subsequent feature extraction, the efficiency of the whole image processing process is improved on one hand, the precision of the features obtained by the subsequent feature extraction can also be improved on the other hand, and the accuracy and the objectivity of the finally determined target image quality are improved.

As described in the above embodiments, the implementation manner of step S12 can be flexibly determined according to practical situations. In one possible implementation, step S12 may include:

moving the first characteristic information to at least one direction within a preset range to obtain at least one third characteristic information, wherein the third characteristic information comprises a plurality of first pixel points in the target image;

acquiring a first intermediate characteristic difference value between the second characteristic information and at least one third characteristic information according to a second pixel point in the reference image;

and determining a feature difference value between the second feature information and the first feature information according to the at least one first intermediate feature difference value.

In this case, the pixel points included in at least one third feature information obtained by moving the first feature information may be a plurality of first pixel points provided in the above-mentioned disclosed embodiments, so that the first intermediate feature difference between the second feature information and the third feature information is obtained, the similarity degree of the target image and the reference image can be better reflected.

The number of the third feature information is not limited in the embodiments of the present disclosure, and may be flexibly determined according to factors such as the moving direction and the moving distance of the first feature information, and is not limited to the following embodiments of the present disclosure. As described in the foregoing disclosure, the first feature information may move in at least one direction, and accordingly, in one possible implementation manner, in each moving direction of the first feature information, a third feature information may be obtained; in one possible implementation manner, in each moving direction of the first feature information, a plurality of pieces of third feature information may be obtained according to different moving distances.

The moving direction of the first feature information may be flexibly set according to actual conditions, and is not limited to the following embodiments. In a possible implementation manner, the first feature information may be moved only in a preset direction; in a possible implementation manner, the first feature information may also be moved to a plurality of preset directions. The moving distance of the first feature information may also be flexibly set according to the actual situation, which is not limited in the embodiment of the present disclosure, and in a possible implementation manner, the moving distance may be any distance within a preset range. The moving distances in different directions may be the same or different, and in one example, the first feature information may be moved by the same distance in eight directions, i.e., upper left, upper right, upper left, right, lower left, lower right, and lower right, respectively, to obtain eight pieces of third feature information.

After obtaining the at least one third feature information, a first intermediate feature difference between the second feature information and the at least one third feature information may be calculated, and a calculation manner of the first intermediate feature difference is not limited in the embodiments of the present disclosure, and is not limited in each of the following disclosed embodiments. In a possible implementation manner, the feature values or feature vectors corresponding to the pixel points at the same position in the second feature information and the third feature information may be directly subtracted from the second pixel point in the reference image to obtain a first intermediate feature difference value; in a possible implementation manner, the manhattan distance of the feature value or the feature vector corresponding to the pixel point at the same position in the second feature information and the third feature information may also be calculated according to the second pixel point in the reference image, and the manhattan distance is used as the first intermediate feature difference value.

At least one first intermediate feature difference value can be obtained by calculating a feature difference value between the second feature information and the at least one third feature information. After obtaining the at least one first intermediate feature difference value, a feature difference value between the second feature information and the first feature information may be determined based on the at least one first intermediate feature difference value.

The manner of determining the feature difference between the second feature information and the first feature information according to the first intermediate feature difference may be flexibly determined according to actual situations, and is not limited to the following disclosed embodiments. In a possible implementation manner, in the case that only one first intermediate feature difference value is obtained, the first intermediate feature difference value may be directly used as a feature difference value between the second feature information and the first feature information; in a possible implementation manner, in the case of obtaining a plurality of first intermediate feature difference values, data of corresponding positions in the plurality of first intermediate feature difference values may be compared, and at each position, data having a smallest numerical value in the plurality of first intermediate feature difference values is used as a feature difference value of the position, so as to obtain a feature difference value between the second feature information and the first feature information; in a possible implementation manner, when a plurality of first intermediate feature difference values are obtained, the data of the first intermediate feature difference values may be compared as a whole, and the first intermediate feature difference value with the smallest overall feature difference may be used as the feature difference value between the second feature information and the first feature information.

Through the above process, the dislocation subtraction between the second feature information and the first feature information can be realized by using the third feature information obtained after the first feature information is moved, the influence of factors such as space distortion of the target image on the accuracy of the feature difference calculation result is reduced, and the obtained feature difference between the second feature information and the first feature information is more accurate, so that the similarity between the target image and the reference image determined according to the feature difference is more accurate, and the image quality of the target image is more accurately and objectively determined.

In one possible implementation, step S12 may include:

respectively taking at least one pixel point in the reference image as a second pixel point, and taking at least one pixel point, the position distance between which and the second pixel point in the target image is within a preset range, as a first pixel point;

obtaining at least one second intermediate characteristic difference value according to the corresponding characteristic information of the second pixel point in the second characteristic information and the corresponding characteristic information of the at least one first pixel point in the first characteristic information;

determining a characteristic difference value of a second pixel point according to at least one second intermediate characteristic difference value;

and determining a characteristic difference value between the second characteristic information and the first characteristic information according to the characteristic difference value of at least one second pixel point.

The implementation forms of the first pixel point and the second pixel point may refer to the above disclosed embodiments, and are not described herein again.

In the above-mentioned embodiment, it has been proposed that the first pixel point may be a plurality of pixel points in the target image whose distance from the second pixel point in the space does not exceed a preset range, and therefore, the second intermediate characteristic difference value may also be flexibly changed according to different implementation forms of the first pixel point. In a possible implementation manner, feature information (such as a feature value or a feature vector, etc.) corresponding to the second pixel point in the second feature information may be respectively subjected to feature difference calculation with feature information (such as a feature value or a feature vector, etc.) corresponding to the at least one first pixel point in the first feature information, so as to respectively obtain at least one second intermediate feature difference.

The calculation method of the second intermediate feature difference is not limited in the embodiments of the present disclosure, and reference may be made to the calculation method of the first intermediate feature difference in each of the embodiments, for example, directly calculate a difference between feature values or feature vectors corresponding to a pixel point, or calculate a manhattan distance between feature values or feature vectors corresponding to a pixel point, or calculate a Mean Square Error (MSE) between feature values or feature vectors corresponding to a pixel point. It should be noted that "first" and "second" of the first intermediate feature difference value and the second intermediate feature difference value are only used to distinguish feature difference values calculated based on different objects, and the calculation manner, the calculation order, and the like of the feature difference values are not limited.

After obtaining the at least one second intermediate feature difference, the feature difference of the second pixel point may be further determined from the at least one second intermediate feature difference, and how to determine the feature difference may be flexibly determined according to the actual situation of the obtained second intermediate feature difference. In a possible implementation manner, under the condition that a plurality of second intermediate feature difference values corresponding to the second pixel point are obtained by respectively calculating the second pixel point and a plurality of first pixel points, the second intermediate feature difference value with the minimum numerical value can be used as the feature difference value of the second pixel point; in a possible implementation manner, under the condition that the second pixel point and the plurality of first pixel points respectively calculate to obtain a plurality of second intermediate feature difference values corresponding to the second pixel point, the second intermediate feature difference value having the smallest absolute value may also be used as the feature difference value of the second pixel point.

After determining the feature difference of the second pixel, the feature difference between the second feature information and the first feature information may be determined according to the feature difference of at least one second pixel. In a possible implementation manner, since each pixel point in the second feature information can be traversed and respectively used as a second pixel point, in this case, the feature difference value of each second pixel point is counted, and the feature difference value between the second feature information and the first feature information can be obtained.

FIG. 3 is a diagram illustrating obtaining a feature difference between second feature information and first feature information according to an embodiment of the disclosure, where the second feature information of the reference image can be represented as f_RThe first characteristic information of the target image may be represented as f_AThe preset range of the corresponding pixel point in the first feature information may be d, and the feature difference between the second feature information and the first feature information may be represented as SWD_d(f_R,f_A)。

In one example, SWD_d(f_R,f_A) The calculation method (c) can be expressed by the following formula (1):

SWD_d(f_R,f_A)[x,y]＝f_R[x,y]-f_A[x',y'] (1)

wherein f is_R[x,y]A feature vector f representing the corresponding second pixel point with (x, y) in the reference image in the second feature information_A[x',y']A feature vector corresponding to a first pixel point with a position (x ', y') in the target image in the first feature information, wherein the distance between (x ', y') and (x, y) is not more than d, and f_A[x',y']And f_R[x,y]The Euclidean distance between the two feature vectors is not greater than the feature vectors and f of any other pixel points in a preset range d of (x, y) in the target image_R[x,y]The Euclidean distance between such feature vectors, in one example, f_A[x',y']The satisfied condition can be expressed by the following formula (2):

the characteristic difference between the second characteristic information and the first characteristic information is determined according to the characteristic information corresponding to the second pixel point in the second characteristic information and the characteristic information corresponding to the at least one first pixel point in the first characteristic information, the characteristic difference between the second characteristic information and the first characteristic information is determined according to the characteristic difference between the at least one second pixel point, through the process, the characteristic information between the second pixel point of the reference image and the plurality of first pixel points of the target image can be subjected to traversal subtraction within a certain range, the pixel points actually corresponding to the reference image in each pixel point in the target image in the spatial distortion are searched, and the similarity between the target image and the reference image can be more accurately judged based on the characteristic difference determined by the actually corresponding pixel points, thereby more objectively and accurately evaluating the quality of the target image.

As described in the foregoing embodiments, in a possible implementation manner, the first feature information may include first feature information of multiple scales, and the second feature information may also include second feature information of multiple scales. Therefore, in one possible implementation, step S12 may include: and respectively determining the feature difference between the second feature information under at least one scale and the first feature information with the same scale according to the second pixel points in the reference image and the plurality of first pixel points in the target image to obtain the feature difference under at least one scale.

In a possible implementation manner, since the first feature information may include first feature information of a plurality of scales, respectively, and the second feature information may also include second feature information of a plurality of scales, respectively, it can be seen from the foregoing disclosure that, in a plurality of scales, feature differences between the second feature information of the scale and the first feature information are calculated, respectively, so as to obtain feature differences between the second feature information and the first feature information in the plurality of scales.

The calculation method of the feature difference value in each scale may refer to the method proposed in each disclosed embodiment, and is not described herein again. It should be noted that the various implementations of S12 in the embodiment of the present disclosure are only exemplary implementations of S12, and the implementation process of S12 is not limited to the above-described embodiments. In some possible implementations, the implementations of S12 in the above disclosed embodiments may be implemented in combination with each other, and how to combine the implementations may be flexibly determined according to practical situations, and the implementations are not limited in the embodiments of the present disclosure.

The method comprises the steps of obtaining a characteristic difference value under at least one scale by respectively determining a characteristic difference value between second characteristic information under at least one scale and first characteristic information with the same scale according to a second pixel point in a reference image and a plurality of first pixel points in a target image, and improving the accuracy of subsequently determined similarity and the objectivity and accuracy of the determined image quality of the target image by obtaining the characteristic difference values of a plurality of scales through the process.

In one possible implementation, after obtaining the feature difference between the second feature information and the first feature information, the similarity between the target image and the reference image may be determined through step S13. As described in the foregoing embodiments, the implementation manner of step S13 can be flexibly determined according to practical situations, and in one possible implementation manner, step S13 may include:

and inputting the characteristic difference value into a regression network, and determining the similarity between the target image and the reference image according to the output of the regression network.

The regression network may be a neural network formed by one or more network layers, and the regression network may perform processing and calculation according to the input feature difference, thereby outputting the similarity between the target image and the reference image.

As described in the above-mentioned embodiments, the expression form of the similarity between the target image and the reference image can be flexibly determined according to actual situations, and is not limited to the following embodiments. In a possible implementation manner, the similarity between the target image and the reference image may be represented by a score of the score, where the score interval, the score value, and the like may be flexibly set according to the actual situation, and are not limited in the embodiment of the present disclosure. In a possible implementation, the similarity between the target image and the reference image, etc. may also be expressed in terms of a percentage. The calculation mode and the implementation form of the regression network can be flexibly changed along with the different expression forms of the similarity between the target image and the reference image. In one example, the regression network may include two convolution layers and one average pooling layer connected in sequence to sequentially calculate and process the input feature difference values, thereby outputting the similarity between the target image and the reference image.

Through the process, the regression network can be used for efficiently and accurately processing and calculating the characteristic difference value, so that the more accurate similarity is obtained, and the efficiency, the accuracy and the convenient degree of realization in the process of determining the quality of the target image are improved.

As described in the foregoing embodiments, the feature difference may include a feature difference in at least one scale, and therefore in a possible implementation, the step S13 may also include:

respectively inputting the characteristic difference values under at least one scale into a regression network to obtain an output result of at least one regression network;

and fusing the output results of at least one regression network, and determining the similarity between the target image and the reference image according to the fusion result.

It can be seen from the foregoing disclosure that, in a possible implementation manner, the feature difference values under at least one scale may be respectively input into the regression networks to obtain the output result of at least one regression network.

The implementation form of the regression network may refer to the above-described embodiments, and details are not repeated here. In a possible implementation manner, when the feature difference includes feature differences in multiple scales, multiple feature differences may be respectively input to the same regression network to respectively obtain corresponding output results, where an input sequence may be flexibly set according to an actual situation, and is not limited in the embodiment of the present disclosure; in a possible implementation manner, in a case that the feature difference includes feature differences under multiple scales, the multiple feature differences may also be input into different regression networks respectively to obtain corresponding output results respectively. The implementation manners of the different regression networks may be the same or different, and are not limited in the embodiments of the present disclosure. In one example, the feature difference values at multiple scales can be input into multiple regression networks respectively, and the multiple regression networks have the same network layer structure except that different regression networks have different requirements on the format of the input data.

After the output result of the at least one regression network is obtained, the output results of the at least one regression network may be fused, and the similarity between the target image and the reference image may be determined according to the fusion result. The fusion mode can be flexibly selected according to actual conditions, and is not limited to the following disclosure embodiments. In a possible implementation manner, the output results of at least one regression network may be directly added, and the addition result is used as a fusion result; in a possible implementation manner, an average value of output results of at least one regression network may also be used as a fusion result; in a possible implementation manner, the output result of at least one regression network may also be weighted and averaged to obtain a fusion result, where the weight of the output result of each regression network may be flexibly set according to the actual situation, and is not limited in the embodiment of the present disclosure.

The obtained fusion result may be directly used as the similarity between the target image and the reference image, or may be converted into the similarity between the target image and the reference image after certain processing, such as normalization processing, and how to select may be flexibly set according to the actual situation, which is not limited in the embodiment of the present disclosure.

The method comprises the steps of inputting the feature difference values under at least one scale into the regression networks respectively to obtain the output results of the at least one regression network, fusing the output results of the at least one regression network, and determining the similarity between the target image and the reference image according to the fusion results.

After determining the similarity of the target image and the reference image, the image quality of the target image may be determined through step S14. The implementation manner of step S14 can be flexibly determined according to actual situations, and is not limited to the following disclosure embodiments. In one possible implementation, the obtained similarity may be directly used as the image quality of the target image; in some possible implementation manners, the quality score of the reference image itself may be combined with the obtained similarity to serve as the image quality of the target image; in some possible implementations, the obtained similarity may be used as a criterion for evaluating the image quality of the target image, and combined with other criteria for evaluating the image quality of the target image, such as sharpness, completeness, or resolution, to perform summation, average calculation, weighted average calculation, or the like to determine the image quality of the target image.

By determining the image quality of the target image according to the similarity between the target image and the reference image, the reference image can be used to more accurately and objectively evaluate and judge the image quality of the target image.

In some possible implementation manners, the image processing method provided in the embodiment of the present disclosure may also be implemented by a target neural network, where an implementation form of the target neural network may be flexibly determined according to an actual situation. In one possible implementation, the target neural network may include a feature extraction network, a feature difference calculation network, a regression network, and the like, which are connected in sequence. The implementation forms of the feature extraction network and the regression network may refer to the above-mentioned embodiments, and are not described herein again. The feature difference calculation network may be implemented by one or more network layers, and the structure thereof is not limited in the embodiment of the present disclosure. The processing method of the feature difference calculation network may refer to the above disclosed embodiments, and is not described herein again.

In one possible implementation, the target neural network may be trained by training data, wherein the training data may include a target image containing a quality score annotation obtained from a related image dataset and a reference image. The method for acquiring the quality score of the target image is not limited in the embodiments of the present disclosure, and is not limited to the following embodiments. In one possible implementation, the quality scoring annotations may be obtained by manual scoring; in a possible implementation manner, the quality scoring label of the target image can also be obtained by a related image quality scoring method, such as an Elo scoring algorithm and the like.

The image data set selected by the training data can be flexibly selected according to actual conditions, and is not limited to the following disclosed embodiments. In one example, the training data may be obtained by an open source Image Processing dataset BAPPS, and in one example, the training data may also be obtained by the Image Processing dataset BAPPS and a Perceptual Image Processing algorithm dataset (PIPAL).

In one possible implementation, the training process of the target neural network may include:

inputting a target image containing quality grading marks and a reference image into a target neural network, and processing the target image and the reference image through the image processing method provided in each disclosed embodiment to obtain the image quality of the target image output by the target neural network;

according to the image quality of a target image output by the target neural network and the quality grade mark of the target image, determining the error loss of the target neural network;

and adjusting at least one network parameter in the target neural network according to the error loss to obtain the trained target neural network.

The calculation method of the error loss may be flexibly selected according to actual situations, and is not limited to the following disclosure embodiments. In one example, a Rank loss function may be selected as the error loss calculation method.

In the process of adjusting the network parameters in the target neural network, which neural network parameters in the target neural network are adjusted can be determined according to actual conditions. In one possible implementation, the network parameters of the feature extraction network, the feature difference calculation network, and the regression network may be adjusted together; in a possible implementation manner, in the process of one training, only parameters of one or some of the neural networks may be adjusted, and in one example, in the process of one training, only network parameters of the feature extraction network and network parameters of the regression network may be adjusted while keeping network parameters of the feature extraction network unchanged.

The image processing method provided by the embodiment of the disclosure is realized through the target neural network, so that end-to-end image quality evaluation can be realized, and convenience, efficiency and accuracy of the determined image quality are improved. Meanwhile, the method provided by the embodiment of the disclosure can also be used for training the neural networks included in the target neural network together or respectively, so that the flexibility of training is greatly improved, the precision of the target neural network obtained after training can be further enhanced, and the objectivity of the target neural network in determining the target image quality and young-year study are effectively improved.

Application scenario example

With the rapid development of image processing technology, various forms of distortion may exist in generated images, and how to perform objective quality judgment on the images becomes a problem to be solved urgently at present.

Fig. 4 is a schematic diagram illustrating an application example according to the present disclosure, and as shown in the diagram, an embodiment of the present disclosure proposes an image processing method that can objectively determine image quality of a target image based on a reference image, and the image processing may be performed by:

first, obtaining picture and preprocessing

The method provided in the embodiment of the present disclosure may be implemented by inputting a target image and a reference image into a target neural network. As shown, in one possible implementation, the target neural network may include a feature extraction network for feature extraction, a feature difference calculation network for calculating feature differences, and a regression network for score calculation. In one possible implementation manner, a distortion map provided by a user and to be subjected to quality evaluation can be used as a target image, and a reference map used by the user for evaluating the quality of the distortion map can be used as a reference image. The target neural network performs score evaluation on the quality of the target image according to the reference image. In one possible implementation, the distortion map and the reference map provided by the user may not meet the input requirement of the target neural network, and thus the distortion map and the reference map provided by the user may be subjected to image preprocessing. In one example, after the distortion map and the reference map are converted into tensor formats by PyTorch, regularization preprocessing may be performed respectively to obtain a preprocessed target image and a preprocessed reference image, so as to match the use of a subsequent feature extraction network. In one example, the regularization parameters for three channels RGB may be: mean value [0.485,0.456,0.406], standard deviation [0.229,0.224,0.225 ].

Second step, feature grabbing

As described in the above application example, in one possible implementation, feature extraction may be performed by a feature extraction network in a target neural network. In a possible implementation manner, the input data of the feature extraction network may be the target image and the reference image obtained in the first step, and the output of the feature extraction network may be feature vectors extracted from the target image and the reference image at different extraction depths.

The implementation form of the feature extraction network can be flexibly selected according to the actual situation. In one example, the feature extraction network may be an Alex network, and the application example of the present disclosure may obtain output features of network layers such as layers 2, 5, 7, 9, and 12 (i.e., convolutional layer output features except for the first layer convolutional layer), respectively; in one example, the feature extraction network may be a VGG network, and the application example of the present disclosure may acquire output features of network layers such as layers 4, 9, 16, 23, and 30, respectively. The output features obtained by the two application examples can both include low-frequency feature information (picture background, etc.) and medium-high frequency feature information (texture details, outline, etc.) in the image. In one example, the feature extraction network proposed in the application example of the present disclosure may be a trained neural network; in one example, the feature extraction network proposed in the application example of the present disclosure may also be trained by training data. In a possible implementation manner, in the training process, under the condition of training the following neural networks such as the rest of regression networks, the parameters of the feature extraction network may be fixed and are not trained together with the rest of neural networks.

Thirdly, calculating the characteristic difference

Through the process, five groups of first feature maps under five scales of the target image are respectively obtained as first feature information and five groups of second feature maps under five scales of the reference image are respectively obtained as second feature information through feature extraction under multiple depths of the feature extraction network. Then, the feature difference calculation network may respectively process each group of first feature maps in the first feature information, in an example, the first feature maps in each scale may be respectively shifted to the directions of eight peripheral points (i.e., upper left, upper right, upper left, right, lower left, lower right), and a total of 9 feature vector maps including the original first feature map are respectively obtained in each scale as third feature information. And then, respectively and sequentially subtracting the second feature map of the reference image in each scale from 9 feature vector maps contained in the third feature information, traversing each pixel point in the feature vector maps, and finding out the Manhattan distance of the minimum difference value of the second feature information of the reference image in 9 feature differences in each pixel point as the feature difference value of the pixel point in each scale. And respectively summarizing the characteristic difference values of all the pixel points under five scales to obtain the characteristic difference value between the second characteristic information and the first characteristic information under five scales.

Fourthly, regression calculation of similarity between the target image and the reference image

Through the process, after the characteristic difference calculation network is used, the five characteristic differences of different scales can be obtained. The five feature difference values enter five regression networks respectively, feature similarity degree scores of each scale can be obtained, and the feature similarity degree scores can reflect the similarity of the target image and the reference image.

In one example, the five feature similarity degrees may be directly added to obtain the finally determined similarity between the target image and the reference image, and the finally determined similarity is used as the image quality score of the target image.

In an example, the target neural network proposed in the application example of the present disclosure may be trained by training data, the training data may be flexibly selected according to actual conditions, in an example, the training data set BAPPS may be used for training, and in an example, the training data set PIPAL and the training data set BAPPS may also be used for training at the same time. The loss function selected in the training can also be flexibly determined, and in one example, the training can be performed through a Rank loss function.

Through the process, the input reference image and the input distorted image can be objectively evaluated on the spatial distortion, and the method provided by the application example of the disclosure has robustness on the novel distortion effect brought by new technologies such as a GAN network and the like; the method can carry out iterative updating in time by means of the neural network, and can rapidly deal with any image generation method generating space distortion type distortion, so that more objective image quality evaluation is carried out.

Through experimental verification, in the case of quality evaluation of an image generated by performing super-resolution processing on the basis of a countermeasure generation network, the spearman rank correlation of the method provided by the application example of the disclosure is 8% higher than that of a related image processing method, so that the method provided by the application example of the disclosure has better objectivity in evaluating the quality of a space distortion type distortion image, and simultaneously, in the case of quality evaluation of an image containing other types of distortions, the determined quality of the method provided by the application example of the disclosure is more objective and accurate than that of the related image processing method.

It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted. Those skilled in the art will appreciate that in the above methods of the specific embodiments, the specific order of execution of the steps should be determined by their function and possibly their inherent logic.

In addition, the present disclosure also provides an image processing apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the image processing methods provided by the present disclosure, and the descriptions and corresponding descriptions of the corresponding technical solutions and the corresponding descriptions in the methods section are omitted for brevity.

Fig. 5 illustrates a block diagram of an image processing apparatus according to an embodiment of the present disclosure. The image processing apparatus may be a terminal device, a server or other processing device, etc. The terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like.

In some possible implementations, the image processing apparatus may be implemented by a processor calling computer readable instructions stored in a memory.

As shown in fig. 5, the image processing apparatus 20 may include:

the feature extraction module 21 is configured to obtain first feature information of a target image and second feature information of a reference image, where the target image and the reference image correspond to the same scene.

The feature difference obtaining module 22 is configured to obtain a feature difference between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image, where a distance between the plurality of first pixel points and the second pixel points in the space does not exceed a preset range.

And a similarity determining module 23, configured to determine a similarity between the target image and the reference image according to the feature difference.

And the image quality determining module 24 is configured to determine the image quality of the target image according to the similarity between the target image and the reference image.

In one possible implementation manner, the feature difference value obtaining module is configured to: moving the first characteristic information to at least one direction within a preset range to obtain at least one third characteristic information, wherein the third characteristic information comprises a plurality of first pixel points in the target image; acquiring a first intermediate characteristic difference value between the second characteristic information and at least one third characteristic information according to a second pixel point in the reference image; and determining a feature difference value between the second feature information and the first feature information according to the at least one first intermediate feature difference value.

In one possible implementation manner, the feature difference value obtaining module is configured to: respectively taking at least one pixel point in the reference image as a second pixel point, and taking at least one pixel point, the position distance between which and the second pixel point in the target image is within a preset range, as a first pixel point; obtaining at least one second intermediate characteristic difference value according to the corresponding characteristic information of the second pixel point in the second characteristic information and the corresponding characteristic information of the at least one first pixel point in the first characteristic information; determining a characteristic difference value of a second pixel point according to at least one second intermediate characteristic difference value; and determining a characteristic difference value between the second characteristic information and the first characteristic information according to the characteristic difference value of at least one second pixel point.

In a possible implementation manner, the first feature information includes first feature information in at least one scale, and the second feature information includes second feature information in at least one scale; the characteristic difference value obtaining module is used for: and respectively determining the feature difference between the second feature information under at least one scale and the first feature information with the same scale according to the second pixel points in the reference image and the plurality of first pixel points in the target image to obtain the feature difference under at least one scale.

In one possible implementation, the feature extraction module is configured to: inputting a target image into a feature extraction network, and acquiring first feature information of the target image under at least one scale according to the output of at least one network layer in the feature extraction network; and inputting the reference image into a feature extraction network, and acquiring second feature information of the reference image under at least one scale according to the output of at least one network layer in the feature extraction network.

In one possible implementation, the similarity determining module is configured to: and inputting the characteristic difference value into a regression network, and determining the similarity between the target image and the reference image according to the output of the regression network.

In one possible implementation, the feature difference values include feature difference values at least one scale; the similarity determination module is to: respectively inputting the characteristic difference values under at least one scale into a regression network to obtain an output result of at least one regression network; and fusing the output results of at least one regression network, and determining the similarity between the target image and the reference image according to the fusion result.

In one possible implementation, the target image is obtained by image preprocessing an initial target image, and the reference image is obtained by image preprocessing an initial reference image; the image preprocessing comprises the following steps: regularization processing and/or format conversion.

Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a non-volatile computer readable storage medium.

An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.

The embodiments of the present disclosure also provide a computer program product, which includes computer readable code, and when the computer readable code runs on a device, a processor in the device executes instructions for implementing the image processing method provided in any one of the above embodiments.

The embodiments of the present disclosure also provide another computer program product for storing computer readable instructions, which when executed cause a computer to perform the operations of the image processing method provided in any of the above embodiments.

The electronic device may be provided as a terminal, server, or other form of device.

Fig. 6 illustrates a block diagram of an electronic device 800 in accordance with an embodiment of the disclosure. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.

Referring to fig. 6, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.

Fig. 7 illustrates a block diagram of an electronic device 1900 in accordance with an embodiment of the disclosure. For example, the electronic device 1900 may be provided as a server. Referring to fig. 7, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.

The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system, such as Windows Server, stored in memory 1932^TM，Mac OS X^TM，UnixTM,Linux^TM，FreeBSD^TMOr the like.

In some possible implementations, each module included in the image processing apparatus 20 corresponds to each hardware module included in an electronic device provided as a terminal, a server, or another device, and the corresponding mode can be flexibly determined according to the device form of the electronic device, and is not limited to the following disclosed embodiments. For example, in one example, each module included in the image processing apparatus 20 may correspond to the processing component 802 in the electronic device in the terminal form; in one example, each module included in the image processing apparatus 20 may correspond to the processing component 1922 in the electronic device in the form of a server.

In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An image processing method, comprising:

respectively acquiring first characteristic information of a target image and second characteristic information of a reference image, wherein the target image and the reference image correspond to the same scene;

acquiring a feature difference value between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image, wherein the distance between the plurality of first pixel points and the second pixel points in space does not exceed a preset range;

according to the characteristic difference value, determining the similarity between the target image and the reference image;

and determining the image quality of the target image according to the similarity of the target image and the reference image.

2. The method according to claim 1, wherein the obtaining a feature difference between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image comprises:

moving the first feature information to at least one direction within the preset range to obtain at least one third feature information, wherein the third feature information comprises a plurality of first pixel points in the target image;

and determining a feature difference value between the second feature information and the first feature information according to at least one first intermediate feature difference value.

3. The method according to claim 1 or 2, wherein the obtaining a feature difference between the second feature information and the first feature information according to a plurality of first pixel points in the target image and second pixel points in the reference image comprises:

respectively taking at least one pixel point in the reference image as a second pixel point, and taking at least one pixel point, of which the position distance between the target image and the second pixel point is within a preset range, as a first pixel point;

obtaining at least one second intermediate feature difference value according to feature information corresponding to the second pixel points in the second feature information and feature information corresponding to at least one first pixel point in the first feature information;

determining a feature difference value of the second pixel point according to the at least one second intermediate feature difference value;

4. The method according to any one of claims 1 to 3, wherein the first feature information comprises first feature information at least one scale, and the second feature information comprises second feature information at least one scale;

the obtaining, according to a plurality of first pixel points in the target image and second pixel points in the reference image, a feature difference between the second feature information and the first feature information includes:

and respectively determining a feature difference value between the second feature information under the at least one scale and the first feature information with the same scale according to the second pixel point in the reference image and the plurality of first pixel points in the target image to obtain the feature difference value under the at least one scale.

5. The method according to any one of claims 1 to 4, wherein the respectively acquiring the first feature information of the target image and the second feature information of the reference image comprises:

inputting the target image into a feature extraction network, and acquiring first feature information of the target image under at least one scale according to the output of at least one network layer in the feature extraction network;

and inputting the reference image into a feature extraction network, and acquiring second feature information of the reference image under at least one scale according to the output of at least one network layer in the feature extraction network.

6. The method according to any one of claims 1 to 5, wherein the determining the similarity between the target image and the reference image according to the feature difference comprises:

7. The method of any one of claims 1 to 6, wherein the feature difference values comprise feature difference values at least one scale;

the determining the similarity between the target image and the reference image according to the feature difference comprises:

respectively inputting the feature difference values under the at least one scale into a regression network to obtain an output result of the at least one regression network;

and fusing the output results of the at least one regression network, and determining the similarity between the target image and the reference image according to the fusion result.

8. The method according to any one of claims 1 to 7, characterized in that the target image is obtained by image preprocessing an initial target image, and the reference image is obtained by image preprocessing an initial reference image;

the image preprocessing comprises: regularization processing and/or format conversion.

9. An image processing apparatus characterized by comprising:

the system comprises a feature extraction module, a feature extraction module and a feature extraction module, wherein the feature extraction module is used for respectively acquiring first feature information of a target image and second feature information of a reference image, and the target image and the reference image correspond to the same scene;

a feature difference obtaining module, configured to obtain, according to a plurality of first pixel points in the target image and second pixel points in the reference image, a feature difference between the second feature information and the first feature information, where a distance between the plurality of first pixel points and the second pixel points in a space does not exceed a preset range;

a similarity determining module, configured to determine, according to the feature difference, a similarity between the target image and the reference image;

and the image quality determining module is used for determining the image quality of the target image according to the similarity between the target image and the reference image.

10. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to invoke the memory-stored instructions to perform the method of any one of claims 1 to 8.

11. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 8.