CN116996751A

CN116996751A - Method and device for training coder and decoder

Info

Publication number: CN116996751A
Application number: CN202311008864.3A
Authority: CN
Inventors: 徐顺鑫; 吕柯兴; 成超
Original assignee: Shanghai Bilibili Technology Co Ltd
Current assignee: Shanghai Bilibili Technology Co Ltd
Priority date: 2023-08-10
Filing date: 2023-08-10
Publication date: 2023-11-03

Abstract

The embodiment of the application provides a training method of a coder and a decoder, which comprises the following steps: acquiring an image sample and a watermark sample; inputting the image samples and the watermark samples into a watermark encoder to obtain a watermarked image; converting the watermarked image to an intermediate image by a degradation layer; the degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression; inputting the intermediate image into the watermark decoder to obtain a target watermark; obtaining a loss value between the target watermark and the watermark sample; and adjusting network parameters of the watermark encoder and/or network parameters of the watermark decoder according to the loss value. The technical scheme of the embodiment of the application can improve the anti-interference capability of the watermark encoder and the watermark decoder.

Description

Method and device for training coder and decoder

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a training method and device of a coder-decoder, computer equipment and a computer readable storage medium.

Background

Digital watermarking is a technique of embedding specific information in digital media (image, audio or video) and can be used for copyright statement and protection. The processing procedure of the digital watermark mainly comprises embedding and extracting. Wherein the embedding step may be performed by a watermark encoder for embedding watermark information in the medium. The extracting step may be accomplished by a watermark decoder for extracting watermark information from the media.

In practical applications, images with digital watermarks may be subject to various disturbances during the propagation process, such as: compression of third party platforms, malicious tampering by a pirate, and the like. These disturbances may affect the image and the digital watermark in the image, resulting in a distinction between the digital watermark extracted from the image and the original digital watermark, which is detrimental to the maintenance of rights. Therefore, there is a need to improve the interference immunity of the codec.

The codec obtained by the conventional training method has certain anti-interference capability. However, conventional training methods tend to fit the codec to a certain scene after training converges, which is not robust to complex compressed scenes.

It should be noted that the foregoing is not necessarily prior art, and is not intended to limit the scope of the present application.

Disclosure of Invention

Embodiments of the present application provide a method, apparatus, computer device, and computer readable storage medium for training a codec, so as to solve or alleviate one or more of the technical problems set forth above.

An aspect of an embodiment of the present application provides a method for training a codec, including:

acquiring an image sample and a watermark sample;

inputting the image samples and the watermark samples into a watermark encoder to obtain a watermarked image;

converting the watermarked image to an intermediate image by a degradation layer; the degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression;

inputting the intermediate image into the watermark decoder to obtain a target watermark;

obtaining a loss value between the target watermark and the watermark sample;

and adjusting network parameters of the watermark encoder and/or network parameters of the watermark decoder according to the loss value.

Optionally, the at least two conversion modes further include a second preset image compression algorithm;

Correspondingly, converting the watermarked image into an intermediate image by a degradation layer, comprising:

randomly selecting one of the first preset image compression algorithm and the second preset image compression algorithm; and

And carrying out compression processing on the watermarking image based on the selected preset image compression algorithm so as to acquire the intermediate image.

Optionally, the at least two conversion modes further comprise a direct connection channel;

selecting a direct connection channel, a first preset image compression algorithm or the second preset image compression algorithm;

in case the direct channel is selected, taking the watermarked image as the intermediate image;

under the condition that a first preset image compression algorithm is selected, compressing the watermarked image according to the first preset image compression algorithm to obtain the intermediate image;

and under the condition that a second preset image compression algorithm is selected, carrying out compression processing on the watermarking image according to the second preset image compression algorithm so as to acquire the intermediate image.

Optionally, the second preset image compression algorithm includes an analog JPEG algorithm; correspondingly, the converting the watermarked image into an intermediate image by a degradation layer comprises:

DCT transformation is carried out on the watermarking image so as to obtain an original DCT coefficient matrix corresponding to the watermarking image;

filtering high-frequency coefficients in the original DCT coefficient matrix to obtain a first target DCT coefficient matrix;

performing inverse DCT on the first target DCT coefficient matrix to obtain the intermediate image.

Optionally, the first preset image compression algorithm includes a standard JPEG algorithm; correspondingly, the converting the watermarked image into an intermediate image by a degradation layer comprises:

quantizing the original DCT coefficient matrix according to the random quality coefficient to obtain a second target DCT coefficient matrix;

and acquiring the intermediate image according to the second target DCT coefficient matrix.

Optionally, the plurality of the degradation layers form a cascade structure, and in the cascade structure, the output of the previous degradation layer between the adjacent degradation layers is the input of the next degradation layer;

correspondingly, the converting the watermarked image into an intermediate image by a degradation layer comprises:

inputting the watermarking image into a first degradation layer in a cascade structure so that each degradation layer processes the watermarking image layer by layer;

And acquiring an image output by the last degradation layer in the cascade structure, and determining the image as an intermediate image.

Another aspect of the embodiment of the present application provides a method for encoding and decoding a digital watermark, where the method includes:

the watermark encoder trained according to any one of the coding and decoding training methods encodes the image to be watermarked; or (b)

And decoding the watermarked image according to the watermark decoder trained by any one of the encoding and decoding training methods.

Another aspect of an embodiment of the present application provides a training apparatus for a codec, the apparatus including:

the first acquisition module is used for acquiring an image sample and a watermark sample;

a first input module for inputting the image samples and the watermark samples into a watermark encoder to obtain a watermarked image;

the conversion module is used for converting the watermarking image into an intermediate image through a degradation layer; the degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression;

a second input module for inputting the intermediate image into the watermark decoder to obtain a target watermark;

A second acquisition module, configured to acquire a loss value between the target watermark and the watermark sample;

and the adjusting module is used for adjusting the network parameters of the watermark encoder and/or the network parameters of the watermark decoder according to the loss value.

Another aspect of an embodiment of the present application provides a computer apparatus, including:

at least one processor; and

A memory communicatively coupled to the at least one processor;

wherein: the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.

Another aspect of embodiments of the present application provides a computer-readable storage medium having stored therein computer instructions which, when executed by a processor, implement a method as described above.

The embodiment of the application adopts the technical scheme and can have the following advantages:

and taking the acquired image samples and the watermark samples as training samples for training the watermark codec. The training process is as follows: the image sample and watermark sample are input into a watermark encoder to obtain a watermarked image. The watermarked image is then converted into an intermediate image by, for example, a first preset image compression algorithm in the degradation layer. The first preset image compression algorithm comprises a random quality coefficient, and the random quality coefficient is used for compressing the watermarking image. The resulting intermediate image is then input to a watermark decoder to obtain the target watermark. Finally, a loss value between the target watermark and the watermark sample is determined. Based on the loss value, the network parameters of the watermark encoder and the network parameters of the watermark decoder are adjusted. It can be known that the embodiment of the application can adopt a random quality coefficient, and the coverage range of the compression algorithm is enlarged by setting the quality coefficient of the first preset image compression algorithm to be a random value. The watermark image is randomly compressed in each training iteration process, namely, a random compression scene is used in each training iteration process, so that the watermark encoder and the watermark decoder after training convergence have robustness to complex compression scenes, and the anti-interference capability of the watermark encoder and the watermark decoder is further improved.

Drawings

The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.

Fig. 1 schematically shows a visible digital watermark and an invisible digital watermark.

Fig. 2 schematically shows a process flow diagram of digital watermarking.

Fig. 3 schematically shows a codec training flow chart according to an embodiment of the application.

Fig. 4 schematically shows a flow chart of a method of training a codec according to a first embodiment of the present application;

fig. 5 schematically shows a sub-step of step S404 in fig. 4;

FIG. 6 schematically illustrates an internal configuration of a degradation layer;

fig. 7 schematically shows a sub-step of step S404 in fig. 4;

fig. 8 schematically shows a sub-step of step S404 in fig. 4;

fig. 9 schematically shows a sub-step of step S404 in fig. 4;

fig. 10 schematically illustrates a cascade structure of degradation layers;

fig. 11 schematically shows a sub-step of step S404 in fig. 4;

fig. 12 schematically shows an application example diagram of a codec training method according to a first embodiment of the present application;

Fig. 13 schematically shows a block diagram of a training device of a codec according to a third embodiment of the present application; and

Fig. 14 schematically shows a hardware architecture diagram of a computer device according to a fourth embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be noted that the descriptions of "first," "second," etc. in the embodiments of the present application are for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.

In the description of the present application, it should be understood that the numerical references before the steps do not identify the order in which the steps are performed, but are merely used to facilitate description of the present application and to distinguish between each step, and thus should not be construed as limiting the present application.

First, a term explanation is provided in relation to the present application:

image degradation: degradation refers to the process of reducing the picture quality of an image during acquisition, transmission and storage. Degradation in embodiments of the present application occurs during the transfer and storage phases.

DCT (Discrete Cosine Transform): discrete cosine transform, which is used to transform signals from a spatial domain (or time domain) to a frequency domain, is widely used in the fields of image processing, audio processing, data processing, and the like.

Back propagation: the optimization algorithm based on the gradient is used for adjusting parameters of the neural network, so that a predicted result of the network is closer to a real state.

End-to-end training: the method refers to the overall objective of directly optimizing tasks by optimizing network parameters without dividing modules in the deep neural network learning process.

JPEG: an image compression standard algorithm.

Mass coefficient: english quality factor, the parameter used for controlling the compression degree in the image compression algorithm. In the JPEG algorithm, the larger the quality coefficient is, the lower the compression degree is, and the smaller the image loss is; the smaller the quality coefficient, the higher the compression degree, and the greater the image loss. The range of the quality coefficient of JPEG is [0,100].

Channel transmission: refers to the transmission of information in a medium. The medium in the embodiment of the application mainly refers to the internet.

Watermark (Watermark): an identification embedded in a video. Watermarks can be used for a variety of purposes including preventing piracy, tracking data leakage, providing copyrighted information, and the like. In the present application, watermarks refer specifically to invisible watermarks.

Next, in order to facilitate understanding of the technical solutions provided by the embodiments of the present application by those skilled in the art, the following description is made on related technologies:

digital watermarking is used to embed specific information in images and can be an effective way of copyright statement. As shown in fig. 1, the digital watermark may include a visible watermark and an invisible watermark. Wherein the visible watermark is overlaid on top of the original image, possibly affecting the visual quality of the image and even obscuring critical content in the image. Since such watermarks are visually visible, the user can specifically remove, for example, using a watermarking algorithm and cropping out the watermark area. The invisible watermark has obvious advantages in that the invisible watermark almost does not influence the visual effect of the image while hiding the information, and is difficult to remove in a targeted manner.

The processing flow of the digital watermark is shown in fig. 2, the copyright owner designates copyright information, and the watermarking algorithm embeds the copyright information into the original image to obtain a watermarked image. Wherein the watermarking algorithm may be performed by a watermark encoder. When a watermarked image is distributed to the internet, the image may be subject to various disturbances during the propagation process, such as compression of a third party platform, malicious tampering by a pirate, etc. These disturbances may affect the image and the watermark information in the image. When the copyright owner finds that the copyright is infringed, the attacked image can be de-watermarked, and the extracted copyright information is taken as evidence. Wherein the watermarking algorithm may be performed by a watermark decoder.

Due to interference and attack, the copyright information extracted from the attacked image may be different from the originally designated copyright information, which is not beneficial to the maintenance of rights. Thus, there is a need to improve the immunity of watermark encoders and watermark decoders, and compression is one of the most common types of interference.

The inventors have appreciated that: in the related art, a compression algorithm is adopted to simulate possible compression interference in channel transmission, so that a watermark encoder and a watermark decoder obtained through training have certain anti-interference capability. However, this tends to make watermark encoders and watermark decoders fit to a certain scene after training convergence, which is less robust for complex compressed scenes.

Therefore, the embodiment of the application provides a training technical scheme of the coder-decoder. In the technical scheme, the method comprises the following steps: (1) The coverage range of a compression algorithm is enlarged by setting the quality coefficient as a random value, so that the watermark encoder and the watermark decoder after training convergence have robustness to complex compression scenes, and the compression interference resistance of the watermark encoder and the watermark decoder is further improved; (2) By arranging a plurality of cascade degradation layers, multiple times of compression are simulated, and the robustness of the watermark encoder and the watermark decoder to complex compression scenes is improved. See in particular below.

For ease of understanding, an exemplary codec training flow chart is provided below.

As shown in fig. 3, the codec training process of the embodiment of the present application uses a watermark encoder, a degradation layer, and a watermark decoder. Wherein:

the watermark encoder is used to embed a watermark in the image. The watermark encoder may be formed of a deep neural network containing learnable network parameters.

The watermark decoder is used to extract the watermark from the image. The watermark decoder may be formed of a deep neural network containing learnable network parameters.

The degradation layer is used to degrade the image quality. The degradation layer is configured to simulate a channel transmission process in end-to-end training of the watermark encoder and the watermark decoder.

It should be noted that the watermark encoder and the watermark decoder may be integrated as a joint model in one system. The watermark encoder and watermark decoder may also be decoupled, distributed in different systems.

The technical scheme of the application is described below through a plurality of embodiments. It should be understood that these embodiments may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

Example 1

Fig. 4 schematically shows a flow chart of a method of training a codec according to a first embodiment of the present application.

As shown in fig. 4, the method for training a codec may include steps S400 to S410, in which:

step S400, obtaining an image sample and a watermark sample.

Step S402, inputting the image samples and the watermark samples into a watermark encoder to obtain a watermarked image.

Step S404, converting the watermarked image into an intermediate image by a degradation layer. The degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression. The random quality coefficients may quantize the DCT coefficients of the watermarked image.

Step S406, inputting the intermediate image into the watermark decoder to obtain a target watermark.

Step S408, obtaining a loss value between the target watermark and the watermark sample.

Step S410, adjusting the network parameter of the watermark encoder and/or the network parameter of the watermark decoder according to the loss value.

According to the method for training the codec, the acquired image samples and watermark samples are used as training samples for training the watermark codec. The training process is as follows: the image sample and watermark sample are input into a watermark encoder to obtain a watermarked image. The watermarked image is then converted to an intermediate image by the degradation layer. The degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression. The resulting intermediate image is then input to a watermark decoder to obtain the target watermark. Finally, a loss value between the target watermark and the watermark sample is determined. Based on the loss value, the network parameters of the watermark encoder and the network parameters of the watermark decoder are adjusted. It can be known that, in the embodiment of the application, the coverage of the compression algorithm is enlarged by setting the quality coefficient of the first preset image compression algorithm to be a random value. And randomly compressing the watermarked image in each training iteration process, namely, using a random compression scene in each training iteration process, so that the watermark encoder and the watermark decoder after training convergence have robustness to complex compression scenes, and further improving the anti-interference capability of the watermark encoder and the watermark decoder.

Each of steps S400 to S410 and optional other steps are described in detail below in conjunction with fig. 4.

Step S400Image samples and watermark samples are acquired.

The image samples are the original image to be watermarked and may constitute training samples for the watermark encoder and the watermark decoder. In some embodiments, audio samples, video samples, etc. may also be employed in place of image samples.

The watermark samples are watermark information that needs to be embedded in the image. The watermark information may include copyright information, content authentication information, anti-counterfeit identification information, and the like.

Training data of the watermark encoder and the watermark decoder can be constructed by the image samples and the watermark, and used for testing and training the watermark encoder and the watermark decoder.

Step S402The image samples and the watermark samples are input into a watermark encoder to obtain a watermarked image.

The watermark encoder may be formed by a model such as a deep neural network, i.e. the watermark encoder includes a learnable network parameter. The learnable network parameters can be continuously adjusted through training, so that the performance of the watermark encoder is improved.

Watermark encoders may be used to implement watermark embedding. For example, a watermark encoder may embed a digital watermark in image samples based on watermark samples to obtain a watermarked image. The watermark in the watermarked image is not visible.

In this embodiment, embedding watermark information in an image is achieved by a watermark encoder, resulting in a watermarked image for subsequent training.

Step S404Converting the watermarked image to an intermediate image by a degradation layer; wherein the degradation layer is configured with a first preset image compression algorithmThe first preset image compression algorithm is configured with random quality coefficients for image compression.

The watermarked image is an image that contains a digital watermark.

The intermediate image is an image obtained by processing the watermarked image by a degradation layer.

The degradation layer may be used to reduce the picture quality of the watermarked image, for example: compression of images, introduction of image noise, reduction of image sharpness, etc.

The degradation layer may simulate various disturbances that the watermarked image may experience during channel transmission. The degradation layer is added into the end-to-end training of the watermark encoder and the watermark decoder, so that the watermark encoder and the watermark decoder after training convergence have robustness to image interference.

For example, in order to achieve robustness of the watermark encoder and watermark decoder to compression, the degradation layer needs to simulate the compression process. Accordingly, degradation refers to pixel loss that occurs when an image or video is compressed and decompressed.

The degradation layer may be configured with a first preset image compression algorithm including a random quality coefficient, the quality coefficient being a parameter used by the first preset image compression algorithm to control the compression degree, the quality coefficient being set to a random value. For example, the quality coefficients may be randomly sampled from the interval [10,90], and the quality coefficients may be random for each training, and may be the same or different.

The first preset image compression algorithm may be any one of JPEG (Joint Photographic Experts Group), AVIF (AV 1 Image File Format) and WebP (Web Picture), and may also be other lossy image compression algorithms. Different lossy image compression algorithms may provide different compression scenarios. Because the interference modes of the image compression algorithms are similar, one of the image compression algorithms (such as JPEG) is used for participating in the training of the watermark encoder and the watermark decoder, and the watermark encoder and the watermark decoder obtained by training not only have robustness to JPEG compression scenes, but also can obtain better robustness in other image compression algorithms.

The watermark image is converted into an intermediate image through the degradation layer, and the intermediate image can be obtained by compressing the watermark image through the degradation layer. Specific: the degradation layer performs lossy compression on the watermarked image using, for example, a first predetermined image compression algorithm, such as a standard JPEG algorithm. In the compression process, the watermarked image is compressed using random quality coefficients. It is from this operation that compression disturbances to the image content during compression result. That is, the degradation layer may simulate a random compressed scene to process the watermarked image during each training process.

It can be known that, in this embodiment, by setting a random quality coefficient for the first preset image compression algorithm, the degradation layer can simulate a plurality of random compression scenes, thereby expanding the coverage range of the compression algorithm.

In some embodiments, the degradation layer may be configured as a blurring algorithm for reducing the sharpness of the image. The degradation layer is added into the end-to-end training of the watermark encoder and the watermark decoder, so that the watermark encoder and the watermark decoder after training convergence have robustness to fuzzy interference.

In other embodiments, the degradation layer may be configured as an image noise algorithm for adding noise to the image. The degradation layer is added into the end-to-end training of the watermark encoder and the watermark decoder, so that the watermark encoder and the watermark decoder after training convergence have robustness to noise interference.

Step S406The intermediate image is input into the watermark decoder to obtain a target watermark.

The watermark decoder may be formed by a model such as a deep neural network, i.e. the watermark decoder contains learnable network parameters. The learnable network parameters can be continuously adjusted through training, so that the performance of the watermark decoder is improved.

The watermark decoder may be arranged to extract watermark information from the image carrying the digital watermark. For example, the watermark decoder may extract a target watermark from the intermediate image, and the target watermark may include the watermark information after the interference.

In this embodiment, the target watermark is extracted from the intermediate image by the watermark decoder for subsequent training.

Step S408And obtaining a loss value between the target watermark and the watermark sample.

The target watermark is watermark information predicted by the watermark decoder. The target watermark may not be exactly the same as the watermark samples. Thus, the loss value may be determined by the difference or error between the target watermark and the watermark samples. The penalty value may be used to evaluate the performance of the watermark encoder and the watermark decoder. The smaller the loss value, the closer the target watermark is to the watermark samples, and the better the watermark encoder and watermark decoder performance. The larger the loss value, the larger the difference between the target watermark and watermark samples, and the poorer the performance of the watermark encoder and watermark decoder.

There are various schemes for obtaining the loss value. For example, the loss functions may be pre-designed for the watermark encoder and watermark decoder as desired. The loss function may be used to measure the difference between model predictions and true states. The target watermark and watermark samples are input into a loss function, through which corresponding loss values can be output.

The loss values may be used to guide network parameter adjustments to optimize the algorithm, specifically as follows:

step S410And adjusting the network parameters of the watermark encoder and/or the network parameters of the watermark decoder according to the loss value.

The smaller the loss value between the target watermark and watermark samples, the better the robustness of the complex compression scene of the watermark encoder and watermark decoder. It is the aim of training watermark encoders and watermark decoders to reduce the loss value as much as possible by continuously adjusting the learnable network parameters therein. The penalty value may be used to guide the adjustment of the network parameters of the watermark encoder and the network parameters of the watermark decoder towards a reduced penalty, improving the performance of the watermark encoder and the watermark decoder.

For example, network parameters may be optimized by a back-propagation algorithm. The back propagation first calculates a gradient representing the rate of change of the loss function in parameter space. That is, adjusting the current parameter in which direction may reduce the loss value. In each training iteration, the network parameters of the watermark encoder and the network parameters of the watermark decoder are continuously adjusted, so that the loss value can be gradually reduced, and the compression interference resistance of the watermark encoder and the watermark decoder to complex compression scenes is improved.

In the above embodiment, by configuring the first preset image compression algorithm including the random quality coefficient in the degradation layer, the degradation layer can simulate a plurality of random compression scenes in the end-to-end training, and the coverage of the compression algorithm is enlarged, so that the robustness of the watermark encoder and the watermark decoder to complex compression scenes is improved.

In order to further increase the robustness of the watermark encoder and watermark decoder to more complex scenes, the degradation layer may be flexibly configured.

In an alternative embodiment, the at least two conversion modes further include a second preset image compression algorithm.

Correspondingly, as shown in fig. 5, step S404 may include:

step S500, randomly selecting one of the first preset image compression algorithm and the second preset image compression algorithm; and

Step S502, performing compression processing on the watermarked image based on the selected preset image compression algorithm, so as to obtain the intermediate image.

Since the operation in the first compression algorithm is not differentiable, it may prevent the gradient back propagation of the watermark encoder during end-to-end training. In order to solve the above-described problem, a second preset image compression algorithm may be configured in the degradation layer. The second preset image compression algorithm may be a custom, differentiable simulation algorithm for implementing an approximation to the compression operation, thereby alleviating the problem of the watermark encoder back propagation blocking during end-to-end training. For example, the second preset image compression algorithm may include a simulated JPEG algorithm, a simulated WebP algorithm, or the like. The second preset image compression algorithm can approximate the compression operation in the standard JPEG algorithm by adopting the analog JPEG algorithm.

In the case that the first preset image compression algorithm and the second preset image compression algorithm are configured in the degradation layer, one of the image compression algorithms can be randomly selected for each training to compress the watermarked image to obtain the intermediate image. In this way, the degradation layer not only provides a plurality of random compression scenes based on the first preset image compression algorithm, but also provides more compression scenes based on the second preset image compression algorithm, so that the coverage range of the compression scenes is further enlarged, and the anti-interference capability and the robustness of the watermark encoder and the watermark decoder to more complex scenes are further improved.

In an alternative embodiment, as shown in fig. 6, the at least two switching means further comprise a direct connection channel.

Correspondingly, as shown in fig. 7, step S404 may include:

step S700, selecting a direct connection channel, a first preset image compression algorithm or the second preset image compression algorithm.

Step S702, in the case of selecting the direct channel, taking the watermarked image as the intermediate image.

Step S704, in the case of selecting a first preset image compression algorithm, performing compression processing on the watermarked image according to the first preset image compression algorithm, so as to obtain the intermediate image.

Step S706, in the case of selecting a second preset image compression algorithm, performing compression processing on the watermarked image according to the second preset image compression algorithm, so as to obtain the intermediate image.

The direct connection refers to taking the input as the output directly without any processing. The direct connection channel can simulate the condition that the image is not interfered by compression, and under the condition of watermarking the image during input, the intermediate image output through the direct connection channel is the watermarking image. I.e. no external disturbances are considered in the end-to-end training of the watermark encoder and the watermark decoder.

Under the condition that a direct communication channel, a first preset image compression algorithm and a second preset image compression algorithm are configured in the degradation layer, one of the direct communication channel (without interference), the first preset image compression algorithm and the second preset image compression algorithm can be selected randomly in each training iteration process and used for simulating channel transmission so as to convert a watermarking image into an intermediate image, so that network parameters of a watermark encoder and network parameters of a watermark decoder can be trained later, and robustness is improved. The direct connection channel does not process the watermarking image, and the first preset image compression algorithm or the second preset image compression algorithm is used for compressing the watermarking image.

In an alternative embodiment, as shown in fig. 8, the second preset image compression algorithm includes an analog JPEG algorithm. Correspondingly, step S404 may include:

step S800, performing DCT transformation on the watermarked image to obtain an original DCT coefficient matrix corresponding to the watermarked image.

Step S802, filtering high frequency coefficients in the original DCT coefficient matrix to obtain a first target DCT coefficient matrix.

Step S804, performing inverse DCT transformation on the first target DCT coefficient matrix to obtain the intermediate image.

The analog JPEG algorithm may be a custom, differentiable image compression algorithm for approximating the compression operations in a standard JPEG algorithm.

DCT is a discrete cosine transform that converts a spatial domain signal to the frequency domain to obtain a set of frequency coefficients, also known as DCT coefficients. The high frequency coefficients may represent details, textures, edges, etc. in the image. The low frequency coefficient may represent low frequency information such as the overall structure of the image and the average color intensity.

In a specific application, the step of converting the watermarked image into an intermediate image by means of an analog JPEG algorithm is specifically as follows:

DCT transformation is firstly carried out on the watermarking image, and an original DCT coefficient matrix corresponding to the watermarking image can be obtained. Quantization of DCT coefficients due to image compression algorithms such as JPEG is mainly focused on high frequencies. Thus, the high frequency coefficients in the original DCT coefficient matrix can be filtered and discarded, and only the low frequency coefficients remain, thereby obtaining the first target DCT coefficient matrix. And performing inverse DCT on the first target DCT coefficient matrix, and re-converting the signal back to the space domain to obtain an intermediate image.

In this embodiment, the compression operation approximation is achieved by eliminating the high frequency coefficients and retaining the low frequency coefficients.

In an alternative embodiment, as shown in fig. 9, the first preset image compression algorithm includes a standard JPEG algorithm. Correspondingly, step S404 may further include:

step S900, performing DCT transformation on the watermarked image to obtain an original DCT coefficient matrix corresponding to the watermarked image.

Step S902, quantizes the original DCT coefficient matrix according to the random quality coefficient to obtain a second target DCT coefficient matrix.

Step S904, obtaining the intermediate image according to the second target DCT coefficient matrix.

Standard JPEG algorithms quantize the frequency domain coefficients of an image through a quantization table. The element values in the quantization table directly affect the degree of quantization and thus the degree of compression of the image.

In the compression process of the standard JPEG, different quantization tables can be selected according to different quality coefficients, so that compression operations of different degrees can be performed on the image.

In a specific application, the step of converting the watermarked image into an intermediate image by means of a standard JPEG algorithm is specifically as follows: DCT transformation is firstly carried out on the watermarking image, and an original DCT coefficient matrix corresponding to the watermarking image can be obtained. And obtaining a corresponding quantization table according to the random quality coefficient, and quantizing the original DCT coefficient matrix to obtain a second target DCT coefficient matrix. And performing inverse DCT on the second target DCT coefficient matrix to obtain an intermediate image.

In this embodiment, the original DCT coefficient matrix is quantized according to a random quality coefficient, that is, the watermarked image is randomly compressed in each training, so as to expand the coverage of the compression algorithm, thereby improving the robustness of the watermark encoder and the watermark decoder.

In the above embodiment, each training randomly selects one from the direct connection channel, the first preset image compression algorithm and the second preset image compression algorithm to simulate channel transmission, so as to train network parameters. In practical applications, the compression undergone by the watermarked image may be more than once, and in order to further improve the robustness of the watermark encoder and the watermark decoder against complex scenes (such as multiple compression), the degradation layer may be flexibly configured. An alternative embodiment is provided below.

In an alternative embodiment, as shown in fig. 10, there are a plurality of the degradation layers, and a plurality of the degradation layers form a cascade structure, in which the output of a previous degradation layer between adjacent degradation layers is the input of a subsequent degradation layer.

Correspondingly, as shown in fig. 11, step S404 may further include:

step S1100, inputting the watermarking image into a first degradation layer in a cascade structure so that each degradation layer processes the watermarking image layer by layer;

Step S1102, acquiring an image output by the last degradation layer in the cascade structure, and determining the image as an intermediate image.

The degradation layer can be formed by layering N (more than 1) degradation layers, and is respectively set as N _i I.e {2,3,4,5,..n }. Each degradation layer may include a direct link channel, one or more compression algorithms. In each degradation layer, one of the direct connection channels and one or more compression algorithms (such as a first preset image compression algorithm and a second preset image compression algorithm) configured therein can be selected randomly for processing the input. Optionally N _i Are all independent repeating events, i.e. each degradation layer N _i And respectively selecting the processing modes of the input, wherein the processing modes are random. Each degradation layer N _i The quality coefficients Q of the first preset image compression algorithm in (a) are also random, and may be the same or different.

For example, image X will be watermarked _in Input to the first degradation layer N of the cascade structure ₁ Obtain x ₁ Subsequently x is taken ₁ As the second degradation layer N ₂ Input, output x of (2) ₂ . Similarly, x _n-1 As the nth degradation layer N _n Input, output x of (2) _n I.e. intermediate image x _out . The whole process can be as follows:

in this embodiment, the degradation layers are set to a cascade structure, and each degradation layer independently and randomly selects a processing mode for the watermarked image, so that a scene of multiple compression can be simulated, and the coverage range of the compression algorithm is enlarged. It is known that adding the degradation layer in the cascade structure to the end-to-end training of the watermark encoder and the watermark decoder can further improve the robustness to more complex compression scenes of the watermark encoder and the watermark decoder.

To make the application easier to understand, an exemplary application is provided below in connection with fig. 12.

S11, acquiring original pictures and initial watermark information;

s12, inputting the original image and the initial watermark information into a watermark encoder, and embedding the watermark into the original image through the watermark encoder to obtain a watermarked image.

S13, inputting the watermarking image into a first degradation layer, wherein the first degradation layer randomly selects one from direct connection, analog JPEG and standard JPEG, processes the watermarking image and outputs the watermarking image.

Wherein the quality coefficient of standard JPEG is random.

S14, taking the output of the first degradation layer as the input of a second degradation layer, randomly selecting one from direct connection, analog JPEG and standard JPEG by the second degradation layer, processing the watermarking image, and outputting the watermarking image.

Wherein the quality coefficient of standard JPEG is random.

And S15, and so on, taking the output of the previous degradation layer as the input of the next degradation layer, and processing the watermarking image layer by layer through each degradation layer.

S16, acquiring an image output by the last degradation layer, and determining the image as an attacked image (intermediate image).

S17, inputting the attacked image into a watermark decoder, and extracting target watermark information from the attacked image through the watermark decoder.

S18, according to the target watermark information and the initial watermark information, adjusting the network parameters of the watermark encoder and the network parameters of the watermark decoder.

The embodiment of the application adopts an end-to-end training mode, and enlarges the coverage range of the compression algorithm by setting the quality coefficient of the first preset image compression algorithm as a random value. And randomly compressing the watermarked image in each training iteration process, namely, using a random compression scene in each training iteration process, so that the watermark encoder and the watermark decoder after training convergence have robustness to complex compression scenes, and further improving the anti-interference capability of the watermark encoder and the watermark decoder. Further, by arranging a plurality of cascade degradation layers, multiple times of compression are simulated, and robustness of the watermark encoder and the watermark decoder to complex compression scenes is improved.

Example two

The first embodiment introduces a training method of the codec, and the embodiment of the application also provides a codec technical scheme of the digital watermark.

The watermark encoder obtained by the codec training method according to the first embodiment encodes an image to be watermarked; or the watermark decoder obtained by the codec training method according to the first embodiment decodes the watermarked image.

The watermark encoder and the watermark decoder obtained by the encoding and decoding training method of the first embodiment have robustness and anti-interference capability on complex compression scenes. It can be seen that the watermark encoder of this embodiment can embed watermark information into an image to be watermarked, so as to effectively improve the anti-interference capability of the digital watermark. The watermark decoder decodes the watermarked image, can extract watermark information more accurately, and improves the robustness and safety of the watermark.

Example III

Fig. 13 schematically shows a block diagram of a training device of a codec according to a third embodiment of the present application, which may be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to complete the embodiment of the present application. Program modules in accordance with the embodiments of the present application are directed to a series of computer program instruction segments capable of performing the specified functions, and the following description describes each program module in detail. As shown in fig. 13, the apparatus 1300 may include: a first acquisition module 1310, a first input module 1320, a conversion module 1330, a second input module 1340, a second acquisition module 1350, an adjustment module 1360, wherein:

A first obtaining module 1310, configured to obtain an image sample and a watermark sample;

a first input module 1320, configured to input the image samples and the watermark samples into a watermark encoder to obtain a watermarked image;

a conversion module 1330 for converting the watermarked image to an intermediate image by a degradation layer; the degradation layer is configured with at least two conversion modes including a first preset image compression algorithm, and the first preset image compression algorithm is configured with a random quality coefficient for image compression;

a second input module 1340 for inputting the intermediate image into the watermark decoder to obtain a target watermark;

a second obtaining module 1350, configured to obtain a loss value between the target watermark and the watermark sample;

an adjustment module 1360 is configured to adjust a network parameter of the watermark encoder and/or a network parameter of the watermark decoder according to the loss value.

As an optional embodiment, the at least two conversion modes further include a second preset image compression algorithm;

correspondingly, the conversion module 1330 is further configured to:

As an optional embodiment, the at least two conversion modes further include a direct connection channel;

correspondingly, the conversion module 1330 is further configured to:

As an alternative embodiment, the second preset image compression algorithm includes an analog JPEG algorithm; correspondingly, the conversion module 1330 is further configured to:

As an alternative embodiment, the first preset image compression algorithm includes a standard JPEG algorithm; correspondingly, the conversion module 1330 is further configured to:

As an alternative embodiment, the degradation layers have a plurality and form a serial structure, in which the output of the previous degradation layer between adjacent degradation layers is the input of the next degradation layer;

correspondingly, the conversion module 1330 is further configured to:

inputting the watermarked image into a first degradation layer in a serial structure so that each degradation layer processes the watermarked image layer by layer;

and acquiring an image output by the last degradation layer in the serial structure, and determining the image as an intermediate image.

Example IV

Fig. 14 schematically shows a hardware architecture diagram of a computer device 10000 adapted to implement a training method of a codec according to a fourth embodiment of the present application. In some embodiments, the computer device 10000 can be a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster composed of multiple servers), or the like. As shown in fig. 14, the computer device 10000 includes, but is not limited to: the memory 10010, processor 10020, network interface 10030 may be communicatively linked to each other via a system bus. Wherein:

memory 10010 includes at least one type of computer-readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory), random Access Memory (RAM), static Random Access Memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, memory 10010 may be an internal storage module of computer device 10000, such as a hard disk or memory of computer device 10000. In other embodiments, the memory 10010 may also be an external storage device of the computer device 10000, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 10000. Of course, the memory 10010 may also include both an internal memory module of the computer device 10000 and an external memory device thereof. In this embodiment, the memory 10010 is typically used for storing an operating system installed on the computer device 10000 and various application software, such as program codes of a training method of a codec. In addition, the memory 10010 may be used to temporarily store various types of data that have been output or are to be output.

The processor 10020 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other chip in some embodiments. The processor 10020 is typically configured to control overall operation of the computer device 10000, such as performing control and processing related to data interaction or communication with the computer device 10000. In this embodiment, the processor 10020 is configured to execute program codes or process data stored in the memory 10010.

The network interface 10030 may comprise a wireless network interface or a wired network interface, which network interface 10030 is typically used to establish a communication link between the computer device 10000 and other computer devices. For example, the network interface 10030 is used to connect the computer device 10000 to an external terminal through a network, establish a data transmission channel and a communication link between the computer device 10000 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System ofMobile communication, abbreviated as GSM), wideband code division multiple access (Wideband Code Division Multiple Access, abbreviated as WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, etc.

It should be noted that fig. 14 only shows a computer device having components 10010-10030, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may be implemented instead.

In this embodiment, the method for training a codec stored in the memory 10010 may be further divided into one or more program modules and executed by one or more processors (e.g., the processor 10020) to perform an embodiment of the present application.

Example five

The present application also provides a computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the steps of the codec training method of the embodiments.

In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may also be an external storage device of a computer device, such as a plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash memory Card (Flash Card), etc. that are provided on the computer device. Of course, the computer-readable storage medium may also include both internal storage units of a computer device and external storage devices. In this embodiment, the computer readable storage medium is typically used to store an operating system installed on a computer device and various types of application software, such as program codes of the codec training method in the embodiment, and the like. Furthermore, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.

It will be apparent to those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computer device, they may be concentrated on a single computer device, or distributed over a network of multiple computer devices, they may alternatively be implemented in program code executable by a computer device, so that they may be stored in a storage device for execution by the computer device, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately made into individual integrated circuit modules, or a plurality of modules or steps in them may be made into a single integrated circuit module. Thus, embodiments of the application are not limited to any specific combination of hardware and software.

It should be noted that the foregoing is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application, and all equivalent structures or equivalent processes using the descriptions of the present application and the accompanying drawings, or direct or indirect application in other related technical fields, are included in the scope of the present application.

Claims

1. A method of training a codec, the method comprising:

acquiring an image sample and a watermark sample;

obtaining a loss value between the target watermark and the watermark sample;

2. The method of claim 1, wherein the at least two conversion modes further comprise a second preset image compression algorithm;

3. The method of claim 2, wherein the at least two switching patterns further comprise direct-connect channels;

4. The method of claim 2, wherein the second preset image compression algorithm comprises an analog JPEG algorithm; correspondingly, the converting the watermarked image into an intermediate image by a degradation layer comprises:

DCT transformation is carried out on the watermarking image so as to obtain an original DC T coefficient matrix corresponding to the watermarking image;

5. The method of claim 1, wherein the first preset image compression algorithm comprises a standard JPEG algorithm; correspondingly, the converting the watermarked image into an intermediate image by a degradation layer comprises:

quantizing the original DCT coefficient matrix according to the random quality coefficient to obtain a second target DC T coefficient matrix;

6. The method of any one of claims 1 to 5, wherein there are a plurality of said degradation layers, the plurality of said degradation layers forming a cascade structure in which the output of a preceding degradation layer between adjacent degradation layers is the input of a subsequent degradation layer;

7. A method for encoding and decoding a digital watermark, comprising:

a watermark encoder trained according to the method of any one of claims 1 to 6 for encoding an image to be watermarked; or (b)

A watermark decoder trained in accordance with the method of any one of claims 1 to 6, for decoding a watermarked image.

8. A training device for a codec, the device comprising:

9. A computer device, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 7.

10. A computer readable storage medium having stored therein computer instructions which when executed by a processor implement the method of any one of claims 1 to 7.