CN117808860A

CN117808860A - Image processing method, device, electronic equipment and storage medium

Info

Publication number: CN117808860A
Application number: CN202311870810.8A
Authority: CN
Inventors: 吴若溪
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-04-02

Abstract

The application discloses an image processing method, an image processing device, electronic equipment and a storage medium, wherein the image processing method comprises the following steps: acquiring an original image as a first-order image; acquiring a depth image corresponding to the first eye image; generating a target parallax image based on the depth image and the first eye image, wherein parallax exists between the target parallax image and the first eye image; and carrying out post-processing on the target parallax image to obtain a second-order image corresponding to the first-order image. The method can generate another eye image with parallax for the monocular image, so that the field of view corresponding to the image content can be expanded when the image content is displayed.

Description

Image processing methods, devices, electronic equipment and storage media

技术领域Technical field

本申请涉及图像处理技术领域，更具体地，涉及一种图像处理方法、装置、电子设备及存储介质。The present application relates to the field of image processing technology, and more specifically, to an image processing method, device, electronic device and storage medium.

背景技术Background technique

随着科技水平和生活水平的快速进步，电子设备(例如智能手机、平板电脑等)已经成为人们生活中常用的电子产品之一。由于电子设备通常具有拍摄功能，因此人们常利用电子设备进行照片和视频的拍摄，但相关技术中，电子设备拍摄的图像中的内容的视野范围通常是有限的。With the rapid advancement of technology and living standards, electronic devices (such as smartphones, tablets, etc.) have become one of the commonly used electronic products in people's lives. Since electronic devices usually have shooting functions, people often use electronic devices to shoot photos and videos. However, in related technologies, the field of view of the content in the images captured by the electronic devices is usually limited.

发明内容Summary of the invention

本申请提出了一种图像处理方法、装置、电子设备及存储介质，可以实现针对单目的图像生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野。The present application proposes an image processing method, device, electronic device and storage medium, which can generate another eye image with parallax for a single-eye image, thereby expanding the field of view corresponding to the image content when displaying the image content.

第一方面，本申请实施例提供了一种图像处理方法，所述方法包括：获取原始图像，作为第一目图像；获取所述第一目图像对应的深度图像；基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差；对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。In a first aspect, embodiments of the present application provide an image processing method. The method includes: acquiring an original image as a first-eye image; acquiring a depth image corresponding to the first-eye image; based on the depth image and the The first eye image is used to generate a target parallax image, and there is a parallax between the target parallax image and the first eye image; the target parallax image is post-processed to obtain a second eye image corresponding to the first eye image. image.

第二方面，本申请实施例提供了一种图像处理装置，所述装置包括：第一图像获取模块、第二图像获取模块、视差图像生成模块以及图像后处理模块，其中，所述第一图像获取模块用于获取原始图像，作为第一目图像；所述第二图像获取模块用于获取所述第一目图像对应的深度图像；所述视差图像生成模块用于基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差；所述图像后处理模块用于对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。In a second aspect, embodiments of the present application provide an image processing device, which includes: a first image acquisition module, a second image acquisition module, a parallax image generation module, and an image post-processing module, wherein the first image The acquisition module is used to acquire the original image as the first eye image; the second image acquisition module is used to acquire the depth image corresponding to the first eye image; the parallax image generation module is used to generate the disparity image based on the depth image and the The first eye image is used to generate a target parallax image, and there is a parallax between the target parallax image and the first eye image; the image post-processing module is used to post-process the target parallax image to obtain the third eye image. The second-eye image corresponding to the first-eye image.

第三方面，本申请实施例提供了一种电子设备，包括：一个或多个处理器；存储器；一个或多个应用程序，其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行，所述一个或多个应用程序配置用于执行上述第一方面提供的图像处理方法。In a third aspect, an embodiment of the present application provides an electronic device, comprising: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, and the one or more applications are configured to execute the image processing method provided in the first aspect above.

第四方面，本申请实施例提供了一种计算机可读取存储介质，所述计算机可读取存储介质中存储有程序代码，所述程序代码可被处理器调用执行上述第一方面提供的图像处理方法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium. The computer-readable storage medium stores program code. The program code can be called by a processor to execute the image provided in the first aspect. Approach.

本申请提供的方案，通过获取原始图像，作为第一目图像；获取第一目图像对应的深度图像，基于深度图像以及第一目图像，生成目标视差图像，目标视差图像与第一目图像之间存在视差，对目标视差图像进行后处理，得到第一目图像对应的第二目图像。由此，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，并且在针对单目的图像获得视差图像后，还对视差图像进行后处理，从而能提升获得的另一目图像的精度和质量。The solution provided by this application obtains the original image as the first-eye image; obtains the depth image corresponding to the first-eye image, and generates a target parallax image based on the depth image and the first-eye image. The target parallax image and the first-eye image are There is parallax between them, and the target parallax image is post-processed to obtain the second-eye image corresponding to the first-eye image. Thus, it is possible to generate another eye image with parallax based on the single-purpose image, thereby expanding the field of view corresponding to the image content when displaying the image content, and after obtaining the parallax image for the single-purpose image, the parallax image is also post-processed. , thereby improving the accuracy and quality of the obtained image of another object.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.

图1示出了根据本申请一个实施例的图像处理方法的流程示意图。Figure 1 shows a schematic flowchart of an image processing method according to an embodiment of the present application.

图2示出了本申请实施例提供的深度估计模型的原理示意图。FIG2 is a schematic diagram showing the principles of a depth estimation model provided in an embodiment of the present application.

图3示出了根据本申请另一个实施例的图像处理方法的流程示意图。Figure 3 shows a schematic flowchart of an image processing method according to another embodiment of the present application.

图4示出了本申请实施例提供的仿射变换的一种示意图。FIG. 4 shows a schematic diagram of an affine transformation provided in an embodiment of the present application.

图5示出了本申请实施例提供的仿射变换的另一种示意图。Figure 5 shows another schematic diagram of the affine transformation provided by the embodiment of the present application.

图6示出了根据本申请又一个实施例的图像处理方法的流程示意图。Figure 6 shows a schematic flowchart of an image processing method according to yet another embodiment of the present application.

图7示出了本申请实施例提供的双线性插值处理的原理示意图。FIG. 7 shows a schematic diagram of the principle of bilinear interpolation processing provided by the embodiment of the present application.

图8示出了本申请实施例提供的一种效果示意图。Figure 8 shows a schematic diagram of the effect provided by the embodiment of the present application.

图9示出了本申请实施例提供的一种效果示意图。Figure 9 shows a schematic diagram of the effect provided by the embodiment of the present application.

图10示出了根据本申请再一个实施例的图像处理方法的流程示意图。Figure 10 shows a schematic flowchart of an image processing method according to yet another embodiment of the present application.

图11示出了根据本申请又另一个实施例的图像处理方法的流程示意图。Figure 11 shows a schematic flowchart of an image processing method according to yet another embodiment of the present application.

图12示出了本申请实施例提供的一种场景示意图。Figure 12 shows a schematic diagram of a scenario provided by the embodiment of the present application.

图13示出了根据本申请一个实施例的图像处理装置的一种框图。Figure 13 shows a block diagram of an image processing device according to an embodiment of the present application.

图14是本申请实施例的用于执行根据本申请实施例的图像处理方法的电子设备的框图。FIG. 14 is a block diagram of an electronic device for executing an image processing method according to an embodiment of the present application.

图15是本申请实施例的用于保存或者携带实现根据本申请实施例的图像处理方法的程序代码的存储单元。Figure 15 is a storage unit used to save or carry the program code for implementing the image processing method according to the embodiment of the present application.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本申请方案，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述。In order to enable those in the technical field to better understand the solution of the present application, the technical solution in the embodiment of the present application will be clearly and completely described below in conjunction with the drawings in the embodiment of the present application.

随着科技水平的发展，电子设备通常会设置有摄像头，从而实现拍摄功能。当前，电子设备(例如智能手机、平板电脑等)在日常生活使用中，普及率已近乎全民覆盖，其中，相机模块，已成为电子设备主要功能点，用户可通过电子设备的相机功能，拍摄照片与视频，所得的图像即拍即得，方便快捷，不仅如此，用户也经常将拍摄的图像上传到网上与他人一起分享。With the development of science and technology, electronic devices are usually equipped with cameras to realize the shooting function. Currently, the popularity of electronic devices (such as smartphones, tablets, etc.) in daily life has reached nearly universal coverage. Among them, the camera module has become the main function point of electronic devices. Users can take photos through the camera function of electronic devices. With video, the resulting images are instantly captured, which is convenient and fast. Not only that, users also often upload the captured images to the Internet to share them with others.

在通过电子设备进行拍摄时，通常成像的视野范围是有限的，这就使得拍摄的图像被后续用于展示时，只能展示出有限的视野范围内的内容。另外，随着虚拟现实(VirtualReality，VR)、增强现实(Augmented Reality，AR)、扩展现实(Extended Reality，XR)的头戴显示设备(例如智能眼镜)在家庭场景中的应用，也会存在电子设备与头戴显示设备之间的联动。在电子设备与头戴显示设备进行联动时，通常存在电子设备将拍摄的内容传输至头戴显示设备进行显示的场景，在这样的场景中，如果头戴显示设备直接显示拍摄的单目图像，则只能实现2D的显示效果，且展示出的图像内容的视野范围是有限的。When shooting with electronic equipment, the field of view of the image is usually limited, which means that when the captured image is subsequently used for display, it can only display the content within the limited field of view. In addition, with the application of virtual reality (VR), augmented reality (AR), and extended reality (XR) head-mounted display devices (such as smart glasses) in home scenes, there will also be electronic Linkage between the device and the head-mounted display device. When an electronic device is linked to a head-mounted display device, there is usually a scenario where the electronic device transmits the captured content to the head-mounted display device for display. In such a scenario, if the head-mounted display device directly displays the captured monocular image, Then only a 2D display effect can be achieved, and the field of view of the displayed image content is limited.

相关技术中，为了解决拍摄的单目图像的成像的视野范围受限的问题，可以利用已知的单目图像和深度图进行视野补全，从而扩展呈现出的图像内容的视野范围。但相关技术中，通常利用单目图像进行视野补全的准确性和精度不足，且视野补全后的图像可能出现阴影、断裂、纹理严重错误等情况，这样在被设备展示时就会出现很大的误差。In the related art, in order to solve the problem of limited field of view of the captured monocular image, known monocular images and depth maps can be used to complete the field of view, thereby expanding the field of view of the presented image content. However, in related technologies, the accuracy and precision of visual field completion using monocular images are usually insufficient, and the image after visual field completion may have shadows, fractures, serious texture errors, etc., which will cause many problems when displayed by the device. Big error.

针对上述问题，发明人提出了本申请实施例提供的图像处理方法、装置、电子设备以及存储介质，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，并且在针对单目的图像获得视差图像后，还对视差图像进行后处理，从而能提升获得的另一目图像的精度和质量。其中，具体的图像处理方法在后续的实施例中进行详细的说明。In response to the above problems, the inventor proposed the image processing method, device, electronic device and storage medium provided by the embodiments of the present application, which can generate another eye image with parallax based on the single-eye image, thereby expanding the image when displaying the image content The field of view corresponding to the content, and after obtaining the parallax image for the single-view image, the disparity image is also post-processed, thereby improving the accuracy and quality of the obtained other-view image. The specific image processing method will be described in detail in subsequent embodiments.

下面再结合附图对本申请实施例提供的图像处理方法进行详细介绍。The image processing method provided by the embodiment of the present application will be introduced in detail below with reference to the accompanying drawings.

请参阅图1，图1示出了本申请一个实施例提供的图像处理方法的流程示意图。在具体的实施例中，所述图像处理方法应用于如图13所示的图像处理装置600以及配置有所述图像处理装置600的电子设备100(图14)。下面将以电子设备为例，说明本实施例的具体流程，当然，可以理解的，本实施例所应用的电子设备可以为智能手机、平板电脑、电子书、头戴显示设备等，在此不做限定。下面将针对图1所示的流程进行详细的阐述，所述图像处理方法具体可以包括以下步骤：Please refer to Figure 1, which shows a schematic flowchart of an image processing method provided by an embodiment of the present application. In a specific embodiment, the image processing method is applied to the image processing device 600 shown in FIG. 13 and the electronic device 100 configured with the image processing device 600 ( FIG. 14 ). The following will take an electronic device as an example to illustrate the specific process of this embodiment. Of course, it can be understood that the electronic device applied in this embodiment can be a smart phone, a tablet computer, an e-book, a head-mounted display device, etc., which will not be discussed here. Make limitations. The process shown in Figure 1 will be described in detail below. The image processing method may specifically include the following steps:

步骤S110：获取原始图像，作为第一目图像。Step S110: Obtain the original image as the first image.

在本申请实施例中，电子设备可以获取原始图像，并将其作为第一目图像，以针对原始图像生成另一目的图像，从而实现补全另一目的视野，实现视野的扩展。其中，原始图像可以为单目的RGB图像，或者原始图像为视频中的其中一帧图像；将原始图像作为第一目图像，可以是将原始图像作为左目图像，也可以是将原始图像作为右目图像。In this embodiment of the present application, the electronic device can acquire the original image and use it as the first image to generate another image based on the original image, thereby completing the field of view of the other object and expanding the field of view. Among them, the original image can be a single-purpose RGB image, or the original image can be one of the frame images in the video; the original image can be used as the first-eye image, the original image can be used as the left-eye image, or the original image can be used as the right-eye image. .

在一些实施方式中，电子设备可以通过对现实场景进行图像拍摄，以得到拍摄的原始图像，并将其作为第一目图像。其中，电子设备可以为智能手机、平板电脑、智能手表等设置有摄像头的移动终端，电子设备可以通过前置摄像头或者后置摄像头进行图像采集，从而获得以上原始图像，例如，电子设备可以通过后置摄像头采集图像，并将采集的图像作为以上原始图像。In some implementations, the electronic device can capture an image of a real scene to obtain the captured original image and use it as the first-view image. Among them, the electronic device can be a mobile terminal equipped with a camera such as a smartphone, a tablet computer, a smart watch, etc. The electronic device can collect images through a front camera or a rear camera to obtain the above original image. For example, the electronic device can pass through the rear camera. Set up the camera to collect images, and use the collected images as the above original images.

在一些实施方式中，电子设备可以从本地获取原始图像，也就是说，电子设备可以从本地存储的文件中获取原始图像，例如，电子设备为移动终端时，可以从相册获取原始图像，即电子设备预先通过摄像头采集图像后存储在本地相册，或者预先从网络下载图像后存储在本地相册等，然后在需要对图像进行视野补全时，从相册中读取原始图像。In some implementations, the electronic device can obtain the original image from a local device. That is to say, the electronic device can obtain the original image from a locally stored file. For example, when the electronic device is a mobile terminal, the original image can be obtained from a photo album, that is, the electronic device can obtain the original image from a photo album. The device collects images through the camera in advance and stores them in the local album, or downloads the images from the network and stores them in the local album, etc., and then reads the original images from the album when it is necessary to complete the field of view of the image.

在一些实施方式中，电子设备为移动终端或者电脑时，也可以从网络下载原始图像，例如，电子设备可以通过无线网络、数据网络等从相应的服务器下载需求的原始图像。电子设备为头戴显示设备时，也可以接收其他设备传输的图像，并将其作为原始图像。In some embodiments, when the electronic device is a mobile terminal or a computer, the original image can also be downloaded from the network. For example, the electronic device can download the required original image from a corresponding server through a wireless network, a data network, etc. When the electronic device is a head-mounted display device, it can also receive images transmitted by other devices and use them as original images.

当然，电子设备获取原始图像的具体方式可以不作为限定。Of course, the specific way in which the electronic device obtains the original image is not limited.

步骤S120：获取所述第一目图像对应的深度图像。Step S120: Obtain the depth image corresponding to the first eye image.

在本申请实施例中，电子设备在获取到以上第一目图像之后，可以获取以上第一目图像对应的深度图像，以便根据第一目图像对应的深度图像，生成与第一目图像之间存在视差的图像。其中，深度图像是指将从图像采集器到场景中各点的距离(深度)作为像素值的图像，深度图像可以反映物体可见表面的几何形状，经过坐标转换可以计算为点云数据，有规则及必要信息的点云数据也可以反算为深度图像数据。In the embodiment of the present application, after acquiring the above first eye image, the electronic device can acquire the depth image corresponding to the above first eye image, so as to generate an image having parallax with the first eye image according to the depth image corresponding to the first eye image. Among them, the depth image refers to an image that uses the distance (depth) from the image collector to each point in the scene as a pixel value. The depth image can reflect the geometric shape of the visible surface of the object, and can be calculated as point cloud data after coordinate conversion. Point cloud data with rules and necessary information can also be inversely calculated as depth image data.

在一些实施方式中，电子设备可以将第一目图像输入至预先训练的深度估计模型，得到深度估计模型输出的深度图像，作为第一目图像对应的深度图像。其中，深度估计模型可以是基于神经网络的深度估计模型，深度估计模型可以根据大量被标注有深度图像的样本单目图像训练得到。In some implementations, the electronic device can input the first-eye image into a pre-trained depth estimation model to obtain a depth image output by the depth estimation model as the depth image corresponding to the first-eye image. The depth estimation model may be a depth estimation model based on a neural network, and the depth estimation model may be trained based on a large number of sample monocular images annotated with depth images.

在一种可能的实施方式中，以上深度估计模型可以通过以下方式训练得到：获取多张样本单目图像，以及每张样本单目图像对应的深度图像；针对每张样本单目图像，标注其对应的深度图像；根据被标注有深度图像的以上样本单目图像，对初始模型进行训练，得到以上深度估计模型。In a possible implementation, the above depth estimation model can be trained in the following manner: obtain multiple sample monocular images and a depth image corresponding to each sample monocular image; for each sample monocular image, annotate its corresponding depth image; based on the above sample monocular images annotated with depth images, train the initial model to obtain the above depth estimation model.

可选地，在获取以上多张样本单目图像，以及每张样本单目图像对应的深度图像时，可以获取双目相机采集得到的双目图像，并将双目图像中的其中一目图像作为以上样本单目图像，例如，可以将左目图像或者右目图像作为以上样本单目图像；根据该样本单目图像以及该样本单目图像对应的另一目图像，确定该样本单目图像对应的深度图像。Optionally, when acquiring the above multiple sample monocular images and the depth image corresponding to each sample monocular image, the binocular images collected by the binocular camera can be acquired, and one of the binocular images can be used as The above sample monocular image, for example, the left eye image or the right eye image can be used as the above sample monocular image; according to the sample monocular image and the other image corresponding to the sample monocular image, the depth image corresponding to the sample monocular image is determined .

可选地，请参阅图2，以上深度估计模型300可以是编码器-解码器模型(encoder-decoder模型)，编码器301用于针对输入的图像，提取图像特征；解码器302用于根据编码器提取到的图像特征，将特征映射为深度图像。其中，编码器可以为的BEiT、Swin2、Res-net等网络模型，编码器可以通过多个卷积层和池化层逐层降低特征图的尺寸，以获取不同尺度的上下文信息，更好理解物体的纹理和形状。解码器可以为循环神经网络(RecurrentNeural Networks，RNN)或者变分自编码器(Variational Auto Encoder，VAE)等，解码器可以将编码器提取的特征图进行上采样，通过跳跃连接(skip connection，也被称为残差连接，即将输入层直接连接到输出层)的方式对不同层特征图进行融合，可以获得更细节的信息，以降低深度图模糊的情况，同时解决网络中出现的梯度消失问题。Optionally, please refer to Figure 2. The above depth estimation model 300 can be an encoder-decoder model. The encoder 301 is used to extract image features for the input image; the decoder 302 is used to extract image features according to the encoding. The image features extracted by the machine are mapped into depth images. Among them, the encoder can be a network model such as BEiT, Swin2, Res-net, etc. The encoder can reduce the size of the feature map layer by layer through multiple convolutional layers and pooling layers to obtain contextual information of different scales and better understand The texture and shape of objects. The decoder can be a recurrent neural network (RNN) or a variational autoencoder (VAE), etc. The decoder can upsample the feature map extracted by the encoder through a skip connection (skip connection). Known as residual connection, it fuses the feature maps of different layers by directly connecting the input layer to the output layer) to obtain more detailed information to reduce the blurring of the depth map and solve the problem of gradient disappearance in the network. .

其中，BEiT网络是基于被损坏图像的编码向量来预测原始图像的视觉tokens(标记)，该网络结构对图像具有两种表示方式，即图像patch(块)和视觉token。其通过首先对图像拆分为一系列的patch，并将patch进行随机的掩码(mask)，一般mask占比40％左右，然后经过平化(flattened)成为向量，并进行线性投影，获得patch编码，同时对每个patch进行位置编码获得图像的位置编码；其次将patch编码和位置编码融合作为变压器(Transformer)层的输入；然后进行掩码图像建模(Masked Image Modeling，MIM)层，对上述mask的patch进行预测其token表示。Swin2的网络是在Transformer的Encoder网络基础上进行修改，Swin2与Swin1的主要区别是将LN层后置，即在残差层连接后再进行标准化，解决了跨层的激活函数振幅差异大带来的影响和训练不稳定的影响，其次注意力机制将采用了余弦注意取代点积注意，避免了陷入极限值的情况。Res-net网络为深度残差网络，其通过引入残差连接(residual connection)来帮助网络学习更复杂的特征表示，残差连接允许梯度更直接地通过深层网络，同时保留了输入信息，从而使得网络能够学习到更丰富的特征表示，具体来说，ResNet中的残差连接包括恒等映射和残差映射，恒等映射是将输入直接传递到输出，而残差映射则通过一系列的卷积层和激活函数处理输入，然后将其与输入相加得到输出。这样，输入和残差映射的输出之和就是最终的输出。Among them, the BEiT network predicts the visual tokens (marks) of the original image based on the encoding vector of the damaged image. The network structure has two representation methods for the image, namely image patch (block) and visual token. It first splits the image into a series of patches, randomly masks the patches, generally the mask accounts for about 40%, and then flattens it into a vector and performs linear projection to obtain the patch. Coding, and perform position coding on each patch to obtain the position coding of the image; secondly, the patch coding and position coding are fused as the input of the Transformer (Transformer) layer; and then the Masked Image Modeling (MIM) layer is carried out to The patch of the above mask predicts its token representation. The network of Swin2 is modified based on the Encoder network of Transformer. The main difference between Swin2 and Swin1 is that the LN layer is post-positioned, that is, it is standardized after the residual layer is connected, which solves the problem of large differences in activation function amplitudes across layers. and the influence of training instability. Secondly, the attention mechanism will use cosine attention instead of dot product attention to avoid falling into the limit value. The Res-net network is a deep residual network that helps the network learn more complex feature representations by introducing residual connections. The residual connections allow gradients to pass through the deep network more directly while retaining input information, thus making The network can learn richer feature representations. Specifically, the residual connections in ResNet include identity mapping and residual mapping. The identity mapping transfers the input directly to the output, while the residual mapping passes through a series of convolutions. Product layers and activation functions process the input and then add it to the input to get the output. In this way, the sum of the input and the output of the residual map is the final output.

可选地，在根据被标注有深度图像的以上样本单目图像，对初始模型进行训练时，可以分别将每张样本单目图像输入至初始模型，得到每个样本单目图像对应的估计结果，然后根据每个样本单目图像对应的估计结果以及每个样本单目图像标注的深度图像，确定损失值，再根据计算得到的损失值，调整初始模型的模型参数之后，即完成了1个epoch(轮次)；然后再返回至将每张样本单目图像输入至初始模型，得到每个样本单目图像对应的估计结果的步骤，完成下一个epoch，如此重复，即可完成多个epoch。其中，epoch指使用全部的样本单目图像的次数，通俗的讲epoch的值就是整个数据集被轮几次，1个epoch等于使用全部的样本单目图像训练1次。Optionally, when training the initial model based on the above sample monocular images annotated with depth images, each sample monocular image can be input to the initial model separately to obtain the estimation results corresponding to each sample monocular image. , and then determine the loss value based on the estimation results corresponding to each sample monocular image and the depth image annotated by each sample monocular image, and then adjust the model parameters of the initial model based on the calculated loss value, and then complete a epoch (round); then return to the step of inputting each sample monocular image into the initial model, obtaining the estimation result corresponding to each sample monocular image, and completing the next epoch. Repeat this to complete multiple epochs. . Among them, epoch refers to the number of times all sample monocular images are used. In layman's terms, the value of epoch is how many times the entire data set is rotated. One epoch is equal to one training time using all sample monocular images.

可选地，可以根据以上确定的损失值，使用Adam优化器对初始模型进行迭代更新，使每次获得的损失值变小，直至以上损失值收敛，并将此时的模型进行保存，得到训练后的深度估计模型。其中，Adam优化器，结合了AdaGra(Adaptive Gradient，自适应梯度)和RMSProp两种优化算法的优点，对梯度的一阶矩估计(First Moment Estimation，即梯度的均值)和二阶矩估计(Second Moment Estimation，即梯度的未中心化的方差)进行综合考虑，计算出更新步长。迭代训练的训练结束条件可以包括：迭代训练的次数达到目标次数；或者以上损失值满足设定条件。其中，收敛条件是让目标损失值尽可能小，使用初始学习率1e-3，学习率随步数余弦衰减，batch_size＝512，训练多个epoch后，即可认为收敛完成。其中，batch_size可以理解为批处理参数，它的极限值为训练集样本总数；损失值满足设定条件可以包括：总损失值小于设定阈值。当然，具体的训练结束条件可以不作为限定。Optionally, you can use the Adam optimizer to iteratively update the initial model based on the loss value determined above, so that the loss value obtained each time becomes smaller until the above loss value converges, and the model at this time is saved to obtain training. The final depth estimation model. Among them, the Adam optimizer combines the advantages of two optimization algorithms, AdaGra (Adaptive Gradient, adaptive gradient) and RMSProp, to estimate the first moment of the gradient (First Moment Estimation, that is, the mean of the gradient) and the second moment of the gradient. Moment Estimation, that is, the uncentered variance of the gradient) is comprehensively considered to calculate the update step size. The training end conditions of iterative training may include: the number of iterative training reaches the target number; or the above loss value meets the set conditions. Among them, the convergence condition is to make the target loss value as small as possible, use the initial learning rate 1e-3, the learning rate decays with the cosine of the number of steps, batch_size=512, after training for multiple epochs, the convergence can be considered complete. Among them, batch_size can be understood as a batch processing parameter, and its limit value is the total number of samples in the training set; the loss value that meets the set conditions can include: the total loss value is less than the set threshold. Of course, specific training end conditions do not need to be used as limitations.

步骤S130：基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差。Step S130: Generate a target parallax image based on the depth image and the first eye image, and there is a parallax between the target parallax image and the first eye image.

在本申请实施例中，在获取到以上第一目图像对应的深度图像之后，则可以基于以上第一目图像以及深度图像，生成与第一目图像之间存在视差的目标视差图像，以便后续得到与第一目图像对应的另一目的图像。其中，视差图像是以图像对中的任一幅图像为基准，其大小为该基准图像的大小，元素值为视差值的图像。In the embodiment of the present application, after obtaining the depth image corresponding to the above first-eye image, a target disparity image with a parallax between the first-eye image and the first-eye image can be generated based on the above first-eye image and depth image for subsequent use. Get another destination image corresponding to the first destination image. The disparity image is an image based on any image in the image pair, its size is the size of the reference image, and the element value is the disparity value.

在一些实施方式中，考虑到通常情况下深度与视差之间存在对应关系，例如，对应关系为：其中Z为每个像素的深度，Z_max为深度图中存在的最大的深度，s为一个随机变量，因此可以根据该对应关系，并基于以上第一目图像以及深度图像，确定以上视差图像。当然，基于以上深度图像以及第一目图像生成目标视差图像的具体方式可以不做限定，例如，也可以通过人工智能(Artificial Intelligence，AI)的方式来生成以上视差图像。In some implementations, considering that there is a corresponding relationship between depth and disparity under normal circumstances, for example, the corresponding relationship is: Among them, Z is the depth of each pixel, Z _max is the maximum depth existing in the depth map, and s is a random variable. Therefore, the above disparity image can be determined based on the corresponding relationship and based on the above first-eye image and depth image. Of course, the specific method of generating the target disparity image based on the above depth image and the first eye image is not limited. For example, the above disparity image can also be generated through artificial intelligence (Artificial Intelligence, AI).

步骤S140：对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。Step S140: Post-process the target disparity image to obtain a second-eye image corresponding to the first-eye image.

在本申请实施例中，在获得以上目标视差图像之后，电子设备还可以对目标视差图像进行后处理，以得到与第一目图像相对应的第二目图像。其中，后处理用于提升目标视差图像的精度以及图像质量，后处理可以包括对目标视差图像进行插值处理、利用最大视差值进行校准、仿射变换等处理，后处理中具体包括的处理可以不做限定，例如，也可以包括其他用于提升目标视差图像的精度和图像质量的处理。可以理解地，在相关技术中针对单目图像进行视野补全得到的另一目图像的精度通常存在不足，图像质量较差，因此可以针对得到的以上目标视差图像进行后处理，以得到精度和图像质量较高的另一目图像。其中，需要说明的是，若以上第一目图像是左目图像，则该第二目图像为右目图像；若以上第一目图像是右目图像，则该第二目图像为左目图像。In the embodiment of the present application, after obtaining the above target disparity image, the electronic device can also perform post-processing on the target disparity image to obtain a second-view image corresponding to the first-view image. Among them, post-processing is used to improve the accuracy and image quality of the target disparity image. Post-processing can include interpolation processing of the target disparity image, calibration using the maximum disparity value, affine transformation, etc. The specific processing included in post-processing can be Without limitation, for example, other processing for improving the accuracy and image quality of the target disparity image may also be included. Understandably, in the related art, the accuracy of the other-eye image obtained by completing the visual field of the monocular image is usually insufficient, and the image quality is poor. Therefore, the above-obtained target parallax image can be post-processed to obtain the accuracy and image A higher quality image from another camera. It should be noted that if the above first-eye image is a left-eye image, then the second-eye image is a right-eye image; if the above first-eye image is a right-eye image, then the second eye image is a left-eye image.

在一些实施方式中，电子设备对以上目标视差图像进行后处理，可以是通过后处理中所包括的每种处理对应的算法对目标视差图像进行处理；也可以是针对以上后处理预先训练有相应的图像处理模型，在对以上目标视差图像进行后处理时，可以通过该图像处理模型对目标视差图像进行处理。In some embodiments, the electronic device performs post-processing on the above target parallax image, which can be done by processing the target parallax image through an algorithm corresponding to each processing included in the post-processing; it can also be pre-trained for the above post-processing. The image processing model can be used to process the target disparity image when post-processing the above target disparity image.

本申请实施例提供的图像处理方法，通过获取原始图像，作为第一目图像；获取第一目图像对应的深度图像，基于深度图像以及第一目图像，生成目标视差图像，目标视差图像与第一目图像之间存在视差，对目标视差图像进行后处理，得到第一目图像对应的第二目图像。由此，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，并且在针对单目的图像获得视差图像后，还对视差图像进行后处理，从而能提升获得的另一目图像的精度和质量。The image processing method provided by the embodiment of the present application obtains the original image as the first-eye image; obtains the depth image corresponding to the first-eye image, and generates a target disparity image based on the depth image and the first-eye image. The target disparity image is the same as the first-eye image. There is parallax between the first-eye images, and the target parallax image is post-processed to obtain the second-eye image corresponding to the first-eye image. Thus, it is possible to generate another eye image with parallax based on the single-purpose image, thereby expanding the field of view corresponding to the image content when displaying the image content, and after obtaining the parallax image for the single-purpose image, the parallax image is also post-processed. , thereby improving the accuracy and quality of the obtained image of another object.

请参阅图3，图3示出了本申请另一个实施例提供的图像处理方法的流程示意图。该图像处理方法应用于上述电子设备，下面将针对图3所示的流程进行详细的阐述，所述图像处理方法具体可以包括以下步骤：Please refer to FIG. 3 , which shows a schematic flowchart of an image processing method provided by another embodiment of the present application. This image processing method is applied to the above-mentioned electronic equipment. The process shown in Figure 3 will be described in detail below. The image processing method may specifically include the following steps:

步骤S210：获取原始图像，作为第一目图像。Step S210: Obtain the original image as the first image.

步骤S220：获取所述第一目图像对应的深度图像。Step S220: Obtain the depth image corresponding to the first eye image.

步骤S230：基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差。Step S230: Generate a target parallax image based on the depth image and the first eye image, and there is a parallax between the target parallax image and the first eye image.

在本申请实施例中，步骤S210至步骤S230可以参阅其他实施例的内容，在此不再赘述。In the embodiment of this application, the contents of steps S210 to S230 may be referred to other embodiments, and will not be described again here.

步骤S240：对所述目标视差图像进行插值处理，得到第一视差图像。Step S240: Perform interpolation processing on the target disparity image to obtain a first disparity image.

在本申请实施例中，在对获取到的目标视差图像进行后处理时，可以对目标视差图像进行插值处理，得到第一视差图像，以通过非线性变换提高远近点之间的视差变化。插值处理是一种数学方法，用于在已知数据点之间进行插值，以获得更精确的数值，在图像处理中，插值处理可以提高图像的分辨率或改善图像的细节。其中，对目标视差图像进行的插值处理可以是三次插值、双线性差值等，三次插值是一种更高阶的插值方法，它利用待求像素周围的16个像素的灰度进行三次多项式插值，可以得到更好的效果，双线性插值利用待求像素四个相邻像素的灰度在两个方向上作线性内插，可以得到较好的效果。In the embodiment of the present application, when post-processing the acquired target disparity image, interpolation processing can be performed on the target disparity image to obtain the first disparity image, so as to improve the disparity change between far and near points through non-linear transformation. Interpolation is a mathematical method used to interpolate between known data points to obtain more accurate values. In image processing, interpolation can increase the resolution of an image or improve the details of an image. Among them, the interpolation processing for the target disparity image can be cubic interpolation, bilinear difference, etc. Cubic interpolation is a higher-order interpolation method that uses the grayscale of 16 pixels around the pixel to be obtained to perform a cubic polynomial Interpolation can get better results. Bilinear interpolation uses the grayscale of four adjacent pixels of the pixel to be found to perform linear interpolation in two directions, which can get better results.

在一些实施方式中，电子设备对目标视差图像进行插值处理可以是进行三次插值。在对目标视差图像进行三次插值时，可以对目标视差图像进行标准化处理，得到第一中间视差图像；对第一中间视差图像进行第二插值处理，得到第二中间视差图像；根据目标视差图像中的最大像素值，对第二中间视差图像的每个像素值进行调整，得到第一视差图像。其中，标准化处理是指对图像数据转化为标准化的形式，使其具有统一的分布特性，像素值均为0～1，实现中心化的处理，从而方便后续的数据分析和处理。In some implementations, the electronic device may perform cubic interpolation on the target disparity image. When performing three interpolations on the target disparity image, the target disparity image can be standardized to obtain the first intermediate disparity image; the first intermediate disparity image can be subjected to the second interpolation process to obtain the second intermediate disparity image; according to the target disparity image The maximum pixel value of the second intermediate disparity image is adjusted to obtain the first disparity image. Among them, standardization processing refers to converting image data into a standardized form so that it has uniform distribution characteristics and pixel values are 0 to 1 to achieve centralized processing, thereby facilitating subsequent data analysis and processing.

在一种可能的实施方式中，以上第二插值处理可以是三次插值，在对第一中间视差图像进行第二插值处理时，可以根据目标像素的位置，确定搜索区间，即待求像素周围的像素范围；在搜索区间内，选择对应的三次多项式来拟合已知像素值的灰度变化趋势；对拟合的三次多项式求导并令其为零，求解极值点，即待求像素的可能位置；根据极值点的位置和对应的灰度值，选择最佳的像素位置，以获得最佳的插值效果；然后将最佳的像素位置应用到原始图像中，得到以上第二中间视差图像。In a possible implementation, the above second interpolation process can be cubic interpolation. When performing the second interpolation process on the first intermediate disparity image, the search interval can be determined according to the position of the target pixel, that is, the search interval around the pixel to be found. Pixel range; within the search interval, select the corresponding cubic polynomial to fit the grayscale change trend of the known pixel value; derive the derivative of the fitted cubic polynomial and make it zero, and solve for the extreme point, that is, the pixel to be found Possible positions; select the best pixel position according to the position of the extreme point and the corresponding gray value to obtain the best interpolation effect; then apply the best pixel position to the original image to obtain the above second intermediate disparity image.

在一种可能的实施方式中，根据目标视差图像中的最大像素值，对第二中间视差图像的每个像素值进行调整，可以是针对第二中间视差图像中的每个像素位置，将其像素值与以上最大像素值相乘，从而得到以上第一视差图像。可以理解地，以上第一中间视差图像是针对目标视差图像进行标准化处理后得到的图像，其像素值是处于0～1的，即以上最大像素值不大于1，在每个像素位置的像素值与最大像素值相乘后，像素值也不会大于1；另外，以上目标视差图像中的像素值是表示视差值，因此以上最大像素值表示了经过视差值最大的位置，故通过将每个像素位置的像素值与以上最大像素值相乘，可以解决近点的视差变化。In a possible implementation, adjusting each pixel value of the second intermediate parallax image according to the maximum pixel value in the target parallax image may be to adjust each pixel position in the second intermediate parallax image. The pixel value is multiplied by the above maximum pixel value, thereby obtaining the above first disparity image. It can be understood that the above first intermediate disparity image is an image obtained after normalization processing for the target disparity image, and its pixel value is between 0 and 1, that is, the above maximum pixel value is not greater than 1, and the pixel value at each pixel position After multiplying by the maximum pixel value, the pixel value will not be greater than 1; in addition, the pixel value in the above target disparity image represents the disparity value, so the above maximum pixel value represents the position where the disparity value is the largest, so by The pixel value at each pixel position is multiplied by the maximum pixel value above to account for the parallax change at the near point.

步骤S250：对所述第一视差图像进行仿射变换，得到与所述第一目图像对齐后的第二视差图像，作为所述第一目图像对应的第二目图像。Step S250: performing an affine transformation on the first parallax image to obtain a second parallax image aligned with the first eye image as a second eye image corresponding to the first eye image.

在本申请实施例中，在对目标视差图像进行插值处理，得到第一视差图像之后，则可以对第一视差图像进行仿射变换，以实现将第一视差图像与第一目图像之间的对齐，得到第二视差图像，该第二视差图像即可作为可以用于与第一目标图像一同显示的另一目图像(即第二目图像)。其中，图像的仿射变换是在保持图像内容相对位置关系不变的前提下，对图像进行平移、旋转、缩放等操作图像对齐是一种图像处理技术，其目的是将两个或多个图像对齐，使其中的对应像素点能够重合，在图像配准中，通过仿射变换可以将不同视角下的图像对齐，以便进行后续的比较或融合，由此在将第一目图像与第二目图像一同进行显示后，能经过用户大脑的融合，使用户看到比第一目图像原本的视野范围更大的图像。In the embodiment of the present application, after interpolation processing is performed on the target disparity image to obtain the first disparity image, affine transformation can be performed on the first disparity image to achieve the transformation between the first disparity image and the first eye image. Align to obtain a second parallax image, which can be used as another eye image (that is, a second eye image) that can be displayed together with the first target image. Among them, the affine transformation of the image is to perform operations such as translation, rotation, and scaling on the image while keeping the relative positional relationship of the image content unchanged. Image alignment is an image processing technology whose purpose is to combine two or more images. Align so that the corresponding pixels can overlap. In image registration, images from different viewing angles can be aligned through affine transformation for subsequent comparison or fusion. Thus, the first-eye image and the second-eye image can be aligned. After the images are displayed together, they can be fused by the user's brain, allowing the user to see a larger image than the original field of view of the first image.

在一些实施方式中，考虑到以上第一目图像可能是左目对应的左目图像，也可能为右目对应的右目图像，因此以上得到的第二视差图像可能是与右目对应的，也可能是与左目对应的，故在对第一视差图像进行仿射变换，以将第一视差图像与第一目图像对齐时，可以是对第一视差图像进行前向映射(warping)或后向映射。其中，若第一目图像为左目图像，则可以对第一视差图像进行前向映射处理，得到与第一目图像对齐后的第二视差图像，作为第一目图像对应的第二目图像；若第一目图像为右目图像，则可以对第一视差图像进行后向映射处理，得到与第一目图像对齐后的第二视差图像，作为第一目图像对应的第二目图像。In some implementations, considering that the above first eye image may be a left eye image corresponding to the left eye, or may be a right eye image corresponding to the right eye, the second parallax image obtained above may be corresponding to the right eye, or may be a left eye image corresponding to the right eye. Correspondingly, when performing affine transformation on the first disparity image to align the first disparity image with the first eye image, forward mapping (warping) or backward mapping may be performed on the first disparity image. Wherein, if the first eye image is a left eye image, forward mapping processing can be performed on the first parallax image to obtain a second parallax image aligned with the first eye image as the second eye image corresponding to the first eye image; If the first eye image is a right eye image, backward mapping processing can be performed on the first parallax image to obtain a second parallax image aligned with the first eye image as the second eye image corresponding to the first eye image.

可以理解地，若第一目图像为左目图像，则第一视差图像是与右目对应的，因此需要将第一视差图像往左目图像对齐，如图4所示，可以对第一视差图像中的像素点进行前向映射处理；若第一目图像为右目图像，则第一视差图像是与左目对应的，因此需要将第一视差图像往右目图像对齐，如图5所示，可以对第一视差图像中的像素点进行后向映射处理。It can be understood that if the first eye image is a left eye image, the first parallax image corresponds to the right eye, so the first parallax image needs to be aligned to the left eye image, as shown in Figure 4, and the pixels in the first parallax image can be forward mapped; if the first eye image is a right eye image, the first parallax image corresponds to the left eye, so the first parallax image needs to be aligned to the right eye image, as shown in Figure 5, and the pixels in the first parallax image can be backward mapped.

本申请实施例提供的图像处理方法，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，另外，在针对单目的图像获得视差图像后，还对视差图像进行插值处理，从而能够提升获得的另一目图像的精度，并且还对视差图像进行仿射变换，从而使获得的另一目图像能与原始的单目图像对齐，进而在显示图像内容时能准确地展示出扩展出的视野内容。The image processing method provided by the embodiment of the present application can generate another eye image with parallax based on the single-purpose image, thereby expanding the field of view corresponding to the image content when displaying the image content. In addition, after obtaining the parallax image for the single-purpose image , the parallax image is also interpolated, which can improve the accuracy of the obtained other-eye image, and the parallax image is also subjected to affine transformation, so that the obtained other-eye image can be aligned with the original monocular image, and then the displayed image The content can accurately display the expanded field of view content.

请参阅图6，图6示出了本申请又一个实施例提供的图像处理方法的流程示意图。该图像处理方法应用于上述电子设备，下面将针对图6所示的流程进行详细的阐述，所述图像处理方法具体可以包括以下步骤：Please refer to FIG. 6 , which shows a schematic flowchart of an image processing method provided by yet another embodiment of the present application. This image processing method is applied to the above-mentioned electronic equipment. The process shown in Figure 6 will be described in detail below. The image processing method may specifically include the following steps:

步骤S310：获取原始图像，作为第一目图像。Step S310: Acquire an original image as a first image.

步骤S320：获取所述第一目图像对应的深度图像。Step S320: Obtain the depth image corresponding to the first eye image.

步骤S330：基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差。Step S330: Generate a target parallax image based on the depth image and the first eye image, and there is a parallax between the target parallax image and the first eye image.

步骤S340：对所述目标视差图像进行插值处理，得到第一视差图像。Step S340: performing interpolation processing on the target disparity image to obtain a first disparity image.

步骤S350：对所述第一视差图像进行仿射变换，得到与所述第一目图像对齐后的第二视差图像。Step S350: performing an affine transformation on the first parallax image to obtain a second parallax image aligned with the first eye image.

在本申请实施例中，步骤S310至步骤S350可以参阅其他实施例的内容，在此不再赘述。In this embodiment of the present application, the contents of steps S310 to S350 may be referred to other embodiments, and will not be described again here.

步骤S360：基于所述第二视差图像中的遮挡位置，对所述第二视差图像进行孔洞填充处理，并将进行所述孔洞填充处理后的第二视差图像作为所述第一目图像对应的第二目图像。Step S360: Based on the occlusion position in the second parallax image, perform hole filling processing on the second parallax image, and use the second parallax image after the hole filling processing as the corresponding first eye image. Second eye image.

在本申请实施例中，在对第一视差图像进行仿射变换，得到与第一目图像对齐后的第二视差图像之后，考虑到在进行仿射变换时，本该是第二视差图像中可见但第一目图像中遮挡的区域，在第一目图像中没有相应的像素点进行映射，因此这些区域会出现孔洞，故在得到以上第二视差图像之后，还可以基于第二视差图像中的遮挡位置，对第二视差图像进行孔洞填充处理，以避免图像中出现断裂、阴影、纹理错误、纹理不清晰、阴影等问题，进而提升第二视差图像的图像质量。In the embodiment of the present application, after performing affine transformation on the first parallax image to obtain the second parallax image aligned with the first eye image, it is considered that when performing affine transformation, the second parallax image should be The areas that are visible but occluded in the first-eye image do not have corresponding pixels for mapping in the first-eye image, so holes will appear in these areas. Therefore, after obtaining the above second disparity image, it can also be based on the second disparity image. At the occlusion position, hole filling processing is performed on the second parallax image to avoid problems such as breaks, shadows, texture errors, unclear textures, and shadows in the image, thereby improving the image quality of the second parallax image.

在一些实施方式中，电子设备基于第二视差图像中的遮挡位置，对第二视差图像进行孔洞填充处理，可以包括：对所述第二视差图像中的所述遮挡位置进行第一插值处理；对所述第二视差图像中除所述遮挡位置以外的其他区域填充目标像素值。In some embodiments, the electronic device performs hole filling processing on the second disparity image based on the occlusion position in the second disparity image, which may include: performing a first interpolation processing on the occlusion position in the second disparity image; and filling target pixel values in other areas of the second disparity image except the occlusion position.

在以上实施方式中，第二视差图像中的遮挡位置是指因仿射变换时，第一目图像中没有相应的像素点进行映射而出现的孔洞。电子设备基于第二视差图像中的遮挡位置，对第二视差图像进行孔洞填充处理时，可以通过根据第二视差图像中各个像素点以及与各个像素点邻近的区域内像素点的像素值，识别以上遮挡位置，例如可以识别像素值为黑色但邻近的区域内像素点不是黑色的区域。在识别出遮挡位置后，可以确定出第二视差图像中除遮挡位置以外的其他区域，该其他区域可以被视作易变形的区域，对第二视差图像进行孔洞填充处理时，可以针对识别出的遮挡位置，进行插值处理；针对第二视差图像中的以上其他区域填充目标像素值。In the above embodiments, the occlusion position in the second parallax image refers to the hole that appears due to the lack of corresponding pixels in the first image for mapping during affine transformation. When the electronic device performs hole filling processing on the second parallax image based on the occlusion position in the second parallax image, it can identify the pixels based on the pixel values of each pixel point in the second parallax image and the pixel points in the area adjacent to each pixel point. The above occlusion position can, for example, identify areas where the pixel value is black but the pixels in the adjacent area are not black. After identifying the occlusion position, other areas in the second parallax image other than the occlusion position can be determined. These other areas can be regarded as easily deformable areas. When performing hole filling processing on the second parallax image, the identified areas can be identified. The occlusion position is interpolated; the target pixel values are filled in for the above other areas in the second disparity image.

在一种可能的实施方式中，对第二视差图像中的遮挡位置进行第一插值处理，可以是对第二视差图像中的遮挡位置进行双线性插值处理。双线性插值处理是使用双线性多项式来逼近图像像素间的灰度值，双线性插值通过计算两个像素点的灰度值，并根据它们的权重计算新像素点的灰度值。其中，双线性插值处理的公式如下：In a possible implementation, the first interpolation processing is performed on the occluded position in the second parallax image, and bilinear interpolation processing may be performed on the occluded position in the second parallax image. Bilinear interpolation processing uses bilinear polynomials to approximate the grayscale values between image pixels. Bilinear interpolation calculates the grayscale values of two pixels and calculates the grayscale value of a new pixel according to their weights. The formula for bilinear interpolation processing is as follows:

Q₁₁＝(x₁,y₁),Q₁₂＝(x₁,y₂)Q ₁₁ = (x ₁ , y ₁ ), Q ₁₂ = (x ₁ , y ₂ )

Q₂₁＝(x₂,y₁),Q₂₂＝(x₂,y₂)Q ₂₁ = (x ₂ , y ₁ ), Q ₂₂ = (x ₂ , y ₂ )

f(x,y)＝f(Q₁₁)(x₂-x)(y₂-y)+f(Q₂₁)(x-x₁)(y₂-y)f(x,y)＝f(Q ₁₁ )(x ₂ -x)(y ₂ -y)+f(Q ₂₁ )(xx ₁ )(y ₂ -y)

+f(Q₁₂)(x₂-x)(y-y₁)+f(Q₂₂)(x-x₁)(y-y₁)+f(Q ₁₂ )(x ₂ -x)(yy ₁ )+f(Q ₂₂ )(xx ₁ )(yy ₁ )

其中，f(x,y)表示目标像素的灰度值，f(Q₁₁)、f(Q₂₁)、f(Q₁₂)以及f(Q₂₂)分别表示反推回的四个像素点的灰度值，(x₂-x)与(y₂-y)、(x-x₁)与(y₂-y)、(x₂-x)与(y-y₁)、(x-x₁)与(y-y₁)分别表示目标像素的位置坐标与反推回的位置坐标之差。请参阅图7，以上公式的原理是这个公式的计算过程是在两个方向上分别进行线性插值，然后将结果相加，具体来说，对于目标像素的灰度值，首先根据反推回的四个像素点的灰度值和它们与目标像素的距离，计算出四个加权值，然后根据这四个加权值和对应的像素点的灰度值，计算出目标像素的灰度值，以上过程在以上遮挡位置的每个像素上都进行一次，从而得到第二视差图像中的遮挡位置的新的灰度值分布。Among them, f(x,y) represents the gray value of the target pixel, f(Q ₁₁ ), f(Q ₂₁ ), f(Q ₁₂ ) and f(Q ₂₂ ) respectively represent the four pixels that are deduced back. Gray value, (x ₂ -x) and (y ₂ -y), (xx ₁ ) and (y ₂ -y), (x ₂ -x) and (yy ₁ ), (xx ₁ ) and (yy ₁ ) respectively represent the difference between the position coordinates of the target pixel and the reversed position coordinates. Please refer to Figure 7. The principle of the above formula is that the calculation process of this formula is to perform linear interpolation in two directions, and then add the results. Specifically, for the gray value of the target pixel, first based on the back-reduction Calculate four weighted values based on the grayscale values of the four pixels and their distance from the target pixel, and then calculate the grayscale value of the target pixel based on these four weighted values and the grayscale values of the corresponding pixels. The above The process is performed once for each pixel of the above occlusion position, thereby obtaining a new gray value distribution of the occlusion position in the second disparity image.

在一种可能的实施方式中，对第二视差图像中的以上其他区域填充目标像素值，可以是对其他区域中的像素点填充黑色的像素值；也可以是取其他区域中的像素点邻近背景区域像素点的色彩信息对这些区域进行填充。可以理解地，除了遮挡区域出现的孔洞以外，还有少数像素点是空白的，这些像素点视作噪点，因此通过以上的填充，可以提升图像质量。In a possible implementation, the target pixel values are filled in the other areas in the second parallax image, which may be black pixel values for the pixels in the other areas; or the color information of the pixels in the background area adjacent to the pixels in the other areas is used to fill these areas. It can be understood that in addition to the holes in the occluded area, there are a few blank pixels, which are regarded as noise points. Therefore, the image quality can be improved through the above filling.

示例性地，请参阅图8，图8示出了针对室内场景拍摄的图像(第一目图像)，通过本申请实施例提供的图像处理方法，得到的另一目图像(第二目图像)的示意图，可以看出针对室内场景拍摄的图像，可以实现边缘处的视野补全，且能得到图像质量较高的图像；请参阅图9，图9示出了针对室外场景拍摄的图像(第一目图像)，通过本申请实施例提供的图像处理方法，得到的另一目图像(第二目图像)的示意图，可以看出针对室外场景拍摄的图像，也可以实现边缘处的视野补全，并且也能得到图像质量较高的图像。Exemplarily, please refer to Figure 8. Figure 8 shows an image (first-eye image) taken of an indoor scene and another eye image (second-eye image) obtained through the image processing method provided by the embodiment of the present application. Schematic diagram, it can be seen that images taken for indoor scenes can complete the field of view at the edge and obtain images with higher image quality; please refer to Figure 9, which shows images taken for outdoor scenes (first eye image), through the image processing method provided by the embodiment of the present application, the schematic diagram of the other eye image (the second eye image) obtained can be seen as an image taken for an outdoor scene, and the field of view completion at the edge can also be achieved, and Images with higher image quality can also be obtained.

本申请实施例提供的图像处理方法，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，另外，在针对单目的图像获得视差图像后，还对视差图像进行插值处理，从而能够提升获得的另一目图像的精度，并且还对视差图像进行仿射变换，从而使获得的另一目图像能与原始的单目图像对齐，进而在显示图像内容时能准确地展示出扩展出的视野内容；另外，对视差图像进行仿射变换后，还对视差图像进行孔洞填充处理，从而进一步提升了最终获得的另一目图像的图像质量。The image processing method provided by the embodiment of the present application can generate another eye image with parallax based on the single-purpose image, thereby expanding the field of view corresponding to the image content when displaying the image content. In addition, after obtaining the parallax image for the single-purpose image , the parallax image is also interpolated, which can improve the accuracy of the obtained other-eye image, and the parallax image is also subjected to affine transformation, so that the obtained other-eye image can be aligned with the original monocular image, and then the displayed image The content of the expanded field of view can be accurately displayed; in addition, after affine transformation is performed on the parallax image, the hole filling process of the parallax image is also performed, thereby further improving the image quality of the final image of the other eye.

请参阅图10，图10示出了本申请再一个实施例提供的图像处理方法的流程示意图。该图像处理方法应用于上述电子设备，下面将针对图10所示的流程进行详细的阐述，所述图像处理方法具体可以包括以下步骤：Please refer to FIG. 10 , which shows a schematic flowchart of an image processing method provided by yet another embodiment of the present application. This image processing method is applied to the above-mentioned electronic device. The process shown in Figure 10 will be described in detail below. The image processing method may specifically include the following steps:

步骤S410：获取原始图像，作为第一目图像。Step S410: Obtain the original image as the first image.

步骤S420：获取所述第一目图像对应的深度图像。Step S420: Obtain the depth image corresponding to the first eye image.

在本申请实施例中，步骤S410以及步骤S420可以参阅前述实施例的内容，在此不再赘述。In the embodiment of the present application, the content of step S410 and step S420 can be referred to the previous embodiment, and will not be described again here.

步骤S430：将所述深度图像以及所述第一目图像输入至预先训练的视差估计模型中，得到所述视差估计模型输出的视差图像，作为目标视差图像，所述视差估计模型是预先根据第一样本图像以及所述第一样本图像对应的深度图像训练得到的，所述第一样本图像被标注有对应的第二样本图像，所述第二样本图像与所述第一样本图像之间存在视差。Step S430: Input the depth image and the first eye image into a pre-trained disparity estimation model to obtain a disparity image output by the disparity estimation model as a target disparity image. The disparity estimation model is pre-trained according to the first disparity estimation model. A sample image and a depth image corresponding to the first sample image are obtained through training. The first sample image is annotated with a corresponding second sample image. The second sample image is identical to the first sample image. There is parallax between images.

在本申请实施例中，在基于以上深度图像以及第一目图像，生成目标视差图像时，可以是将深度图像以及第一目图像输入至预先训练的视差估计模型中，然后可以得到视差估计模型输出的视差图像。其中，视差估计模型可以是卷积神经网络，示例性地，视差估计模型包括编码网络以及解码网络，输入编码网络的图像经过卷积、批归一化(BN)和激活函数(Relu)激活后，输出图像特征，解码网络对输入的图像特征，进行卷积、批归一化和Relu函数的激活，然后再通过多个残差块及卷积层后，输出视差图像。In the embodiment of the present application, when generating the target disparity image based on the above depth image and the first eye image, the depth image and the first eye image can be input into a pre-trained disparity estimation model, and then the disparity estimation model can be obtained Output disparity image. The disparity estimation model may be a convolutional neural network. For example, the disparity estimation model includes an encoding network and a decoding network. After the image input to the encoding network is activated by convolution, batch normalization (BN) and activation function (Relu) , output image features, the decoding network performs convolution, batch normalization and Relu function activation on the input image features, and then outputs a disparity image after passing through multiple residual blocks and convolution layers.

在一些实施方式中，视差估计模型通过以下方式训练得到：获取双目相机采集到的多个样本图像对；获取所述多个样本图像对中第一目对应的样本图像，作为所述第一样本图像，以及所述多个样本图像对中第二目对应的图像，作为所述第二样本图像；针对每张所述第一样本图像，标注所述第一样本图像对应的所述第二样本图像，并获取所述第一样本图像对应的深度图像，得到样本图像集；基于所述样本图像集，对初始估计模型进行训练，得到所述视差估计模型。其中，以上双目相机采集到的多个样本图像对，可以是针对不同场景进行图像拍摄，然后得到的多个样本图像对；由于以上每个样本图像对是双面相机采集得到的，因此样本图像对中的两张样本图像之间是存在视差的，也就是说，每个样本图像对中是包括左目对应的左目样本图像以及右目对应的右目图像的，故可以利用样本图像对中的其中一目的图像作为用于输入至模型的第一样本图像，另一目图像作为该第一样本图像的标签，以约束模型能够准确生成视差图像。In some embodiments, the disparity estimation model is trained by: obtaining a plurality of sample image pairs collected by a binocular camera; obtaining a sample image corresponding to the first eye in the plurality of sample image pairs as the first sample image, and an image corresponding to the second eye in the plurality of sample image pairs as the second sample image; for each of the first sample images, annotating the second sample image corresponding to the first sample image, and obtaining a depth image corresponding to the first sample image to obtain a sample image set; based on the sample image set, training the initial estimation model to obtain the disparity estimation model. Among them, the plurality of sample image pairs collected by the binocular camera above can be a plurality of sample image pairs obtained by taking images for different scenes; since each of the sample image pairs above is collected by a double-sided camera, there is a disparity between the two sample images in the sample image pair, that is, each sample image pair includes a left eye sample image corresponding to the left eye and a right eye sample image corresponding to the right eye, so one of the target images in the sample image pair can be used as the first sample image for input to the model, and the other target image can be used as the label of the first sample image to constrain the model to accurately generate a disparity image.

在一些实施方式中，基于所述样本图像集，对初始估计模型进行训练，得到所述视差估计模型，可以包括：将所述样本图像集中的所述第一样本图像以及所述第一样本图像对应的深度图像输入至初始估计模型，得到所述初始估计模型输出的视差估计图像；基于所述第一样本图像所标注的所述第二样本图像，与所述视差估计图像之间的差异，确定损失值；基于所述损失值，对所述初始估计模型进行迭代更新，得到所述视差估计模型。In some embodiments, training an initial estimation model based on the sample image set to obtain the disparity estimation model may include: combining the first sample image and the first sample image in the sample image set. The depth image corresponding to this image is input to the initial estimation model to obtain the disparity estimation image output by the initial estimation model; between the second sample image marked based on the first sample image and the disparity estimation image difference, determine the loss value; based on the loss value, iteratively update the initial estimation model to obtain the disparity estimation model.

在以上实施方式中，在获取第一样本图像对应的深度图像时，获取深度图像的方式，可以与前述实施例中获取第一目图像对应的深度图像的方式相同。例如，可以将第一样本图像输入至预先训练的深度估计模型中，从而得到深度估计模型输出的该第一样本图像对应的深度图像。In the above embodiments, when obtaining the depth image corresponding to the first sample image, the method of obtaining the depth image may be the same as the method of obtaining the depth image corresponding to the first eye image in the previous embodiment. For example, the first sample image can be input into a pre-trained depth estimation model, thereby obtaining a depth image corresponding to the first sample image output by the depth estimation model.

在一种可能的实施方式中，在将第一样本图像以及第一样本图像对应的深度图像输入至以上初始估计模型后，初始估计模型可以针对第一样本图像，输出其对应的视差估计图像；然后，可以根据初始估计模型输出的视差估计模型与第一样本图像对应的标签(即第一样本图像被标注的第二样本图像)之间的差异，确定损失值。可选地，可以采用L2损失计算的方式，确定以上损失值。In a possible implementation, after inputting the first sample image and the depth image corresponding to the first sample image into the above initial estimation model, the initial estimation model can output its corresponding disparity for the first sample image. Estimating the image; then, the loss value can be determined based on the difference between the disparity estimation model output by the initial estimation model and the label corresponding to the first sample image (ie, the second sample image to which the first sample image is labeled). Optionally, L2 loss calculation can be used to determine the above loss value.

在一种可能的实施方式中，根据获取到的损失值，对初始估计模型进行迭代更新时，可以根据计算得到的损失值，调整初始估计模型的模型参数；然后返回步骤：将所述样本图像集中的所述第一样本图像以及所述第一样本图像对应的深度图像输入至初始估计模型，得到所述初始估计模型输出的视差估计图像，直至满足训练结束条件，得到训练后的视差估计模型。In a possible implementation, when the initial estimation model is iteratively updated based on the obtained loss value, the model parameters of the initial estimation model can be adjusted according to the calculated loss value; and then return to step: convert the sample image The concentrated first sample image and the depth image corresponding to the first sample image are input to the initial estimation model, and a disparity estimation image output by the initial estimation model is obtained. Until the training end condition is met, the disparity after training is obtained. Estimation model.

其中，在分别将每个第一样本图像及其深度图像输入至初始估计模型，得到每个第一样本图像对应的视差估计图像，然后根据每个第一样本图像对应的视差估计图像以及每个第一样本图像标注的第二样本图像，确定损失值，再根据计算得到的损失值，调整初始估计模型的模型参数之后，即完成了1个epoch(轮次)；然后再返回步骤：将所述样本图像集中的所述第一样本图像以及所述第一样本图像对应的深度图像输入至初始估计模型，得到所述初始估计模型输出的视差估计图像，完成下一个epoch，如此重复，即可完成多个epoch。其中，epoch指使用全部的第一样本图像及其深度图像的次数，通俗的讲epoch的值就是整个数据集被轮几次，1个epoch等于使用全部的第一样本图像及其深度图像训练1次。Wherein, each first sample image and its depth image are respectively input into the initial estimation model to obtain a disparity estimation image corresponding to each first sample image, and then the disparity estimation image corresponding to each first sample image is obtained. and the second sample image annotated by each first sample image, determine the loss value, and then adjust the model parameters of the initial estimation model based on the calculated loss value, that is, one epoch (round) is completed; and then return Step: Input the first sample image in the sample image set and the depth image corresponding to the first sample image into the initial estimation model, obtain the disparity estimation image output by the initial estimation model, and complete the next epoch. , repeated in this way, multiple epochs can be completed. Among them, epoch refers to the number of times that all first sample images and their depth images are used. In layman's terms, the value of epoch is how many times the entire data set is rotated. One epoch is equal to using all the first sample images and their depth images. Train once.

在一些实施方式中，可以根据损失值，使用Adam优化器对初始估计模型进行迭代更新，使每次获得的损失值变小，直至以上损失值收敛，并将此时的模型进行保存，得到训练后的视差估计模型。其中，Adam优化器，结合了AdaGra和RMSProp两种优化算法的优点，对梯度的一阶矩估计和二阶矩估计进行综合考虑，计算出更新步长。其中，迭代训练的训练结束条件可以包括：迭代训练的次数达到目标次数；或者以上损失值满足设定条件。In some implementations, the Adam optimizer can be used to iteratively update the initial estimated model according to the loss value, so that the loss value obtained each time becomes smaller until the above loss value converges, and the model at this time is saved to obtain training. The final disparity estimation model. Among them, the Adam optimizer combines the advantages of the two optimization algorithms AdaGra and RMSProp, comprehensively considers the first-order moment estimate and the second-order moment estimate of the gradient to calculate the update step size. Among them, the training end conditions of iterative training may include: the number of iterative training reaches the target number; or the above loss value meets the set conditions.

可选地，收敛条件是让目标损失值尽可能小，使用初始学习率1e-3，学习率随步数余弦衰减，batch_size＝512，训练多个epoch后，即可认为收敛完成。其中，batch_size可以理解为批处理参数，它的极限值为训练集样本总数。Optionally, the convergence condition is to make the target loss value as small as possible, use the initial learning rate 1e-3, the learning rate decays with the cosine of the number of steps, batch_size=512, after training for multiple epochs, the convergence can be considered complete. Among them, batch_size can be understood as a batch processing parameter, and its limit value is the total number of training set samples.

可选地，损失值满足设定条件可以包括：损失值小于设定阈值。当然，具体的训练结束条件可以不作为限定。Optionally, the loss value satisfying the set condition may include: the loss value is less than the set threshold. Of course, specific training end conditions do not need to be used as limitations.

在一种可能的实施方式中，若需要得到的视差估计模型是根据输入的左目对应的图像及其深度图像，输出右目对应的视差图像，即以上原始图像是作为左目对应的左目图像，获得的第二目图像为右目图像，则以上第一样本图像可以是样本图像对中的左目图像，第二样本图像可以是样本图像对中的右目图像；若需要得到的视差估计模型是根据输入的右目对应的图像及其深度图像，输出左目对应的视差图像，即以上原始图像是作为右目对应的右目图像，获得的第二目图像为左目图像，则以上第一样本图像可以是样本图像对中的右目图像，第二样本图像可以是样本图像对中的左目图像。In a possible implementation, if the required disparity estimation model is to output the disparity image corresponding to the right eye based on the input image corresponding to the left eye and its depth image, that is, the above original image is obtained as the left eye image corresponding to the left eye. The second eye image is a right eye image, then the above first sample image can be the left eye image in the sample image pair, and the second sample image can be the right eye image in the sample image pair; if the required disparity estimation model is based on the input The image corresponding to the right eye and its depth image are output, and the disparity image corresponding to the left eye is output. That is, the above original image is the right eye image corresponding to the right eye, and the second eye image obtained is the left eye image, then the above first sample image can be a sample image pair The second sample image may be the left-eye image of the sample image pair.

在一种可能的实施方式中，以上初始估计模型包括生成网络以及判别网络，生成网络用于根据输入的每张第一样本图像及其深度图像，输出对应的视差估计图像。也就是说，初始估计模型可以为生成对抗网络(Generative Adversarial Network，GAN)，GAN模型是一种深度学习模型，该模型通过框架中至少两个模型：生成模型(Generative Model)和判别模型(Discriminative Model)之间的互相博弈学习产生相当好的输出。在GAN的训练过程中，生成网络的目标就是尽量生成真实的视差图像去欺骗判别网络，而判别网络的目标就是尽量把生成网络生成的图片和真实的图片分别开来，如此，生成网络和判别网络之间就构成了一个动态的“博弈过程”。训练过程中，对于生成网络输出的视差估计图像，可以输入至判别网络，并且可以将第一样本图像所标注的第二样本图像输入至判别网络，由此，可以得到判别网络针对生成网络输出的视差估计图像的判别结果，以及真实的第二样本图像的判别结果；再根据判别网络针对生成网络输出的视差估计图像的判别结果，以及真实的第二样本图像的判别结果，可以确定出判别网络的判别损失值。In a possible implementation, the above initial estimation model includes a generating network and a discriminating network. The generating network is used to output a corresponding disparity estimation image based on each input first sample image and its depth image. In other words, the initial estimation model can be a Generative Adversarial Network (GAN). The GAN model is a deep learning model that passes at least two models in the framework: Generative Model and Discriminative Model The mutual game learning between Model) produces quite good output. In the training process of GAN, the goal of the generation network is to try to generate real disparity images to deceive the discriminant network, and the goal of the discriminant network is to try to separate the pictures generated by the generative network from the real pictures. In this way, the generation network and the discriminant network A dynamic "game process" is formed between the networks. During the training process, the disparity estimation image output by the generating network can be input to the discriminating network, and the second sample image marked by the first sample image can be input to the discriminating network. From this, the discriminating network can obtain the output of the generating network The discrimination result of the disparity estimation image, and the discrimination result of the real second sample image; and then based on the discrimination result of the disparity estimation image output by the discriminant network for the generation network, and the discrimination result of the real second sample image, the discrimination result can be determined The discriminative loss value of the network.

另外，还可以根据第一样本图像所标注的第二样本图像，与视差估计图像之间的差异，确定损失值，作为生成损失值；然后基于判别损失值以及生成损失值，确定初始估计模型的总损失值，再基于总损失值对初始估计模型进行迭代训练，直至训练结束条件，并将此时得到的生成网络，作为本申请实施例中的视差估计模型。In addition, the loss value can also be determined based on the difference between the second sample image marked by the first sample image and the disparity estimation image as the generation loss value; and then the initial estimation model is determined based on the discrimination loss value and the generation loss value. The total loss value is then iteratively trained on the initial estimation model based on the total loss value until the training end condition, and the generated network obtained at this time is used as the disparity estimation model in the embodiment of the present application.

步骤S440：对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。Step S440: Post-process the target disparity image to obtain a second-eye image corresponding to the first-eye image.

在本申请实施例中，步骤S440可以参阅其他实施例的内容，在此不再赘述。In the embodiment of the present application, step S440 can refer to the contents of other embodiments and will not be repeated here.

本申请实施例提供的图像处理方法，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，并且在针对单目的图像获得视差图像后，还对视差图像进行后处理，从而能提升获得的另一目图像的精度和质量；另外，在获取视差图像时，通过预先训练的视差估计模型来针对单目的图像生成视差图像，从而可以进一步提升获得的另一目图像的精度。The image processing method provided in the embodiment of the present application can generate another eye image with parallax based on a single-eye image, thereby expanding the field of view corresponding to the image content when displaying the image content, and after obtaining the disparity image for the single-eye image, the disparity image is post-processed, thereby improving the accuracy and quality of the obtained another eye image; in addition, when acquiring the disparity image, the disparity image is generated for the single-eye image through a pre-trained disparity estimation model, thereby further improving the accuracy of the obtained another eye image.

请参阅图11，图11示出了本申请又另一个实施例提供的图像处理方法的流程示意图。该图像处理方法应用于上述电子设备，下面将针对图11所示的流程进行详细的阐述，所述图像处理方法具体可以包括以下步骤：Please refer to FIG. 11 , which shows a schematic flowchart of an image processing method provided by yet another embodiment of the present application. This image processing method is applied to the above-mentioned electronic device. The process shown in Figure 11 will be described in detail below. The image processing method may specifically include the following steps:

步骤S510：获取原始图像，作为第一目图像。Step S510: Obtain the original image as the first image.

步骤S520：获取所述第一目图像对应的深度图像。Step S520: Obtain a depth image corresponding to the first image.

步骤S530：基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差。Step S530: generating a target disparity image based on the depth image and the first visual image, wherein there is disparity between the target disparity image and the first visual image.

步骤S540：对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。Step S540: post-processing the target parallax image to obtain a second image corresponding to the first image.

在本申请实施例中，步骤S510至步骤S540可以参阅前述实施例的内容，在此不再赘述。In the embodiment of the present application, the contents of steps S510 to S540 can be referred to the previous embodiments, and will not be described again here.

步骤S550：将所述第一目图像以及所述第二目图像发送至头戴显示设备，以使所述头戴显示设备对所述第一目图像以及所述第二目图像进行显示。Step S550: sending the first eye image and the second eye image to a head mounted display device, so that the head mounted display device displays the first eye image and the second eye image.

在本申请实施例中，电子设备可以与头戴显示设备连接，电子设备在获取到以上第一目图像对应的第二目图像之后，则可以将第一目图像以及第二目图像发送至头戴显示设备，以便头戴显示设备对第一目图像以及第二目图像进行显示。由此，头戴显示设备在显示第一目图像以及第二目图像时，第一目图像的图像内容可以进入用户的左眼，第二目图像的图像内容可以进入用户的右眼，在经过用户大脑的融合后，可以使用户看到视野范围比单独显示的第一目图像的视野范围更大的图像，进而实现了视野范围的扩展。In an embodiment of the present application, the electronic device can be connected to a head-mounted display device. After acquiring the second eye image corresponding to the first eye image, the electronic device can send the first eye image and the second eye image to the head-mounted display device so that the head-mounted display device displays the first eye image and the second eye image. Thus, when the head-mounted display device displays the first eye image and the second eye image, the image content of the first eye image can enter the user's left eye, and the image content of the second eye image can enter the user's right eye. After being fused by the user's brain, the user can see an image with a larger field of view than the field of view of the first eye image displayed alone, thereby achieving an expansion of the field of view.

示例性地，请参阅图12，头戴显示设备200可以为无线式的设备，头戴显示设备200进行内容显示时，头戴显示设备200可以与电子设备100连接。电子设备100获得的第一目图像以及第二目图像发送至头戴显示设备200，头戴显示设备200再对第一目图像以及第二目图像进行显示。该头戴显示设备200可以为AR眼镜、AR头盔、VR眼镜、VR头盔、MR(MR，MixedReality)眼镜、MR头盔等，在此不做限定。For example, referring to FIG. 12 , the head-mounted display device 200 may be a wireless device. When the head-mounted display device 200 displays content, the head-mounted display device 200 may be connected to the electronic device 100 . The first-eye image and the second-eye image obtained by the electronic device 100 are sent to the head-mounted display device 200, and the head-mounted display device 200 displays the first-eye image and the second eye image. The head-mounted display device 200 can be AR glasses, AR helmets, VR glasses, VR helmets, MR (MR, MixedReality) glasses, MR helmets, etc., which are not limited here.

在一些实施方式中，电子设备获取执行本申请实施例提供的图像处理方法的应用场景，可以是电子设备实时地拍摄图像，并针对拍摄的图像，执行本申请实施例提供的图像处理方法。也就是说，电子设备可以获取图像采集装置采集的单目图像，作为第一目图像；获取第一目图像对应的深度图像；基于深度图像以及第一目图像，生成目标视差图像，目标视差图像与第一目图像之间存在视差；对目标视差图像进行后处理，得到第一目图像对应的第二目图像；然后将第一目图像以及第二目图像发送至头戴显示设备。由此，可以实现电子设备实时地针对拍摄图像(即以上第一目图像)生成另一目的图像(即以上第二目图像帧)，并将拍摄图像以及另一目的图像发送至头戴显示设备进行显示。以上图像拍摄可以是照片的拍摄，也可以是视频的拍摄，在此不做限定。In some implementations, the electronic device obtains the application scenario for executing the image processing method provided by the embodiments of the present application. This may be that the electronic device captures images in real time, and executes the image processing method provided by the embodiments of the present application on the captured images. That is to say, the electronic device can acquire the monocular image collected by the image acquisition device as the first-eye image; acquire the depth image corresponding to the first-eye image; and generate the target parallax image based on the depth image and the first-eye image. There is a parallax between the first-eye image and the first-eye image; the target parallax image is post-processed to obtain a second-eye image corresponding to the first-eye image; and then the first-eye image and the second-eye image are sent to the head-mounted display device. Thus, it is possible for the electronic device to generate another target image (ie, the above second-view image frame) based on the captured image (ie, the above first-view image) in real time, and send the captured image and the other target image to the head-mounted display device. to display. The above image shooting can be a photo shooting or a video shooting, which is not limited here.

在一些实施方式中，电子设备在获取到以上第二目图像之后，也可以将第一目图像以及第二目图像进行存储。由此，在需要通过头戴显示设备展示第一目图像时，电子设备可以将第一目图像以及第二目图像发送至头戴显示设备，以使头戴显示设备在展示第一目图像时，能展示视野范围更大的图像，从而扩展图像内容对应的视野。In some implementations, after acquiring the above second-eye image, the electronic device may also store the first-eye image and the second-eye image. Therefore, when the first-eye image needs to be displayed through the head-mounted display device, the electronic device can send the first-eye image and the second-eye image to the head-mounted display device, so that the head-mounted display device displays the first eye image when the head-mounted display device needs to display the first eye image. , can display images with a wider field of view, thereby expanding the field of view corresponding to the image content.

在一些实施方式中，电子设备在得到以上第一目图像对应的第二目图像之后，还可以将第一目图像以及第二目图像作为立体数据应用于任一立体模型，比如SLAM系统，以提高单目系统的精度。In some embodiments, after obtaining the second-eye image corresponding to the above first-eye image, the electronic device can also apply the first-eye image and the second-eye image as stereoscopic data to any three-dimensional model, such as a SLAM system, to Improve the accuracy of monocular systems.

本申请实施例提供的图像处理方法，可以实现根据单目的原始图像，生成存在视差的另一目图像，并将单目的原始图像以及生成的另一目的图像发送至头戴显示设备，由头戴显示设备对单目的原始图像以及生成的另一目的图像进行显示，从而可以使用户看到视野范围比单独显示的原始图像的视野范围更大的图像，进而实现了视野范围的扩展。The image processing method provided by the embodiment of the present application can generate another eye image with parallax based on the single-view original image, and send the single-view original image and the generated other-view image to the head-mounted display device, and the head-mounted display device The device displays the single-purpose original image and the generated image of another purpose, thereby allowing the user to see an image with a wider field of view than that of the original image displayed alone, thus achieving an expansion of the field of view.

请参阅图13，其示出了本申请实施例提供的一种图像处理装置600的结构框图。该图像处理装置600应用上述的电子设备，该图像处理装置600包括：第一图像获取模块610、第二图像获取模块620、视差图像生成模块630以及图像后处理模块640。其中，所述第一图像获取模块610用于获取原始图像，作为第一目图像；所述第二图像获取模块620用于获取所述第一目图像对应的深度图像；所述视差图像生成模块630用于基于所述深度图像以及所述第一目图像，生成目标视差图像，所述目标视差图像与所述第一目图像之间存在视差；所述图像后处理模块640用于对所述目标视差图像进行后处理，得到所述第一目图像对应的第二目图像。Please refer to FIG. 13 , which shows a structural block diagram of an image processing device 600 provided by an embodiment of the present application. The image processing device 600 applies the above-mentioned electronic equipment. The image processing device 600 includes: a first image acquisition module 610, a second image acquisition module 620, a parallax image generation module 630 and an image post-processing module 640. Among them, the first image acquisition module 610 is used to acquire the original image as the first eye image; the second image acquisition module 620 is used to acquire the depth image corresponding to the first eye image; the disparity image generation module 630 is used to generate a target parallax image based on the depth image and the first eye image, and there is a parallax between the target parallax image and the first eye image; the image post-processing module 640 is used to process the The target disparity image is post-processed to obtain a second-eye image corresponding to the first-eye image.

在一些实施方式中，图像后处理模块640可以具体用于：对所述目标视差图像进行插值处理，得到第一视差图像；对所述第一视差图像进行仿射变换，得到与所述第一目图像对齐后的第二视差图像，作为所述第一目图像对应的第二目图像。In some embodiments, the image post-processing module 640 may be specifically configured to: perform interpolation processing on the target disparity image to obtain a first disparity image; perform affine transformation on the first disparity image to obtain the same as the first disparity image. The second disparity image after the eye images are aligned is used as the second eye image corresponding to the first eye image.

在一种可能的实施方式中，图像后处理模块640可以具体用于：若所述第一目图像为左目图像，则对所述第一视差图像进行前向映射处理，得到与所述第一目图像对齐后的第二视差图像，作为所述第一目图像对应的第二目图像；若所述第一目图像为右目图像，则对所述第一视差图像进行后向映射处理，得到与所述第一目图像对齐后的第二视差图像，作为所述第一目图像对应的第二目图像。In a possible implementation, the image post-processing module 640 may be specifically configured to: if the first eye image is a left eye image, perform forward mapping processing on the first disparity image to obtain the same image as the first disparity image. The second parallax image after eye image alignment is used as the second eye image corresponding to the first eye image; if the first eye image is a right eye image, then backward mapping is performed on the first parallax image to obtain The second disparity image aligned with the first-order image serves as the second-order image corresponding to the first-order image.

在一种可能的实施方式中，图像后处理模块640可以具体用于：对所述第一视差图像进行仿射变换，得到与所述第一目图像对齐后的第二视差图像；基于所述第二视差图像中的遮挡位置，对所述第二视差图像进行孔洞填充处理，并将进行所述孔洞填充处理后的第二视差图像作为所述第二目图像。In a possible implementation, the image post-processing module 640 may be specifically configured to: perform affine transformation on the first parallax image to obtain a second parallax image aligned with the first eye image; based on the Hole filling processing is performed on the second parallax image at the occlusion position in the second parallax image, and the second parallax image after the hole filling processing is used as the second eye image.

可选地，图像后处理模块640可以具体用于：对所述第二视差图像中的所述遮挡位置进行第一插值处理；对所述第二视差图像中除所述遮挡位置以外的其他区域填充目标像素值。Optionally, the image post-processing module 640 may be specifically configured to: perform a first interpolation process on the occlusion position in the second parallax image; and perform a first interpolation process on other areas in the second disparity image except the occlusion position. Fill the target pixel value.

在一种可能的实施方式中，图像后处理模块640可以具体用于：对所述目标视差图像进行标准化处理，得到第一中间视差图像；对所述第一中间视差图像进行第二插值处理，得到第二中间视差图像；根据所述目标视差图像中的最大像素值，对所述第二中间视差图像的每个像素值进行调整，得到所述第一视差图像。In a possible implementation, the image post-processing module 640 may be specifically configured to: perform standardization processing on the target parallax image to obtain a first intermediate parallax image; perform second interpolation processing on the first intermediate parallax image, A second intermediate parallax image is obtained; and each pixel value of the second intermediate parallax image is adjusted according to the maximum pixel value in the target parallax image to obtain the first parallax image.

在一些实施方式中，视差图像生成模块630可以具体用于：将所述深度图像以及所述第一目图像输入至预先训练的视差估计模型中，得到所述视差估计模型输出的视差图像，作为所述目标视差图像，所述视差估计模型是预先根据第一样本图像以及所述第一样本图像对应的深度图像训练得到的，所述第一样本图像被标注有对应的第二样本图像，所述第二样本图像与所述第一样本图像之间存在视差。In some embodiments, the disparity image generation module 630 may be specifically configured to: input the depth image and the first eye image into a pre-trained disparity estimation model, and obtain a disparity image output by the disparity estimation model, as The target disparity image and the disparity estimation model are trained in advance based on a first sample image and a depth image corresponding to the first sample image. The first sample image is annotated with a corresponding second sample. Image, there is a parallax between the second sample image and the first sample image.

在一种可能的实施方式中，该图像处理装置600还可以包括图像对获取模块、样本图像获取模块、图像集构建模块以及模型训练模块。图像对获取模块用于获取双目相机采集到的多个样本图像对；样本图像获取模块用于获取所述多个样本图像对中第一目对应的样本图像，作为所述第一样本图像，以及所述多个样本图像对中第二目对应的图像，作为所述第二样本图像；图像集构建模块用于针对每张所述第一样本图像，标注所述第一样本图像对应的所述第二样本图像，并获取所述第一样本图像对应的深度图像，得到样本图像集；模型训练模块用于基于所述样本图像集，对初始估计模型进行训练，得到所述视差估计模型。In a possible implementation, the image processing device 600 may also include an image pair acquisition module, a sample image acquisition module, an image set construction module, and a model training module. The image pair acquisition module is used to acquire multiple sample image pairs collected by the binocular camera; the sample image acquisition module is used to acquire the sample image corresponding to the first object among the multiple sample image pairs as the first sample image. , and the image corresponding to the second object among the plurality of sample image pairs, as the second sample image; the image set construction module is used to label the first sample image for each of the first sample images. The corresponding second sample image is obtained, and the depth image corresponding to the first sample image is obtained to obtain a sample image set; the model training module is used to train an initial estimation model based on the sample image set to obtain the Disparity estimation model.

可选地，模型训练模块可以具体用于：将所述样本图像集中的所述第一样本图像以及所述第一样本图像对应的深度图像输入至初始估计模型，得到所述初始估计模型输出的视差估计图像；基于所述第一样本图像所标注的所述第二样本图像，与所述视差估计图像之间的差异，确定损失值；基于所述损失值，对所述初始估计模型进行迭代更新，得到所述视差估计模型。Optionally, the model training module may be specifically configured to: input the first sample image in the sample image set and the depth image corresponding to the first sample image into an initial estimation model to obtain the initial estimation model. The output disparity estimation image; based on the difference between the second sample image marked by the first sample image and the disparity estimation image, determine a loss value; based on the loss value, calculate the initial estimate The model is updated iteratively to obtain the disparity estimation model.

在一些实施方式中，第二图像获取模块620可以具体用于：将所述第一目图像输入至预先训练的深度估计模型，得到所述深度估计模型输出的深度图像，作为所述第一目图像对应的深度图像。In some embodiments, the second image acquisition module 620 may be specifically configured to: input the first mesh image to a pre-trained depth estimation model, and obtain a depth image output by the depth estimation model as the first mesh image. The depth image corresponding to the image.

在一些实施方式中，图像处理装置600还可以包括图像发送模块。图像发送模块用于将所述第一目图像以及所述第二目图像发送至头戴显示设备，以使所述头戴显示设备对所述第一目图像以及所述第二目图像进行显示。In some implementations, the image processing device 600 may further include an image sending module. The image sending module is configured to send the first image and the second image to a head mounted display device, so that the head mounted display device displays the first image and the second image.

在一些实施方式中，第一图像获取模块610可以具体用于：获取图像采集装置采集的单目图像，作为所述第一目图像。In some implementations, the first image acquisition module 610 may be specifically configured to: acquire a monocular image collected by an image acquisition device as the first image.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述装置和模块的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the above-described devices and modules can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

在本申请所提供的几个实施例中，模块相互之间的耦合可以是电性，机械或其它形式的耦合。In several embodiments provided in this application, the coupling between modules may be electrical, mechanical or other forms of coupling.

另外，在本申请各个实施例中的各功能模块可以集成在一个处理模块中，也可以是各个模块单独物理存在，也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application can be integrated into one processing module, or each module can exist physically alone, or two or more modules can be integrated into one module. The above integrated modules can be implemented in the form of hardware or software function modules.

综上所述，本申请提供的方案，通过获取原始图像，作为第一目图像；获取第一目图像对应的深度图像，基于深度图像以及第一目图像，生成目标视差图像，目标视差图像与第一目图像之间存在视差，对目标视差图像进行后处理，得到第一目图像对应的第二目图像。由此，可以实现根据单目的图像，生成存在视差的另一目图像，从而在显示图像内容时能扩展图像内容对应的视野，并且在针对单目的图像获得视差图像后，还对视差图像进行后处理，从而能提升获得的另一目图像的精度和质量。In summary, the solution provided by this application obtains the original image as the first-eye image; obtains the depth image corresponding to the first-eye image, and generates a target disparity image based on the depth image and the first-eye image. The target disparity image is the same as the first-eye image. There is parallax between the first-eye images, and the target parallax image is post-processed to obtain a second-eye image corresponding to the first-eye image. Thus, it is possible to generate another eye image with parallax based on the single-purpose image, thereby expanding the field of view corresponding to the image content when displaying the image content, and after obtaining the parallax image for the single-purpose image, the parallax image is also post-processed. , thereby improving the accuracy and quality of the obtained image of another object.

请参考图14，其示出了本申请实施例提供的一种电子设备的结构框图。该电子设备100可以是智能手机、平板电脑、电子书、头戴显示设备等能够运行应用程序的电子设备。本申请中的电子设备100可以包括一个或多个如下部件：处理器110、存储器120、以及一个或多个应用程序，其中一个或多个应用程序可以被存储在存储器120中并被配置为由一个或多个处理器110执行，一个或多个应用程序配置用于执行如前述方法实施例所描述的方法。Please refer to Figure 14, which shows a structural block diagram of an electronic device provided in an embodiment of the present application. The electronic device 100 can be an electronic device capable of running applications, such as a smart phone, a tablet computer, an e-book, a head-mounted display device, etc. The electronic device 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more applications, wherein one or more applications may be stored in the memory 120 and configured to be executed by one or more processors 110, and one or more applications are configured to execute the method described in the aforementioned method embodiment.

处理器110可以包括一个或者多个处理核。处理器110利用各种接口和线路连接整个电子设备100内的各个部分，通过运行或执行存储在存储器120内的指令、程序、代码集或指令集，以及调用存储在存储器120内的数据，执行电子设备100的各种功能和处理数据。可选地，处理器110可以采用数字信号处理(Digital Signal Processing，DSP)、现场可编程门阵列(Field－Programmable Gate Array，FPGA)、可编程逻辑阵列(Programmable LogicArray，PLA)中的至少一种硬件形式来实现。处理器110可集成中央处理器(CentralProcessing Unit，CPU)、图形处理器(Graphics Processing Unit，GPU)和调制解调器等中的一种或几种的组合。其中，CPU主要处理操作系统、用户界面和应用程序等；GPU用于负责显示内容的渲染和绘制；调制解调器用于处理无线通信。可以理解的是，上述调制解调器也可以不集成到处理器110中，单独通过一块通信芯片进行实现。The processor 110 may include one or more processing cores. The processor 110 uses various interfaces and lines to connect various parts of the entire electronic device 100, and executes various functions and processes data of the electronic device 100 by running or executing instructions, programs, code sets or instruction sets stored in the memory 120, and calling data stored in the memory 120. Optionally, the processor 110 can be implemented in at least one hardware form of digital signal processing (DSP), field-programmable gate array (FPGA), and programmable logic array (PLA). The processor 110 can integrate one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), and a modem. Among them, the CPU mainly processes the operating system, user interface, and application programs; the GPU is responsible for rendering and drawing display content; and the modem is used to process wireless communications. It can be understood that the above-mentioned modem may not be integrated into the processor 110, but may be implemented separately through a communication chip.

存储器120可以包括随机存储器(Random Access Memory，RAM)，也可以包括只读存储器(Read-Only Memory)。存储器120可用于存储指令、程序、代码、代码集或指令集。存储器120可包括存储程序区和存储数据区，其中，存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等。存储数据区还可以存储电子设备100在使用中所创建的数据(比如电话本、音视频数据、聊天记录数据)等。The memory 120 may include random access memory (RAM) or read-only memory (Read-Only Memory). Memory 120 may be used to store instructions, programs, codes, sets of codes, or sets of instructions. The memory 120 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing an operating system and instructions for implementing at least one function (such as a touch function, a sound playback function, an image playback function, etc.) , instructions for implementing each of the following method embodiments, etc. The storage data area can also store data created during use of the electronic device 100 (such as phone book, audio and video data, chat record data), etc.

请参考图15，其示出了本申请实施例提供的一种计算机可读存储介质的结构框图。该计算机可读介质800中存储有程序代码，所述程序代码可被处理器调用执行上述方法实施例中所描述的方法。Please refer to FIG. 15 , which shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application. Program code is stored in the computer-readable medium 800, and the program code can be called by the processor to execute the method described in the above method embodiment.

计算机可读存储介质800可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。可选地，计算机可读存储介质800包括非易失性计算机可读介质(non-transitory computer-readable storage medium)。计算机可读存储介质800具有执行上述方法中的任何方法步骤的程序代码810的存储空间。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。程序代码810可以例如以适当形式进行压缩。The computer readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read-only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer readable storage medium 800 includes a non-transitory computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 that performs any method step in the above method. These program codes can be read from or written to one or more computer program products. The program code 810 can be compressed, for example, in an appropriate form.

最后应说明的是：以上实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不驱使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present application, but not to limit it; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: it can still Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions in the embodiments of the present application.

Claims

1. An image processing method, characterized in that the method includes:

Get the original image as the first image;

Acquire a depth image corresponding to the first image;

Generate a target disparity image based on the depth image and the first eye image, where there is a disparity between the target disparity image and the first eye image;

Post-process the target disparity image to obtain a second-eye image corresponding to the first-eye image.

2. The method according to claim 1, wherein post-processing the target disparity image to obtain a second-eye image corresponding to the first-eye image includes:

Perform interpolation processing on the target disparity image to obtain a first disparity image;

Perform affine transformation on the first parallax image to obtain a second parallax image aligned with the first eye image as a second eye image corresponding to the first eye image.

3. The method of claim 2, wherein the first parallax image is subjected to affine transformation to obtain a second parallax image aligned with the first eye image, as the first parallax image. The second eye image corresponding to the first eye image includes:

If the first eye image is a left eye image, forward mapping is performed on the first parallax image to obtain a second parallax image aligned with the first eye image as the corresponding left eye image. Second eye image;

If the first eye image is a right eye image, then perform backward mapping processing on the first parallax image to obtain a second parallax image aligned with the first eye image as the corresponding parallax image of the first eye image. Second eye image.

4. The method according to claim 2, wherein the first parallax image is subjected to affine transformation to obtain a second parallax image aligned with the first eye image, as the first parallax image. The second eye image corresponding to the first eye image includes:

Perform affine transformation on the first parallax image to obtain a second parallax image aligned with the first eye image;

Based on the occlusion position in the second disparity image, hole filling processing is performed on the second disparity image, and the second disparity image after the hole filling processing is used as the second eye image.

5. The method according to claim 4, characterized in that the hole filling process is performed on the second disparity image based on the occlusion position in the second disparity image, comprising:

Perform a first interpolation process on the occlusion position in the second disparity image;

Filling other areas in the second disparity image except the occlusion position with target pixel values.

6. The method according to claim 2, characterized in that, performing interpolation processing on the target disparity image to obtain the first disparity image includes:

Perform normalization processing on the target disparity image to obtain a first intermediate disparity image;

Perform a second interpolation process on the first intermediate parallax image to obtain a second intermediate parallax image;

According to the maximum pixel value in the target disparity image, each pixel value of the second intermediate disparity image is adjusted to obtain the first disparity image.

7. The method according to claim 1, characterized in that generating a target disparity image based on the depth image and the first eye image comprises:

The depth image and the first eye image are input into a pre-trained disparity estimation model to obtain a disparity image output by the disparity estimation model as the target disparity image. The disparity estimation model is pre-trained according to the first The sample image and the depth image corresponding to the first sample image are trained. The first sample image is annotated with the corresponding second sample image. The second sample image is the same as the first sample image. There is a parallax between them.

8. The method according to claim 7, characterized in that the disparity estimation model is trained by:

Obtain multiple sample image pairs collected by binocular cameras;

Acquire a sample image corresponding to a first target among the plurality of sample image pairs as the first sample image, and an image corresponding to a second target among the plurality of sample image pairs as the second sample image;

For each first sample image, mark the second sample image corresponding to the first sample image, and obtain the depth image corresponding to the first sample image to obtain a sample image set;

Based on the sample image set, an initial estimation model is trained to obtain the disparity estimation model.

9. The method according to claim 8, characterized in that, based on the sample image set, training an initial estimation model to obtain the disparity estimation model includes:

Input the first sample image in the sample image set and the depth image corresponding to the first sample image into an initial estimation model to obtain a disparity estimation image output by the initial estimation model;

Determine a loss value based on the difference between the second sample image annotated by the first sample image and the disparity estimation image;

Based on the loss value, the initial estimation model is iteratively updated to obtain the disparity estimation model.

10. The method according to claim 1, wherein said obtaining the depth image corresponding to the first eye image includes:

The first mesh image is input to a pre-trained depth estimation model to obtain a depth image output by the depth estimation model as the depth image corresponding to the first mesh image.

11. The method according to any one of claims 1 to 10, characterized in that after post-processing the target parallax image to obtain a second eye image corresponding to the first eye image, the method further comprises:

The first eye image and the second eye image are sent to a head mounted display device, so that the head mounted display device displays the first eye image and the second eye image.

12. The method according to any one of claims 1 to 10, characterized in that obtaining the original image as the first image includes:

A monocular image captured by an image acquisition device is acquired as the first monocular image.

13. An image processing device, characterized in that the device includes: a first image acquisition module, a second image acquisition module, a parallax image generation module and an image post-processing module, wherein,

The first image acquisition module is used to acquire an original image as a first image;

The second image acquisition module is used to acquire the depth image corresponding to the first eye image;

The parallax image generation module is configured to generate a target parallax image based on the depth image and the first eye image, and there is a parallax between the target parallax image and the first eye image;

The image post-processing module is used to perform post-processing on the target disparity image to obtain a second-eye image corresponding to the first-eye image.

14. An electronic device, characterized in that it includes:

one or more processors;

memory;

one or more programs, wherein said one or more programs are stored in said memory and configured to be executed by said one or more processors, said one or more programs are configured to perform as claimed The method described in any one of 1-12.

15. A computer-readable storage medium, characterized in that program code is stored in the computer-readable storage medium, and the program code can be called and executed by a processor as described in any one of claims 1-12 Methods.