CN112017243B

CN112017243B - Medium visibility recognition method

Info

Publication number: CN112017243B
Application number: CN202010869849.8A
Authority: CN
Inventors: 王锡纲; 李杨; 赵育慧
Original assignee: Dalian Xinwei Technology Co ltd
Current assignee: Dalian Xinwei Technology Co ltd
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2024-05-03
Anticipated expiration: 2040-08-26
Also published as: CN112017243A

Abstract

The invention relates to the technical field of visibility recognition, and provides a medium visibility recognition method, which comprises the following steps: the method comprises the steps of collecting video data of a target object through a binocular camera, and collecting visibility data through a visibility tester; respectively extracting the positions of the target objects from two paths of video signals acquired by the binocular camera by using a target segmentation algorithm; performing feature matching on the obtained extraction result of the target object; obtaining distance information of a target object by using a binocular distance measuring algorithm; predicting the image quality visibility of each frame image in two paths of video signals acquired by the binocular camera by using an image quality prediction visibility algorithm; predicting the visibility of the target visual effect on each frame image in the two paths of video signals acquired by the binocular camera by using a target visual effect prediction visibility algorithm; and carrying out final visibility prediction by utilizing a visibility balance algorithm. The invention can improve the accuracy of medium visibility recognition and adapt to various environments.

Description

A medium visibility identification method

技术领域Technical Field

本发明涉及能见度识别技术领域，尤其涉及一种介质能见度识别方法。The present invention relates to the technical field of visibility recognition, and in particular to a medium visibility recognition method.

背景技术Background technique

能见度识别在航空航海、交通运输等方面具有重要意义，恶劣的天气状况、海洋环境会造成各种安全隐患，关系着人民群众的生命财产安全，如若相关部门能准确发布对应的能见度状况，就能帮助各行各业提高管理质量。Visibility identification is of great significance in aviation, navigation, transportation, etc. Severe weather conditions and marine environment will cause various safety hazards, which are related to the safety of people’s lives and property. If relevant departments can accurately release the corresponding visibility conditions, it can help all walks of life improve management quality.

常用的能见度识别方法有人工目测法和器测法等。人工目测法通过在每个站点布置专门的观测站来判断能见度，目测法由于只凭人眼分辨和主观判断，导致规范性和客观性较差；器测法通过透射式能见度仪和激光雷达能见度测量仪等设备测量透射率、消光系数等来计算能见度，这些设备价格昂贵，场地要求高，局限性大，因此不能广泛使用。Commonly used visibility identification methods include manual visual inspection and instrumental inspection. Manual visual inspection determines visibility by setting up special observation stations at each site. Visual inspection relies only on human eye discrimination and subjective judgment, resulting in poor standardization and objectivity. Instrumental inspection calculates visibility by measuring transmittance, extinction coefficient, etc. using equipment such as transmission visibility meters and laser radar visibility meters. These devices are expensive, have high site requirements, and have great limitations, so they cannot be widely used.

发明内容Summary of the invention

本发明主要解决现有技术的介质能见度识别价格昂贵、适用范围小和识别准确率低等技术问题，提出一种介质能见度识别方法，以达到提高介质能见度识别的准确性以及适应各种环境的目的。The present invention mainly solves the technical problems of the prior art of medium visibility recognition, such as high price, small application scope and low recognition accuracy, and proposes a medium visibility recognition method to achieve the purpose of improving the accuracy of medium visibility recognition and adapting to various environments.

本发明提供了一种介质能见度识别方法，包括以下过程：The present invention provides a medium visibility identification method, comprising the following process:

步骤100，通过双目摄像头采集目标物的视频数据，并通过能见度测试仪采集能见度数据，得到两路视频信号和能见度信号；Step 100, collecting video data of the target object through a binocular camera, and collecting visibility data through a visibility tester, to obtain two video signals and a visibility signal;

步骤200，利用目标分割算法从双目摄像头采集的两路视频信号中分别提取目标物的位置，得到目标物的提取结果；Step 200, using a target segmentation algorithm to extract the position of the target object from the two video signals collected by the binocular camera, and obtain the extraction result of the target object;

步骤300，对得到目标物的提取结果，进行特征匹配；Step 300, performing feature matching on the target object extraction result;

步骤400，利用双目测距算法，得到目标物的距离信息，进而得到目标物检测距离与实际距离的偏差；Step 400, using a binocular distance measurement algorithm to obtain distance information of the target object, and then obtain the deviation between the detected distance of the target object and the actual distance;

步骤500，利用图像质量预测能见度算法对双目摄像头采集的两路视频信号中的每一帧图像，进行图像质量能见度预测，预测得到的能见度区间；Step 500, using an image quality prediction visibility algorithm to perform image quality visibility prediction on each frame of the two-channel video signal collected by the binocular camera, and predict the visibility range;

步骤600，利用目标视觉效果预测能见度算法对双目摄像头采集的两路视频信号中的每一帧图像，进行目标视觉效果能见度预测，预测得到的能见度区间；Step 600, using a target visual effect prediction visibility algorithm to predict the target visual effect visibility for each frame of the two-channel video signal collected by the binocular camera, and predicting the visibility interval;

步骤700，利用能见度平衡算法，进行最终的能见度预测。Step 700, using the visibility balance algorithm to perform final visibility prediction.

进一步的，步骤200包括以下过程：Further, step 200 includes the following process:

步骤201，对视频两路视频信号中的每一帧图像，进行卷积神经网络提取特征；Step 201, performing a convolutional neural network to extract features for each frame of the two video signals;

步骤202，利用区域提取网络进行初步分类与回归；Step 202, using a region extraction network to perform preliminary classification and regression;

步骤203，对候选框特征图进行对齐操作；Step 203, aligning the candidate frame feature map;

步骤204，利用卷积神经网络对目标进行分类、回归、分割，得到目标物的提取结果。Step 204, using a convolutional neural network to classify, regress, and segment the target to obtain an extraction result of the target object.

进一步的，步骤300包括以下过程：Further, step 300 includes the following process:

步骤301，对两个目标物轮廓，进行提取关键点；Step 301, extract key points from the contours of two targets;

步骤302，对得到的关键点，进行定位关键点；Step 302, positioning the key points obtained;

步骤303，根据定位的关键点，确定关键点的特征向量；Step 303, determining the feature vector of the key point according to the located key point;

步骤304，通过各关键点的特征向量，关键点匹配。Step 304: matching key points using the feature vectors of each key point.

进一步的，步骤400包括以下过程：Further, step 400 includes the following process:

步骤401，对双目摄像头进行标定；Step 401, calibrating the binocular camera;

步骤402，对双目摄像头进行双目校正；Step 402, performing binocular calibration on the binocular camera;

步骤403，对双目摄像头采集的图像进行双目匹配；Step 403, performing binocular matching on the images collected by the binocular camera;

步骤404，计算双目匹配后的图像的深度信息，得到图像中目标物的距离信息。Step 404: Calculate the depth information of the image after binocular matching to obtain the distance information of the target object in the image.

进一步的，步骤500包括以下过程：Further, step 500 includes the following process:

步骤501，对图像进行分割，实现对目标进行识别和定位；Step 501, segmenting the image to identify and locate the target;

步骤502，根据目标进行识别和定位结果，对图像能见度进行预测，得到图像分类结果。Step 502: predict the image visibility based on the target recognition and positioning results to obtain the image classification result.

进一步的，步骤600包括以下过程：Further, step 600 includes the following process:

步骤601，构建目标视觉效果预测能见度算法网络结构；Step 601, constructing a target visual effect prediction visibility algorithm network structure;

步骤602，将步骤200得到的得到目标物的提取结果，输入目标视觉效果预测能见度算法网络结构，得到多尺度的特征图；Step 602, inputting the target object extraction result obtained in step 200 into the target visual effect prediction visibility algorithm network structure to obtain a multi-scale feature map;

步骤603，经过目标视觉效果预测能见度算法网络结构对图像进行分类，得到目标图像分类结果，实现预测得到的能见度区间。Step 603, classify the image through the target visual effect prediction visibility algorithm network structure to obtain the target image classification result and realize the predicted visibility range.

进一步的，所述目标视觉效果预测能见度算法网络结构包括：输入层、卷积层、第一个提取特征模块、合并通道、第二个提取特征模块、合并通道、全连接层、分类结构输出层；其中，每个提取特征模块包括5个卷积核。Furthermore, the network structure of the target visual effect prediction visibility algorithm includes: an input layer, a convolution layer, a first feature extraction module, a merging channel, a second feature extraction module, a merging channel, a fully connected layer, and a classification structure output layer; wherein each feature extraction module includes 5 convolution kernels.

进一步的，步骤700包括以下过程：Further, step 700 includes the following process:

步骤701，构建能见度平衡算法网络结构，所述能见度平衡算法网络结构包括输入层、循环神经网络、全连接层和能见度区间输出层；Step 701, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure includes an input layer, a recurrent neural network, a fully connected layer, and a visibility interval output layer;

步骤702，将能见度依次输入循环神经网络，得到考虑时间序列的结果；Step 702, input the visibility into the recurrent neural network in sequence to obtain a result considering the time series;

步骤703，将循环神经网络的输出连接一个全连接层，得到该时间序列所对应的能见度区间值。Step 703, connect the output of the recurrent neural network to a fully connected layer to obtain the visibility interval value corresponding to the time series.

本发明提供的一种介质能见度识别方法，利用双目摄像头，全天候捕捉视频影像，利用测距算法得到的目标物检测距离与实际距离的偏差，利用图像质量预测能见度算法得到的能见度区间，利用目标视觉效果预测能见度算法得到的能见度区间，并根据以上结果利用能见度平衡算法，进行最终的能见度预测。本发明能够识别当前的介质能见度，对介质能见度识别的准确率高、稳定性高，对常见各种情况的适应性强，不依赖于特定的视频采集设备。本发明的每个点位使用双目摄像头进行视频数据采集。双目摄像头的使用起到了一举多得的效果。两个镜头既可以独立使用，把每个镜头可以当作一个独立的视频信号源，两路信号进行交叉验证，又可以结合使用增加对距离的敏感性。The present invention provides a medium visibility identification method, which uses a binocular camera to capture video images around the clock, uses the deviation between the target detection distance and the actual distance obtained by the ranging algorithm, uses the visibility interval obtained by the image quality prediction visibility algorithm, uses the visibility interval obtained by the target visual effect prediction visibility algorithm, and uses the visibility balance algorithm to perform final visibility prediction based on the above results. The present invention can identify the current medium visibility, has high accuracy and stability in medium visibility identification, is highly adaptable to various common situations, and does not rely on specific video acquisition equipment. Each point of the present invention uses a binocular camera to collect video data. The use of the binocular camera achieves multiple goals at one stroke. The two lenses can be used independently, each lens can be used as an independent video signal source, and the two signals can be cross-validated, and can also be used in combination to increase sensitivity to distance.

本发明可以应用在海底能见度识别、港区的大气能见度识别，以及其他需要进行介质能见度识别的场景。在港区的大气能见度识别时，通过对港区应用场景的分析，可以发现，港区面积大、作业区域分布广，因此，识别点位需要根据作业区域进行多点部署。港区内建设相对成熟，地形地貌特征和建筑外观相对稳定。便于在每个点位设定检测参照点，更有利于提高识别的稳定性和准确率。双目摄像头在港区内是多点部署，通过系统的时间戳控制，可以得到同一时间，各个点位的视频数据。同时，视频数据是时间维度上的图像序列，因此，通过本发明的方法可以得到不同时段，不同地点的大气能见度数据，供港口业务人员使用。The present invention can be applied to seabed visibility identification, atmospheric visibility identification in port areas, and other scenarios that require medium visibility identification. In the atmospheric visibility identification in the port area, through the analysis of the application scenarios in the port area, it can be found that the port area is large and the operating area is widely distributed. Therefore, the identification points need to be deployed at multiple points according to the operating area. The construction in the port area is relatively mature, and the topographic features and building appearance are relatively stable. It is convenient to set detection reference points at each point, which is more conducive to improving the stability and accuracy of identification. The binocular camera is deployed at multiple points in the port area. Through the system's timestamp control, video data of each point at the same time can be obtained. At the same time, the video data is an image sequence in the time dimension. Therefore, the atmospheric visibility data of different time periods and different locations can be obtained by the method of the present invention for use by port business personnel.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明提供的介质能见度识别方法的实现流程图；FIG1 is a flow chart of a method for identifying medium visibility provided by the present invention;

图2是特征金字塔网络结构的示意图；FIG2 is a schematic diagram of a feature pyramid network structure;

图3是自底向上结构的示意图；FIG3 is a schematic diagram of a bottom-up structure;

图4a-4e是自底向上结构中每个阶段产生特征图的原理示意图；4a-4e are schematic diagrams showing the principle of generating feature maps at each stage in the bottom-up structure;

图5是区域提取网络结构的示意图；FIG5 is a schematic diagram of a region extraction network structure;

图6是对特征图进行对齐操作的效果图；FIG6 is a diagram showing the effect of performing an alignment operation on a feature map;

图7是分类、回归、分割网络结构的示意图；Figure 7 is a schematic diagram of the classification, regression, and segmentation network structures;

图8是双目测距算法的原理示意图；FIG8 is a schematic diagram of the principle of a binocular ranging algorithm;

图9是双目测距的基本原理示意图；FIG9 is a schematic diagram of the basic principle of binocular ranging;

图10是图像分割网络结构的示意图；FIG10 is a schematic diagram of an image segmentation network structure;

图11是图像能见度预测网络结构的示意图；FIG11 is a schematic diagram of an image visibility prediction network structure;

图12是目标视觉效果预测能见度算法网络结构的示意图；FIG12 is a schematic diagram of a network structure of a target visual effect prediction visibility algorithm;

图13是能见度平衡算法网络结构的示意图；FIG13 is a schematic diagram of the visibility balance algorithm network structure;

图14a-14b是循环神经网络结构的示意图。14a-14b are schematic diagrams of a recurrent neural network structure.

具体实施方式Detailed ways

为使本发明解决的技术问题、采用的技术方案和达到的技术效果更加清楚，下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部内容。In order to make the technical problems solved by the present invention, the technical solutions adopted and the technical effects achieved clearer, the present invention is further described in detail below in conjunction with the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are only used to explain the present invention, rather than to limit the present invention. It should also be noted that, for the convenience of description, only the parts related to the present invention are shown in the accompanying drawings, rather than all the contents.

图1是本发明提供的介质能见度识别方法的实现流程图。如图1所示，本发明实施例提供的介质能见度识别方法，包括：FIG1 is a flowchart of the method for identifying medium visibility provided by the present invention. As shown in FIG1 , the method for identifying medium visibility provided by an embodiment of the present invention includes:

步骤100，通过双目摄像头采集目标物的视频数据，并通过能见度测试仪采集能见度数据，得到两路视频信号和能见度信号。Step 100, collecting video data of the target object through a binocular camera, and collecting visibility data through a visibility tester, to obtain two video signals and a visibility signal.

由于本发明能见度识别的输出为离散值，即数字区间，如“500米以下”、“500米到1000米”的描述。因此，为了提高检测精度，利用测距算法得到的连续值，即检测距离，如“245.87米”、“1835.64米”的描述，来对其他算法检测出的离散值进行校正。因此需要设置目标参照物。目标参照物的选定原则：位置不变的固定物体；能见度良好的情况下，白天夜晚都能清晰辨认的物体；双目摄像头和目标物之间无遮挡；目标物与双目摄像头之间的距离符合能见度区间的分布，且分布均匀。距离差值100米左右为宜。Since the output of visibility recognition of the present invention is a discrete value, that is, a digital interval, such as the description of "below 500 meters", "500 meters to 1000 meters". Therefore, in order to improve the detection accuracy, the continuous value obtained by the ranging algorithm, that is, the detection distance, such as the description of "245.87 meters", "1835.64 meters", is used to correct the discrete values detected by other algorithms. Therefore, it is necessary to set a target reference object. The selection principles of the target reference object: a fixed object with unchanged position; an object that can be clearly identified both day and night under good visibility; no obstruction between the binocular camera and the target object; the distance between the target object and the binocular camera conforms to the distribution of the visibility interval and is evenly distributed. A distance difference of about 100 meters is appropriate.

步骤200，利用目标分割算法从双目摄像头采集的两路视频信号中分别提取目标物的位置，得到目标物的提取结果。Step 200, using a target segmentation algorithm to extract the position of the target object from two video signals collected by the binocular camera, to obtain the extraction result of the target object.

从双目摄像头两路视频信号中分别提取目标物的位置，目的是防止单路检测出现错误导致后续计算失败。本步骤采用精确的目标分割算法，得到目标物的精确轮廓，能够提取目标物的准确位置，为后续处理做准备。The position of the target object is extracted from the two video signals of the binocular camera respectively to prevent the subsequent calculation failure caused by errors in single-channel detection. This step uses an accurate target segmentation algorithm to obtain the precise outline of the target object, which can extract the accurate position of the target object and prepare for subsequent processing.

考虑到双目摄像头的视野(视角和焦距等)不会轻易变化，因此，理论上目标物出现在成像中的位置是固定不变的。但是在实际中，必须考虑到摄像头在风、海浪等作用下会产生晃动，或者其他外力作用下，导致视野发生轻微改变，甚至飞鸟、鱼群等干扰物会出现在视野中等因素，为了增加检测的准确性，在目标分割算法进行处理的时候，会根据先验条件设置视野中的热点区域，并提高热点区域内检测到的目标物的权值。Considering that the field of view (angle and focal length, etc.) of the binocular camera will not change easily, in theory, the position of the target in the image is fixed. However, in practice, it must be considered that the camera will shake under the influence of wind and waves, or other external forces will cause slight changes in the field of view, and even interference objects such as flying birds and schools of fish will appear in the field of view. In order to increase the accuracy of detection, when the target segmentation algorithm is processed, the hotspot area in the field of view will be set according to the prior conditions, and the weight of the target detected in the hotspot area will be increased.

经过目标分割算法，在双目摄像头的两路视频帧信号中，理论上可以得到两个目标物的“精确轮廓”。此处所说的“精确轮廓”，会受到介质中不同情况的干扰，不同能见度情况下，检测到的“精确轮廓”可能不相同。我们在这个环节容忍这种干扰，因为干扰恰恰是包含了能见度的信息。如果此处没有得到两个“精确轮廓”，则说明该帧数据识别有误，或者由于某些原因导致识别无法正常进行，如某一录视频信号采集异常，或者某个镜头被遮挡。After the target segmentation algorithm, in the two video frame signals of the binocular camera, the "precise contours" of the two targets can be theoretically obtained. The "precise contours" mentioned here will be affected by different conditions in the medium. Under different visibility conditions, the "precise contours" detected may be different. We tolerate this interference in this link because the interference contains visibility information. If two "precise contours" are not obtained here, it means that the frame data is incorrectly recognized, or the recognition cannot be performed normally due to some reasons, such as abnormal acquisition of a recorded video signal, or a lens is blocked.

没有满足上述条件的情况，该帧数据将被丢弃，等待下一帧数据的输入，重新识别。本方法在实际应用时，如果发生多帧连续出现上述情况，则需要进行报警，将视频信号保存，留待业务人员检查。If the above conditions are not met, the frame data will be discarded and the next frame data will be input and re-identified. In actual application of this method, if the above situation occurs for multiple frames in succession, an alarm needs to be issued and the video signal needs to be saved for inspection by business personnel.

本步骤的目标分割是图像分析的第一步，是计算机视觉的基础，是图像理解的重要组成部分，同时也是图像处理中最困难的问题之一。所谓图像分割是指根据灰度、颜色、空间纹理、几何形状等特征把图像划分成若干个互不相交的区域，使得这些特征在同一区域内表现出一致性或相似性，而在不同区域间表现出明显的不同。简单的说就是在一副图像中，把目标从背景中分离出来。图像分割使得其后的图像分析，目标识别等高级处理阶段所要处理的数据量大大减少，同时又保留有关图像结构特征的信息。The target segmentation in this step is the first step of image analysis, the foundation of computer vision, an important part of image understanding, and one of the most difficult problems in image processing. Image segmentation refers to dividing an image into several non-intersecting areas based on features such as grayscale, color, spatial texture, and geometric shape, so that these features show consistency or similarity in the same area, but show obvious differences between different areas. Simply put, it is to separate the target from the background in an image. Image segmentation greatly reduces the amount of data to be processed in subsequent advanced processing stages such as image analysis and target recognition, while retaining information about the structural features of the image.

目标分割算法主要分为：基于阈值的分割方法、基于区域的分割方法、基于边缘的分割方法以及基于深度学习的分割方法等。本步骤采用的目标分割算法的主要过程包括：The target segmentation algorithm is mainly divided into: threshold-based segmentation method, region-based segmentation method, edge-based segmentation method and deep learning-based segmentation method. The main process of the target segmentation algorithm used in this step includes:

步骤201，对视频两路视频信号中的每一帧图像，进行卷积神经网络提取特征。Step 201, for each frame of the two video signals, a convolutional neural network is used to extract features.

本步骤考虑图像的清晰度会随着摄像头参数不同而变化，所以采用多尺度特征提取方案，即特征金字塔网络。特征金字塔网络结构如图2所示。This step considers that the clarity of the image will change with different camera parameters, so a multi-scale feature extraction solution is adopted, namely the feature pyramid network. The structure of the feature pyramid network is shown in Figure 2.

特征金字塔网络分为两部分结构。左侧结构叫自底向上结构，该结构产出不同尺度的特征图，如图上C1到C5。C1到C5分别为不同尺度的特征图，从下至上，特征图尺寸不断变小，也意味着提取的特征维度越来越高。形状呈金字塔型，因此成为特征金字塔网络。右侧结构叫自顶向下结构，分别对应特征金字塔的每一层特征，两个结构间，相同级别的特征处理连接的箭头是横向连接。The feature pyramid network is divided into two parts. The structure on the left is called the bottom-up structure, which produces feature maps of different scales, such as C1 to C5 in the figure. C1 to C5 are feature maps of different scales. From bottom to top, the size of the feature map keeps getting smaller, which also means that the extracted feature dimensions are getting higher and higher. The shape is pyramid-shaped, so it is called a feature pyramid network. The structure on the right is called the top-down structure, which corresponds to each layer of the feature pyramid. Between the two structures, the arrows connecting the feature processing of the same level are horizontal connections.

这样做的目的是因为尺寸较小的高层特征具有较多语义信息，尺寸较大的低层特征语义信息少但位置信息多，通过这样的连接，每一层的特征图都融合了不同分辨率和不同语义强度的特征，因此在对不同分辨率的物体进行检测时，检测效果可以得到提升。The purpose of doing this is that smaller high-level features have more semantic information, while larger low-level features have less semantic information but more position information. Through such a connection, the feature map of each layer integrates features of different resolutions and different semantic strengths. Therefore, when detecting objects of different resolutions, the detection effect can be improved.

自底向上结构如图3所示，该网络结构中包含五个阶段，每个阶段用来计算不同尺寸的特征图，其缩放步长为2。每个阶段产生特征图的原理如图4a-4e所示。我们使用每个阶段输出的C1、C2、C3、C4、C5特征图用于构建特征金字塔网络结构。The bottom-up structure is shown in Figure 3. The network structure contains five stages, each of which is used to calculate feature maps of different sizes, with a scaling step of 2. The principle of generating feature maps in each stage is shown in Figures 4a-4e. We use the C1, C2, C3, C4, and C5 feature maps output by each stage to construct the feature pyramid network structure.

自顶向下结构如图2中金字塔网络结构右侧所示。首先对具有更强语义信息的高层特征图进行上采样，得到和低层特征图相同的尺寸。然后，将具有相同尺寸的自底向上和自顶向下结构中的特征图进行横向连接。按元素相加的方式，将两个特征图映射合并。最后，为了减少上采样带来的混叠效应，在每个合并的特征图上附加一个卷积层得到最终的特征图，即P2、P3、P4、P5。The top-down structure is shown on the right side of the pyramid network structure in Figure 2. First, the high-level feature maps with stronger semantic information are upsampled to the same size as the low-level feature maps. Then, the feature maps in the bottom-up and top-down structures with the same size are horizontally connected. The two feature map maps are merged in an element-wise manner. Finally, in order to reduce the aliasing effect caused by upsampling, a convolutional layer is attached to each merged feature map to obtain the final feature map, namely P2, P3, P4, and P5.

步骤202，利用区域提取网络进行初步分类与回归。Step 202: Perform preliminary classification and regression using a region extraction network.

区域提取网络结构如图5所示。基于上述特征金字塔网络得到的特征图P2、P3、P4、P5，首先根据锚框生成规则生成特征图上每个点对应原图的锚框，然后将P2、P3、P4、P5特征图输入到区域提取网络，区域提取网络包含一个卷积层和一个全连接层，最终得到每个锚框的分类、回归结果，具体包含每个锚框的前景、背景分类得分及每个锚框的边界框坐标修正信息。最后根据阈值选取出符合前景得分条件的锚框并进行边界框修正，修正之后的锚框称为候选框。The structure of the region extraction network is shown in Figure 5. Based on the feature maps P2, P3, P4, and P5 obtained by the feature pyramid network, first generate anchor frames corresponding to the original image for each point on the feature map according to the anchor frame generation rules, and then input the P2, P3, P4, and P5 feature maps into the region extraction network. The region extraction network contains a convolution layer and a fully connected layer, and finally obtains the classification and regression results of each anchor frame, including the foreground and background classification scores of each anchor frame and the bounding box coordinate correction information of each anchor frame. Finally, the anchor frame that meets the foreground score condition is selected according to the threshold and the bounding box is corrected. The corrected anchor frame is called a candidate frame.

步骤203，对候选框特征图进行对齐操作。Step 203: align the candidate box feature map.

通过区域提取网络，得到符合得分要求的候选框，并将这些候选框映射回特征图上。根据以下公式得到候选框所对应的特征图层数：Through the region extraction network, candidate boxes that meet the score requirements are obtained and mapped back to the feature map. The number of feature layers corresponding to the candidate boxes is obtained according to the following formula:

其中，w表示候选框的宽度，h表示候选框的高度，k表示这个候选框所对应的特征层层数，k₀是w，h＝224时映射的层数，一般取4，即对应着P4层。然后，通过双线性内插的方法获得候选框所对应的特征图，所得到的特征图尺寸是一致的。对特征图进行对齐操作效果如图6所示。Among them, w represents the width of the candidate box, h represents the height of the candidate box, and k represents the number of feature layers corresponding to the candidate box. k ₀ is the number of layers mapped when w and h = 224, which is generally 4, corresponding to the P4 layer. Then, the feature map corresponding to the candidate box is obtained by bilinear interpolation, and the size of the obtained feature map is consistent. The effect of aligning the feature map is shown in Figure 6.

分类、回归、分割网络结构如图7所示。基于上述所得固定尺寸的候选框特征图，经过分类、回归网络，计算候选框的分类得分和坐标偏移量并对候选框进行边界框修正。经过分割网络，对候选框内的目标进行分割。最终，经过目标分割算法可以得到图像中目标的分类、边界框回归及分割结果，进而得到目标物的提取结果。The classification, regression, and segmentation network structures are shown in Figure 7. Based on the fixed-size candidate box feature map obtained above, the classification score and coordinate offset of the candidate box are calculated through the classification and regression network, and the candidate box is corrected by the bounding box. After the segmentation network, the target in the candidate box is segmented. Finally, the classification, bounding box regression, and segmentation results of the target in the image can be obtained through the target segmentation algorithm, and then the target object extraction result can be obtained.

步骤300，对得到目标物的提取结果，进行特征匹配。Step 300, performing feature matching on the target object extraction result.

通过步骤200的目标分割算法，得到了两个目标物轮廓，但是这两个目标物轮廓在不同的视频帧中的位置和角度是不一样的，这一步需要将两个目标物轮廓进行特征匹配。特征匹配算法需要对两个目标物轮廓进行特征比对，找到成像中不同位置的同一物体的同一个点。因为后续的测距算法，必须要根据某个确定的像素点进行计算。在这一环节，为了尽可能保证提取到的是同一个点，会进行多次采样取均值的方法确定最后结果。并记录该点在不同的成像中的像素位置。具体包括以下过程：Through the target segmentation algorithm of step 200, two target contours are obtained, but the positions and angles of the two target contours in different video frames are different. This step requires feature matching of the two target contours. The feature matching algorithm needs to perform feature comparison on the two target contours to find the same point of the same object at different positions in the imaging. Because the subsequent ranging algorithm must be calculated based on a certain pixel point. In this link, in order to ensure that the same point is extracted as much as possible, multiple sampling and averaging methods will be used to determine the final result. And record the pixel position of the point in different imaging. Specifically include the following processes:

步骤301，对两个目标物轮廓，进行提取关键点。Step 301: extract key points from the contours of two targets.

关键点是一些十分突出的不会因光照、尺度、旋转等因素而消失的点，比如角点、边缘点、暗区域的亮点以及亮区域的暗点。本步骤是搜索所有尺度空间上的图像位置。通过高斯微分函数来识别潜在的具有尺度和旋转不变的兴趣点。Key points are some very prominent points that will not disappear due to factors such as lighting, scale, rotation, etc., such as corner points, edge points, bright spots in dark areas, and dark spots in bright areas. This step searches for image locations in all scale spaces. The Gaussian differential function is used to identify potential scale- and rotation-invariant points of interest.

步骤302，对得到的关键点，进行定位关键点。Step 302: locate the key points obtained.

在每个候选的位置上，通过一个拟合精细的模型来确定位置和尺度。关键点的选择依据于它们的稳定程度。At each candidate location, a fine-grained model is fitted to determine the position and scale. Keypoints are selected based on their stability.

步骤303，根据定位的关键点，确定关键点的特征向量。Step 303: Determine the feature vector of the key point based on the located key point.

基于图像局部的梯度方向，分配给每个关键点位置一个或多个方向。所有后面的对图像数据的操作都相对于关键点的方向、尺度和位置进行变换，从而提供对于这些变换的不变性。Based on the local gradient direction of the image, one or more directions are assigned to each keypoint position. All subsequent operations on the image data are transformed relative to the direction, scale and position of the keypoint, thereby providing invariance to these transformations.

通过各关键点的特征向量，进行两两比较找出相互匹配的若干对特征点，建立物体间特征的对应关系。最终，可通过对应关系计算出关键点之间的距离。Through the feature vectors of each key point, we can compare them two by two to find several pairs of feature points that match each other, and establish the corresponding relationship between the features of the objects. Finally, we can calculate the distance between the key points through the corresponding relationship.

步骤400，利用双目测距算法，得到目标物的距离信息，进而得到目标物检测距离与实际距离的偏差。Step 400, using a binocular distance measurement algorithm, obtains distance information of the target object, and then obtains the deviation between the detected distance of the target object and the actual distance.

双目测距算法的原理图如图8所示。由图8可见，测距算法的误差会受到左右两个相机间距的测量误差、相机焦距的测量误差、相机与目标物的垂直高度差的测量误差等因素的影响。这些误差是不可避免的。但本步骤并不是要测量目标物的精确距离，只是建立实际距离与不同能见度情况影响下的检测距离的关联关系。并且，由于后续由神经网络的存在，可以将本步骤产生的误差通过后续神经网络减少影响。测距算法的输出值为检测距离值(连续值)。双目测距的基本原理如图9所示。本步骤具体包括以下过程：The principle diagram of the binocular ranging algorithm is shown in Figure 8. As can be seen from Figure 8, the error of the ranging algorithm will be affected by factors such as the measurement error of the distance between the left and right cameras, the measurement error of the camera focal length, and the measurement error of the vertical height difference between the camera and the target. These errors are inevitable. However, this step is not to measure the precise distance of the target, but to establish the correlation between the actual distance and the detection distance under different visibility conditions. In addition, due to the existence of the subsequent neural network, the error generated in this step can be reduced through the subsequent neural network. The output value of the ranging algorithm is the detection distance value (continuous value). The basic principle of binocular ranging is shown in Figure 9. This step specifically includes the following processes:

步骤401，对双目摄像头进行标定。Step 401, calibrate the binocular camera.

摄像头由于光学透镜的特性使得成像存在着径向畸变，可由三个参数k1、k2、k3确定，径向畸变公式为：X_dr＝X(1+k₁×r²+k₂×r⁴+k₃×r⁶)，Y_dr＝Y(1+k₁×r²+k₂×r⁴+k₃×r⁶)，r²＝X²+Y²,式中，(X,Y)是无畸变的图像像素点坐标，(X_dr,Y_dr)是畸变后图像像素点坐标；由于装配方面的误差，摄像头的传感器与光学镜头之间并非完全平行，因此成像存在切向畸变，可由两个参数p1、p2确定，切向畸变公式为：X_dt＝2p₁×X×Y+p₂(r²+2×X²)+1，Y_dt＝2p₁(r²+2×Y²)+2p₂×X×Y+1,式中，(X,Y)是无畸变的图像像素点坐标，(X_dt,Y_dt)是畸变后图像像素点坐标。单个摄像头的定标主要是计算出摄像头的内参(焦距f和成像原点cx、cy，五个畸变参数(一般只需要计算出k1、k2、p1、p2，对于鱼眼镜头等径向畸变特别大的才需要计算k3))以及外参(标定物的世界坐标)。双目摄像头定标不仅要得出每个摄像头的内部参数，还需要通过标定来测量两个摄像头之间的相对位置(即右摄像头相对于左摄像头的旋转矩阵R、平移向量t)。Due to the characteristics of the optical lens of the camera, the imaging has radial distortion, which can be determined by three parameters k1, k2, and k3. The radial distortion formula is: _Xdr = X(1+ _k1 × ^r2 + _k2 × ^r4 + _k3 × ^r6 ), _Ydr = Y(1+ _k1 × ^r2 + ^k2 _× r4+ _k3 × ^r6 ), ^r2 = ^X2 + ^Y2 , where (X, Y) are the coordinates of the pixel point of the undistorted image, and ( _Xdr , _Ydr ) are the coordinates of the pixel point of the distorted image. Due to the assembly error, the sensor and the optical lens of the camera are not completely parallel, so the imaging has tangential distortion, which can be determined by two parameters p1 and p2. The tangential distortion formula is: _Xdt = _2p1 ×X×Y+ _p2 ( ^r2 +2× ^X2 )+1, _Ydt = _2p1 ( ^r2 +2×Y ² )+2p ₂ ×X×Y+1, where (X,Y) are the coordinates of the pixels in the undistorted image, and (X _dt ,Y _dt ) are the coordinates of the pixels in the distorted image. The calibration of a single camera mainly involves calculating the camera's internal parameters (focal length f and imaging origin cx, cy, five distortion parameters (generally only k1, k2, p1, p2 need to be calculated, and k3 only needs to be calculated for fisheye lenses with particularly large radial distortion)) and external parameters (world coordinates of the calibration object). The calibration of a binocular camera not only requires the internal parameters of each camera, but also requires the measurement of the relative position between the two cameras through calibration (i.e. the rotation matrix R and translation vector t of the right camera relative to the left camera).

步骤402，对双目摄像头进行双目校正。Step 402: Perform binocular calibration on the binocular camera.

双目校正是根据摄像头定标后获得的单目内参数据(焦距、成像原点、畸变系数)和双目相对位置关系(旋转矩阵和平移向量)，分别对左右视图进行消除畸变和行对准，使得左右视图的成像原点坐标一致、两摄像头光轴平行、左右成像平面共面、对极线行对齐。这样一幅图像上任意一点与其在另一幅图像上的对应点就必然具有相同的行号，只需在该行进行一维搜索即可匹配到对应点。Binocular correction is to eliminate distortion and align the left and right views based on the monocular internal parameter data (focal length, imaging origin, distortion coefficient) and the binocular relative position relationship (rotation matrix and translation vector) obtained after camera calibration, so that the imaging origin coordinates of the left and right views are consistent, the optical axes of the two cameras are parallel, the left and right imaging planes are coplanar, and the epipolar lines are aligned. In this way, any point on one image and its corresponding point on the other image must have the same row number, and the corresponding point can be matched by performing a one-dimensional search in the row.

步骤403，对双目摄像头采集的图像进行双目匹配。Step 403: perform binocular matching on the images captured by the binocular camera.

双目匹配的作用是把同一场景在左右视图上对应的像点匹配起来，这样做的目的是为了得到视差数据。The purpose of binocular matching is to match the corresponding image points of the same scene in the left and right views in order to obtain disparity data.

P是待测物体上的某一点，L和R分别是左右相机的光心，点P在两个相机感光器上的成像点分别为p和p′(相机的成像平面经过旋转后放在了镜头前方)，f表示相机焦距，b表示两个相机中心距，z表示所求目标物的距离，设p和p′的距离为dis，则P is a point on the object to be measured, L and R are the optical centers of the left and right cameras respectively, and the imaging points of point P on the two camera sensors are p and p' respectively (the imaging plane of the camera is rotated and placed in front of the lens), f is the focal length of the camera, b is the center distance between the two cameras, and z is the distance of the target object. Let the distance between p and p' be dis, then

dis＝b-(X_R-X_L)dis＝b-(X _R -X _L )

根据三角形相似原理：According to the triangle similarity principle:

可得：Available:

公式中，焦距f和摄像头中心距b可以通过标定得到，所以只要获得了X_R-X_L(即，视差d)的值即可得到深度信息。视差值可根据第二个特征匹配算法中的匹配关键点计算得出。最终，经过双目测距算法可以得到图像中目标物的距离信息，进而得到目标物检测距离与实际距离的偏差。In the formula, the focal length f and the camera center distance b can be obtained through calibration, so as long as the value of X _R - _XL (i.e., disparity d) is obtained, the depth information can be obtained. The disparity value can be calculated based on the matching key points in the second feature matching algorithm. Finally, the distance information of the target object in the image can be obtained through the binocular ranging algorithm, and then the deviation between the detected distance of the target object and the actual distance can be obtained.

步骤500，利用图像质量预测能见度算法对双目摄像头采集的两路视频信号中的每一帧图像，进行图像质量能见度预测，预测得到的能见度区间。Step 500, using an image quality prediction visibility algorithm to perform image quality visibility prediction on each frame of the two-channel video signal collected by the binocular camera, and predict the visibility range.

图像质量预测能见度算法是一种利用图像宏观信息来预测能见度的方法，主要是基于图像中介质背景下物体的清晰度、对比度来预测能见度。本环节不能直接接收视频帧信号，而是需要过滤掉图像中的近景高频信息，将介质背景的低频信息提取出来，然后进行图像的分析和预测。为了使本发明能够适应白天和夜晚，并且提高在不同情况下的预测准确率，在该算法的训练过程，需要提供大量视频数据，以及相同时间戳的能见度检测仪的检测数据。本步骤的输出为能见度的区间值(离散值)。本步骤主要分为以下两步：The image quality prediction visibility algorithm is a method that uses the macro information of the image to predict visibility. It mainly predicts visibility based on the clarity and contrast of objects in the medium background of the image. This link cannot directly receive the video frame signal, but needs to filter out the close-up high-frequency information in the image, extract the low-frequency information of the medium background, and then analyze and predict the image. In order to make the present invention adaptable to day and night, and improve the prediction accuracy under different conditions, a large amount of video data and detection data of the visibility detector with the same timestamp need to be provided during the training process of the algorithm. The output of this step is the interval value (discrete value) of visibility. This step is mainly divided into the following two steps:

步骤501，对图像进行分割，实现对目标进行识别和定位。Step 501, segment the image to identify and locate the target.

图像分割网络结构如图10所示。图像经过三个卷积块对其进行特征提取，然后连接两层全连接层，得到图像中目标的分类得分和边界框位置，最后选取出得分最高的作为输出，提取出最有可能存在目标的边界框，从而达到对目标进行识别和定位的目的。The image segmentation network structure is shown in Figure 10. The image is subjected to three convolution blocks for feature extraction, and then connected to two fully connected layers to obtain the classification score and bounding box position of the target in the image. Finally, the highest score is selected as the output, and the bounding box where the target is most likely to exist is extracted, thereby achieving the purpose of identifying and locating the target.

图像能见度预测网络结构如图11所示。基于上述图像分割结果，可得到预测能见度图像。由于图像场景比较复杂，因此网络结构中包含了三个模块，每个模块都使用了四种不同的卷积核，用来提取图像不同尺度的特征，增加了特征多样性，提高了分类准确率。在每个模块的输出端将提取到的各种特征在通道维度上进行拼接合并，得到多尺度的特征图。最后，经过全连接层对图像进行分类，得到图像分类结果，实现预测得到的能见度区间。The network structure of image visibility prediction is shown in Figure 11. Based on the above image segmentation results, the predicted visibility image can be obtained. Since the image scene is relatively complex, the network structure contains three modules. Each module uses four different convolution kernels to extract features of different scales of the image, which increases feature diversity and improves classification accuracy. At the output end of each module, the various extracted features are spliced and merged in the channel dimension to obtain a multi-scale feature map. Finally, the image is classified through the fully connected layer to obtain the image classification result and realize the predicted visibility range.

步骤600，利用目标视觉效果预测能见度算法对双目摄像头采集的两路视频信号中的每一帧图像，进行目标视觉效果能见度预测，预测得到的能见度区间。Step 600, using a target visual effect prediction visibility algorithm to predict the target visual effect visibility for each frame of the two-channel video signal collected by the binocular camera, and predicting the visibility range.

目标视觉效果预测能见度算法是一种利用图像微观信息来预测能见度的方法，主要是基于目标物的轮廓梯度、轮廓完整程度、以及其颜色的饱和度来预测能见度。本环节算法的输入为步骤200的目标分割算法的输出。为了使本发明能够适应白天和夜晚，并且提高在不同情况下的预测准确率，在该算法的训练过程，需要提供大量视频数据，以及相同时间戳的能见度检测仪的检测数据。本步骤的输出为能见度的区间值(离散值)。步骤600具体包括以下过程：The target visual effect prediction visibility algorithm is a method that uses image microscopic information to predict visibility, mainly based on the target object's contour gradient, contour completeness, and color saturation to predict visibility. The input of this link algorithm is the output of the target segmentation algorithm of step 200. In order to make the present invention adaptable to daytime and nighttime, and to improve the prediction accuracy under different conditions, a large amount of video data and detection data of the visibility detector with the same timestamp need to be provided during the training process of the algorithm. The output of this step is the interval value (discrete value) of visibility. Step 600 specifically includes the following process:

步骤601，构建目标视觉效果预测能见度算法网络结构。Step 601, constructing a target visual effect prediction visibility algorithm network structure.

目标视觉效果预测能见度算法网络结构如图12所示。所述目标视觉效果预测能见度算法网络结构包括：输入层、卷积层、第一个提取特征模块、合并通道、第二个提取特征模块、合并通道、全连接层、分类结构输出层；每个提取特征模块包括5个卷积核。基于步骤200的目标分割算法，可得到含有目标的图像。由于目标图像中的环境噪声干扰较少，因此本步骤构建的网络结构中包含了两个提取特征模块，每个提取特征模块都使用三种不同的卷积核，用来提取图像不同尺度的特征，增加了特征多样性，提高了分类准确率。The network structure of the target visual effect prediction visibility algorithm is shown in Figure 12. The network structure of the target visual effect prediction visibility algorithm includes: an input layer, a convolution layer, a first feature extraction module, a merging channel, a second feature extraction module, a merging channel, a fully connected layer, and a classification structure output layer; each feature extraction module includes 5 convolution kernels. Based on the target segmentation algorithm of step 200, an image containing a target can be obtained. Since the environmental noise interference in the target image is relatively small, the network structure constructed in this step includes two feature extraction modules, each of which uses three different convolution kernels to extract features of different scales of the image, thereby increasing feature diversity and improving classification accuracy.

步骤602，将步骤200得到的得到目标物的提取结果，输入目标视觉效果预测能见度算法网络结构，得到多尺度的特征图。In step 602, the extraction result of the target object obtained in step 200 is input into the target visual effect prediction visibility algorithm network structure to obtain a multi-scale feature map.

本步骤具体的，在每个模块的输出端将提取到的各种特征在通道维度上进行拼接合并，得到多尺度的特征图。Specifically, in this step, the various features extracted are concatenated and merged in the channel dimension at the output end of each module to obtain a multi-scale feature map.

本步骤具体的，经过全连接层对图像进行分类，得到目标图像分类结果。Specifically, in this step, the image is classified through the fully connected layer to obtain the classification result of the target image.

经过步骤200-步骤600的处理过程，在一帧视频数据中得到了三个与能见度相关的结果：(1)步骤400中利用测距算法得到的目标物检测距离与实际距离的偏差(连续值)。这个偏差的产生跟能见度有很大且直接的关联。(2)步骤500中利用图像质量预测能见度算法得到的能见度区间(离散值)。(3)步骤600中利用目标视觉效果预测能见度算法得到的能见度区间(离散值)。After the processing from step 200 to step 600, three results related to visibility are obtained in one frame of video data: (1) The deviation (continuous value) between the detected distance of the target object obtained by the distance measurement algorithm in step 400 and the actual distance. The generation of this deviation is greatly and directly related to visibility. (2) The visibility interval (discrete value) obtained by the image quality prediction visibility algorithm in step 500. (3) The visibility interval (discrete value) obtained by the target visual effect prediction visibility algorithm in step 600.

对于多个计算结果的平衡策略，传统上通常采用直接取均值，或者排除异常值后取均值的方法。为了进一步提高检测准确率，在本步骤采取多帧结果循环校验的方法。在与能见度变化速度相比的短时间内(如1分钟)，按照一定时间间隔(如5秒)，取得多帧数据进行计算，按照时间顺序，将每帧的检测结果输入能见度平衡算法，得到最终的能见度区间值。步骤700包括以下过程：Traditionally, the balancing strategy for multiple calculation results usually adopts the method of directly taking the average, or taking the average after excluding outliers. In order to further improve the detection accuracy, a multi-frame result cyclic verification method is adopted in this step. In a short time compared with the visibility change speed (such as 1 minute), multiple frames of data are obtained at a certain time interval (such as 5 seconds) for calculation, and the detection results of each frame are input into the visibility balance algorithm in chronological order to obtain the final visibility interval value. Step 700 includes the following process:

步骤701，构建能见度平衡算法网络结构，所述能见度平衡算法网络结构包括输入层、循环神经网络、全连接层和能见度区间输出层。Step 701, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure includes an input layer, a recurrent neural network, a fully connected layer, and a visibility interval output layer.

能见度平衡算法网络结构如图13所示。如图13所示，所述能见度平衡算法网络结构包括输入层、循环神经网络、全连接层和能见度区间输出层。能见度平衡算法网络按照时间顺序输入能见度，每个时间节点输入的能见度特征长度为3，分别为步骤400中测距算法得到的目标物检测距离与实际距离的偏差、步骤500中图像质量预测能见度算法得到的能见度区间及步骤600中目标视觉效果预测能见度算法得到的能见度区间。The network structure of the visibility balance algorithm is shown in FIG13. As shown in FIG13, the network structure of the visibility balance algorithm includes an input layer, a recurrent neural network, a fully connected layer, and a visibility interval output layer. The visibility balance algorithm network inputs visibility in chronological order, and the visibility feature length input at each time node is 3, which are respectively the deviation between the target object detection distance and the actual distance obtained by the ranging algorithm in step 400, the visibility interval obtained by the image quality prediction visibility algorithm in step 500, and the visibility interval obtained by the target visual effect prediction visibility algorithm in step 600.

步骤702，将能见度依次输入循环神经网络，得到考虑时间序列的结果。Step 702, input the visibility into the recurrent neural network in sequence to obtain a result that takes into account the time series.

本发明利用能见度平衡算法，能够在时间维度上，对于多次计算结果进行平衡，能够减少单帧计算错误的影响。但是在不同时间戳取得的结果，前后具有一定关联性，即短时间内，能见度的变化不会很剧烈。所以可以利用时间维度，对多个检测数值进行校正。所以能见度平衡首先采用循环神经网络来进行处理。循环神经网络的特点是每一次计算，需要考虑上一次计算的结果，作为先验输入，可实现对后续的计算进行校正的效果。在校正后，得到了不同时间戳的计算结果，再后接一个全连接神经网络，对多次计算结果进行整合，得到最终结果。循环神经网络结构如图14a-14b所示。所述循环神经网络结构包括：输入层、循环层和输出层。The present invention utilizes a visibility balance algorithm, which can balance multiple calculation results in the time dimension and reduce the impact of single-frame calculation errors. However, the results obtained at different timestamps have a certain correlation before and after, that is, the visibility will not change drastically in a short period of time. Therefore, the time dimension can be used to correct multiple detection values. Therefore, visibility balance is first processed using a recurrent neural network. The characteristic of a recurrent neural network is that each calculation needs to consider the result of the previous calculation as a priori input, which can achieve the effect of correcting subsequent calculations. After correction, the calculation results of different timestamps are obtained, and then a fully connected neural network is connected to integrate the multiple calculation results to obtain the final result. The recurrent neural network structure is shown in Figures 14a-14b. The recurrent neural network structure includes: an input layer, a circulation layer, and an output layer.

循环神经网络具有按照输入数据顺序递归学习的特性，因此可用来处理和序列相关的数据。从网络结构可以看出，循环神经网络会记忆之前的信息，并利用之前的信息影响后面结点的输出。也就是说，循环神经网络的隐藏层之间的结点是有连接的，隐藏层的输入不仅包括输入层的输出，还包括上时刻隐藏层的输出。Recurrent neural networks have the characteristic of learning recursively according to the order of input data, so they can be used to process sequence-related data. From the network structure, it can be seen that recurrent neural networks will remember previous information and use previous information to affect the output of subsequent nodes. In other words, the nodes between the hidden layers of the recurrent neural network are connected, and the input of the hidden layer includes not only the output of the input layer, but also the output of the hidden layer at the previous moment.

给定按序列输入的数据X＝{X₁,X₂,…,X_t},X的特征长度为c，展开长度为t。循环神经网络的输出h_t计算公式为：Given the data X={X ₁ ,X ₂ ,…,X _t } input in sequence, the feature length of X is c and the expansion length is t. The output h _t of the recurrent neural network is calculated as:

h_t＝tanh(W*X_t+W*h_t-1)h _t = tanh(W*X _t +W*h _t-1 )

其中，W为隐藏层参数，tanh为激活函数。由公式可以看出，t时刻的输出不仅取决于当前时刻的输入X_t,还取决于前一时刻的输出h_t-1。Among them, W is the hidden layer parameter, and tanh is the activation function. It can be seen from the formula that the output at time t depends not only on the input X _t at the current time, but also on the output h _t-1 at the previous time.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that modifying the technical solutions described in the aforementioned embodiments, or replacing some or all of the technical features therein by equivalents, does not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for identifying visibility of a medium, comprising the steps of:

step 100, acquiring video data of a target object through a binocular camera, and acquiring visibility data through a visibility tester to obtain two paths of video signals and a visibility signal;

Step 200, respectively extracting the positions of the target objects from two paths of video signals acquired by the binocular camera by using a target segmentation algorithm to obtain extraction results of the target objects;

Step 300, performing feature matching on the obtained extraction result of the target object;

Step 400, obtaining the distance information of the target object by using a binocular distance measuring algorithm, and further obtaining the deviation between the detection distance of the target object and the actual distance; step 400 includes the following steps 401 to 404:

step 401, calibrating a binocular camera;

Step 402, performing binocular correction on a binocular camera;

Step 403, performing binocular matching on the image acquired by the binocular camera;

Step 404, calculating depth information of the binocular-matched image to obtain distance information of a target object in the image;

step 500, predicting the image quality visibility of each frame image in two paths of video signals acquired by a binocular camera by using an image quality prediction visibility algorithm, and predicting the obtained visibility interval; step 500 includes the following steps 501 to 502:

Step 501, dividing an image to realize identification and positioning of a target;

step 502, predicting the visibility of the image according to the identification and positioning results of the target to obtain an image classification result;

step 600, predicting the visibility of the target visual effect for each frame image in two paths of video signals acquired by the binocular camera by using a target visual effect prediction visibility algorithm, and predicting the obtained visibility interval; step 600 includes the following steps 601 to 603:

Step 601, constructing a target visual effect prediction visibility algorithm network structure; the target visual effect prediction visibility algorithm network structure comprises: the system comprises an input layer, a convolution layer, a first extraction feature module, a merging channel, a second extraction feature module, a merging channel, a full connection layer and a classification structure output layer; wherein each extracted feature module comprises 5 convolution kernels;

step 602, inputting the extraction result of the target object obtained in the step 200 into a network structure of a target visual effect prediction visibility algorithm to obtain a multi-scale feature map;

step 603, classifying images through a target visual effect prediction visibility algorithm network structure to obtain a target image classification result, and realizing a predicted visibility interval;

Step 700, performing final visibility prediction by using a visibility balance algorithm; step 700 includes the following steps 701 through 703:

Step 701, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure comprises an input layer, a circulating neural network, a full-connection layer and a visibility interval output layer;

Step 702, sequentially inputting the visibility into a cyclic neural network to obtain a result considering a time sequence;

And step 703, connecting the output of the cyclic neural network with a full-connection layer to obtain a visibility interval value corresponding to the time sequence.

2. The method of medium visibility identification according to claim 1, wherein step 200 comprises the following process:

Step 201, convolutional neural network extraction features are performed on each frame of image in two paths of video signals;

Step 202, performing preliminary classification and regression by using a region extraction network;

step 203, performing alignment operation on the candidate frame feature map;

And 204, classifying, regressing and dividing the target by using the convolutional neural network to obtain an extraction result of the target object.

3. The medium visibility identification method according to claim 1, wherein step 300 comprises the following procedure:

Step 301, extracting key points from two target contours;

step 302, positioning the key points;

Step 303, determining feature vectors of the key points according to the positioned key points;

step 304, the key points are matched through the feature vectors of the key points.