CN109214319A

CN109214319A - A kind of underwater picture object detection method and system

Info

Publication number: CN109214319A
Application number: CN201810965772.7A
Authority: CN
Inventors: 李振波; 彭芳; 苗政; 李光耀; 钮冰姗; 杨晋琪; 岳峻; 李道亮
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2018-08-23
Filing date: 2018-08-23
Publication date: 2019-01-15

Abstract

Embodiments of the present invention provide an underwater image target detection method and system, including: acquiring an underwater image to be detected, and using MS-CNN denoising algorithm and dark channel prior denoising algorithm to detect the underwater image to be detected Perform preprocessing to obtain a first image; input the first image into a trained preset feature pyramid network FPN, and output a target detection result; wherein, the convolutional layer of the preset FPN is a PVA network, and C.ReLU is added structure. After preprocessing the underwater image to be detected, the improved FPN is used to detect the target of the preprocessed underwater image to be detected, and the target detection result is output. There is no need to manually design features, the detection process is time-consuming, and the obtained detection result It has high accuracy and can be well adapted to practical application requirements such as automatic detection and automatic fishing.

Description

A kind of underwater picture object detection method and system

Technical field

The present embodiments relate to technical field of image processing, more particularly, to a kind of underwater picture target detection side Method and system.

Background technique

One of the important technology in the underwater vision perceptually underwater world, has nowadays been increasingly used in sea The various aspects such as foreign engineering, undersea search, target detection, biological monitoring provide abundant for oceanography and fishery science research Information, for wisdom ocean, wisdom fishery development lay a good foundation.Underwater picture target detection technique is taken as its branch It is loaded on underwater robot, image and video analysis, biomass detection, bio-identification lookup etc. is applied to underwater environment, promoted The development automated into hydrospace detection and fishery.

However the usual illumination of underwater environment is insufficient, noise is obvious, contrast is low, picture colour cast is serious, and underwater mesh Mark often possesses protective coloration similar with environment, and factors above all greatly limits performance hair of the detection algorithm under water in image It waves.It is special that the feature (such as color, shape, texture, SIFT, HOG, DPM) that the prior art is mostly based on hand-designed extracts target Then sign carries out identification positioning to target using mode identification method.

But the Feature Engineering of hand-designed is time-consuming and laborious, and bad to the robustness of underwater complex background, under water Recognition accuracy under complex background is lower, while often there are problems that time-consuming when the operation of existing object detection method, nothing Method adapts to the practical application requests such as detect and catch automatically automatically very well.

Summary of the invention

The embodiment of the invention provides a kind of underwater figures for overcoming the above problem or at least being partially solved the above problem As object detection method and system.

The embodiment of the invention provides a kind of underwater picture object detection methods for first aspect, comprising:

Underwater picture to be detected is obtained, and using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described to be checked Underwater picture is surveyed to be pre-processed to obtain the first image；

The first image is inputted into trained default feature pyramid network FPN, exports object detection results；Its In, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

On the other hand the embodiment of the invention provides a kind of underwater picture object detection systems, comprising:

Image pre-processing module, for obtaining underwater picture to be detected, and it is first using MS-CNN Denoising Algorithm and dark It tests Denoising Algorithm the underwater picture to be detected is pre-processed to obtain the first image；

Module of target detection is exported for the first image to be inputted trained default feature pyramid network FPN Object detection results；Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

The embodiment of the invention provides include processor, communication interface, memory and bus for the third aspect, wherein processing Device, communication interface, memory complete mutual communication by bus, and processor can call the logical order in memory, To execute the underwater picture object detection method of first aspect offer.

The embodiment of the invention provides a kind of non-transient computer readable storage medium, the non-transient calculating for fourth aspect Machine readable storage medium storing program for executing stores computer instruction, and the computer instruction makes the computer execute the underwater of first aspect offer Image object detection method.

A kind of underwater picture object detection method provided in an embodiment of the present invention and system, by underwater picture to be detected After being pre-processed, target detection is carried out to by pretreated underwater picture to be detected using improved FPN, exports target Testing result is not necessarily to manual designs feature, and detection process time-consuming is short, and obtained testing result accuracy is high, can fit well It should detect automatically and the practical application requests such as fishing automatically.

Detailed description of the invention

Fig. 1 is a kind of flow chart of underwater picture object detection method provided in an embodiment of the present invention；

Fig. 2 is the schematic diagram of C.ReLU structure in the embodiment of the present invention；

Fig. 3 is the structural schematic diagram that FPN is preset in the embodiment of the present invention；

Fig. 4 is a kind of structural block diagram of underwater picture object detection system provided in an embodiment of the present invention；

Fig. 5 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical solution in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.

Fig. 1 is a kind of flow chart of underwater picture object detection method provided in an embodiment of the present invention, as shown in Figure 1, packet It includes:

S101 obtains underwater picture to be detected, and using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to institute Underwater picture to be detected is stated to be pre-processed to obtain the first image；

The first image is inputted trained default feature pyramid network FPN, exports target detection knot by S102 Fruit；Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

Wherein, in step s101, since underwater environment image-forming condition is poor, underwater picture often will appear distortion and mould Paste needs its to underwater picture to be detected to carry out denoising, so as to subsequent detection.Utilize MS-CNN Denoising Algorithm and dark The first image that priori Denoising Algorithm pre-processes the underwater picture to be detected is that subsequent can directly input Input quantity in trained default FPN.

In step s 102, the first image is detected using trained default FPN, presets FPN for existing FPN's Trunk convolutional network CNN changes the PVANet of lightweight into, and C.ReLU structure is added.Learn due to several layers of before CNN network There is negative correlation in filter, i.e., the convolution kernel of low layer always (parameter each other opposite number) occurs in pairs, therefore C.ReLU structure Output characteristic pattern number is reduced to original half, it is another it is semi-direct take opposite number to obtain, then two parts characteristic pattern is connected, To reduce convolution kernel number, operational efficiency is improved.The shallow-layer convolutional network of FPN is added in C.ReLU by the embodiment of the present invention In, network parameter can be reduced while not reducing accuracy rate, shorten time loss.

Specifically, Fig. 2 show the schematic diagram of C.ReLU structure, and wherein Convolution indicates convolution operation； Negation expression takes opposite number；Concatenation expression connects two parts characteristic pattern；Scale/Shift indicates scaling； ReLU expression is activated with ReLU function；Shortcut connection indicates residual error connection.

Most important feature pyramid structure will be also changed accordingly in existing FPN, conv3_4, conv4_4, conv5_4 Be expressed as { C3, C4, C5 }, represent different scale convolution characteristic pattern (convolutional layer is deeper, and the scale of characteristic pattern is smaller, Semantic feature is stronger)；By the calculating of lateral connection and top-down structure, the characteristic pattern gold word comprising strong semantic feature is obtained Tower { P3, P4, P5 }, the trellis diagram for keeping bottom resolution ratio big also obtain strong semantic feature, improve the detection of small scaled target Precision；And merge the feature of different convolutional layers, effectively improve detection accuracy.

The structure of default FPN is as shown in figure 3, the dotted line frame that wherein part Feature Pyramid Structure is drawn The details for representing lateral connection and top-down structure is connected if C4 passes through 1x1 convolution with the P5 by 2 times of up-samplings It connects, and obtains P4 by 3x3 convolutional calculation.

A kind of underwater picture object detection method provided in an embodiment of the present invention, it is pre- by being carried out to underwater picture to be detected After processing, target detection is carried out to by pretreated underwater picture to be detected using improved FPN, exports target detection knot Fruit is not necessarily to manual designs feature, and detection process time-consuming is short, and obtained testing result accuracy is high, can well adapt to automatic The practical application requests such as detection and automatic fishing.

In the above-described embodiments, described to utilize MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described to be detected Underwater picture is pre-processed to obtain the first image, is specifically included:

MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm is utilized respectively to carry out in advance the underwater picture to be detected Processing, obtains the second image and third image；

The piece image in second image and the third image is chosen as described using default voting mechanism One image.

Specifically, carrying out pretreated process to image to be detected can be understood as denoising calculation to MS-CNN with voting mechanism The characteristics of method and dark channel prior Denoising Algorithm are merged, and fusion can give full play to multiple Denoising Algorithms, allows the same picture Processing of the sample Jing Guo different Denoising Algorithms generates a denoising result after optimum organization, and this result is often than list The result that a Denoising Algorithm generates is more reliable.

In the above-described embodiments, it is characterised in that the ballot index of the default voting mechanism be Y-PSNR PSNR and Image entropy；Correspondingly,

It is described to choose the piece image in second image and the third image as institute using default voting mechanism The first image is stated, is specifically included:

If judgement knows that the sum of PSNR and image entropy of second image are not less than the PSNR and figure of the third image As the sum of entropy, then second image is chosen as the first image；If the PSNR and figure of second image are known in judgement As the sum of entropy is less than PSNR and the sum of image entropy of the third image, then the third image is chosen as first figure Picture.

Specifically, ballot index used by the embodiment of the present invention is that (Y-PSNR is set as p) and image entropy PSNR (being set as e), if fusion results are S, calculation formula is as follows:

In formula (1), S_m、S_dIt is the image after MS-CNN and the denoising of dark channel prior algorithm, w respectively_p、w_eIt is PSNR respectively With the weight of image entropy, p_m、e_mRespectively indicate the PSNR and image entropy of MS-CNN denoising image, similarly, p_d、e_dIt respectively indicates dark The PSNR and image entropy of channel prior denoising image.Wherein PSNR (Y-PSNR) is the visitor of most widely used evaluation image quality Measurement method is seen, PSNR value is bigger, and it is fewer to represent image fault；Image entropy is to enrich journey from information theory view reflection image information A kind of metric form of degree, usual image entropy is bigger, then the information content for showing that image as unit area carries is abundanter, picture quality Also better.

In the above-described embodiments, the first image is inputted into trained default feature pyramid network FPN described Before, further includes:

Training dataset is obtained by data extending method；

The default FPN is trained using the training dataset, obtains the trained FPN.

Wherein, on the one hand, training picture number is far more than conventional method needed for deep learning, and underwater picture is not It easily obtains, data extending method effectively increases data set, reduces the influence of over-fitting；On the other hand, large data sets can be with Rotational invariance, scale invariability, the data diversity etc. of effective boosting algorithm, to improve target detection precision.

Specifically, it before carrying out in the default FPN of image input for concentrating training data, also needs to carry out image Pretreatment.

In the above-described embodiments, described that training dataset is obtained by data extending, it specifically includes:

By flip horizontal, spins upside down, rotates predetermined angle, random scaling, random cropping and add in noise One or more methods obtain training dataset.

In the above-described embodiments, described that the default FPN is trained using the training dataset, it specifically includes:

In the training process, the RPN network that training data concentration any image inputs in the default FPN is obtained First Classification Loss and the first bounding box return loss, and the Fast RCNN that any image inputs in the default FPN is obtained Loss is returned to the second Classification Loss and the second boundary frame；

First Classification Loss and second Classification Loss are weighted fusion, first bounding box is returned Loss and the second boundary frame return loss and are weighted fusion.

Specifically, the basic network of FPN is Faster RCNN, and there are two important components by Faster RCNN: RPN net Network (region recommendation network) and Fast RCNN, but the connection of the two is not close.The major function of RPN network is recommended candidate Target area, the function of Fast RCNN are that the recommendation to RPN carries out target classification and candidate frame optimization.Although they are shared The deconvolution parameter of PVANet, but the loss descent direction of both discoveries is inconsistent in training process, reason is that they are right The pyramidal producing level of feature has differences, and RPN makes full use of the pyramidal multi-scale information of feature, and Fast RCNN is only It is used in scale mapping.If joint training, the direction of loss decline can be adjusted, feature can be made full use of pyramidal more Dimensional information can integrate both sides' advantage again, accelerate convergence rate, improve the accuracy rate of target detection.

If total losses is L_total, λ is customized weight, L_rpnAnd L_fast-rcnnRespectively RPN network losses and Fast RCNN Loss, based on formula it is as follows:

L_total=L_rpn+λL_fast-rcnn (2)

In the training process, the default FPN is trained using online difficult sample mining algorithm.

Specifically, online difficult sample mining algorithm, that is, OHEM algorithm, because underwater picture background is complicated, and true water Lower picture is again less, and joint training can only be carried out together with simple sample.OHEM algorithm be exactly automatically select difficult sample, thus Data are preferably utilized, complex environment feature can further be learnt, so that training is more effective.

OHEM major design goes out read-only RoI module (RoINet1) and difficulty RoI module (RoINet2) selects difficult sample.

A, all candidate regions of original image are inputted into RoINet1, calculates their Classification Loss and bounding box loss；

B, the candidate region that some height are overlapped is screened away using non-maxima suppression, then will be lost from high to low K candidate region before sequence is selected；

C, the preceding K candidate region (can be understood as difficult sample) selected is input to RoINet2, calculates K candidate The loss in region, and gradient is counter-propagating to convolutional network (i.e. PVANet), to update entire PVANet-FPN network.

Fig. 4 is a kind of structural block diagram of underwater picture object detection system provided in an embodiment of the present invention, as shown in figure 4, It include: image pre-processing module 401 and module of target detection 402.Wherein:

Image pre-processing module 401 utilizes MS-CNN Denoising Algorithm and dark for obtaining underwater picture to be detected Priori Denoising Algorithm is pre-processed to obtain the first image to the underwater picture to be detected.Module of target detection 402 is used for will The first image inputs trained default feature pyramid network FPN, exports object detection results；Wherein, described default The convolutional layer of FPN is PVA network, and C.ReLU structure is added.

A kind of underwater picture object detection system provided in an embodiment of the present invention, it is pre- by being carried out to underwater picture to be detected After processing, target detection is carried out to by pretreated underwater picture to be detected using improved FPN, exports target detection knot Fruit is not necessarily to manual designs feature, and detection process time-consuming is short, and obtained testing result accuracy is high, can well adapt to automatic The practical application requests such as detection and automatic fishing.

In the above-described embodiments, image pre-processing module 401 specifically include:

Submodule is pre-processed, for being utilized respectively MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described to be checked It surveys underwater picture to be pre-processed, obtains the second image and third image；

Submodule is chosen, for choosing the width in second image and the third image using default voting mechanism Image is as the first image.

In the above-described embodiments, the ballot index of the default voting mechanism is Y-PSNR PSNR and image entropy；Phase Ying Di,

Submodule is chosen, is specifically included:

Judging submodule, if for judging to know that the sum of PSNR and the image entropy of second image are not less than the third The sum of PSNR and image entropy of image then choose second image as the first image；If judgement knows described second The sum of PSNR and image entropy of image are less than the sum of PSNR and image entropy of the third image, then choose the third image and make For the first image.

In the above-described embodiments, underwater picture object detection system further include:

Data expansion module, for obtaining training dataset by data extending method；

Training module is obtained described trained for being trained using the training dataset to the default FPN FPN。

In the above-described embodiments, Data expansion module is specifically used for:

By flip horizontal, spin upside down, to rotate predetermined angle, random scaling, random cropping and addition noise a kind of Or a variety of methods obtain training dataset.

In the above-described embodiments, training module is specifically used for:

In the training process, the RPN network that training data concentration any image inputs in the default FPN is obtained The first Classification Loss and the first bounding box return loss, any image is inputted into the Fast RCNN in the default FPN It obtains the second Classification Loss and the second boundary frame returns loss；

In the above-described embodiments, training module is specifically used for:

Fig. 5 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention, as shown in figure 5, electronic equipment packet It includes: processor (processor) 501, communication interface (Communications Interface) 502, memory (memory) 503 and bus 504, wherein processor 501, communication interface 502, memory 503 complete mutual communication by bus 504. Processor 501 can call the logical order in memory 503, to execute following method, for example, obtain to be detected underwater Image, and the underwater picture to be detected pre-process using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm To the first image；The first image is inputted into trained default feature pyramid network FPN, exports object detection results； Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

Logical order in above-mentioned memory 502 can be realized and as independent by way of SFU software functional unit Product when selling or using, can store in a computer readable storage medium.Based on this understanding, of the invention Substantially the part of the part that contributes to existing technology or the technical solution can be produced technical solution in other words with software The form of product embodies, which is stored in a storage medium, including some instructions are used so that one Platform computer equipment (can be personal computer, server or the network equipment etc.) executes described in each embodiment of the present invention The all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with Store the medium of program code.

The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage Medium storing computer instruction, the computer instruction make the computer execute side provided by above-mentioned each method embodiment Method, for example, obtain underwater picture to be detected, and using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described Underwater picture to be detected is pre-processed to obtain the first image；The first image is inputted into trained default feature pyramid Network FPN exports object detection results；Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed；And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light The various media that can store program code such as disk.

The embodiments such as communication equipment described above are only schematical, wherein unit as illustrated by the separation member It may or may not be physically separated, component shown as a unit may or may not be physics list Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs In some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativeness Labour in the case where, it can understand and implement.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation The method of certain parts of example or embodiment.

Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features； And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims

1. a kind of underwater picture object detection method characterized by comprising

Underwater picture to be detected is obtained, and is denoised using multiple dimensioned convolutional neural networks MS-CNN Denoising Algorithm and dark channel prior Algorithm is pre-processed to obtain the first image to the underwater picture to be detected；

The first image is inputted into trained default feature pyramid network FPN, exports object detection results；Wherein, institute The convolutional layer for stating default FPN is PVA network, and C.ReLU structure is added.

2. method according to claim 1, which is characterized in that described to be denoised using MS-CNN Denoising Algorithm and dark channel prior Algorithm is pre-processed to obtain the first image to the underwater picture to be detected, is specifically included:

MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm is utilized respectively to pre-process the underwater picture to be detected, Obtain the second image and third image；

The piece image in second image and the third image is chosen as first figure using default voting mechanism Picture.

3. method according to claim 2, which is characterized in that the ballot index of the default voting mechanism is Y-PSNR PSNR and image entropy；Correspondingly,

It is described to choose the piece image in second image and the third image as described using default voting mechanism One image, specifically includes:

If judgement knows that the sum of PSNR and image entropy of second image are not less than the PSNR and image entropy of the third image The sum of, then second image is chosen as the first image；If the PSNR and image entropy of second image are known in judgement The sum of be less than the PSNR and the sum of image entropy of the third image, then choose the third image as the first image.

4. method according to claim 1, which is characterized in that the first image is inputted trained default spy described Before sign pyramid network FPN, further includes:

Training dataset is obtained by data extending method；

The default FPN is trained using the training dataset, obtains the trained FPN.

5. method according to claim 4, which is characterized in that it is described that training dataset is obtained by data extending, it is specific to wrap It includes:

By flip horizontal, one of spins upside down, rotates predetermined angle, random scaling, random cropping and addition noise Or a variety of methods obtain training dataset.

6. method according to claim 4, which is characterized in that it is described using the training dataset to the default FPN into Row training, specifically includes:

In the training process, training data concentration any image is inputted into the region recommendation network RPN in the default FPN It obtains the first Classification Loss and the first bounding box returns loss, any image is inputted into the Fast in the default FPN RCNN obtains the second Classification Loss and the second boundary frame returns loss；

First Classification Loss and second Classification Loss are weighted fusion, first bounding box is returned and is lost Loss, which is returned, with the second boundary frame is weighted fusion.

7. method according to claim 4, which is characterized in that it is described using the training dataset to the default FPN into Row training, specifically includes:

8. a kind of underwater picture object detection system characterized by comprising

Image pre-processing module is gone for obtaining underwater picture to be detected, and using MS-CNN Denoising Algorithm and dark channel prior Algorithm of making an uproar is pre-processed to obtain the first image to the underwater picture to be detected；

Module of target detection exports target for the first image to be inputted trained default feature pyramid network FPN Testing result；Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.

9. a kind of electronic equipment, which is characterized in that including processor, communication interface, memory and bus, wherein processor leads to Believe that interface, memory complete mutual communication by bus, processor can call the logical order in memory, to execute Underwater picture object detection method as described in any one of claim 1 to 7.

10. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited Computer instruction is stored up, the computer instruction makes the computer execute underwater picture as described in any one of claim 1 to 7 Object detection method.