Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical solution in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention
A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having
Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
Fig. 1 is a kind of flow chart of underwater picture object detection method provided in an embodiment of the present invention, as shown in Figure 1, packet
It includes:
S101 obtains underwater picture to be detected, and using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to institute
Underwater picture to be detected is stated to be pre-processed to obtain the first image;
The first image is inputted trained default feature pyramid network FPN, exports target detection knot by S102
Fruit;Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.
Wherein, in step s101, since underwater environment image-forming condition is poor, underwater picture often will appear distortion and mould
Paste needs its to underwater picture to be detected to carry out denoising, so as to subsequent detection.Utilize MS-CNN Denoising Algorithm and dark
The first image that priori Denoising Algorithm pre-processes the underwater picture to be detected is that subsequent can directly input
Input quantity in trained default FPN.
In step s 102, the first image is detected using trained default FPN, presets FPN for existing FPN's
Trunk convolutional network CNN changes the PVANet of lightweight into, and C.ReLU structure is added.Learn due to several layers of before CNN network
There is negative correlation in filter, i.e., the convolution kernel of low layer always (parameter each other opposite number) occurs in pairs, therefore C.ReLU structure
Output characteristic pattern number is reduced to original half, it is another it is semi-direct take opposite number to obtain, then two parts characteristic pattern is connected,
To reduce convolution kernel number, operational efficiency is improved.The shallow-layer convolutional network of FPN is added in C.ReLU by the embodiment of the present invention
In, network parameter can be reduced while not reducing accuracy rate, shorten time loss.
Specifically, Fig. 2 show the schematic diagram of C.ReLU structure, and wherein Convolution indicates convolution operation;
Negation expression takes opposite number;Concatenation expression connects two parts characteristic pattern;Scale/Shift indicates scaling;
ReLU expression is activated with ReLU function;Shortcut connection indicates residual error connection.
Most important feature pyramid structure will be also changed accordingly in existing FPN, conv3_4, conv4_4, conv5_4
Be expressed as { C3, C4, C5 }, represent different scale convolution characteristic pattern (convolutional layer is deeper, and the scale of characteristic pattern is smaller,
Semantic feature is stronger);By the calculating of lateral connection and top-down structure, the characteristic pattern gold word comprising strong semantic feature is obtained
Tower { P3, P4, P5 }, the trellis diagram for keeping bottom resolution ratio big also obtain strong semantic feature, improve the detection of small scaled target
Precision;And merge the feature of different convolutional layers, effectively improve detection accuracy.
The structure of default FPN is as shown in figure 3, the dotted line frame that wherein part Feature Pyramid Structure is drawn
The details for representing lateral connection and top-down structure is connected if C4 passes through 1x1 convolution with the P5 by 2 times of up-samplings
It connects, and obtains P4 by 3x3 convolutional calculation.
A kind of underwater picture object detection method provided in an embodiment of the present invention, it is pre- by being carried out to underwater picture to be detected
After processing, target detection is carried out to by pretreated underwater picture to be detected using improved FPN, exports target detection knot
Fruit is not necessarily to manual designs feature, and detection process time-consuming is short, and obtained testing result accuracy is high, can well adapt to automatic
The practical application requests such as detection and automatic fishing.
In the above-described embodiments, described to utilize MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described to be detected
Underwater picture is pre-processed to obtain the first image, is specifically included:
MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm is utilized respectively to carry out in advance the underwater picture to be detected
Processing, obtains the second image and third image;
The piece image in second image and the third image is chosen as described using default voting mechanism
One image.
Specifically, carrying out pretreated process to image to be detected can be understood as denoising calculation to MS-CNN with voting mechanism
The characteristics of method and dark channel prior Denoising Algorithm are merged, and fusion can give full play to multiple Denoising Algorithms, allows the same picture
Processing of the sample Jing Guo different Denoising Algorithms generates a denoising result after optimum organization, and this result is often than list
The result that a Denoising Algorithm generates is more reliable.
In the above-described embodiments, it is characterised in that the ballot index of the default voting mechanism be Y-PSNR PSNR and
Image entropy;Correspondingly,
It is described to choose the piece image in second image and the third image as institute using default voting mechanism
The first image is stated, is specifically included:
If judgement knows that the sum of PSNR and image entropy of second image are not less than the PSNR and figure of the third image
As the sum of entropy, then second image is chosen as the first image;If the PSNR and figure of second image are known in judgement
As the sum of entropy is less than PSNR and the sum of image entropy of the third image, then the third image is chosen as first figure
Picture.
Specifically, ballot index used by the embodiment of the present invention is that (Y-PSNR is set as p) and image entropy PSNR
(being set as e), if fusion results are S, calculation formula is as follows:
In formula (1), Sm、SdIt is the image after MS-CNN and the denoising of dark channel prior algorithm, w respectivelyp、weIt is PSNR respectively
With the weight of image entropy, pm、emRespectively indicate the PSNR and image entropy of MS-CNN denoising image, similarly, pd、edIt respectively indicates dark
The PSNR and image entropy of channel prior denoising image.Wherein PSNR (Y-PSNR) is the visitor of most widely used evaluation image quality
Measurement method is seen, PSNR value is bigger, and it is fewer to represent image fault;Image entropy is to enrich journey from information theory view reflection image information
A kind of metric form of degree, usual image entropy is bigger, then the information content for showing that image as unit area carries is abundanter, picture quality
Also better.
In the above-described embodiments, the first image is inputted into trained default feature pyramid network FPN described
Before, further includes:
Training dataset is obtained by data extending method;
The default FPN is trained using the training dataset, obtains the trained FPN.
Wherein, on the one hand, training picture number is far more than conventional method needed for deep learning, and underwater picture is not
It easily obtains, data extending method effectively increases data set, reduces the influence of over-fitting;On the other hand, large data sets can be with
Rotational invariance, scale invariability, the data diversity etc. of effective boosting algorithm, to improve target detection precision.
Specifically, it before carrying out in the default FPN of image input for concentrating training data, also needs to carry out image
Pretreatment.
In the above-described embodiments, described that training dataset is obtained by data extending, it specifically includes:
By flip horizontal, spins upside down, rotates predetermined angle, random scaling, random cropping and add in noise
One or more methods obtain training dataset.
In the above-described embodiments, described that the default FPN is trained using the training dataset, it specifically includes:
In the training process, the RPN network that training data concentration any image inputs in the default FPN is obtained
First Classification Loss and the first bounding box return loss, and the Fast RCNN that any image inputs in the default FPN is obtained
Loss is returned to the second Classification Loss and the second boundary frame;
First Classification Loss and second Classification Loss are weighted fusion, first bounding box is returned
Loss and the second boundary frame return loss and are weighted fusion.
Specifically, the basic network of FPN is Faster RCNN, and there are two important components by Faster RCNN: RPN net
Network (region recommendation network) and Fast RCNN, but the connection of the two is not close.The major function of RPN network is recommended candidate
Target area, the function of Fast RCNN are that the recommendation to RPN carries out target classification and candidate frame optimization.Although they are shared
The deconvolution parameter of PVANet, but the loss descent direction of both discoveries is inconsistent in training process, reason is that they are right
The pyramidal producing level of feature has differences, and RPN makes full use of the pyramidal multi-scale information of feature, and Fast RCNN is only
It is used in scale mapping.If joint training, the direction of loss decline can be adjusted, feature can be made full use of pyramidal more
Dimensional information can integrate both sides' advantage again, accelerate convergence rate, improve the accuracy rate of target detection.
If total losses is Ltotal, λ is customized weight, LrpnAnd Lfast-rcnnRespectively RPN network losses and Fast RCNN
Loss, based on formula it is as follows:
Ltotal=Lrpn+λLfast-rcnn (2)
In the above-described embodiments, described that the default FPN is trained using the training dataset, it specifically includes:
In the training process, the default FPN is trained using online difficult sample mining algorithm.
Specifically, online difficult sample mining algorithm, that is, OHEM algorithm, because underwater picture background is complicated, and true water
Lower picture is again less, and joint training can only be carried out together with simple sample.OHEM algorithm be exactly automatically select difficult sample, thus
Data are preferably utilized, complex environment feature can further be learnt, so that training is more effective.
OHEM major design goes out read-only RoI module (RoINet1) and difficulty RoI module (RoINet2) selects difficult sample.
A, all candidate regions of original image are inputted into RoINet1, calculates their Classification Loss and bounding box loss;
B, the candidate region that some height are overlapped is screened away using non-maxima suppression, then will be lost from high to low
K candidate region before sequence is selected;
C, the preceding K candidate region (can be understood as difficult sample) selected is input to RoINet2, calculates K candidate
The loss in region, and gradient is counter-propagating to convolutional network (i.e. PVANet), to update entire PVANet-FPN network.
Fig. 4 is a kind of structural block diagram of underwater picture object detection system provided in an embodiment of the present invention, as shown in figure 4,
It include: image pre-processing module 401 and module of target detection 402.Wherein:
Image pre-processing module 401 utilizes MS-CNN Denoising Algorithm and dark for obtaining underwater picture to be detected
Priori Denoising Algorithm is pre-processed to obtain the first image to the underwater picture to be detected.Module of target detection 402 is used for will
The first image inputs trained default feature pyramid network FPN, exports object detection results;Wherein, described default
The convolutional layer of FPN is PVA network, and C.ReLU structure is added.
A kind of underwater picture object detection system provided in an embodiment of the present invention, it is pre- by being carried out to underwater picture to be detected
After processing, target detection is carried out to by pretreated underwater picture to be detected using improved FPN, exports target detection knot
Fruit is not necessarily to manual designs feature, and detection process time-consuming is short, and obtained testing result accuracy is high, can well adapt to automatic
The practical application requests such as detection and automatic fishing.
In the above-described embodiments, image pre-processing module 401 specifically include:
Submodule is pre-processed, for being utilized respectively MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described to be checked
It surveys underwater picture to be pre-processed, obtains the second image and third image;
Submodule is chosen, for choosing the width in second image and the third image using default voting mechanism
Image is as the first image.
In the above-described embodiments, the ballot index of the default voting mechanism is Y-PSNR PSNR and image entropy;Phase
Ying Di,
Submodule is chosen, is specifically included:
Judging submodule, if for judging to know that the sum of PSNR and the image entropy of second image are not less than the third
The sum of PSNR and image entropy of image then choose second image as the first image;If judgement knows described second
The sum of PSNR and image entropy of image are less than the sum of PSNR and image entropy of the third image, then choose the third image and make
For the first image.
In the above-described embodiments, underwater picture object detection system further include:
Data expansion module, for obtaining training dataset by data extending method;
Training module is obtained described trained for being trained using the training dataset to the default FPN
FPN。
In the above-described embodiments, Data expansion module is specifically used for:
By flip horizontal, spin upside down, to rotate predetermined angle, random scaling, random cropping and addition noise a kind of
Or a variety of methods obtain training dataset.
In the above-described embodiments, training module is specifically used for:
In the training process, the RPN network that training data concentration any image inputs in the default FPN is obtained
The first Classification Loss and the first bounding box return loss, any image is inputted into the Fast RCNN in the default FPN
It obtains the second Classification Loss and the second boundary frame returns loss;
First Classification Loss and second Classification Loss are weighted fusion, first bounding box is returned
Loss and the second boundary frame return loss and are weighted fusion.
In the above-described embodiments, training module is specifically used for:
In the training process, the default FPN is trained using online difficult sample mining algorithm.
Fig. 5 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention, as shown in figure 5, electronic equipment packet
It includes: processor (processor) 501, communication interface (Communications Interface) 502, memory (memory)
503 and bus 504, wherein processor 501, communication interface 502, memory 503 complete mutual communication by bus 504.
Processor 501 can call the logical order in memory 503, to execute following method, for example, obtain to be detected underwater
Image, and the underwater picture to be detected pre-process using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm
To the first image;The first image is inputted into trained default feature pyramid network FPN, exports object detection results;
Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.
Logical order in above-mentioned memory 502 can be realized and as independent by way of SFU software functional unit
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, of the invention
Substantially the part of the part that contributes to existing technology or the technical solution can be produced technical solution in other words with software
The form of product embodies, which is stored in a storage medium, including some instructions are used so that one
Platform computer equipment (can be personal computer, server or the network equipment etc.) executes described in each embodiment of the present invention
The all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with
Store the medium of program code.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage
Medium storing computer instruction, the computer instruction make the computer execute side provided by above-mentioned each method embodiment
Method, for example, obtain underwater picture to be detected, and using MS-CNN Denoising Algorithm and dark channel prior Denoising Algorithm to described
Underwater picture to be detected is pre-processed to obtain the first image;The first image is inputted into trained default feature pyramid
Network FPN exports object detection results;Wherein, the convolutional layer of the default FPN is PVA network, and C.ReLU structure is added.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light
The various media that can store program code such as disk.
The embodiments such as communication equipment described above are only schematical, wherein unit as illustrated by the separation member
It may or may not be physically separated, component shown as a unit may or may not be physics list
Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs
In some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativeness
Labour in the case where, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
The method of certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.