[go: up one dir, main page]

CN115937613B - A plant temporal image contrast learning method embedding prior distance - Google Patents

A plant temporal image contrast learning method embedding prior distance

Info

Publication number
CN115937613B
CN115937613B CN202310033871.2A CN202310033871A CN115937613B CN 115937613 B CN115937613 B CN 115937613B CN 202310033871 A CN202310033871 A CN 202310033871A CN 115937613 B CN115937613 B CN 115937613B
Authority
CN
China
Prior art keywords
distance
image
plant
image pairs
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310033871.2A
Other languages
Chinese (zh)
Other versions
CN115937613A (en
Inventor
胡玲艳
许巍
郭睿雅
汪祖民
谷毛毛
陈鹏宇
徐国辉
郭占俊
李国强
秦山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN202310033871.2A priority Critical patent/CN115937613B/en
Publication of CN115937613A publication Critical patent/CN115937613A/en
Application granted granted Critical
Publication of CN115937613B publication Critical patent/CN115937613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a plant time sequence image contrast learning method for embedding prior distances, which comprises the steps of reading plant time sequence images, obtaining object weather information, generating four types of image pairs, recording corresponding prior distances owned by different types of image pairs, inputting the image pairs x and y into a contrast model, carrying out data enhancement to obtain images v1 and v2, extracting feature vectors h1 and h2 from the images v1 and v2 in an encoder, mapping a representation to a space where contrast loss is applied by using a small neural network projection head after extracting the feature vectors h1 and h2, and fusing the prior distances of the different types of image pairs with the actual distances of the corresponding vectors z1 and z2 through a classification distance and a classification distance, so as to obtain the contrast loss, and further training. The invention uses contrast learning to obtain the pre-training weight specially used for crops, so that the self-supervision contrast learning method can be effectively applied to the pre-training of the plant time sequence image.

Description

Plant time sequence image contrast learning method embedded with priori distance
Technical Field
The invention relates to the technical field of plant time sequence image processing of deep learning, in particular to a plant time sequence image contrast learning method embedded with prior distance.
Background
Plants often exhibit different characteristics and traits during the growing process due to genetic differences and environmental impact. The appearance and inherent physiological and biochemical characteristics of all measurable organisms, such as shape, structure, size, color, etc., as determined by genotype and environment are referred to as plant phenotype. Understanding the phenotypic characteristics and traits of plants is an important proposition for biological research, and the complex roles of genomic and environmental factors on plant phenotypes will not be deeply understood in the absence of exhaustive phenotypic data. The traditional plant phenotype analysis mainly measures various parameters manually, and has the advantages of small analysis scale, low efficiency and large error.
The plant time sequence image general research steps mainly comprise two stages of information extraction and time sequence modeling. In the information extraction stage, a digital image processing method, in particular to various deep learning methods such as image classification, target detection, semantic segmentation and the like are generally adopted to extract phenotype data of single image data. In the time sequence modeling stage, information is accumulated from the time dimension, data of different growth stages are fused, a specific model is built, and joint analysis can be carried out with other external factors.
In the acquisition process of the plant time sequence image, the imaging equipment usually shoots the image aiming at the plant in a specific area at fixed time, so that the plant time sequence image can be conveniently acquired. However, in training a deep learning model with image information extraction, a large number of tagged images are often required, and because of the presence of more details of the plant image, such as edges of flowers and leaves, the manual tagging of the data set requires a higher cost, which requires that the model can use less manual tagging data to obtain better training results.
For the current situation, the contrast learning of pre-training by using label-free data is one way to realize efficient training of labels, such as SimCLR, moCo, simSia. The model learns the characterization by maximizing consistency between different enhanced views of the same data instance, with good results when migrating to downstream tasks. Research applications have also begun on plant and crop images, such as plant phenotype segmentation, plant remote sensing, pest monitoring, seed classification, and the like.
However, the number of contrast learning studies in the plant field is far less than in other fields, and no study of self-supervised contrast learning on plant timing images is seen. The plant grows slowly, the image change of the image sequence in a period of time is small and is similar, the ratio of the parts of organs such as flowers, leaves, trunks and the like in the image is large, and the semantic information is simple. In the traditional contrast learning model, although research images are easy to acquire, the labeling cost is high, and due to the semantic similarity of different images, the model is difficult to judge whether the same image is a positive sample enhanced by different data or a negative sample from different images but similar images during contrast training.
Disclosure of Invention
The invention aims to provide a plant time sequence image contrast learning method embedded with a priori distance, which can be effectively applied to the pre-training of plant time sequence images and has wide application prospects in the research of various computer vision plant phenotypes.
In order to achieve the above object, the present application provides a plant time sequence image contrast learning method embedded with a priori distance, comprising:
reading the plant time sequence image to obtain the information of the weather period, and generating four kinds of image pairs, namely image pairs of the same sequence in the same period, the same sequence in different periods, the different sequences in the same period and the different sequences in different periods;
recording corresponding prior distances owned by different types of image pairs;
After an image pair x and y is input into a comparison model, data enhancement is carried out, the purpose is to weaken the influence of color on model training, particularly the green of large-area leaves, so that the model can pay attention to higher-level semantic information except the color, different views can be generated in the training of different epochs for the same image, and the model can better utilize similar images. The data enhanced images are v1 and v2.
The images v1 and v2 are extracted into characteristic vectors h1 and h2 in an encoder, and the encoder is arbitrarily decided according to a downstream task.
After extracting the feature vectors h1 and h2, the invention uses a small neural network projection head to map the representation to the space where contrast loss is applied, uses 2-layer MLP with ReLU and BN layers to project the feature vectors hi and hj to 256-dimensional vectors z1 and z2, and the projection head does not participate in downstream task training.
In the stage of calculating contrast loss, the invention provides two modes of classifying distance and classifying distance, and the prior distance of different types of image pairs and the actual distance of corresponding vectors z1 and z2 are fused to obtain the contrast loss.
Compared with the prior art, the technical scheme has the advantages that important domain knowledge is converted into the prior distance of the image pair, and contrast learning pre-training is carried out. The self-supervision contrast learning method can be effectively applied to the pre-training of plant time sequence images, and has wide application prospects in the research of various computer vision plant phenotypes.
Drawings
FIG. 1 is a flow chart of a plant time sequence image contrast learning method embedded with a priori distance;
FIG. 2 is a schematic illustration of weathered segmentation and image extraction;
FIG. 3 is a graph of distance metric and contrast loss function acquisition process;
FIG. 4 is a schematic diagram of the pre-training and migration process.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the application, i.e., the embodiments described are merely some, but not all, of the embodiments of the application.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.
Example 1
As shown in FIG. 1, the application provides a plant time sequence image contrast learning method embedded with a priori distance, which specifically comprises the following steps:
And acquiring the weathered period information from the plant image. And randomly extracting 3-5 example sequences for n image sequences of a certain plant. For each example sequence, the budding was set to a reference time day0, and the start time and duration of the different climates could be obtained by manually interpreting the image sequence. Each time node of the example sequence is averaged. Since a particular plant has its relatively fixed annual growth period, this average value may be approximated as representing the climatic periods of all n image sequences.
And extracting a required image from the images with the divided object waiting periods. The weathered period change is a process, and at the end of one weathered period and the beginning of the next weathered period, the corresponding images may contain similar semantic information. In order to realize automatic and accurate extraction of images in different climates and maximize the difference of image semantics in different climates, images in adjacent climates in the boundary period are abandoned, and only images far from a time critical point are selected and recorded for classification.
The images are paired pairwise, namely, the image set itself is subjected to Cartesian product. The obtained image pairs can be recorded into four types of synchronous sequences, different phases of different sequences and different phases of different sequences according to the sequences and the climatic periods of the two images, and are stored as label by One-hot coding.
And calculating contrast loss according to the prior distances of the image pairs of different types, and providing grading distances and classifying distances. The grading distance is shown in fig. 3 (a), specifically:
For class i image pairs, a distance coefficient k l is first defined to characterize the relative distance between the class of image pairs. For any one image pair x and y, its distance p xy can be obtained:
a 1,a2,a3,a4 is the original label of the image pair, and after this processing, the prior distance p xy between any one of the image pairs x and y is marked as a number between [0,1], and p xy can be regarded as the probability between the images with the same meaning between the sample pairs.
For vectors z1 and z2 of image pairs x and y, the loss value loss is calculated as:
Wherein Simz 1,z2 is the similarity between z1 and z 2. The function can be regarded as similarity to z1 and z2, and cross entropy is calculated with the probability represented by p xy after Sigmoid. τ is a temperature coefficient that can be used to adjust the distribution of similarity. The similarity may be measured using a negative cosine distance (negative cosine similarity) or euclidean distance. When a negative cosine distance is used:
the euclidean distance is the l2 norm:
Sim(z1,z2)=Euc(z1,z2)=||z1-z2||2
for each mini-batch, if batchsize is n, then its contrast loss is:
the classification distance is shown in fig. 3 (b), specifically, the distance information of the contrast model is implicitly mapped onto a fully connected layer, and the loss is directly calculated with the classification of different image pairs.
In calculating the Euclidean distance of z1 and z2, it can be further rewritten as:
the euclidean distance calculation is completed to obtain a number, which is not beneficial to the connection to the full connection layer for classification. When the classifying distance calculates the contrast loss, z1 and z2 are directly subtracted according to the phase, namely:
e=z1-z2
e is a vector of the same dimension as z1 and z2, and it can be found that the process of calculating e is the core step of calculating the euclidean distance of z1 and z2, which results in the inclusion of a priori distance information of z1 and z2 in e. The e is linearly projected to a full connection layer t of 4 nodes, and the output o can be obtained through softmax processing. Namely:
t=eW
o=Softmax(t)
The cross entropy is used for calculating the error between the category information in o and the prior distance information in the image pair label category, namely:
After the contrast learning pre-training is finished, the obtained weight can be transferred to various downstream supervision tasks. The invention adopts a small amount of plant images with labels to supervise training based on the weight which is already trained. And taking the trained semantic segmentation network as an information extraction model, recording semantic segmentation results from the time dimension, and establishing a plant growth model.
As shown in fig. 4, a U-Net semantic segmentation network is used as an information extraction model to segment the regions of the stems, flowers, leaves, fruits, background and the like in the plant image at the pixel level. The U-Net network structure is a classical Encoder-Decoder structure, and various trunks can be adopted as Encoder. The Encoder networks can obtain 5 preliminary layers of active features. The Decoder network uses deconvolution to up-sample 5 feature layers and performs feature fusion to obtain an effective feature layer fused with all features. And classifying each feature point of the finally obtained feature layer to obtain a semantic segmentation result.
And the Kalman filtering is adopted, so that the influence on the aspects of ambient light, climate, production activities and the like is reduced, noise and interference in a growth model system are removed, and more stable observation data are obtained.
The foregoing descriptions of specific exemplary embodiments of the present invention are presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application to thereby enable one skilled in the art to make and utilize the invention in various exemplary embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (4)

1.一种嵌入先验距离的植物时序图像对比学习方法,其特征在于,包括:1. A plant temporal image contrast learning method embedding prior distance, characterized in that it includes: 读植物时序图像,获得物候期信息,生成四个种类的图像对,即同序列同时期、同序列不同时期、不同序列同时期、不同序列不同时期的图像对;Read plant time series images to obtain phenological information and generate four types of image pairs: image pairs of the same sequence at the same time, image pairs of the same sequence at different times, image pairs of different sequences at the same time, and image pairs of different sequences at different times. 记录不同种类的图像对拥有的对应先验距离;Record the prior distances between different types of image pairs; 将一个图像对x和y输入对比模型后,进行数据增强,得到图像v1和v2;对同一个图像,在不同epoch的训练中生成不同视图;After inputting an image into a comparison model using x and y pairs, data augmentation is performed to obtain images v1 and v2; different views are generated for the same image during training at different epochs. 所述图像v1和v2在编码器中提取特征向量h1和h2;The encoder extracts feature vectors h1 and h2 from the images v1 and v2; 提取特征向量h1和h2后,使用神经网络投影头,将表示映射到应用对比损失的空间,得到向量z1和z2;After extracting feature vectors h1 and h2, a neural network projection head is used to map the representations to a space where contrastive loss is applied, resulting in vectors z1 and z2. 通过分级距离和分类距离两种方式,融合不同种类图像对的先验距离与对应向量z1、z2的实际距离,得到对比损失;By fusing the prior distance between image pairs of different types and the actual distance between the corresponding vectors z1 and z2 using both hierarchical distance and classification distance methods, a contrastive loss is obtained. 所述分级距离具体为:首先定义一个距离系数,以表征该类图像对之间的相对距离;对于图像对x和y,获取其距离The hierarchical distance is specifically defined as follows: First, a distance coefficient is defined. To characterize the relative distance between image pairs of this type; for image pairs x and y, obtain their distance. : 为图像对原有的label,经过此处理,任意一个图像对x和y的先验距离标记为[0,1]之间的一个数,将视为样本对之间为同样含义图像之间的概率; For the original labels of the images, after this processing, the prior distance between any image pairs x and y is... A number labeled as [0,1] will Consider the probability that a sample pair represents images with the same meaning; 对于图像对x和y的向量z1和z2,计算损失值loss为:For the image pairs x and y vectors z1 and z2, the loss value is calculated as follows: 式中为z1和z2的相似度;上式视为对z1和z2求相似度,并经过Sigmoid后,与所代表的概率求交叉熵;为温度系数,用于调节相似度的分布;In the formula Let z1 and z2 be the similarity; the above formula is considered as calculating the similarity between z1 and z2, and then applying the Sigmoid function to it. Calculate the cross-entropy based on the probabilities they represent; This is a temperature coefficient used to adjust the distribution of similarity. 相似度采用负余弦距离或欧氏距离度量;当采用负余弦距离时:Similarity is measured using negative cosine distance or Euclidean distance; when using negative cosine distance: 欧氏距离即为l2范数: Euclidean distance is the l2 norm: 对于每个mini-batch,若batch size为n,则其对比损失loss为:For each mini-batch, if the batch size is n, then its contrastive loss is: . 2.根据权利要求1所述一种嵌入先验距离的植物时序图像对比学习方法,其特征在于,所述分类距离具体为:隐式的将对比模型的距离信息映射到一个全连接层上,并与不同图像对的分类直接计算损失;2. The plant temporal image contrast learning method with embedded prior distance according to claim 1, wherein the classification distance is specifically: implicitly mapping the distance information of the contrast model onto a fully connected layer, and directly calculating the loss with the classification of different image pairs; 在计算z1和z2的欧氏距离时,将下式改写为:When calculating the Euclidean distance between z1 and z2, the following formula can be rewritten as: 分类距离计算对比损失时,z1和z2直接按位相减,即:参数e是与z1和z2相同维度的向量,计算e的过程是计算z1和z2欧氏距离的核心步骤,这就使得e中蕴含了z1和z2的先验距离信息。When calculating the contrastive loss based on classification distance, z1 and z2 are directly subtracted bitwise, i.e.: The parameter e is a vector with the same dimension as z1 and z2. The process of calculating e is the core step in calculating the Euclidean distance between z1 and z2, which means that e contains the prior distance information of z1 and z2. 3.根据权利要求2所述一种嵌入先验距离的植物时序图像对比学习方法,其特征在于,将e线性投影到一个4个节点的全连接层t,经过softmax处理,获得输出o;即:3. The plant temporal image contrast learning method embedding prior distance according to claim 2, characterized in that e is linearly projected onto a fully connected layer t with 4 nodes, and after softmax processing, the output o is obtained; that is: 交叉熵用于计算o中的类别信息与图像对label种类中的先验距离信息之间的误差,即:Cross-entropy is used to calculate the error between the category information in o and the prior distance information between the image and the label categories, i.e.: . 4.根据权利要求1所述一种嵌入先验距离的植物时序图像对比学习方法,其特征在于,采用U-Net语义分割网络作为信息提取模型,对植物图像中枝干、花、叶、果实和背景区域进行像素级分割;U-Net语义分割网络为Encoder-Decoder结构,Encoder网络获得5个特征层;Decoder网络运用反卷积对5个特征层上采样,并且进行特征融合,得到一个融合了所有特征的有效特征层;对有效特征层的每一个特征点进行分类,即可获取语义分割结果。4. The plant temporal image contrast learning method with embedded prior distance as described in claim 1, characterized in that a U-Net semantic segmentation network is used as the information extraction model to perform pixel-level segmentation of branches, flowers, leaves, fruits and background regions in the plant image; the U-Net semantic segmentation network is an Encoder-Decoder structure, the Encoder network obtains 5 feature layers; the Decoder network uses deconvolution to upsample the 5 feature layers and performs feature fusion to obtain an effective feature layer that integrates all features; the semantic segmentation result can be obtained by classifying each feature point of the effective feature layer.
CN202310033871.2A 2023-01-10 2023-01-10 A plant temporal image contrast learning method embedding prior distance Active CN115937613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310033871.2A CN115937613B (en) 2023-01-10 2023-01-10 A plant temporal image contrast learning method embedding prior distance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310033871.2A CN115937613B (en) 2023-01-10 2023-01-10 A plant temporal image contrast learning method embedding prior distance

Publications (2)

Publication Number Publication Date
CN115937613A CN115937613A (en) 2023-04-07
CN115937613B true CN115937613B (en) 2026-01-09

Family

ID=86700919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310033871.2A Active CN115937613B (en) 2023-01-10 2023-01-10 A plant temporal image contrast learning method embedding prior distance

Country Status (1)

Country Link
CN (1) CN115937613B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9342589B2 (en) * 2008-07-30 2016-05-17 Nec Corporation Data classifier system, data classifier method and data classifier program stored on storage medium
US8218869B2 (en) * 2009-03-29 2012-07-10 Mitsubishi Electric Research Laboratories, Inc. Image segmentation using spatial random walks
US9463132B2 (en) * 2013-03-15 2016-10-11 John Castle Simmons Vision-based diagnosis and treatment
CN108269244B (en) * 2018-01-24 2021-07-06 东北大学 An Image Dehazing System Based on Deep Learning and Prior Constraints
CN115424059B (en) * 2022-08-24 2023-09-01 珠江水利委员会珠江水利科学研究院 Remote sensing land utilization classification method based on pixel level contrast learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Cherry growth modeling based on Prior Distance Embedding contrastive learning: Pre-training, anomaly detection, semantic segmentation, and temporalmodeling;Wei Xua;《ComputersandElectronicsinAgriculture》;20240425;全文 *

Also Published As

Publication number Publication date
CN115937613A (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN112070078B (en) Land use classification method and system based on deep learning
CN106198909B (en) A method for predicting water quality in aquaculture based on deep learning
CN111369498A (en) A Data Augmentation Method for Seedling Growth Viability Evaluation Based on Improved Generative Adversarial Networks
CN110889335B (en) Two-person interaction behavior recognition method based on multi-channel spatio-temporal fusion network human skeleton
CN115115830A (en) Improved Transformer-based livestock image instance segmentation method
CN112184734B (en) Animal long-time gesture recognition system based on infrared image and wearable optical fiber
CN110853070A (en) Underwater sea cucumber image segmentation method based on significance and Grabcut
CN114120359B (en) A method for measuring body size of group-raised pigs based on stacked hourglass network
CN114550164B (en) A research method for tomato leaf disease identification based on deep learning
CN106991666A (en) A kind of disease geo-radar image recognition methods suitable for many size pictorial informations
CN119580049B (en) System for realizing intelligent recognition of crops by multi-mode method based on CLIP
CN118072251B (en) Tobacco pest identification method, medium and system
CN114663791A (en) Branch recognition method for pruning robot in unstructured environment
CN117557914A (en) A method for identifying crop diseases and insect pests based on deep learning
CN117011272B (en) A method for anomaly detection in large-scale plant growth images by incorporating relative distance
An Xception network for weather image recognition based on transfer learning
CN116976392A (en) An unsupervised estimation method for large-scale brain functional connectivity networks
CN115937613B (en) A plant temporal image contrast learning method embedding prior distance
CN120931968A (en) Crop pest intelligent identification and early warning system based on multi-mode large language model
CN118072295B (en) Tobacco leaf identification method, system, storage medium, equipment and program product
CN119206341A (en) A plant disease identification method, system and storage medium based on unsupervised category incremental learning
Stanski et al. Flower detection using object analysis: new ways to quantify plant phenology in a warming tundra biome
CN118521915A (en) Self-adaptive method-based automatic unsupervised remote sensing field extraction method
Gao et al. Plant Event Detection from Time-Varying Point Clouds
Wang et al. Extracting the height of lettuce by using neural networks of image recognition in deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant