WO2025252921A1 - Method to determine disease in plants - Google Patents
Method to determine disease in plantsInfo
- Publication number
- WO2025252921A1 WO2025252921A1 PCT/EP2025/065740 EP2025065740W WO2025252921A1 WO 2025252921 A1 WO2025252921 A1 WO 2025252921A1 EP 2025065740 W EP2025065740 W EP 2025065740W WO 2025252921 A1 WO2025252921 A1 WO 2025252921A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- disease
- species
- neural network
- plant
- substructure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F16—ENGINEERING ELEMENTS AND UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
- F16M—FRAMES, CASINGS OR BEDS OF ENGINES, MACHINES OR APPARATUS, NOT SPECIFIC TO ENGINES, MACHINES OR APPARATUS PROVIDED FOR ELSEWHERE; STANDS; SUPPORTS
- F16M11/00—Stands or trestles as supports for apparatus or articles placed thereon ; Stands for scientific apparatus such as gravitational force meters
- F16M11/02—Heads
- F16M11/04—Means for attachment of apparatus; Means allowing adjustment of the apparatus relatively to the stand
- F16M11/041—Allowing quick release of the apparatus
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F16—ENGINEERING ELEMENTS AND UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
- F16M—FRAMES, CASINGS OR BEDS OF ENGINES, MACHINES OR APPARATUS, NOT SPECIFIC TO ENGINES, MACHINES OR APPARATUS PROVIDED FOR ELSEWHERE; STANDS; SUPPORTS
- F16M11/00—Stands or trestles as supports for apparatus or articles placed thereon ; Stands for scientific apparatus such as gravitational force meters
- F16M11/20—Undercarriages with or without wheels
- F16M11/22—Undercarriages with or without wheels with approximately constant height, e.g. with constant length of column or of legs
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F16—ENGINEERING ELEMENTS AND UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
- F16M—FRAMES, CASINGS OR BEDS OF ENGINES, MACHINES OR APPARATUS, NOT SPECIFIC TO ENGINES, MACHINES OR APPARATUS PROVIDED FOR ELSEWHERE; STANDS; SUPPORTS
- F16M13/00—Other supports for positioning apparatus or articles; Means for steadying hand-held apparatus or articles
- F16M13/04—Other supports for positioning apparatus or articles; Means for steadying hand-held apparatus or articles for supporting on, or holding steady relative to, a person, e.g. by chains, e.g. rifle butt or pistol grip supports, supports attached to the chest or head
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/188—Vegetation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/68—Food, e.g. fruit or vegetables
Definitions
- Systems, methods, and computer programs disclosed herein relate to the qualitative and quantitative assessment of disease on crop plants using supervised learning.
- Quantitative assessment of disease severity is particular important to allow comparison between different areas, locations and time points. Especially in research or development trials, including trials necessary for registration of crop protection products in agriculture quantification is key however prone to errors (Bock et al., Phytopathology (2020), Vol 2, pp 1- 9). Therefore, there is the need to measure the extent and severity of pathogen infections in a more precise, objective and scalable manner, finally leading to faster and more efficient development and registration of crop protection products. This includes estimating the proportion of infected plants in a given area, the degree of tissue damage, and the potential yield loss associated with the disease. Advanced techniques such as digital image analysis, molecular diagnostics, and remote sensing technologies have greatly enhanced the accuracy and efficiency of quantitative assessments.
- Neural networks one type of machine learning models, are known to be highly suitable for tasks like pattern or object detection and segmentation of images.
- Mask-RCNNs have been used in a two-step approach to recognize Fusarium Head Blight in wheat (US2023010954), however in order to achieve a high accuracy in plant disease detection both in a quantitative or qualitative manner is still an unsolved problem.
- neural networks are known to require high computing capacity which may pose a challenge for the assessment process in agricultural areas, fields and trials where large number of subimages that need to be analyzed in order to assess disease ultimately with a precision to allow the data to be used in registration relevant trials. Therefore also the choice of a suitable neural network architecture as well as the efficacy and computational cost of the training approaches will play an important role.
- the objects of the present invention include a method, a system and a computer program product for qualitative and quantitative assessment of disease on crop plants using supervised learning and a device supporting to receive digital images.
- the present disclosure relates to a computer-implemented method comprising. Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
- the present disclosure provides a computer system comprising: a processor; and a memory storing an application program configured to perform, when executed by the processor, an operation, the operation comprising:
- the present disclosure provides a non-transitory computer readable storage medium having stored thereon software instructions that, when executed by a processor of a computer system, cause the computer system to execute the following steps:
- each of the subimages comprising a one or more specific substructure separated from other substructures ;
- the present disclosure provides a device comprising an arm including a first, second and third member and first, second and third means for connecting the members; means for attaching a mobile device, wherein the second member is attached at its proximal end to the middle section of the first member at a 90 degree angle and wherein the second member is attached with its distal end to the third member at a 90 degree angle in relation to the plane formed by the first and second member and wherein the means for attached a mobile phone is connected to the third member so that the front of the mobile device is parallel to and above the first member.
- Figure 1 is a block diagram of a computer system that can be used in the various embodiments of the invention described.
- Figure 2 shows schematically an embodiment of the computer-implemented method of the present disclosure in form of a flow chart.
- Figure 3 shows in an example image the polygons confining 28 wheat ears in the foreground of the image, assigned by human annotators.
- the polygons have been filled with color for better visibility.
- the corresponding bounding boxes are also shown in the image.
- Figure 4 shows an image of the masks highlighting the infected areas of the ears. Annotated pixels that are not part of an ear have been ignored.
- Figure 5 shows selected training and validation metrics from a typical model training run over 5000 epochs with a batch size of 4 images. Validation metrics have been derived every 300 epochs.
- the Figure shows the convergence of the training and validation loss, indicating that 3000 to 5000 epochs are a good choice to stop the training to avoid overfitting of the model.
- the learning rate schedule includes a warm-up and a decay phase, which support smooth convergence of the model. Box and segmentation average precision on the validation images reach about 58% and about 51% respectively, which is a good result keeping in mind that sometimes the model will select a slightly different set of foreground ears compared to the ground truth annotations.
- Figure 6 shows the output of the trained Mask R-CNN. Foreground ears have been detected with a high accuracy and the corresponding masks and bounding boxes are overlayed on top of the image.
- Figure 7 shows the subimage of one wheat ear. In the left image of Figure 7 the ear has been cropped (manually) from the original image. The middle Figure shows the same ear cropped to the bounding box and with removed background using the results from Mask R-CNN. The right image shows the mask predicted using the trained U-Net, predicting the area of the ear that is infected with Gibberella zeae. For this example, the ear (without background) corresponds to 107012 pixels, the mask contains 11732 pixels resulting in a disease severity of about 11%.
- Figure 8 shows an image of wheat ears with bounding boxes (left part) and an image with the resulting masks.
- Figure 9 shows a schematic drawing of the device useful in receiving digital images by a mobile device in a standardized manner.
- the articles “a” and “an” are intended to include one or more items and may be used interchangeably with “one or more” and “at least one.”
- the singular form of “a”, “an”, and “the” include plural referents, unless the context clearly dictates otherwise. Where only one item is intended, the term “one” or similar language is used.
- the terms “has”, “have”, “having”, or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based at least partially on” unless explicitly stated otherwise.
- phrase “based on” may mean “in response to” and be indicative of a condition for automatically triggering a specified operation of an electronic device (e.g., a controller, a processor, a computing device, etc.) as appropriately referred to herein.
- an electronic device e.g., a controller, a processor, a computing device, etc.
- FIG. 1 shows schematically an embodiment of the computer-implemented method of the present disclosure in form of a flow chart.
- the method (100) comprises the steps:
- a “computer system” is a system for electronic data processing that processes data by means of programmable calculation rules. Such a system usually comprises a “computer”, that unit which comprises a processor for carrying out logical operations, and also peripherals.
- peripherals refer to all devices which are connected to the computer and serve for the control of the computer and/or as input and output devices. Examples thereof are monitor (screen), printer, scanner, mouse, keyboard, drives, camera, microphone, loudspeaker, etc. Internal ports and expansion cards are, too, considered to be peripherals in computer technology.
- Computer systems of today are frequently divided into desktop PCs, portable PCs, laptops, notebooks, netbooks and tablet PCs and so-called handhelds (e.g. smartphone); Computer systems also include cloud-hosted systems where one or more steps of the computer-implemented method are performed remotely over the internet. Computer systems also include web server, database server, application servers or cluster computing. All these systems can be utilized for carrying out the invention.
- non-transitory is used herein to exclude transitory, propagating signals or waves, but to otherwise include any volatile or non-volatile computer memory technology suitable to the application.
- the term “computer” should be broadly construed to cover any kind of electronic device with data processing capabilities, including, by way of non-limiting example, personal computers, servers, embedded cores, computing system, communication devices, processors (e.g., digital signal processor (DSP)), microcontrollers, field programmable gate array (FPGA), application specific integrated circuit (ASIC), etc.) and other electronic computing devices.
- processors e.g., digital signal processor (DSP)
- microcontrollers e.g., field programmable gate array (FPGA), application specific integrated circuit (ASIC), etc.
- ASIC application specific integrated circuit
- processor includes a single processing unit or a plurality of distributed or remote such units.
- Figure 1 illustrates a computer system (1) according to some example implementations of the present disclosure in more detail.
- a computer system of exemplary implementations of the present disclosure may be referred to as a computer and may comprise, include, or be embodied in one or more fixed or portable electronic devices.
- the computer may include one or more of each of a number of components such as, for example, a processing unit (20) connected to a memory (50) (e.g., storage device).
- the processing unit (20) may be composed of one or more processors alone or in combination with one or more memories.
- the processing unit (20) is generally any piece of computer hardware that is capable of processing information such as, for example, data, computer programs and/or other suitable electronic information.
- the processing unit (20) is composed of a collection of electronic circuits some of which may be packaged as an integrated circuit or multiple interconnected integrated circuits (an integrated circuit at times more commonly referred to as a “chip”).
- the processing unit (20) may be configured to execute computer programs, which may be stored onboard the processing unit (20) or otherwise stored in the memory (50) of the same or another computer.
- the processing unit (20) may be a number of processors, a multi-core processor or some other type of processor, depending on the particular implementation. For example, it may be a central processing unit (CPU), a field programmable gate array (FPGA), a graphics processing unit (GPU) and/or a tensor processing unit (TPU). Further, the processing unit (20) may be implemented using a number of heterogeneous processor systems in which a main processor is present with one or more secondary processors on a single chip. As another illustrative example, the processing unit (20) may be a symmetric multi-processor system containing multiple processors of the same type.
- CPU central processing unit
- FPGA field programmable gate array
- GPU graphics processing unit
- TPU tensor processing unit
- the processing unit (20) may be implemented using a number of heterogeneous processor systems in which a main processor is present with one or more secondary processors on a single chip.
- the processing unit (20) may be a symmetric multi-processor system
- the processing unit (20) may be embodied as or otherwise include one or more ASICs, FPGAs or the like.
- the processing unit (20) may be capable of executing a computer program to perform one or more functions
- the processing unit (20) of various examples may be capable of performing one or more functions without the aid of a computer program. In either instance, the processing unit (20) may be appropriately programmed to perform functions or operations according to example implementations of the present disclosure.
- the memory (50) is generally any piece of computer hardware that is capable of storing information such as, for example, data, computer programs (e.g., computer-readable program code (60)) and/or other suitable information either on a temporary basis and/or a permanent basis.
- the memory (50) may include volatile and/or non-volatile memory, and may be fixed or removable. Examples of suitable memory include random access memory (RAM), read-only memory (ROM), a hard drive, a flash memory, a thumb drive, a removable computer diskette, an optical disk, a magnetic tape or some combination of the above.
- Optical disks may include compact disk - read only memory (CD-ROM), compact disk - read/write (CD-R/W), DVD, Blu-ray disk or the like.
- the memory may be referred to as a computer-readable storage medium or data memory.
- the computer-readable storage medium is a non-transitory device capable of storing information, and is distinguishable from computer-readable transmission media such as electronic transitory signals capable of carrying information from one location to another.
- Computer-readable medium as described herein may generally refer to a computer- readable storage medium or computer-readable transmission medium.
- the processing unit (20) may also be connected to one or more interfaces for displaying, transmitting and/or receiving information.
- the interfaces may include one or more communications interfaces and/or one or more user interfaces.
- the communications interface(s) may be configured to transmit and/or receive information, such as to and/or from other computer(s), network(s), database(s) or the like.
- the communications interface may be configured to transmit and/or receive information by physical (wired) and/or wireless communications links.
- the communications interface(s) may include interface(s) (41) to connect to a network, such as using technologies such as cellular telephone, Wi-Fi, satellite, cable, digital subscriber line (DSL), fiber optics and the like.
- the communications interface(s) may include one or more short-range communications interfaces (42) configured to connect devices using short-range communications technologies such as NFC, RFID, Bluetooth, Bluetooth LE, ZigBee, infrared (e.g., IrDA) or the like.
- short-range communications technologies such as NFC, RFID, Bluetooth, Bluetooth LE, ZigBee, infrared (e.g., IrDA) or the like.
- the user interfaces may include a display (30).
- the display (screen) may be configured to present or otherwise display information to a user, suitable examples of which include a liquid crystal display (LCD), light-emitting diode display (LED), plasma display panel (PDP) or the like.
- the user input interface(s) (11) may be wired or wireless, and may be configured to receive information from a user into the computer system (1), such as for processing, storage and/or display. Suitable examples of user input interfaces include a microphone, image or video capture device, keyboard or keypad, joystick, touch-sensitive surface (separate from or integrated into a touchscreen) or the like.
- the user interfaces may include automatic identification and data capture (AIDC) technology (12) for machine-readable information. This may include barcode, radio frequency identification (RFID), magnetic stripes, optical character recognition (OCR), integrated circuit card (ICC), and the like.
- the user interfaces may further include one or more interfaces for communicating with peripherals such as printers and the like.
- program code instructions (60) may be stored in memory (50), and executed by processing unit (20) that is thereby programmed, to implement functions of the systems, subsystems, tools and their respective elements described herein.
- any suitable program code instructions (60) may be loaded onto a computer or other programmable apparatus from a computer- readable storage medium to produce a particular machine, such that the particular machine becomes a means for implementing the functions specified herein.
- These program code instructions (60) may also be stored in a computer-readable storage medium that can direct a computer, processing unit or other programmable apparatus to function in a particular manner to thereby generate a particular machine or particular article of manufacture.
- the instructions stored in the computer-readable storage medium may produce an article of manufacture, where the article of manufacture becomes a means for implementing functions described herein.
- the program code instructions (60) may be retrieved from a computer- readable storage medium and loaded into a computer, processing unit or other programmable apparatus to configure the computer, processing unit or other programmable apparatus to execute operations to be performed on or by the computer, processing unit or other programmable apparatus.
- Retrieval, loading and execution of the program code instructions (60) may be performed sequentially such that one instruction is retrieved, loaded and executed at a time. In some example implementations, retrieval, loading and/or execution may be performed in parallel such that multiple instructions are retrieved, loaded, and/or executed together. Execution of the program code instructions (60) may produce a computer-implemented process such that the instructions executed by the computer, processing circuitry or other programmable apparatus provide operations for implementing functions described herein.
- a computer system (1) may include processing unit (20) and a computer-readable storage medium or memory (50) coupled to the processing circuitry, where the processing circuitry is configured to execute computer-readable program code instructions (60) stored in the memory (50). It will also be understood that one or more functions, and combinations of functions, may be implemented by special purpose hardware-based computer systems and/or processing circuitry which perform the specified functions, or combinations of special purpose hardware and program code instructions.
- the computer system of the present disclosure may be in the form of a laptop, notebook, netbook, and/or tablet PC; it may also be a component of an MRI scanner, a CT scanner, or an ultrasound diagnostic machine.
- the present disclosure provides a computer program product.
- a computer program product comprises a non-volatile data carrier, such as a CD, a DVD, a USB stick or other medium for storing data.
- a computer program is stored on the data carrier.
- the computer program can be loaded into a working memory of a computer system (in particular, into a working memory of a computer system of the present disclosure), where it can cause the computer system to perform the following steps: Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
- the computer program product may also be marketed in combination with the contrast agent.
- a combination is also referred to as a kit.
- a kit includes the contrast agent and the computer program product.
- Such a kit includes the contrast agent and means for allowing a purchaser to obtain the computer program, e.g., download it from an Internet site. These means may include a link, i.e., an address of the Internet site from which the computer program may be obtained, e.g., from which the computer program may be downloaded to a computer system connected to the Internet.
- Such means may include a code (e.g., an alphanumeric string or a QR code, or a DataMatrix code or a barcode or other optically and/or electronically readable code) by which the purchaser can access the computer program.
- a link and/or code may, for example, be printed on a package of the contrast agent and/or printed on a package insert for the contrast agent.
- a kit is thus a combination product comprising a contrast agent and a computer program (e.g., in the form of access to the computer program or in the form of executable program code on a data carrier) that is offered for sale together.
- a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area is received.
- Digital images are images which are made up of pixel each pixel has a discrete quantity of numeric representation for its intensity or gray level.
- the pixels are defined by their spatial coordinates on the x and y-axis.
- binary digital image a pixel is only black (having a value of 0) or white (having a value of 1 ) .
- grayscale pictures each pixel has a value indicating its shade of grey (usually 8 bit/pixel allowing 256 shades of grey) without any information on the color.
- color images also known as RGB images, each pixel contains color information for separate values for the channels in the visible spectrum which are red, green, and blue components (typically 24 bits/pixel, 8 bits for each color channel).
- RGB A images (A for alpha) a fourth channel is added representing for each pixel the transparency with a value of zero being completely transparent and a value of 255 (for 8-bit color) the pixel being fully opaque.
- each pixel is represented by a vector of intensity values for each spectral band including bands outside of the visible spectrum.
- Crop plant means a plant or plants populations which is specifically grown with human intervention as a plant useful for feed, food, fuel or fibre or an ornamental plant.
- Crop plants may be plants which can be obtained by conventional breeding and optimization methods or by biotechnological, such as recombinant DNA techniques, genetic engineering methods, or gene editing methods such as CRISPR Cas or combinations of these methods, including the genetically modified plants (transgenic) or gene- edited (also named new genomic technologies) crop plants.
- Crop plants include also plant cultivars which are protectable and non-protectable by plant breeders’ rights, geno- or biotypes.
- Crop plants include the following species or genera: Alfafa, Anacardiaceae sp.
- beet for example sugar beet and fodder beet
- cereals for example wheat, durum, barley, rye, oats, rice, maize, triticale and millet/sorghum
- citrus fruit for example oranges, lemons, mandarins, grapefruits and tangerines
- cucurbits for example pumpkin/squash, gherkins, calabashes, cucumbers and melons
- fibre plants for example cotton, flax, hemp and jute, cannabis
- Latex plants Lauraceae sp. (e.g. avocado, cinnamon, camphor); legumes, for example beans, lentils, peas and soybeans, common beans and broad beans; Malvaceae sp. (e.g.
- Manihoteae sp. for instance Manihot esculenta, manioc
- oil crops for example Brassica oil seeds such as Brassica napus (e.g. canola, rapeseed), Brassica rapa, B. juncea (e.g. (field) mustard) and Brassica carinata, Arecaceae sp. (e.g. oilpalm, coconut), mustard, poppies, Oleaceae sp. (e.g. olive tree, olives), sunflowers, castor oil plants; Papaveraceae (e.g.
- pome fruit for example apples, pears and quince, Ribesioidae sp., soft fruits for example strawberries, raspberries, blackberries, blueberries, red and black currant and gooseberry; Rubiaceae sp. (for instance coffee); Solanaceae sp. (e.g. tomatoes, potatoes, peppers, bell peppers, capsicum, aubergines, eggplant, tobacco), Stevia rebaudiana; stone fruit for example peaches, nectarines, cherries, plums, common plums, apricots; Theobroma sp. (for instance Theobroma cacao: cocoa); vegetables, for example spinach, lettuce, Asparagaceae (e.g. asparagus), Cruciferae sp.
- Solanaceae sp. e.g. tomatoes, potatoes, peppers, bell peppers, capsicum, aubergines, eggplant, tobacco
- Solanaceae sp. e.g. tomatoes, potatoes, peppers, bell peppers,
- Chestnut (Castanea), Chestnuts, including Chinese Chestnut, Malabar chestnut, Sweet Chestnut, Beech (Fagus), Oak (Quercus), Stone-oak, Tanoak (Lithocarpus)); Betulaceae sp. (Alder (Alnus), Birch (Betula), Hazel, Filbert (Corylus), Hornbeam), Leguminosae sp.
- Asteraceae sp. for instance sunflower seed
- Almond, Beech, Butternut, Brazil nut Candlenut, Cashew, Colocynth, Cotton seed, Cucurbita ficifolia, Filbert, Indian Beech or Pongam Tree, Kola nut, Lotus seed, Macadamia, Mamoncillo, Maya nut, Mongongo, Oak acorns, Ogbono nut, Paradise nut, Pili nut, Pine nut, Pistacchio, Pumpkin seed, water Caltrop; soybeans (Glycine sp., Glycine max).
- the digital image is an RGB image.
- the RGB image is taken by using a mobile device.
- the mobile device is a smart phone, mobile phone, handheld or tablet.
- the digital image, in particular the RGB image is taken using a Unmanned Aerial Vehicle (UAV), in particular a fixed wing drone, single rotor or multi -rotor drone.
- UAV Unmanned Aerial Vehicle
- the digital image is comprised of more than one crop plant. In one embodiment the digital image is comprised of more than one part of a crop plant, preferably 2 to 50 parts of a crop plant, more preferably 5 to 40 parts of a crop plant, even more preferably 15 to 35 parts of a crop plant.
- the crop plant is a cereal comprising wheat, barley, rye, oats, rice, maize, triticale and millet/sorghum.
- the plant part is an ear or a tassel.
- a digital image comprising two to 40 ears, preferably 15 to 35 of a wheat, barley, rye, rice, maize or triticale plant growing in an area is received.
- An area means a spatially delimitable region of the Earth's surface on which the plants grow.
- the area is at least partly utilized agriculturally in that crop plants are planted in one or more fields, are supplied with nutrients and are harvested.
- the area may also be or comprise a silviculturally utilized region of the Earth's surface (for example a forest). Gardens, parks or the like in which vegetation exists solely for human pleasure are covered by the term area.
- An area may include a multitude of fields for crop plants.
- an area corresponds to a growing area for a crop plant (for definition of a growing area see, for example, Journal furMapesse, 61 (7). p. 247-253, 2009, ISSN 0027-7479).
- an area corresponds to a biome (for definition of equivalent German term Boden-Klima-Raum see, for example, sympatheticenbl. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut. Deut
- a location means a preferably contiguous region within an area.
- a location may be one or more fields in which a specific crop plant is being grown.
- the location is preferably being farmed by a person having registered access to a multitude of imaging devices and optionally one or more plant analysis devices.
- the location a field in which a grower or farmers grow a commercial crop plant. In another embodiment the location is a field in which experimental or demonstration trials have been planted.
- a field trial means a certain area or location in which crop plants are grown under certain conditions such as defined conditions of sowing, application of certain amounts of crop protection products including timing and amounts and method of applications.
- Field trials require a dedicated design in order to be able to arrange plots within a field trial to the regime or treatments in order to test the hypothesis or outcomes the scientist or agronomist want to investigate. This includes statistical considerations to test the hypothesis in a statistically meaningful way.
- the conditions of a field trial adhere to those conditions defined by regulatory authorities including European and Mediterranean Plant Protection Organization. The results of those field trials may be used in the approval process for plant protection product by the respective regulatory authorities such as the European Food Safety Agency or the Environmental Protection Agency of the United States.
- the field trial is conducted according to the publication: Bulletin OEPP/EPPO Bulletin (2012) 42 (3), 419-425 describing the efficacy evaluation of fungicides on foliar and ear diseases in cereals.
- one or more subimages are generated by using a trained neural network, each of the subimages comprising a one or more specific substructure separated from other substructures.
- a digital image is comprised by a plurality of subimages generating by methods including segmentation, in particular instance segmentation.
- the subimage comprises at least one object, the substructure, identified by the segmentation preferably a plant part.
- a substructure means a part of specific organ of a crop plant above and below the ground, such as a shoot, a leaf, a needle, a stalk, a stem or part of a stem, a flower, an ear, a spike, a fruit body, a fruit, a seed, a root, a tuber or a rhizome.
- a preferred substructure is a spike or a tassel.
- the neural network which is used according to the invention for the second step comprises at least three layers of processing elements: a first layer with input neurons (nodes), an Nth layer with at least one output neuron (node), and N-2 hidden layers, where N is a natural number greater than 2.
- the function of the input neurons is to receive digital images as input values. Normally, there is one input neuron for each pixel of a digital image. Additional input neurons may be provided for additional input values (e.g. conditions that existed when the respective recorded image was created, or additional information about the objects).
- the output neurons are used to predict mask or pixel set having a certain label, bounding boxes including a label for a digital image, indicating whether the object shown in the digital image meets or does not meet a defined criterion.
- the processing elements of the layers between the input neurons and the output neurons are connected to each other in a predefined architecture and may use either pretrained or random connection weights which are then refined during the training phase.
- the neural network is preferably a so-called convolutional neural network (CNN).
- CNN convolutional neural network
- a convolutional neural network is able to process input data in the form of a matrix. This allows digital images represented as a matrix (width x height x number of colour channels) to be used as input data.
- a standard neural network e.g. in the form of a multi-layer perceptron (MLP), on the other hand, requires a vector as input, i.e. in order to use a recorded image as input, the pixels of the image would need to be unravelled into a long chain. This means, for example, that standard neural networks are not able to recognize objects in an image independently of the position of the object in the image. The same object at a different position in the image would have a completely different input vector.
- a CNN consists essentially of filters (Convolutional Layer) and aggregation layers (Pooling Layer), which repeat alternately, and finally one or more layers of “standard”, fully connected neurons (Dense / Fully Connected Layer).
- the neural network training can be carried out, for example, by means of a back propagation method.
- the aim is to achieve the most reliable mapping possible from given input vectors or matrices to given output vectors or matrices for the network.
- the quality of the mapping is described by an error function.
- the goal is to minimise the error function.
- the training of an artificial neural network in the back propagation procedure is carried out by modifying the connection weights.
- connection weights between the processing elements contain information regarding the relationship between the recorded images (input) and labels, bounding boxes or masks (output), which can be used to predict labels, bounding boxes or masks for a new recorded image.
- annotated data set is split into training, validation and test images.
- model training the model learns using the training images and relevant metrics that inform about the training success are derived using the validation set (see the example in Figure 5).
- a test set of images is withhold from this process and can be used to evaluate the accuracy and other metrics on previously unused images.
- the neural network is a CNN. In another embodiment the neural network is a Mask R-CNN.
- a Mask R-CNN is a deep learning model that combines object detection and instance segmentation allowing to perform pixel-wise instance segmentation alongside object detection. This is achieved through the addition of an extra "mask head” branch, which generates precise segmentation masks for each detected object.
- the Mask R-CNN is an extension of the Faster R-CNN architecture adding a third branch to predict segmentation masks in addition to bounding boxes and class labels.
- the Faster-R-CNN is an object detection algorithm that builds upon the previous R-CNN and Fast R-CNN models. It introduces a Region Proposal Network (RPN) that shares convolutional features with the detection network, enabling efficient and accurate generation of region proposals.
- RPN Region Proposal Network
- Mask R-CNN has its ability to perform pixel-wise instance segmentation alongside object detection. This is achieved through the addition of an extra "mask head” branch, which generates precise segmentation masks for each detected object. This enables fine-grained pixel-level boundaries for accurate and detailed instance segmentation.
- the mask branch uses a pixel-to-pixel alignment mechanism, often implemented with spatially aligned ROI pooling, to ensure accurate correspondence between the proposed region and the generated mask. It incorporates two critical enhancements - ROIAlign addressing misalignment issues in traditional ROI pooling and Feature Pyramid Network (FPN) providing multi-scale feature extraction.
- ROIAlign addressing misalignment issues in traditional ROI pooling
- FPN Feature Pyramid Network
- the backbone of the Mask R-CNN is a ResNet 50.
- the subimage represents a bounding box which is a rectangular object that encloses a set of pixel which is here determined to detect one or more substructure of interest. In another embodiment those pixel of the subimage which are not encompassed by the substructure are set to the default RGB value of (0,0,0) to enable the third step of the method.
- the subimage comprises one substructure. In a preferred embodiment the subimage comprises one substructure based on a part of the crop plant in the foreground of the digital image. In a preferred embodiment the subimage comprises one substructure being an ear of a cereal crop plant based on a part of the crop plant being a cereal crop plant in the foreground of the digital image.
- the neural network of the second step is trained using at least a data set of at least 1000 digital images of crop plants or parts of crop plants which is divided randomly into a training, validation and data set.
- one or more trained neural network is used to generate for each subimage one or more disease pixel sets classifying each pixel within the disease pixel set regarding plant disease symptoms.
- SAM2 Segment Anything Model 2
- Meta accessible through GitHub https://ai.meta.com/sam2/, https://github.com/facebookresearch/sam2
- SAM2 Segment Anything Model 2
- Meta accessible through GitHub https://ai.meta.com/sam2/, https://github.com/facebookresearch/sam2
- GitHub https://ai.meta.com/sam2/, https://github.com/facebookresearch/sam2
- the step producing the sub images is performed using two models, model la and model 1 b allowing a more precise evaluation of the diseased areas, thereby allowing a higher accuracy of the respective disease index, eg the disease severity.
- Model la generates bounding boxes around each ear. All or a selection of the bounding boxes detected by model la will be passed to model lb. Criteria for the selection of a box may comprise a detection score, a maximal number of boxes to be processed per image or the position or size of a box.
- Model la may be the same model as currently used as eg the Mask- RCNN. In one embodiment faster R-CNN could be used, it is the same model, without the part that produces the masks, so only returning bounding boxes.
- Model lb takes the bounding boxes as input
- SAM2 is used for this step.
- post-processing e.g. fdling holes in the masks
- post-processing may be applied to the model lb output.
- Plant diseases are caused by fungi virus or bacteria infecting crop plants including diseases caused by powdery mildew pathogens, for example Blumeria species, for example Blumeria graminis; Podosphaera species, for example Podosphaera leucotricha; Sphaerotheca species, for example Sphaerotheca fuliginea; Uncinula species, for example Erysiphe necator; diseases caused by rust disease pathogens, for example Gymnosporangium species, for example Gymnosporangium sabinae; Hemileia species, for example Hemileia vastatrix; Phakopsora species, for example Phakopsora pachyrhizi, Phakopsora meibomiae or Phakopsora euvitis; Puccinia species, for example Puccinia recondita, Puccinia graminis oder Puccinia striiformis; Uromyces species, for example Urom
- Pseudomonas species for example Pseudomonas syringae pv. lachrymans
- Erwinia species for example Erwinia amylovora
- Liberibacter species for example Liberibacter asiaticus
- Xyella species for example Xylella fastidiosa
- Ralstonia species for example Ralstonia solanacearum
- Dickeya species for example Dickeya solani
- Clavibacter species for example Clavibacter michiganensis
- Streptomyces species for example Streptomyces scabies.
- Altemaria leaf spot Altemaria spec, atrans tenuissima
- Anthracnose Coldletotrichum gloeosporoides dematium var.
- phytophthora rot (Phytophthora megasperma), brown stem rot (Phialophora gregata), pythium rot (Pythium aphanidermatum, Pythium irregulare, Pythium debaryanum, Pythium myriotylum, Pythium ultimum), rhizoctonia root rot, stem decay, and damping-off (Rhizoctonia solani), sclerotinia stem decay (Sclerotinia sclerotiorum), sclerotinia southern blight (Sclerotinia rolfsii), thielaviopsis root rot (Thielaviopsis basicola).
- the disease is a disease infecting ear and panicle of cereals including Altemaria species, for example Altemaria spp.; Aspergillus species, for example Aspergillus flavus; Cladosporium species, for example Cladosporium cladosporioides; Claviceps species, for example Claviceps purpurea; Fusarium species, for example Fusarium culmomm; Gibberella species, for example Gibberella zeae (also called Fusarium graminearum or Fusarium Head Blight; Monographella species, for example Monographella nivalis; Stagnospora species, for example Stagnosporanodorum.
- the disease is Gibberella zeae infecting cereals including wheat, durum, barley and com.
- Plant disease symptoms may be detected by visual changes in the area of the part of the plant affected by the plant disease.
- Plant disease may cause wilting of a part of the crop plant including leaves, stems, flowers; necrosis, tumors. This results in visual changes of the affected areas of part of crop plants including changes in color or shape like spots.
- the plant disease symptoms are those caused by Gibberella zeae which are discolored spikelets of the ear.
- one trained neural network is used.
- the neural network is a U-net being convolutional neural network architecture consisting of a contracting path (encoder) to capture context and a symmetric expanding path (decoder) that enables precise localization.
- the U-net has skip connections between the encoder and decoder paths, allowing the network to combine low-level feature maps with higher-level ones, improving the flow of information and localization accuracy.
- the spatial resolution is decreased while feature depth is increased through a series of convolutional and pooling layers.
- transpose convolutions are used to upsample the feature maps, increasing spatial resolution while reducing feature depth.
- the expansive path combines the upsampled features with the high-resolution features from the encoder path via concatenation, allowing precise localization.
- the contractingexpanding architecture with skip connections allows the U-Net to work with very few training images and yield more precise segmentation maps compared to other architectures. This makes it particularly useful for biomedical and scientific image segmentation tasks where annotated data is limited.
- the neural network architecture of a U-net is particularly advantageous for the technical problem of achieving high accuracy assessing qualitatively and quantatively disease by calculating an disease index in a location, field or trial as it allows a high accuracy even with a limited training data set.
- U-Nets has shown excellent results for biomedical images which are images generated by medical diagnostic methods such as MRI, X-ray, ultrasound, PET, SPECT, or computer tomography (Ronneberger, O., Fischer, P., Brox, T. (2015).
- U-Net Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Homegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer- Assisted Intervention - MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science (vol 9351. Springer, Cham, https://doi.org/10.1007/978-3-319-24574-4_28).
- U-nets have a simpler architecture leading to an easier implementation and less requirements regarding for example computational costs and storage. This enables faster inference, especially when powerful compute instances are not available for example in remote agricultural areas.
- the Dice coefficient also known as the Dice similarity coefficient or Fl score, is a metric used to evaluate the performance of a model in tasks like image segmentation, object detection, and other classification problems. It measures the overlap between the predicted and ground truth segmentation masks or object regions.
- TP True Positives
- FP False Positives
- FN False Negatives
- the Dice coefficient ranges from 0 to 1, where 1 indicates a perfect overlap between the predicted and ground truth segmentation masks or object regions, and 0 indicates no overlap at all.
- ground truth segmentation is provided using human annotators.
- the Dice coefficient can be computed using the torchmetrics library, which provides a Dice class and a functional dice implementation. These allow calculating the Dice coefficient for binary or multi-class segmentation tasks, with options to control the averaging method (micro, macro, weighted, etc.) and handle class imbalance by ignoring certain classes.
- the Dice coefficient is often used in combination with other metrics like Intersection over Union (loU) to evaluate the performance of segmentation or object detection models comprehensively.
- LinU Intersection over Union
- the learning rate (or step-size) is explained as the magnitude of change/update to model weights during the backpropagation training process in Neural Networks.
- the learning rate is usually specified as a positive value less than 1.0.
- model weights are updated to reduce the error estimates of our loss function. Rather than changing the weights using the full amount, we multiply it by some learning rate value. For example, setting the learning rate to 0.5 would mean updating (usually subtract) the weights with 0.5*estimated weight errors (i.e., gradients or total error change w.r.t. the weights).
- the neural network is trained using a data set of subimages which have been annotated by human agronomists.
- one or more trained neural network is used to generate for each subimage one or more disease pixel sets classifying each pixel within the disease pixel set regarding plant disease symptoms
- the trained neural network is used to generate one disease pixel set classifying each pixel within the disease pixel set as plant disease symptoms of a specific plant disease.
- each pixel is assigned a probability to fall into the respective classification categories. Examples for such categories are healthy or infected. Infected refers to a certain plant disease, in particular caused by Gibberella zeae.
- the pixel is assigned to the category with the highest probability.
- the classification consists of two categories which are healthy and infected.
- the trained neural network is used to generate two or more disease pixel sets are generated classifying each pixel within one of the disease pixel set regarding plant disease symptoms of a specific plant disease.
- the disease is Gibberella zeae.
- two or more trained neural networks are used, each of them trained to recognize a specific plant disease on one or more crop species.
- one or more disease index for each substructure is calculated based on the classification of the pixels in each substructure.
- Disease severity indicates the percentage of infected area of a plant part.
- Disease incident indicates the percentage of infected crop plants or infected part of a crop plant compared to the total number of plant parts or plants.
- a crop plant or a part of a crop plant is infected if one or more pixel of the corresponding substructure is classified as infected.
- the substructure which represents the plant part is an ear or an tassel.
- Multiple values of disease indices may be used to calculate one or more disease index for an area, a location or a field trial, requiring the analysis of multiple plant parts.
- the disease incident indicates the percentage of the number of infected crop plants compared to the total number of crop plants.
- a script is generated for an automated application of crop protection product based on the disease index calculated for a location, a field, a field trial, or an area.
- the efficacy of a plant protection product applied to that agricultural location is calculated based on the disease index calculated for a location, a field, a field trial or an area.
- the disease indices generated through the method are results used to evaluate the outcome of field trials.
- An outcome means the result of testing for efficacy of crop protection products, practices for applying crop protection products, including time points, application methods, performance of crop plant varieties, fertilizer application.
- the disease indices derived from the method allows the comparison in a qualitative or quantative manner of disease indices indicative of outcomes across different growing periods and geographies. The accuracy for the results according to the method.
- the deep learning model architecture is a vision transformer capable of performing the object detection and image segmentation in one step.
- a device comprising an arm (200) including a first, second and third member (210, 220, 230) and first and second means (280) and (290) for connecting the members (210, 220, 230); means (240) for attaching a mobile device (270) with means (260) holding the mobile device (270), wherein the second member (220) is attached at its proximal end to the middle section of the first member (210) at an angle alpha using means (290) and wherein the second member (220) is attached with its distal end to the third member (230) at an angle beta in relation to the plane formed by the first and second member using means (280) and wherein the mobile device (270) is connected to the third member (230) using means (250) or (240) so that the front of the mobile device (270) is parallel to and above the first member (210).
- Angle alpha is between 60 and 90 degrees, preferably between 75 and 90 degrees, more preferably between 80 and 90 degree and most preferred 90 degrees.
- Angle beta is between 45 and 160 degrees, preferably between 60 and 120 degrees, more preferably between 60 and 90 degree and most preferred 90 degrees.
- the members (210), (220), and (230) may be extendable or fixed in its length. Examples for members (210), (220), and (230) include pipes, rods.
- the means for connecting members (280) and (290) may be connected by fixed or removable hinges
- the mobile device (270) is suitable to be carried by a human operator. The human operator will hold either member (230) or means (250) or (240) or means (260) or the mobile device (270) itself, or use the first member (220) to divide plants in an agricultural area to separate so that the mobile device (270) is able to take the digital image in a defined distance and angle between the first member (220) and the mobile device (270).
- the human operator will hold either member (230) or means (250) or (240) or means (260) or the mobile device (270) itself, or use the first member (220) to divide plants in an agricultural area to separate so that the mobile device (270) is able to take the digital image in a defined distance and angle between the first member (220) and the
- Means (260) for attaching a mobile device attaches at its distal end to a holder for the mobile device.
- means (280), (290), (240), (250) or (260) independently for each other are clamps, magnetic connections, velcros, hook or cases.
- the means are attached to the third member.
- Images that typically included between five and 40 wheat ears were collected from selected field trials for plant protection products between 2020 and 2023 (the collection is ongoing for validation purposes).
- the images cover different wheat species or varieties including spring or winter wheat or awned wheat species, different countries including Germany, Canada, Czech Republic, Lithuania, United States, Italy, France, Great Britain and different levels of Fusarium Head Blight infections.
- images of awned wheat species were included.
- Figure 3 the image is an example showing the polygons of 28 wheat ears in the foreground assigned by human annotators.
- a set of 1165 complete images (usually five to 40 wheat ears per image) were annotated by human annotators who are trained agronomists in order to ensure that the annotators had the training to identify infected areas in an ear. They overlayed masks on the outline of areas of the ear infected with Gibberella zeae. After this the individual ears were cropped out (creating sub-images) using bounding boxes and pixels marked as diseased within the area of the annotated ears have been used in the training. The background in each of the cropped-out sub images was removed for training. In total, the annotators analysed 22385 individual ears using this approach.
- the image of figure 4 shows in an example the masks highlighting the infected areas of the ears. Those images were then overlayed with the polygons outlining the ears and only those pixels classified as being infected within the annotated ears were used.
- the annotated images were split into training and validation (train-val) datasets (90% if the images) and a test dataset (10% of the images) sets using stratified sampling with regard to the 14 different trials in the data.
- train-val data set was split into training (80% of the train-val images) and validation (20% of the train-val images) data set.
- the detectron2 implementation available at Github https://github.com/facebookresearch/detectron2/tree/main was used.
- Box and segmentation average precision on the validation set are shown in the figure 5 as a measure of model precision.
- Figure 6 shows an image in which ears of awned species were detected with a high accuracy using the trained Mask R-CNN.
- the data set with the annotated subimages was split int train-val (90%) and test (10%) set. 10% of the train- val set are being used for validation during training.
- the U-Net neural network was employed using a pytorch implementation (see https://pytorch.org/) based on the reference described in Ronneberger et al. (https://arxiv.org/pdf/1505.04597).
- the dice coefficient has been used to evaluate the trained U-Net models.
- the learning rate (or step-size) is explained as the magnitude of change/update to model weights during the backpropagation training process in Neural Networks.
- the learning rate is usually specified as a positive value less than 1.0.
- model weights are updated to reduce the error estimates of our loss function. Rather than changing the weights using the full amount, we multiply it by some learning rate value. For example, seting the learning rate to 0.5 would mean updating (usually subtract) the weights with 0.5*estimated weight errors (i.e., gradients or total error change w.r.t. the weights). For this case the best model reaches a validation dice coefficient of 0.65.
- the image in Figure 7 shows the subimage of one wheat ear, with removed background, and the mask predicted using the trained U-Net, predicting the area of the ear that infected with Gibberella zeae. Based on the area the disease severity index for that ear was calculated by dividing the number of pixels in the mask by the total number of pixels that belong to the area of the ear (without background).
- the step that produces the masks for individual ears, which is currently performed by the first model could be split up into a task performed by two models la and lb. Model 2 (ear segmentation) remains unchanged.
- model la would return bounding boxes around each ear (technically points on the ear and even a first version of the mask would also be work. I have tested points and boxes; boxes give me better results). All or a selection of the boxes detected by model la will be passed to model lb. Criteria for the selection of a box could for example be its detection score, a maximal number of boxes to be processed per image or the position or size of a box.
- Model la could be the same model as currently used as model 1 (the Mask- RCNN, although the ’’Mask” part is not required anymore. Faster R-CNN could be used, it is the same model, without the part that produces the masks, so only returning bounding boxes. There are many other modem object detection models returning bounding boxes, so we have a lot of flexibility here.)
- Model lb takes the bounding boxes as input (referred to as “prompts” in literature, see below) and produces a high-quality outline (mask) for each box. I am currently using SAM2 (https://ai.meta.com/sam2/. https://github.com/facebookresearch/sam2) for this task. Again, some post-processing (e.g. filling holes in the masks) can be applied to the model lb output.
- SAM2 https://ai.meta.com/sam2/. https://github.com/facebookresearch/sam2
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Mechanical Engineering (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Systems, methods, and computer programs disclosed herein relate to Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area; Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures; Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms; Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
Description
METHOD TO DETERMINE DISEASE IN PLANTS
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in a Patent Office patent file or records, but otherwise reserves all copyright or rights whatsoever. © 2023 Bayer AG
FIELD OF THE DISCLOSURE
Systems, methods, and computer programs disclosed herein relate to the qualitative and quantitative assessment of disease on crop plants using supervised learning.
BACKGROUND
The assessment of diseases on crop plants is a critical aspect of agricultural management and plant pathology. Fungal, bacterial and viral pathogens are among the most significant threats to crop health and productivity, leading to substantial economic losses worldwide. Qualitative and quantitative assessment involves the identification and characterization of disease species affecting plants, which is essential for understanding the disease cycle, host range, and potential impact on crop yields. It enables farmers and researchers to diagnose the presence of specific pathogens and to determine the severity, the incidence of the infection through visual inspection and laboratory analysis. This information is crucial for the timely application of appropriate control measures to prevent the spread of the disease.
DISEASE SEVERITY
Quantitative assessment of disease severity is particular important to allow comparison between different areas, locations and time points. Especially in research or development trials, including trials necessary for registration of crop protection products in agriculture quantification is key however prone to errors (Bock et al., Phytopathology (2020), Vol 2, pp 1- 9). Therefore, there is the need to measure the extent and severity of pathogen infections in a more precise, objective and scalable manner, finally leading to faster and more efficient development and registration of crop protection products. This includes estimating the proportion of infected plants in a given area, the degree of tissue damage, and the potential yield loss associated with the disease. Advanced techniques such as digital image analysis, molecular diagnostics, and remote sensing technologies have greatly enhanced the accuracy and efficiency of quantitative assessments. These methods allow for the monitoring of disease progression over time and the evaluation of the effectiveness of fungicides and other control strategies.
Together, qualitative and quantitative assessments provide a comprehensive understanding of fungal diseases in plants. They are indispensable for the development of integrated disease management programs that aim to minimize the impact of fungal pathogens while promoting sustainable agricultural practices. Accurate disease assessment is also vital for breeding programs focused on developing resistant plant varieties, as well as for regulatory purposes where documentation of disease incidence is required for trade and quarantine decisions. In the context of a patent application, the importance of these assessments cannot be overstated, as they underpin the innovation and effectiveness of novel treatments, diagnostic tools, and management strategies designed to combat fungal diseases in agriculture.
Precise determination first of the affected plant part and then of the infected area caused by a plant pathogen is a common approach which comes with a lot of challenges regarding accuracy, as first the plant part has to be precisely identified in a digital image, secondly a distinction between infected and healthy areas is needed, infected areas have to be recognized in an accurate manner and the accuracy in determining the infected area needs to be high. Also area or locations are planted with a large number of crop plants so there is the need to be able to generate and analyse a sufficient amount of images with multiple plant parts to be analyzed preferably in an automated manner to provide a precise estimation of disease.
Neural networks, one type of machine learning models, are known to be highly suitable for tasks like pattern or object detection and segmentation of images. Mask-RCNNs have been used in a two-step approach to recognize Fusarium Head Blight in wheat (US2023010954), however in order to achieve a high accuracy in plant disease detection both in a quantitative or qualitative manner is still an unsolved problem. In addition, neural networks are known to require high computing capacity which may pose a challenge for the assessment process in agricultural areas, fields and trials where large number of subimages that need to be analyzed in order to assess disease ultimately with a precision to allow the data to be used in registration relevant trials. Therefore also the choice of a suitable neural network architecture as well as the efficacy and computational cost of the training approaches will play an important role.
SUMMARY
These problems are solved by the subject matter of the independent claims of the present disclosure. Preferred embodiments are defined in the dependent claims, the description and the drawings. The objects of the present invention include a method, a system and a computer program product for qualitative and quantitative assessment of disease on crop plants using supervised learning and a device supporting to receive digital images.
In a first aspect, the present disclosure relates to a computer-implemented method comprising.
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
In another aspect, the present disclosure provides a computer system comprising: a processor; and a memory storing an application program configured to perform, when executed by the processor, an operation, the operation comprising:
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
In another aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon software instructions that, when executed by a processor of a computer system, cause the computer system to execute the following steps:
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures ;
Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms ;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
In another aspect the present disclosure provides a device comprising an arm including a first, second and third member and first, second and third means for connecting the members; means for attaching a mobile device, wherein the second member is attached at its proximal end to the middle section of the first member at a 90 degree angle and wherein the second member is attached with its distal end to the third member at a 90 degree angle in relation to the plane formed by the first and second member and wherein the means for attached a mobile phone is connected to the third member so that the front of the mobile device is parallel to and above the first member.
BRIEF DESCRIPTION OF THE DRAWINGS
Some implementations of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all implementations of the disclosure are shown. Indeed, various implementations of the disclosure may be embodied in many different forms and should not be construed as limited to the implementations set forth herein; rather, these example implementations are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Figure 1 is a block diagram of a computer system that can be used in the various embodiments of the invention described.
Figure 2 shows schematically an embodiment of the computer-implemented method of the present disclosure in form of a flow chart.
Figure 3 shows in an example image the polygons confining 28 wheat ears in the foreground of the image, assigned by human annotators. The polygons have been filled with color for better visibility. The corresponding bounding boxes are also shown in the image.
Figure 4 shows an image of the masks highlighting the infected areas of the ears. Annotated pixels that are not part of an ear have been ignored.
Figure 5 shows selected training and validation metrics from a typical model training run over 5000 epochs with a batch size of 4 images. Validation metrics have been derived every 300 epochs. The Figure shows the convergence of the training and validation loss, indicating that 3000 to 5000 epochs are a good choice to stop the training to avoid overfitting of the model. The learning rate schedule includes a warm-up and a decay phase, which support smooth convergence of the model. Box and segmentation average precision on the validation images reach about 58% and about 51% respectively, which is a good result keeping in mind that sometimes the model will select a slightly different set of foreground ears compared to the ground truth annotations.
Figure 6 shows the output of the trained Mask R-CNN. Foreground ears have been detected with a high accuracy and the corresponding masks and bounding boxes are overlayed on top of the image.
Figure 7 shows the subimage of one wheat ear. In the left image of Figure 7 the ear has been cropped (manually) from the original image. The middle Figure shows the same ear cropped to the bounding box and with removed background using the results from Mask R-CNN. The right image shows the mask predicted using the trained U-Net, predicting the area of the ear that is infected with Gibberella zeae. For this example, the ear (without background) corresponds to 107012 pixels, the mask contains 11732 pixels resulting in a disease severity of about 11%.
Figure 8 shows an image of wheat ears with bounding boxes (left part) and an image with the resulting masks.
Figure 9 shows a schematic drawing of the device useful in receiving digital images by a mobile device in a standardized manner.
DETAILED DESCRIPTION
The invention will be more particularly described below without distinguishing between the aspects of the invention (method, computer system, computer-readable storage medium). On the contrary, the following description are intended to apply analogously to all the aspects of the invention, irrespective of in which context (method, computer system, computer-readable storage medium) they occur.
If steps are stated in an order in the present description or in the claims, this does not necessarily mean that the invention is restricted to the stated order. On the contrary, it is conceivable that the steps can also be executed in a different order or else in parallel to one another, unless one step builds upon another step, this absolutely requiring that the building step be executed subsequently (this being, however, clear in the individual case). The stated orders are thus preferred embodiments of the present disclosure.
As used herein, the articles “a” and “an” are intended to include one or more items and may be used interchangeably with “one or more” and “at least one.” As used in the specification and the claims, the singular form of “a”, “an”, and “the” include plural referents, unless the context clearly dictates otherwise. Where only one item is intended, the term “one” or similar language is used. Also, as used herein, the terms “has”, “have”, “having”, or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based at least partially on” unless explicitly stated otherwise. Further, the phrase “based on” may mean “in response to” and be indicative of a condition for automatically triggering a specified operation of an electronic device (e.g., a controller, a processor, a computing device, etc.) as appropriately referred to herein.
Some implementations of the present disclosure will be described more fully with reference to the accompanying drawings, in which some, but not all implementations of the disclosure are shown. Indeed, various implementations of the disclosure may be embodied in many different forms and should not be construed as limited to the implementations set forth herein; rather, these example implementations are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Figure 1 shows schematically an embodiment of the computer-implemented method of the present disclosure in form of a flow chart.
The method (100) comprises the steps:
(110) Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
(120) Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
(130) Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms;
(140) Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
The operations in accordance with the teachings herein may be performed by at least one computer system specially constructed for the desired purposes or general-purpose computer system specially configured for the desired purpose by at least one computer program stored in a typically non-transitory computer readable storage medium.
A “computer system” is a system for electronic data processing that processes data by means of programmable calculation rules. Such a system usually comprises a “computer”, that unit which comprises a processor for carrying out logical operations, and also peripherals.
In computer technology, “peripherals” refer to all devices which are connected to the computer and serve for the control of the computer and/or as input and output devices. Examples thereof are monitor (screen), printer, scanner, mouse, keyboard, drives, camera, microphone, loudspeaker, etc. Internal ports and expansion cards are, too, considered to be peripherals in computer technology.
Computer systems of today are frequently divided into desktop PCs, portable PCs, laptops, notebooks, netbooks and tablet PCs and so-called handhelds (e.g. smartphone); Computer systems also include cloud-hosted systems where one or more steps of the computer-implemented method are performed remotely over the internet. Computer systems also include web server, database server, application servers or cluster computing. All these systems can be utilized for carrying out the invention.
The term “non-transitory” is used herein to exclude transitory, propagating signals or waves, but to otherwise include any volatile or non-volatile computer memory technology suitable to the application.
The term “computer” should be broadly construed to cover any kind of electronic device with data processing capabilities, including, by way of non-limiting example, personal computers, servers, embedded cores, computing system, communication devices, processors (e.g., digital signal processor
(DSP)), microcontrollers, field programmable gate array (FPGA), application specific integrated circuit (ASIC), etc.) and other electronic computing devices.
The term “process” as used above is intended to include any type of computation or manipulation or transformation of data represented as physical, e.g., electronic, phenomena which may occur or reside e.g., within registers and/or memories of at least one computer or processor. The term processor includes a single processing unit or a plurality of distributed or remote such units.
Figure 1 illustrates a computer system (1) according to some example implementations of the present disclosure in more detail.
Generally, a computer system of exemplary implementations of the present disclosure may be referred to as a computer and may comprise, include, or be embodied in one or more fixed or portable electronic devices. The computer may include one or more of each of a number of components such as, for example, a processing unit (20) connected to a memory (50) (e.g., storage device).
The processing unit (20) may be composed of one or more processors alone or in combination with one or more memories. The processing unit (20) is generally any piece of computer hardware that is capable of processing information such as, for example, data, computer programs and/or other suitable electronic information. The processing unit (20) is composed of a collection of electronic circuits some of which may be packaged as an integrated circuit or multiple interconnected integrated circuits (an integrated circuit at times more commonly referred to as a “chip”). The processing unit (20) may be configured to execute computer programs, which may be stored onboard the processing unit (20) or otherwise stored in the memory (50) of the same or another computer.
The processing unit (20) may be a number of processors, a multi-core processor or some other type of processor, depending on the particular implementation. For example, it may be a central processing unit (CPU), a field programmable gate array (FPGA), a graphics processing unit (GPU) and/or a tensor processing unit (TPU). Further, the processing unit (20) may be implemented using a number of heterogeneous processor systems in which a main processor is present with one or more secondary processors on a single chip. As another illustrative example, the processing unit (20) may be a symmetric multi-processor system containing multiple processors of the same type. In yet another example, the processing unit (20) may be embodied as or otherwise include one or more ASICs, FPGAs or the like. Thus, although the processing unit (20) may be capable of executing a computer program to perform one or more functions, the processing unit (20) of various examples may be capable of performing one or more functions without the aid of a computer program. In either instance, the processing unit (20) may be appropriately programmed to perform functions or operations according to example implementations of the present disclosure.
The memory (50) is generally any piece of computer hardware that is capable of storing information such as, for example, data, computer programs (e.g., computer-readable program code (60)) and/or other
suitable information either on a temporary basis and/or a permanent basis. The memory (50) may include volatile and/or non-volatile memory, and may be fixed or removable. Examples of suitable memory include random access memory (RAM), read-only memory (ROM), a hard drive, a flash memory, a thumb drive, a removable computer diskette, an optical disk, a magnetic tape or some combination of the above. Optical disks may include compact disk - read only memory (CD-ROM), compact disk - read/write (CD-R/W), DVD, Blu-ray disk or the like. In various instances, the memory may be referred to as a computer-readable storage medium or data memory. The computer-readable storage medium is a non-transitory device capable of storing information, and is distinguishable from computer-readable transmission media such as electronic transitory signals capable of carrying information from one location to another. Computer-readable medium as described herein may generally refer to a computer- readable storage medium or computer-readable transmission medium.
In addition to the memory (50), the processing unit (20) may also be connected to one or more interfaces for displaying, transmitting and/or receiving information. The interfaces may include one or more communications interfaces and/or one or more user interfaces. The communications interface(s) may be configured to transmit and/or receive information, such as to and/or from other computer(s), network(s), database(s) or the like. The communications interface may be configured to transmit and/or receive information by physical (wired) and/or wireless communications links. The communications interface(s) may include interface(s) (41) to connect to a network, such as using technologies such as cellular telephone, Wi-Fi, satellite, cable, digital subscriber line (DSL), fiber optics and the like. In some examples, the communications interface(s) may include one or more short-range communications interfaces (42) configured to connect devices using short-range communications technologies such as NFC, RFID, Bluetooth, Bluetooth LE, ZigBee, infrared (e.g., IrDA) or the like.
The user interfaces may include a display (30). The display (screen) may be configured to present or otherwise display information to a user, suitable examples of which include a liquid crystal display (LCD), light-emitting diode display (LED), plasma display panel (PDP) or the like. The user input interface(s) (11) may be wired or wireless, and may be configured to receive information from a user into the computer system (1), such as for processing, storage and/or display. Suitable examples of user input interfaces include a microphone, image or video capture device, keyboard or keypad, joystick, touch-sensitive surface (separate from or integrated into a touchscreen) or the like. In some examples, the user interfaces may include automatic identification and data capture (AIDC) technology (12) for machine-readable information. This may include barcode, radio frequency identification (RFID), magnetic stripes, optical character recognition (OCR), integrated circuit card (ICC), and the like. The user interfaces may further include one or more interfaces for communicating with peripherals such as printers and the like.
As indicated above, program code instructions (60) may be stored in memory (50), and executed by processing unit (20) that is thereby programmed, to implement functions of the systems, subsystems,
tools and their respective elements described herein. As will be appreciated, any suitable program code instructions (60) may be loaded onto a computer or other programmable apparatus from a computer- readable storage medium to produce a particular machine, such that the particular machine becomes a means for implementing the functions specified herein. These program code instructions (60) may also be stored in a computer-readable storage medium that can direct a computer, processing unit or other programmable apparatus to function in a particular manner to thereby generate a particular machine or particular article of manufacture. The instructions stored in the computer-readable storage medium may produce an article of manufacture, where the article of manufacture becomes a means for implementing functions described herein. The program code instructions (60) may be retrieved from a computer- readable storage medium and loaded into a computer, processing unit or other programmable apparatus to configure the computer, processing unit or other programmable apparatus to execute operations to be performed on or by the computer, processing unit or other programmable apparatus.
Retrieval, loading and execution of the program code instructions (60) may be performed sequentially such that one instruction is retrieved, loaded and executed at a time. In some example implementations, retrieval, loading and/or execution may be performed in parallel such that multiple instructions are retrieved, loaded, and/or executed together. Execution of the program code instructions (60) may produce a computer-implemented process such that the instructions executed by the computer, processing circuitry or other programmable apparatus provide operations for implementing functions described herein.
Execution of instructions by processing unit, or storage of instructions in a computer-readable storage medium, supports combinations of operations for performing the specified functions. In this manner, a computer system (1) may include processing unit (20) and a computer-readable storage medium or memory (50) coupled to the processing circuitry, where the processing circuitry is configured to execute computer-readable program code instructions (60) stored in the memory (50). It will also be understood that one or more functions, and combinations of functions, may be implemented by special purpose hardware-based computer systems and/or processing circuitry which perform the specified functions, or combinations of special purpose hardware and program code instructions.
The computer system of the present disclosure may be in the form of a laptop, notebook, netbook, and/or tablet PC; it may also be a component of an MRI scanner, a CT scanner, or an ultrasound diagnostic machine.
In another aspect, the present disclosure provides a computer program product. Such a computer program product comprises a non-volatile data carrier, such as a CD, a DVD, a USB stick or other medium for storing data. A computer program is stored on the data carrier. The computer program can be loaded into a working memory of a computer system (in particular, into a working memory of a computer system of the present disclosure), where it can cause the computer system to perform the following steps:
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
The computer program product may also be marketed in combination with the contrast agent. Such a combination is also referred to as a kit. Such a kit includes the contrast agent and the computer program product. It is also possible that such a kit includes the contrast agent and means for allowing a purchaser to obtain the computer program, e.g., download it from an Internet site. These means may include a link, i.e., an address of the Internet site from which the computer program may be obtained, e.g., from which the computer program may be downloaded to a computer system connected to the Internet. Such means may include a code (e.g., an alphanumeric string or a QR code, or a DataMatrix code or a barcode or other optically and/or electronically readable code) by which the purchaser can access the computer program. Such a link and/or code may, for example, be printed on a package of the contrast agent and/or printed on a package insert for the contrast agent. A kit is thus a combination product comprising a contrast agent and a computer program (e.g., in the form of access to the computer program or in the form of executable program code on a data carrier) that is offered for sale together.
In the first step a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area is received.
Digital images are images which are made up of pixel each pixel has a discrete quantity of numeric representation for its intensity or gray level. The pixels are defined by their spatial coordinates on the x and y-axis. In binary digital image a pixel is only black (having a value of 0) or white (having a value of 1 ) . In grayscale pictures each pixel has a value indicating its shade of grey (usually 8 bit/pixel allowing 256 shades of grey) without any information on the color. In color images, also known as RGB images, each pixel contains color information for separate values for the channels in the visible spectrum which are red, green, and blue components (typically 24 bits/pixel, 8 bits for each color channel). In RGB A images (A for alpha) a fourth channel is added representing for each pixel the transparency with a value of zero being completely transparent and a value of 255 (for 8-bit color) the pixel being fully opaque. In multispectral images each pixel is represented by a vector of intensity values for each spectral band including bands outside of the visible spectrum.
Crop plant means a plant or plants populations which is specifically grown with human intervention as a plant useful for feed, food, fuel or fibre or an ornamental plant. Crop plants may be plants which can
be obtained by conventional breeding and optimization methods or by biotechnological, such as recombinant DNA techniques, genetic engineering methods, or gene editing methods such as CRISPR Cas or combinations of these methods, including the genetically modified plants (transgenic) or gene- edited (also named new genomic technologies) crop plants. Crop plants include also plant cultivars which are protectable and non-protectable by plant breeders’ rights, geno- or biotypes. Crop plants include the following species or genera: Alfafa, Anacardiaceae sp. (mango); beet, for example sugar beet and fodder beet; cereals, for example wheat, durum, barley, rye, oats, rice, maize, triticale and millet/sorghum; citrus fruit, for example oranges, lemons, mandarins, grapefruits and tangerines; cucurbits, for example pumpkin/squash, gherkins, calabashes, cucumbers and melons; fibre plants, for example cotton, flax, hemp and jute, cannabis; Latex plants; Lauraceae sp. (e.g. avocado, cinnamon, camphor); legumes, for example beans, lentils, peas and soybeans, common beans and broad beans; Malvaceae sp. (e.g. okra, cocoa);Manihoteae sp. (for instance Manihot esculenta, manioc), oil crops, for example Brassica oil seeds such as Brassica napus (e.g. canola, rapeseed), Brassica rapa, B. juncea (e.g. (field) mustard) and Brassica carinata, Arecaceae sp. (e.g. oilpalm, coconut), mustard, poppies, Oleaceae sp. (e.g. olive tree, olives), sunflowers, castor oil plants; Papaveraceae (e.g. poppy), pome fruit for example apples, pears and quince, Ribesioidae sp., soft fruits for example strawberries, raspberries, blackberries, blueberries, red and black currant and gooseberry; Rubiaceae sp. (for instance coffee); Solanaceae sp. (e.g. tomatoes, potatoes, peppers, bell peppers, capsicum, aubergines, eggplant, tobacco), Stevia rebaudiana; stone fruit for example peaches, nectarines, cherries, plums, common plums, apricots; Theobroma sp. (for instance Theobroma cacao: cocoa); vegetables, for example spinach, lettuce, Asparagaceae (e.g. asparagus), Cruciferae sp. (e.g. white cabbage, red cabbage, broccoli, cauliflower, Brussels sprouts, pak choi, kohlrabi, radishes, horseradish, cress and Chinese cabbage), onions, bell peppers, artichokes and chicory - including root chicory, endive or common chicory, leeks and onions; Umbelliferae sp. (e.g. carrots, parsley, celery and celeriac); Vitis sp. (for instance Vitis vinifera: grape vine, raisins, table grapes); Musaceae sp. (e.g. banana trees, bananas and plantations), nuts of various botanical taxa such as peanuts, Juglandaceae sp. (Walnut, Persian Walnut (Juglans regia), Butternut (Juglans), Hickory, Shagbark Hickory, Pecan (Carya), Wingnut (Pterocarya)), Fagaceae sp. (Chestnut (Castanea), Chestnuts, including Chinese Chestnut, Malabar chestnut, Sweet Chestnut, Beech (Fagus), Oak (Quercus), Stone-oak, Tanoak (Lithocarpus)); Betulaceae sp. (Alder (Alnus), Birch (Betula), Hazel, Filbert (Corylus), Hornbeam), Leguminosae sp. (for instance peanuts, peas and beans beans - such as climbing beans and broad beans), Asteraceae sp. (for instance sunflower seed), Almond, Beech, Butternut, Brazil nut, Candlenut, Cashew, Colocynth, Cotton seed, Cucurbita ficifolia, Filbert, Indian Beech or Pongam Tree, Kola nut, Lotus seed, Macadamia, Mamoncillo, Maya nut, Mongongo, Oak acorns, Ogbono nut, Paradise nut, Pili nut, Pine nut, Pistacchio, Pumpkin seed, water Caltrop; soybeans (Glycine sp., Glycine max).
In one embodiment the digital image is an RGB image. In another embodiment the RGB image is taken by using a mobile device. In one embodiment the mobile device is a smart phone, mobile phone,
handheld or tablet. In one embodiment the digital image, in particular the RGB image is taken using a Unmanned Aerial Vehicle (UAV), in particular a fixed wing drone, single rotor or multi -rotor drone.
In one embodiment the digital image is comprised of more than one crop plant. In one embodiment the digital image is comprised of more than one part of a crop plant, preferably 2 to 50 parts of a crop plant, more preferably 5 to 40 parts of a crop plant, even more preferably 15 to 35 parts of a crop plant. In one embodiment the crop plant is a cereal comprising wheat, barley, rye, oats, rice, maize, triticale and millet/sorghum. In one embodiment the plant part is an ear or a tassel. In one embodiment a digital image comprising two to 40 ears, preferably 15 to 35 of a wheat, barley, rye, rice, maize or triticale plant growing in an area is received.
An area means a spatially delimitable region of the Earth's surface on which the plants grow. In one embodiment the area is at least partly utilized agriculturally in that crop plants are planted in one or more fields, are supplied with nutrients and are harvested. The area may also be or comprise a silviculturally utilized region of the Earth's surface (for example a forest). Gardens, parks or the like in which vegetation exists solely for human pleasure are covered by the term area.
An area may include a multitude of fields for crop plants. In one embodiment, an area corresponds to a growing area for a crop plant (for definition of a growing area see, for example, Journal fur Kulturpflanzen, 61 (7). p. 247-253, 2009, ISSN 0027-7479). In another embodiment, an area corresponds to a biome (for definition of equivalent German term Boden-Klima-Raum see, for example, Nachrichtenbl. Deut. Pflanzenschutzd., 59(7), p. 155-161, 2007, ISSN 0027-7479).
A location means a preferably contiguous region within an area. A location may be one or more fields in which a specific crop plant is being grown. The location is preferably being farmed by a person having registered access to a multitude of imaging devices and optionally one or more plant analysis devices.
In one embodiment the location a field in which a grower or farmers grow a commercial crop plant. In another embodiment the location is a field in which experimental or demonstration trials have been planted.
A field trial means a certain area or location in which crop plants are grown under certain conditions such as defined conditions of sowing, application of certain amounts of crop protection products including timing and amounts and method of applications. Field trials require a dedicated design in order to be able to arrange plots within a field trial to the regime or treatments in order to test the hypothesis or outcomes the scientist or agronomist want to investigate. This includes statistical considerations to test the hypothesis in a statistically meaningful way. In one embodiment the conditions of a field trial adhere to those conditions defined by regulatory authorities including European and Mediterranean Plant Protection Organization. The results of those field trials may be used in the approval process for plant protection product by the respective regulatory authorities such as the European Food Safety Agency or the Environmental Protection Agency of the United States. In one embodiment the field trial
is conducted according to the publication: Bulletin OEPP/EPPO Bulletin (2012) 42 (3), 419-425 describing the efficacy evaluation of fungicides on foliar and ear diseases in cereals.
In a second step one or more subimages are generated by using a trained neural network, each of the subimages comprising a one or more specific substructure separated from other substructures.
A digital image is comprised by a plurality of subimages generating by methods including segmentation, in particular instance segmentation. The subimage comprises at least one object, the substructure, identified by the segmentation preferably a plant part.
A substructure means a part of specific organ of a crop plant above and below the ground, such as a shoot, a leaf, a needle, a stalk, a stem or part of a stem, a flower, an ear, a spike, a fruit body, a fruit, a seed, a root, a tuber or a rhizome. A preferred substructure is a spike or a tassel.
The neural network which is used according to the invention for the second step comprises at least three layers of processing elements: a first layer with input neurons (nodes), an Nth layer with at least one output neuron (node), and N-2 hidden layers, where N is a natural number greater than 2.
The function of the input neurons is to receive digital images as input values. Normally, there is one input neuron for each pixel of a digital image. Additional input neurons may be provided for additional input values (e.g. conditions that existed when the respective recorded image was created, or additional information about the objects).
In that network, the output neurons are used to predict mask or pixel set having a certain label, bounding boxes including a label for a digital image, indicating whether the object shown in the digital image meets or does not meet a defined criterion.
The processing elements of the layers between the input neurons and the output neurons are connected to each other in a predefined architecture and may use either pretrained or random connection weights which are then refined during the training phase.
The neural network is preferably a so-called convolutional neural network (CNN).
A convolutional neural network is able to process input data in the form of a matrix. This allows digital images represented as a matrix (width x height x number of colour channels) to be used as input data. A standard neural network, e.g. in the form of a multi-layer perceptron (MLP), on the other hand, requires a vector as input, i.e. in order to use a recorded image as input, the pixels of the image would need to be unravelled into a long chain. This means, for example, that standard neural networks are not able to recognize objects in an image independently of the position of the object in the image. The same object at a different position in the image would have a completely different input vector.
A CNN consists essentially of filters (Convolutional Layer) and aggregation layers (Pooling Layer), which repeat alternately, and finally one or more layers of “standard”, fully connected neurons (Dense / Fully Connected Layer).
Details can be found in the prior art (see e.g.: S. Khan et al.: A Guide to Convolutional Neural Networks for computer Vision, Morgan & Claypool Publishers 2018, ISBN 1681730227, 9781681730226).
The neural network training can be carried out, for example, by means of a back propagation method. The aim is to achieve the most reliable mapping possible from given input vectors or matrices to given output vectors or matrices for the network. The quality of the mapping is described by an error function. The goal is to minimise the error function. The training of an artificial neural network in the back propagation procedure is carried out by modifying the connection weights.
In the trained state, the connection weights between the processing elements contain information regarding the relationship between the recorded images (input) and labels, bounding boxes or masks (output), which can be used to predict labels, bounding boxes or masks for a new recorded image.
Commonly the available annotated data set is split into training, validation and test images. During model training the model learns using the training images and relevant metrics that inform about the training success are derived using the validation set (see the example in Figure 5). A test set of images is withhold from this process and can be used to evaluate the accuracy and other metrics on previously unused images.
In one embodiment the neural network is a CNN. In another embodiment the neural network is a Mask R-CNN. A Mask R-CNN is a deep learning model that combines object detection and instance segmentation allowing to perform pixel-wise instance segmentation alongside object detection. This is achieved through the addition of an extra "mask head" branch, which generates precise segmentation masks for each detected object. The Mask R-CNN is an extension of the Faster R-CNN architecture adding a third branch to predict segmentation masks in addition to bounding boxes and class labels. The Faster-R-CNN is an object detection algorithm that builds upon the previous R-CNN and Fast R-CNN models. It introduces a Region Proposal Network (RPN) that shares convolutional features with the detection network, enabling efficient and accurate generation of region proposals.
The key innovation of Mask R-CNN is its ability to perform pixel-wise instance segmentation alongside object detection. This is achieved through the addition of an extra "mask head" branch, which generates precise segmentation masks for each detected object. This enables fine-grained pixel-level boundaries for accurate and detailed instance segmentation.
The mask branch uses a pixel-to-pixel alignment mechanism, often implemented with spatially aligned ROI pooling, to ensure accurate correspondence between the proposed region and the generated mask.
It incorporates two critical enhancements - ROIAlign addressing misalignment issues in traditional ROI pooling and Feature Pyramid Network (FPN) providing multi-scale feature extraction.
In one embodiment the backbone of the Mask R-CNN is a ResNet 50.
In one embodiment the subimage represents a bounding box which is a rectangular object that encloses a set of pixel which is here determined to detect one or more substructure of interest. In another embodiment those pixel of the subimage which are not encompassed by the substructure are set to the default RGB value of (0,0,0) to enable the third step of the method. In a preferred embodiment the subimage comprises one substructure. In a preferred embodiment the subimage comprises one substructure based on a part of the crop plant in the foreground of the digital image. In a preferred embodiment the subimage comprises one substructure being an ear of a cereal crop plant based on a part of the crop plant being a cereal crop plant in the foreground of the digital image.
The neural network of the second step is trained using at least a data set of at least 1000 digital images of crop plants or parts of crop plants which is divided randomly into a training, validation and data set.
In a third step one or more trained neural network is used to generate for each subimage one or more disease pixel sets classifying each pixel within the disease pixel set regarding plant disease symptoms.
In another embodiment the Segment Anything Model 2 (SAM2), an alternative Neural Network Architecture) by Meta accessible through GitHub (https://ai.meta.com/sam2/, https://github.com/facebookresearch/sam2) may be used for the image segmentation which uses a transformer-based architecture to extract high-level features from both images.
In this embodiment the step producing the sub images is performed using two models, model la and model 1 b allowing a more precise evaluation of the diseased areas, thereby allowing a higher accuracy of the respective disease index, eg the disease severity. Model la generates bounding boxes around each ear. All or a selection of the bounding boxes detected by model la will be passed to model lb. Criteria for the selection of a box may comprise a detection score, a maximal number of boxes to be processed per image or the position or size of a box. Model la may be the same model as currently used as eg the Mask- RCNN. In one embodiment faster R-CNN could be used, it is the same model, without the part that produces the masks, so only returning bounding boxes. Model lb takes the bounding boxes as input In one embodiment SAM2 is used for this step. In one embodiment post-processing (e.g. fdling holes in the masks) may be applied to the model lb output.
Plant diseases are caused by fungi virus or bacteria infecting crop plants including diseases caused by powdery mildew pathogens, for example Blumeria species, for example Blumeria graminis; Podosphaera species, for example Podosphaera leucotricha; Sphaerotheca species, for example Sphaerotheca fuliginea; Uncinula species, for example Erysiphe necator; diseases caused by rust disease pathogens, for example Gymnosporangium species, for example Gymnosporangium sabinae; Hemileia species, for example Hemileia vastatrix; Phakopsora species, for
example Phakopsora pachyrhizi, Phakopsora meibomiae or Phakopsora euvitis; Puccinia species, for example Puccinia recondita, Puccinia graminis oder Puccinia striiformis; Uromyces species, for example Uromyces appendiculatus; diseases caused by pathogens from the group of the Oomycetes, for example Albugo species, for example Albugo Candida; Bremia species, for example Bremia lactucae; Peronospora species, for example Peronospora pisi or P. brassicae; Phytophthora species, for example Phytophthora infestans; Plasmopara species, for example Plasmopara viticola; Pseudoperonospora species, for example Pseudoperonospora humuli or Pseudoperonospora cubensis; Pythium species, for example Pythium ultimum; leaf blotch diseases and leaf wilt diseases caused, for example, by Altemaria species, for example Altemaria solani; Cercospora species, for example Cercospora beticola; Cladiosporium species, for example Cladiosporium cucumerinum; Cochliobolus species, for example Cochliobolus sativus (conidial form: Drechslera, syn: Hehninthosporium) or Cochliobolus miyabeanus; Colletotrichum species, for example Colletotrichum lindemuthanium; Corynespora species, for example Corynespora cassiicola; Cycloconium species, for example Cycloconium oleaginum; Diaporthe species, for example Diaporthe citri; Elsinoe species, for example Elsinoe fawcettii; Gloeosporium species, for example Gloeosporium laeticolor; Glomerella species, for example Glomerella cingulata; Guignardia species, for example Guignardia bidwelli; Leptosphaeria species, for example Leptosphaeria maculans; Magnaporthe species, for example Magnaporthe grisea; Microdochium species, for example Microdochium nivale; Mycosphaerella species, for example Zymoseptoria tritici (syn: Mycosphaerella graminicola), Mycosphaerella arachidicola or Mycosphaerella fijiensis; Phaeosphaeria species, for example Phaeosphaeria nodorum; Phyllachora species, for example Phyllachora maydis, Pyrenophora species, for example Pyrenophora teres or Pyrenophora tritici repentis; Ramularia species, for example Ramularia collo-cygni or Ramularia areola; Rhynchosporium species, for example Rhynchosporium secalis; Septoria species, for example Septoria apii or Septoria lycopersici; Stagonospora species, for example Stagonospora nodorum; Typhula species, for example Typhula incamata; Venturia species, for example Venturia inaequalis; root and stem diseases caused, for example, by Corticium species, for example Corticium graminearum; Fusarium species, for example Fusarium oxysporum; Gaeumannomyces species, for example Gaeumannomyces graminis; Plasmodiophora species, for example Plasmodiophora brassicae; Rhizoctonia species, for example Rhizoctonia solani; Sarocladium species, for example Sarocladium oryzae; Sclerotium species, for example Sclerotium oryzae; Tapesia species, for example Tapesia acuformis; Thielaviopsis species, for example Thielaviopsis basicola; ear and panicle diseases (including com cobs) caused, for example, by Altemaria species, for example Altemaria spp.; Aspergillus species, for example Aspergillus flavus; Cladosporium species, for example Cladosporium cladosporioides; Claviceps species, for example Claviceps purpurea; Fusarium species,
for example Fusarium culmorum; Gibberella species, for example Gibberella zeae; Monographella species, for example Monographella nivalis; Stagnospora species, for example Stagnospora nodorum; diseases caused by smut fungi, for example Sphacelotheca species, for example Sphacelotheca reiliana; Tilletia species, for example Tilletia caries or Tilletia controversa; Urocystis species, for example Urocystis occulta; Ustilago species, for example Ustilago nuda; fruit rot caused, for example, by Aspergillus species, for example Aspergillus flavus; Botrytis species, for example Botrytis cinerea; Monilinia species, for example Monilinia laxa; Penicillium species, for example Penicillium expansum or Penicillium purpurogenum; Rhizopus species, for example Rhizopus stolonifer; Sclerotinia species, for example Sclerotinia sclerotiorum; Verticilium species, for example Verticilium alboatrum; seed- and soil-borne rot and wilt diseases, and also diseases of seedlings, caused, for example, by Altemaria species, for example Altemaria brassicicola; Aphanomyces species, for example Aphanomyces euteiches; Ascochyta species, for example Ascochyta lentis; Aspergillus species, for example Aspergillus flavus; Cladosporium species, for example Cladosporium herbarum; Cochliobolus species, for example Cochliobolus sativus (conidial form: Drechslera, Bipolaris Syn: Hehninthosporium); Colletotrichum species, for example Colletotrichum coccodes; Fusarium species, for example Fusarium culmorum; Gibberella species, for example Gibberella zeae; Macrophomina species, for example Macrophomina phaseolina; Microdochium species, for example Microdochium nivale; Monographella species, for example Monographella nivalis; Penicillium species, for example Penicillium expansum; Phoma species, for example Phoma lingam; Phomopsis species, for example Phomopsis sojae; Phytophthora species, for example Phytophthora cactorum; Pyrenophora species, for example Pyrenophora graminea; Pyricularia species, for example Pyricularia oryzae; Pythium species, for example Pythium ultimum; Rhizoctonia species, for example Rhizoctonia solani; Rhizopus species, for example Rhizopus oryzae; Sclerotium species, for example Sclerotium rolfsii; Septoria species, for example Septoria nodorum; Typhula species, for example Typhula incamata; Verticillium species, for example Verticillium dahliae; cancers, galls and witches’ broom caused, for example, by Nectria species, for example Nectria galligena; wilt diseases caused, for example, by Verticillium species, for example Verticillium longisporum; Fusarium species, for example Fusarium oxysporum; deformations of leaves, flowers and fruits caused, for example, by Exobasidium species, for example Exobasidium vexans; Taphrina species, for example Taphrina deformans; degenerative diseases in woody plants, caused, for example, by Esca species, for example Phaeomoniella chlamydospora, Phaeoacremonium aleophilum or Fomitiporia mediterranea; Ganoderma species, for example Ganoderma boninense;
diseases of plant tubers caused, for example, by Rhizoctonia species, for example Rhizoctonia solani; Helminthosporium species, for example Helminthosporium solani; diseases caused by bacterial pathogens, for example Xanthomonas species, for example Xanthomonas campestris pv. oryzae; Pseudomonas species, for example Pseudomonas syringae pv. lachrymans; Erwinia species, for example Erwinia amylovora; Liberibacter species, for example Liberibacter asiaticus; Xyella species, for example Xylella fastidiosa; Ralstonia species, for example Ralstonia solanacearum; Dickeya species, for example Dickeya solani; Clavibacter species, for example Clavibacter michiganensis; Streptomyces species, for example Streptomyces scabies.
Diseases of soya beans:
Fungal diseases on leaves, stems, pods and seeds caused, for example, by Altemaria leaf spot (Altemaria spec, atrans tenuissima), Anthracnose (Colletotrichum gloeosporoides dematium var. truncatum), brown spot (Septoria glycines), cercospora leaf spot and blight (Cercospora kikuchii), choanephora leaf blight (Choanephora infundibulifera trispora (Syn.)), dactuliophora leaf spot (Dactuliophora glycines), downy mildew (Peronospora manshurica), drechslera blight (Drechsleraglycini), frogeye leaf spot (Cercospora sojina), leptosphaerulina leaf spot (Leptosphaerulina trifolii), phyllostica leaf spot (Phyllosticta sojaecola), pod and stem blight (Phomopsis sojae), powdery mildew (Microsphaera diffusa), pyrenochaeta leaf spot (Pyrenochaeta glycines), rhizoctonia aerial, foliage, and web blight (Rhizoctonia solani), rust (Phakopsora pachyrhizi, Phakopsora meibomiae, Phakopsora euvitis), scab (Sphaceloma glycines), stemphylium leaf blight (Stemphylium botryosum), sudden death syndrome (Fusarium virguliforme), target spot (Corynespora cassiicola).
Fungal diseases on roots and the stem base caused, for example, by black root rot (Calonectria crotalariae), charcoal rot (Macrophomina phaseolina), fusarium blight or wilt, root rot, and pod and collar rot (Fusarium oxysporum, Fusarium orthoceras, Fusarium semitectum, Fusarium equiseti), mycoleptodiscus root rot (Mycoleptodiscus terrestris), neocosmospora (Neocosmospora vasinfecta), pod and stem blight (Diaporthe phaseolorum), stem canker (Diaporthe phaseolorum var. caulivora), phytophthora rot (Phytophthora megasperma), brown stem rot (Phialophora gregata), pythium rot (Pythium aphanidermatum, Pythium irregulare, Pythium debaryanum, Pythium myriotylum, Pythium ultimum), rhizoctonia root rot, stem decay, and damping-off (Rhizoctonia solani), sclerotinia stem decay (Sclerotinia sclerotiorum), sclerotinia southern blight (Sclerotinia rolfsii), thielaviopsis root rot (Thielaviopsis basicola).
In one embodiment the disease is a disease infecting ear and panicle of cereals including Altemaria species, for example Altemaria spp.; Aspergillus species, for example Aspergillus flavus; Cladosporium species, for example Cladosporium cladosporioides; Claviceps species, for example Claviceps purpurea; Fusarium species, for example Fusarium culmomm; Gibberella species, for example Gibberella zeae (also called Fusarium graminearum or Fusarium Head Blight; Monographella species, for example
Monographella nivalis; Stagnospora species, for example Stagnosporanodorum. In one embodiment the disease is Gibberella zeae infecting cereals including wheat, durum, barley and com.
Plant disease symptoms may be detected by visual changes in the area of the part of the plant affected by the plant disease. Plant disease may cause wilting of a part of the crop plant including leaves, stems, flowers; necrosis, tumors. This results in visual changes of the affected areas of part of crop plants including changes in color or shape like spots. In one embodiment the plant disease symptoms are those caused by Gibberella zeae which are discolored spikelets of the ear.
In one embodiment one trained neural network is used.
In one embodiment the neural network is a U-net being convolutional neural network architecture consisting of a contracting path (encoder) to capture context and a symmetric expanding path (decoder) that enables precise localization. The U-net has skip connections between the encoder and decoder paths, allowing the network to combine low-level feature maps with higher-level ones, improving the flow of information and localization accuracy. In the encoder path, the spatial resolution is decreased while feature depth is increased through a series of convolutional and pooling layers. In the decoder path, transpose convolutions are used to upsample the feature maps, increasing spatial resolution while reducing feature depth. The expansive path combines the upsampled features with the high-resolution features from the encoder path via concatenation, allowing precise localization. The contractingexpanding architecture with skip connections allows the U-Net to work with very few training images and yield more precise segmentation maps compared to other architectures. This makes it particularly useful for biomedical and scientific image segmentation tasks where annotated data is limited. The neural network architecture of a U-net is particularly advantageous for the technical problem of achieving high accuracy assessing qualitatively and quantatively disease by calculating an disease index in a location, field or trial as it allows a high accuracy even with a limited training data set. U-Nets has shown excellent results for biomedical images which are images generated by medical diagnostic methods such as MRI, X-ray, ultrasound, PET, SPECT, or computer tomography (Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Homegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer- Assisted Intervention - MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science (vol 9351. Springer, Cham, https://doi.org/10.1007/978-3-319-24574-4_28). Also, compared to for example Mask R-CNN, U-nets have a simpler architecture leading to an easier implementation and less requirements regarding for example computational costs and storage. This enables faster inference, especially when powerful compute instances are not available for example in remote agricultural areas. To evaluate U- Net the dice coefficient is used. The Dice coefficient, also known as the Dice similarity coefficient or Fl score, is a metric used to evaluate the performance of a model in tasks like image segmentation, object detection, and other classification problems. It measures the overlap between the predicted and ground truth segmentation masks or object regions. The Dice coefficient is calculated as:
Dice = (2 * TP) / (2 * TP + FP + FN) where:
TP (True Positives) is the number of pixels correctly predicted as belonging to the object/class of interest.
FP (False Positives) is the number of pixels incorrectly predicted as belonging to the object/class.
FN (False Negatives) is the number of pixels incorrectly predicted as not belonging to the object/class.
The Dice coefficient ranges from 0 to 1, where 1 indicates a perfect overlap between the predicted and ground truth segmentation masks or object regions, and 0 indicates no overlap at all. Here the ground truth segmentation is provided using human annotators.
In a PyTorch implementation of the U-Net, the Dice coefficient can be computed using the torchmetrics library, which provides a Dice class and a functional dice implementation. These allow calculating the Dice coefficient for binary or multi-class segmentation tasks, with options to control the averaging method (micro, macro, weighted, etc.) and handle class imbalance by ignoring certain classes.
The Dice coefficient is often used in combination with other metrics like Intersection over Union (loU) to evaluate the performance of segmentation or object detection models comprehensively.
The learning rate (or step-size) is explained as the magnitude of change/update to model weights during the backpropagation training process in Neural Networks. As a configurable hyperparameter, the learning rate is usually specified as a positive value less than 1.0. In back-propagation, model weights are updated to reduce the error estimates of our loss function. Rather than changing the weights using the full amount, we multiply it by some learning rate value. For example, setting the learning rate to 0.5 would mean updating (usually subtract) the weights with 0.5*estimated weight errors (i.e., gradients or total error change w.r.t. the weights).
The neural network is trained using a data set of subimages which have been annotated by human agronomists.
In a third step one or more trained neural network is used to generate for each subimage one or more disease pixel sets classifying each pixel within the disease pixel set regarding plant disease symptoms
In one embodiment the trained neural network is used to generate one disease pixel set classifying each pixel within the disease pixel set as plant disease symptoms of a specific plant disease. In the classification process each pixel is assigned a probability to fall into the respective classification categories. Examples for such categories are healthy or infected. Infected refers to a certain plant disease, in particular caused by Gibberella zeae. In a second step the pixel is assigned to the category with the highest probability. In one embodiment the classification consists of two categories which are healthy and infected.
In one embodiment the trained neural network is used to generate two or more disease pixel sets are generated classifying each pixel within one of the disease pixel set regarding plant disease symptoms of a specific plant disease. In one embodiment the disease is Gibberella zeae. In one embodiment in the third step two or more trained neural networks are used, each of them trained to recognize a specific plant disease on one or more crop species.
In a fourth step one or more disease index for each substructure is calculated based on the classification of the pixels in each substructure.
There are several disease indices. Disease severity indicates the percentage of infected area of a plant part. Disease incident indicates the percentage of infected crop plants or infected part of a crop plant compared to the total number of plant parts or plants. A crop plant or a part of a crop plant is infected if one or more pixel of the corresponding substructure is classified as infected. In one embodiment the substructure which represents the plant part is an ear or an tassel. Multiple values of disease indices may be used to calculate one or more disease index for an area, a location or a field trial, requiring the analysis of multiple plant parts.
The disease incident indicates the percentage of the number of infected crop plants compared to the total number of crop plants.
In one embodiment a script is generated for an automated application of crop protection product based on the disease index calculated for a location, a field, a field trial, or an area.
In one embodiment the efficacy of a plant protection product applied to that agricultural location is calculated based on the disease index calculated for a location, a field, a field trial or an area.
In one embodiment the disease indices generated through the method are results used to evaluate the outcome of field trials. An outcome means the result of testing for efficacy of crop protection products, practices for applying crop protection products, including time points, application methods, performance of crop plant varieties, fertilizer application. More importantly the disease indices derived from the method allows the comparison in a qualitative or quantative manner of disease indices indicative of outcomes across different growing periods and geographies. The accuracy for the results according to the method.
In one embodiment a computer-implemented method is disclosed comprising
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a deep learning model architecture one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms within one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
In one embodiment the deep learning model architecture is a vision transformer capable of performing the object detection and image segmentation in one step.
As shown in Figure In one embodiment a device is disclosed comprising an arm (200) including a first, second and third member (210, 220, 230) and first and second means (280) and (290) for connecting the members (210, 220, 230); means (240) for attaching a mobile device (270) with means (260) holding the mobile device (270), wherein the second member (220) is attached at its proximal end to the middle section of the first member (210) at an angle alpha using means (290) and wherein the second member (220) is attached with its distal end to the third member (230) at an angle beta in relation to the plane formed by the first and second member using means (280) and wherein the mobile device (270) is connected to the third member (230) using means (250) or (240) so that the front of the mobile device (270) is parallel to and above the first member (210). Means (260) holds the mobile device (270) directly). Angle alpha is between 60 and 90 degrees, preferably between 75 and 90 degrees, more preferably between 80 and 90 degree and most preferred 90 degrees. Angle beta is between 45 and 160 degrees, preferably between 60 and 120 degrees, more preferably between 60 and 90 degree and most preferred 90 degrees.
The members (210), (220), and (230) may be extendable or fixed in its length. Examples for members (210), (220), and (230) include pipes, rods. The means for connecting members (280) and (290) may be connected by fixed or removable hinges The mobile device (270) is suitable to be carried by a human operator. The human operator will hold either member (230) or means (250) or (240) or means (260) or the mobile device (270) itself, or use the first member (220) to divide plants in an agricultural area to separate so that the mobile device (270) is able to take the digital image in a defined distance and angle between the first member (220) and the mobile device (270). The
Means (260) for attaching a mobile device attaches at its distal end to a holder for the mobile device. Examples for means (280), (290), (240), (250) or (260) independently for each other are clamps, magnetic connections, velcros, hook or cases. On the proximal end the means are attached to the third member.
Examples
Some implementations of the present disclosure will be described more fully with reference to the accompanying examples, in which some, but not all implementations of the disclosure are shown. Indeed, various implementations of the disclosure may be embodied in many different forms and should not be construed as limited to the implementations described in the examples; rather, these example implementations are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Preparation of Data sets for training of the Mask R-CNN neural network
Images that typically included between five and 40 wheat ears were collected from selected field trials for plant protection products between 2020 and 2023 (the collection is ongoing for validation purposes). The images cover different wheat species or varieties including spring or winter wheat or awned wheat species, different countries including Germany, Canada, Czech Republic, Lithuania, United States, Italy, France, Great Britain and different levels of Fusarium Head Blight infections. In the newer data sets (see below for Mask R-CNN) images of awned wheat species were included.
The images were taken based on the assessments of agronomists in accordance with publication: Bulletin OEPP/EPPO Bulletin (2012) 42 (3), 419-425 describing the efficacy evaluation of fungicides on foliar and ear diseases in cereals. Phenotypically at this point in time healthy ears still appeared in a green color corresponding to the BBCH stage 75 to 85, in which disease symptoms on the ears were clearly visible. Some images were taken three weeks after the application of the crop protection product.
For the training of the Mask R-CNN the annotators created polygons around foreground ears
Of about 1605 images of wheat ears
663 images from 14 trial from eight countries including trials with awned wheat species in a high quality.
In Figure 3 the image is an example showing the polygons of 28 wheat ears in the foreground assigned by human annotators.
Preparation of data sets for training of the U-Net neural network for the analysis of the subimages
A set of 1165 complete images (usually five to 40 wheat ears per image) were annotated by human annotators who are trained agronomists in order to ensure that the annotators had the training to identify infected areas in an ear. They overlayed masks on the outline of areas of the ear infected with Gibberella zeae. After this the individual ears were cropped out (creating sub-images) using bounding boxes and pixels marked as diseased within the area of the annotated ears have been used in the training. The background in each of the cropped-out sub images was removed for training. In total, the annotators analysed 22385 individual ears using this approach.
The image of figure 4 shows in an example the masks highlighting the infected areas of the ears. Those images were then overlayed with the polygons outlining the ears and only those pixels classified as being infected within the annotated ears were used.
Model Training and results of the Mask R CNN neural network
The annotated images were split into training and validation (train-val) datasets (90% if the images) and a test dataset (10% of the images) sets using stratified sampling with regard to the 14 different trials in the data. During training the train-val data set was split into training (80% of the train-val images) and validation (20% of the train-val images) data set.
For the Mask R-CNN implementation the detectron2 implementation available at Github https://github.com/facebookresearch/detectron2/tree/main was used.
The results below were archived using a ResNet-50 backbone model and pre-trained weights.
In figure 5 selected training and validation metrics from a model training run training for 5000 epochs with a batch size of 4 images are shown. For better visibility some of the metrics were scaled as indicated in the figure. Validation metrics were derived every 300 epochs.
Box and segmentation average precision on the validation set are shown in the figure 5 as a measure of model precision.
Figure 6 shows an image in which ears of awned species were detected with a high accuracy using the trained Mask R-CNN.
Model Training and results of the U-Net neural network
The data set with the annotated subimages was split int train-val (90%) and test (10%) set. 10% of the train- val set are being used for validation during training.
The U-Net neural network was employed using a pytorch implementation (see https://pytorch.org/) based on the reference described in Ronneberger et al. (https://arxiv.org/pdf/1505.04597).
To evaluate the trained U-Net models the dice coefficient was used.
The dice coefficient has been used to evaluate the trained U-Net models.
Different learning rate schedules have been tested. The learning rate (or step-size) is explained as the magnitude of change/update to model weights during the backpropagation training process in Neural Networks. As a configurable hyperparameter, the learning rate is usually specified as a positive value less than 1.0. In back-propagation, model weights are updated to reduce the error estimates of our loss function. Rather than changing the weights using the full amount, we multiply it by some learning rate value. For example, seting the learning rate to 0.5 would mean updating (usually subtract) the weights with 0.5*estimated weight errors (i.e., gradients or total error change w.r.t. the weights). For this case the best model reaches a validation dice coefficient of 0.65.
The image in Figure 7 shows the subimage of one wheat ear, with removed background, and the mask predicted using the trained U-Net, predicting the area of the ear that infected with Gibberella zeae. Based on the area the disease severity index for that ear was calculated by dividing the number of pixels in the mask by the total number of pixels that belong to the area of the ear (without background).
Alternative in an alternative approach the step that produces the masks for individual ears, which is currently performed by the first model could be split up into a task performed by two models la and lb. Model 2 (ear segmentation) remains unchanged.
The change would add more complexity but could increase the precision of the produced outlines of the ears leading to an overall more precise evaluation of e.g. the disease severity.
In the alternative version model la would return bounding boxes around each ear (technically points on the ear and even a first version of the mask would also be work. I have tested points and boxes; boxes give me better results). All or a selection of the boxes detected by model la will be passed to model lb. Criteria for the selection of a box could for example be its detection score, a maximal number of boxes to be processed per image or the position or size of a box. Model la could be the same model as currently used as model 1 (the Mask- RCNN, although the ’’Mask” part is not required anymore. Faster R-CNN could be used, it is the same model, without the part that produces the masks, so only returning bounding boxes. There are many other modem object detection models returning bounding boxes, so we have a lot of flexibility here.)
Model lb takes the bounding boxes as input (referred to as “prompts” in literature, see below) and produces a high-quality outline (mask) for each box. I am currently using SAM2 (https://ai.meta.com/sam2/. https://github.com/facebookresearch/sam2) for this task. Again, some post-processing (e.g. filling holes in the masks) can be applied to the model lb output.
Claims
1. Computer-implemented method comprising
Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Generating for each subimage by using one or more trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms;
Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
2. A method according to claim one wherein each of the trained neural networks is a convolutional neural network.
3. A method according to claim 1 or 2 where the neural network of the second step is a Mask-CNN and the neural network of the second step is a U-net.
4. A method according to any of claims 1 to 3, further comprising calculating one or more disease index for that agricultural area.
5. A method according to any of claims 1 to 4, wherein the disease is Fusarium graminearum, the species of the crop plant is wheat, and the digital image is a RGB image.
6. A method according to any of claims 1 to 5, further comprising generating a script for an automated application of crop protection product based on the disease index.
7. A method according to any of claims 1 to 5, further comprising calculating the efficacy of a plant protection product applied to that agricultural location.
8. A computer system comprising: a processor; and a memory storing an application program configured to perform, when executed by the processor, an operation, the operation comprising: Receiving a digital image comprising at least one crop plant or at least one part of one or more crop plants growing in an area;
Generating by using a trained neural network one or more subimages, each of the subimages comprising a one or more specific substructure separated from other substructures;
Generating for each subimage by using a trained neural network one or more disease pixel sets classifying each pixel within the one or more disease pixel set regarding plant disease symptoms; Calculating based on the classification of the pixels in the substructure one or more disease index for each substructure.
9. A computer system according to claim 8 wherein each of the trained neural networks is a convolutional neural network.
10. A computer system according to claim 8 or 9 where the neural network of the second step is a Mask-CNN and the neural network of the second step is a U-net.
11. A computer system to any of claims 8 to 10, further comprising calculating one or more disease index for that agricultural area.
12. A computer system according to any of claims 8 to 11, wherein the disease is Fusarium graminearum, the species of the crop plant is wheat, and the digital image is a RGB image.
13. A computer system according to any of claims 8 to 12, further comprising generating a script for an automated application of crop protection product based on the disease index.
14. A computer system according to any of claims 8 to 12, further comprising calculating the efficacy of a plant protection product applied to that agricultural location.
15. A device comprising:
- an arm including a first, second and third member and first, second and third means for connecting the members;
- means for attaching a mobile device, wherein the second member is attached at its proximal end to the middle section of the first member at angle between 60 and 90 degrees and wherein the second member is attached with its distal end to the third member at an angle between 45 and 160 degrees in relation to the plane formed by the first and second member and wherein the means for attached a mobile phone is connected to the third member so that the front of the mobile device is parallel to and above the first member.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP24180949 | 2024-06-07 | ||
| EP24180949.0 | 2024-06-07 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025252921A1 true WO2025252921A1 (en) | 2025-12-11 |
Family
ID=91470058
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2025/065740 Pending WO2025252921A1 (en) | 2024-06-07 | 2025-06-05 | Method to determine disease in plants |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025252921A1 (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2020395845A1 (en) * | 2019-12-03 | 2022-06-23 | Basf Se | System and method for determining damage on crops |
| US20230010954A1 (en) | 2013-11-26 | 2023-01-12 | Taiwan Semiconductor Manufacturing Company, Ltd. | Structure and Method for FinFET Device with Buried Sige Oxide |
-
2025
- 2025-06-05 WO PCT/EP2025/065740 patent/WO2025252921A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230010954A1 (en) | 2013-11-26 | 2023-01-12 | Taiwan Semiconductor Manufacturing Company, Ltd. | Structure and Method for FinFET Device with Buried Sige Oxide |
| AU2020395845A1 (en) * | 2019-12-03 | 2022-06-23 | Basf Se | System and method for determining damage on crops |
Non-Patent Citations (9)
| Title |
|---|
| ABADE ANDR� ET AL: "Plant diseases recognition on images using convolutional neural networks: A systematic review", COMPUTERS AND ELECTRONICS IN AGRICULTURE, ELSEVIER, AMSTERDAM, NL, vol. 185, 30 April 2021 (2021-04-30), XP086572486, ISSN: 0168-1699, [retrieved on 20210430], DOI: 10.1016/J.COMPAG.2021.106125 * |
| BOCK ET AL., PHYTOPATHOLOGY, vol. 2, 2020, pages 1 - 9 |
| BULLETIN OEPP/EPPO BULLETIN, vol. 42, no. 3, 2012, pages 419 - 425 |
| JOURNAL FAR KULTURPFLANZEN, vol. 61, no. 7, 2009, pages 247 - 253, ISSN: 0027-7479 |
| KUMAR DEEPAK ET AL: "An Instance Segmentation Approach for Wheat Yellow Rust Disease Recognition", 2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), IEEE, 7 December 2021 (2021-12-07), pages 926 - 931, XP034046863, DOI: 10.1109/DASA53625.2021.9682257 * |
| NACHRICHTENBL. DEUT. PFLANZENSCHUTZD., vol. 59, no. 7, 2007, pages 155 - 161, ISSN: 0027-7479 |
| RONNEBERGER, O.FISCHER, P.BROX, T.: "Medical Image Computing and Computer-Assisted Intervention", vol. 9351, 2015, SPRINGER, article "U-Net: Convolutional Networks for Biomedical Image Segmentation" |
| S. KHAN ET AL.: "A Guide to Convolutional Neural Networks for computer Vision", 2018, MORGAN & CLAYPOOL PUBLISHERS |
| SOWMIYA M ET AL: "Deep Learning Techniques to Detect Crop Disease and Nutrient Deficiency -A Survey", 2021 INTERNATIONAL CONFERENCE ON SYSTEM, COMPUTATION, AUTOMATION AND NETWORKING (ICSCAN), IEEE, 30 July 2021 (2021-07-30), pages 1 - 5, XP033969357, DOI: 10.1109/ICSCAN53069.2021.9526442 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zhao et al. | Multiple disease detection method for greenhouse-cultivated strawberry based on multiscale feature fusion Faster R_CNN | |
| Forbes et al. | Field assessment of resistance in potato to Phytophthora infestans: International cooperators guide | |
| Oerke | Remote sensing of diseases | |
| Birrell et al. | A field‐tested robotic harvesting system for iceberg lettuce | |
| Mahlein et al. | Hyperspectral sensors and imaging technologies in phytopathology: state of the art | |
| Rani et al. | Pathogen-based classification of plant diseases: A deep transfer learning approach for intelligent support systems | |
| Sapkota et al. | Comparing YOLOv11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment | |
| Partel et al. | Automated vision-based system for monitoring Asian citrus psyllid in orchards utilizing artificial intelligence | |
| Duan et al. | Dynamic quantification of canopy structure to characterize early plant vigour in wheat genotypes | |
| Schor et al. | Development of a robotic detection system for greenhouse pepper plant diseases | |
| Malathy et al. | Disease detection in fruits using image processing | |
| Elisabeth Lof et al. | Achieving durable resistance against plant diseases: scenario analyses with a national-scale spatially explicit model for a wind-dispersed plant pathogen | |
| Garin et al. | A modelling framework to simulate foliar fungal epidemics using functional–structural plant models | |
| Irwin | Implications of movement in developing and deploying integrated pest management strategies | |
| Mirnezami et al. | Automated trichome counting in soybean using advanced image‐processing techniques | |
| Schumann et al. | Detection of three fruit maturity stages in wild blueberry fields using deep learning artificial neural networks | |
| Hund et al. | Non-invasive field phenotyping of cereal development | |
| Vidal et al. | Cultivar architecture modulates spore dispersal by rain splash: A new perspective to reduce disease progression in cultivar mixtures | |
| Wade et al. | Temporal variation in arthropod sampling effectiveness: the case for using the beat sheet method in cotton | |
| Dutta et al. | Application and prospects of artificial intelligence (AI)-based technologies in fruit production systems | |
| Germain et al. | Shared friends counterbalance shared enemies in old forests | |
| WO2025252921A1 (en) | Method to determine disease in plants | |
| Einspanier et al. | High-resolution disease phenotyping reveals distinct resistance mechanisms of tomato crop wild relatives against sclerotinia sclerotiorum | |
| da Cunha et al. | Psyllid Detector: A web-based application to automate insect detection utilizing image processing and deep learning | |
| Miranda et al. | High‐throughput phenotyping and machine learning techniques in soybean breeding: Exploring the potential of aerial imaging and vegetation indices |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25729159 Country of ref document: EP Kind code of ref document: A1 |