JP7599905B2

JP7599905B2 - Information processing device and information processing method

Info

Publication number: JP7599905B2
Application number: JP2020179983A
Authority: JP
Inventors: 将史瀧本; 竜也山本; 英太小野; 悟間宮; 茂樹弘岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-10-27
Filing date: 2020-10-27
Publication date: 2024-12-16
Anticipated expiration: 2040-10-27
Also published as: JP2022070747A

Description

本発明は、撮影画像に基づく予測のための技術に関するものである。 The present invention relates to a technique for prediction based on captured images.

近年、農業において、収量の予測、最適な収穫時期の予測や農薬散布量の制御、圃場修復計画等、多様な問題解決に役立てるために、ＩＴ化によって課題を解決する取り組みが盛んに行われている。 In recent years, there have been many efforts to use IT to solve a variety of problems in agriculture, such as predicting yields, predicting optimal harvest times, controlling the amount of pesticide sprayed, and planning field restoration.

例えば、特許文献１には、農作物を生育させる場から取得したセンサ情報とそれら情報を格納したデータベースを適宜参照することにより生育状況や収穫予測を早期に把握し、生育の異常状態を早期に発見して対処する方法が開示されている。 For example, Patent Document 1 discloses a method for quickly determining the growth status and harvest forecast by appropriately referring to sensor information acquired from the field where agricultural crops are grown and a database that stores that information, and for quickly detecting and dealing with abnormal growth conditions.

また、特許文献２には、農作物に関する多種多様なセンサから獲得した情報を基に登録済の情報を参照して任意の推論を行うことで農作物の品質や収量のばらつきを抑制する圃場管理を行う方法が開示されている。 Patent document 2 also discloses a method of field management that reduces variation in the quality and yield of agricultural crops by making arbitrary inferences based on information acquired from a wide variety of agricultural crop-related sensors and referring to registered information.

特開２００５－１３７２０９号公報JP 2005-137209 A 特開２０１６－４９１０２号公報JP 2016-49102 A

しかしながら、従来から提案されてきた方式では、予測等を実施する圃場に関して過去取得した事例を充分数保持し、該事例に関する情報を基に予測事項が精度良く推定できるような調整作業が済んでいることが前提となっている。 However, the methods proposed so far have been based on the assumption that a sufficient number of past cases for the fields for which predictions are to be performed are stored and that adjustments have been made so that predictions can be estimated with high accuracy based on information about those cases.

一方で、農作物の出来不出来は一般的に、天候・気候等の環境の変動に大きく影響を受け、作業者による肥料・農薬等の散布状態によっても大きく異なる。全ての外的要因による条件が毎年不変であるならば、収量の予測や収穫時期の予測等は実施する必要すら無くなるが、工業と異なり農業は作業者自ら制御不可能な外的要因が多いため、予測は非常に困難である。また、未経験の天候が続いた場合の収量等を予測するような場合、上記の過去に取得した事例から調整された推定システムでは正しい予測が困難である。 On the other hand, the success or failure of agricultural crops is generally greatly influenced by environmental fluctuations such as weather and climate, and can also vary greatly depending on how workers spray fertilizers and pesticides. If all conditions due to external factors remained constant from year to year, there would be no need to predict yields or harvest times, but unlike industry, agriculture has many external factors that workers cannot control, making predictions extremely difficult. Also, when trying to predict yields when unprecedented weather conditions continue, it is difficult to make accurate predictions using estimation systems that have been adjusted from past examples as mentioned above.

最も予測が困難なケースは、新規に上記の予測システムを圃場に導入した場合である。例えば、特定の圃場で収量の予測や、生育不良な領域（枯れ枝・病変）を修繕することを目的とした非生産領域の検出を行う場合を考える。こういったタスクにおいては通常、上記の圃場で過去に収集した農作物に関する画像やパラメータをデータベースに保持しておく。そして、実際に圃場に対して予測等を実施する際には、観測された現在の圃場で撮影された画像やその他センサから取得された生育情報に関わるデータを相互に参照して調整し、精度良く予測する。しかし、上記の如く、これらの予測システムや非生産領域検出器を異なる新規の圃場にも導入した場合、（圃場の）条件が合致しないことが多いために、すぐに適用することができない。こういった場合は、新規の圃場で充分な数のデータの収集を実施して調整するという作業が必要であった。 The most difficult case to predict is when the above prediction system is introduced to a new field. For example, consider the case of predicting yield in a specific field or detecting non-productive areas for the purpose of repairing poorly growing areas (dead branches, lesions). In such tasks, images and parameters of crops collected in the past in the field are usually stored in a database. Then, when actually carrying out predictions on the field, images taken in the current observed field and data related to growth information obtained from other sensors are cross-referenced and adjusted to make accurate predictions. However, as mentioned above, when these prediction systems and non-productive area detectors are introduced to a different new field, they cannot be applied immediately because the conditions (of the field) often do not match. In such cases, it was necessary to collect a sufficient amount of data in the new field and make adjustments.

また、上記の予測システムや非生産領域検出器の調整を人手による調整で行う場合、農作物の生育に関わるパラメータは高次元になるため多くの手間がかかる。また、ディープラーニングやそれらに準じた機械学習的手法で実施する場合であっても、新規の入力に対して良い性能を発揮するためには通常、人手によるラベル付与（アノテーション）作業が必要となるため、作業コストが大きくかかってしまう。 In addition, if the above-mentioned prediction system or non-productive area detector is adjusted manually, it takes a lot of time because the parameters related to crop growth are high-dimensional. Even when deep learning or similar machine learning methods are used, manual labeling (annotation) is usually required to achieve good performance for new input, resulting in high work costs.

本来であれば、予測システムを新規に導入する際や、過去に無かった天災や天候の場合であっても、ユーザの負荷が少ない簡易な設定で良好な予測・推定を行うことが好ましい。 Ideally, when introducing a new prediction system or in the case of unprecedented natural disasters or weather conditions, it is desirable to be able to make good predictions and estimates with simple settings that place a low burden on the user.

本発明では、過去に収集している情報のみから処理が困難である場合や、過去に収集した情報が無い場合であっても、状況に応じた学習モデルによる処理を可能にするための技術を提供する。 The present invention provides technology that enables processing using a learning model according to the situation, even when processing is difficult using only previously collected information or when no previously collected information is available.

本発明の一様態は、オブジェクトの撮影に係る情報に基づいて、互いに異なる学習環境において学習した複数の学習モデルから１以上の学習モデルを候補学習モデルとして選択する第１選択手段と、
前記第１選択手段が選択した候補学習モデルによるオブジェクト検出処理の結果に基づいて、該候補学習モデルから１以上の候補学習モデルを選択する第２選択手段と、
前記第２選択手段が選択した候補学習モデルのうちの少なくともいずれか一つの候補学習モデルを用いて、前記オブジェクトの撮影画像に対するオブジェクト検出処理を行う検出手段と、
前記オブジェクト検出処理の結果として得られるオブジェクトの検出領域に基づいて、農作物の収量の予測、圃場における修繕箇所の検出を行う手段と
を備えることを特徴とする。 According to one aspect of the present invention, there is provided a method for selecting one or more candidate learning models from a plurality of learning models trained in different learning environments based on information related to photographing an object;
A second selection means for selecting one or more candidate learning models from the candidate learning models selected by the first selection means based on a result of the object detection process using the candidate learning models;
A detection means for performing an object detection process on a photographed image of the object by using at least one of the candidate learning models selected by the second selection means ;
A means for predicting a yield of a crop and detecting a repair area in a farm field based on an object detection area obtained as a result of the object detection process;
The present invention is characterized by comprising:

本発明の構成によれば、過去に収集している情報のみから処理が困難である場合や、過去に収集した情報が無い場合であっても、状況に応じた学習モデルによる処理を可能にすることができる。 The configuration of the present invention makes it possible to process using a learning model according to the situation even when processing is difficult using only previously collected information or when no previously collected information exists.

システムの構成例を示す図。FIG. 1 is a diagram showing an example of a system configuration. システムが行う処理のフローチャート。4 is a flowchart of a process performed by the system. ステップＳ２３における処理の詳細を示すフローチャート。10 is a flowchart showing details of the process in step S23. ステップＳ２３３における処理の詳細を示すフローチャート。10 is a flowchart showing details of the process in step S233. カメラ１０による圃場の撮影方法の一例を示す図。3 is a diagram showing an example of a method for photographing a farm field using a camera 10; 困難な事例を示す図。Diagram showing a difficult case. 撮影画像に対してアノテーション作業を行った結果を示す図。FIG. 13 is a diagram showing the results of annotation work performed on a captured image. ＧＵＩの表示例を示す図。FIG. 13 is a diagram showing a display example of a GUI. ＧＵＩの表示例を示す図。FIG. 13 is a diagram showing a display example of a GUI. システムが行う処理のフローチャート。4 is a flowchart of a process performed by the system. ステップＳ８３における処理の詳細を示すフローチャート。10 is a flowchart showing details of the process in step S83. ステップＳ８３３における処理の詳細を示すフローチャート。10 is a flowchart showing details of the process in step S833. 検出領域の検出例を示す図。FIG. 4 is a diagram showing a detection example of a detection region. ＧＵＩの表示例を示す図。FIG. 13 is a diagram showing a display example of a GUI. （Ａ）はクエリパラメータの構成例を示す図、（Ｂ）は学習モデルのパラメータセットの構成例を示す図、（Ｃ）はクエリパラメータの構成例を示す図。1A is a diagram showing an example of the configuration of a query parameter, FIG. 1B is a diagram showing an example of the configuration of a parameter set of a learning model, and FIG. 1C is a diagram showing an example of the configuration of a query parameter.

以下、添付図面を参照して実施形態を詳しく説明する。尚、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 The following embodiments are described in detail with reference to the attached drawings. Note that the following embodiments do not limit the invention according to the claims. Although the embodiments describe multiple features, not all of these multiple features are necessarily essential to the invention, and multiple features may be combined in any manner. Furthermore, in the attached drawings, the same reference numbers are used for the same or similar configurations, and duplicate explanations are omitted.

［第１の実施形態］
本実施形態では、カメラによって撮影された圃場の撮影画像から、該圃場における農作物の収量の予測や修繕箇所の検出等、該圃場の分析処理を行うシステムについて説明する。 [First embodiment]
In this embodiment, a system is described that performs analysis processing of a farm field, such as predicting the yield of agricultural crops in the field and detecting areas that need repair, based on an image of the field captured by a camera.

まず、本実施形態に係るシステムの構成例について、図１を用いて説明する。図１に示す如く、本実施形態に係るシステムは、カメラ１０、クラウドサーバ１２、情報処理装置１３を有する。 First, an example of the configuration of a system according to this embodiment will be described with reference to FIG. 1. As shown in FIG. 1, the system according to this embodiment has a camera 10, a cloud server 12, and an information processing device 13.

まず、カメラ１０について説明する。カメラ１０は圃場の動画像を撮影し、該動画像における各フレームの画像を「圃場の撮影画像」として出力する。もしくはカメラ１０は圃場の静止画像を定期的もしくは不定期的に撮影し、該撮影した静止画像を「圃場の撮影画像」として出力する。撮影画像から後述の予測を正確に行うためには、同じ圃場で撮影した画像は可能な限り同じ環境、条件で撮影されていることが望ましい。カメラ１０から出力された撮影画像はＬＡＮやインターネットなどの通信網１１を介してクラウドサーバ１２や情報処理装置１３に対して送信される。 First, the camera 10 will be described. The camera 10 captures moving images of the field and outputs the images of each frame in the moving images as "captured images of the field". Alternatively, the camera 10 captures still images of the field periodically or irregularly and outputs the captured still images as "captured images of the field". In order to accurately perform the predictions described below from the captured images, it is desirable that images captured in the same field are captured in the same environment and conditions as much as possible. The captured images output from the camera 10 are transmitted to a cloud server 12 or an information processing device 13 via a communication network 11 such as a LAN or the Internet.

カメラ１０による圃場の撮影方法は特定の撮影方法に限らない。カメラ１０による圃場の撮影方法の一例を図３（Ａ）を用いて説明する。図３（Ａ）ではカメラ１０としてカメラ３３およびカメラ３４を用いている。一般的な圃場では、農家によって計画的に植えられた農作物の木が列をなしており、例えば、図３（Ａ）に示す如く、農作物の木の列３０や農作物の木の列３１のように、何列も農作物の木が並んで植えられている。農作業用トラクター３２には、矢印で示している進行方向において左側の農作物の木の列３１を撮影するカメラ３４と、右側の農作物の木の列３０を撮影するカメラ３３と、が設けられている。よって農作業用トラクター３２が列３０と列３１との間を矢印で示す進行方向に移動すると、カメラ３４は列３１における農作物の木の撮影画像を複数枚撮影することになり、カメラ３３は列３０における農作物の木の撮影画像を複数枚撮影することになる。 The method of photographing the field using the camera 10 is not limited to a specific photographing method. An example of the method of photographing the field using the camera 10 will be described with reference to FIG. 3(A). In FIG. 3(A), cameras 33 and 34 are used as the camera 10. In a typical field, rows of crop trees are planted in a planned manner by a farmer. For example, as shown in FIG. 3(A), rows of crop trees are planted side by side, such as row 30 of crop trees and row 31 of crop trees. The farm tractor 32 is provided with a camera 34 that photographs the row 31 of crop trees on the left side in the traveling direction indicated by the arrow, and a camera 33 that photographs the row 30 of crop trees on the right side. Therefore, when the farm tractor 32 moves between the rows 30 and 31 in the traveling direction indicated by the arrow, the camera 34 will take multiple images of the crop trees in row 31, and the camera 33 will take multiple images of the crop trees in row 30.

農作業用トラクター３２が入って作業するようにデザインされ、等間隔に農作物の木が植えられているような多くの圃場では、図３（Ａ）に示す如く農作業用トラクター３２に設置されたカメラ３３，３４で農作物の木を撮影することで、より多くの農作物の木を一定の高さで農作物の木から一定の距離を保った状態で撮影することが比較的容易に実現できる。そのため、ほとんど同じ条件で対象の圃場全ての画像を撮影することが可能となり、望ましい条件での画像撮影が容易に実現される。 In many farm fields that are designed for agricultural tractors 32 to enter and where crop trees are planted at equal intervals, it is relatively easy to photograph a large number of crop trees at a constant height and at a constant distance from the crop trees by photographing the crop trees with cameras 33, 34 installed on the agricultural tractor 32 as shown in FIG. 3(A). This makes it possible to take images of the entire target field under almost the same conditions, making it easy to take images under desirable conditions.

なお、概ね同等の条件で圃場の撮影を行うことが可能であれば、他の撮影方法を採用しても良い。カメラ１０による圃場の撮影方法の一例を図３（Ｂ）を用いて説明する。図３（Ｂ）ではカメラ１０としてカメラ３８およびカメラ３９を用いている。図３（Ｂ）に示す如く、農作物の木の列３５と農作物の木の列３６との間の間隔が狭く、トラクターによる走行が不可能な圃場等では、ドローン３７に取り付けたカメラ３８およびカメラ３９による撮影でも良い。ドローン３７には、矢印で示している進行方向において左側の農作物の木の列３６を撮影するカメラ３９と、右側の農作物の木の列３５を撮影するカメラ３８と、が設けられている。よってドローン３７が列３５と列３６との間を矢印で示す進行方向に移動すると、カメラ３９は列３６における農作物の木の撮影画像を複数枚撮影することになり、カメラ３８は列３５における農作物の木の撮影画像を複数枚撮影することになる。 Note that other photographing methods may be used as long as it is possible to photograph the field under roughly the same conditions. An example of a method of photographing a field using a camera 10 will be described with reference to FIG. 3(B). In FIG. 3(B), cameras 38 and 39 are used as the camera 10. As shown in FIG. 3(B), in a field where the distance between the rows of crop trees 35 and the rows of crop trees 36 is narrow and travel by a tractor is not possible, photographing may be performed using cameras 38 and 39 attached to a drone 37. The drone 37 is provided with a camera 39 that photographs the row of crop trees 36 on the left side in the traveling direction indicated by the arrow, and a camera 38 that photographs the row of crop trees 35 on the right side. Therefore, when the drone 37 moves between the rows 35 and 36 in the traveling direction indicated by the arrow, the camera 39 will take multiple photographs of the crop trees in the row 36, and the camera 38 will take multiple photographs of the crop trees in the row 35.

また、自走ロボットに設置されたカメラによって農作物の木の撮影画像を撮影するようにしても良い。また、撮影に用いるカメラの数は図３（Ａ）、（Ｂ）では２としているが、特定の数に限らない。 Also, a camera installed on the self-propelled robot may be used to capture images of crop trees. Although the number of cameras used for capturing images is two in Figures 3(A) and (B), this is not limited to a specific number.

農作物の木の撮影画像をどのような撮影方法で撮影したとしても、カメラ１０は、撮像画像には、該撮影画像の撮影時における撮影情報（撮影位置（例えばＧＰＳによって測定された撮影位置）、撮影日時、カメラ１０に係る情報等が記録されたＥｘｉｆ情報）を添付して出力する。 Regardless of the method used to capture the image of the crop tree, the camera 10 outputs the captured image with the shooting information at the time of the image capture (Exif information that records the shooting location (e.g., the shooting location measured by GPS), the shooting date and time, information related to the camera 10, etc.) attached to the captured image.

次に、クラウドサーバ１２について説明する。クラウドサーバ１２には、カメラ１０から送信される撮影画像およびＥｘｉｆ情報が登録される。また、クラウドサーバ１２には、撮影画像から農作物に係る画像領域を検出するための学習モデル（検出器／設定）が複数登録されており、それぞれの学習モデルは互いに異なる学習環境で学習したモデルである。そしてクラウドサーバ１２は、自身が保持している複数の学習モデルのうち、撮影画像から農作物に係る画像領域を検出する際に用いる学習モデルの候補を選択して情報処理装置１３に提示する。 Next, the cloud server 12 will be described. The captured image and Exif information transmitted from the camera 10 are registered in the cloud server 12. In addition, a plurality of learning models (detectors/settings) for detecting an image area related to agricultural products from the captured image are registered in the cloud server 12, and each learning model is a model trained in a different learning environment. Then, the cloud server 12 selects a candidate learning model to be used when detecting an image area related to agricultural products from the captured image from among the plurality of learning models held by the cloud server 12, and presents the candidate learning model to the information processing device 13.

ＣＰＵ１９１は、ＲＡＭ１９２やＲＯＭ１９３に格納されているコンピュータプログラムやデータを用いて各種の処理を実行する。これによりＣＰＵ１９１は、クラウドサーバ１２全体の動作制御を行うと共に、クラウドサーバ１２が行うものとして説明する各種の処理を実行もしくは制御する。 The CPU 191 executes various processes using computer programs and data stored in the RAM 192 and the ROM 193. As a result, the CPU 191 controls the operation of the entire cloud server 12, and executes or controls various processes that will be described as being performed by the cloud server 12.

ＲＡＭ１９２は、ＲＯＭ１９３や外部記憶装置１９６からロードされたコンピュータプログラムやデータを格納するためのエリア、Ｉ／Ｆ１９７を介して外部から受信したデータを格納するためのエリア、を有する。さらにＲＡＭ１９２は、ＣＰＵ１９１が各種の処理を実行する際に用いるワークエリアを有する。このようにＲＡＭ１９２は、各種のエリアを適宜提供することができる。 RAM 192 has an area for storing computer programs and data loaded from ROM 193 or external storage device 196, and an area for storing data received from the outside via I/F 197. RAM 192 also has a work area used by CPU 191 when executing various processes. In this way, RAM 192 can provide various areas as needed.

ＲＯＭ１９３には、クラウドサーバ１２の設定データ、クラウドサーバ１２の起動に係るコンピュータプログラムやデータ、クラウドサーバ１２の基本動作に係るコンピュータプログラムやデータ、などが格納されている。 ROM 193 stores configuration data for cloud server 12, computer programs and data related to starting up cloud server 12, computer programs and data related to the basic operation of cloud server 12, and the like.

操作部１９４は、キーボード、マウス、タッチパネルなどのユーザインターフェースであり、ユーザが操作することで各種の指示をＣＰＵ１９１に対して入力することができる。 The operation unit 194 is a user interface such as a keyboard, mouse, or touch panel, and the user can operate it to input various instructions to the CPU 191.

表示部１９５は、液晶画面やタッチパネル画面などの画面を有し、ＣＰＵ１９１による処理結果を画像や文字などでもって表示することができる。なお、表示部１９５は、画像や文字を投影するプロジェクタなどの投影装置であっても良い。 The display unit 195 has a screen such as an LCD screen or a touch panel screen, and can display the results of processing by the CPU 191 using images, text, etc. The display unit 195 may also be a projection device such as a projector that projects images and text.

外部記憶装置１９６は、ハードディスクドライブ装置などの大容量情報記憶装置である。外部記憶装置１９６には、ＯＳ（オペレーティングシステム）や、クラウドサーバ１２が行うものとして説明する各種の処理をＣＰＵ１９１に実行もしくは制御させるためのコンピュータプログラムやデータが保存されている。外部記憶装置１９６に保存されているデータには、上記の学習モデルに係るデータも含まれている。外部記憶装置１９６に保存されているコンピュータプログラムやデータは、ＣＰＵ１９１による制御に従って適宜ＲＡＭ１９２にロードされ、ＣＰＵ１９１による処理対象となる。 The external storage device 196 is a large-capacity information storage device such as a hard disk drive device. The external storage device 196 stores an OS (operating system) and computer programs and data for causing the CPU 191 to execute or control various processes described as being performed by the cloud server 12. The data stored in the external storage device 196 also includes data related to the learning model described above. The computer programs and data stored in the external storage device 196 are loaded into the RAM 192 as appropriate under the control of the CPU 191, and become the subject of processing by the CPU 191.

Ｉ／Ｆ１９７は、外部とのデータ通信を行うための通信インターフェースであり、クラウドサーバ１２は、Ｉ／Ｆ１９７を介して外部とのデータの送受信を行う。ＣＰＵ１９１、ＲＡＭ１９２、ＲＯＭ１９３、操作部１９４、表示部１９５、外部記憶装置１９６、Ｉ／Ｆ１９７、は何れもシステムバス１９８に接続されている。なお、クラウドサーバ１２の構成は図１に示した構成に限らない。 I/F 197 is a communication interface for performing data communication with the outside, and cloud server 12 transmits and receives data with the outside via I/F 197. CPU 191, RAM 192, ROM 193, operation unit 194, display unit 195, external storage device 196, and I/F 197 are all connected to system bus 198. Note that the configuration of cloud server 12 is not limited to the configuration shown in FIG. 1.

なお、カメラ１０から出力された撮影画像およびＥｘｉｆ情報を一時的に他の装置のメモリに格納し、該メモリから通信網１１を介してクラウドサーバ１２に該撮影画像およびＥｘｉｆ情報を転送するようにしても良い。 The captured image and Exif information output from the camera 10 may be temporarily stored in the memory of another device, and the captured image and Exif information may be transferred from the memory to the cloud server 12 via the communication network 11.

次に、情報処理装置１３について説明する。情報処理装置１３は、ＰＣ（パーソナルコンピュータ）、スマートフォン、タブレット端末装置、などのコンピュータ装置である。情報処理装置１３は、クラウドサーバ１２によって提示された学習モデルの候補をユーザに提示してユーザからの学習モデルの選択を受け付け、ユーザにより選択された学習モデルをクラウドサーバ１２に通知する。クラウドサーバ１２は、情報処理装置１３から通知された学習モデル（候補からユーザが選択した学習モデル）用いて、カメラ１０による撮影画像から農作物に係る画像領域の検出（オブジェクト検出処理）を行って、上記の分析処理を行う。 Next, the information processing device 13 will be described. The information processing device 13 is a computer device such as a PC (personal computer), a smartphone, or a tablet terminal device. The information processing device 13 presents candidates for learning models presented by the cloud server 12 to the user, accepts the user's selection of a learning model, and notifies the cloud server 12 of the learning model selected by the user. The cloud server 12 uses the learning model notified by the information processing device 13 (the learning model selected by the user from the candidates) to detect image areas related to agricultural products from the image captured by the camera 10 (object detection processing), and performs the above-mentioned analysis processing.

ＣＰＵ１３１は、ＲＡＭ１３２やＲＯＭ１３３に格納されているコンピュータプログラムやデータを用いて各種の処理を行う。これによりＣＰＵ１３１は、情報処理装置１３全体の動作制御を行うと共に、情報処理装置１３が行うものとして説明する各種の処理を実行もしくは制御する。 The CPU 131 performs various processes using computer programs and data stored in the RAM 132 and the ROM 133. As a result, the CPU 131 controls the operation of the entire information processing device 13, and also executes or controls various processes that will be described as being performed by the information processing device 13.

ＲＡＭ１３２は、ＲＯＭ１３３からロードされたコンピュータプログラムやデータを格納するためのエリア、入力Ｉ／Ｆ１３５を介してカメラ１０やクラウドサーバ１２から受信したデータを格納するためのエリア、を有する。さらにＲＡＭ１３２は、ＣＰＵ１３１が各種の処理を実行する際に用いるワークエリアを有する。このように、ＲＡＭ１３２は、各種のエリアを適宜提供することができる。 RAM 132 has an area for storing computer programs and data loaded from ROM 133, and an area for storing data received from camera 10 and cloud server 12 via input I/F 135. RAM 132 also has a work area used by CPU 131 when executing various processes. In this way, RAM 132 can provide various areas as appropriate.

ＲＯＭ１３３には、情報処理装置１３の設定データ、情報処理装置１３の起動に係るコンピュータプログラムやデータ、情報処理装置１３の基本動作に係るコンピュータプログラムやデータ、などが格納されている。 ROM 133 stores setting data for information processing device 13, computer programs and data related to the startup of information processing device 13, computer programs and data related to the basic operation of information processing device 13, etc.

出力Ｉ／Ｆ１３４は、情報処理装置１３が各種の情報を外部に出力／送信するために用いるインターフェースである。 The output I/F 134 is an interface used by the information processing device 13 to output/transmit various information to the outside.

入力Ｉ／Ｆ１３５は、情報処理装置１３が各種の情報を外部から入力／受信するために用いるインターフェースである。 The input I/F 135 is an interface used by the information processing device 13 to input/receive various information from the outside.

表示装置１４は、液晶画面やタッチパネル画面を有し、ＣＰＵ１３１による処理結果を画像や文字などでもって表示することができる。なお、表示装置１４は、画像や文字を投影するプロジェクタなどの投影装置であっても良い。 The display device 14 has an LCD screen or a touch panel screen, and can display the results of processing by the CPU 131 as images, text, etc. The display device 14 may also be a projection device such as a projector that projects images and text.

ユーザインターフェース１５は、キーボードやマウスを含み、ユーザが操作することで各種の指示をＣＰＵ１３１に対して入力することができる。なお、情報処理装置１３の構成は図１に示した構成に限らず、例えば、ハードディスクドライブ装置などの大容量情報記憶装置を有し、該ハードディスクドライブ装置に後述するＧＵＩなどのコンピュータプログラムやデータを保存しておいても良い。また、ユーザインターフェース１５には、タッチパネルなどのタッチセンサを含めても良い。 The user interface 15 includes a keyboard and a mouse, and can be operated by the user to input various instructions to the CPU 131. Note that the configuration of the information processing device 13 is not limited to the configuration shown in FIG. 1, and may include, for example, a large-capacity information storage device such as a hard disk drive device, in which computer programs and data such as a GUI described below are stored. The user interface 15 may also include a touch sensor such as a touch panel.

次に、カメラ１０により撮影された圃場の撮影画像から、該圃場で収穫される農作物の収量を収穫時期よりも早い段階で予測するタスクの流れについて説明する。単純に収穫時期に収穫対象である果実等をカウントすることによって収穫量を予測する場合、単純に特定物体検出と称される方法で対象果実を撮影画像から識別器によって検出すれば目的が達成される。これは、果実自体が極めて特徴的な外観を有しているため、この特徴的な外観を学習した識別器によって検出する方法である。 Next, we will explain the task flow of predicting the yield of agricultural crops to be harvested in a farm field at a stage earlier than the harvest time, based on images of the farm field captured by the camera 10. When predicting the yield by simply counting the fruits to be harvested at the harvest time, the objective can be achieved by simply detecting the target fruits from the captured images using a classifier using a method called specific object detection. Since the fruits themselves have a very distinctive appearance, this method detects them using a classifier that has learned this distinctive appearance.

本実施形態では、農作物が果実の場合は、該果実が成熟した後に該果実をカウントすることのみに留まらず、収穫時期よりも早い段階で該果実の収量を予測する。例えば、後に果実となる花序を検出してその数から収量を予測したり、果実が生る可能性の低い枯れ枝や病変領域を検出することで収量を予測したり、木の葉の生い茂り方の状態から収量を予測したりする。このような予測を行うためには、撮影時期や気候によって農作物の生育状況が異なってくることに対応した予測方法が必要となる。つまり、農作物の状況に応じて、予測性能が良い予測方式を選択する必要が有る。この場合、予測対象の圃場に合致した学習モデルによって上記の予測を適切に行うことが期待される。 In this embodiment, when the crop is fruit, the system does not only count the fruits after they have ripened, but also predicts the yield of the fruit at an earlier stage than the harvest time. For example, the system detects inflorescences that will later become fruit and predicts the yield from their number, detects dead branches or diseased areas that are unlikely to bear fruit, or predicts the yield from the state of leaf growth. To make such predictions, a prediction method that can handle the fact that the growth conditions of agricultural crops differ depending on the time of photography and the weather is required. In other words, it is necessary to select a prediction method with good prediction performance depending on the condition of the agricultural crops. In this case, it is expected that the above predictions will be made appropriately using a learning model that matches the field to be predicted.

ここで、撮影画像内に写る様々なオブジェクトを農作物の木の幹クラス、枝クラス、枯れ枝クラス、支柱クラス等のクラスに分類し、クラスによって収量を予測する。撮影時期によって木の幹クラスや枝クラス等のクラスに属するオブジェクトの外観は変わるため、万能な予測は困難である。このような困難な事例を図４に示す。 Here, various objects captured in captured images are classified into classes such as crop tree trunk class, branch class, dead branch class, and support class, and yield is predicted based on the class. Universal prediction is difficult because the appearance of objects belonging to classes such as tree trunk class and branch class changes depending on the time of capture. An example of such a difficult case is shown in Figure 4.

図４（Ａ）および図４（Ｂ）は、上記のカメラ１０によって撮影された撮影画像の一例を示している。これらの撮影画像には、ほぼ等間隔に農作物の木が写っているが、収穫される予定の果実等が未だなっていないので、該撮影画像からは果実を検出するタスクは実行できない。図４（Ａ）の撮影画像中の木は、比較的シーズンの早い段階で撮影された農作物の木であり、図４（Ｂ）の撮影画像中の木は、ある程度葉が生い茂った段階で撮影された木である。図４（Ａ）の撮影画像では、どの木の枝も同程度葉が有るため、生育不良な領域は無いと判断でき、全て収穫可能な領域と判定することができる。一方で図４（Ｂ）の撮影画像では、該撮影画像中の中央領域４１付近の枝の葉の生い茂り方が明らかに他と異なっており、生育不良と判断することは容易である。しかし、中央領域４１（葉が少ない領域）の様子は、図４（Ａ）の撮影画像中の領域４０付近でも、同様のパターンとして見つけることができる。この２つの事例が示すことは、農産物の木の異常領域は局所的なパターンでは判定不可能ということである。つまり、上記の特定物体検出のような局所パターンのみの入力で判断はできず、画像全体から得られるコンテキストを反映させることが必要となる。 4(A) and 4(B) show examples of images captured by the camera 10. These captured images show crop trees at approximately equal intervals, but the fruits to be harvested have not yet borne fruit, so the task of detecting fruit cannot be performed from the captured images. The tree in the captured image of FIG. 4(A) is a crop tree captured at a relatively early stage of the season, and the tree in the captured image of FIG. 4(B) is a tree captured at a stage when the leaves have grown to a certain extent. In the captured image of FIG. 4(A), all the branches have the same amount of leaves, so it can be determined that there are no areas of poor growth, and all areas can be determined as harvestable. On the other hand, in the captured image of FIG. 4(B), the leaves of the branches near the central region 41 in the captured image are clearly different from the others, so it is easy to determine that they are poorly grown. However, the appearance of the central region 41 (an area with few leaves) can also be found as a similar pattern near the region 40 in the captured image of FIG. 4(A). These two examples show that abnormal areas in agricultural trees cannot be identified by local patterns. In other words, a judgment cannot be made by inputting only local patterns, as in the specific object detection method described above, and it is necessary to reflect the context obtained from the entire image.

つまり、過去に同様の生育状況の農作物を同様の条件で撮影した画像で学習した学習モデルを用いて上記の特定物体検出を行わなければ、充分な性能を発揮することができない。 In other words, unless the specific object detection is performed using a learning model trained on images of crops in similar growth conditions taken in the past, sufficient performance will not be achieved.

過去に撮影したことのない新規の圃場で撮影した画像が入力された場合や、日照りが続いた、雨量が極端に多かった等の何かの外的要因によって以前に撮影した条件と異なる条件における画像が入力された場合のみならず、ユーザが都合の良い時期に撮影した画像が入力された場合など、のあらゆるケースに対応するためには、毎度、入力画像の条件に近い条件で学習した学習モデルを獲得する必要がある。 In order to handle all kinds of cases, such as when an image is input that was taken in a new field where no images have been taken before, when an image is input that was taken under conditions different from those previously taken due to some external factor such as prolonged drought or extremely heavy rainfall, or when an image is input that was taken at a time convenient for the user, it is necessary to acquire a learning model that is trained under conditions close to those of the input image each time.

ここで、圃場の撮影を行うたびに毎回アノテーション作業とディープラーニングによる学習を実施する場合に、どのようなアノテーション作業が必要となるのかについて説明する。例えば、図４（Ａ）、図４（Ｂ）のそれぞれの撮影画像に対してアノテーション作業を行った結果を、それぞれ図５（Ａ）、図５（Ｂ）に示す。 Here, we will explain what kind of annotation work is required when annotation work and deep learning are performed every time a photograph of a farm field is taken. For example, the results of annotation work performed on the photographed images in Figures 4(A) and 4(B) are shown in Figures 5(A) and 5(B), respectively.

図５（Ａ）の撮影画像における矩形領域５００～５０４がアノテーション作業によって指定された画像領域である。矩形領域５００は、正常な枝の領域として指定された画像領域であり、矩形領域５０１～５０４は、木の幹の領域として指定された画像領域である。矩形領域５００は、木の生育に関して正常な状態を表している画像領域であるため、該画像領域が収量の予測に大きく関連する領域となる。以下では、矩形領域５００のような、木の生育に関して正常な状態を表す領域、果実等が収穫可能な部分の領域、を生産領域と称する。 Rectangular areas 500-504 in the captured image in Figure 5 (A) are image areas designated by annotation work. Rectangular area 500 is an image area designated as an area of normal branches, and rectangular areas 501-504 are image areas designated as areas of the tree trunk. Rectangular area 500 is an image area that represents a normal state of tree growth, and therefore this image area is an area that is highly relevant to yield prediction. Hereinafter, areas that represent a normal state of tree growth, such as rectangular area 500, and areas where fruit, etc. can be harvested, are referred to as production areas.

図５（Ｂ）の撮影画像における矩形領域５０５～５０７、５１１～５１４がアノテーション作業によって指定された画像領域である。矩形領域５０５，５０７は、正常な枝の領域として指定された画像領域であり、矩形領域５０６は、異常な枯れ枝の領域として指定された画像領域である。矩形領域５０６のような、異常な状態を表す領域、果実等が収穫不可能な部分の領域、を非生産領域と称する。矩形領域５１１～５１４は、木の幹の領域として指定された画像領域である。果実等が収穫可能な部分の領域（生産領域）と判断される画像領域は矩形領域５０５、５０７であるから、該矩形領域５０５，５０７が収量の予測に大きく関連する領域となる。 Rectangular regions 505-507 and 511-514 in the captured image in Figure 5 (B) are image regions designated by annotation work. Rectangular regions 505 and 507 are image regions designated as normal branch regions, and rectangular region 506 is an image region designated as an abnormal dead branch region. Regions that indicate abnormal conditions, such as rectangular region 506, and regions where fruit or the like cannot be harvested are called non-productive regions. Rectangular regions 511-514 are image regions designated as tree trunk regions. Rectangular regions 505 and 507 are image regions that are determined to be regions where fruit or the like can be harvested (productive regions), and therefore rectangular regions 505 and 507 are regions that are closely related to yield prediction.

このようなアノテーション作業を圃場の撮影の度に多数（例えば、数百～数千枚）の撮影画像に対して実施するには、非常にコストがかかる。そこで、本実施形態では、このようなより煩わしいアノテーション作業を実施せずに、良好な予測結果を獲得する。本実施形態では、ディープラーニングによって学習モデルを獲得する。しかし、学習モデルの獲得方法は特定の獲得方法に限らない。また、様々なオブジェクト検出器を学習モデルの代わりに適用してもかまわない。 It is very costly to perform such annotation work on a large number of captured images (e.g., hundreds to thousands) each time a field is photographed. Therefore, in this embodiment, good prediction results are obtained without performing such more cumbersome annotation work. In this embodiment, a learning model is acquired by deep learning. However, the method of acquiring the learning model is not limited to a specific acquisition method. Also, various object detectors may be applied instead of the learning model.

次に、カメラ１０によって撮影された圃場の撮影画像に基づいて該圃場における収量の予測や該圃場全体に対する非生産率の計算等の分析処理を行うために本実施形態に係るシステムが行う処理について、図２Ａのフローチャートに従って説明する。 Next, the process performed by the system according to this embodiment to perform analytical processes such as predicting the yield in a field and calculating the non-productive rate for the entire field based on the image of the field captured by the camera 10 will be described with reference to the flowchart in FIG. 2A.

ステップＳ２０では、カメラ１０は、農作業用トラクター３２やドローン３７などの移動体が移動中に圃場を撮影することで該圃場の撮影画像を生成する。 In step S20, the camera 10 captures an image of the field while a mobile object such as an agricultural tractor 32 or a drone 37 is moving, thereby generating a captured image of the field.

ステップＳ２１では、カメラ１０は、ステップＳ２０で生成した撮影画像に上記のＥｘｉｆ情報（撮影情報）を添付し、該Ｅｘｉｆ情報が添付された撮影画像を通信網１１を介してクラウドサーバ１２および情報処理装置１３に対して送信する。 In step S21, the camera 10 attaches the above-mentioned Exif information (shooting information) to the captured image generated in step S20, and transmits the captured image with the Exif information attached to the cloud server 12 and the information processing device 13 via the communication network 11.

ステップＳ２２では、情報処理装置１３のＣＰＵ１３１は、カメラ１０が撮影した圃場や農作物などに関する情報（農作物の品種や樹齢、農作物の育成法や剪定法等）を撮影圃場パラメータとして取得する。例えば、ＣＰＵ１３１は、図６（Ａ）に示すＧＵＩ（グラフィカルユーザインターフェース）を表示装置１４に表示させて、ユーザからの撮影圃場パラメータの入力を受け付ける。 In step S22, the CPU 131 of the information processing device 13 acquires information about the field and crops photographed by the camera 10 (such as the variety and age of the crops, and the cultivation and pruning methods of the crops) as photographed field parameters. For example, the CPU 131 displays a GUI (graphical user interface) shown in FIG. 6(A) on the display device 14 and accepts input of the photographed field parameters from the user.

図６（Ａ）のＧＵＩにおいて領域６００には、圃場全体のマップが表示される。領域６００に表示される圃場のマップは複数の区分に分かれており、それぞれの区分には該区分に固有の識別子（ＩＤ）が表示されている。ユーザはユーザインターフェース１５を操作して、カメラ１０により撮影を行った区分（つまりこれから上記の分析処理を行いたい区分）に該当する領域６００内の箇所を指定するか、若しくは該区分の識別子を領域６０１に入力する。ユーザがユーザインターフェース１５を操作して、カメラ１０により撮影を行った区分に該当する領域６００内の箇所を指定した場合、該区分の識別子が領域６０１に表示される。 In the GUI of FIG. 6(A), a map of the entire field is displayed in area 600. The map of the field displayed in area 600 is divided into multiple sections, and each section is displayed with its own unique identifier (ID). The user operates user interface 15 to specify a location in area 600 that corresponds to the section photographed by camera 10 (i.e., the section on which the above-mentioned analysis process will now be performed), or inputs the identifier of the section into area 601. When the user operates user interface 15 to specify a location in area 600 that corresponds to the section photographed by camera 10, the identifier of the section is displayed in area 601.

ユーザはユーザインターフェース１５を操作して領域６０２に作物名（農作物の名称）を入力することができる。またユーザはユーザインターフェース１５を操作して領域６０３に農作物の品種を入力することができる。またユーザはユーザインターフェース１５を操作して領域６０４にＴｒｅｌｌｉｓを入力することができる。Ｔｒｅｌｌｉｓとは、例えば、農作物が葡萄である場合、葡萄圃場で葡萄を生育させるための葡萄の木のデザイン方法である。またユーザはユーザインターフェース１５を操作して領域６０５にＰｌａｎｔｅｄＹｅａｒを入力することができる。ＰｌａｎｔｅｄＹｅａｒとは、例えば、農作物が葡萄である場合、葡萄の木を植えた時期を表す。なお、これら全ての項目について撮影圃場パラメータを入力することは必須ではない。 The user can operate the user interface 15 to input a crop name (the name of the agricultural product) in area 602. The user can also operate the user interface 15 to input a variety of the agricultural product in area 603. The user can also operate the user interface 15 to input Trellis in area 604. For example, if the agricultural product is grapes, Trellis is a method of designing grape vines for growing grapes in a grape field. The user can also operate the user interface 15 to input a planted year in area 605. For example, if the agricultural product is grapes, the planted year indicates the time when the grape vines were planted. Note that it is not necessary to input the photography field parameters for all of these items.

そしてユーザがユーザインターフェース１５を操作して登録ボタン６０６を指示すると、情報処理装置１３のＣＰＵ１３１は、図６（Ａ）のＧＵＩにおいて入力された上記の各項目の撮影圃場パラメータをクラウドサーバ１２に対して送信する。クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から送信された撮影圃場パラメータを外部記憶装置１９６に保存（登録）する。 When the user operates the user interface 15 to select the registration button 606, the CPU 131 of the information processing device 13 transmits the photographed field parameters for each of the above items input in the GUI of FIG. 6(A) to the cloud server 12. The CPU 191 of the cloud server 12 stores (registers) the photographed field parameters transmitted from the information processing device 13 in the external storage device 196.

また、ユーザがユーザインターフェース１５を操作して修正ボタン６０７を指示すると、情報処理装置１３のＣＰＵ１３１は、図６（Ａ）のＧＵＩにおいて入力済みの撮影圃場パラメータの修正を可能にする。 In addition, when the user operates the user interface 15 to select the modification button 607, the CPU 131 of the information processing device 13 enables modification of the photographed field parameters that have already been input in the GUI of FIG. 6(A).

図６（Ａ）のＧＵＩは、特に葡萄の圃場を管理することを前提とした撮影圃場パラメータを入力させるためのＧＵＩであるが、同様の目的であったとしても、ユーザに入力させる撮影圃場パラメータは図６（Ａ）に示したものに限らない。また、農作物が葡萄でない場合であっても同様に、ユーザに入力させる撮影圃場パラメータは図６（Ａ）に示したものに限らない。例えば、領域６０２に入力する作物名を変更すると、領域６０３～６０５のタイトルおよび入力させる撮影圃場パラメータを変更するようにしても良い。 The GUI in FIG. 6(A) is a GUI for inputting photographed field parameters that are intended to be used in particular to manage grape fields, but even if the purpose is the same, the photographed field parameters that the user is prompted to input are not limited to those shown in FIG. 6(A). Similarly, even if the crop is not grapes, the photographed field parameters that the user is prompted to input are not limited to those shown in FIG. 6(A). For example, when the crop name entered in area 602 is changed, the titles of areas 603-605 and the photographed field parameters that the user is prompted to input may be changed.

図６（Ａ）のＧＵＩで入力した撮影圃場パラメータは、基本的には一度決定した後は固定のまま利用すれば良いため、例えば、毎年圃場の撮影を行って収量を予測する場合、既に登録済の撮影圃場パラメータを呼び出して利用できる。所望の区分について既に撮影圃場パラメータが登録されていれば、次回からは図６（Ｂ）に示す如く、領域６００内で該所望の区分に該当する箇所を指示することで、該区分に対応する撮影圃場パラメータが領域６０９～６１３に表示される。 The photographed field parameters entered in the GUI in FIG. 6(A) can basically be used as fixed parameters once they have been determined, so for example, if photographing the field every year to predict yields, the already registered photographed field parameters can be called up and used. If photographed field parameters have already been registered for a desired division, from the next time, the photographed field parameters corresponding to the desired division can be displayed in regions 609-613 by specifying the location in region 600 that corresponds to the desired division, as shown in FIG. 6(B).

ここで、正しい撮影圃場パラメータを全て入力することが後段の学習モデル選択のためにも望ましいが、ユーザにとって不明であるがために入力できなかった撮影圃場パラメータがあったとしても、不明のまま後続する処理を行うことができる。 Here, it is desirable to input all correct photographed field parameters in order to select a learning model later, but even if there are photographed field parameters that the user is unable to input because they are unknown, subsequent processing can be carried out while still remaining unknown.

ステップＳ２３では、撮影画像から農作物などのオブジェクトを検出するために用いる学習モデルの候補を選択するための処理が行われる。ステップＳ２３における処理の詳細について、図２Ｂのフローチャートに従って説明する。 In step S23, a process is performed to select candidates for a learning model to be used to detect objects such as agricultural crops from the captured image. Details of the process in step S23 will be described with reference to the flowchart in FIG. 2B.

ステップＳ２３０では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から取得したそれぞれの撮影画像に添付されているＥｘｉｆ情報と、外部記憶装置１９６に登録されている撮影圃場パラメータ（撮影画像に対応する区分の撮影圃場パラメータ）と、からクエリパラメータを生成する。 In step S230, the CPU 191 of the cloud server 12 generates query parameters from the Exif information attached to each captured image acquired from the camera 10 and the captured field parameters (the captured field parameters for the section corresponding to the captured image) registered in the external storage device 196.

クエリパラメータの構成例を図１１（Ａ）に示す。図１１（Ａ）のクエリパラメータは、図６（Ｂ）の撮影圃場パラメータが入力された場合に生成されるクエリパラメータである。 An example of the configuration of a query parameter is shown in FIG. 11(A). The query parameter in FIG. 11(A) is generated when the photographed field parameters in FIG. 6(B) are input.

「クエリ名」には、領域６０９に入力された「Ｆ５」が設定されている。「品種」には、領域６１１に入力された「Ｓｈｉｒａｚ」が設定されている。「Ｔｒｅｌｌｉｓ」には、領域６１２に入力された「Ｓｃｏｔｔ－Ｈｅｎｒｙ」が設定されている。「樹齢」には、領域６１３に入力された「２００１」からＥｘｉｆ情報に含まれている撮影日時（年）までの経過年数が樹齢「１９」として設定されている。「撮影日」には、Ｅｘｉｆ情報に含まれている撮影日時（月日）「Ｏｃｔ２０」が設定されている。「撮影時間帯」には、カメラ１０から受信したそれぞれの撮影画像に添付されているＥｘｉｆ情報中の撮影日時（時間）のうち最も過去の撮影日時（時間）から最近の撮影日時（時間）までの間の時間帯「１２：００－１４：００」が設定されている。「緯度、経度」には、Ｅｘｉｆ情報に含まれている撮影位置「３５°２８’Ｓ，１４９°１２’Ｅ」が設定されている。 The "Query Name" is set to "F5" input in area 609. The "Variety" is set to "Shiraz" input in area 611. The "Trellis" is set to "Scott-Henry" input in area 612. The "Age" is set to the number of years elapsed from "2001" input in area 613 to the shooting date and time (year) included in the Exif information, which is the age of the tree "19". The "Shooting Date" is set to "Oct 20", the shooting date and time (month and day) included in the Exif information. The "Shooting Time Zone" is set to "12:00-14:00", the time zone between the oldest shooting date and time (time) and the most recent shooting date and time (time) in the Exif information attached to each captured image received from camera 10. The "latitude, longitude" is set to the shooting position "35°28'S, 149°12'E" contained in the Exif information.

なお、クエリパラメータの生成方法は上記の方法に限らず、例えば、農作物の農家が既に圃場管理で用いているデータを読み込み、上記の項目に一致するパラメータの集合をクエリパラメータとしても良い。 The method of generating query parameters is not limited to the above method. For example, data that a farmer of a certain crop is already using for field management can be read, and a set of parameters that match the above items can be used as query parameters.

なお、場合によっては一部の項目に関する情報が不明となっている場合も有り得る。例えばＰｌａｎｔｅｄＹｅａｒや品種に関する情報が分からない場合、図１１（Ａ）に例示したような全項目を埋めることができない。この場合のクエリパラメータは図１１（Ｃ）に示す如く、一部空欄になる。 In some cases, information about some items may be unknown. For example, if information about the planted year or variety is unknown, it will not be possible to fill in all items as shown in FIG. 11(A). In this case, some of the query parameters will be left blank, as shown in FIG. 11(C).

次に、ステップＳ２３１では、クラウドサーバ１２のＣＰＵ１９１は、外部記憶装置１９６に保存しているＥ個（Ｅは２以上の整数）の学習モデルのうち候補となるＭ（１≦Ｍ＜Ｅ）個の学習モデル（候補学習モデル）を選択する。該選択では、クエリパラメータが示す環境と類似する環境に基づいて学習した学習モデルを候補学習モデルとして選択する。外部記憶装置１９６には、Ｅ個の学習モデルのそれぞれについて、該学習モデルがどのような環境に基づいて学習したのかを示すパラメータセットが保存されている。外部記憶装置１９６におけるそれぞれの学習モデルのパラメータセットの構成例を図１１（Ｂ）に示す。 Next, in step S231, the CPU 191 of the cloud server 12 selects M (1≦M<E) candidate learning models (candidate learning models) from the E (E is an integer equal to or greater than 2) learning models stored in the external storage device 196. In this selection, learning models that have been trained based on an environment similar to the environment indicated by the query parameters are selected as candidate learning models. The external storage device 196 stores, for each of the E learning models, a parameter set that indicates the environment in which the learning model was trained. An example of the configuration of the parameter sets for each learning model in the external storage device 196 is shown in FIG. 11 (B).

「モデル名」は、学習モデルの名称であり、「品種」は、該学習モデルが学習した農作物の品種であり、「Ｔｒｅｌｌｉｓ」は、該学習モデルが学習した「葡萄圃場で葡萄を生育させるための葡萄の木のデザイン方法」である。「樹齢」は、該学習モデルが学習した農作物の樹齢であり、「撮影日」は、該学習モデルが学習に使用した農作物の撮影画像の撮影日時である。「撮影時間帯」は、該学習モデルが学習に使用した農作物の撮影画像のうち最古の撮影日時から最近の撮影日時までの間の期間であり、「緯度、経度」は、該学習モデルが学習に使用した農作物の撮影画像の撮影位置「３５°２８’Ｓ，１４９°１２’Ｅ」である。 "Model name" is the name of the learning model, "variety" is the variety of the crop that the learning model has learned, and "Trellis" is the "method of designing grape trees for growing grapes in a grape field" that the learning model has learned. "Tree age" is the age of the crop that the learning model has learned, and "photographed date" is the date and time of the image of the crop that the learning model used for learning. "Photographed time period" is the period from the oldest date and time of the image of the crop that the learning model used for learning to the most recent date and time of the image, and "latitude, longitude" is the location "35°28'S, 149°12'E" of the image of the crop that the learning model used for learning.

学習モデルによっては、複数の圃場のブロックで収集されたデータセットを混在させて学習しているものもある。そのため、例えばモデル名が「Ｍ００４」、「Ｍ００５」の学習モデルのように、複数の設定（品種や樹齢等）を含むようにパラメータセットが設定されているものがあっても良い。 Some learning models learn by mixing data sets collected from multiple field blocks. For this reason, there may be models whose parameter sets are configured to include multiple settings (variety, tree age, etc.), such as the learning models with the model names "M004" and "M005".

よってクラウドサーバ１２のＣＰＵ１９１は、クエリパラメータと、図１１（Ｂ）に示す学習モデルごとのパラメータセットと、の類似度を求め、該類似度が高い順に上位Ｍ個の学習モデルを候補学習モデルとして選択する。 The CPU 191 of the cloud server 12 therefore calculates the similarity between the query parameters and the parameter set for each learning model shown in FIG. 11(B), and selects the top M learning models in descending order of similarity as candidate learning models.

モデル名＝Ｍ００１、Ｍ００２，…のそれぞれの学習モデルのパラメータセットをＭ_１，Ｍ_２，…と表記すると、クラウドサーバ１２のＣＰＵ１９１は、クエリパラメータＱと、パラメータセットＭ_ｘと、の類似度Ｄ（Ｑ，Ｍ_ｘ）を以下の式（１）を計算することで求める。 If the parameter sets of the learning models with model names = M001, M002, ... are denoted as _M1 , _M2 , ..., the CPU 191 of the cloud server 12 calculates the similarity D(Q, _Mx ) between the query parameter Q and the parameter set _Mx by calculating the following formula (1).

ここで、ｑ_ｋはクエリパラメータＱにおいて先頭からｋ番目の要素を表す。図１１（Ａ）の場合、クエリパラメータＱには、「品種」、「Ｔｒｅｌｌｉｓ」、「樹齢」、「撮影日」、「撮影時間帯」、「緯度、経度」の６つの要素が含まれているため、ｋ＝１～６である。 Here, q _k represents the k-th element from the top in the query parameter Q. In the case of Fig. 11A, the query parameter Q includes six elements, namely "variety", "Trellis", "age", "photographed date", "photographed time zone", and "latitude, longitude", so k = 1 to 6.

ｍ_ｘ、ｋはパラメータセットＭ_ｘにおいて先頭からｋ番目の要素を表す。図１１（Ｂ）の場合、パラメータセットには、「品種」、「Ｔｒｅｌｌｉｓ」、「樹齢」、「撮影日」、「撮影時間帯」、「緯度、経度」の６つの要素が含まれているため、ｋ＝１～６である。 m _{x, k} represents the k-th element from the top in the parameter set M _x . In the case of Fig. 11B, the parameter set includes six elements, "variety", "Trellis", "age", "photographed date", "photographed time zone", and "latitude, longitude", so k = 1 to 6.

ｆ_ｋ（ａ_ｋ、ｂ_ｋ）は、要素ａ_ｋとｂ_ｋとの間の距離を求めるための関数であり、予め設定されている。ｆ_ｋ（ａ_ｋ、ｂ_ｋ）は、事前に実験により注意深く設定しても良いが、上記の式（１）による距離定義は基本的に性質の異なる学習モデル程大きな値になるようになっていれば良いため、以下のように簡易に設定すれば良い。 _fk ( _ak , _bk ) is a function for finding the distance between elements _ak and _bk , and is set in advance. _fk ( _ak , _bk ) may be set carefully in advance by experiment, but since the distance definition in the above formula (1) basically only needs to be a larger value for learning models with different properties, it can be set simply as follows:

つまり、基本的に要素は、分類要素（品種、Ｔｒｅｌｌｉｓ）である場合と、連続値要素（樹齢、撮影日…）である場合と、の２類種に分けられる。よって、分類要素間の距離を規定する関数は以下の式（２）のように定義し、連続値要素間の距離を規定する関数は以下の式（３）のように定義する。 In other words, elements are basically divided into two types: classification elements (variety, Trellis) and continuous value elements (age of tree, date of photo, etc.). Therefore, the function that specifies the distance between classification elements is defined as in the following formula (2), and the function that specifies the distance between continuous value elements is defined as in the following formula (3).

全ての要素（ｋ）に対する関数は事前にルールベースで実装しておく。また、各々の要素の最終的なモデル間距離への影響度に応じてα_ｋを決めておく。例えば、「品種」（ｋ＝１）による違いはそれ程画像の違いに表れないため、α_１は０に限りなく近づけ、「Ｔｒｅｌｌｉｓ」（ｋ＝２）の違いは大きく影響するため、α_２は大きく設定しておく、というように予め調整しておく。 Functions for all elements (k) are implemented in advance as a rule base. In addition, α _k is determined according to the degree of influence of each element on the final inter-model distance. For example, since differences due to "type" (k=1) do not appear much in the differences in the images, α ₁ is set as close to 0 as possible, and since differences in "Trellis" (k=2) have a large influence, α ₂ is set large in advance.

また、図１１（Ｂ）のモデル名が「Ｍ００４」、「Ｍ００５」の学習モデルのように、「品種」や「樹齢」に複数の設定が登録されている学習モデルの場合、例えば「品種」の場合は、「品種」に登録されているそれぞれの設定について距離を求め、その平均距離を「品種」に対応する距離とする。「樹齢」の場合も同様に、「樹齢」に登録されているそれぞれの設定について距離を求め、その平均距離を「樹齢」に対応する距離とする。 In addition, in the case of a learning model in which multiple settings are registered for "variety" or "age," such as the learning models with model names "M004" and "M005" in Figure 11 (B), for example, in the case of "variety," the distance is found for each setting registered for "variety," and the average distance is taken as the distance corresponding to "variety." Similarly, in the case of "age," the distance is found for each setting registered for "age," and the average distance is taken as the distance corresponding to "age."

なお、クラウドサーバ１２のＣＰＵ１９１は、上記の類似度に基づいてＭ個の学習モデルを候補学習モデルとして選択するのであれば、その選択方法は特定の選択方法に限らない。例えば、クラウドサーバ１２のＣＰＵ１９１は、閾値以上の類似度を有するＭ個の学習モデルを選択するようにしても良い。 Note that the selection method is not limited to a specific method as long as the CPU 191 of the cloud server 12 selects M learning models as candidate learning models based on the above similarity. For example, the CPU 191 of the cloud server 12 may select M learning models that have a similarity equal to or greater than a threshold value.

ただし、クエリパラメータにおける要素が何れも空の場合は、ステップＳ２３１における処理は行われず、その結果、全ての学習モデルを候補学習モデルとして以降の処理が行われることになる。 However, if all elements in the query parameters are empty, the processing in step S231 is not performed, and as a result, subsequent processing is performed with all learning models as candidate learning models.

候補学習モデルを選択することによる効果は、多岐に及ぶ。まず、事前知識として可能性の低い学習モデルを本ステップで排除することで、以降続く学習モデルのスコアリングによるランキング作成などに係る処理時間を大幅にカットすることができる。また、ルールベースによる学習モデルのスコアリングであっても、本来比較する必要の無い学習モデルをも候補に入れると、学習モデルの選択精度を落とす可能性があるが、その可能性を最小限に留めることができる。 The effects of selecting candidate learning models are manifold. First, by eliminating learning models that are unlikely to be prior knowledge in this step, the processing time required for subsequent steps such as scoring learning models to create rankings can be significantly reduced. Also, even in rule-based scoring of learning models, including learning models that do not actually need to be compared as candidates can reduce the accuracy of the learning model selection, but this possibility can be minimized.

次に、ステップＳ２３２では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から受信した撮影画像からＰ（Ｐは２以上の整数）枚の撮影画像をモデル選択対象画像として選択する。カメラ１０から受信した撮影画像からＰ枚の撮影画像を選択する方法は特定の選択方法に限らない。例えば、ＣＰＵ１９１は、カメラ１０から受信した撮影画像からランダムにＰ枚の撮影画像を選択しても良いし、何らかの基準に従って選択しても良い。 Next, in step S232, the CPU 191 of the cloud server 12 selects P (P is an integer equal to or greater than 2) captured images from the captured images received from the camera 10 as images for model selection. The method for selecting P captured images from the captured images received from the camera 10 is not limited to a specific selection method. For example, the CPU 191 may randomly select P captured images from the captured images received from the camera 10, or may select them according to some criteria.

次に、ステップＳ２３３では、ステップＳ２３２で選択したＰ枚の撮影画像を用いて、Ｍ個の候補学習モデルから１つを選択学習モデルとして選択するための処理が行われる。ステップＳ２３３における処理の詳細について、図２Ｃのフローチャートに従って説明する。 Next, in step S233, a process is performed to select one of the M candidate learning models as a selected learning model using the P captured images selected in step S232. Details of the process in step S233 will be described with reference to the flowchart in FIG. 2C.

ステップＳ２３３０では、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、「Ｐ枚の撮影画像のそれぞれについて、該候補学習モデルを用いて該撮影画像からオブジェクトを検出する処理であるオブジェクト検出処理」を行う。 In step S2330, the CPU 191 of the cloud server 12 performs an object detection process for each of the M candidate learning models, which is a process for detecting an object from each of the P captured images using the candidate learning model.

これにより、Ｐ枚の撮影画像のそれぞれについて、Ｍ個の候補学習モデルのそれぞれの「該撮影画像に対するオブジェクト検出処理の結果」が得られる。本実施形態では、「撮影画像に対するオブジェクト検出処理の結果」は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報である。 As a result, for each of the P captured images, the "result of the object detection process for that captured image" for each of the M candidate learning models is obtained. In this embodiment, the "result of the object detection process for the captured image" is the position information of the image area (rectangular area, detection area) of the object detected from the captured image.

ステップＳ２３３１では、ＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれの「Ｐ枚の撮影画像のそれぞれに対するオブジェクト検出処理の結果」に対するスコアを求める。そしてＣＰＵ１９１は、該スコアに基づいてＭ個の候補学習モデルの順位付け（ランキング作成）を行って、Ｍ個の候補学習モデルからＮ（Ｎ≦Ｍ）個の候補学習モデルを選択する。 In step S2331, the CPU 191 obtains a score for each of the M candidate learning models for the "result of the object detection process for each of the P captured images." The CPU 191 then ranks the M candidate learning models based on the scores, and selects N (N≦M) candidate learning models from the M candidate learning models.

このとき、撮影画像にはアノテーション情報が無いため、正確な検出精度評価はできない。しかし、農場のように計画的にデザインされてメンテナンスされている対象では、以下の様なルールを利用してオブジェクト検出処理の精度を予測して評価することが可能である。候補学習モデルによるオブジェクト検出処理の結果に対するスコアは、例えば、以下のようにして求める。 In this case, since the captured images do not contain annotation information, accurate evaluation of detection accuracy is not possible. However, for objects that are systematically designed and maintained, such as farms, it is possible to predict and evaluate the accuracy of the object detection process using the following rules. The score for the results of the object detection process using the candidate learning model is calculated, for example, as follows:

一般的な圃場では、図３（Ａ）、（Ｂ）に示したように、等間隔に農作物が植わっている。よって、図５（Ａ）、（Ｂ）で例示したアノテーション（矩形領域）のようにオブジェクトを検出をする際は、常に等しく画像の左端から右端まで矩形領域が連続して検出されるのが正常に検出した状態である。 In a typical farm field, crops are planted at equal intervals, as shown in Figures 3(A) and (B). Therefore, when detecting an object such as the annotation (rectangular area) shown in Figures 5(A) and (B), a normal detection state is when a rectangular area is always detected continuously and evenly from the left edge to the right edge of the image.

例えば、図５（Ａ）のように撮影画像の左端から右端まで全て果実等が収穫可能な領域と検出される場合、矩形領域５００のように生産領域が検出されるべきである。また、図５（Ｂ）のように、撮影画像内に非生産領域である矩形領域５０６がある場合も、撮影画像の左端から右端まで矩形領域５０５、５０６、５０７と検出されるべきである。もし撮影画像の条件に合わない学習モデルを用いて該撮影画像に対するオブジェクト検出処理を実施すると、上記の矩形領域のうち検出されない矩形領域が発生する可能性がある。撮影画像の条件から遠い条件に対応する学習モデルほど、このような可能性は高くなる。よって、候補学習モデルの評価を行う最も、簡易なスコアリング方法としては、例えば次のような方法が考えられる。 For example, as in FIG. 5(A), if the entire captured image from the left to the right edge is detected as an area where fruit or the like can be harvested, then a production area should be detected as rectangular area 500. Also, as in FIG. 5(B), if the captured image contains rectangular area 506, which is a non-production area, then rectangular areas 505, 506, and 507 should be detected from the left to the right edge of the captured image. If an object detection process is performed on the captured image using a learning model that does not match the conditions of the captured image, there is a possibility that some of the above rectangular areas will not be detected. The more the learning model corresponds to conditions that are farther from the conditions of the captured image, the higher the possibility of this happening. Therefore, the following method, for example, can be considered as the simplest scoring method for evaluating candidate learning models.

着目候補学習モデルにより着目撮影画像からは複数のオブジェクトの検出領域が検出される。よって、該着目撮影画像の垂直方向に検出領域を探索して該検出領域がない領域の画素数Ｃｐをカウントし、該着目撮影画像の幅の画素数に対する画素数Ｃｐの割合を該着目撮影画像の罰則スコアとする。このようにして、着目候補学習モデルによりオブジェクト検出処理を行ったＰ枚の撮影画像のそれぞれについて罰則スコアを求め、該求めた罰則スコアの合計値を、該着目候補学習モデルのスコアとする。このような処理をＭ個の候補学習モデルのそれぞれについて行うことで、それぞれの候補学習モデルのスコアが確定する。そしてＭ個の候補学習モデルをスコアが小さい順に順位付けし、スコアが小さい順に上位Ｎ個の候補学習モデルを選択する。該選択の際には、「スコアが閾値未満」という条件を加えても良い。 The candidate learning model of interest detects detection areas of multiple objects from the photographed image of interest. Therefore, the detection area is searched for in the vertical direction of the photographed image of interest, and the number of pixels Cp in the area where the detection area is not present is counted, and the ratio of the number of pixels Cp to the number of pixels in the width of the photographed image of interest is set as the penalty score of the photographed image of interest. In this way, a penalty score is calculated for each of the P photographed images in which object detection processing is performed using the candidate learning model of interest, and the sum of the calculated penalty scores is set as the score of the candidate learning model of interest. By performing such processing for each of the M candidate learning models, the score of each candidate learning model is determined. Then, the M candidate learning models are ranked in order of decreasing score, and the top N candidate learning models are selected in order of decreasing score. When making this selection, a condition that "score is less than a threshold value" may be added.

また、候補学習モデルのスコアとして、通常等間隔に植えられている木の幹部分の検出領域から類推されるスコアを求めても良い。木の幹は図５（Ａ）に示す如く、矩形領域５０１、５０２、５０３、５０４と凡そ等間隔に検出されるべきであるため、撮影画像の幅に対する「木の幹の領域の検出数」として想定される数は決まっている。想定される数より少ない／多い撮影画像は検出ミスを起こしている可能性が高いため、この検出数をスコアに反映させても良い。 In addition, the score of the candidate learning model may be calculated based on the detection area of the trunk of a tree, which is usually planted at equal intervals. As shown in FIG. 5(A), tree trunks should be detected at approximately equal intervals as rectangular areas 501, 502, 503, and 504, so the expected number of "detected tree trunk areas" for the width of the captured image is fixed. Images with fewer/more than the expected number are likely to have detection errors, so this detection number may be reflected in the score.

そしてＣＰＵ１９１は、Ｐ枚の撮影画像、Ｍ個の候補学習モデルから選択したＮ個の候補学習モデルのそれぞれの「該Ｐ枚の撮影画像に対するオブジェクト検出処理の結果」、該Ｎ個の候補学習モデルに関する情報（モデル名など）、を情報処理装置１３に対して送信する。上記の如く、本実施形態では、「撮影画像に対するオブジェクト検出処理の結果」は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報であり、このような位置情報は、例えば、ｊｓｏｎ形式またはｔｘｔ形式等のファイルフォーマットのデータとして情報処理装置１３に対して送信される。 Then, the CPU 191 transmits to the information processing device 13 the P captured images, the "results of the object detection process on the P captured images" of each of the N candidate learning models selected from the M candidate learning models, and information about the N candidate learning models (such as model names). As described above, in this embodiment, the "results of the object detection process on the captured images" are positional information of the image area (rectangular area, detection area) of the object detected from the captured image, and such positional information is transmitted to the information processing device 13 as data in a file format such as json format or txt format.

次に、選択されたＮ個の候補学習モデルから１つをユーザに選択させる。ステップＳ２３３１の処理の終了時点で未だ候補学習モデルはＮ個残っており、性能を比較する根拠とする出力は上記のＰ枚の撮影画像に対するオブジェクト検出処理の結果であるため、ユーザは、Ｎ×Ｐ枚の撮影画像に対するオブジェクト検出処理の結果を見比べなければならない。そのような状態で適切に１つの候補学習モデルを選択学習モデルとして選択する（１つに絞り込む）のは困難である。 Next, the user is prompted to select one of the selected N candidate learning models. At the end of the processing in step S2331, N candidate learning models still remain, and since the output on which the performance comparison is based is the result of the object detection processing on the P captured images, the user must compare the results of the object detection processing on the N x P captured images. In such a situation, it is difficult to appropriately select one candidate learning model as the selected learning model (narrow it down to one).

よってステップＳ２３３２では、情報処理装置１３のＣＰＵ１３１は、Ｐ枚の撮影画像について、ユーザの主観による比較がしやすい情報提示のためのスコアリング（表示画像スコアリング）を行う。表示画像スコアリングでは、Ｐ枚の撮影画像のそれぞれについて、Ｎ個の候補学習モデル間で検出領域の配置パターンが大きく異なるほど大きいスコアを決定する。このようなスコアは、例えば、以下の式（４）を計算することで求めることができる。 Therefore, in step S2332, the CPU 131 of the information processing device 13 performs scoring (display image scoring) for the P captured images in order to present information that is easy for the user to subjectively compare. In the display image scoring, a larger score is determined for each of the P captured images as the detection area arrangement patterns differ more significantly among the N candidate learning models. Such a score can be obtained, for example, by calculating the following formula (4).

ここで、Ｓｃｏｒｅ（ｚ）は、撮影画像Ｉｚに対するスコアである。Ｔ_Ｉｚ（Ｍａ、Ｍｂ）は、候補学習モデルＭａが撮影画像Ｉｚに対して行ったオブジェクト検出処理の結果（検出領域の配置パターン）と、候補学習モデルＭｂが撮影画像Ｉｚに対して行ったオブジェクト検出処理の結果（検出領域の配置パターン）と、の差に基づくスコアを求めるための関数である。このような関数には様々な関数が適用可能であり、特定の関数に限らない。例えば、候補学習モデルＭａが撮影画像Ｉｚから検出した検出領域Ｒａごとに、候補学習モデルＭｂが撮影画像Ｉｚから検出した検出領域Ｒｂのうち該検出領域Ｒａに最も近い検出領域Ｒｂ’の位置（例えば左上隅の位置および右下隅の位置）と該検出領域Ｒａの位置（例えば左上隅の位置および右下隅の位置）との差を求め、求めた差の合計を返す関数をＴ_Ｉｚ（Ｍａ、Ｍｂ）としても良い。 Here, Score(z) is the score for the captured image Iz. T _Iz (Ma, Mb) is a function for obtaining a score based on the difference between the result of the object detection process (arrangement pattern of the detection area) performed by the candidate learning model Ma on the captured image Iz and the result of the object detection process (arrangement pattern of the detection area) performed by the candidate learning model Mb on the captured image Iz. Various functions can be applied to such a function, and it is not limited to a specific function. For example, for each detection area Ra detected by the candidate learning model Ma from the captured image Iz, a function that obtains the difference between the position of the detection area Rb' (e.g., the position of the upper left corner and the position of the lower right corner) closest to the detection area Ra among the detection areas Rb detected by the candidate learning model Mb from the captured image Iz and the position of the detection area Ra (e.g., the position of the upper left corner and the position of the lower right corner) and returns the sum of the obtained differences may be set as T _Iz (Ma, Mb).

上位Ｎ個の候補学習モデルによるオブジェクト検出処理の結果は多くの場合は類似することが多いため、無作為に取り出した画像で見比べてもほとんど差が無い場合が多く、学習モデルを選ぶ際の根拠にならない。よって上記の式（４）でスコアリングした上位の撮影画像だけを見ることによって容易に学習モデルの良し悪しを判断しやすくなる。 The results of object detection processing using the top N candidate learning models are often similar, so even if randomly selected images are compared, there is often little difference, and this does not provide a basis for selecting a learning model. Therefore, by looking at only the top captured images scored using formula (4) above, it becomes easier to judge the quality of the learning model.

ステップＳ２３３３では、情報処理装置１３のＣＰＵ１３１は、上記のＮ個の候補学習モデルのそれぞれについて、クラウドサーバ１２から受信したＰ枚の撮影画像のうちスコアが大きい順に上位Ｆ枚（上位から規定枚数）の撮影画像と、クラウドサーバ１２から受信した該撮影画像に対するオブジェクト検出処理の結果と、を表示装置１４に表示させる（表示制御）。その際、Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示する。 In step S2333, the CPU 131 of the information processing device 13 causes the display device 14 to display (display control) the top F (a specified number from the top) captured images of the P captured images received from the cloud server 12 in descending order of score for each of the N candidate learning models, and the results of the object detection process for the captured images received from the cloud server 12. At that time, the F captured images are displayed in descending order of score from the left.

候補学習モデルごとの撮影画像およびオブジェクト検出処理の結果を表示したＧＵＩの表示例を図７（Ａ）に示す。図７（Ａ）では、Ｎ＝３、Ｆ＝４のケースについて示している。 Figure 7(A) shows an example of a GUI displaying the captured images and the results of the object detection process for each candidate learning model. Figure 7(A) shows the case where N = 3 and F = 4.

最上の行には、スコアが最も大きい候補学習モデルのモデル名「Ｍ００２」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００２」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 The top row displays the model name "M002" of the candidate learning model with the highest score along with a radio button 70, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area of the object detected from the captured image by the candidate learning model with model name "M002" is superimposed on the captured image.

中段の行には、スコアが２番目に大きい候補学習モデルのモデル名「Ｍ０１１」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０１１」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 In the middle row, the model name "M011" of the candidate learning model with the second highest score is displayed along with a radio button 70, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area of the object detected from the captured image by the candidate learning model with model name "M011" is superimposed on the captured image.

下段の行には、スコアが３番目に大きい候補学習モデルのモデル名「Ｍ００９」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００９」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 In the bottom row, the model name "M009" of the candidate learning model with the third highest score is displayed along with a radio button 70, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area of the object detected from the captured image by the candidate learning model with model name "M009" is superimposed on the captured image.

なお、このＧＵＩでは、各々の候補学習モデルによるオブジェクト検出処理の結果を一瞥して比較しやすいように、同列に並ぶ撮影画像は同じ撮影画像になるように表示する。 In addition, in this GUI, images in the same row are displayed as if they were the same photograph, so that the results of the object detection process using each candidate learning model can be easily compared at a glance.

そして、Ｆ枚の撮影画像に対するＮ個の候補学習モデルによるオブジェクト検出処理の結果の違いをユーザは目視により確認し、Ｎ個の候補学習モデルのうち１つをユーザインターフェース１５を用いて選択する。 Then, the user visually checks the differences in the results of the object detection process for the F captured images using the N candidate learning models, and selects one of the N candidate learning models using the user interface 15.

ステップＳ２３３４では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ操作、ユーザ入力）を受け付ける。ステップＳ２３３５では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われたか否かを判断する。 In step S2334, the CPU 131 of the information processing device 13 accepts a candidate learning model selection operation (user operation, user input) by the user. In step S2335, the CPU 131 of the information processing device 13 determines whether or not a candidate learning model selection operation (user input) has been performed by the user.

図７（Ａ）の場合、ユーザは、モデル名「Ｍ００２」の候補学習モデルを選択する場合には、最上の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０１１」の候補学習モデルを選択する場合には、中段の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ００９」の候補学習モデルを選択する場合には、下段の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。図７（Ａ）では、モデル名「Ｍ００２」に対応するラジオボタン７０が選択されているため、モデル名「Ｍ００２」の候補学習モデルが選択されたことを示す枠７４が表示される。 In the case of FIG. 7(A), when the user wants to select a candidate learning model with the model name "M002", the user selects the radio button 70 in the top row using the user interface 15. When the user wants to select a candidate learning model with the model name "M011", the user selects the radio button 70 in the middle row using the user interface 15. When the user wants to select a candidate learning model with the model name "M009", the user selects the radio button 70 in the bottom row using the user interface 15. In FIG. 7(A), because the radio button 70 corresponding to the model name "M002" is selected, a frame 74 is displayed indicating that the candidate learning model with the model name "M002" has been selected.

そしてユーザがユーザインターフェース１５を操作して決定ボタン７１を指示すると、ＣＰＵ１３１は、「ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた」と判断し、選択したラジオボタン７０に対応する候補学習モデルを選択学習モデルとして選択する。 When the user operates the user interface 15 to select the decision button 71, the CPU 131 determines that "the user has performed a candidate learning model selection operation (user input)," and selects the candidate learning model corresponding to the selected radio button 70 as the selected learning model.

この判断の結果、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた場合には、処理はステップＳ２３３６に進み、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われていない場合には、処理はステップＳ２３３４に進む。 If the result of this determination is that the user has performed a selection operation (user input) of a candidate learning model, processing proceeds to step S2336; if the user has not performed a selection operation (user input) of a candidate learning model, processing proceeds to step S2334.

ステップＳ２３３６では、情報処理装置１３のＣＰＵ１３１は、最終的に学習モデルが１個のみ選択された状態であるのかを確認する。そして、最終的に学習モデルが１個のみ選択された状態である場合には、処理はステップＳ２４に進み、最終的に学習モデルが１個のみ選択された状態ではない場合には、処理はステップＳ２３３２に進む。 In step S2336, the CPU 131 of the information processing device 13 checks whether only one learning model was ultimately selected. If only one learning model was ultimately selected, the process proceeds to step S24, and if not, the process proceeds to step S2332.

ここで、ユーザが図７（Ａ）の表示を見ただけでは１個に絞ることが出来なかった場合は、複数のラジオボタン７０を選択することで複数の候補学習モデルを選択するようにしても良い。例えば、ユーザがユーザインターフェース１５を操作して、図７（Ａ）においてモデル名「Ｍ００２」に対応するラジオボタン７０とモデル名「Ｍ０１１」に対応するラジオボタン７０とを選択して決定ボタン７１を指定した場合、選択したラジオボタン７０の数「２」をＮに設定して、処理はステップＳ２３３６を介してステップＳ２３３２に進む。この場合、ステップＳ２３３２以降では、Ｎ＝２、Ｆ＝４について同様の処理を行う。このようにして、最終的に選択される学習モデルの数が「１」になるまで処理を繰り返す。 Here, if the user is unable to narrow it down to one just by looking at the display in FIG. 7(A), multiple radio buttons 70 may be selected to select multiple candidate learning models. For example, if the user operates the user interface 15 to select the radio button 70 corresponding to the model name "M002" and the radio button 70 corresponding to the model name "M011" in FIG. 7(A) and presses the decision button 71, the number of selected radio buttons 70, "2", is set to N, and the process proceeds to step S2332 via step S2336. In this case, from step S2332 onwards, the same process is performed for N=2 and F=4. In this way, the process is repeated until the number of learning models finally selected becomes "1".

また、ユーザは、図７（Ａ）のＧＵＩの代わりに図７（Ｂ）のＧＵＩを用いて学習モデルを選択しても良い。図７（Ａ）のＧＵＩは、ユーザに直接どの学習モデルが良いのかを選択させるＧＵＩとなっている。これに対し、図７（Ｂ）のＧＵＩでは、それぞれの撮影画像にチェックボックス７２が設けられており、ユーザは、縦一列に並ぶ撮影画像列ごとに、該撮影画像列中の撮影画像のうちオブジェクト検出処理の結果が好ましいと判断した撮影画像のチェックボックス７２を、ユーザインターフェース１５を操作して指定してオンにする（チェックマークを付ける）。そしてユーザがユーザインターフェース１５を操作して決定ボタン７５を指示すると、情報処理装置１３のＣＰＵ１３１は、モデル名が「Ｍ００２」、「Ｍ０１１」、「Ｍ００９」の候補学習モデルのうち、チェックボックス７２がオンになっている撮影画像の数が最も多い候補学習モデルを選択学習モデルとして選択する。図７（Ｂ）の例では、モデル名が「Ｍ００２」の候補学習モデルの４枚の撮影画像のうち３つのチェックボックス７２がオンになっており、モデル名が「Ｍ０１１」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス７２がオンになっており、モデル名が「Ｍ００９」の候補学習モデルの候補学習モデルの４枚の撮影画像の何れのチェックボックス７２もオンになっていない。この場合は、モデル名が「Ｍ００２」の候補学習モデルが選択学習モデルとして選択されることになる。このようなＧＵＩによる選択学習モデルの選択方法は、例えば、Ｆの値が増加してユーザがどの候補学習モデルが最も良いか判断するのが難しい場合に有効である。 The user may also select a learning model using the GUI of FIG. 7(B) instead of the GUI of FIG. 7(A). The GUI of FIG. 7(A) is a GUI that allows the user to directly select which learning model is better. In contrast, in the GUI of FIG. 7(B), a check box 72 is provided for each captured image, and the user operates the user interface 15 to specify and turn on (put a check mark on) the check box 72 of the captured image that is determined to have a better result of the object detection process among the captured images in the captured image sequence, for each captured image sequence that is lined up vertically. Then, when the user operates the user interface 15 to instruct the decision button 75, the CPU 131 of the information processing device 13 selects, as the selected learning model, the candidate learning model with the largest number of captured images with the check box 72 turned on, from among the candidate learning models with the model names "M002", "M011", and "M009". In the example of FIG. 7(B), three check boxes 72 are checked out of the four captured images of the candidate learning model with the model name "M002", one check box 72 is checked out of the four captured images of the candidate learning model with the model name "M011", and none of the check boxes 72 are checked out of the four captured images of the candidate learning model with the model name "M009". In this case, the candidate learning model with the model name "M002" is selected as the selected learning model. This method of selecting a selected learning model using a GUI is effective, for example, when the value of F increases and it is difficult for the user to determine which candidate learning model is best.

なお、「チェックボックス７２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルが存在する場合には、ステップＳ２３３６で「最終的に学習モデルが１個のみ選択された状態ではない」と判断して、処理はステップＳ２３３２に進む。そしてステップＳ２３３２以降では、「チェックボックス７２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルを対象にして処理を行う。このような場合でも、最終的に選択される学習モデルの数が「１」になるまで処理を繰り返す。 If there are candidate learning models with the same or a small difference in the "number of captured images with check boxes 72 checked," step S2336 determines that "not only one learning model was ultimately selected," and processing proceeds to step S2332. From step S2332 onward, processing is performed on candidate learning models with the same or a small difference in the "number of captured images with check boxes 72 checked." Even in such cases, processing is repeated until the number of learning models ultimately selected becomes "1."

また、より左に表示された撮影画像は、候補学習モデル間におけるオブジェクト検出処理の結果の差異がより大きい撮影画像であることから、より左に表示される撮影画像ほどより大きい重み値を割り当てても良い。この場合、候補学習モデルごとに、チェックボックス７２がオンになっている撮影画像の重み値の合計を求め、求めた合計が最も大きい候補学習モデルを選択学習モデルとして選択するようにしても良い。 In addition, since the captured image displayed further to the left is an image for which the difference in the results of the object detection process between the candidate learning models is greater, a larger weight value may be assigned to the captured image displayed further to the left. In this case, for each candidate learning model, the sum of the weight values of the captured images for which the check boxes 72 are checked may be calculated, and the candidate learning model with the largest calculated sum may be selected as the selected learning model.

そして情報処理装置１３のＣＰＵ１３１は、どのような方法で選択学習モデルを選択したとしても、該選択学習モデルを示す情報（例えば選択学習モデルのモデル名）をクラウドサーバ１２に通知する。 Then, regardless of the method used to select the selected learning model, the CPU 131 of the information processing device 13 notifies the cloud server 12 of information indicating the selected learning model (e.g., the model name of the selected learning model).

ステップＳ２４では、クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から通知された情報で特定される選択学習モデルを用いて、撮影画像（カメラ１０がクラウドサーバ１２および情報処理装置１３に送信した撮影画像）に対するオブジェクト検出処理を行う。 In step S24, the CPU 191 of the cloud server 12 performs object detection processing on the captured image (the captured image transmitted by the camera 10 to the cloud server 12 and the information processing device 13) using the selected learning model identified by the information notified from the information processing device 13.

ステップＳ２５では、クラウドサーバ１２のＣＰＵ１９１は、ステップＳ２４におけるオブジェクト検出処理の結果として得られる検出領域から、目的としていた圃場の収量の予測や圃場全体に対する非生産率の計算等の分析処理を行う。この計算は、全撮影画像から検出された生産領域矩形と枯れ枝領域、病変領域等と判定された非生産領域の双方の領域を加味して行われる。 In step S25, the CPU 191 of the cloud server 12 performs analysis processing such as predicting the yield of the target field and calculating the non-productive rate for the entire field from the detection area obtained as a result of the object detection processing in step S24. This calculation is performed taking into account both the production area rectangle detected from all captured images and the non-productive areas determined to be dead branch areas, diseased areas, etc.

なお、本実施形態に係る学習モデルはディープラーニングによって学習されたモデルであるが、各種パラメータで定義されたルールベースによる検出器、ファジィ推論、遺伝的アルゴリズム、等の様々なオブジェクト検出技術を学習モデルとして利用しても良い。 Note that the learning model in this embodiment is a model learned by deep learning, but various object detection techniques such as rule-based detectors defined by various parameters, fuzzy inference, genetic algorithms, etc. may also be used as the learning model.

［第２の実施形態］
本実施形態以降では、第１の実施形態との差分について説明し、以下で特に触れない限りは第１の実施形態と同様であるものとする。本実施形態では、工場の生産ラインにおける外観検査を行うシステムを例にとり説明する。本実施形態に係るシステムは、検査対象である工業製品の異常領域を検出する。 Second Embodiment
In the following embodiments, differences from the first embodiment will be described, and unless otherwise specified below, it is assumed that the present embodiment is the same as the first embodiment. In the present embodiment, a system for performing visual inspection on a production line in a factory will be described as an example. The system according to the present embodiment detects abnormal areas in industrial products that are the objects of inspection.

従来、工場の生産ラインにおける外観検査では、製造ラインごとに検査装置（製品の外観を撮影して検査する装置）の撮影条件等が綿密に調整されており、各製造ラインが立ち上がるごとに検査装置の設定も時間をかけて調整するのが一般的であった。しかし、近年では、顧客ニーズの多様化と市場の移り変わりに即座に対応することが製造現場に望まれている。そして、小ロットであっても短期間でラインを立ち上げて需要に見合う数量の製造を行い、さらに充分な供給が終了すると即座にラインを解体して次の製造ラインに備える、といったスピーディな対応へのニーズが高まっている。 Traditionally, in visual inspections on factory production lines, the imaging conditions of the inspection equipment (equipment that photographs and inspects the appearance of products) were carefully adjusted for each production line, and it was common to spend time adjusting the settings of the inspection equipment each time a production line was started up. However, in recent years, manufacturing sites are expected to respond immediately to diversifying customer needs and changes in the market. There is a growing need for speedy responses, such as setting up a line in a short period of time, even for small lots, to produce an amount that meets demand, and then immediately dismantling the line once sufficient supply has been met to prepare for the next production line.

その際、従来同様に外観検査の設定を製造現場の専門家の経験や勘を基に毎度設定しているのでは迅速な立ち上げに対応しきれない。類似した製品の検査を過去に実施していたような場合、それらに関わる設定パラメータを保持しておいて、類似した検査を行う場合に該過去の設定パラメータを呼び出すことができれば、専門家の経験に頼ることなく誰でも検査装置の設定を行うことが可能になる。 In this case, if the settings for visual inspections are set up each time based on the experience and intuition of experts on the manufacturing site, as in the past, it will not be possible to respond to a rapid start-up. If similar products have been inspected in the past, the related setting parameters can be stored and those past setting parameters can be called up when a similar inspection is to be performed, allowing anyone to set up the inspection equipment without relying on the experience of experts.

第１の実施形態と同様に、既に保持している学習モデルを新規製品の検査対象画像に割り当てることで同様に上記目的が達成される。よって、第２の実施形態にも上記の情報処理装置１３を適用することができる。 As in the first embodiment, the above objective can be achieved by assigning an already-held learning model to the inspection target image of a new product. Therefore, the above information processing device 13 can also be applied to the second embodiment.

本実施形態に係るシステムによる検査装置の設定処理（外観検査用の設定処理）について、図８Ａのフローチャートに従って説明する。なお、外観検査用の設定処理は、製造ラインにおける検査ステップの立ち上げ時に実施することを想定している。 The setting process for the inspection device by the system according to this embodiment (setting process for visual inspection) will be described with reference to the flowchart in FIG. 8A. Note that the setting process for visual inspection is assumed to be performed when starting up an inspection step in the production line.

クラウドサーバ１２の外部記憶装置１９６には、撮影画像における外観検査を行うための学習モデル（外観検査用モデル／設定）が複数登録されており、それぞれの学習モデルは互いに異なる学習環境で学習されたモデルである。 Multiple learning models (appearance inspection models/settings) for performing appearance inspections on captured images are registered in the external storage device 196 of the cloud server 12, and each learning model is a model that has been trained in a different learning environment.

カメラ１０は、外観検査の対象となる製品（検査対象製品）を撮影するためのカメラである。第１の実施形態と同様、カメラ１０は定期的もしくは不定期的に撮影を行うカメラであっても良いし、動画像を撮影するカメラであっても良い。撮影画像から検査対象製品における正確な異常領域の検出を行うために、異常領域を含む検査対象製品が検査工程に入ってきた場合は、可能な限り異常領域が強調されるような条件で撮影されることが望ましい。カメラ１０は、検査対象製品を複数の条件で撮影するのであればマルチカメラであっても良い。 Camera 10 is a camera for photographing a product that is the subject of visual inspection (product under inspection). As in the first embodiment, camera 10 may be a camera that photographs periodically or irregularly, or may be a camera that photographs moving images. In order to accurately detect abnormal areas in the product under inspection from the photographed images, when a product under inspection containing an abnormal area enters the inspection process, it is desirable to photograph the product under conditions that highlight the abnormal area as much as possible. Camera 10 may be a multi-camera if it photographs the product under inspection under multiple conditions.

ステップＳ８０では、カメラ１０は、検査対象製品を撮影することで該検査対象製品の撮影画像を生成する。ステップＳ８１では、カメラ１０は、ステップＳ８０で生成した撮影画像を通信網１１を介してクラウドサーバ１２および情報処理装置１３に対して送信する。 In step S80, the camera 10 captures an image of the product to be inspected to generate a captured image of the product to be inspected. In step S81, the camera 10 transmits the captured image generated in step S80 to the cloud server 12 and the information processing device 13 via the communication network 11.

ステップＳ８２では、情報処理装置１３のＣＰＵ１３１は、カメラ１０が撮影した検査対象製品などに関する情報（検査対象製品の部品名や材質、製造年月日、撮影時の撮像系パラメータ、ロット番号や気温、湿度等）を検査対象製品パラメータとして取得する。例えば、ＣＰＵ１３１は、ＧＵＩを表示装置１４に表示させて、ユーザからの検査対象製品パラメータの入力を受け付ける。そしてユーザがユーザインターフェース１５を操作して登録指示を入力すると、情報処理装置１３のＣＰＵ１３１は、ＧＵＩにおいて入力された上記の各項目の検査対象製品パラメータをクラウドサーバ１２に対して送信する。クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から送信された検査対象製品パラメータを外部記憶装置１９６に保存（登録）する。 In step S82, the CPU 131 of the information processing device 13 acquires information about the product to be inspected photographed by the camera 10 (part name and material of the product to be inspected, manufacturing date, imaging system parameters at the time of photographing, lot number, temperature, humidity, etc.) as parameters of the product to be inspected. For example, the CPU 131 displays a GUI on the display device 14 and accepts input of the product to be inspected parameters from the user. When the user operates the user interface 15 to input a registration instruction, the CPU 131 of the information processing device 13 transmits the product to be inspected parameters of each of the above items input in the GUI to the cloud server 12. The CPU 191 of the cloud server 12 stores (registers) the product to be inspected parameters transmitted from the information processing device 13 in the external storage device 196.

ステップＳ８３では、撮影画像から上記の検査対象製品を検出するために用いる学習モデルを選択するための処理が行われる。ステップＳ８３における処理の詳細について、図８Ｂのフローチャートに従って説明する。 In step S83, a process is performed to select a learning model to be used to detect the above-mentioned inspection target product from the captured image. Details of the process in step S83 will be described with reference to the flowchart in FIG. 8B.

ステップＳ８３１では、クラウドサーバ１２のＣＰＵ１９１は、外部記憶装置１９６に保存しているＥ個の学習モデルのうち候補となるＭ個の学習モデル（候補学習モデル）を選択する。ＣＰＵ１９１は、外部記憶装置１９６に登録されている検査対象製品パラメータから第１の実施形態と同様にクエリパラメータを生成し、該クエリパラメータが示す環境と類似する環境について学習した学習モデル（過去の類似した検査で用いた学習モデル）を選択する。 In step S831, the CPU 191 of the cloud server 12 selects M candidate learning models (candidate learning models) from the E learning models stored in the external storage device 196. The CPU 191 generates query parameters from the product parameters to be inspected registered in the external storage device 196 in the same manner as in the first embodiment, and selects a learning model that has learned about an environment similar to the environment indicated by the query parameters (a learning model used in a similar past inspection).

クエリパラメータに「部品名」として「基盤」が含まれている場合、過去の基盤検査に用いられた学習モデルが選ばれやすくなり、さらに「材質」として「ガラスエポキシ」が含まれている場合、ガラスエポキシ基盤の検査に用いられた学習モデルが選ばれやすくなる。 If the query parameters include "circuit board" as the "part name," a learning model that was used in past circuit board inspections is more likely to be selected, and if the query parameters include "glass epoxy" as the "material," a learning model that was used in the inspection of glass epoxy circuit boards is more likely to be selected.

ステップＳ８３１でも第１の実施形態と同様、学習モデルのパラメータセットとクエリパラメータとを用いてＭ個の候補学習モデルを選択するが、その際には、第１の実施形態と同様、上記の式（１）を用いる。 In step S831, as in the first embodiment, M candidate learning models are selected using the learning model parameter set and the query parameters, and in this case, the above formula (1) is used, as in the first embodiment.

次に、ステップＳ８３２では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から受信した撮影画像からＰ枚の撮影画像をモデル選択対象画像として選択する。例えば、本製造ラインの本検査工程に流れてくる製品をランダムに選択して実運用時と同様の設定でカメラ１０で撮影した撮影画像からＰ枚の撮影画像を取得する。通常、製造ラインで発生する異常品の数は少ないため、該工程で撮影する製品の数が少ない場合は以降のステップにおける処理が良く機能しない。よって、目安として数百個以上の製品の撮影が望ましい。 Next, in step S832, the CPU 191 of the cloud server 12 selects P images from the captured images received from the camera 10 as images for model selection. For example, products that flow through the main inspection process of the main production line are randomly selected, and P images are obtained from the images captured by the camera 10 with the same settings as during actual operation. Since the number of defective products that occur on a production line is usually small, the processing in the subsequent steps does not function well if the number of products photographed in the process is small. Therefore, as a guideline, it is desirable to photograph several hundred or more products.

次に、ステップＳ８３３では、ステップＳ８３２で選択したＰ枚の撮影画像を用いて、Ｍ個の候補学習モデルから１つを選択学習モデルとして選択するための処理が行われる。ステップＳ８３３における処理の詳細について、図８Ｃのフローチャートに従って説明する。 Next, in step S833, a process is performed to select one of the M candidate learning models as a selected learning model using the P captured images selected in step S832. Details of the process in step S833 will be described with reference to the flowchart in FIG. 8C.

ステップＳ８３３０では、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、「Ｐ枚の撮影画像のそれぞれについて、該候補学習モデルを用いて該撮影画像からオブジェクトを検出する処理であるオブジェクト検出処理」を行う。本実施形態でも、撮影画像に対するオブジェクト検出処理の結果は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報である。 In step S8330, the CPU 191 of the cloud server 12 performs "object detection processing, which is processing for detecting an object from each of the P captured images using the candidate learning model" for each of the M candidate learning models. In this embodiment, too, the result of the object detection processing for the captured images is position information of the image area (rectangular area, detection area) of the object detected from the captured image.

ステップＳ８３３１では、ＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれの「Ｐ枚の撮影画像のそれぞれに対するオブジェクト検出処理の結果」に対するスコアを求める。そしてＣＰＵ１９１は、該スコアに基づいてＭ個の候補学習モデルの順位付け（ランキング作成）を行って、Ｍ個の候補学習モデルからＮ個の候補学習モデルを選択する。候補学習モデルによるオブジェクト検出処理の結果に対するスコアは、例えば、以下のようにして求める。 In step S8331, the CPU 191 obtains a score for the "result of the object detection process for each of the P captured images" for each of the M candidate learning models. The CPU 191 then ranks the M candidate learning models based on the scores (creates a ranking) and selects N candidate learning models from the M candidate learning models. The score for the result of the object detection process using the candidate learning models is obtained, for example, as follows.

例えば、プリント基板上の異常を検出するようなタスクにおいて、固定プリントパターン上の各種特定局所パターンに対してオブジェクト検出処理を実施するものとする。ここで、特定の学習モデルでは正常品の撮影画像からは、図９（Ａ）のような検出領域９０１～９０６が得られるとする。製造ラインで生産される製品の異常の発生頻度は極めて少ないため、上記タスクを実施する上で良い学習モデルとは、想定される撮影画像のバラつきに対して安定した結果を出力できる学習モデルである。例えば、エリアセンサ側の環境の変動によって僅かに製品を撮影した画像の見た目が変わることによって図９（Ｂ）のように検出領域９０１～９０６のうち検出領域９０６が検出できなくなることがある。このような場合、僅かな違いしか無い入力に対して検出領域が変わる学習モデルの評価スコアに対しては罰則を与えるべきである。 For example, in a task to detect anomalies on a printed circuit board, object detection processing is performed on various specific local patterns on a fixed print pattern. Here, it is assumed that in a specific learning model, detection areas 901 to 906 as shown in FIG. 9(A) are obtained from a captured image of a normal product. Since the frequency of occurrence of abnormalities in products produced on a manufacturing line is extremely low, a good learning model for performing the above task is one that can output stable results for expected variations in captured images. For example, a slight change in the appearance of the captured image of the product due to environmental fluctuations on the area sensor side may make it impossible to detect detection area 906 out of detection areas 901 to 906 as shown in FIG. 9(B). In such a case, a penalty should be imposed on the evaluation score of a learning model in which the detection area changes in response to an input with only a slight difference.

よって、例えば、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、Ｐ枚の撮影画像間で該候補学習モデルによる検出領域の配置パターンが大きく異なるほど大きいスコアを決定する。このようなスコアは、例えば、上記の式（４）を計算することで求めることができる。そしてＭ個の候補学習モデルをスコアが小さい順に順位付けし、スコアが小さい順に上位Ｎ個の候補学習モデルを選択する。該選択の際には、「スコアが閾値未満」という条件を加えても良い。 Therefore, for example, the CPU 191 of the cloud server 12 determines a higher score for each of the M candidate learning models, the greater the difference in the arrangement pattern of the detection area by the candidate learning model between the P captured images. Such a score can be obtained, for example, by calculating the above formula (4). The M candidate learning models are then ranked in ascending order of score, and the top N candidate learning models are selected in descending order of score. When making this selection, a condition that "the score is less than a threshold value" may be added.

ステップＳ８３３２では、情報処理装置１３のＣＰＵ１３１は、Ｐ枚の撮影画像について、ユーザの主観による比較がしやすい情報提示のためのスコアリング（表示画像スコアリング）を、第１の実施形態（ステップＳ２３３２）と同様にして行う。 In step S8332, the CPU 131 of the information processing device 13 performs scoring (display image scoring) for the P captured images to present information that is easy for the user to subjectively compare, in the same manner as in the first embodiment (step S2332).

ステップＳ８３３３では、情報処理装置１３のＣＰＵ１３１は、ステップＳ８３３１で選択したＮ個の候補学習モデルのそれぞれについて、クラウドサーバ１２から受信したＰ枚の撮影画像のうちスコアが大きい順に上位Ｆ枚の撮影画像と、クラウドサーバ１２から受信した該撮影画像に対するオブジェクト検出処理の結果と、を表示装置１４に表示させる。その際、Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示する。 In step S8333, the CPU 131 of the information processing device 13 causes the display device 14 to display, for each of the N candidate learning models selected in step S8331, the top F captured images among the P captured images received from the cloud server 12 in descending order of score, and the results of the object detection process for the captured images received from the cloud server 12. At that time, the F captured images are displayed in descending order of score from the left.

候補学習モデルごとの撮影画像およびオブジェクト検出処理の結果を表示したＧＵＩの表示例を図１０（Ａ）に示す。図１０（Ａ）では、Ｎ＝３、Ｆ＝４のケースについて示している。 Figure 10(A) shows an example of a GUI displaying the captured images and the results of the object detection process for each candidate learning model. Figure 10(A) shows the case where N = 3 and F = 4.

最上の行には、スコアが最も大きい候補学習モデルのモデル名「Ｍ００５」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００５」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 The top row displays the model name "M005" of the candidate learning model with the highest score along with a radio button 100, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area detected from the captured image by the candidate learning model with the model name "M005" is superimposed on the captured image.

中段の行には、スコアが２番目に大きい候補学習モデルのモデル名「Ｍ０２３」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０２３」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 In the middle row, the model name "M023" of the candidate learning model with the second highest score is displayed along with a radio button 100, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area detected from the captured image by the candidate learning model with the model name "M023" is superimposed on the captured image.

下段の行には、スコアが３番目に大きい候補学習モデルのモデル名「Ｍ０１４」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０１４」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 In the bottom row, the model name "M014" of the candidate learning model with the third highest score is displayed along with a radio button 100, and to the right of that, the top four captured images with the highest scores are displayed in order from left to right. A frame indicating the detection area detected from the captured image by the candidate learning model with model name "M014" is superimposed on the captured image.

この場合、検出領域の配置パターンの違いについては、製品外観がほとんど固定であり、多くは正常品であることが多いため、図１０（Ａ）に示されるようになる。Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示するが、スコアの大きいものは個別撮影時の撮影条件の違いが大きかった場合もしくは異常領域を含む個体であった場合になる傾向がある。よって、ユーザは事前に製品の撮影画像に対して異常領域へのアノテーション作業を実施するほか、数多くの製品から欠陥品を手作業で探してから検査装置の設定を行っていた従来の方式に比べ、該作業を一切実施しなくともこのＧＵＩを見るだけで異常領域を含む可能性のある製品の撮影画像から優先的にユーザに提示することができるため、省力化になる。ユーザは図１０（Ａ）のＧＵＩでオブジェクト検出処理の結果を見比べながら、正しく異常領域を検出できている学習モデルを選択すれば良い。 In this case, the difference in the arrangement pattern of the detection area is as shown in FIG. 10(A) because the product appearance is almost fixed and most of the products are normal. The F captured images are displayed from the left in descending order of score, and images with high scores tend to be those in which the shooting conditions at the time of individual shooting were significantly different or those containing abnormal areas. Therefore, compared to the conventional method in which the user performs annotation work for abnormal areas in the captured images of the product in advance and manually searches for defective products from a large number of products before setting up the inspection device, the user can preferentially present captured images of products that may contain abnormal areas by simply looking at this GUI without performing any of the above work, thereby saving labor. The user can compare the results of the object detection process in the GUI in FIG. 10(A) and select a learning model that can correctly detect abnormal areas.

ステップＳ８３３４では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）を受け付ける。ステップＳ８３３５では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われたか否かを判断する。 In step S8334, the CPU 131 of the information processing device 13 accepts a candidate learning model selection operation (user input) by the user. In step S8335, the CPU 131 of the information processing device 13 determines whether or not a candidate learning model selection operation (user input) has been performed by the user.

図１０（Ａ）の場合、ユーザは、モデル名「Ｍ００５」の候補学習モデルを選択する場合には、最上の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０２３」の候補学習モデルを選択する場合には、中段の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０１４」の候補学習モデルを選択する場合には、下段の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。図１０（Ａ）では、モデル名「Ｍ００５」に対応するラジオボタン１００が選択されているため、モデル名「Ｍ００５」の候補学習モデルが選択されたことを示す枠１０４が表示される。 In the case of FIG. 10(A), when the user selects a candidate learning model with the model name "M005", the user selects the radio button 100 in the top row using the user interface 15. When the user selects a candidate learning model with the model name "M023", the user selects the radio button 100 in the middle row using the user interface 15. When the user selects a candidate learning model with the model name "M014", the user selects the radio button 100 in the bottom row using the user interface 15. In FIG. 10(A), because the radio button 100 corresponding to the model name "M005" is selected, a frame 104 is displayed indicating that the candidate learning model with the model name "M005" has been selected.

そしてユーザがユーザインターフェース１５を操作して決定ボタン１０１を指示すると、ＣＰＵ１３１は、「ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた」と判断し、選択したラジオボタン１００に対応する候補学習モデルを選択学習モデルとして選択する。 When the user operates the user interface 15 to select the decision button 101, the CPU 131 determines that "the user has performed a candidate learning model selection operation (user input)," and selects the candidate learning model corresponding to the selected radio button 100 as the selected learning model.

この判断の結果、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた場合には、処理はステップＳ８３３６に進み、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われていない場合には、処理はステップＳ８３３４に進む。 If the result of this determination is that the user has performed a selection operation (user input) of a candidate learning model, processing proceeds to step S8336; if the user has not performed a selection operation (user input) of a candidate learning model, processing proceeds to step S8334.

ステップＳ８３３６では、情報処理装置１３のＣＰＵ１３１は、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態であるのかを確認する。そして、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態である場合には、処理はステップＳ８４に進み、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態ではない場合には、処理はステップＳ８３３２に進む。 In step S8336, the CPU 131 of the information processing device 13 checks whether the "number of learning models desired by the user" has been selected. If the "number of learning models desired by the user" has been selected, the process proceeds to step S84, and if the "number of learning models desired by the user" has not been selected, the process proceeds to step S8332.

ここで、「ユーザが希望する数」とは、主に外観検査にかけてよい時間（タクトタイム）に応じて決定する。例えば、「ユーザが希望する数」が２個の場合は、１個の学習モデルで低周波の異常領域を検出し、もう一方の学習モデルで高周波欠陥を検出するなど検出対象の傾向を変えることで幅広い検出が可能となる場合がある。 Here, the "number desired by the user" is determined mainly based on the time (takt time) that can be spent on visual inspection. For example, if the "number desired by the user" is two, it may be possible to detect a wider range of defects by changing the tendency of the detection target, such as using one learning model to detect low-frequency abnormal areas and the other learning model to detect high-frequency defects.

ここで、ユーザが図１０（Ａ）の表示を見ただけでは「ユーザが希望する数」に絞ることが出来なかった場合は、複数のラジオボタン１００を選択することで複数の候補学習モデルを選択するようにしても良い。例えば、「ユーザが希望する数」が「１」で、選択したラジオボタン１００の数が２の場合、Ｎ＝２として処理はステップＳ８３３６を介してステップＳ８３３２に進む。この場合、ステップＳ８３３２以降では、Ｎ＝２、Ｆ＝４について同様の処理を行う。このようにして、最終的に選択される学習モデルの数が「ユーザが希望する数」になるまで処理を繰り返す。 Here, if the user is unable to narrow down the number to "the number desired by the user" simply by looking at the display in FIG. 10(A), multiple candidate learning models may be selected by selecting multiple radio buttons 100. For example, if the "number desired by the user" is "1" and the number of selected radio buttons 100 is 2, N=2 and the process proceeds to step S8332 via step S8336. In this case, from step S8332 onwards, similar processing is performed for N=2 and F=4. In this way, the process is repeated until the number of learning models finally selected is "the number desired by the user."

また、ユーザは、図１０（Ａ）のＧＵＩの代わりに図１０（Ｂ）のＧＵＩを用いて学習モデルを選択しても良い。図１０（Ａ）のＧＵＩは、ユーザに直接どの学習モデルが良いのかを選択させるＧＵＩとなっている。これに対し、図１０（Ｂ）のＧＵＩでは、それぞれの撮影画像にチェックボックス１０２が設けられており、ユーザは、縦一列に並ぶ撮影画像列ごとに、該撮影画像列中の撮影画像のうちオブジェクト検出処理の結果が好ましいと判断した撮影画像のチェックボックス１０２を、ユーザインターフェース１５を操作して指定してオンにする（チェックマークを付ける）。そしてユーザがユーザインターフェース１５を操作して決定ボタン１０１５を指定すると、情報処理装置１３のＣＰＵ１３１は、モデル名が「Ｍ００５」、「Ｍ０２３」、「Ｍ０１４」の候補学習モデルのうち、チェックボックス１０２がオンになっている撮影画像の数が最も多い候補学習モデルを選択学習モデルとして選択する。図１０（Ｂ）の例では、モデル名が「Ｍ００５」の候補学習モデルの４枚の撮影画像のうち２つのチェックボックス１０２がオンになっており、モデル名が「Ｍ０２３」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス１０２がオンになっており、モデル名が「Ｍ０１４」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス１０２がオンになっている。この場合は、モデル名が「Ｍ００５」の候補学習モデルが選択学習モデルとして選択されることになる。このようなＧＵＩによる選択学習モデルの選択方法は、例えば、Ｆの値が増加してユーザがどの候補学習モデルが最も良いか判断するのが難しい場合に有効である。 The user may also select a learning model using the GUI of FIG. 10(B) instead of the GUI of FIG. 10(A). The GUI of FIG. 10(A) is a GUI that allows the user to directly select which learning model is better. In contrast, in the GUI of FIG. 10(B), a check box 102 is provided for each captured image, and the user operates the user interface 15 to specify and turn on (put a check mark on) the check box 102 of the captured image that is determined to have a good result of the object detection process among the captured images in the captured image sequence, for each captured image sequence that is lined up vertically. Then, when the user operates the user interface 15 to specify the decision button 1015, the CPU 131 of the information processing device 13 selects, as the selected learning model, the candidate learning model with the largest number of captured images with the check box 102 turned on, from among the candidate learning models with the model names "M005", "M023", and "M014". In the example of FIG. 10(B), two check boxes 102 are checked out of four captured images of a candidate learning model with a model name of "M005", one check box 102 is checked out of four captured images of a candidate learning model with a model name of "M023", and one check box 102 is checked out of four captured images of a candidate learning model with a model name of "M014". In this case, the candidate learning model with the model name "M005" is selected as the selected learning model. This method of selecting a selected learning model using a GUI is effective, for example, when the value of F increases and it is difficult for the user to determine which candidate learning model is best.

図１０（Ｂ）のＧＵＩにおいて最終的に「ユーザが希望する数」の学習モデルに絞り込む最も簡単な方法は、オンになっているチェックボックスの数が多い順に上位から「ユーザが希望する数」までを選択する方法である。 The easiest way to narrow down the learning models to the "number desired by the user" in the GUI of Figure 10 (B) is to select the "number desired by the user" from the top in order of the number of checked check boxes.

なお、「チェックボックス１０２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルが存在する場合には、ステップＳ８３３６で「最終的に学習モデルが「ユーザが希望する数」だけ選択された状態ではない」と判断して、処理はステップＳ８３３２に進む。そしてステップＳ８３３２以降では、「チェックボックス１０２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルを対象にして処理を行う。このような場合でも、最終的に選択される学習モデルの数が「ユーザが希望する数」になるまで処理を繰り返す。 If there are candidate learning models with the same or a small difference in the "number of captured images with check boxes 102 turned on," it is determined in step S8336 that "the number of learning models selected is not the "number desired by the user" in the end," and the process proceeds to step S8332. Then, from step S8332 onward, processing is performed on candidate learning models with the same or a small difference in the "number of captured images with check boxes 102 turned on." Even in such a case, the process is repeated until the number of learning models finally selected is the "number desired by the user."

ステップＳ８４では、クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から通知された情報で特定される選択学習モデルを用いて、撮影画像（カメラ１０がクラウドサーバ１２および情報処理装置１３に送信した撮影画像）に対するオブジェクト検出処理を行う。そしてクラウドサーバ１２のＣＰＵ１９１は、オブジェクト検出処理の結果として得られる検出領域から、最終的な検査装置の設定を行う。ここで設定された学習モデルおよび各種パラメータによって実際に製造ラインが立ち上がった段階で検査が実行される。 In step S84, the CPU 191 of the cloud server 12 performs object detection processing on the captured image (the captured image transmitted by the camera 10 to the cloud server 12 and the information processing device 13) using the selected learning model identified by the information notified from the information processing device 13. The CPU 191 of the cloud server 12 then performs final settings for the inspection device from the detection area obtained as a result of the object detection processing. Inspection is performed using the learning model and various parameters set here when the production line is actually launched.

＜変形例＞
上記の各実施形態は、目的とする検出・識別処理を実施するタスクにおいて、新規対象への検出・識別処理を行う場合に都度、学習モデルの学習や設定の調整を行うコストを低減させるための技術の一例である。よって、上記の各実施形態にて説明した技術は、農作物の収量の予測や修繕領域検出、検査対象である工業製品の異常領域の検出等に適用することに限定されない。対象は農業、工業、水産業やその他幅広い分野において適用される。 <Modification>
Each of the above embodiments is an example of a technique for reducing the cost of learning a learning model and adjusting the settings each time a detection/identification process is performed on a new target in a task that performs a target detection/identification process. Therefore, the technique described in each of the above embodiments is not limited to application to prediction of crop yields, detection of repair areas, detection of abnormal areas in industrial products that are inspected, etc. The target is applied to agriculture, industry, fisheries, and a wide range of other fields.

また、上記のラジオボタンやチェックボックスは、ユーザが対象を選択するための選択部の一例として表示するものであり、同様の効果を奏するものであれば、他の表示アイテムを代わりに表示させても良い。また、上述の実施形態においてオブジェクト検出処理に用いる学習モデルをユーザ操作に基づいて選択する構成を説明した（Ｓ２４）。しかしながら、これに限られず、オブジェクト検出処理に用いる学習モデルを自動で選択する構成としてもよい。例えば、スコアが最も大きい候補学習モデルをオブジェクト検出処理に用いる学習モデルとして自動で選択する構成としてもよい。 The above radio buttons and check boxes are displayed as an example of a selection section for the user to select a target, and other display items may be displayed instead as long as they have a similar effect. In the above embodiment, a configuration has been described in which a learning model to be used in the object detection process is selected based on a user operation (S24). However, this is not limited to this, and a configuration may be adopted in which a learning model to be used in the object detection process is automatically selected. For example, a configuration may be adopted in which a candidate learning model with the highest score is automatically selected as the learning model to be used in the object detection process.

また、上記の説明における各処理の主体は一例である。例えば、クラウドサーバ１２のＣＰＵ１９１が行うものとして説明した処理の一部若しくは全部を情報処理装置１３のＣＰＵ１３１が行うようにしても良い。また、情報処理装置１３のＣＰＵ１３１が行うものとして説明した処理の一部若しくは全部をクラウドサーバ１２のＣＰＵ１９１が行うようにしても良い。 Furthermore, the subject of each process in the above description is just an example. For example, some or all of the processes described as being performed by the CPU 191 of the cloud server 12 may be performed by the CPU 131 of the information processing device 13. Furthermore, some or all of the processes described as being performed by the CPU 131 of the information processing device 13 may be performed by the CPU 191 of the cloud server 12.

また、上記の説明では、各実施形態のシステムが分析処理を行うものとして説明した。しかし、分析処理の主体もまた上記の各実施形態のシステムに限らず、例えば、分析処理は他の装置／システムが行うようにしても良い。 In the above explanation, the system of each embodiment is described as performing the analysis process. However, the subject of the analysis process is not limited to the system of each embodiment, and for example, the analysis process may be performed by another device/system.

また、上記の各実施形態や変形例で使用した数値、処理タイミング、処理順、処理の主体、データ（情報）の構成／送信先／送信元などは、具体的な説明を行うために一例として挙げたものであり、このような一例に限定することを意図したものではない。 In addition, the numerical values, processing timing, processing order, processing subject, data (information) structure/destination/source, etc. used in the above embodiments and variations are given as examples to provide a concrete explanation, and are not intended to be limiting.

また、以上説明した各実施形態や変形例の一部若しくは全部を適宜組み合わせて使用しても構わない。また、以上説明した各実施形態や変形例の一部若しくは全部を選択的に使用しても構わない。 Furthermore, any or all of the embodiments and variations described above may be used in appropriate combination.Furthermore, any or all of the embodiments and variations described above may be used selectively.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 Other Embodiments
The present invention can also be realized by a process in which a program for implementing one or more of the functions of the above-described embodiments is supplied to a system or device via a network or a storage medium, and one or more processors in a computer of the system or device read and execute the program. The present invention can also be realized by a circuit (e.g., ASIC) that implements one or more of the functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the above-described embodiment, and various modifications and variations are possible without departing from the spirit and scope of the invention. Therefore, the following claims are appended to disclose the scope of the invention.

１０：カメラ１１：通信網１２：クラウドサーバ１３：情報処理装置１４：表示装置１５：ユーザインターフェース１３１：ＣＰＵ１３２：ＲＡＭ１３３：ＲＯＭ１３４：出力Ｉ／Ｆ１３５：入力Ｉ／Ｆ１９１：ＣＰＵ１９２：ＲＡＭ１９３：ＲＯＭ１９４：操作部１９５：表示部１９６：外部記憶装置１９７：Ｉ／Ｆ１９８：システムバス 10: Camera 11: Communication network 12: Cloud server 13: Information processing device 14: Display device 15: User interface 131: CPU 132: RAM 133: ROM 134: Output I/F 135: Input I/F 191: CPU 192: RAM 193: ROM 194: Operation unit 195: Display unit 196: External storage device 197: I/F 198: System bus

Claims

a first selection means for selecting one or more learning models as candidate learning models from a plurality of learning models trained in different learning environments based on information related to photographing an object;
A second selection means for selecting one or more candidate learning models from the candidate learning models selected by the first selection means based on a result of the object detection process using the candidate learning models;
A detection means for performing an object detection process on a photographed image of the object by using at least one of the candidate learning models selected by the second selection means ;
A means for predicting a yield of a crop and detecting a repair area in a farm field based on an object detection area obtained as a result of the object detection process;
An information processing device comprising:

The information processing device according to claim 1, characterized in that the first selection means generates a query parameter based on the information, and selects, from among the multiple learning models, one or more learning models that have been trained in an environment similar to the environment indicated by the query parameter as candidate learning models.

The information processing device according to claim 1 or 2, characterized in that the second selection means obtains a score for each candidate learning model selected by the first selection means based on the result of the object detection process using the candidate learning model, and selects one or more candidate learning models from the candidate learning models selected by the first selection means based on the score.

Furthermore,
4. The information processing apparatus according to claim 1, further comprising a display control means for displaying a result of the object detection process using the candidate learning model selected by the second selection means.

The information processing device according to claim 4, characterized in that the display control means determines a score that is larger for each of the multiple captured images for which the candidate learning model selected by the second selection means has performed an object detection process, the larger the difference in the results of the object detection process between the candidate learning models, and displays, for each candidate learning model selected by the second selection means, a GUI including the results of the object detection process by the candidate learning model for a specified number of captured images, starting from the highest score.

the GUI includes a selection portion for selecting a candidate learning model;
The information processing device according to claim 5, characterized in that the detection means sets a candidate learning model corresponding to a selection section selected in response to a user operation on the GUI as a selected learning model, and performs object detection processing using the selected learning model.

The information processing device according to claim 5, characterized in that the detection means selects the candidate learning model that has the most object detection process results selected in response to a user operation from among the object detection process results displayed for each candidate learning model by the display control means as a selected learning model, and performs object detection processing using the selected learning model.

8. The information processing device according to claim 1 , wherein the information includes Exif information of the captured image, information relating to a field in which the captured image was captured, and information relating to objects included in the captured image.

The information processing device according to any one of claims 1 to 7, characterized in that the information includes information related to an object included in the captured image.

The information processing device according to any one of claims 1 to 9, characterized in that the detection means performs the object detection processing on the captured image of the object using a candidate learning model selected from the candidate learning models selected by the second selection means based on a user operation.

An information processing method performed by an information processing device,
a first selection step in which a first selection means of the information processing device selects one or more learning models as candidate learning models from a plurality of learning models trained in different learning environments based on information related to photographing an object;
a second selection step in which a second selection means of the information processing device selects one or more candidate learning models from the candidate learning models selected in the first selection step based on a result of an object detection process using the candidate learning models;
a first detection step in which a first detection means of the information processing device performs an object detection process on a photographed image of the object by using at least one of the candidate learning models selected in the second selection step ;
a second detection step in which a second detection means of the information processing device predicts a yield of a crop and detects a repair area in the field based on a detection area of the object obtained as a result of the object detection process;
An information processing method comprising:

A computer program for causing a computer to function as each of the means of the information processing device according to any one of claims 1 to 10 .