JP6682833B2

JP6682833B2 - Database construction system for machine learning of object recognition algorithm

Info

Publication number: JP6682833B2
Application number: JP2015237644A
Authority: JP
Inventors: 市川　健太郎; 健太郎市川; 文洋奥村
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 2015-12-04
Filing date: 2015-12-04
Publication date: 2020-04-15
Anticipated expiration: 2035-12-04
Also published as: JP2017102838A

Description

本発明は、或るセンサの出力から機械学習を用いたアルゴリズムを通じて、例えば、物体の存在又は状態等の事象を認識又は検出する技術に係り、より詳細には、自動車等の車両の走行中に、周辺の車両、歩行者又はその他の障害物等を車載のセンサを用いて検出する場合などに利用可能な機械学習を用いた認識アルゴリズムに於ける機械学習のためのデータベースを構築するためのシステムに係る。 The present invention relates to a technology for recognizing or detecting an event such as the presence or state of an object through an algorithm using machine learning from an output of a certain sensor, and more specifically, while the vehicle such as an automobile travels. , A system for constructing a database for machine learning in a recognition algorithm using machine learning that can be used when detecting a nearby vehicle, a pedestrian or other obstacles using an in-vehicle sensor Pertain to.

例えば、走行中の車両その他の移動体に於いて、カメラ、ＬＩＤＡＲ（Laser Imaging Detection and Ranging）、レーダー等の車載センサを用いて先行する車両若しくは移動体又はその他の障害物等の物体の検出を行う技術が種々提案されている。これらの車載センサによる物体の検出に於いては、車両又は移動体の周囲の状態や検出対象となる物体の状態等によって、検出能に得手不得手が存在し、例えば、カメラの撮像による物体検出の場合、日中など、明るい環境下では精度良く物体の検出が達成できるが、夜間では、物体の検出精度が低下し、又、車載センサと物体との距離に依存して、物体の存在の有無、物体までの距離の精度が変化するといったことがある。一方、ＬＩＤＡＲによる物体検出の場合、その検出精度は、周囲の明るさにはさほどに影響を受けず、また、物体までの距離、方角を高精度に検出可能であるが、雨天時には、検出精度が低下し、また、一般的には、物体の種類の認識精度については、カメラの撮像による物体検出に比してやや低下する。そこで、従前より、上記の如き、車両又は移動体に搭載されるセンサによって、その周囲の物体の存在及び／又はその状態の認識又は検出を行う場合に、種類の異なる複数のセンサを用いて、これらのセンサから得られた出力から、それぞれ、物体の認識又は検出を行う構成、或いは、これらのセンサから得られた出力を統合して、物体の認識又は検出を行う構成が種々提案されている。 For example, in a moving vehicle or other moving body, detection of an object such as a preceding vehicle or moving body or other obstacle by using an in-vehicle sensor such as a camera, a LIDAR (Laser Imaging Detection and Ranging), or a radar. Various techniques have been proposed. In the detection of an object by these in-vehicle sensors, there is a weak point in detectability depending on the state of the surroundings of the vehicle or the moving body, the state of the object to be detected, etc. In this case, the object can be detected with high accuracy in a bright environment such as during the day, but at night, the detection accuracy of the object decreases, and the existence of the object depends on the distance between the vehicle-mounted sensor and the object. The presence or absence and the accuracy of the distance to the object may change. On the other hand, in the case of object detection by LIDAR, the detection accuracy is not so affected by the surrounding brightness, and the distance and direction to the object can be detected with high accuracy. , And generally, the recognition accuracy of the type of the object is slightly lower than that of the object detection by the image pickup by the camera. Therefore, conventionally, in the case of recognizing or detecting the presence and / or the state of the surrounding objects by the sensor mounted on the vehicle or the moving body as described above, using a plurality of different types of sensors, Various configurations have been proposed for recognizing or detecting an object from the outputs obtained from these sensors, or for integrating the outputs obtained from these sensors to recognize or detect the object. .

例えば、特許文献１に於いては、ＬＩＤＡＲにて得られる自車両の前方の他車両の位置情報を表すＬＩＤＡＲ点列を、カメラにより得られた画像にて特定された車両の前方の他車両の端部に重畳して、他車両の端部の位置を特定することが開示されている。また、特許文献２では、走行中の車両前方の物体を認識装置に於いて、２台の電子式カメラ１、２で撮像した画像のステレオ画像処理とレーザレンジファインダとのそれぞれにより、車両前方の平面領域に於いて、物体の存在する位置を検出して、それらの検出結果に於ける物体の存在の検出の頻度を、それぞれに重み付けをして合算し、合算値が閾値以上の位置に物体が存在すると判定する構成が提案されている。この場合、重み付けの割合を学習によって調節することが記載されている。特許文献３では、走行中の車両前方の物体の認識に於いて、レーザレーダにより特定された各反射点の位置に基づき、略等距離の位置に略車両幅の範囲内に存在する反射点群を車両候補点群とし、この車両候補点群をカメラの座標系に座標変換してカメラにより抽出される矩形領域と照合し、座標変換後の車両候補点群が矩形領域とほぼ一致すれば、その車両候補点群を前方車両と判断する構成が記載されている。特許文献４に於いては、車両前方の画像データから抽出される特徴量ベクトルを1次元の状態に変換した状態量からリスク情報を参照してリスクの程度を検出する装置に於いて、リスクの程度を決定するリスク情報がドライバの操作情報から抽出される情報を教師情報として、特徴量ベクトルの１次元に変換された状態量と教師情報との相関関係を用いて、リスク情報の学習を行うことが提案されている。特許文献５では、車両の走行中に路面に照射されたレーザ光の反射強度(反射光データ)から路面状態を判定する装置に於いて、かかる判定のために用いられる反射光データが、既知の路面上に於ける車両の走行中に得られた反射光データを用いた学習によって生成される構成が開示されている。そして、特許文献６に於いては、走行中の車両前方の物体の認識に於いて、実走行時の撮像画像中の注視領域について、カメラ画像に於けるエッジヒストグラムとレーザレーダに於ける受光強度ヒストグラムとから、車両候補領域についてのＸ，Ｙ方向ベクトル及びレーザベクトルが作成、融合されてフュージョンベクトルが生成され、ベクトル空間に於いて、かかるフュージョンベクトルと予め準備された辞書フュージョンベクトルとの距離が小さいときに、車両候補領域が車両であると予測するといった構成が開示されている。 For example, in Patent Document 1, a LIDAR point sequence representing position information of another vehicle in front of the own vehicle obtained by LIDAR is compared with another vehicle in front of the vehicle identified by the image obtained by the camera. It is disclosed that the position of the end portion of the other vehicle is specified so as to be superimposed on the end portion. Further, in Patent Document 2, in the recognition device for an object in front of the vehicle while traveling, the stereo image processing of images captured by the two electronic cameras 1 and 2 and the laser range finder are used to detect the front of the vehicle. In the plane area, the position where the object exists is detected, and the detection frequency of the existence of the object in those detection results is weighted and summed up. There is proposed a configuration for determining that there exists. In this case, it is described that the weighting ratio is adjusted by learning. In Patent Document 3, in recognition of an object in front of a moving vehicle, a group of reflection points existing in a range of substantially the vehicle width at positions at substantially equal distances based on the position of each reflection point specified by a laser radar. Is a vehicle candidate point group, the vehicle candidate point group is subjected to coordinate conversion into the coordinate system of the camera and collated with a rectangular area extracted by the camera, and if the vehicle candidate point group after coordinate conversion substantially matches the rectangular area, A configuration for determining the vehicle candidate point group as a front vehicle is described. In Patent Document 4, in a device that detects the degree of risk by referring to risk information from a state quantity obtained by converting a feature quantity vector extracted from image data in front of a vehicle into a one-dimensional state, The risk information that determines the degree is extracted from the driver's operation information as the teacher information, and the risk information is learned by using the correlation between the state quantity converted into the one-dimensional feature vector and the teacher information. Is proposed. In Patent Document 5, in a device that determines a road surface state from the reflection intensity (reflected light data) of a laser beam applied to a road surface while a vehicle is traveling, the reflected light data used for the determination is known. A configuration generated by learning using reflected light data obtained while a vehicle is traveling on a road surface is disclosed. Then, in Patent Document 6, in recognition of an object in front of a moving vehicle, an edge histogram in a camera image and a light-receiving intensity in a laser radar for a gaze area in a captured image during actual traveling An X, Y direction vector and a laser vector for a vehicle candidate region are created and fused from the histogram to generate a fusion vector, and the distance between the fusion vector and the dictionary fusion vector prepared in advance is generated in the vector space. A configuration is disclosed in which when the vehicle size is small, the vehicle candidate area is predicted to be a vehicle.

特開２００９−９８０２５JP, 2009-98025, A 特開２０００−３２９８５２Japanese Patent Laid-Open No. 2000-329852 特開２００３−８４０６４Japanese Patent Laid-Open No. 2003-84064 特開２００８−２３８８３１Japanese Patent Laid-Open No. 2008-238831 特開２０１４−２２８３００JP-A-2014-228300 特開２００３−９９７６２JP-A-2003-99762

ところで、上記の走行中の車両の前方の先行車や障害物等の物体を、カメラ、ＬＩＤＡＲ、レーダー等を用いて検出する場合のように、或る検出又は認識対象物を複数の種類のセンサを用いて検出又は認識する形式の物体の検出又は認識に於いては、既に触れた如く、センサの種類によって、精度良く検出できる対象或いは環境に違いがあることから、検出能に得手不得手があり、従って、或る検出対象に対して、或いは、或る検出環境に於いて、一方のセンサでは、精度良く物体の存在及び／又は種類の認識が可能であるが、他方のセンサでは、単独では、精度の良い認識が困難である場合や、そもそも、適切な信号処理、例えば、物体の種類の認識のための処理など、が確立していない場合もある。また、センサの種類によっては、使用中に、精度の良い認識を実行するための処理条件が変化することもある。そのような場合、即ち、或るセンサの出力から、物体の存在、種類の認識及び／又は運動状態の推定（以下、「物体の認識」と称する。）のための適切な信号処理方法、条件又は手順が未知、不確定、変動的である場合に、かかるセンサの出力について適切な信号処理方法、条件又は手順を調節する方法として、機械学習の手法により、そのセンサの出力と、それとは別の、精度良く物体の認識の可能な状態にあるセンサの出力を用いた検出結果とを照合して、前者のセンサの出力からできるだけ精度の良い物体の認識が達成されるように、信号処理方法、条件又は手順を構成又は調節するといったことが考えられる。即ち、第一のセンサの検出結果を教師データとして、第二のセンサの出力から検出結果を得るための信号処理方法、条件又は手順を機械学習によって構成又は調節するといった構成によれば、第二のセンサによる検出結果の精度の向上が期待される。その際、一般的には、教師データの精度が高く、かつ、データベースが大量・多様であるほど、即ち、様々な距離、向き、種類の車両等のデータが多く含まれているほど、機械学習の結果として得られる第二のセンサによる検出結果の精度が向上することが知られている。 By the way, as in the case of detecting an object such as a preceding vehicle or an obstacle in front of the running vehicle by using a camera, LIDAR, radar, etc., a certain detection or recognition target object is detected by a plurality of types of sensors. In the detection or recognition of an object of the type that is detected or recognized using, as already mentioned, there are differences in the objects or environments that can be detected accurately depending on the type of sensor. Therefore, with respect to a certain detection target or in a certain detection environment, one sensor can accurately recognize the existence and / or type of an object, but the other sensor alone Then, it may be difficult to perform accurate recognition, or proper signal processing, for example, processing for recognizing the type of object may not be established in the first place. In addition, depending on the type of sensor, the processing conditions for executing accurate recognition may change during use. In such a case, that is, an appropriate signal processing method and condition for recognizing the existence, type and / or motion state of an object (hereinafter referred to as “object recognition”) from the output of a certain sensor. Alternatively, when the procedure is unknown, uncertain, or variable, the output of the sensor is separated from the output of the sensor by a machine learning method as a method of adjusting an appropriate signal processing method, condition or procedure for the output of the sensor. In order to achieve the most accurate object recognition from the output of the former sensor, the signal processing method is collated with the detection result using the output of the sensor in the state where the object can be recognized with high accuracy. , Configuring or adjusting conditions or procedures. That is, according to the configuration in which the detection result of the first sensor is used as teacher data, the signal processing method, condition or procedure for obtaining the detection result from the output of the second sensor is configured or adjusted by machine learning. It is expected that the accuracy of the detection result by the sensor will be improved. At that time, in general, the higher the accuracy of the teacher data and the larger and more diverse the database, that is, the more the data of various distances, directions, types of vehicles, etc., are included, the more machine learning will be performed. It is known that the accuracy of the detection result obtained by the second sensor obtained as a result of is improved.

この点に関し、所謂「機械学習」を用いた或るセンサの信号処理方法、条件又は手順の構成又は調節、即ち、機械学習を用いた認識アルゴリズムの構築、に於いては、大量の教師あり学習データ、即ち、機械学習に於いて教師データとなる第一のセンサの検出結果と、入力データとなる第二のセンサの出力及び／又は検出結果とを照合させたデータが必要となる。かかる照合データに関して、従前では、人が手動で第一のセンサの検出結果と第二のセンサの出力及び／又は検出結果とにタグ付けをするなどの処理によって、教師データ又は学習データを作成していた。しかし、手動で、大量の教師あり学習データを低コストに作ることは困難であり、また、データの種類によっては、手動で教師データを作成すること自体が困難な場合がある。例えば、車両前方を撮影した画像に於いて、先行車の像にバウンディングボックス（画像内に設定する枠）を与える場合、その像が良好であれば、手動でバウンディングボックスを付与することも容易であるが、逆光状態や夜間に撮影された画像に於いて、先行車の像に正確にバウンディングボックスを与えることは難しく、また、ＬＩＤＡＲで取得されたポイントクラウドやミリ波レーダーの強度マップなどのように形状が複雑であったり、境界があいまいなデータのための教師データを手動で作成することも容易ではない。更に、車両の速度など、画像や動画から直接観測できない情報に対して教師データを手動で与えることも困難である。従って、機械学習を用いた認識アルゴリズムの構築のための大量の教師あり学習データを調製するためには、かかる学習データを人の手動によらず、自動的に収集できるようになっていることが好ましい。 In this regard, in the construction or adjustment of a signal processing method, condition or procedure of a certain sensor using so-called “machine learning”, that is, construction of a recognition algorithm using machine learning, a large amount of supervised learning is performed. Data, that is, data obtained by collating the detection result of the first sensor, which is the teacher data in machine learning, with the output and / or the detection result of the second sensor, which is the input data, is required. Regarding such collation data, conventionally, a person manually creates teacher data or learning data by processing such as tagging the detection result of the first sensor and the output and / or detection result of the second sensor. Was there. However, it is difficult to manually create a large amount of supervised learning data at low cost, and it may be difficult to manually create the teacher data depending on the type of data. For example, in the image of the front of the vehicle, if you give a bounding box (frame to be set in the image) to the image of the preceding vehicle, you can easily add the bounding box manually if the image is good. However, it is difficult to accurately give a bounding box to the image of the preceding vehicle in the backlight condition or the image taken at night, and the intensity map of the point cloud acquired by LIDAR or the millimeter wave radar is used. It is not easy to manually create teacher data for data with complicated shapes and ambiguous boundaries. Furthermore, it is difficult to manually give teacher data to information that cannot be directly observed from images or moving images, such as vehicle speed. Therefore, in order to prepare a large amount of supervised learning data for constructing a recognition algorithm using machine learning, it is possible to automatically collect such learning data without manual operation by a person. preferable.

かくして、本発明の一つの課題は、複数の種類のセンサを用いて車両の周辺の他車両、歩行者、障害物等の物体の認識を実行する技術であって、或るセンサの検出結果を教師データとして用いた、別のセンサの出力から物体の認識を実行するための機械学習のための、教師あり学習データを自動的に収集して、教師あり学習データのデータベースを構築するシステムを提供することである。 Thus, an object of the present invention is a technique for executing recognition of an object such as another vehicle around a vehicle, a pedestrian, an obstacle, etc. using a plurality of types of sensors, Provide a system that automatically collects supervised learning data and builds a database of supervised learning data for machine learning to perform object recognition from the output of another sensor used as teacher data It is to be.

上記の課題は、センサの出力に基づいて車両の周囲領域の物体の認識を実行するアルゴリズムの構成又は調節のための機械学習に用いる教師あり学習データを蓄積するデータベースを構築するシステムであって、
逐次的に車両の周囲領域の状態を検出する第一のセンサと、
第一のセンサの逐次的に得られた出力データに基づいて車両の周囲領域の物体の認識を逐次的に行う第一の物体認識手段と、
逐次的に車両の周囲領域の状態を検出する第二のセンサと、
第一の物体認識手段による物体の認識結果データの信頼度が所定の度合以上のときに、第二のセンサの逐次的に得られた出力データを機械学習に於ける入力データとして用い、第一の物体認識手段により逐次的に認識された物体の認識結果データを機械学習に於ける教師データとして用いて、教師データと該教師データに対応する入力データの対応付けを行うデータ対応付け手段と、
対応付けされた入力データと教師データとの組を機械学習のための教師あり学習データとして格納する学習データ格納手段と
を含むシステムによって達成される。 The above problem is a system that builds a database that accumulates supervised learning data used for machine learning for configuration or adjustment of an algorithm that executes recognition of an object in a surrounding area of a vehicle based on an output of a sensor,
A first sensor that sequentially detects the state of the surrounding area of the vehicle,
A first object recognition means for sequentially recognizing an object in the surrounding area of the vehicle based on output data obtained sequentially from the first sensor,
A second sensor that sequentially detects the state of the surrounding area of the vehicle,
When the reliability of the recognition result data of the object by the first object recognition means is equal to or higher than a predetermined degree, the output data sequentially obtained from the second sensor is used as the input data in the machine learning. Using the recognition result data of the objects sequentially recognized by the object recognizing means as the teacher data in machine learning, the data associating means for associating the teacher data with the input data corresponding to the teacher data,
This is achieved by a system including a learning data storage unit that stores a pair of associated input data and teacher data as supervised learning data for machine learning.

上記の構成に於いて、「センサ」は、任意のセンサであってよいところ、典型的には、車両の周囲領域の状態を画像として検出するカメラ、ＬＩＤＡＲ、ミリ波レーダー等の車両の周囲領域の状態の検出に通常使用されるものであってよい。「物体の認識を実行するアルゴリズムの構成又は調節」とは、ここに於いては、既に述べた如く、或るセンサの出力に基づいて、物体の認識を実行するための適切な信号処理方法、条件又は手順を構成し、或いは、既に構成されたアルゴリズムに於ける種々の条件を調節することである。機械学習の分野に於いてよく知られている如く、機械学習を用いて物体認識のアルゴリズムを構築又は開発する場合には、或るセンサの出力（入力データ）に対して正解である認識結果（教師データ）を対応させたデータ群を参照して、そのセンサの任意の出力が与えられたときに正解の認識結果を与えるように、演算処理や判定処理の構成及び／又はそれらに於いて使用される種々のパラメータ等の決定が為される。「アルゴリズムの構成又は調節」とは、そういった演算処理や判定処理の構成及び／又は種々のパラメータ等の決定のことを意味している。なお、本発明に於いては、「物体の認識」という場合には、上記の如く、物体の存在の検出及び／又は物体の種類の識別（静止しているか否か、検出された物体が、車両、人、動物又はその他の障害物のいずれであるかなどの識別）及び／又は物体の運動状態（位置、姿勢（向き）、速度など）の推定を意味するものとする。また、上記に於いて、第一のセンサと第二のセンサとは、典型的には、それぞれ、別々の種類又は別々の仕様（計測位置、計測角度範囲、感度が異なる場合など）のセンサ、例えば、カメラと、ＬＩＤＡＲとの組み合わせなどであるが、いくつかの実施の態様に於いて、同一の種類のセンサであってもよい。「第一の物体認識手段」は、「第一のセンサ」の種類に応じて、この分野に於いて公知の任意の手法によって第一のセンサの出力に基づいて、物体の認識の結果を与える手段であってよい。 In the above-mentioned configuration, the "sensor" may be any sensor, but typically, the surrounding area of the vehicle such as a camera, a LIDAR, or a millimeter wave radar that detects the state of the surrounding area of the vehicle as an image. It may be one that is usually used for detecting the state of. As used herein, the term "configuration or adjustment of an algorithm for performing object recognition" means, as described above, a suitable signal processing method for performing object recognition based on the output of a certain sensor. To configure a condition or procedure, or to adjust various conditions in an already constructed algorithm. As is well known in the field of machine learning, when an object recognition algorithm is constructed or developed using machine learning, a recognition result (correction result (correction result) that is correct for a certain sensor output (input data) Structure of arithmetic processing and determination processing and / or used in them so as to give a correct recognition result when an arbitrary output of the sensor is given by referring to a data group corresponding to (teaching data) Various parameters to be performed are determined. “Algorithm configuration or adjustment” means the configuration of such arithmetic processing and determination processing and / or the determination of various parameters and the like. In the present invention, in the case of “recognizing an object”, as described above, the presence of the object is detected and / or the type of the object is identified (whether the object is stationary, the detected object is Identification of whether it is a vehicle, person, animal or other obstacle) and / or estimation of the motion state (position, posture (orientation), speed, etc.) of an object. Further, in the above, the first sensor and the second sensor are typically sensors of different types or different specifications (measurement position, measurement angle range, different sensitivity, etc.), For example, a combination of a camera and a LIDAR may be used, but in some embodiments, the same type of sensor may be used. The "first object recognizing means" gives the result of object recognition based on the output of the first sensor by any method known in the art according to the type of the "first sensor". It may be a means.

上記の構成に於いては、端的に述べれば、上記の如く、少なくとも二つのセンサ、即ち、第一のセンサと第二のセンサが準備され、それぞれが、それぞれの態様に従って、逐次的に車両の周囲領域の状態の検出を実行させられる。ここに於いて、第一のセンサとしては、上記の如く、その出力に基づいて物体の認識を実行するアルゴリズムが確定されているものが選択され、第二のセンサとしては、機械学習により、その出力に基づいて物体の認識を実行するアルゴリズムが構成又は調節されるべきものが選択される。そして、第一のセンサの出力に基づいて逐次的に得られた物体の認識結果データ（物体が存在している場合も存在していない場合も含まれていてよい。）が機械学習に於ける教師データとして選択され、第二のセンサの出力が機械学習に於ける入力データとして選択され、これらのデータの対応付けが為され、教師あり学習データとしてデータベースに格納されることとなる。 In brief, in the above configuration, as described above, at least two sensors, that is, the first sensor and the second sensor, are prepared, and each of them is sequentially arranged in the vehicle according to the respective aspects. The detection of the state of the surrounding area can be executed. Here, as the first sensor, a sensor for which an algorithm for recognizing an object is determined based on the output is selected as described above, and as the second sensor, a machine learning is performed. Based on the output, the one for which the algorithm for performing the recognition of the object is to be constructed or adjusted is selected. Then, the recognition result data of the object obtained sequentially based on the output of the first sensor (whether the object exists or not) may be included in the machine learning. It is selected as teacher data, the output of the second sensor is selected as input data in machine learning, these data are associated, and stored as supervised learning data in the database.

かかる第一のセンサの物体の認識結果データと第二のセンサの出力との対応付けと教師あり学習データの格納とは、コンピュータの処理によって自動的に実行される。これらの処理は、一つの態様に於いては、例えば、第一のセンサの出力データ、第一のセンサの物体の認識結果データ及び第二のセンサの出力データを任意のデータ記憶装置に蓄積しておき、後で（オフライン処理で）、車両の走行ログデータを参照しながら、第一のセンサの物体の認識結果データと第二のセンサの出力との対応付けと教師あり学習データの格納とが実行されてもよい。データ記憶装置は、自車に搭載されたものであってもよく、或いは、データがネットワーク通信を介して外部の施設に設けられたデータ記憶装置へ送信され、蓄積されるようになっていてもよい。また、別の態様として、車両の走行中などに、逐次的に実行されてよい。なお、上記の構成にあるように、第一のセンサによる物体の認識結果データと第二のセンサの出力との対応付けとは、第一のセンサによる物体の認識結果データの信頼度が所定度合以上であるときに、即ち、第一のセンサによる物体の認識結果データが教師データとして利用可能な状態のときにのみ実行される。 The correspondence between the recognition result data of the object of the first sensor and the output of the second sensor and the storage of the supervised learning data are automatically executed by the processing of the computer. In one aspect, these processes store, for example, the output data of the first sensor, the recognition result data of the object of the first sensor, and the output data of the second sensor in an arbitrary data storage device. Afterwards (offline processing), while referring to the travel log data of the vehicle, the correspondence between the recognition result data of the object of the first sensor and the output of the second sensor and the storage of the supervised learning data are stored. May be performed. The data storage device may be installed in the own vehicle, or data may be transmitted via a network communication to a data storage device provided in an external facility to be stored. Good. Further, as another aspect, it may be sequentially executed while the vehicle is traveling. As described above, the correspondence between the recognition result data of the object by the first sensor and the output of the second sensor means that the reliability of the recognition result data of the object by the first sensor has a predetermined degree. It is executed only when the above is the case, that is, when the recognition result data of the object by the first sensor is usable as the teacher data.

上記の本発明の構成に於いては、更に、第二のセンサの出力データに基づいて得られる物体認識結果が教師データとして用いられる構成が含まれていてよい。即ち、第二のセンサによる物体認識アルゴリズムが構成される前は、第一のセンサの出力データに基づく物体認識結果が教師データとして用いられ、第二のセンサによる物体認識アルゴリズムが構成された後には、第一のセンサの出力に基づく物体認識データと第二のセンサの出力に基づく物体認識データとを統合したものを教師データとして、これと、現に得られている第二のセンサの出力との対応付けを行うようになっていてもよい。この場合、第二のセンサの出力に基づく物体認識アルゴリズムについて、学習データのループが形成され、更なる精度の向上が期待される。 The above configuration of the present invention may further include a configuration in which the object recognition result obtained based on the output data of the second sensor is used as teacher data. That is, before the object recognition algorithm by the second sensor is configured, the object recognition result based on the output data of the first sensor is used as teacher data, and after the object recognition algorithm by the second sensor is configured. , A combination of the object recognition data based on the output of the first sensor and the object recognition data based on the output of the second sensor as teacher data, and this and the output of the second sensor currently obtained Correspondence may be made. In this case, a loop of learning data is formed in the object recognition algorithm based on the output of the second sensor, and further improvement in accuracy is expected.

かくして、上記の本発明の装置は、更に、第二のセンサの逐次的に得られた出力データに基づいて車両の周囲領域の物体の認識を逐次的に行う第二の物体認識手段を含み、データ対応付け手段が更に第二の物体認識手段により逐次的に認識された物体の認識結果データも機械学習に於ける教師データとして用いるよう構成されていてよい。かかる構成によれば、ループの繰り返し処理によって物体認識精度が向上できることとなり、データ量の不足を補う効果も得られることとなる。なお、第二のセンサの出力データに基づいて得られた物体の認識結果データを教師データとして使用して学習データのループを形成する構成に於いて、かかる物体の認識結果データは、それを各種の制御に使用するときよりも高い信頼度を有していることが好ましい。従って、第二のセンサの出力データに基づいて得られた物体の認識結果データは、その信頼度が各種の制御に使用するときに満たすべき度合よりも高い所定度合以上であるときに、教師データとして用いられるようになっていてよい。 Thus, the above-mentioned device of the present invention further includes a second object recognition means for sequentially recognizing an object in the surrounding area of the vehicle based on the output data obtained sequentially from the second sensor, The data associating unit may further be configured to use the recognition result data of the objects sequentially recognized by the second object recognizing unit as the teacher data in the machine learning. According to such a configuration, the object recognition accuracy can be improved by the loop iteration processing, and the effect of compensating for the lack of the data amount can be obtained. In the configuration in which the recognition result data of the object obtained based on the output data of the second sensor is used as the teacher data to form the loop of the learning data, the recognition result data of the object is It is preferable to have a higher degree of reliability than when it is used for controlling. Therefore, when the reliability of the object recognition result data obtained based on the output data of the second sensor is higher than or equal to a predetermined degree higher than the degree to be satisfied when used for various controls, the teacher data May be used as.

更に、別の態様に於いては、少なくとも二つのセンサが使用される場合に、それぞれの物体の認識結果データの信頼度に応じて、その都度、少なくとも二つのセンサのうちの物体の認識結果データの信頼度の高い方を第一のセンサとして選択し（ただし、信頼度が所定度合以上であるとき）、他方を第二のセンサとして選択するようになっていてもよい。また、第一のセンサとして、複数のセンサを用い、第一の物体認識手段は、複数のセンサの出力を統合して（センサフュージョン）物体の認識結果を与えるよう構成されていてもよく、そのような場合も、本発明の範囲に属することは理解されるべきである。 Furthermore, in another aspect, when at least two sensors are used, the recognition result data of the object of the at least two sensors is changed depending on the reliability of the recognition result data of each object. The one with higher reliability may be selected as the first sensor (provided that the reliability is equal to or higher than a predetermined degree), and the other may be selected as the second sensor. Further, a plurality of sensors may be used as the first sensor, and the first object recognition means may be configured to integrate the outputs of the plurality of sensors (sensor fusion) to give a recognition result of the object. It should be understood that such cases also belong to the scope of the present invention.

かくして、上記の本発明によれば、複数の種類のセンサを用いて自車の周辺の車両、歩行者、障害物等の物体の認識を実行する技術に於いて、或るセンサの検出結果を教師データとして用いた別のセンサの出力から物体の認識を実行するための機械学習のための、教師あり学習データを自動的に調製し収集して、教師あり学習データのデータベースを構築するシステムを提供することが可能となる。かかる構成によれば、使用者の労力、システムの構築に要するコストが大幅に低減され、或いは、人的に構築が非常に困難な場合でも、教師あり学習データのデータベースの構築が可能となる利点が得られる。また、第二のセンサによる物体認識アルゴリズムを学習データの調製に利用することで、学習データ蓄積と機械学習による物体認識アルゴリズム構成のループを形成する構成が設けられている場合には、物体認識アルゴリズムの精度が向上し、これにより、得られるデータベースの精度、多様性、量が向上することとなり、更なる物体認識アルゴリズムの精度の向上が期待される。即ち、学習データ蓄積と機械学習による物体認識アルゴリズム構成のループ処理を繰り返せば、繰り返すほど、より精度が高く多様な教師あり学習データのデータベースを、より大量に若しくは効率的に構築することが可能となる。 Thus, according to the present invention described above, in the technique of executing recognition of an object such as a vehicle, a pedestrian, and an obstacle around the own vehicle using a plurality of types of sensors, the detection result of a certain sensor is A system that automatically prepares and collects supervised learning data for machine learning to execute object recognition from the output of another sensor used as supervised data, and builds a database of supervised learning data. It becomes possible to provide. According to this configuration, the labor of the user and the cost required to build the system are significantly reduced, or even when it is extremely difficult to build the system manually, it is possible to build a database of supervised learning data. Is obtained. In addition, if a configuration for forming a loop of learning data storage and object recognition algorithm configuration by machine learning by using the object recognition algorithm by the second sensor for preparation of learning data, the object recognition algorithm is used. Is improved, and the accuracy, diversity, and quantity of the obtained database are improved, and further improvement of the accuracy of the object recognition algorithm is expected. That is, by repeating the loop processing of the object recognition algorithm configuration by learning data storage and machine learning, it is possible to construct a database of highly accurate and diverse supervised learning data in a larger amount or more efficiently. Become.

本発明のその他の目的及び利点は、以下の本発明の好ましい実施形態の説明により明らかになるであろう。 Other objects and advantages of the present invention will be apparent from the following description of the preferred embodiments of the present invention.

図１（Ａ）は、本発明によるセンサ出力の認識アルゴリズムに於ける機械学習の学習データを収集して格納するためのデータベース構築システムの一つの実施形態の構成をブロック図の形式にて表した図である。図１（Ｂ）は、データベース構築に於ける処理をフローチャートの形式にて表した図である。FIG. 1A is a block diagram showing the configuration of one embodiment of a database construction system for collecting and storing learning data for machine learning in a sensor output recognition algorithm according to the present invention. It is a figure. FIG. 1B is a diagram showing the process in the database construction in the form of a flow chart. 図２（Ａ）〜（Ｃ）は、本発明によるデータベース構築システムの別の実施形態の構成をブロック図の形式にて表した図である。2A to 2C are diagrams showing the configuration of another embodiment of the database construction system according to the present invention in the form of a block diagram. 図３（Ａ）〜（Ｃ）は、本発明によるデータベース構築システムの更に別の実施形態の構成をブロック図の形式にて表した図である。3 (A) to 3 (C) are diagrams showing the configuration of still another embodiment of the database construction system according to the present invention in the form of a block diagram. 図４（Ａ）〜（Ｂ）は、本発明によるデータベース構築システムの更に別の実施形態であって、特に、教師データが二種類ある場合の構成をブロック図の形式にて表した図である。4 (A) and 4 (B) are still another embodiment of the database construction system according to the present invention, and in particular, are diagrams showing the configuration in the case where there are two types of teacher data in the form of a block diagram. . 図５（Ａ）〜（Ｂ）は、本発明によるデータベース構築システムの更に別の実施形態であって、特に、データベースに蓄積された学習データを用いて確立させた物体認識アルゴリズムにより得られた物体認識結果を教師データとして利用する場合の構成をブロック図の形式にて表した図である。5A and 5B are still another embodiment of the database construction system according to the present invention, and in particular, an object obtained by an object recognition algorithm established by using learning data accumulated in the database. It is the figure which represented the structure at the time of utilizing a recognition result as teacher data in the format of a block diagram.

ＳＣ…カメラ画像出力
ＳＬ…ＬＩＤＡＲ点群出力
ＳＲ…ミリ波レーダー出力
Ｒｒ、Ｒｒ１、Ｒｒ２…認識結果 SC ... Camera image output SL ... LIDAR point cloud output SR ... Millimeter wave radar output Rr, Rr1, Rr2 ... Recognition result

以下に添付の図を参照しつつ、本発明を幾つかの好ましい実施形態について詳細に説明する。図中、同一の符号は、同一の部位を示す。 The present invention will be described in detail with respect to some preferred embodiments with reference to the accompanying drawings. In the drawings, the same reference numerals indicate the same parts.

システムの基本的な構成と作動
本発明による物体認識アルゴリズムの機械学習のための学習データ収集を実行するデータベース構築システムに於いては、「発明の概要」の欄で述べた如く、端的に述べれば、第一のセンサの出力に基づいて逐次的に得られた物体の認識結果を教師データとして、そして、第二のセンサの逐次的な出力を入力データとして、それぞれ用いて、第二のセンサの出力に基づいて物体を認識するためのアルゴリズムを構成又は調節する機械学習に利用される「教師あり学習データ」の調製、収集及び蓄積が逐次的に実行される。ここに於いて、機械学習に利用される「教師あり学習データ」とは、第二のセンサの出力と、これに対応付けられた第一のセンサの認識結果との組となる。従って、データベース構築システムは、基本的な構成として、第一のセンサと、第一のセンサの出力に基づいて逐次的に物体を認識する認識手段と、第二のセンサと、認識手段による逐次的に得られた認識結果と逐次的に得られた第二のセンサの出力との対応付けを行う対応付け手段と、対応付けられた第二のセンサの出力と第一のセンサの認識結果との組を「教師あり学習データ」として格納するデータ格納手段とから構成される。第一のセンサに基づいて逐次的に物体を認識する認識手段と、対応付け手段と、データ格納手段とは、コンピュータシステム、即ち、通常の形式の、双方向コモン・バスにより相互に連結されたＣＰＵ、ＲＯＭ、ＲＡＭ及び入出力ポート装置を有するマイクロコンピュータ及び駆動回路を含むシステムにより実現され、上記の各手段の作動は、かかるシステムに於けるコンピュータプログラムの実行によって、自動的に達成される。 Basic Configuration and Operation of System In the database construction system for executing the learning data collection for machine learning of the object recognition algorithm according to the present invention, as described in the section "Outline of the Invention", , The recognition result of the object sequentially obtained based on the output of the first sensor is used as teacher data, and the sequential output of the second sensor is used as input data. The preparation, collection, and accumulation of "supervised learning data" utilized in machine learning that constructs or adjusts algorithms for recognizing objects based on outputs is performed sequentially. Here, the “supervised learning data” used for machine learning is a set of the output of the second sensor and the recognition result of the first sensor associated with this. Therefore, the database construction system has, as a basic configuration, a first sensor, a recognition unit that sequentially recognizes an object based on the output of the first sensor, a second sensor, and a recognition unit that sequentially recognizes the object. Of the recognition result obtained in step 1 and the associating means for associating the output of the second sensor sequentially obtained with the output of the associated second sensor and the recognition result of the first sensor And a data storage means for storing the set as "supervised learning data". The recognizing means for sequentially recognizing the object based on the first sensor, the associating means, and the data storing means are connected to each other by a computer system, that is, an ordinary type, bidirectional common bus. It is realized by a system including a microcomputer having a CPU, a ROM, a RAM, and an input / output port device and a drive circuit, and the operation of each means described above is automatically achieved by executing a computer program in such a system.

上記の如き、本発明によるデータベース構築システムの、最も基本的な実施形態の一つは、図１（Ａ）に例示されている如き、車両の周辺領域の物体の認識のためのシステムのためのデータベースの構築に利用される。同図を参照して、かかるデータベース構築システムに於いては、まず、「第一のセンサ」として、車載カメラが採用され、「第二のセンサ」として、車載のＬＩＤＡＲが採用される。カメラとＬＩＤＡＲとは、それぞれ、車両の周辺領域（特に、前方領域）に於ける物体の認識を行うためにこの分野で使用されている任意の、或いは、公知の形式のものであってよい。また、図示の実施形態の場合には、第一のセンサである車載カメラの画像出力ＳＣは、物体認識部へ逐次的に与えられ、そこに於いて、カメラにより撮影された画像に於ける物体の像の存在及び／又は種類、例えば、（先行する）車両、歩行者、動物、路側帯の固定物、路上の障害物等の識別が、この分野で使用されている任意の、或いは、公知の形式にて実行され、認識結果Ｒｒが逐次的に出力される。そして、カメラ画像に基づいて得られた認識結果Ｒｒと、かかるカメラ画像と略同時期に或いは同一の領域についてＬＩＤＡＲの出力データである点群データＳＬとが対応付け部へ逐次的に与えられ、対応付け部に於いて、認識結果Ｒｒと、これに対応する点群データＳＬとを、前者を教師データとして、後者を入力データとして、対応付けする処理が実行されて、学習データが逐次的に調製され、調製された学習データが逐次的にデータベース、即ち、データ格納手段へ格納される。なお、図１（Ａ）に例示の構成に於いて、カメラとＬＩＤＡＲとは、車両に搭載されるが、物体認識部、対応付け部、データベースは、車両に搭載されてもよく、或いは、外部の任意の施設に設置されていてよい。物体認識部、対応付け部、データベースが外部施設に設置される場合には、カメラとＬＩＤＡＲの出力及び／又は物体認識部の認識結果は、任意の形式の無線通信手段、ネットワーク等を通じて、外部施設へ送信されるようになっていてよい。 One of the most basic embodiments of the database construction system according to the present invention, as described above, is for a system for recognizing an object in a peripheral area of a vehicle, as illustrated in FIG. It is used to build a database. Referring to the figure, in such a database construction system, first, an in-vehicle camera is adopted as the "first sensor", and an in-vehicle LIDAR is adopted as the "second sensor". The camera and the LIDAR may each be of any of the known or known types used in the art for recognition of objects in the vehicle's peripheral area (particularly in the front area). In the case of the illustrated embodiment, the image output SC of the vehicle-mounted camera, which is the first sensor, is sequentially given to the object recognition unit, where the object in the image captured by the camera is detected. The presence and / or type of images of, for example, identification of (preceding) vehicles, pedestrians, animals, roadside fixed objects, roadside obstacles, etc., is arbitrary or known in the art. The recognition result Rr is sequentially output. Then, the recognition result Rr obtained based on the camera image and the point cloud data SL which is the output data of the LIDAR about the same time period or the same area as the camera image are sequentially given to the associating unit, In the associating unit, a process of associating the recognition result Rr and the point cloud data SL corresponding thereto with the former as the teacher data and the latter as the input data is executed, and the learning data is sequentially acquired. The prepared learning data is sequentially stored in the database, that is, the data storage means. In the configuration illustrated in FIG. 1A, the camera and the LIDAR are mounted on the vehicle, but the object recognition unit, the association unit, and the database may be mounted on the vehicle, or externally. It may be installed in any facility. When the object recognition unit, the association unit, and the database are installed in an external facility, the output of the camera and the LIDAR and / or the recognition result of the object recognition unit can be transmitted to an external facility through a wireless communication unit of any format, a network, or the like. May be sent to.

対応付け処理の例
上記の対応付け部に於ける学習データの調製、即ち、第二のセンサの出力と第一のセンサの認識結果との対応付け処理は、第一のセンサの認識結果と第二のセンサの出力とのそれぞれの表現形式又は態様に応じて、種々の、任意の方法、例えば、上記の一連の特許文献に記載されている処理手順又はその他の任意の手順により達成可能である。例えば、図１（Ａ）の如く、第一のセンサの認識結果として、カメラ画像に基づく認識結果Ｒｒを採用し、第二のセンサの出力として、ＬＩＤＡＲの点群データＳＬを採用した場合の対応付け処理は、図１（Ｂ）に例示されている処理によって達成されてよい。具体的には、同図を参照して、対応付け部に於いて、まず、カメラ画像に基づく認識結果Ｒｒの取得（ステップ１０）と、ＬＩＤＡＲの点群データの取得（ステップ１２）とが実行される。ここに於いて、カメラ画像に基づく認識結果Ｒｒの表現形式は、例えば、画像内に於いて、車両の像又はその他の物体の像の範囲が、その像の物体の種類を特定した状態で画定されたものであってよく、像の範囲は、画像内の座標又は画像に撮像されている空間に於ける座標で表されていてよい。また、ＬＩＤＡＲの点群データは、検出点（光の反射点）の各々の空間に於ける位置座標で表されていてよい。かくして、それぞれのデータの取得が為されると、ＬＩＤＡＲの点群データを、ＬＩＤＡＲの点群データの処理として一般的な態様にて、空間内の位置によって、グループ分けして部分点群に分割される（ステップ１４）。ここに於いて、分割パターンは、任意に予め定められたものであってよく、例えば、等立体角分割による分割方法や、各点について、その点と最近傍点との距離が閾値未満のものを同一グループとし、閾値以上のものを別グループとするという処理による分割方法などが採用されてよい。 Example of associating process The preparation of the learning data in the above associating unit, that is, the associating process between the output of the second sensor and the recognition result of the first sensor is performed by the first sensor recognition result and the first sensor recognition result. It can be achieved by various arbitrary methods, for example, the processing procedure described in the above-mentioned series of patent documents or any other procedure, depending on the respective expression forms or modes with the outputs of the two sensors. . For example, as shown in FIG. 1A, the case where the recognition result Rr based on the camera image is adopted as the recognition result of the first sensor and the point cloud data SL of LIDAR is adopted as the output of the second sensor The attaching process may be achieved by the process illustrated in FIG. Specifically, referring to the figure, in the associating unit, first, the recognition result Rr based on the camera image is acquired (step 10) and the LIDAR point cloud data is acquired (step 12). To be done. Here, the expression form of the recognition result Rr based on the camera image is defined, for example, in a state where the range of the image of the vehicle or the image of another object in the image specifies the type of the object of the image. The range of the image may be represented by the coordinates in the image or the coordinates in the space captured in the image. The LIDAR point cloud data may be represented by position coordinates in each space of the detection points (light reflection points). Thus, when the respective data are obtained, the LIDAR point cloud data is divided into partial point groups by grouping according to the position in space in a general manner as the processing of the LIDAR point cloud data. (Step 14). Here, the division pattern may be arbitrarily determined in advance, and for example, a division method by equal solid angle division, or for each point, the distance between that point and the nearest point may be less than a threshold value. A division method or the like may be adopted in which the same group is set and those having a threshold value or more are set as different groups.

しかる後、上記で得られた部分点群をカメラ画像に投影し（ステップ１６）、部分点群毎に、カメラ画像内の認識対象の像、例えば、車両の像の範囲に含まれる点の数の割合が算出される（ステップ１８）。ここに於いて、カメラ画像への部分点群の投影は、認識対象の像の範囲の表現形式と点群データの表現形式に応じて、認識対象の像の物体の存在する空間の座標と点群データの表現されている空間の座標とが互いに整合するように、両者の幾何学的な変換を用いて実行されてよい。なお、カメラ画像に基づく認識結果Ｒｒに於いては、典型的には、或る物体の像が認識された場合に、その存在の信頼度或いは確からしさの程度が、パーセントなどの割合で表される（例えば、存在確率が７５％など）。その場合には、カメラ画像内の認識対象の像の範囲に含まれる点の数の割合の算出に於いて、照合の誤りを低減するために、認識結果に於いて、物体の存在の信頼度が所定度合以上の像の範囲に対してのみ、或いは、カメラから見て最も手前にある物体の像の範囲に対してのみ、像の範囲に含まれる点の数の割合の算出が実行されてよい。なお、上記の所定度合は、実験的に又は理論的に適宜設定されてよい。かくして、像の範囲に含まれる点の数の割合が、任意に設定される所定の閾値以上の部分点群が、認識対象、例えば、車両に属する点群として関連付けられ、所定の閾値未満の部分点群が、認識対象ではない物体に属する点群として関連付けられ、それぞれ、教師あり学習データとして、データベースへ格納される（ステップ２０〜２４）。 After that, the partial point group obtained above is projected on the camera image (step 16), and the number of points included in the image of the recognition target in the camera image, for example, the range of the image of the vehicle is projected for each partial point group. Is calculated (step 18). Here, the projection of the partial point cloud onto the camera image is performed by the coordinates and points of the space in which the object of the recognition target image exists according to the expression format of the range of the recognition target image and the expression format of the point cloud data. It may be executed by using a geometrical transformation of the two so that the coordinates of the space in which the group data is represented match each other. In the recognition result Rr based on the camera image, typically, when an image of a certain object is recognized, the degree of reliability or certainty of its existence is expressed as a percentage. (For example, the existence probability is 75%). In that case, in order to reduce collation errors in calculating the ratio of the number of points included in the range of the recognition target image in the camera image, the reliability of the existence of the object in the recognition result is reduced. The calculation of the ratio of the number of points included in the image range is performed only for the range of the image having a predetermined degree or more, or only for the range of the image of the object located closest to the camera. Good. The above-mentioned predetermined degree may be set experimentally or theoretically as appropriate. Thus, a partial point group in which the ratio of the number of points included in the range of the image is equal to or greater than a predetermined threshold value that is arbitrarily set is associated with the recognition target, for example, a point group belonging to the vehicle, The point cloud is associated as a point cloud belonging to an object that is not a recognition target, and is stored as supervised learning data in the database (steps 20 to 24).

図１（Ｂ）に例示の処理は、コンピュータの処理によって自動的に実行される。典型的には、カメラ（第一のセンサ）の出力データ、認識結果Ｒｒ（第一のセンサの物体の認識結果データ）及びＬＩＤＡＲの点群データ（第二のセンサの出力データ）を任意のデータ記憶装置に蓄積しておき、図１（Ｂ）に例示の、一連のデータの対応付けと教師あり学習データの格納とは、オフライン処理で、例えば、車両の走行ログデータ（センサのデータと伴に記憶される。）を参照しながら、実行されてよい。また、別の態様としては、データの対応付けと教師あり学習データの格納とは、センサデータの取得と伴に逐次的に実行されてよく、その場合、時々刻々に、教師あり学習データが調製されて、蓄積されていくこととなる。 The process illustrated in FIG. 1B is automatically executed by the process of the computer. Typically, the output data of the camera (first sensor), the recognition result Rr (recognition result data of the object of the first sensor) and the LIDAR point cloud data (output data of the second sensor) are arbitrary data. Correspondence of a series of data and storage of supervised learning data, which are stored in a storage device and illustrated in FIG. 1B, are offline processes, and are, for example, vehicle running log data (sensor data and sensor data). Stored in a). Further, as another aspect, the association of data and the storage of supervised learning data may be sequentially executed along with the acquisition of sensor data, in which case the supervised learning data is prepared every moment. Will be accumulated and accumulated.

蓄積された「教師あり学習データ」は、任意の態様にて、ＬＩＤＡＲにより検出された点群データに基づいて物体の認識アルゴリズムを構成又は調節するための機械学習に用いられ、かくして、得られたアルゴリズムを用いて、ＬＩＤＡＲにより検出された点群データに基づき、車両の周辺領域の物体の認識に利用され、認識結果が車両に於ける各種制御に利用されてよい。なお、カメラ画像に基づく認識結果も車両に於ける各種制御に利用されてよいことは理解されるべきである。また、図示の例は、対応付け処理の一つの例であり、第一のセンサに基づく認識結果（教師データ）と第二のセンサの出力（入力データ）の表現形式又は態様に応じた対応付け処理が実行されてよいことは理解されるべきである。重要なことは、逐次的に、第一のセンサに基づく認識結果と第二のセンサの出力とを取得し、それらの対応付けをして学習データを調製し、格納する処理を、コンピュータにより自動的に達成するという点である。 The accumulated "supervised learning data" is used in any manner for machine learning to construct or adjust an object recognition algorithm based on the point cloud data detected by LIDAR, thus obtained. Based on the point cloud data detected by LIDAR using an algorithm, it may be used for recognition of an object in a peripheral area of the vehicle, and the recognition result may be used for various controls in the vehicle. It should be understood that the recognition result based on the camera image may also be used for various controls in the vehicle. Further, the illustrated example is one example of the associating process, and associates the recognition result (teacher data) based on the first sensor and the output format (input data) of the second sensor according to the expression format or mode. It should be appreciated that processing may be performed. What is important is that the computer automatically performs the process of sequentially acquiring the recognition result based on the first sensor and the output of the second sensor, associating them with each other, and preparing and storing the learning data. It is a point to achieve it.

本発明によるデータベース構築システムのその他の実施形態の例
本発明によるデータベース構築システムは、図１（Ａ）に例示した構成の他に、図２〜５に示される形態により実現されてよい。いずれの場合も、教師データと入力データのそれぞれの表現形式に応じた両者の対応付け処理が為されて、学習データの調製と格納が上記と同様に実行されてよい。 Example of Other Embodiments of Database Construction System According to Present Invention The database construction system according to the present invention may be realized by the configurations shown in FIGS. 2 to 5 in addition to the configuration illustrated in FIG. In any case, the teacher data and the input data may be associated with each other according to their respective expression formats, and the learning data may be prepared and stored in the same manner as described above.

（１）教師データとして、ＬＩＤＡＲの出力に基づく認識結果を用いる場合（図２（Ａ））
ＬＩＤＡＲがカメラよりも正しい認識結果を得られる状況(逆光、夜間、雨天など)では、ＬＩＤＡＲの出力に基づく認識結果を教師データとし、カメラ画像を入力データとして用いて学習データの調製及び蓄積が実行されてよい。この場合、学習データを用いて、カメラ画像に基づいて物体を認識するアルゴリズムが機械学習によって構成又は調節されることとなる。また、ＬＩＤＡＲとカメラとのうちで、いずれが正しい認識結果を得られるかの状況に応じて、図１（Ａ）の構成と図２（Ａ）の構成のいずれかが選択できるようになっていてもよい。 (1) When the recognition result based on the output of LIDAR is used as the teacher data (FIG. 2 (A))
In the situation where LIDAR can obtain more accurate recognition result than the camera (backlight, night, rainy weather, etc.), the recognition result based on the output of LIDAR is used as the teacher data and the learning data is prepared and stored using the camera image as the input data. May be done. In this case, using learning data, an algorithm for recognizing an object based on a camera image is configured or adjusted by machine learning. Further, one of the configuration shown in FIG. 1A and the configuration shown in FIG. 2A can be selected depending on which of the LIDAR and the camera can obtain a correct recognition result. May be.

（２）カメラ画像とＬＩＤＡＲの点群データとの双方を用いて物体の認識を実行する構成（センサフュージョン）の場合（図２（Ｂ）、（Ｃ））
この場合、カメラ画像とＬＩＤＡＲの点群データとに基づく物体の認識結果を教師データとし、入力データとして、ＬＩＤＡＲの点群データ（図２（Ｂ））又はカメラ画像（図２（Ｃ））を用いて学習データの調製及び蓄積が実行されてよい。なお、教師データは、物体の認識結果から抽出される物体までの距離、物体の速度等の情報であってもよい。また、入力データとして用いるカメラ画像は、動画であってもよい。 (2) In the case of a configuration (sensor fusion) for recognizing an object using both a camera image and LIDAR point cloud data (FIGS. 2B and 2C)
In this case, the recognition result of the object based on the camera image and the LIDAR point cloud data is used as teacher data, and the LIDAR point cloud data (FIG. 2B) or the camera image (FIG. 2C) is used as input data. Training data preparation and storage may be performed using. The teacher data may be information such as the distance to the object extracted from the recognition result of the object, the speed of the object, and the like. The camera image used as the input data may be a moving image.

（３）ＬＩＤＡＲとミリ波レーダ（ＲＡＤＡＲ）とにより、物体の認識を行う構成の場合（図３（Ａ）、（Ｂ））
図１（Ａ）、図２（Ａ）〜（Ｃ）の構成に於いて、カメラに代えて、ミリ波レーダーが用いられてもよい。ミリ波レーダーの出力ＳＲは、レーダー反射強度マップとなるので、ミリ波レーダーを第一のセンサとして使用する場合には、物体認識部は、レーダー反射強度マップＳＲに基づいて任意の方式にて物体を認識する手段となり、教師データは、レーダー反射強度マップＳＲに基づく物体の認識結果Ｒｒとなる（図３（Ａ））。また、レーダー反射強度マップとＬＩＤＡＲの点群データとの双方を用いて物体の認識を実行する構成（センサフュージョン）の場合には、教師データは、レーダー反射強度マップＳＲとＬＩＤＡＲの点群データＳＬとに基づく物体の認識結果Ｒｒとなり、入力データは、レーダー反射強度マップＳＲ（図３（Ｂ））又はＬＩＤＡＲの点群データ（図示せず）となる。特に、ミリ波レーダーのレーダー反射強度マップをタグ付けするといった処理を人の手により行うことは、困難であるため、上記の如く、コンピュータにより自動的に処理できることは非常に有利である。 (3) In the case of a configuration in which an object is recognized by a LIDAR and a millimeter wave radar (RADAR) (FIGS. 3A and 3B)
In the configurations of FIGS. 1A and 2A to 2C, a millimeter wave radar may be used instead of the camera. Since the output SR of the millimeter wave radar becomes a radar reflection intensity map, when the millimeter wave radar is used as the first sensor, the object recognition unit uses an arbitrary method based on the radar reflection intensity map SR. As a means for recognizing the object, and the teacher data becomes the object recognition result Rr based on the radar reflection intensity map SR (FIG. 3A). Further, in the case of a configuration (sensor fusion) in which object recognition is executed using both the radar reflection intensity map and the LIDAR point cloud data, the teacher data is the radar reflection intensity map SR and the LIDAR point cloud data SL. The object recognition result Rr based on and is obtained, and the input data is the radar reflection intensity map SR (FIG. 3B) or the LIDAR point cloud data (not shown). In particular, since it is difficult for humans to perform processing such as tagging a radar reflection intensity map of a millimeter wave radar, it is very advantageous that the processing can be automatically performed by a computer as described above.

（４）その他の情報を学習データに付加する場合（図３（Ｃ））
カメラ画像、ＬＩＤＡＲの点群データ或いはレーダー反射強度マップの他、任意のセンサ又は検出装置等により取得した車速等の車両の運動情報や天候等の環境情報Ｄｔを学習データに付加するようになっていてもよい。この場合、車両の運動情報や環境情報に適合した機械学習が可能となることが期待される。 (4) When adding other information to the learning data (Fig. 3 (C))
In addition to camera images, LIDAR point cloud data or radar reflection intensity maps, vehicle motion information such as vehicle speed and environment information Dt such as weather acquired by any sensor or detection device are added to the learning data. May be. In this case, it is expected that machine learning suitable for vehicle motion information and environment information will be possible.

（５）複数の教師データを用いる場合（図４（Ａ）、（Ｂ））
教師データとして、二種類以上のデータ（Ｒｒ１、Ｒｒ２）が用いられてもよい（上記までに説明された例では、一種類）。教師データが二つ以上の場合、それぞれのデータから適宜抽出される情報を対応付け処理に於いて用いられてよい。例えば、図４（Ａ）の例では、教師データとして参照する情報として、カメラ画像に基づく認識結果Ｒｒ１からは、画像内の物体の像の位置や種類の情報を採用し、ＲＡＤＡＲのレーダー反射強度マップＳＲに基づく認識結果Ｒｒ２からは、物体までの距離、速度の情報を採用するといった態様であってよい。また、図４（Ｂ）の如く、機械学習の対象であるＬＩＤＡＲの点群データに於ける認識アルゴリズムが、一応の精度にて確立している場合には、ＬＩＤＡＲの点群データＳＬに基づく認識結果Ｒｒ２が、二つ目の教師データとして採用されてよい。 (5) When using a plurality of teacher data (FIGS. 4A and 4B)
Two or more types of data (Rr1, Rr2) may be used as teacher data (one type in the examples described above). When there are two or more teacher data, information appropriately extracted from each data may be used in the association process. For example, in the example of FIG. 4A, the information on the position and type of the image of the object in the image is adopted from the recognition result Rr1 based on the camera image as the information referred to as the teacher data, and the radar reflection intensity of RADAR is used. From the recognition result Rr2 based on the map SR, the information about the distance to the object and the speed may be adopted. Further, as shown in FIG. 4B, when the recognition algorithm for the point cloud data of LIDAR which is the target of machine learning is established with a certain accuracy, recognition based on the point cloud data SL of LIDAR is performed. The result Rr2 may be adopted as the second teacher data.

（６）機械学習により得られた認識アルゴリズムによる認識結果を教師データとして用いる場合（図５（Ａ）、（Ｂ））
データベースに格納された学習データを用いた機械学習によって、第二のセンサの出力に基づく物体の認識アルゴリズムが構成又は調節された後、更に、その認識アルゴリズムを用いた認識結果が教師データとして採用されてよい。例えば、図５（Ａ）に例示されている構成の場合には、図１（Ａ）にて説明された構成と同様に、まず、カメラ画像に基づく認識結果Ｒｒ１とＬＩＤＡＲの点群データＳＬとの対応付け処理を通じて学習データの調製と格納が或る程度の期間に亘って実行された後、かかる学習データを用いて、機械学習によりＬＩＤＡＲの点群データＳＬに基づいて物体の認識アルゴリズムが構成又は調節される。しかる後、その機械学習によって得られた物体の認識アルゴリズムにより、ＬＩＤＡＲの点群データＳＬに基づく物体の認識が実行され、その認識結果Ｒｒ２も教師データとして、ＬＩＤＡＲの点群データＳＬと対応付けされて、これにより、学習データの調製及び格納が実行される。ここに於いて、教師データがＲｒ１、Ｒｒ２の二つとなるが、例えば、計測状況に応じて、適宜、より精度の高い教師データの一方が優先的に選択して、入力データに対応付けされるようになっていてよい。より具体的には、例えば、任意の手法で判定されてよい信頼度の高い認識結果の重みを大きくした態様にて、教師データＲｒ１、Ｒｒ２の寄与の割合を調節して、一つの教師データを調製し、これを入力データへ対応付けするようになっていてもよい。この点に関し、機械学習によって構成され或いは調節された第二のセンサの出力に基づく物体の認識アルゴリズムの認識結果Ｒｒ２を教師データとして使用する場合、その結果の信頼度は、十分に高くなっていることが好ましい。従って、認識結果Ｒｒ２は、その信頼度が各種の制御に使用するときに満たすべき度合よりも高い所定度合以上であるときにのみ、教師データとして用いられるようになっていてよい。上記の処理に於いて、データベースに格納された学習データを用いた機械学習は、任意の態様にて実行されてよい。 (6) When the recognition result by the recognition algorithm obtained by machine learning is used as teacher data (FIGS. 5A and 5B)
After the object recognition algorithm based on the output of the second sensor is constructed or adjusted by machine learning using the learning data stored in the database, the recognition result using the recognition algorithm is further adopted as teacher data. You may For example, in the case of the configuration illustrated in FIG. 5A, first, similarly to the configuration described in FIG. 1A, the recognition result Rr1 based on the camera image and the point cloud data SL of LIDAR are obtained. After the preparation and storage of the learning data are performed through the association process for a certain period of time, an object recognition algorithm is configured based on the LIDAR point cloud data SL by machine learning using the learning data. Or adjusted. Then, the object recognition algorithm obtained by the machine learning executes object recognition based on the LIDAR point cloud data SL, and the recognition result Rr2 is also associated with the LIDAR point cloud data SL as teacher data. As a result, preparation and storage of learning data are executed. Here, there are two sets of teacher data, Rr1 and Rr2. For example, one of the teacher data with higher accuracy is preferentially selected according to the measurement situation and is associated with the input data. It may be like this. More specifically, for example, one teacher data is adjusted by adjusting the ratio of the contributions of the teacher data Rr1 and Rr2 in a mode in which the weight of a highly reliable recognition result that may be determined by an arbitrary method is increased. It may be prepared and associated with the input data. In this regard, when the recognition result Rr2 of the object recognition algorithm based on the output of the second sensor configured or adjusted by machine learning is used as teacher data, the reliability of the result is sufficiently high. It is preferable. Therefore, the recognition result Rr2 may be used as teacher data only when the reliability thereof is equal to or higher than a predetermined degree higher than the degree to be satisfied when used for various controls. In the above process, the machine learning using the learning data stored in the database may be executed in any mode.

上記の如く、データベースに格納された学習データを用いた機械学習により得られた認識結果を更に教師データとして利用する構成の場合、所謂、機械学習に於ける学習データのループが形成されることとなり、かかるループが繰り返されるほど、機械学習の対象となる認識アルゴリズムの認識精度の向上が期待される。 As described above, when the recognition result obtained by the machine learning using the learning data stored in the database is further used as the teacher data, a so-called learning data loop in the machine learning is formed. As the loop is repeated, it is expected that the recognition accuracy of the recognition algorithm targeted for machine learning will be improved.

更に、図５（Ｂ）に例示されている如く、種類の異なる二つのセンサの出力の双方に対して、それぞれの物体の認識アルゴリズムを、データベースに蓄積した学習データを用いて機械学習によって構成又は調節できるようになっていてもよい。図５（Ｂ）の例の場合、カメラ画像に基づく物体の認識と、ＬＩＤＡＲ点群データに基づく物体の認識との二つが実行されるところ、それぞれの物体の認識のアルゴリズムが、データベースに格納された学習データを用いた機械学習により構成又は調節されることとなる。対応付け処理に於いては、種類の異なるセンサの出力の各々に対して、種類の異なるセンサの出力に基づく教師データが適宜対応付けされてよい（即ち、それぞれのセンサ毎に、学習データが調製される。）。かかる構成によれば、種類の異なるセンサの出力のそれぞれの物体の認識のアルゴリズムに於いて、機械学習に於ける学習データのループが形成され、双方のセンサに基づく物体の認識アルゴリズムの認識精度の向上が期待される。 Further, as illustrated in FIG. 5B, a recognition algorithm of each object is configured by machine learning using learning data accumulated in a database for both outputs of two different types of sensors or It may be adjustable. In the case of the example in FIG. 5B, when the recognition of the object based on the camera image and the recognition of the object based on the LIDAR point cloud data are executed, the recognition algorithms of the respective objects are stored in the database. It is configured or adjusted by machine learning using the learned data. In the associating process, the teaching data based on the outputs of the different types of sensors may be appropriately associated with the outputs of the different types of sensors (that is, the learning data is prepared for each sensor). Be done.). According to such a configuration, a loop of learning data in machine learning is formed in the algorithm for recognizing each object of the outputs of different types of sensors, and the recognition accuracy of the object recognition algorithm based on both sensors is improved. Expected to improve.

かくして、上記の一連のデータベース構築システムによれば、或るセンサの検出結果を教師データとして用いた別のセンサの出力から物体の認識を実行するための機械学習のための、教師あり学習データを自動的に調整し収集して、教師あり学習データのデータベースを構築するシステムを提供することが可能となり、使用者の労力、システムの構築に要するコストが大幅に低減され、或いは、人的に構築が非常に困難な場合でも、教師あり学習データのデータベースの構築が可能となる利点が得られる。 Thus, according to the series of database construction systems described above, the supervised learning data for machine learning for executing the recognition of an object from the output of another sensor using the detection result of a certain sensor as the teacher data is generated. It is possible to provide a system that automatically adjusts and collects and builds a database of supervised learning data, which greatly reduces the labor of the user and the cost required to build the system, or is built manually. Even if it is very difficult, there is an advantage that a database of supervised learning data can be constructed.

以上の説明は、本発明の実施の形態に関連してなされているが、当業者にとつて多くの修正及び変更が容易に可能であり、本発明は、上記に例示された実施形態のみに限定されるものではなく、本発明の概念から逸脱することなく種々の装置に適用されることは明らかであろう。 Although the above description is made in connection with the embodiments of the present invention, many modifications and changes can be easily made by those skilled in the art, and the present invention is limited to the embodiments illustrated above. It will be apparent that the present invention is not limited and applies to various devices without departing from the inventive concept.

例えば、本発明で用いられるセンサについて、交差点に設置された固定カメラの如く、車両の外部に固定されたセンサであってもよい。 For example, the sensor used in the present invention may be a sensor fixed outside the vehicle, such as a fixed camera installed at an intersection.

Claims

A system for constructing a database for accumulating supervised learning data used for machine learning for configuration or adjustment of an algorithm for recognizing an object in a surrounding area of a vehicle based on an output of a sensor,
A first sensor that sequentially detects the state of the surrounding area of the vehicle,
A first object recognition means for sequentially recognizing an object in a surrounding area of the vehicle based on output data obtained sequentially from the first sensor;
A second sensor that sequentially detects the state of the surrounding area of the vehicle,
When the reliability of the recognition result data of the object by the first object recognition means is a predetermined degree or more, the output data sequentially obtained by the second sensor is used as the input data in the machine learning. Using the recognition result data of the objects sequentially recognized by the first object recognition means as the teacher data in the machine learning, the teacher data and the input data corresponding to the teacher data are associated with each other. Means for associating data of the object,
A teacher for machine learning for configuring or adjusting an algorithm that executes recognition of an object in the surrounding area of the vehicle based on the output of the second sensor, based on the set of the associated input data and teacher data. A system including learning data storage means for storing as learning data.