JP3599255B2

JP3599255B2 - Environment recognition device for vehicles

Info

Publication number: JP3599255B2
Application number: JP31566795A
Authority: JP
Inventors: 千秋青山
Original assignee: Honda Motor Co Ltd
Current assignee: Honda Motor Co Ltd
Priority date: 1995-12-04
Filing date: 1995-12-04
Publication date: 2004-12-08
Anticipated expiration: 2015-12-04
Also published as: JPH09159442A

Description

【０００１】
【発明の属する技術分野】
この発明は、ステレオ視を利用した車両用環境認識装置に関し、一層詳細には、例えば、自動車等の車両に搭載され、当該自動車の位置を基準として、風景や先行車等を含む情景に係る周囲環境を認識する車両用環境認識装置に関する。
【０００２】
【従来の技術】
従来から、周囲環境を認識しようとする場合、ステレオ視を利用したステレオカメラにより得られる２枚の画像（ステレオ画像ともいう。）から三角測量の原理に基づき対象物（単に、物体ともいう。）までの距離を求め、対象物の位置を認識する、いわゆるステレオ法が採用されている。
【０００３】
このステレオ法においては、前記距離を求める際に、レンズを通じて撮像した２枚の画像上において同一物体の対応が採れることが前提条件となる。
【０００４】
撮像した２枚の画像上において同一物体の対応を採る技術として、画像中の領域に着目する方法がある。
【０００５】
この方法は、まず、一方の画像上に適当なサイズのウィンドウを設定し、他方の画像においてこのウィンドウに対応する領域を求めるために、他方の画像に前記ウィンドウと同一サイズの領域を設定する。
【０００６】
次に、両画像上の各ウィンドウ内の画像（単に、ウィンドウ画像ともいう。）を構成する対応する各画素（詳しく説明すると、マトリクス位置が対応する各画素）についての画素データ値を引き算して差を得、さらに差の絶対値を得る。
【０００７】
そして、各画素についての差の絶対値の前記ウィンドウ内の和、いわゆる総和を求める。
【０００８】
このようにウィンドウ内の各画素データ値の差の絶対値の総和を求める計算を他方の画像上のウィンドウの位置を変えて順次行い、前記総和が最小になる他方の画像のウィンドウを、前記一方の画像のウィンドウに対応する領域であると決定する方法である。
【０００９】
この発明においても、基本的には、この画像中の領域に着目する方法を採用している。
【００１０】
【発明が解決しようとする課題】
ところで、車両に搭載されたステレオカメラ、例えば基線上に一定間隔離されて配置された２台のビデオカメラにより撮像する際には、画像情報を有する光が、フロントガラスおよびビデオカメラを構成するレンズを含む光学系を通じてＣＣＤエリアセンサ等の撮像素子に導入され、この撮像素子により光電変換が行われて電気的信号、すなわちビデオ信号として出力される。このビデオ信号が画素に対応して分割され、デジタル信号である画像データに変換されて、フレームメモリ等の画像メモリに格納される。
【００１１】
しかしながら、フロントガラス、レンズ等には歪みがあり、特に大きい歪みを有するレンズでは、いわゆる糸巻き（ピンクッション的）歪みやたる（バレル的）形歪み等の収差が存在するが、この収差を上記従来の技術においては考慮していない。
【００１２】
したがって、通常、レンズの中心を通る水平線を考えた場合、この撮像は、直線となり、上記対応処理により画像の対応が採れるが、レンズの周辺部分の画像については、水平線が湾曲した曲線となるために、直線と仮定して対応を採ることができないという問題があった。
【００１３】
なお、この出願に関連する技術として、例えば、特開平６−２８２７９３号公報に開示された技術を挙げることができるが、この公報には、単に、レンズの周囲で口径食により明るさが減ずるのを補正するための補正値を予め準備しておく内容が記載されているだけであり、レンズの歪みについての動機付けとなる技術は何も開示されていない。
【００１４】
この発明はこのような課題を考慮してなされたものであり、レンズの歪み等、光学系の歪みに基づく画像の歪みを原因として同一物体の対応の採れない状態を回避することを可能とする車両用環境認識装置を提供することを目的とする。
【００１５】
【課題を解決するための手段】
この発明は、例えば、図１および図８に示すように、
画像情報を有する光をとらえる光学部１１Ｒ、１１Ｌ、及び前記各光学部を通じて得た光をそれぞれ電気信号に変換する撮像手段１３Ｒ、１３Ｌとからなる２つのカメラ１Ｒ、１Ｌを有し、前記２つのカメラは、車両の正面方向にある無限遠点が前記２つの撮像手段からそれぞれ得られる各画面中の中心となるように設置され、
各画像中の同一物体画像の対応を採る対応処理手段６と、
対応の採れた同一物体までの距離を三角測量の原理に基づき演算する位置演算手段７とを備え、
対応処理手段は、光学部１１Ｒ、１１Ｌと撮像手段１３Ｒ、１３Ｌとを用いて予め準備された補正データを使用して取得画面の光学的な歪みを補正する歪み補正手段６２を有し、
対応処理手段が、歪みが補正された左右の取得画面内の同一形状を有するウィンドウをエピポーラーラインＥＰに沿って移動させて、両ウィンドウ内の画像の一致度を検出することで同一物体画像の対応を採る際に、一方の画面内でウィンドウを所定距離移動させるとともに、前記一方の画面内のウィンドウの位置を基準とする所定範囲内で他方の画面内のウィンドウを移動させて一致度検出を行い、
前記所定範囲は、前記撮像手段の水平画角、水平方向の画素数、物体の最短検出距離及び前記２つの撮像手段の基線長に応じて定められる範囲であることを特徴とする。
【００１６】
この発明によれば、対応処理手段が、２つの撮像手段から得られる各画像信号中の同一物体画像の対応を採る際に、歪み補正手段により光学部の歪みを原因とする撮像の歪みを補正した後に、対応を採るようにしているので、レンズの歪み等、光学系の歪みに基づく画像の歪みを原因とする同一物体の対応の採れない状態を回避することができる。
【００１７】
また、この発明によれば、前記歪み補正手段は、前記画像メモリからの画素読み出しアドレスを補正するアドレス補正テーブルを前記補正データとして有し、画素単位で画像メモリに格納された画像データを読み出して同一物体画像の対応を採る際に、左右画像内の一方で、ウィンドウをエピポーラーラインに沿って所定距離（例えば、６４０画素）移動させるとともに、他方の画像内でウィンドウを前記撮像手段の水平画角（θ）、水平方向の画素数（Ｎ）、物体の最短検出距離（Ｚｄ）及び前記２つの撮像手段の基線長（Ｄ）に応じて定められる所定範囲内で移動させて同一物体画像の対応を採るようにしているので、簡単な構成で、レンズの歪み等、光学系の歪みに基づく画像の歪みを原因とする対応の採れない状態を回避することができる。
【００１８】
この場合、アドレス補正テーブルには、補正前のアドレスに対応してアドレス補正データを符号付のビットデータとして格納されており、前記一致度の検出時にはパイプライン方式処理と並列処理の少なくとも一方の処理を用いて演算が行われる。このようにすれば、光学部毎に、対応するアドレス補正テーブルを持つ構成とすることが可能となり、結果として、撮像手段の光学部歪み特性に応じたアドレス補正テーブルを当該撮像手段に付設することができる。また、演算時間を低減することができる。
【００１９】
【発明の実施の形態】
以下、この発明の一実施の形態について、図面を参照して説明する。
【００２０】
図１はこの発明の一実施の形態の構成を示すブロック図である。
【００２１】
図１において、ステレオカメラ１が、右側のビデオカメラ（以下、単にカメラまたは右カメラともいう。）１Ｒと、左側のビデオカメラ（同様に、カメラまたは左カメラともいう。）１Ｌとにより構成されている。左右のカメラ１Ｒ、１Ｌは、図２に示すように、自動車（車両ともいう。）Ｍのダッシュボード上に予め定めた所定の間隔、いわゆる基線長Ｄを隔てて設置してある。また、カメラ１Ｒ、１Ｌはダッシュボード上に水平面に対して平行に、かつ車両Ｍの正面方向にある無限遠点が画像の中心となるように設置してある。さらに、カメラ１Ｒ、１Ｌはダッシュボード上に設置してあるために、カメラ１Ｒ、１Ｌを一体として連結することができ、上述の基線長Ｄを維持できる。
【００２２】
また、カメラ１Ｒ、１Ｌは、車両Ｍのワイパーのワイパー拭き取り範囲内に配置し、かつワイパーが左右にあって同方向に回動する場合には、左右のワイパーブレードの始点から同一位置になるように配置することで、ワイパーブレードによる遮光位置の変化が左右のカメラ１Ｒ、１Ｌで同一となり、認識対象物体（物体、対象物、対象物体、または、単に、対象ともいう。）の撮像に対してワイパーブレードの撮像の影響を少なくすることができる。左右のカメラ１Ｒ、１Ｌの光軸１５Ｒ、１５Ｌ（図１参照）は、同一水平面上において平行になるように設定されている。
【００２３】
図１から分かるように、右と左のカメラ１Ｒ、１Ｌには、光軸１５Ｒ、１５Ｌに略直交する方向に、画像情報を有する光ＩＬをとらえる同一の焦点距離Ｆを有する対物レンズ１１Ｒ、１１Ｌと、減光フィルタとしてのＮＤフィルタ組立体１２Ｒ、１２Ｌと、対物レンズ１１Ｒ、１１Ｌによって結像された像を撮像するエリアセンサ型のＣＣＤイメージセンサ（撮像素子部）１３Ｒ、１３Ｌとが配設されている。この場合、それぞれの光学系（光学部ともいう。）とも、例えば、右側の光学系で説明すれば、対物レンズ１１Ｒ、ＮＤフィルタ組立体１２Ｒを構成する１つのＮＤフィルタ（後述する。）または素通しの状態およびＣＣＤイメージセンサ１３Ｒは、いわゆる共軸光学系を構成する。
【００２４】
カメラ１Ｒ、１Ｌには、ＣＣＤイメージセンサ１３Ｒ、１３Ｌの読み出しタイミング、電子シャッタ時間等の各種タイミングを制御したり、ＣＣＤイメージセンサ１３Ｒ、１３Ｌを構成する撮像素子群を走査して得られる光電変換信号である撮像信号を、いわゆる映像信号に変換するための信号処理回路１４Ｒ、１４Ｌが配設されている。
【００２５】
左右のカメラ１Ｒ、１Ｌの出力信号、言い換えれば、信号処理回路１４Ｒ、１４Ｌの出力信号である映像信号は、増幅利得等を調整するＣＣＵ２Ｒ、２Ｌを通じて、例えば、８ビット分解能のＡＤ変換器３Ｒ、３Ｌに供給される。なお、実際上、ＣＣＵ２Ｒ、２Ｌから信号処理回路１４Ｒ、１４Ｌに対して前記電子シャッタ時間を可変する制御信号が送出される。
【００２６】
ＡＤ変換器３Ｒ、３Ｌによりアナログ信号である映像信号がデジタル信号に変換され、水平方向の画素数７６８列、垂直方向の画素数２４０行の画素の信号の集合としての画像信号（以下、必要に応じて、画素データの集合としての画像データともいい、実際上は濃度を基準とする画像信号ではなく輝度を基準とする映像信号データであるので、映像信号データともいう。）としてフレームバッファ等の画像メモリ４Ｒ、４Ｌに格納される。画像メモリ４Ｒ、４Ｌには、それぞれ、Ｎフレーム（Ｎコマ）分、言い換えれば、ラスタディスプレイ上の画面Ｎ枚分に相当する画面イメージが保持される。一実施の形態においてはＮの値として、Ｎ＝２〜６までの値が当てはめられる。２枚以上を保持できるようにしたために、画像の取り込みと対応処理とを並行して行うことが可能である。
【００２７】
画像メモリ（画像を構成する画素を問題とする場合には、画素メモリともいう。）４Ｒ、４Ｌは、この実施の形態においては、上記水平方向の画素数×垂直方向の画素数と等しい値の１フレーム分の画素メモリを有するものと考える。各画素メモリ４Ｒ、４Ｌは８ビットのデータを格納することができる。なお、各画素メモリ４Ｒ、４Ｌに格納されるデータは、上述したように、映像信号の変換データであるので輝度データである。
【００２８】
画像メモリ４Ｒ、４Ｌに格納される画像は、上述したように１枚の画面イメージ分の画像であるので、これを明確にするときには、必要に応じて、全体画像ともいう。
【００２９】
右側用の画像メモリ４Ｒの所定領域の画像データに対して、左側の画像メモリ４Ｌの同じ大きさの領域の画像データを位置（実際には、アドレス）を変えて順次比較して所定演算を行い、物体の対応領域を求める対応処理装置６が、画像メモリ４Ｒ、４Ｌに接続されている。
【００３０】
左右の画像メモリ４Ｒ、４Ｌ中の対象の対応領域（対応アドレス位置）に応じ三角測量法（両眼立体視）に基づいて、対象の相対位置を演算する位置演算装置７が対応処理装置６に接続されている。
【００３１】
対応処理装置６および位置演算装置７における対応処理・位置演算に先立ち、入力側が画像メモリ４Ｒに接続される露光量調整装置８の制御により、ＣＣＤイメージセンサ１３Ｒ、１３Ｌに入射される画像情報を有する光ＩＬの露光量が適正化される。
【００３２】
露光量調整装置８は、画像メモリ４Ｒの所定領域の画像データに基づいて、後述するルックアップテーブル等を参照して露光量を決定し、ＣＣＵ２Ｒ、２Ｌの増幅利得と、ＣＣＤイメージセンサ１３Ｒ、１３Ｌの電子シャッタ時間｛通常の場合、シャッタ速度と称されるが、単位は時間（具体的には、電荷蓄積時間）であるので、この実施の形態においては電子シャッタ時間という。なお、必要に応じて電子シャッタ速度ともいう。｝と、ＮＤフィルタ組立体１２Ｒ、１２Ｌのうちの所望のフィルタとを、それぞれ、同じ値、同じものに同時に決定する。
【００３３】
ＮＤフィルタ組立体１２Ｒ、１２Ｌのうち、所望のＮＤフィルタが、駆動回路５Ｒ、５Ｌを通じて切り換え選定されるが、この切り換えには、ＮＤフィルタを使用しない場合、いわゆる素通し（必要に応じて、素通しのＮＤフィルタとして考える。）の場合も含まれる。
【００３４】
次に、上記実施の形態の動作および必要に応じてさらに詳細な構成について説明する。
【００３５】
図３は、三角測量の原理説明に供される、対象物体Ｓを含む情景を左右のカメラ１Ｒ、１Ｌにより撮像している状態の平面視的図を示している。対象物体Ｓの相対位置をＲＰで表すとき、相対位置ＲＰは、既知の焦点距離ＦからのＺ軸方向（奥行き方向）の距離Ｚｄと右カメラ１ＲのＸ軸方向（水平方向）中心位置からの水平方向のずれ距離ＤＲとによって表される。すなわち、相対位置ＲＰがＲＰ＝ＲＰ（Ｚｄ、ＤＲ）で定義されるものとする。もちろん、相対位置ＲＰは、既知の焦点距離Ｆからの距離Ｚｄと左カメラ１ＬのＸ軸（水平方向）中心位置からの水平方向のずれ距離ＤＬとによって表すこともできる。すなわち、相対位置ＲＰをＲＰ＝ＲＰ（Ｚｄ、ＤＬ）と表すことができる。
【００３６】
図４Ａは、右側のカメラ１Ｒによって撮像された対象物体Ｓを含む画像（右画像または右側画像ともいう。）ＩＲを示し、図４Ｂは、左側のカメラ１Ｌによって撮像された同一対象物体Ｓを含む画像（左画像または左側画像ともいう。）ＩＬを示している。これら画像ＩＲと画像ＩＬとがそれぞれ画像メモリ４Ｒおよび画像メモリ４Ｌに格納されていると考える。右側画像ＩＲ中の対象物体画像ＳＲと左側画像ＩＬ中の対象物体画像ＳＬとは、画像ＩＲ、ＩＬのＸ軸方向の中心線３５、３６に対してそれぞれ視差ｄＲと視差ｄＬとを有している。対象物体画像ＳＲと対象物体画像ＳＬとは、エピポーラーライン（視線像）ＥＰ上に存在する。対象物体Ｓが無限遠点に存在するとき、対象物体画像ＳＲと対象物体画像ＳＬとは、中心線３５、３６上の同一位置に撮像され、視差ｄＲ、ｄＬは、ｄＲ＝ｄＬ＝０になる。
【００３７】
なお、ＣＣＤエリアセンサ１３Ｒ、１３Ｌ上における図３に示す視差ｄＲ、ｄＬとは、画像ＩＲ、ＩＬ上の図４Ａ、図４Ｂに示す視差ｄＲ、ｄＬとは極性が異なるが、ＣＣＤエリアセンサ１３Ｒ、１３Ｌからの読み出し方向を変えることで同一極性とすることができる。光学部に配設するレンズの枚数を適当に設定することによりＣＣＤエリアセンサ１３Ｒ、１３Ｌ上における視差ｄＲ、ｄＬと画像ＩＲ、ＩＬ上の視差ｄＲ、ｄＬの極性とを合わせることもできる。
【００３８】
図３から、次の（１）式〜（３）式が成り立つことが分かる。
【００３９】
ＤＲ：Ｚｄ＝ｄＲ：Ｆ …（１）
ＤＬ：Ｚｄ＝ｄＬ：Ｆ …（２）
Ｄ＝ＤＲ＋ＤＬ …（３）
これら（１）式〜（３）式から距離Ｚｄとずれ距離ＤＲとずれ距離ＤＬとをそれぞれ（４）式〜（６）式で求めることができる。
【００４０】
Ｚｄ＝Ｆ×Ｄ／（ＤＲ＋ＤＬ） …（４）
ＤＲ＝ｄＲ×Ｄ／（ｄＬ＋ｄＲ） …（５）
ＤＬ＝ｄＬ×Ｄ／（ｄＬ＋ｄＲ） …（６）
これら位置情報である距離Ｚｄとずれ距離ＤＲとずれ距離ＤＬとをクラスタリングして、対象物体Ｓについての識別符号としての、いわゆるアイディ（ＩＤ：Ｉｄｅｎｔｉｆｉｃａｔｉｏｎ）を付けることで、車両追従装置等への応用を図ることができる。
【００４１】
なお、実際上の問題として、ＣＣＤイメージセンサ１３Ｒ、１３Ｌの実効１画素の物理的な大きさの測定や焦点距離Ｆの測定は困難であるため、比較的正確に測定可能な画角を利用して距離Ｚｄ、ずれ距離ＤＲ、ＤＬを求める。
【００４２】
すなわち、例えば、カメラ１Ｒ、１Ｌの水平画角をθ、カメラ１Ｒ、１Ｌの水平方向の実効画素数（画像メモリ４Ｒ、４Ｌの水平画素数に等しい画素数）をＮ、視差ｄＲ、ｄＬに対応する画像メモリ４Ｒ、４Ｌ上の画素数をＮＲ、ＮＬとすると、次に示す（７）式〜（９）式から距離Ｚｄとずれ距離ＤＲとずれ距離ＤＬとをそれぞれ求めることができる。
【００４３】
Ｚｄ＝Ｎ×Ｄ／｛２（ＮＬ＋ＮＲ）ｔａｎ（θ／２）｝ …（７）
ＤＲ＝ＮＲ・Ｄ／（ＮＬ＋ＮＲ） …（８）
ＤＬ＝ＮＬ・Ｄ／（ＮＬ＋ＮＲ） …（９）
ここで、水平画角θは測定可能な値であり、水平方向の実効画素数Ｎ（この実施の形態では、上述したようにＮ＝７６８）は予め定められており、視差ｄＲ、ｄＬに対応する画素数ＮＲおよびＮＬも取り込んだ画像から分かる値である。
【００４４】
次に、上述の画像の取り込みからＩＤを付けるまでの過程をフローチャートを利用して全体的に説明すれば、図５に示すようになる。
【００４５】
すなわち、ＡＤ変換器３Ｒ、３Ｌから出力される映像信号データがそれぞれ画像メモリ４Ｒ、４Ｌに取り込まれて格納される（ステップＳ１）。
【００４６】
ステップＳ１に続いて、画像メモリ４Ｒに記憶されたある領域の画像に対応する画像を画像メモリ４Ｌから求め、いわゆる画像の左右の対応を取る（ステップＳ２）。
【００４７】
対応を取った後、カメラ１Ｒ、１Ｌにおける視差ｄＲ、ｄＬを求め、位置情報に変換する（ステップＳ３）。
【００４８】
その位置情報をクラスタリングし（ステップＳ４）、ＩＤを付ける（ステップＳ５）。
【００４９】
位置演算装置７の出力である、ＩＤの付けられた出力は、本発明の要部ではないので、詳しく説明しないが、図示していない、例えば、道路・障害物認識装置等に送出されて自動運転システムを構成することができる。この自動運転システムでは、運転者に対する警告、自動車（ステレオカメラ１を積んだ自車）Ｍの衝突回避、前走車の自動追従等の動作を行うことができる。
【００５０】
この実施の形態において、上述の左右の画像の対応を取るステップＳ２では、いわゆる特徴に着目した方法ではなく、基本的には、従来技術の項で説明した画像中の領域に着目する方法を採用している。
【００５１】
すなわち、エッジ、線分、特殊な形など何らかの特徴を抽出し、それらの特徴が一致する部分が対応の取れた部分であるとする特徴に着目する方法は、取り扱う情報量が低下するので採用せず、一方の画像、この実施の形態では、右画像ＩＲから対象物体画像ＳＲを囲む小領域、いわゆるウィンドウを切り出し、この小領域に似た小領域を他方の左画像ＩＬから探すことにより対応を決定する方法を採用している。
【００５２】
この実施の形態において採用した画像中の領域に着目する方法では、２枚の画像ＩＬ、ＩＲ上において同一対象物体Ｓの対応を採る技術として、一方の画像上に適当なサイズのウィンドウを設定し、他方の画像においてこのウィンドウに対応する領域を求めるために、他方の画像に前記ウィンドウと同一サイズの領域を設定する。
【００５３】
次に、両画像上の各ウィンドウ内の画像（単に、ウィンドウ画像ともいう。）を構成する対応する各画素（詳しく説明すると、ウィンドウ画像中のマトリクス位置が対応する各画素）についての画素データ値、すなわち、輝度値を引き算して差を得、さらに輝度差の絶対値を得る。
【００５４】
そして、各対応する画素についての輝度差の絶対値の前記ウィンドウ内の和、いわゆる総和を求める。
【００５５】
この総和を左右画像の一致度（対応度ともいう。）Ｈと定義する。このとき、右画像ＩＲと左画像ＩＬのウィンドウ内の対応座標点（ｘ，ｙ）の輝度（画素データ値）をそれぞれＩＲ（ｘ，ｙ）、ＩＬ（ｘ，ｙ）とし、ウィンドウの横幅をｎ画素（ｎは画素数）、縦幅をｍ画素（ｍも画素数）とするとき、ずらし量をｄｘ（後述する）とすれば、一致度Ｈは、次の（１０）式により求めることができる。
【００５６】
Ｈ（ｘ，ｙ）＝Σ（ｊ＝１→ｍ）Σ（ｉ＝１→ｎ）｜Ｉｄ｜ …（１０）
ここで、
｜Ｉｄ｜＝｜ＩＲ（ｘ＋ｉ，ｙ＋ｊ）−ＩＬ（ｘ＋ｉ＋ｄｘ，ｙ＋ｊ）｜
である。記号Σ（ｉ＝１→ｎ）は、｜Ｉｄ｜についてのｉ＝１からｉ＝ｎまでの総和を表し、記号Σ（ｊ＝１→ｍ）は、Σ（ｉ＝１→ｎ）｜Ｉｄ｜の結果についてのｊ＝１からｊ＝ｍまでの総和を表すものとする。
【００５７】
この（１０）式から、一致度Ｈが小さいほど、言い換えれば、輝度差の絶対値の総和が小さいほど、左右のウィンドウ画像が良く一致していることが分かる。
【００５８】
この場合、分割しようとするウィンドウ、すなわち小領域の大きさが大きすぎると、その領域内に相対距離Ｚｄの異なる他の物体が同時に存在する可能性が大きくなって、誤対応の発生する可能性が高くなる。一方、小領域の大きさが小さすぎると、誤った位置で対応してしまう誤対応、あるいは、ノイズを原因とする誤対応が増加してしまうという問題がある。本発明者等は、種々の実験結果から、最も誤対応が少なくなる小領域の大きさは、横方向の画素数ｎがｎ＝７〜９程度、縦方向の画素数ｍがｍ＝１２〜１５程度の大きさであることをつきとめた。
【００５９】
図６と図７は、対応処理装置６において一致度Ｈを求める対応計算を行う際の領域の動かし方の概念を示している。
【００６０】
図６に示すように、対応を取る元となる右画像ＩＲ上の所定領域（小領域または原領域ともいう。）３１は、Ｘ軸方向左端位置から右へ１画素ずつ６４０画素分移動していき、対応を取られる左画像ＩＬの所定領域（小領域または検索領域ともいう。）３２は、右画像ＩＲの原領域３１の左端位置に対応する位置（以下、原領域３１の水平方向の変移位置という。）から対応計算を行い、ずらし量ｄｘを右方向にエピポーラーラインＥＰ上を０〜最大１２７画素分だけ１画素ずつ移動させて対応計算を行うようにしている。最大１２７画素のずれが有効な一致度Ｈの計算は、合計で（６４０−ｎ）×１２８回行われる。
【００６１】
なお、１２８画素分に限定する理由は、出力結果を利用する側の要求から水平画角θがθ＝４０°、最短の距離ＺｄがＺｄ＝５ｍ、使用できるステレオカメラ１（カメラ１Ｒとカメラ１Ｌ）の水平方向の画素数ＮがＮ＝７６８、設置できる基線長ＤがＤ＝０．５ｍから、下記の（１１）式に当てはめると、ＮＬ＋ＮＲ＝１０５画素となり、ハードウエアにおいて都合のよい２の累乗でこれに近い値の２⁷＝１２８を選んだからである。
【００６２】
ＮＬ＋ＮＲ＝（Ｎ×Ｄ）／｛Ｚｄ×２×ｔａｎ（θ／２）｝
＝（７６８×０．５）／（５×２×ｔａｎ２０°） …（１１）このことは、右画像ＩＲ中、Ｘ＝０（左端）の位置に撮像された対象が、かならず、左画像ＩＬのずらし量ｄｘがｄｘ＝０〜１２７に対応する０番目の画素位置から１２７番目の画素位置内に撮像されていることを意味する。したがって、Ｘ座標値（変移位置ともいう。）ＸがＸ＝０を基準とする原領域３１内の撮像対象は、左画像ＩＬのＸ座標値ＸがＸ＝０を基準として、ずらし量ｄｘがｄｘ＝０〜１２７の範囲に撮像されていることを意味する。同様にして右画像ＩＲのＸ座標値ＸがＸ＝６４０−ｎを基準とする原領域３１内の撮像対象は、左画像ＩＬのＸ座標値ＸがＸ＝６４０−ｎを基準として、ずらし量ｄｘがｄｘ＝０〜１２７の範囲に撮像されていることになる。
【００６３】
このとき、検索領域３２の最右端の画素がＸ座標値ＸがＸ＝６４０＋ｎ＋１２７＝７６７（７６８番目）の最右端の画素になるので、それ以上、右画像ＩＲの原領域３１を右方向にずらすことは、一般に、無意味である。右画像ＩＲ中、Ｘ座標値ＸがＸ＝６４０−ｎより右側の撮像対象は、左画像ＩＬに撮像されないからである。しかし、遠方の画像については対応がとれるため、有意なこともあるので、本発明においては、対応すべき画像のない部分の画素については８ビットの最大値２５５があるものとして一応計算を行っている。メモリや計算時間を節約するためにはＸ座標値ＸをＸ＝６４０−ｎまでで打ち切ることが有効である。
【００６４】
そこで、図７のフローチャートに示すように、まず、右画像ＩＲ中のＸ座標値ＸがＸ＝０を変移位置とする原領域３１を取り出し（ステップＳ１１）、左画像ＩＬの検索領域３２のずらし量ｄｘをｄｘ＝０に設定する（ステップＳ１２）。
【００６５】
次に、ずらし量ｄｘがｄｘ＝１２７を超える値であるかどうか、すなわちｄｘ＝１２８であるかどうかを判定する（ステップＳ１３）。
【００６６】
この判定が否定的であるときには、対応度Ｈの計算をするために、左画像ＩＬの検索領域（小領域）３２分の画素データを取り出す（ステップＳ１４）。
【００６７】
次いで、小領域３１と小領域３２の各画素の差の絶対値の総和、すなわち、（１０）式に示す一致度Ｈを求め記憶する（ステップＳ１５）。
【００６８】
次に、ずらし量ｄｘをｄｘ→ｄｘ＋１（この場合、ｄｘ＝１）として１画素分増加する（ステップＳ１６）。
【００６９】
このとき、ステップＳ１３の判定は成立しないので、次に、ずらし量ｄｘがｄｘ＝１を基準に検索領域３２を取り出し（再び、ステップＳ１４）、このずらし量ｄｘがｄｘ＝１を基準の検索領域３２とＸ座標値（変移位置ともいう。）ＸがＸ＝０の原領域３１とで一致度Ｈを計算して記憶する（再び、ステップＳ１５）。
【００７０】
同様にして、ずらし量ｄｘがｄｘ＝１２８になるまで（ステップＳ１３の判定が成立するまで）Ｘ座標値ＸがＸ＝０の原領域３１についての一致度Ｈを計算する。
【００７１】
ステップＳ１３の判定が肯定的であるとき、すなわち、Ｘ座標値ＸがＸ＝０の原領域３１について計算した一致度Ｈのうち、負のピーク値である最小値Ｈｍｉｎとその近傍の値を求め、記憶しておく（ステップＳ１７）。
【００７２】
次に、繁雑になるので、図７のフローチャート中には記載しないが、右画像ＩＲ中の変移位置ＸがＸ＝１〜７６７（または６４０−ｎ）まで、上述のステップＳ１１〜Ｓ１７を繰り返し、各変移位置Ｘにおける右画像ＩＲの原領域３１に最も対応する左画像ＩＬの検索領域３２を検出する。
【００７３】
図８は、図６の動作説明図、図７のフローチャートに基づいて、一致度Ｈの計算等を行う対応処理装置６の詳細な構成を示すブロック図である。
【００７４】
図８中、スキャン座標生成部６１において、対応処理を行おうとする右画像ＩＲに対する原領域３１と左画像ＩＬに対する検索領域３２の座標（上述の図６に示す変移位置Ｘとずらし量ｄｘおよびエピポーラーラインＥＰのＹ座標値）が生成される。
【００７５】
このスキャン座標生成部６１で生成された座標（Ｘ，Ｙ）に基づいて、画像メモリ４Ｒ、４Ｌから読み出す小領域のアドレスデータが画像メモリアドレス生成部６４により生成されるが、この実施の形態においては、スキャン座標生成部６１で生成された座標（Ｘ，Ｙ）に基づいて、レンズ１１Ｒ、１１Ｌを含む光学部の歪みを原因とする撮像の歪みを補正する補正座標テーブル（歪み補正手段、アドレス補正テーブルの一部として機能する。）６２からアドレス補正座標（Δｘ，Δｙ）が読み出され、画像メモリアドレス生成部６４に供給される。
【００７６】
したがって、画像メモリアドレス生成部６４には、加算器６９で加算された補正後の座標（Ｘ＋Δｘ，Ｙ＋Δｙ）が供給される。この補正後の座標（Ｘ＋Δｘ，Ｙ＋Δｙ）に基づいて、画像メモリ４Ｒ、４Ｌに対する読み出しアドレスデータが画像メモリアドレス生成部６４で生成され、それぞれ、画像メモリ４Ｒ、４Ｌに供給される。
【００７７】
画像メモリ４Ｒ、４Ｌから読み出された画像データに基づく一致度Ｈの計算、いわゆる相関演算が相関演算部６５で行われ、相関演算結果が相関メモリ６７に記憶される。また、ずらし量ｄｘに対応して相関演算結果のピーク値、すなわち一致度Ｈの最小値Ｈｍｉｎ等がピーク値検出部６６により検出され、検出されたピーク値がピーク値メモリ６８に記憶される。
【００７８】
上記補正座標テーブル６２に格納されるアドレス補正座標（Δｘ，Δｙ）は、右側の光学系に係るビデオカメラ１Ｒ用と左側の光学系に係るビデオカメラ１Ｌ用とで、それぞれの光学系に対応した別々の内容の補正座標とすることもできる。
【００７９】
アドレス補正座標（Δｘ，Δｙ）は、光軸に垂直な平面状の物体が、光軸に垂直な撮像面、ここでは、ＣＣＤエリアセンサ１３Ｒ、１３Ｌの撮像面に相似に結像されない収差、すなわち光学系の歪みを補正のためのデータである。
【００８０】
例えば、光軸に垂直な平面状の物体としての被写体を、図９例に示すような長方形（正方形でもよい。）が格子状に配列された長方形格子７３とした場合に、ビデオカメラ１で撮像された画像が図１０に示す、たる形に中央が膨らんだ画像７４になるものとする。
【００８１】
そこで、図１１に示すように２つの画像を合わせて考慮すれば、スキャン座標生成部６１で生成された座標（Ｘ，Ｙ）に対応して、補正座標テーブル６２から読み出されたアドレス補正座標（Δｘ，Δｙ）を加算した補正後の座標（Ｘ＋Δｘ，Ｙ＋Δｙ）に基づく読み出しアドレスデータで画像メモリ４Ｒ、４Ｌから読み出すようにすれば、画像を正確な位置座標で読み出すことができるということが理解される。
【００８２】
アドレス補正座標（Δｘ，Δｙ）は電球を空間的に位置がわかっている点に置き、それが画像上の何処に写っているかを探し、歪みがない場合の位置とのずれをアドレス補正座標（Δｘ，Δｙ）としてテーブルに書き込む。この場合、すべての点について行うと処理時間がかかるため粗な点について補正データを求め、間の点については多項式により近似した値を利用する。
【００８３】
この実施の形態において、補正量であるアドレス補正座標（Δｘ，Δｙ）は、符号付の４ビットを使用し、Ｘ座標、Ｙ座標共に、値が−８〜７までの画素数の補正が可能となっており、レンズ１１Ｒ、１１Ｌ等の歪みの程度としては、垂直方向で約６％（７．５÷１２０×１００）程度まで許容している。
【００８４】
この場合、アドレス補正座標（Δｘ，Δｙ）は、計算により求めているのではなく、実測値から求めているので、レンズの規則的な歪みの補正にとどまらず、フロントガラス等に存在する歪みのような不規則な歪みをも合わせて補正することができるという利点が得られる。
【００８５】
このように補正座標テーブル６２を設けることの理由および利点を課題と比較してまとめて再度説明すると、左右の画像を対応させる場合において、図９および図１０を参照して説明したように、レンズ１１Ｒ、１１Ｌには収差があるため、中心（レンズ１１Ｒ、１１Ｌの中心であると同時に画像ＩＲ、ＩＬの中心）を通る縦および横のそれぞれの直線は直線として撮像されるが、中心を通らない直線は、たる形歪み的あるいは糸巻き歪み的な曲線として撮像される。このため、対応する領域３１、３２が左右のビデオカメラ１Ｒ、１Ｌでは上下方向に移動してしまい、エピポーラーラインＥＰを基準とした同じ高さで探したのでは対応が採れないことになる。
【００８６】
そこで、中心から離れた領域について対応を採る場合には、この歪みを補正してから対応を採る必要がある。
【００８７】
この実施の形態において、スキャン座標生成部６１で歪みがないと仮定した場合の理想的な状態での対応をとるべき領域３１、３２の座標（Ｘ，Ｙ）が生成される。そして、この個々の座標値（Ｘ，Ｙ）に対して予めルックアップテーブルとしての補正座標テーブル６２に記憶されている補正値である補正座標（Δｘ，Δｙ）を読み出し、これらを加算器６９により合成して画像メモリアドレス生成部６４に供給することで、正確な、いわゆる真の座標に対応するメモリアドレスが画像メモリアドレス生成部６４で生成される。
【００８８】
この補正後のメモリアドレスにより画像メモリ４Ｒ、４Ｌから画像データを読み出して相関演算部６５に供給することにより、レンズ１１Ｒ、１１Ｌに歪みが多少存在しても相関演算部６５で対応が採れることになる。
【００８９】
次に、図１２は、図６、図７を参照して説明した一致度Ｈを求めるための相関演算部６５の詳細な構成を示している。
【００９０】
この相関演算部６５は、基本的には、第１〜第４の演算ブロック８１、８２、８３、８４を有する、いわゆるパイプライン方式的処理である並列処理方式を採用している。
【００９１】
理解の容易化のために、まず、パイプライン方式的処理を考慮しないで、具体的には、ＦＩＦＯメモリ６５ｉが存在しないものとして、第１演算ブロック８１のみで、図６、図７を参照して説明した一致度Ｈを求めるための動作について説明する。そして、上述のように、誤対応が最も少なくなるそれぞれの小領域（原領域３１と検索領域３２）の大きさとしては、横方向の画素数ｎがｎ＝７〜９画素程度、縦方向の画素数ｍがｍ＝１２〜１５画素程度であるが、ここでは、理解を容易にするために、ｎ＝４、ｍ＝５として説明する。
【００９２】
図１３は、このような前提のもとでの、エピポーラーラインＥＰ上に乗る仮想的な右画像データＩｒｄの例を示している。原領域３１の対象となる全画素データ数は、ｍ×６４０＝５×６４０箇であるものとする。
【００９３】
図１４は、同様に、エピポーラーラインＥＰ上に乗る仮想的な左画像データＩｌｄの例を示している。検索領域３２の対象となる全画素データ数は、ｍ×７６８＝５×７６８箇であるものとする。
【００９４】
図１２において、画像メモリ４Ｒから端子８５を通じて原領域３１の右画像データＩｒｄが減算器６５ａの被減算入力端子に供給され、画像メモリ４Ｌから端子８６を通じて検索領域３２の左画像データＩｌｄが減算器６５ａの減算入力端子に供給される。
【００９５】
まず、一般的に説明すると、減算器６５ａでは、縦方向の左右の画素データの差を取り、その差の絶対値が絶対値演算器６５ｂで取られる。加算器６５ｃは、縦方向の左右の画素データの差の絶対値の和を取るとともに、ラッチ６５ｄにラッチされている前列の縦方向の左右の画素データの差の絶対値の和を加算する。
【００９６】
ＦＩＦＯメモリ６５ｅには、横方向の画素数ｎに対応するｎ段分、この実施の形態では、当該列の分を除いて左側（前側）に４（＝ｎ）列分の縦方向の左右の画素データの差の絶対値の和が保持される。すなわち、この実施の形態において、ＦＩＦＯメモリ６５ｅは、最初（入力側）のメモリ６５ｅ１〜最後（出力側）のメモリ６５ｅ４までの４段ある。
【００９７】
具体的に説明すると、１回目の演算（１列１行目）で加算器６５ｃの出力側には、１列１行目の左右の画素データの差の絶対値｜Ａ１−ａ１｜が現れ、かつ、この値｜Ａ１−ａ１｜がラッチ６５ｄに保持される。
【００９８】
２回目の演算（１列２行目）で１列２行目の左右の画素データの差の絶対値｜Ａ２−ａ２｜とラッチ６５ｄに保持されているデータ｜Ａ１−ａ１｜との和、すなわち、｜Ａ２−ａ２｜＋｜Ａ１−ａ１｜が加算器６５ｃの出力側に現れる。
したがって、５回目の演算後には、次の（１３）式に示す１列目の左右の画素データの差の絶対値の和（データ）Σ▲１▼（以下、２列目以降を順次、Σ▲２▼、Σ▲３▼、Σ▲４▼、…Σ６４１とする。）が加算器６５ｃの出力側に現れ、この和Σ▲１▼は、ラッチ６５ｄに保持される。また、このデータΣ▲１▼は、ＦＩＦＯメモリ６５ｅの最初のメモリ６５ｅ１に格納される。
【００９９】
Σ▲１▼＝｜Ａ１−ａ１｜＋｜Ａ２−ａ２｜＋｜Ａ３−ａ３｜＋｜Ａ４−ａ４｜＋｜Ａ５−ａ５｜ …（１３）
この１列目の左右の画素データの差の絶対値の和Σ▲１▼が、最初のメモリ６５ｅ１に格納された後、ラッチ６５ｄは、端子８９から供給される制御信号によりリセットされる。
【０１００】
このようにして、ずらし量ｄｘの値がｄｘ＝０での小領域３１、３２間での全ての１回目の計算が終了する４列（４＝ｎ）５行（５＝ｍ）目の演算終了後のラッチ６５ｄに格納されるデータ値とＦＩＦＯメモリ６５ｅに格納されるデータ値とラッチ６５ｈに格納されるデータ値等を図１５に模式的に示す。
【０１０１】
図１５において、ずらし量ｄｘの値がｄｘ＝０の場合における次の（１４）式に示す最初に求められる一致度Ｈ０が加算器６５ｇの出力側に現れている点に留意する。
【０１０２】
Ｈ０＝Σ▲１▼＋Σ▲２▼＋Σ▲３▼＋Σ▲４▼ …（１４）
次に、５列５行目の演算終了後の図１５に対応する図を図１６に示す。図１６から分かるように、ずらし量ｄｘの値がｄｘ＝０の場合の検索領域３２に対する一致度Ｈ０が出力端子９０に現れる。
【０１０３】
この場合、加算器６５ｆの出力側には、５列目のデータΣ▲５▼と１列目のデータΣ▲１▼との差Σ▲５▼−Σ▲１▼が現れるので、加算器６５ｇの出力側には、ずらし量ｄｘの値がｄｘ＝１の場合の検索領域３２に対する次の（１５）式に示す一致度Ｈ１が現れることになる。
【０１０４】
Ｈ１＝Σ▲２▼＋Σ▲３▼＋Σ▲４▼＋Σ▲５▼ …（１５）
ここで、実際の１５×１５の小領域を水平方向にＸ＝０〜６３９まで移動し、ずらし量ｄｘをｄｘ＝１２８までの各一致度Ｈを求める際に、この実施の形態では、原領域３１の左画像ＩＬ上で１画素分右にずらした位置での対応度Ｈを求めるとき、左端の縦方向の和（上例ではΣ▲１▼）を減じて右に加わる新たな列の縦方向の和（上例ではΣ▲５▼）を加えるようにしているので、演算回数を１５×６４０×１２８＝１，２２８，８００回にすることができる。すなわち、小領域の横方向の幅（画素数）は計算時間に無関係になる。
【０１０５】
もし、上例のように演算しなくて、１５×１５の小領域を移動させこの小領域毎に各領域を構成する画素データの差を取って、一致度Ｈを、水平方向ＸをＸがＸ＝０〜６３９まで、ずらし量ｄｘを１２８まで計算することにすると、演算回数は１５×１５×６４０×１２８＝１８，４３２，０００回となり、最も演算時間のかかる絶対値演算器６５ｂの１回の演算時間を１００ｎｓで実行した場合でも、総演算時間が１８４３ｍｓかかることになる。これに対して上例では、総演算時間が１２３ｍｓであり、約１／１５に低減することができる。
【０１０６】
しかし、この総演算時間１２３ｍｓは、ＮＴＳＣ方式のフレームレートである３３ｍｓより大きいので、フレームレート毎に、言い換えれば、１画面毎に一致度Ｈを計算する場合には、総演算時間１２３ｍｓを約１／４以下の時間にする必要がある。
【０１０７】
そこで、この実施の形態では、図１２に示したように、第１演算ブロック８１と同一構成の第２〜第４演算ブロック８２、８３、８４を設け、縦方向の画素数ｍと同数のＦＩＦＯメモリ６５ｉを直列に接続している。
この場合、簡単のために、図１３、図１４と同じ画像データを利用してパイプライン方式的処理動作を説明すれば、最初に、第１と第２の演算ブロック８１、８２を構成するＦＩＦＯメモリ６５ｅを通じて、第３演算ブロック８３を構成するＦＩＦＯメモリ６５ｉに１列目の画素データａ１〜ａ５までを転送する。したがって、この転送時点で、第２演算ブロック８２を構成するＦＩＦＯメモリ６５ｉには２列目の画素データｂ１〜ｂ５が転送され、第１演算ブロック８１を構成するＦＩＦＯメモリ６５ｉには３列目の画素データｃ１〜ｃ５が転送される。
【０１０８】
次に、次の４列目の画素データｄ１〜ｄ５を第１演算ブロック８１のＦＩＦＯメモリ６５ｉに順次転送したとき、第４演算ブロック８４では右１列目の画素データＡ１〜Ａ５と左１列目の画素データａ１〜ａ５に関連する上述の演算が行われ、第３演算ブロック８３では右１列目の画素データＡ１〜Ａ５と左２列目の画素データｂ１〜ｂ５に関連する上述の演算が行われ、第２演算ブロック８２では右１列目の画素データＡ１〜Ａ５と左３列目の画素データｃ１〜ｃ５に関連する上述の演算が行われ、第１演算ブロック８１では右１列目の画素データＡ１〜Ａ５と左４列目の画素データｄ１〜ｄ５に関連する上述の演算が行われる。
【０１０９】
次いで、右２列目の画素データＢ１〜Ｂ５の転送に同期して次の左５列目の画素データｅ１〜ｅ５を第１演算ブロック８１のＦＩＦＯメモリ６５ｉに順次転送したとき、第４演算ブロック８４では右２列目の画素データＢ１〜Ｂ５と左２列目の画素データｂ１〜ｂ５に関連する演算が行われ、第３演算ブロック８３では右２列目の画素データＢ１〜Ｂ５と左３列目の画素データｃ１〜ｃ５に関連する演算が行われ、第２演算ブロック８２では右２列目の画素データＢ１〜Ｂ５と左４列目の画素データｄ１〜ｄ５に関連する演算が行われ、第１演算ブロック８１では右２列目の画素データＢ１〜Ｂ５と左５列目の画素データｅ１〜ｅ５に関連する上述の演算が行われる。
【０１１０】
このようにして、次に、右３列目の画素データＣ１〜Ｃ５の転送に同期して次の左６列目の画素データｆ１〜ｆ５を順次同期して転送するようにすれば、第４演算ブロック８４では、ずらし量ｄｘがｄｘ＝０、ｄｘ＝４、……についての一致度Ｈを計算でき、同様に、第３演算ブロック８３では、ずらし量ｄｘがｄｘ＝１、ｄｘ＝５、……についての一致度Ｈを計算でき、第２演算ブロック８２では、ずらし量ｄｘがｄｘ＝２、ｄｘ＝６、……についての一致度Ｈを計算でき、第１演算ブロック８１では、ずらし量ｄｘがｄｘ＝３、ｄｘ＝７、……についての一致度Ｈを同時に計算することできる。
【０１１１】
このように、パイプライン方式的処理の４並列にすれば、演算時間を約１／４に低減することができる。なお、上述の説明から理解できるように、第４演算ブロック８４中のＦＩＦＯメモリ６５ｉは不要である。
【０１１２】
この場合、図１２例の４並列による動作によれば、１フレームレートで１フレームの画像についての６４０点の距離情報が求まり、左画像ＩＬの横７６８画素×縦１５画素の帯領域の処理が完了するが、これは１画像領域が７６８×２４０画素であることを考えると、全画像領域の１／１６になる。
【０１１３】
なお、左右のカメラ１Ｒ、１Ｌの上下方向の取付位置がずれた場合等を想定した場合には、当初のエピポーラーラインＥＰ上に対応する対象物画像が存在しなくなる場合も考えられる。この場合、図示はしないが、例えば、図９の対応処理装置６の構成を４並列にし、画像の縦方向の処理を４並列にすることにより、横７６８画素、縦１５画素の帯領域４つをフレームレート内で処理することが可能となる。この場合に、領域が重ならないようにすることで、最大１２７画素のずれまで検出できる距離情報を１フレームレート内で（６４０−ｎ）×４点出力できる。
【０１１４】
図１２例の相関演算部６５の処理により、１本のエピポーラーラインＥＰ上における右画像ＩＲ中の６４０個の原領域３１のそれぞれに対して、ずらし量ｄｘがｄｘ＝０〜１２７の検索領域３２についての１２８個の一致度Ｈが演算され、この演算結果の一致度Ｈが、相関メモリ６７に格納される。
【０１１５】
また、１個の原領域３１、すなわち、各変移位置Ｘに対する１２８個の検索領域３２のうち、一致度Ｈが最小値となる値（ピーク値ともいう。）をピーク値検出部６６で検出し、検出したピーク値（最小値）Ｈｍｉｎを、そのときの変移位置Ｘとずらし量ｄｘに対応させてピーク値メモリ６８に記憶する。ピーク値メモリ６８は、一致度Ｈのピーク値（最小値）記憶テーブルとして機能する。
【０１１６】
変移位置Ｘとずらし量ｄｘをアドレスとして一致度Ｈが記憶されている相関メモリ６７と、その最小値としてのピーク値Ｈｍｉｎが記憶されているピーク値メモリ６８が位置演算装置７に接続されている。
【０１１７】
位置演算装置７は、一致度Ｈとそのピーク値Ｈｍｉｎとを参照し、図１７に示すフローチャートに基づいて、対象物体Ｓの３次元空間での位置Ｐを求める。
【０１１８】
変移位置Ｘが所定の変移位置であるＸ＝Ｘｐの原領域３１についての位置Ｐの算出方法について説明する。
【０１１９】
まず、所定の変移位置Ｘｐの原領域３１についての一致度Ｈのピーク値Ｈｍｉｎと、そのときのずらし量ｄｘ（このずらし量ｄｘをずらし量ｄｘｍｉｎと呼ぶ）をピーク値メモリ６８から取り込む（ステップＳ２１）。
【０１２０】
次に、このずらし量ｄｘｍｉｎの近傍の左右各２個の一致度Ｈ、すなわち、ずらし量ｄｘがずらし量ｄｘｍｉｎより３つ少ないずれ量ｄｘｍｉｎ−２および３つ多いずれ量ｄｘｍｉｎ＋２の各位置における一致度Ｈｍｉｎ−２、Ｈｍｉｎ＋２を取り込む（ステップＳ２２）。
【０１２１】
次に、次の（１６）式に基づいて谷の深さ（ピーク深さともいう。）Ｑを求める（ステップＳ２３）。
【０１２２】
Ｑ＝ｍｉｎ｛Ｈｍｉｎ−２／Ｈｍｉｎ，Ｈｍｉｎ＋２／Ｈｍｉｎ｝ …（１６）
この（１６）式は、ピーク値Ｈｍｉｎに対する、これから２つ隣の一致度Ｈｍｉｎ−２、Ｈｍｉｎ＋２の大きさの各比のうち、最小値を取ることを意味する。
【０１２３】
そして、この谷の深さＱが所定の閾値ＴＨ以上の値であるかどうか（Ｑ≧ＴＨ）を判定し（ステップＳ２４）、所定の閾値ＴＨ以上の値である場合には、ピーク値Ｈｍｉｎであり、ずらし量ｄｘｍｉｎの検索領域３２が所定の変移位置Ｘｐの原領域３１に対応する領域であると同定して次のステップＳ２５に進む。
【０１２４】
一方、ステップＳ２４の結果が否定的である場合には、ピーク値Ｈｍｉｎであり、ずらし量ｄｘｍｉｎの検索領域３２が所定の変移位置Ｘｐの原領域３１に対応する領域ではないと判断して、次の変移位置Ｘｐ＋１の原領域３１に対する対応する検索領域３２を求める処理が全て終了したかどうかを判定し（ステップＳ２８）、全ての変移位置Ｘに対応する処理が終了していない場合には、そのステップＳ２１〜Ｓ２４の処理を繰り返す。
【０１２５】
この実施の形態において、一致度Ｈのピーク値Ｈｍｉｎを変移位置Ｘｐの原領域３１に対応する検索領域３２であると直ちに同定しないで、その近傍を見て（ステップＳ２２）、その谷の深さＱを計算し（ステップＳ２３）、その谷の深さＱが所定の閾値ＴＨ以上の場合にのみ、一致度Ｈのピーク値Ｈｍｉｎが得られるずらし量ｄｘｍｉｎの検索領域３２が、変移位置Ｘｐの原領域３１に対応する検索領域３２であると同定する理由は、雑音の混入または画像ＩＲ、ＩＬの被写体の画像濃度が一様である場合等に、一致度Ｈのピーク値Ｈｍｉｎが得られ、ずらし量ｄｘｍｉｎの検索領域３２が、変移位置Ｘｐの原領域３１に必ずしも対応するとは限らないからである。
【０１２６】
すなわち、ずらし量ｄｘｍｉｎの位置の近傍領域を考慮して、谷の深さＱが、所定の閾値ＴＨより小さいものは、対応がよく取れていないと判断し、その一致度Ｈのピーク値Ｈｍｉｎは利用しないこととした。なお、所定の閾値ＴＨは、この実施の形態においては、ＴＨ＝１．２とした。
【０１２７】
ステップＳ２４の判断が肯定的であるとき、ずらし量ｄｘの真の値（真のピーク位置という）ｄｓを次に示す補間処理により求める（ステップＳ２５）。すなわち、図１８に示すように、最小位置座標を（ｄｘｍｉｎ，Ｈｍｉｎ）とし、その前後の位置座標をそれぞれ（ｄｘｍｉｎ−１，Ｈｍｉｎ−１）、（ｄｘｍｉｎ＋１，Ｈｍｉｎ＋１）とするとき、前後の一致度Ｈｍｉｎ−１、Ｈｍｉｎ＋１の大きさを比較して、それぞれ次の（１７）式〜（１９）式で示す値に推定する。
【０１２８】
Ｈｍｉｎ−１＜Ｈｍｉｎ＋１の場合、
ｄｓ＝ｄｘｍｉｎ−｛（Ｈｍｉｎ−１−Ｈｍｉｎ＋１）／（２・（Ｈｍｉｎ−Ｈｍｉｎ＋１））｝…（１７）
Ｈｍｉｎ−１＝Ｈｍｉｎ＋１の場合、
ｄｓ＝ｄｘｍｉｎ …（１８）
Ｈｍｉｎ−１＞Ｈｍｉｎ＋１の場合、
ｄｓ＝ｄｘｍｉｎ＋｛（Ｈｍｉｎ＋１−Ｈｍｉｎ−１）／（２・（Ｈｍｉｎ−Ｈｍｉｎ−１））｝…（１９）
この（１７）式〜（１９）式の補間式を用いて真のピーク位置ｄｓを求めた場合には、補間しない場合に比較して、位置精度が３倍向上することを実験的に確認することができた。
【０１２９】
結局、ステップＳ２５の補間処理終了後に、変移位置Ｘｐの原領域３１に最も対応する検索領域３２の真のピーク位置ｄｓが求まることになる。
【０１３０】
このようにして求められた変移位置Ｘｐと真のピーク位置ｄｓは、それぞれ、図５に示す右画像ＩＲ上の対象物体画像ＳＲの視差ｄＲと左画像ＩＬ上の対象物体画像ＳＬの視差ｄＬに対応する。
【０１３１】
しかし、実際上、上述したように、フロントガラスやカメラ１Ｒ、１Ｌの対物レンズ１１Ｒ、１１Ｌの光学特性によって、左右の画像ＩＲ、ＩＬには、例えば、ピンクッション的歪み、あるいはバレル的歪みが存在するので、これらによる歪み補正を行った視差ｄＲと視差ｄＬとを求める（ステップＳ２６）。
【０１３２】
そこで、これら歪み補正を行った視差ｄＲと視差ｄＬを測定値として、上述の（４）式〜（６）式から対象物体Ｓまでの奥行き方向の距離Ｚｄと、その距離Ｚｄからの左右の偏差にかかるずれ距離ＤＲとずれ距離ＤＬとの３次元位置情報を求めることができる（ステップＳ２７）。
【０１３３】
ステップＳ２８では、エピポーラーラインＥＰ上の全ての変移位置Ｘでの原領域３１に対応する検索領域３２中の真のピーク位置ｄｓを求める演算が終了したかどうか、すなわち、変移位置ＸがＸ＝７６７であるかどうかを確認して処理を終了する。
【０１３４】
位置演算装置７で作成された、これら３次元位置情報である距離Ｚｄとずれ距離ＤＲとずれ距離ＤＬとはクラスタリングされ、対象物体Ｓについての識別符号としての、いわゆるアイディ（ＩＤ：Ｉｄｅｎｔｉｆｉｃａｔｉｏｎ）が付けられて、出力端子９０を通じて、次の処理過程である、図示しない道路・障害物認識装置等に接続される。
【０１３５】
道路・障害物認識装置等は、自動運転システムを構成し、運転者に対する警告、車体の自動衝突回避、前走車への自動追従走行などの動作を行うことができる装置である。この場合、例えば、自動追従走行を行うシステムとして、本出願人の出願による「物体検出装置およびその方法」（特願平７−２４９７４７号）を挙げることができる。
【０１３６】
なお、この発明は上述の実施の形態に限らず、この発明の要旨を逸脱することなく種々の構成を採り得ることはもちろんである。
【０１３７】
【発明の効果】
以上説明したように、この発明によれば、２つのカメラは、車両の正面方向にある無限遠点が前記２つの撮像手段からそれぞれ得られる各画面中の中心となるように設置されており、対応処理手段が、２つの撮像手段から得られる各画像中の同一物体画像の対応を採る際に、歪み補正手段により光学部の歪みを原因とする撮像の歪みを補正した後に対応を採るようにしているので、レンズの歪み等、光学系の歪みに基づく画像の歪みを原因とする同一物体の対応の採れない状態を回避することができるという効果が達成される。
【図面の簡単な説明】
【図１】この発明の一実施の形態の構成を示すブロック図である。
【図２】ステレオカメラの据えつけ位置の説明に供される概略斜視図である。
【図３】三角測量の原理で距離を求める際の説明に供される平面視的図である。
【図４】対象物体にかかる左右画像上での視差の説明に供される線図であって、Ａは、左側画像、Ｂは、右側画像をそれぞれ表す図である。
【図５】図１例の装置の全体的な動作説明に供されるフローチャートである。
【図６】左右の小領域の対応処理の仕方の説明に供される図である。
【図７】図６例の説明に供されるフローチャートである。
【図８】対応処理装置の詳細な構成を含む装置の構成を示すブロック図である。
【図９】被写体としての長方形格子を示す図である。
【図１０】レンズの収差による歪みを含んで撮像された画像を示す図である。
【図１１】図９の図形と図１０の図形とを重ね合わせた補正座標の説明に供される図である。
【図１２】相関演算部の詳細な構成を示す回路ブロック図である。
【図１３】エピポーラーライン上の左画像データの一部を模式的に表す線図である。
【図１４】エピポーラーライン上の右画像データの一部を模式的に表す線図である。
【図１５】図１２例中、第１演算ブロックの動作説明に供されるブロック図である。
【図１６】図１２例中、第１演算ブロックの動作説明に供される他のブロック図である。
【図１７】位置演算装置の動作説明に供されるフローチャートである。
【図１８】補間演算の説明に供される線図である。
【符号の説明】
１…ステレオカメラ１Ｒ、１Ｌ…ビデオカメラ
２Ｒ、２Ｌ…ＣＣＵ４Ｒ、４Ｌ…画像メモリ
５Ｒ、５Ｌ…駆動回路６…対応処理装置
７…位置演算装置８…露光量調整装置
１１Ｒ、１１Ｌ…対物レンズ１３Ｒ、１３Ｌ…ＣＣＤイメージセンサ
１５Ｒ、１５Ｌ…光軸６２…補正座標テーブル
７３…長方形格子７４…たる形に中央が膨らんだ画像[0001]
TECHNICAL FIELD OF THE INVENTION
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an environment recognition apparatus for a vehicle using stereo vision, and more particularly, for example, mounted on a vehicle such as an automobile, and based on the position of the automobile, the surroundings of a scene including a landscape or a preceding vehicle. The present invention relates to a vehicle environment recognition device that recognizes an environment.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, when trying to recognize the surrounding environment, a target object (simply called an object) based on the principle of triangulation from two images (also called a stereo image) obtained by a stereo camera using stereo vision. A so-called stereo method is adopted in which the distance to the object is determined and the position of the object is recognized.
[0003]
In this stereo method, it is a prerequisite that, when obtaining the distance, a correspondence of the same object can be obtained on two images captured through a lens.
[0004]
As a technique for taking the correspondence of the same object on two captured images, there is a method of focusing on a region in the images.
[0005]
In this method, first, a window having an appropriate size is set on one image, and an area having the same size as the window is set on the other image in order to obtain an area corresponding to the window in the other image.
[0006]
Next, the pixel data value of each corresponding pixel (specifically, each pixel corresponding to a matrix position) constituting an image in each window (also simply referred to as a window image) on both images is subtracted. Obtain the difference and also the absolute value of the difference.
[0007]
Then, the sum of the absolute value of the difference of each pixel in the window, that is, the so-called sum is calculated.
[0008]
In this way, the calculation for obtaining the sum of the absolute values of the differences between the pixel data values in the window is sequentially performed while changing the position of the window on the other image, and the window of the other image in which the sum is minimized is determined as the one-window. This is a method of determining that the region is an area corresponding to the window of the image.
[0009]
The present invention also basically employs a method that focuses on the region in the image.
[0010]
[Problems to be solved by the invention]
By the way, when an image is captured by a stereo camera mounted on a vehicle, for example, two video cameras spaced apart from each other at a fixed distance on a base line, light having image information is transmitted through a windshield and a lens constituting the video camera. Is introduced into an image pickup device such as a CCD area sensor through an optical system including the above, and is subjected to photoelectric conversion by this image pickup device and output as an electric signal, that is, a video signal. This video signal is divided corresponding to pixels, converted into image data which is a digital signal, and stored in an image memory such as a frame memory.
[0011]
However, there is distortion in a windshield, a lens, and the like. In a lens having particularly large distortion, aberrations such as so-called pincushion (pincushion) distortion and barrel (barrel) distortion exist. This is not considered in the technology.
[0012]
Therefore, normally, when a horizontal line passing through the center of the lens is considered, this imaging is a straight line, and the correspondence of the images can be obtained by the above-described correspondence processing. However, the image of the peripheral portion of the lens has a curved curved horizontal line. In addition, there was a problem that it was not possible to take a correspondence by assuming a straight line.
[0013]
As a technique related to this application, for example, a technique disclosed in Japanese Patent Application Laid-Open No. HEI 6-282793 can be cited. However, in this publication, the brightness is simply reduced around the lens due to vignetting. It only describes the content of preparing in advance a correction value for correcting, but does not disclose any technique for motivating lens distortion.
[0014]
The present invention has been made in view of such a problem, and makes it possible to avoid an incompatible state of the same object due to distortion of an image due to distortion of an optical system such as distortion of a lens. An object of the present invention is to provide a vehicle environment recognition device.
[0015]
[Means for Solving the Problems]
The present invention, for example, as shown in FIGS.
Capturing light with image informationLightFaculty 11R, 11L,as well asThe light obtained through each optical unit is converted into an electric signal.ShootingWith image means 13R, 13LHaving two cameras 1R and 1L, the two cameras being installed such that an infinity point in the front direction of the vehicle is the center of each screen obtained from each of the two imaging units,
eachPictureIn the statueCorrespondence processing means 6 for taking correspondence of the same object image of
Position calculation means 7 for calculating the distance to the corresponding object taken based on the principle of triangulation;
The corresponding processing unit acquires using the correction data prepared in advance using the optical units 11R and 11L and the imaging units 13R and 13L.screenA distortion correction means 62 for correcting the optical distortion of
The corresponding processing meansLeft and right acquisition with distortion correctionscreenAre moved along the epipolar line EP, and the correspondence between the same object images is determined by detecting the degree of coincidence of the images in both windows.In doing so, the window is moved a predetermined distance in one screen, and the coincidence detection is performed by moving the window in the other screen within a predetermined range based on the position of the window in the one screen,
The predetermined range is a range determined according to the horizontal angle of view of the imaging unit, the number of pixels in the horizontal direction, the shortest detection distance of the object, and the base line length of the two imaging units.It is characterized by the following.
[0016]
According to the present invention, the correspondence processing means:TwoObtained from the imaging meanseachIn the image signalSameWhen taking the correspondence of the object image, the distortion correction unit corrects the imaging distortion caused by the distortion of the optical unit and then takes the correspondence, so that the image based on the distortion of the optical system such as the lens distortion. It is possible to avoid a state in which the same object cannot be handled due to the distortion.
[0017]
According to the invention,The distortion correction unit has an address correction table for correcting a pixel read address from the image memory as the correction data,When reading the image data stored in the image memory in pixel units and taking the correspondence of the same object image,In one of the left and right images, the window is moved by a predetermined distance (for example, 640 pixels) along the epipolar line, and in the other image, the window is moved by the horizontal angle of view (θ) of the imaging unit and the number of pixels in the horizontal direction. (N), moving the object within a predetermined range determined according to the shortest detection distance (Zd) of the object and the base line length (D) of the two imaging means, and taking correspondence of the same object image.With such a configuration, it is possible to avoid an incompatible state due to an image distortion due to an optical system distortion such as a lens distortion with a simple configuration.
[0018]
In this case, the address correction table stores the address correction data as signed bit data corresponding to the address before correction.When the degree of coincidence is detected, an operation is performed using at least one of a pipeline process and a parallel process.. With this configuration, it is possible to have a configuration in which a corresponding address correction table is provided for each optical unit, and as a result, an address correction table corresponding to the optical unit distortion characteristic of the imaging unit is attached to the imaging unit. Can be.Further, the calculation time can be reduced.
[0019]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
[0020]
FIG. 1 is a block diagram showing a configuration of an embodiment of the present invention.
[0021]
In FIG. 1, a stereo camera 1 includes a right video camera (hereinafter, also simply referred to as a camera or a right camera) 1R and a left video camera (also, also referred to as a camera or a left camera) 1L. I have. As shown in FIG. 2, the left and right cameras 1 </ b> R and 1 </ b> L are installed on a dashboard of an automobile (also referred to as a vehicle) M at a predetermined interval, that is, a so-called base line length D. Further, the cameras 1R and 1L are installed on the dashboard so as to be parallel to a horizontal plane and at an infinity point in the front direction of the vehicle M to be the center of the image. Furthermore, since the cameras 1R and 1L are installed on the dashboard, the cameras 1R and 1L can be integrally connected, and the above-described base line length D can be maintained.
[0022]
Further, the cameras 1R and 1L are arranged within the wiper wiping range of the wiper of the vehicle M, and are located at the same position from the starting point of the left and right wiper blades when the wipers are in the left and right directions and rotate in the same direction. , The change in the light blocking position by the wiper blade is the same for the left and right cameras 1R and 1L, and the imaging of a recognition target object (also referred to as an object, an object, an object, or simply an object) is performed. The influence of the imaging by the wiper blade can be reduced. The optical axes 15R and 15L (see FIG. 1) of the left and right cameras 1R and 1L are set to be parallel on the same horizontal plane.
[0023]
As can be seen from FIG. 1, the right and left cameras 1R and 1L have objective lenses 11R and 11L having the same focal length F for capturing light IL having image information in a direction substantially orthogonal to the optical axes 15R and 15L. And ND filter assemblies 12R and 12L as neutral density filters, and area sensor type CCD image sensors (imaging element units) 13R and 13L for capturing images formed by the objective lenses 11R and 11L. ing. In this case, if each optical system (also referred to as an optical unit) is described, for example, with the right optical system, one ND filter (to be described later) or a single ND filter constituting the objective lens 11R and the ND filter assembly 12R will be described. And the CCD image sensor 13R constitute a so-called coaxial optical system.
[0024]
The cameras 1R and 1L control various timings such as readout timing and electronic shutter time of the CCD image sensors 13R and 13L, and a photoelectric conversion signal obtained by scanning an image pickup device group constituting the CCD image sensors 13R and 13L. Is provided with signal processing circuits 14R and 14L for converting the image pickup signal into a video signal.
[0025]
The output signals of the left and right cameras 1R and 1L, in other words, the video signals that are the output signals of the signal processing circuits 14R and 14L are passed through the CCUs 2R and 2L for adjusting the amplification gain and the like, for example, the AD converter 3R having an 8-bit resolution. It is supplied to 3L. Actually, a control signal for varying the electronic shutter time is transmitted from the CCUs 2R and 2L to the signal processing circuits 14R and 14L.
[0026]
The AD converters 3R and 3L convert an analog video signal into a digital signal, and form an image signal (hereinafter referred to as a necessity) as a set of pixels having 768 columns in the horizontal direction and 240 rows in the vertical direction. Accordingly, the image data is also referred to as image data as a set of pixel data, and is actually not video signal based on density but video signal data based on luminance. It is stored in the image memories 4R and 4L. The image memories 4R and 4L hold screen images corresponding to N frames (N frames), in other words, screen images corresponding to N screens on the raster display. In one embodiment, values of N = 2 to 6 are applied as the value of N. Since two or more images can be held, it is possible to perform image capture and corresponding processing in parallel.
[0027]
In this embodiment, the image memory (also referred to as a pixel memory when a pixel constituting an image is considered) has a value equal to the number of pixels in the horizontal direction × the number of pixels in the vertical direction. It is assumed that one pixel memory is provided. Each of the pixel memories 4R and 4L can store 8-bit data. Note that the data stored in each of the pixel memories 4R and 4L is luminance data because it is converted data of a video signal as described above.
[0028]
Since the images stored in the image memories 4R and 4L are images of one screen image as described above, when clarifying the images, they are also referred to as whole images as necessary.
[0029]
A predetermined operation is performed by sequentially comparing the image data of the same size area of the left image memory 4L with changing the position (actually, the address) with respect to the image data of the predetermined area of the right image memory 4R. And a corresponding processing unit 6 for obtaining a corresponding area of the object is connected to the image memories 4R and 4L.
[0030]
The position processing device 7 which calculates the relative position of the target based on the triangulation method (binocular stereovision) according to the corresponding region (corresponding address position) of the target in the left and right image memories 4R and 4L is transmitted to the corresponding processing device 6. It is connected.
[0031]
Prior to the correspondence processing / position calculation in the correspondence processing device 6 and the position calculation device 7, the input side has image information incident on the CCD image sensors 13R and 13L under the control of the exposure adjustment device 8 connected to the image memory 4R. The exposure amount of the light IL is optimized.
[0032]
The exposure adjusting device 8 determines an exposure based on image data of a predetermined area of the image memory 4R with reference to a look-up table or the like, which will be described later, and obtains amplification gains of the CCUs 2R and 2L and CCD image sensors 13R and 13L. The electronic shutter time is usually referred to as a shutter speed. Since the unit is a time (specifically, a charge accumulation time), the electronic shutter time is referred to as an electronic shutter time in this embodiment. In addition, it is also called an electronic shutter speed as needed. ｝ And a desired filter of the ND filter assemblies 12R and 12L are simultaneously determined to have the same value and the same value, respectively.
[0033]
Of the ND filter assemblies 12R and 12L, a desired ND filter is switched and selected through the drive circuits 5R and 5L. For this switching, if the ND filter is not used, a so-called transparent (if necessary, a transparent ND filter) is included.
[0034]
Next, the operation of the above embodiment and a more detailed configuration as necessary will be described.
[0035]
FIG. 3 is a plan view showing a state in which the scene including the target object S is imaged by the left and right cameras 1R and 1L, which is used for explaining the principle of triangulation. When the relative position of the target object S is represented by RP, the relative position RP is defined by a distance Zd in the Z-axis direction (depth direction) from the known focal length F and a center position in the X-axis direction (horizontal direction) of the right camera 1R. It is represented by a horizontal displacement distance DR. That is, the relative position RP is defined as RP = RP (Zd, DR). Of course, the relative position RP can also be represented by a known distance Zd from the focal length F and a horizontal displacement distance DL from the X-axis (horizontal direction) center position of the left camera 1L. That is, the relative position RP can be expressed as RP = RP (Zd, DL).
[0036]
4A shows an image (also referred to as a right image or a right image) IR including the target object S captured by the right camera 1R, and FIG. 4B includes the same target object S captured by the left camera 1L. An image (also referred to as a left image or a left image) IL is shown. It is assumed that these images IR and IL are stored in the image memories 4R and 4L, respectively. The target object image SR in the right image IR and the target object image SL in the left image IL have a parallax dR and a parallax dL with respect to the center lines 35 and 36 in the X-axis direction of the images IR and IL, respectively. I have. The target object image SR and the target object image SL exist on an epipolar line (line-of-sight image) EP. When the target object S exists at the point at infinity, the target object image SR and the target object image SL are imaged at the same position on the center lines 35 and 36, and the parallaxes dR and dL become dR = dL = 0. .
[0037]
The parallaxes dR and dL shown in FIG. 3 on the CCD area sensors 13R and 13L have different polarities from the parallaxes dR and dL shown in FIGS. 4A and 4B on the images IR and IL. The same polarity can be obtained by changing the reading direction from 13L. The polarity of the parallaxes dR and dL on the CCD area sensors 13R and 13L and the polarities of the parallaxes dR and dL on the images IR and IL can be matched by appropriately setting the number of lenses provided in the optical unit.
[0038]
From FIG. 3, it can be seen that the following equations (1) to (3) hold.
[0039]
DR: Zd = dR: F (1)
DL: Zd = dL: F (2)
D = DR + DL (3)
From these equations (1) to (3), the distance Zd, the shift distance DR, and the shift distance DL can be obtained by the equations (4) to (6), respectively.
[0040]
Zd = F × D / (DR + DL) (4)
DR = dR × D / (dL + dR) (5)
DL = dL × D / (dL + dR) (6)
By clustering the distance Zd, the shift distance DR, and the shift distance DL, which are these pieces of position information, and applying a so-called ID (Identification) as an identification code for the target object S, application to a vehicle tracking device or the like is performed. Can be achieved.
[0041]
As a practical problem, since it is difficult to measure the physical size of one effective pixel of the CCD image sensors 13R and 13L and to measure the focal length F, an angle of view that can be measured relatively accurately is used. To obtain the distance Zd and the deviation distances DR and DL.
[0042]
That is, for example, the horizontal angle of view of the cameras 1R and 1L corresponds to θ, the effective number of pixels in the horizontal direction of the cameras 1R and 1L (the number of pixels equal to the number of horizontal pixels of the image memories 4R and 4L) to N, and the parallaxes dR and dL. Assuming that the numbers of pixels on the image memories 4R and 4L are NR and NL, the distance Zd, the shift distance DR and the shift distance DL can be obtained from the following equations (7) to (9).
[0043]
Zd = N × D / {2 (NL + NR) tan (θ / 2)} (7)
DR = NR · D / (NL + NR) (8)
DL = NL · D / (NL + NR) (9)
Here, the horizontal angle of view θ is a measurable value, and the number N of effective pixels in the horizontal direction (N = 768 as described above in this embodiment) is predetermined and corresponds to the parallaxes dR and dL. The number of pixels NR and NL are also values that can be understood from the captured image.
[0044]
Next, the entire process from the above-described image capture to ID assignment will be described with reference to a flowchart, as shown in FIG.
[0045]
That is, the video signal data output from the AD converters 3R and 3L are captured and stored in the image memories 4R and 4L, respectively (step S1).
[0046]
Subsequent to step S1, an image corresponding to an image of a certain area stored in the image memory 4R is obtained from the image memory 4L, and a so-called left / right correspondence of the image is obtained (step S2).
[0047]
After taking the correspondence, the parallaxes dR and dL in the cameras 1R and 1L are obtained and converted into position information (step S3).
[0048]
The position information is clustered (step S4), and an ID is assigned (step S5).
[0049]
Although the output with the ID, which is the output of the position calculation device 7, is not a main part of the present invention, it will not be described in detail. An operation system can be configured. In this automatic driving system, it is possible to perform operations such as a warning to the driver, a collision avoidance of the automobile (own vehicle equipped with the stereo camera 1) M, and an automatic following of the preceding vehicle.
[0050]
In this embodiment, in step S2 for associating the left and right images described above, not the method focusing on so-called features, but basically adopting the method focusing on the region in the image described in the section of the related art. are doing.
[0051]
In other words, a method of extracting some feature such as an edge, a line segment, or a special shape, and focusing on the feature that a portion where the feature coincides with a corresponding portion is adopted because the amount of information to be handled decreases. First, in this embodiment, a small area surrounding the target object image SR, a so-called window, is cut out from one image, the right image IR, and a small area similar to this small area is searched from the other left image IL to determine the correspondence. Adopt a method to determine.
[0052]
In the method adopted in this embodiment, which focuses on a region in an image, a window of an appropriate size is set on one of the images IL and IR as a technique for taking correspondence of the same target object S on one of the images. In order to obtain an area corresponding to this window in the other image, an area having the same size as the window is set in the other image.
[0053]
Next, the pixel data value of each corresponding pixel (specifically, each pixel corresponding to the matrix position in the window image) constituting the image in each window (also simply referred to as a window image) on both images is described. That is, the difference is obtained by subtracting the luminance value, and further, the absolute value of the luminance difference is obtained.
[0054]
Then, a sum in the window of the absolute value of the luminance difference for each corresponding pixel, that is, a so-called sum is obtained.
[0055]
This sum is defined as the degree of coincidence (also called degree of correspondence) H between the left and right images. At this time, the brightness (pixel data value) of the corresponding coordinate point (x, y) in the window of the right image IR and the left image IL is set to IR (x, y) and IL (x, y), respectively, and the width of the window is set. If n pixels (n is the number of pixels) and the vertical width is m pixels (m is also the number of pixels), and if the shift amount is dx (described later), the degree of coincidence H is obtained by the following equation (10). Can be.
[0056]
H (x, y) = {(j = 1 → m)} (i = 1 → n) | Id | (10)
here,
| Id | = | IR (x + i, y + j) −IL (x + i + dx, y + j) |
It is. The symbol Σ (i = 1 → n) represents the sum of | Id | from i = 1 to i = n, and the symbol Σ (j = 1 → m) is Σ (i = 1 → n) | Id | Represents the sum of the results of | from j = 1 to j = m.
[0057]
From equation (10), it can be seen that the smaller the degree of coincidence H, in other words, the smaller the sum of the absolute values of the luminance differences, the better the left and right window images match.
[0058]
In this case, if the size of the window to be divided, that is, the size of the small area is too large, there is a high possibility that other objects having different relative distances Zd are present in the area at the same time, and the erroneous correspondence may occur. Will be higher. On the other hand, if the size of the small region is too small, there is a problem that erroneous responses corresponding to erroneous positions or erroneous responses due to noise increase. The present inventors have found from various experimental results that the size of the small area where the number of erroneous correspondences is the least is about n = 7 to 9 in the horizontal direction and m = 12 to 9 in the vertical direction. I found it to be about 15 in size.
[0059]
6 and 7 show the concept of how to move the area when performing the correspondence calculation for obtaining the degree of coincidence H in the correspondence processing device 6. FIG.
[0060]
As shown in FIG. 6, a predetermined area (also referred to as a small area or an original area) 31 on the right image IR from which a correspondence is to be obtained is shifted 640 pixels by one pixel from the left end position in the X-axis direction to the right. A predetermined area (also referred to as a small area or a search area) 32 of the left image IL to be associated with is located at a position corresponding to the left end position of the original area 31 of the right image IR (hereinafter, the horizontal displacement of the original area 31). Correspondence calculation is performed from the position.), And the correspondence calculation is performed by moving the shift amount dx rightward on the epipolar line EP by 0 to 127 pixels at a time. The calculation of the coincidence H in which the shift of 127 pixels at the maximum is effective is performed (640−n) × 128 times in total.
[0061]
Note that the reason for limiting to 128 pixels is that the horizontal view angle θ is 40 °, the shortest distance Zd is Zd = 5 m, and the usable stereo camera 1 (camera 1R and camera 1L) The number of pixels N in the horizontal direction of N) is N = 768, and the base line length D that can be installed is D = 0.5 m. By applying the following equation (11), NL + NR = 105 pixels, which is convenient for hardware. 2 which is a value close to this in the power⁷This is because = 128 was selected.
[0062]
NL + NR = (N × D) / {Zd × 2 × tan (θ / 2)}
= (768 × 0.5) / (5 × 2 × tan20 °) (11) This means that the target imaged at the position of X = 0 (left end) in the right image IR is always the left image IL Means that the image is captured in the 127th pixel position from the 0th pixel position corresponding to the shift amount dx of dx = 0 to 127. Therefore, the X-coordinate value (also referred to as a displacement position) of the imaging target in the original area 31 based on X = 0 is such that the X-coordinate value X of the left image IL is based on X = 0 and the shift amount dx is This means that the image is captured in the range of dx = 0 to 127. Similarly, the X-coordinate value X of the right image IR in the original area 31 based on X = 640-n is shifted by the X-coordinate value X of the left image IL based on X = 640-n. This means that dx is imaged in the range of dx = 0 to 127.
[0063]
At this time, since the rightmost pixel of the search area 32 is the rightmost pixel having the X coordinate value X = 640 + n + 127 = 767 (768th), the original area 31 of the right image IR is further shifted to the right. That is generally pointless. This is because, in the right image IR, an imaging target whose X coordinate value X is on the right side of X = 640-n is not captured in the left image IL. However, since it can be significant for a distant image, it may be significant. Therefore, in the present invention, it is assumed that a pixel of a portion having no corresponding image has a maximum value 255 of 8 bits, and the calculation is temporarily performed. I have. In order to save memory and calculation time, it is effective to stop the X coordinate value X up to X = 640-n.
[0064]
Therefore, as shown in the flowchart of FIG. 7, first, the original area 31 in which the X coordinate value X in the right image IR is a transition position where X = 0 is extracted (step S11), and the search area 32 of the left image IL is shifted. The quantity dx is set to dx = 0 (step S12).
[0065]
Next, it is determined whether or not the shift amount dx exceeds dx = 127, that is, whether or not dx = 128 (step S13).
[0066]
If this determination is negative, pixel data for 32 search regions (small regions) of the left image IL is extracted to calculate the degree of correspondence H (step S14).
[0067]
Next, the sum of the absolute values of the differences between the pixels of the small area 31 and the small area 32, that is, the coincidence H shown in the equation (10) is obtained and stored (step S15).
[0068]
Next, the shift amount dx is increased by one pixel as dx → dx + 1 (in this case, dx = 1) (step S16).
[0069]
At this time, since the determination in step S13 is not established, next, the search area 32 is extracted based on the shift amount dx based on dx = 1 (again, step S14), and the search area 32 based on the shift amount dx based on dx = 1. The degree of coincidence H is calculated and stored for the original area 32 where X = 0 and the X coordinate value (also referred to as a transition position) X = 0 (again, step S15).
[0070]
Similarly, the degree of coincidence H is calculated for the original area 31 where the X coordinate value X is X = 0 until the shift amount dx becomes dx = 128 (until the determination in step S13 is satisfied).
[0071]
When the determination in step S13 is affirmative, that is, the minimum value Hmin, which is a negative peak value, and values in the vicinity of the minimum value Hmin are obtained from the coincidence H calculated for the original region 31 where the X coordinate value X is X = 0. Is stored (step S17).
[0072]
Next, although not described in the flowchart of FIG. 7 because of complexity, steps S11 to S17 described above are repeated until the displacement position X in the right image IR is X = 1 to 767 (or 640-n). The search area 32 of the left image IL most corresponding to the original area 31 of the right image IR at each transition position X is detected.
[0073]
FIG. 8 is a block diagram showing a detailed configuration of the correspondence processing device 6 that calculates the degree of coincidence H based on the operation explanatory diagram of FIG. 6 and the flowchart of FIG.
[0074]
Figure8In the middle and scan coordinate generation units 61, the coordinates of the original region 31 for the right image IR and the search region 32 for the left image IL (the above-described transition position X, shift amount dx and epipolar line) shown in FIG. (Y coordinate value of EP) is generated.
[0075]
Based on the coordinates (X, Y) generated by the scan coordinate generation unit 61, the address data of the small area read from the image memories 4R and 4L is generated by the image memory address generation unit 64. In this embodiment, Is a correction coordinate table (distortion correcting means, address) for correcting, based on the coordinates (X, Y) generated by the scan coordinate generation unit 61, distortion in imaging caused by distortion of the optical unit including the lenses 11R and 11L. The address correction coordinates (Δx, Δy) are read from 62 and supplied to the image memory address generation unit 64.
[0076]
Therefore, the corrected coordinates (X + Δx, Y + Δy) added by the adder 69 are supplied to the image memory address generator 64. Based on the corrected coordinates (X + Δx, Y + Δy), read address data for the image memories 4R and 4L is generated by the image memory address generator 64, and supplied to the image memories 4R and 4L, respectively.
[0077]
The calculation of the degree of coincidence H based on the image data read from the image memories 4R and 4L, that is, the so-called correlation operation is performed by the correlation operation unit 65, and the result of the correlation operation is stored in the correlation memory 67. Further, the peak value of the correlation operation result, that is, the minimum value Hmin of the degree of coincidence H or the like is detected by the peak value detection unit 66 corresponding to the shift amount dx, and the detected peak value is stored in the peak value memory 68.
[0078]
The address correction coordinates (Δx, Δy) stored in the correction coordinate table 62 correspond to each optical system for the video camera 1R related to the right optical system and for the video camera 1L related to the left optical system. The correction coordinates may have different contents.
[0079]
The address correction coordinates (Δx, Δy) are aberrations in which a planar object perpendicular to the optical axis is not imaged in a similar manner on the imaging surface perpendicular to the optical axis, here, the imaging surfaces of the CCD area sensors 13R and 13L. This is data for correcting distortion of the optical system.
[0080]
For example, when the object as a planar object perpendicular to the optical axis is a rectangular lattice 73 in which rectangles (or squares) as shown in the example of FIG. It is assumed that the obtained image becomes an image 74 whose center is expanded in a barrel shape as shown in FIG.
[0081]
Therefore, if the two images are considered together as shown in FIG. 11, the address correction coordinates read from the correction coordinate table 62 corresponding to the coordinates (X, Y) generated by the scan coordinate generation unit 61. It is understood that by reading from the image memories 4R and 4L with read address data based on the corrected coordinates (X + Δx, Y + Δy) to which (Δx, Δy) is added, the image can be read with accurate position coordinates. Is done.
[0082]
The address correction coordinates (Δx, Δy) are obtained by placing the light bulb at a point whose position is spatially known, searching where the light bulb is located on the image, and determining the deviation from the position where there is no distortion by the address correction coordinates ( Δx, Δy) in the table. In this case, it takes processing time to perform processing for all points, so that correction data is obtained for coarse points, and values approximated by a polynomial are used for points in between.
[0083]
In this embodiment, the address correction coordinates (Δx, Δy), which are correction amounts, use signed 4 bits, and can correct the number of pixels having a value of −8 to 7 for both the X coordinate and the Y coordinate. The degree of distortion of the lenses 11R, 11L and the like is allowed in the vertical direction to about 6% (7.5） 120 × 100).
[0084]
In this case, the address correction coordinates (Δx, Δy) are not obtained by calculation, but are obtained from actual measurement values. There is an advantage that such irregular distortion can be corrected together.
[0085]
The reason and the advantage of providing the correction coordinate table 62 in this way will be described again in comparison with the problem. When the left and right images are associated with each other, as described with reference to FIG. 9 and FIG. Since 11R and 11L have aberrations, the vertical and horizontal straight lines passing through the center (the center of the images IR and IL at the same time as the centers of the lenses 11R and 11L) are captured as straight lines, but do not pass through the center. The straight line is imaged as a barrel-shaped or pincushion-shaped curve. For this reason, the corresponding areas 31 and 32 move vertically in the left and right video cameras 1R and 1L, and if the search is performed at the same height based on the epipolar line EP, no correspondence can be obtained.
[0086]
Therefore, when taking a measure for a region distant from the center, it is necessary to correct the distortion before taking a measure.
[0087]
In this embodiment, the coordinates (X, Y) of the regions 31 and 32 to be handled in an ideal state when there is no distortion in the scan coordinate generation unit 61 are generated. Then, correction coordinates (Δx, Δy), which are correction values stored in advance in the correction coordinate table 62 as a look-up table, are read from the individual coordinate values (X, Y), and these are read by the adder 69. By synthesizing the image data and supplying the image data to the image memory address generation unit 64, the image memory address generation unit 64 generates an accurate memory address corresponding to so-called true coordinates.
[0088]
By reading out the image data from the image memories 4R and 4L based on the corrected memory addresses and supplying the image data to the correlation operation unit 65, even if there is some distortion in the lenses 11R and 11L, the correlation operation unit 65 can take measures. Become.
[0089]
Next, FIG. 12 shows a detailed configuration of the correlation operation unit 65 for obtaining the degree of coincidence H described with reference to FIGS.
[0090]
The correlation operation unit 65 basically employs a parallel processing method, which is a so-called pipeline processing, having first to fourth operation blocks 81, 82, 83, and 84.
[0091]
For ease of understanding, first, without considering the pipeline-type processing, specifically, assuming that the FIFO memory 65i does not exist, only the first operation block 81 will be described with reference to FIGS. The operation for obtaining the degree of coincidence H described above will be described. As described above, the size of each of the small areas (the original area 31 and the search area 32) where the erroneous correspondence is minimized is such that the number n of pixels in the horizontal direction is about 7 to 9 pixels, The number m of pixels is about m = 12 to 15 pixels. However, in this example, n = 4 and m = 5 for easy understanding.
[0092]
FIG. 13 shows an example of virtual right image data Ird on the epipolar line EP under such a premise. It is assumed that the total number of pixel data to be processed in the original area 31 is m × 640 = 5 × 640.
[0093]
FIG. 14 similarly shows an example of virtual left image data Ild riding on the epipolar line EP. It is assumed that the total number of pixel data targeted for the search area 32 is m × 768 = 5 × 768.
[0094]
In FIG. 12, the right image data Ird of the original area 31 is supplied to the subtracted input terminal of the subtractor 65a from the image memory 4R through the terminal 85, and the left image data Ild of the search area 32 is subtracted from the image memory 4L through the terminal 86. It is supplied to a subtraction input terminal 65a.
[0095]
First, generally speaking, the subtracter 65a calculates the difference between the left and right pixel data in the vertical direction, and the absolute value of the difference is calculated by the absolute value calculator 65b. The adder 65c calculates the sum of the absolute values of the differences between the left and right pixel data in the vertical direction, and adds the sum of the absolute values of the differences between the left and right pixel data in the vertical direction in the front row latched by the latch 65d.
[0096]
The FIFO memory 65e has n stages corresponding to the number n of pixels in the horizontal direction. In this embodiment, the left (front side) of the left and right columns (= n columns) except for the column concerned has four (= n) columns. The sum of the absolute values of the pixel data differences is held. That is, in this embodiment, the FIFO memory 65e has four stages from the first (input side) memory 65e1 to the last (output side) memory 65e4.
[0097]
More specifically, the absolute value | A1-a1 | of the difference between the left and right pixel data in the first column and the first row appears on the output side of the adder 65c in the first operation (the first column and the first row), The value | A1-a1 | is held in the latch 65d.
[0098]
The sum of the absolute value | A2-a2 | of the difference between the left and right pixel data in the first column and the second row in the second calculation (first column and second row) and the data | A1-a1 | That is, | A2-a2 | + | A1-a1 | appears on the output side of the adder 65c.
Therefore, after the fifth operation, the sum (data) of the absolute value of the difference between the left and right pixel data in the first column shown in the following equation (13) (1) (2), (3), (4),..., 641) appear on the output side of the adder 65c, and the sum (1) is held in the latch 65d. The data {circle around (1)} is stored in the first memory 65e1 of the FIFO memory 65e.
[0099]
{1} = | A1-a1 | + | A2-a2 | + | A3-a3 | + | A4-a4 | + | A5-a5 | (13)
After the sum of the absolute values of the difference between the left and right pixel data in the first column, {1}, is stored in the first memory 65e1, the latch 65d is reset by the control signal supplied from the terminal 89.
[0100]
In this manner, the calculation of the fourth column (4 = n) and the fifth row (5 = m) in which the first calculation is completed between the small areas 31 and 32 when the value of the shift amount dx is dx = 0 FIG. 15 schematically shows a data value stored in the latch 65d, a data value stored in the FIFO memory 65e, a data value stored in the latch 65h, and the like after completion.
[0101]
In FIG. 15, it should be noted that, when the value of the shift amount dx is dx = 0, the first degree of coincidence H0 shown in the following equation (14) appears on the output side of the adder 65g.
[0102]
H0 = Σ1１ + Σ2 ▼ + Σ3 ▼ + Σ4 ▼ (14)
Next, FIG. 16 shows a diagram corresponding to FIG. 15 after the completion of the calculation in the fifth column and the fifth row. As can be seen from FIG. 16, the degree of coincidence H 0 with respect to the search area 32 when the value of the shift amount dx is dx = 0 appears at the output terminal 90.
[0103]
In this case, a difference (5)-(1) between the data in the fifth column (5) and the data in the first column (1) appears on the output side of the adder 65f. , The coincidence H1 shown in the following expression (15) for the search area 32 when the value of the shift amount dx is dx = 1 appears.
[0104]
H1 = Σ (2) + Σ (3) + Σ (4) + Σ (5)… (15)
Here, when the actual 15 × 15 small area is moved in the horizontal direction from X = 0 to 639 and the shift amount dx is obtained for each degree of coincidence H up to dx = 128, in this embodiment, the original area is used. When the degree of correspondence H at a position shifted by one pixel to the right on the left image IL of 31 is obtained, the vertical sum of the left end (in the above example, {circle around (1)}) is subtracted, and the height of the new column added to the right is reduced. Since the sum of the directions (Σ5 in the above example) is added, the number of operations can be made 15 × 640 × 128 = 1, 228,800. That is, the width (the number of pixels) of the small region in the horizontal direction is independent of the calculation time.
[0105]
If the calculation is not performed as in the above example, the small area of 15 × 15 is moved, and the difference between the pixel data constituting each area is calculated for each of the small areas. If X = 0 to 639 and the shift amount dx is calculated up to 128, the number of operations is 15 × 15 × 640 × 128 = 18,432,000, which is one of the absolute value operation units 65b which requires the longest operation time. Even if the calculation time is 100 ns, the total calculation time will be 1843 ms. On the other hand, in the above example, the total operation time is 123 ms, which can be reduced to about 1/15.
[0106]
However, since the total operation time 123 ms is greater than the NTSC frame rate of 33 ms, when calculating the coincidence H for each frame rate, in other words, for each screen, the total operation time 123 ms is reduced to about 1 ms. It is necessary to set the time to / 4 or less.
[0107]
Therefore, in this embodiment, as shown in FIG. 12, the second to fourth operation blocks 82, 83, and 84 having the same configuration as the first operation block 81 are provided, and the same number of FIFOs as the number m of pixels in the vertical direction are provided. The memories 65i are connected in series.
In this case, for the sake of simplicity, a description will be given of a pipeline processing operation using the same image data as in FIGS. 13 and 14. First, the FIFOs constituting the first and second arithmetic blocks 81 and 82 will be described first. The pixel data a1 to a5 in the first column are transferred to the FIFO memory 65i constituting the third operation block 83 via the memory 65e. Therefore, at the time of this transfer, the pixel data b1 to b5 of the second column are transferred to the FIFO memory 65i forming the second operation block 82, and the pixel data b1 to b5 of the second column are transferred to the FIFO memory 65i forming the first operation block 81. The pixel data c1 to c5 are transferred.
[0108]
Next, when the pixel data d1 to d5 of the next fourth column are sequentially transferred to the FIFO memory 65i of the first arithmetic block 81, the pixel data A1 to A5 of the right first column and the left one column The above-described calculation relating to the pixel data a1 to a5 of the second eye is performed. In the third calculation block 83, the above-described calculation relating to the pixel data A1 to A5 of the first right column and the pixel data b1 to b5 of the second left column is performed. Is performed in the second operation block 82, and the above-described operation relating to the pixel data A1 to A5 in the first column on the right and the pixel data c1 to c5 in the third column on the left is performed. The above-described calculation relating to the pixel data A1 to A5 of the eye and the pixel data d1 to d5 of the fourth column on the left is performed.
[0109]
Next, in synchronization with the transfer of the pixel data B1 to B5 in the second right column, when the next pixel data e1 to e5 in the fifth left column are sequentially transferred to the FIFO memory 65i of the first calculation block 81, the fourth calculation block In 84, operations related to the pixel data B1 to B5 in the second right column and the pixel data b1 to b5 in the second left column are performed. In the third operation block 83, the pixel data B1 to B5 in the second right column and the left 3 Operations related to the pixel data c1 to c5 in the column are performed, and calculations related to the pixel data B1 to B5 in the second right column and the pixel data d1 to d5 in the fourth left column are performed in the second calculation block 82. In the first calculation block 81, the above-described calculation relating to the pixel data B1 to B5 in the second column on the right and the pixel data e1 to e5 in the fifth column on the left is performed.
[0110]
In this way, if the pixel data f1 to f5 in the next left sixth column are sequentially transferred in synchronization with the transfer of the pixel data C1 to C5 in the third right column, the fourth In the operation block 84, the coincidence H can be calculated for the shift amount dx of dx = 0, dx = 4,... Similarly, in the third operation block 83, the shift amount dx of dx = 1, dx = 5, .. Can be calculated, the second operation block 82 can calculate the coincidence H for the shift amount dx of dx = 2, dx = 6,..., And the first operation block 81 can calculate the shift amount dx. .. can be simultaneously calculated for dx = 3, dx = 7,....
[0111]
As described above, the calculation time can be reduced to about 1/4 by performing four parallel processes of the pipeline system. As can be understood from the above description, the FIFO memory 65i in the fourth operation block 84 is unnecessary.
[0112]
In this case, according to the four-parallel operation in the example of FIG. 12, distance information of 640 points for one frame image is obtained at one frame rate, and processing of a band area of 768 pixels × 15 pixels of the left image IL is performed. This is completed, but this is 1/16 of the entire image area, considering that one image area is 768 × 240 pixels.
[0113]
In addition, when it is assumed that the mounting positions of the left and right cameras 1R and 1L in the vertical direction are displaced, the corresponding target image may not exist on the initial epipolar line EP. In this case, although not shown, for example, by making the configuration of the corresponding processing device 6 of FIG. 9 four-parallel and making the image vertical processing four-parallel, four band regions of 768 pixels horizontally and 15 pixels vertically are formed. Can be processed within the frame rate. In this case, by preventing areas from overlapping, (640-n) × 4 points of distance information that can be detected up to a shift of 127 pixels can be output within one frame rate.
[0114]
By the processing of the correlation operation unit 65 in the example of FIG. 12, the search area in which the shift amount dx is dx = 0 to 127 for each of the 640 original areas 31 in the right image IR on one epipolar line EP. The 128 matching degrees H for 32 are calculated, and the calculated matching degree H is stored in the correlation memory 67.
[0115]
The peak value detection unit 66 detects a value (also referred to as a peak value) at which the degree of coincidence H is the minimum value from one original region 31, that is, 128 search regions 32 for each transition position X. The detected peak value (minimum value) Hmin is stored in the peak value memory 68 in association with the shift position X and the shift amount dx at that time. The peak value memory 68 functions as a peak value (minimum value) storage table of the coincidence H.
[0116]
A correlation memory 67 in which the degree of coincidence H is stored using the shift position X and the shift amount dx as addresses, and a peak value memory 68 in which a peak value Hmin as its minimum value is stored are connected to the position calculation device 7. .
[0117]
The position calculation device 7 refers to the coincidence H and its peak value Hmin, and obtains the position P of the target object S in the three-dimensional space based on the flowchart shown in FIG.
[0118]
A method of calculating the position P for the original area 31 where X = Xp where the transition position X is a predetermined transition position will be described.
[0119]
First, the peak value Hmin of the degree of coincidence H of the original area 31 at the predetermined transition position Xp and the shift amount dx at that time (this shift amount dx is called a shift amount dxmin) are fetched from the peak value memory 68 (step S21). ).
[0120]
Next, the degree of coincidence H of each of the two right and left in the vicinity of the shift amount dxmin, that is, the degree of coincidence at each position of the shift amount dxmin-2 and the shift amount dxmin + 2 three times less than the shift amount dxmin. Hmin-2 and Hmin + 2 are fetched (step S22).
[0121]
Next, a valley depth (also referred to as a peak depth) Q is obtained based on the following equation (16) (step S23).
[0122]
Q = min {Hmin-2 / Hmin, Hmin + 2 / Hmin} (16)
This equation (16) means that the minimum value is taken out of the ratios of the magnitudes of the coincidences Hmin−2 and Hmin + 2 two adjacent to the peak value Hmin.
[0123]
Then, it is determined whether or not the depth Q of the valley is equal to or greater than a predetermined threshold value TH (Q ≧ TH) (step S24). If the depth Q is equal to or greater than the predetermined threshold value TH, the peak value Hmin is set. Yes, the search area 32 with the shift amount dxmin is identified as the area corresponding to the original area 31 at the predetermined transition position Xp, and the process proceeds to the next step S25.
[0124]
On the other hand, if the result of step S24 is negative, it is determined that the search area 32 having the peak value Hmin and the shift amount dxmin is not the area corresponding to the original area 31 at the predetermined transition position Xp. It is determined whether or not the processing for obtaining the search area 32 corresponding to the original area 31 at the transition position Xp + 1 has been completed (step S28). If the processing corresponding to all the transition positions X has not been completed, Steps S21 to S24 are repeated.
[0125]
In this embodiment, the vicinity of the peak value Hmin of the coincidence H is not immediately identified as the search area 32 corresponding to the original area 31 at the transition position Xp (step S22), and the depth of the valley is determined. Q is calculated (step S23), and only when the depth Q of the valley is equal to or more than the predetermined threshold TH, the search area 32 of the shift amount dxmin at which the peak value Hmin of the coincidence H is obtained becomes the original position of the transition position Xp. The reason for identifying the search area 32 corresponding to the area 31 is that the peak value Hmin of the degree of coincidence H is obtained when noise is mixed or when the image density of the subject in the images IR and IL is uniform. This is because the search area 32 of the quantity dxmin does not always correspond to the original area 31 at the transition position Xp.
[0126]
That is, when the valley depth Q is smaller than the predetermined threshold TH in consideration of the vicinity area of the position of the shift amount dxmin, it is determined that the valley is not well-corresponding, and the peak value Hmin of the coincidence H is We decided not to use it. In this embodiment, the predetermined threshold TH is set to TH = 1.2.
[0127]
When the determination in step S24 is affirmative, a true value (referred to as a true peak position) ds of the shift amount dx is obtained by the following interpolation processing (step S25). That is, as shown in FIG. 18, when the minimum position coordinate is (dxmin, Hmin) and the position coordinates before and after that are (dxmin-1, Hmin-1) and (dxmin + 1, Hmin + 1), respectively, The magnitudes of Hmin-1 and Hmin + 1 are compared and estimated to values shown by the following equations (17) to (19), respectively.
[0128]
If Hmin-1 <Hmin + 1,
ds = dxmin − {(Hmin−1−Hmin + 1) / (2 · (Hmin−Hmin + 1))} (17)
If Hmin-1 = Hmin + 1,
ds = dxmin (18)
If Hmin-1> Hmin + 1,
ds = dxmin + {(Hmin + 1−Hmin−1) / (2 · (Hmin−Hmin−1))} (19)
When the true peak position ds is obtained by using the interpolation formulas (17) to (19), it is experimentally confirmed that the position accuracy is improved three times as compared with the case where no interpolation is performed. I was able to.
[0129]
After all, after the interpolation processing in step S25, the true peak position ds of the search area 32 most corresponding to the original area 31 at the transition position Xp is obtained.
[0130]
The shift position Xp and the true peak position ds obtained in this way are respectively the parallax dR of the target object image SR on the right image IR and the parallax dL of the target object image SL on the left image IL shown in FIG. Corresponding.
[0131]
However, in practice, as described above, the left and right images IR and IL have, for example, pincushion-like distortion or barrel-like distortion due to the windshield and the optical characteristics of the objective lenses 11R and 11L of the cameras 1R and 1L. Therefore, the parallax dR and the parallax dL that have been subjected to the distortion correction based on these are obtained (step S26).
[0132]
Therefore, using the disparity dR and the disparity dL subjected to the distortion correction as measurement values, the distance Zd in the depth direction to the target object S from the above equations (4) to (6) and the left-right deviation from the distance Zd , The three-dimensional position information of the shift distance DR and the shift distance DL can be obtained (step S27).
[0133]
In step S28, it is determined whether the calculation for finding the true peak position ds in the search region 32 corresponding to the original region 31 at all the transition positions X on the epipolar line EP has been completed, that is, the transition position X is equal to X = It is determined whether the value is 767 or not, and the process is terminated.
[0134]
The distance Zd, the shift distance DR, and the shift distance DL, which are the three-dimensional position information created by the position calculation device 7, are clustered, and a so-called ID (Identification) is attached as an identification code for the target object S. Then, through an output terminal 90, it is connected to a road / obstacle recognition device (not shown) which is the next processing step.
[0135]
The road / obstacle recognition device and the like are devices that constitute an automatic driving system and can perform operations such as a warning to a driver, automatic collision avoidance of a vehicle body, and automatic following of a preceding vehicle. In this case, for example, as an example of a system that performs automatic following travel, an “object detection apparatus and method” (Japanese Patent Application No. 7-249747) filed by the present applicant can be cited.
[0136]
It should be noted that the present invention is not limited to the above-described embodiment, but may adopt various configurations without departing from the gist of the present invention.
[0137]
【The invention's effect】
As described above, according to the present invention,The two cameras are installed such that the point at infinity in the front direction of the vehicle is the center of each screen obtained from each of the two imaging units.Corresponding processing means is provided for each image obtained from the two imaging means.In the statueWhen the correspondence of the same object image is taken, the correspondence is taken after the distortion of the imaging caused by the distortion of the optical unit is corrected by the distortion correcting means, so that the correspondence is based on the distortion of the optical system such as the lens distortion. This achieves an effect that an incompatible state of the same object due to image distortion can be avoided.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an embodiment of the present invention.
FIG. 2 is a schematic perspective view for explaining an installation position of the stereo camera.
FIG. 3 is a plan view used for explanation when obtaining a distance based on the principle of triangulation.
FIG. 4 is a diagram for explaining parallax of a target object on left and right images, where A is a left image and B is a right image;
FIG. 5 is a flowchart for explaining the overall operation of the apparatus of FIG. 1;
FIG. 6 is a diagram which is used for describing a method of handling left and right small areas.
FIG. 7 is a flowchart provided for explaining the example in FIG. 6;
FIG. 8 is a block diagram illustrating a configuration of an apparatus including a detailed configuration of a corresponding processing apparatus.
FIG. 9 is a diagram showing a rectangular grid as a subject.
FIG. 10 is a diagram illustrating an image captured including distortion due to aberration of a lens.
FIG. 11 is a diagram provided for describing correction coordinates obtained by superimposing the graphic in FIG. 9 and the graphic in FIG. 10;
FIG. 12 is a circuit block diagram illustrating a detailed configuration of a correlation operation unit.
FIG. 13 is a diagram schematically illustrating a part of left image data on an epipolar line.
FIG. 14 is a diagram schematically illustrating a part of right image data on an epipolar line.
FIG. 15 is a block diagram for explaining the operation of a first operation block in the example of FIG. 12;
FIG. 16 is another block diagram used for describing the operation of the first operation block in the example of FIG. 12;
FIG. 17 is a flowchart used to describe the operation of the position calculation device.
FIG. 18 is a diagram provided for explanation of interpolation calculation.
[Explanation of symbols]
1: Stereo camera 1R, 1L: Video camera
2R, 2L ... CCU 4R, 4L ... Image memory
5R, 5L ... Drive circuit 6 ... Compatible processing device
7 Position calculating device 8 Exposure amount adjusting device
11R, 11L: Objective lens 13R, 13L: CCD image sensor
15R, 15L: Optical axis 62: Correction coordinate table
73 ... Rectangular lattice 74 ... Image with the center bulging in a barrel shape

Claims

Has two cameras consisting imaging means for capturing an image formed through the optical faculties light Faculty Ru captures light having image information,
The two cameras are installed such that an infinity point in the front direction of the vehicle is the center of each screen obtained from each of the two imaging units.
And corresponding processing means to take corresponding prior Symbol same object image in each screen,
Position calculation means for calculating the distance to the same taken corresponding object based on the principle of triangulation,
The correspondence processing unit has a distortion correction unit that corrects optical distortion of an acquisition screen using correction data prepared in advance using the optical unit and the imaging unit,
The correspondence processing unit moves windows having the same shape in the left and right acquisition screens in which the distortion has been corrected along the epipolar line, and detects the degree of coincidence of the images in both windows, thereby detecting the same object image. When taking the correspondence , the window is moved by a predetermined distance in one screen, and the window in the other screen is moved within a predetermined range based on the position of the window in the one screen to detect the degree of coincidence. Do
The environment recognition for a vehicle is characterized in that the predetermined range is a range determined according to a horizontal angle of view of the imaging unit, a number of pixels in a horizontal direction, a shortest detection distance of an object, and a base line length of the two imaging units. apparatus.

In the address correction table, address correction data is stored as signed bit data corresponding to the address before correction, and at least one of the pipelined processing and the parallel processing is used when the coincidence is detected. the vehicle environment recognizing apparatus according to claim 1, wherein the calculation is performed Te.