JP7577608B2

JP7577608B2 - Location determination device, location determination method, and location determination system

Info

Publication number: JP7577608B2
Application number: JP2021094448A
Authority: JP
Inventors: サキブアジム; 拓実仁藤; 克行中村
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-06-04
Filing date: 2021-06-04
Publication date: 2024-11-05
Anticipated expiration: 2041-06-04
Also published as: JP2022186299A

Description

本開示は、位置特定装置、位置特定方法及び位置特定システムに関する。 The present disclosure relates to a location determination device, a location determination method, and a location determination system.

近年、ＩＴ化の進展に伴い、社会に多数のセンサが配置され、極めて大量のデータが蓄積されている。そうした中、集積された映像データを活用する様々な方策が検討されている。特に、写真、動画、画像等の映像コンテンツが増えるにつれ、これらの映像コンテンツから、環境地図を生成し、当該環境内で移動する移動体の位置を正確に特定する手段が望まれている。 In recent years, with the advancement of IT, numerous sensors have been installed in society and an extremely large amount of data has been accumulated. In this context, various methods for utilizing accumulated video data are being considered. In particular, as the amount of video content such as photographs, videos, and images increases, there is a demand for a means to generate an environmental map from this video content and accurately identify the position of a moving object moving within that environment.

従来から、自己位置推定と環境地図の作成を同時に実行する手段として、いわゆるＳＬＡＭ（ＳｉｍｕｌｔａｎｅｏｕｓＬｏｃａｌｉｚａｔｉｏｎａｎｄＭａｐｐｉｎｇ）が知られている。ＳＬＡＭでは、Ｌｉｄａｒなどのセンサを搭載した移動体は、走行を行いながら周囲の環境をセンシングすることで、二次元もしくは三次元の環境地図の作成を行う。
ＳＬＡＭを活用することで、移動体が未知の環境下で環境地図を作成することができる。また、このように生成した地図情報を用いて障害物などを回避しつつ、特定のタスクを遂行することができる。 Conventionally, so-called Simultaneous Localization and Mapping (SLAM) has been known as a method for simultaneously performing self-location estimation and creating an environmental map. In SLAM, a mobile object equipped with a sensor such as Lidar senses the surrounding environment while traveling, thereby creating a two-dimensional or three-dimensional environmental map.
SLAM allows a mobile object to create an environmental map in an unknown environment, and then use the map information to avoid obstacles and complete a specific task.

しかし、ＳＬＡＭでは、映像における静的オブジェクト（例えば、建物、家具、壁等の動かない物体）のみが環境地図の作成に用いられており、動的オブジェクト（例えば、人間、動物、自動車等の動く物体）が映像において存在する場合、地図作成に必要な静的オブジェクトや背景が部分的に隠され、地図の精度が低下してしまう。
そのため、ＳＬＡＭを用いて高精度の環境地図を生成するためには、映像における動的オブジェクトを特定し、ＳＬＡＭ処理から排除することが重要である。 However, in SLAM, only static objects in the image (e.g., non-moving objects such as buildings, furniture, and walls) are used to create the environmental map; when dynamic objects (e.g., moving objects such as humans, animals, and automobiles) are present in the image, the static objects and background necessary for map creation are partially hidden, reducing the accuracy of the map.
Therefore, in order to generate a highly accurate environmental map using SLAM, it is important to identify dynamic objects in the image and exclude them from the SLAM process.

映像における動的オブジェクトを特定し、ＳＬＡＭ処理から排除する方法については、従来からいくつかの提案がなされている。
例えば、中国特許出願公開第１１１９５０５６１号明細書（特許文献１）には、「本発明は、セマンティックセグメンテーションに基づいてセマンティックＳＬＡＭ動的ポイントを除去する方法を開示し、以下のステップを含む：１）ＰＳＰＮｅｔセマンティックセグメンテーションネットワークを用いて画像フレームに対してセマンティックセグメンテーションを実行し、ピクセルセグメンテーション結果に従って分類を行う。新しいフレームＩのＯＲＢ特徴を抽出し、記述子を計算してから、新しいフレームIをFと一致させる。３）しきい値Ｍを設定する。４）動的ポイントと静的ポイントのモーションeに従って特徴ポイントを分類する。５）セマンティックラベルを分類された静的ポイントセットに関連付けて、ローカルセマンティックマップを構築する。６）ステップ１）からステップ５）を繰り返し、すべてのローカルマップが作成され、グローバルマップが取得されるまで、ローカルセマンティックマップを更新する。この方法では、システムの精度に大きな影響を与える環境内の動的な特徴点を排除し、静的な特徴点を用いて、高精度で意味的な情報で解釈可能なオクトマップを構築することができる」技術が記載されている。 Several methods have been proposed for identifying dynamic objects in video and excluding them from SLAM processing.
For example, Chinese Patent Publication No. 111950561 (Patent Document 1) describes a technique that "The present invention discloses a method for removing semantic SLAM dynamic points based on semantic segmentation, including the following steps: 1) perform semantic segmentation on image frames using a PSPNet semantic segmentation network, and perform classification according to the pixel segmentation result. Extract ORB features of the new frame I, calculate the descriptor, and then match the new frame I with F. 3) Set a threshold M. 4) Classify the feature points according to the motion e of the dynamic points and static points. 5) Associate semantic labels with the classified static point set to construct a local semantic map. 6) Repeat steps 1) to 5) to update the local semantic map until all local maps are created and a global map is obtained. In this method, dynamic feature points in the environment that have a significant impact on the accuracy of the system can be eliminated, and static feature points can be used to build an octomap that can be interpreted with semantic information with high accuracy."

中国特許出願公開第１１１９５０５６１号明細書Chinese Patent Publication No. 111950561

特許文献１に記載の手段では、対象の映像を構成する全てのフレームに対してセマンティックセグメンテーションを実行することで、動的オブジェクトを特定し、排除することができる。このように、動的オブジェクトを排除した後、静的オブジェクトを用いてＳＬＡＭ処理を行うことで、環境地図を生成することができる。 The method described in Patent Document 1 performs semantic segmentation on all frames that make up the target image, thereby identifying and eliminating dynamic objects. In this way, after eliminating the dynamic objects, a map of the environment can be generated by performing SLAM processing using the static objects.

しかし、特許文献１に記載の手段では、セマンティックセグメンテーションが入力映像の全てのフレームに対して実行されるため、膨大なコンピューティング資源量が必要になる。このため、例えばスマートデバイス等の使用可能なコンピューティング資源が限られている場合、特許文献１の手段の導入は困難となる。また、多くの場合、動的オブジェクトが入力映像の一部のフレームのみに存在するため、動的オブジェクトが存在しないフレームに対してもセマンティックセグメンテーションを実行することは無駄となってしまう。
更に、特許文献１に記載の手段では、オブジェクトが動的か静的かとの判定は、オブジェクトの移動量のみに基づいて行われる。このため、例えば人間、動物、自動車等の移動が可能な物体（以下、「動的クラスのオブジェクト」）が一次的に移動を止めた場合には、当該物体が静的オブジェクトと誤認されてしまうことがある。その後、ＳＬＡＭ処理において、静的オブジェクトとして誤認された動的クラスのオブジェクトを用いて環境地図を生成すると、環境地図の正確性が大幅に損なわれ、精度が低下してしまう。 However, in the method described in Patent Document 1, semantic segmentation is performed on all frames of the input video, which requires a huge amount of computing resources. For this reason, it is difficult to introduce the method described in Patent Document 1 when the available computing resources of a smart device, for example, are limited. In addition, in many cases, dynamic objects are present only in some frames of the input video, so performing semantic segmentation on frames in which no dynamic objects exist is wasteful.
Furthermore, in the method described in Patent Document 1, the determination of whether an object is dynamic or static is made based only on the amount of movement of the object. For this reason, when a movable object such as a human, animal, or automobile (hereinafter, a "dynamic class object") temporarily stops moving, the object may be mistaken for a static object. If an environment map is then generated in SLAM processing using a dynamic class object that has been mistaken for a static object, the accuracy of the environment map is significantly impaired, resulting in a decrease in precision.

そこで、本開示は、入力映像において動的内容を含む可能性が高いフレーム（以下、「動的フレーム」という。）を予測し、これらの動的フレームのみに対して動的オブジェクト分割処理を施すことで、必要なコンピューティング資源量を抑制しつつ、高精度の環境地図を生成し、移動体の位置を特定することが可能な位置特定手段を提供することを目的とする。 Therefore, the present disclosure aims to provide a position identification means capable of generating a highly accurate environmental map and identifying the position of a moving object while reducing the amount of required computing resources by predicting frames (hereinafter referred to as "dynamic frames") that are likely to contain dynamic content in an input video and performing dynamic object segmentation processing only on these dynamic frames.

上記の課題を解決するために、代表的な本開示の位置特定装置の一つは、任意の環境において移動する移動体の位置を特定する位置特定装置であって、前記環境を示す入力映像を解析し、前記入力映像に含まれる複数のフレームのそれぞれについて、特徴を抽出する画像特徴抽出部と、前記入力映像に含まれる前記複数のフレーム間の移動量、前記入力映像を取得した撮影部の移動速度、及び前記フレームにおいて動的クラスに対応するオブジェクトの画素の比率のいずれかに基づいて、前記入力映像の中から、動的内容を含む動的フレームを予測する選択的分割判定部と、前記動的フレームに対して、動的オブジェクト分割処理を施し、前記動的フレームにおける各画素について、当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを示す画像の意味的情報を生成する動的オブジェクト分割部と、前記画像の意味的情報と、前記特徴とに基づいて、各特徴を動的特徴又は静的特徴として分類する特徴分類部と、前記静的特徴と、前記画像の意味的情報とに基づいて、前記環境を立体的に示す３次元地図と、前記移動体の位置及び向きを示す第１の移動体位置情報とを生成するＳＬＡＭ部とを含む。 In order to solve the above problem, one representative positioning device disclosed herein is a positioning device that identifies the position of a moving object moving in an arbitrary environment, and includes an image feature extraction unit that analyzes an input video showing the environment and extracts features for each of a plurality of frames included in the input video; a selective division determination unit that predicts a dynamic frame including dynamic content from the input video based on any one of the amount of movement between the plurality of frames included in the input video, the movement speed of the imaging unit that captured the input video, and the ratio of pixels of an object corresponding to a dynamic class in the frame; a dynamic object division unit that performs a dynamic object division process on the dynamic frame and generates image semantic information indicating whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class for each pixel in the dynamic frame; a feature classification unit that classifies each feature as a dynamic feature or a static feature based on the semantic information of the image and the features; and a SLAM unit that generates a three-dimensional map showing the environment in a three-dimensional manner and first moving object position information indicating the position and orientation of the moving object based on the static features and the semantic information of the image.

本開示によれば、入力映像において動的内容を含む可能性が高い動的フレームを予測し、これらの動的フレームのみに対して動的オブジェクト分割処理を施すことで、必要なコンピューティング資源量を抑制しつつ、高精度の環境地図を生成し、移動体の位置を特定することが可能な位置特定手段を提供することができる。
上記以外の課題、構成及び効果は、以下の発明を実施するための形態における説明により明らかにされる。 According to the present disclosure, by predicting dynamic frames in an input video that are likely to contain dynamic content and performing dynamic object segmentation processing only on these dynamic frames, it is possible to provide a position identification means that can generate a highly accurate environmental map and identify the position of a moving object while reducing the amount of computing resources required.
Other objects, configurations and effects will become apparent from the following description of the preferred embodiment of the invention.

図１は、本開示の実施例を実施するためのコンピュータシステムを示す図である。FIG. 1 illustrates a computer system for implementing an embodiment of the present disclosure. 図２は、本開示の実施例１に係る位置特定装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of a position identifying device according to the first embodiment of the present disclosure. 図３は、本開示の実施例１に係る位置特定装置における地図生成処理の流れを示すブロック図である。FIG. 3 is a block diagram illustrating a flow of a map generation process in the position identification device according to the first embodiment of the present disclosure. 図４は、本開示の実施例１に係る位置特定装置における位置特定処理の流れを示す図である。FIG. 4 is a diagram illustrating a flow of a position identification process in the position identification device according to the first embodiment of the present disclosure. 図５は、本開示の実施例１に係る位置特定装置における動的内容判定処理の流れを示すブロック図である。FIG. 5 is a block diagram illustrating a flow of a dynamic content determination process in the position identifying device according to the first embodiment of the present disclosure. 図６は、本開示の実施例１に係る位置特定装置における特徴分類処理の流れを示すブロック図である。FIG. 6 is a block diagram illustrating a flow of a feature classification process in the position identifying device according to the first embodiment of the present disclosure. 図７は、本開示の実施例２に係る位置特定装置のハードウェア構成の一例を示す図である。FIG. 7 is a diagram illustrating an example of a hardware configuration of a position identifying device according to a second embodiment of the present disclosure. 図８は、本開示の実施例２に係る位置特定装置における移動体位置情報補正処理の流れを示すブロック図である。FIG. 8 is a block diagram illustrating a flow of a moving object position information correction process in a position specifying device according to the second embodiment of the present disclosure. 図９は、本開示の実施例２に係る位置特定装置におけるＧＰＳ対応付け処理及び地図読み込み処理の流れを示すブロック図である。FIG. 9 is a block diagram illustrating a flow of a GPS association process and a map reading process in a position identifying device according to the second embodiment of the present disclosure. 図１０は、本開示の実施例３に係る位置特定装置のハードウェア構成の一例を示す図である。FIG. 10 is a diagram illustrating an example of a hardware configuration of a position identifying device according to a third embodiment of the present disclosure. 図１１は、本開示の実施例３に係る位置特定装置におけるリスク管理処理の流れを示すブロック図である。FIG. 11 is a block diagram illustrating a flow of a risk management process in a position identifying device according to a third embodiment of the present disclosure. 図１２は、本開示の実施例に係るインターフェース部の機能を示すブロック図である。FIG. 12 is a block diagram illustrating the functions of an interface unit according to an embodiment of the present disclosure. 図１３は、本開示の実施例に係るスキップマスクパラメータの使用例を示す図である。FIG. 13 is a diagram illustrating an example of the use of skip mask parameters according to an embodiment of the present disclosure.

以下、図面を参照して、本発明の実施例について説明する。なお、この実施例により本発明が限定されるものではない。また、図面の記載において、同一部分には同一の符号を付して示している。 Below, an embodiment of the present invention will be described with reference to the drawings. Note that the present invention is not limited to this embodiment. In addition, in the description of the drawings, the same parts are indicated by the same reference numerals.

まず、図１を参照して、本開示の実施例を実施するためのコンピュータシステム１００について説明する。本明細書で開示される様々な実施例の機構及び装置は、任意の適切なコンピューティングシステムに適用されてもよい。コンピュータシステム１００の主要コンポーネントは、１つ以上のプロセッサ１０２、メモリ１０４、端末インターフェース１１２、ストレージインタフェース１１３、Ｉ／Ｏ（入出力）デバイスインタフェース１１４、及びネットワークインターフェース１１５を含む。これらのコンポーネントは、メモリバス１０６、Ｉ／Ｏバス１０８、バスインターフェースユニット１０９、及びＩ／Ｏバスインターフェースユニット１１０を介して、相互的に接続されてもよい。 First, referring to FIG. 1, a computer system 100 for implementing an embodiment of the present disclosure will be described. The mechanisms and devices of the various embodiments disclosed herein may be applied to any suitable computing system. The main components of the computer system 100 include one or more processors 102, memory 104, a terminal interface 112, a storage interface 113, an I/O (input/output) device interface 114, and a network interface 115. These components may be interconnected via a memory bus 106, an I/O bus 108, a bus interface unit 109, and an I/O bus interface unit 110.

コンピュータシステム１００は、プロセッサ１０２と総称される１つ又は複数の汎用プログラマブル中央処理装置（ＣＰＵ）１０２Ａ及び１０２Ｂを含んでもよい。ある実施例では、コンピュータシステム１００は複数のプロセッサを備えてもよく、また別の実施例では、コンピュータシステム１００は単一のＣＰＵシステムであってもよい。各プロセッサ１０２は、メモリ１０４に格納された命令を実行し、オンボードキャッシュを含んでもよい。 Computer system 100 may include one or more general purpose programmable central processing units (CPUs) 102A and 102B, collectively referred to as processors 102. In some embodiments, computer system 100 may include multiple processors, and in other embodiments, computer system 100 may be a single CPU system. Each processor 102 executes instructions stored in memory 104 and may include an on-board cache.

ある実施例では、メモリ１０４は、データ及びプログラムを記憶するためのランダムアクセス半導体メモリ、記憶装置、又は記憶媒体（揮発性又は不揮発性のいずれか）を含んでもよい。メモリ１０４は、本明細書で説明する機能を実施するプログラム、モジュール、及びデータ構造のすべて又は一部を格納してもよい。例えば、メモリ１０４は、位置特定アプリケーション１５０を格納していてもよい。ある実施例では、位置特定アプリケーション１５０は、後述する機能をプロセッサ１０２上で実行する命令又は記述を含んでもよい。 In some embodiments, memory 104 may include random access semiconductor memory, storage devices, or storage media (either volatile or non-volatile) for storing data and programs. Memory 104 may store all or a portion of the programs, modules, and data structures that implement the functions described herein. For example, memory 104 may store a location application 150. In some embodiments, location application 150 may include instructions or descriptions that execute on processor 102 the functions described below.

ある実施例では、位置特定アプリケーション１５０は、プロセッサベースのシステムの代わりに、またはプロセッサベースのシステムに加えて、半導体デバイス、チップ、論理ゲート、回路、回路カード、および/または他の物理ハードウェアデバイスを介してハードウェアで実施されてもよい。ある実施例では、位置特定アプリケーション１５０は、命令又は記述以外のデータを含んでもよい。ある実施例では、カメラ、センサ、または他のデータ入力デバイス（図示せず）が、バスインターフェースユニット１０９、プロセッサ１０２、またはコンピュータシステム１００の他のハードウェアと直接通信するように提供されてもよい。 In some embodiments, the location application 150 may be implemented in hardware via semiconductor devices, chips, logic gates, circuits, circuit cards, and/or other physical hardware devices instead of or in addition to a processor-based system. In some embodiments, the location application 150 may include data other than instructions or descriptions. In some embodiments, cameras, sensors, or other data input devices (not shown) may be provided to communicate directly with the bus interface unit 109, the processor 102, or other hardware of the computer system 100.

コンピュータシステム１００は、プロセッサ１０２、メモリ１０４、表示システム１２４、及びＩ／Ｏバスインターフェースユニット１１０間の通信を行うバスインターフェースユニット１０９を含んでもよい。Ｉ／Ｏバスインターフェースユニット１１０は、様々なＩ／Ｏユニットとの間でデータを転送するためのＩ／Ｏバス１０８と連結していてもよい。Ｉ／Ｏバスインターフェースユニット１１０は、Ｉ／Ｏバス１０８を介して、Ｉ／Ｏプロセッサ（ＩＯＰ）又はＩ／Ｏアダプタ（ＩＯＡ）としても知られる複数のＩ／Ｏインタフェースユニット１１２，１１３，１１４、及び１１５と通信してもよい。 Computer system 100 may include a bus interface unit 109 that provides communication between processor 102, memory 104, display system 124, and I/O bus interface unit 110. I/O bus interface unit 110 may be coupled to an I/O bus 108 for transferring data to and from various I/O units. I/O bus interface unit 110 may communicate via I/O bus 108 with multiple I/O interface units 112, 113, 114, and 115, also known as I/O processors (IOPs) or I/O adapters (IOAs).

表示システム１２４は、表示コントローラ、表示メモリ、又はその両方を含んでもよい。表示コントローラは、ビデオ、オーディオ、又はその両方のデータを表示装置１２６に提供することができる。また、コンピュータシステム１００は、データを収集し、プロセッサ１０２に当該データを提供するように構成された1つまたは複数のセンサ等のデバイスを含んでもよい。 The display system 124 may include a display controller, a display memory, or both. The display controller may provide video, audio, or both data to the display device 126. The computer system 100 may also include one or more sensors or other devices configured to collect data and provide the data to the processor 102.

例えば、コンピュータシステム１００は、心拍数データやストレスレベルデータ等を収集するバイオメトリックセンサ、湿度データ、温度データ、圧力データ等を収集する環境センサ、及び加速度データ、運動データ等を収集するモーションセンサ等を含んでもよい。これ以外のタイプのセンサも使用可能である。表示システム１２４は、単独のディスプレイ画面、テレビ、タブレット、又は携帯型デバイスなどの表示装置１２６に接続されてもよい。 For example, computer system 100 may include biometric sensors to collect data such as heart rate data or stress level data, environmental sensors to collect data such as humidity data, temperature data, or pressure data, and motion sensors to collect acceleration data, movement data, etc. Other types of sensors may also be used. Display system 124 may be connected to a display device 126, such as a standalone display screen, a television, a tablet, or a handheld device.

Ｉ／Ｏインタフェースユニットは、様々なストレージ又はＩ／Ｏデバイスと通信する機能を備える。例えば、端末インタフェースユニット１１２は、ビデオ表示装置、スピーカテレビ等のユーザ出力デバイスや、キーボード、マウス、キーパッド、タッチパッド、トラックボール、ボタン、ライトペン、又は他のポインティングデバイス等のユーザ入力デバイスのようなユーザＩ／Ｏデバイス１１６の取り付けが可能である。ユーザは、ユーザインターフェースを使用して、ユーザ入力デバイスを操作することで、ユーザＩ／Ｏデバイス１１６及びコンピュータシステム１００に対して入力データや指示を入力し、コンピュータシステム１００からの出力データを受け取ってもよい。ユーザインターフェースは例えば、ユーザＩ／Ｏデバイス１１６を介して、表示装置に表示されたり、スピーカによって再生されたり、プリンタを介して印刷されたりしてもよい。 The I/O interface unit provides the ability to communicate with various storage or I/O devices. For example, the terminal interface unit 112 can be fitted with user I/O devices 116, such as user output devices, such as a video display, a television with speakers, and user input devices, such as a keyboard, a mouse, a keypad, a touchpad, a trackball, buttons, a light pen, or other pointing devices. A user may use a user interface to input input data or instructions to the user I/O devices 116 and the computer system 100 and receive output data from the computer system 100 by manipulating the user input devices. The user interface may be displayed on a display, played through speakers, or printed via a printer, for example, via the user I/O devices 116.

ストレージインタフェース１１３は、１つ又は複数のディスクドライブや直接アクセスストレージ装置１１７（通常は磁気ディスクドライブストレージ装置であるが、単一のディスクドライブとして見えるように構成されたディスクドライブのアレイ又は他のストレージ装置であってもよい）の取り付けが可能である。ある実施例では、ストレージ装置１１７は、任意の二次記憶装置として実装されてもよい。メモリ１０４の内容は、ストレージ装置１１７に記憶され、必要に応じてストレージ装置１１７から読み出されてもよい。Ｉ／Ｏデバイスインタフェース１１４は、プリンタ、ファックスマシン等の他のＩ／Ｏデバイスに対するインターフェースを提供してもよい。ネットワークインターフェース１１５は、コンピュータシステム１００と他のデバイスが相互的に通信できるように、通信経路を提供してもよい。この通信経路は、例えば、ネットワーク１３０であってもよい。 The storage interface 113 allows for the attachment of one or more disk drives or direct access storage devices 117 (usually magnetic disk drive storage devices, but may also be an array of disk drives or other storage devices configured to appear as a single disk drive). In some embodiments, the storage device 117 may be implemented as any secondary storage device. The contents of the memory 104 may be stored in the storage device 117 and retrieved from the storage device 117 as needed. The I/O device interface 114 may provide an interface to other I/O devices such as printers, fax machines, etc. The network interface 115 may provide a communications path so that the computer system 100 and other devices can communicate with each other. This communications path may be, for example, a network 130.

ある実施例では、コンピュータシステム１００は、マルチユーザメインフレームコンピュータシステム、シングルユーザシステム、又はサーバコンピュータ等の、直接的ユーザインターフェースを有しない、他のコンピュータシステム（クライアント）からの要求を受信するデバイスであってもよい。他の実施例では、コンピュータシステム１００は、デスクトップコンピュータ、携帯型コンピュータ、ノートパソコン、タブレットコンピュータ、ポケットコンピュータ、電話、スマートフォン、又は任意の他の適切な電子機器であってもよい。 In some embodiments, computer system 100 may be a device that receives requests from other computer systems (clients) without a direct user interface, such as a multi-user mainframe computer system, a single-user system, or a server computer. In other embodiments, computer system 100 may be a desktop computer, a portable computer, a laptop, a tablet computer, a pocket computer, a telephone, a smartphone, or any other suitable electronic device.

次に、図２を参照して、本開示の実施例１に係る位置特定装置のハードウェア構成について説明する。 Next, the hardware configuration of the location identification device according to the first embodiment of the present disclosure will be described with reference to FIG. 2.

図２は、本開示の実施例１に係る位置特定装置２００のハードウェア構成の一例を示す図である。
図２に示す位置特定装置２００は、本開示の実施例に係る位置特定手段を実行するように構成された装置であり、例えばスマートフォン、タブレット、ノートパソコン、専用回路等として実現されてもよいが、特に限定されない。
また、本開示の実施例に係る位置特定装置２００は、移動体２７５に備えられており、移動体２７５と共に環境内で移動することを前提としている。この移動体２７５は、例えば人間、動物、自動運転機能を備えたロボット等、自律的な移動が可能なものであれば特に限定されない。更に、位置特定装置２００の形態は、移動体２７５に応じて決定されてもよい。例えば、移動体２７５が人間の場合、位置特定装置２００はスマートフォン、タブレット、又はノートパソコン等の個人用端末として実現されてもよく、移動体２７５がロボットの場合、位置特定装置２００はロボットに含まれる専用回路であってもよい。 FIG. 2 is a diagram illustrating an example of a hardware configuration of the position identifying device 200 according to the first embodiment of the present disclosure.
The location determination device 200 shown in FIG. 2 is a device configured to execute the location determination means according to an embodiment of the present disclosure, and may be realized, for example, as a smartphone, tablet, laptop, dedicated circuit, etc., but is not limited thereto.
In addition, the positioning device 200 according to the embodiment of the present disclosure is provided in a moving object 275 and is assumed to move in an environment together with the moving object 275. The moving object 275 is not particularly limited as long as it is capable of autonomous movement, such as a human being, an animal, or a robot with an automatic driving function. Furthermore, the form of the positioning device 200 may be determined according to the moving object 275. For example, when the moving object 275 is a human being, the positioning device 200 may be realized as a personal terminal such as a smartphone, a tablet, or a notebook computer, and when the moving object 275 is a robot, the positioning device 200 may be a dedicated circuit included in the robot.

また、本開示での「環境」とは、地図の生成や移動体２７５の位置特定の対象となる環境を意味し、例えば街、公園、工場、オフィスビル、駅、空港等、任意の環境を含んでもよい。
また、本開示での「環境地図」とは、建物や構造物、固定物、高低差や移動に影響を与える障害物の有無などの周辺環境の情報を記述した地図を意味する。 In addition, in this disclosure, "environment" refers to the environment that is the subject of map generation and location identification of a moving object 275, and may include any environment, such as a city, a park, a factory, an office building, a station, an airport, etc.
In addition, in this disclosure, an "environmental map" refers to a map that describes information about the surrounding environment, such as buildings, structures, fixed objects, elevation differences, and the presence or absence of obstacles that affect movement.

図２に示すように、位置特定装置２００は、撮影部２０５、プロセッサ２１０、メモリ２１５、画像特徴抽出部２２０、選択的分割判定部２２５、動的オブジェクト分割部２３０、特徴分類部２３５、ＳＬＡＭ部２４０、位置特定部２４５及びインターフェース部２５０を含む。 As shown in FIG. 2, the position identification device 200 includes a photographing unit 205, a processor 210, a memory 215, an image feature extraction unit 220, a selective division determination unit 225, a dynamic object division unit 230, a feature classification unit 235, a SLAM unit 240, a position identification unit 245, and an interface unit 250.

撮影部２０５は、環境を示す映像（以下、「入力映像」という）を取得するための機能部である。例えば、撮影部２０５は、レーザーレンジスキャナー(測域センサ、Ｌｉｄａｒ)、カメラ（ＲＧＢカメラ、赤外線カメラ、３６０度カメラ）、エンコーダ等であってもよい。一例として、位置特定装置２００をスマートフォンを用いて実現した場合、撮影部２０５は、当該スマートフォンに内蔵されているカメラであってもよい。また、撮影部２０５は、位置特定装置２００が移動しながら環境の画像を撮影可能に配置されることが望ましい。
この入力映像は、例えば環境を示す動画等、複数のフレームを含むデータであってもよい。ここでのフレームとは、入力映像を構成する一枚一枚の静止画である。
なお、図２では、撮影部２０５が位置特定装置２００に搭載されている構成を一例として示しているが、本開示はこれに限定されない。例えば、位置特定装置２００自体は、その環境から地理的に離れている遠隔サーバとして設置し、撮影部２０５を、環境内で移動する移動体２７５に備えたシステム構成とすることも可能である。 The photographing unit 205 is a functional unit for acquiring an image showing the environment (hereinafter referred to as an "input image"). For example, the photographing unit 205 may be a laser range scanner (range sensor, Lidar), a camera (RGB camera, infrared camera, 360-degree camera), an encoder, or the like. As an example, when the position identification device 200 is realized using a smartphone, the photographing unit 205 may be a camera built into the smartphone. In addition, it is desirable that the photographing unit 205 is disposed so that the position identification device 200 can photograph an image of the environment while moving.
The input video may be data including a number of frames, such as a moving image showing an environment, where a frame is each still image that constitutes the input video.
2 shows an example of a configuration in which the image capturing unit 205 is mounted on the position identifying device 200, but the present disclosure is not limited to this. For example, the position identifying device 200 itself may be installed as a remote server that is geographically separated from the environment, and the image capturing unit 205 may be provided on a mobile object 275 that moves within the environment.

また、図２では、単独の撮影部２０５を含む位置特定装置２００の構成を一例として示しているが、本開示はこれに限定されない。例えば、位置特定装置２００は、異なる角度を撮影するように配置した複数の撮影部２０５や、複数の異なる種類の撮影部２０５（ＲＧＢカメラ及びＬｉｄａｒ等）を含んでもよい。一例として、位置特定装置２００は、移動体２７５の前方を撮影する第１の撮影部と後方を撮影する第２の撮影部を含む構成や、移動体２７５の周囲３６０度を撮影可能な３６０度カメラとＬｉｄａｒを含む構成であってもよい。このように、移動体２７５の前方のみならず、周囲を撮影するカメラを用いることで、より精度の高い３次元地図の生成が可能となる。 2 shows an example of a configuration of the position identification device 200 including a single image capture unit 205, but the present disclosure is not limited thereto. For example, the position identification device 200 may include multiple image capture units 205 arranged to capture images at different angles, or multiple different types of image capture units 205 (such as an RGB camera and Lidar). As an example, the position identification device 200 may be configured to include a first image capture unit that captures the front of the moving body 275 and a second image capture unit that captures the rear, or a 360-degree camera that can capture 360 degrees around the moving body 275 and Lidar. In this way, by using a camera that captures not only the front of the moving body 275 but also the surroundings, it is possible to generate a more accurate 3D map.

プロセッサ２１０は、位置特定装置２００の各機能部の機能を実現するための命令を実行する演算装置であり、例えば図１に示すプロセッサ１０２と実質的に同様であるため、ここではその説明を省略する。 The processor 210 is a computing device that executes instructions to realize the functions of each functional unit of the position determination device 200, and is substantially similar to the processor 102 shown in FIG. 1, for example, so a description thereof will be omitted here.

メモリ２１５は、位置特定装置２００の各機能部に用いられるデータやプログラム命令を記憶するランダムアクセス半導体メモリ、記憶装置、又は記憶媒体（揮発性又は不揮発性のいずれか）であり、例えば図１に示すメモリ１０４と実質的に同様であるため、ここではその説明を省略する。 Memory 215 is a random access semiconductor memory, storage device, or storage medium (either volatile or non-volatile) that stores data and program instructions used by each functional unit of location identification device 200, and is substantially similar to memory 104 shown in FIG. 1, for example, so its description is omitted here.

画像特徴抽出部２２０は、撮影部２０５によって取得された入力映像を解析し、当該入力映像に含まれる複数のフレームのそれぞれについて特徴を抽出するための機能部である。ここでの特徴とは、フレームにおけるオブジェクトの特徴を表す情報（ベクトル等）であり、例えば色、大きさ、形状等を含んでもよい。また、ここでの画像特徴抽出部２２０は、例えばＳＩＦＴ（ＳｃａｌｅＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）やＯＲＢ(ＯｒｉｅｎｔｅｄＦＡＳＴａｎｄＲｏｔａｔｅｄＢＲＩＥＦ)等、任意の既存の手段を用いてもよい。 The image feature extraction unit 220 is a functional unit for analyzing the input video acquired by the image capture unit 205 and extracting features for each of the multiple frames included in the input video. The features here are information (vectors, etc.) that represent the features of an object in a frame, and may include, for example, color, size, shape, etc. The image feature extraction unit 220 here may use any existing means, such as SIFT (Scale Invariant Feature Transform) or ORB (Oriented FAST and Rotated BRIEF).

選択的分割判定部２２５は、撮影部２０５によって取得された入力映像に含まれる複数のフレームの中から、動的内容（Ｄｙｎａｍｉｃｃｏｎｔｅｎｔ）を含む可能性が高いフレームを、動的オブジェクト分割処理（いわゆるセマンティックセグメンテーション）の対象となる動的フレームとして予測する機能部である。より具体的には、選択的分割判定部２２５は、入力映像に含まれる複数のフレーム間の移動量、入力映像を取得した撮影部２０５の移動速度、及びフレームにおいて動的クラスに対応するオブジェクトの画素の比率のいずれかに基づいて、入力映像の中から、動的内容を含む動的フレームを予測する。
なお、ここでの動的内容とは、移動することができる動的オブジェクト（例えば、人間、動物、自動車等）に対応する画像情報を意味する。一方、静的内容とは、移動することができない静的オブジェクト（例えば、建物、家具、壁等）に対応する画像情報を意味する。
選択的分割判定部２２５の処理の詳細については後述するため、ここではその説明を省略する。 The selective division determination unit 225 is a functional unit that predicts, as a dynamic frame to be subjected to dynamic object division processing (so-called semantic segmentation), a frame that is likely to contain dynamic content from among a plurality of frames included in the input video acquired by the imaging unit 205. More specifically, the selective division determination unit 225 predicts a dynamic frame that contains dynamic content from the input video based on any one of the amount of movement between a plurality of frames included in the input video, the movement speed of the imaging unit 205 that acquired the input video, and the ratio of pixels of an object corresponding to a dynamic class in the frame.
In this context, dynamic content refers to image information corresponding to dynamic objects that can be moved (e.g., humans, animals, automobiles, etc.), whereas static content refers to image information corresponding to static objects that cannot be moved (e.g., buildings, furniture, walls, etc.).
The details of the process of the selective division determination unit 225 will be described later, and therefore will not be described here.

動的オブジェクト分割部２３０は、選択的分割判定部２２５によって予測された動的フレームに対して、動的オブジェクト分割処理を施し、動的フレームにおける各画素について、当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを示す画像の意味的情報（Ｉｍａｇｅｓｅｍａｎｔｉｃｉｎｆｏｒｍａｔｉｏｎ）を生成する機能部である。
ここでの動的オブジェクト分割部２３０は、例えばＭａｓｋＲ－ＣＮＮ（ＭａｓｋＲｅｇｉｏｎ－ＢａｓｅｄＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）、ＰＳＰＮｅｔ（ＰｙｒａｍｉｄＳｃｅｎｅＰａｒｓｉｎｇＮｅｔｗｏｒｋ）及びＢｉＳｅＮｅｔ（ＢｉｌａｔｅｒａｌＳｅｇｍｅｎｔａｔｉｏｎＮｅｔｗｏｒｋ）等、任意の既存の手段を用いてもよい。 The dynamic object splitter 230 is a functional unit that performs dynamic object splitting processing on the dynamic frame predicted by the selective split determination unit 225, and generates image semantic information for each pixel in the dynamic frame that indicates whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class.
The dynamic object segmentation unit 230 here may use any existing means, such as Mask R-CNN (Mask Region-Based Convolutional Neural Network), PSPNet (Pyramid Scene Parsing Network), and BiSeNet (Bilateral Segmentation Network).

特徴分類部２３５は、動的オブジェクト分割部２３０によって生成される画像の意味的情報と、画像特徴抽出部２２０によって抽出された特徴とに基づいて、入力映像のフレームの各特徴を動的特徴（つまり、動的オブジェクトに属する特徴）又は静的特徴（つまり、静的オブジェクトに属する特徴）として分類する機能部である。
ある実施例では、特徴分類部２３５は、フレーム間における特徴の移動量に基づいて特徴を静的特徴又は動的特徴として分類してもよい。なお、「フレーム間の移動量」との表現は、移動体２７５に移動によって発生する、複数のフレームの差分（例えば、画素情報の差分）を意味する。 The feature classification unit 235 is a functional unit that classifies each feature of a frame of an input video as a dynamic feature (i.e., a feature belonging to a dynamic object) or a static feature (i.e., a feature belonging to a static object) based on the semantic information of the image generated by the dynamic object division unit 230 and the features extracted by the image feature extraction unit 220.
In some embodiments, the feature classifier 235 may classify a feature as a static feature or a dynamic feature based on the amount of movement of the feature between frames, where the amount of movement between frames refers to the difference between multiple frames (e.g., the difference in pixel information) that occurs due to the movement of the moving object 275.

ＳＬＡＭ部２４０は、特徴分類部２３５によって分類された静的特徴と、動的オブジェクト分割部２３０によって生成された画像の意味的情報とに基づいて、環境を立体的に示す３次元地図を生成する機能部である。
ここでのＳＬＡＭ部２４０は、既存のＳｉｍｕｌｔａｎｅｏｕｓＬｏｃａｌｉｚａｔｉｏｎａｎｄＭａｐｐｉｎｇ手段を用いてもよく、本開示では特に限定されない。 The SLAM unit 240 is a functional unit that generates a three-dimensional map that shows the environment in a stereoscopic manner based on the static features classified by the feature classification unit 235 and the semantic information of the images generated by the dynamic object division unit 230.
The SLAM unit 240 here may use existing Simultaneous Localization and Mapping means, and is not particularly limited in this disclosure.

位置特定部２４５は、ＳＬＡＭ部２４０によって生成される環境の３次元地図と、特徴分類部２３５によって分類される静的特徴と、動的オブジェクト分割部２３０によって生成される画像の意味的情報とに基づいて、移動体２７５の環境における位置を特定するための機能部である。 The position identification unit 245 is a functional unit for identifying the position of the moving body 275 in the environment based on the three-dimensional map of the environment generated by the SLAM unit 240, the static features classified by the feature classification unit 235, and the semantic information of the image generated by the dynamic object segmentation unit 230.

インターフェース部２５０は、各種情報を位置特定装置２００のユーザに提供すると共に、ユーザからの入力を受け付けるための機能部である。一例として、インターフェース部２５０は、タッチ画面であってもよい。インターフェース部２５０は、例えばＳＬＡＭ部２４０によって生成される３次元地図を表示したり、当該３次元地図における移動体２７５の位置を表示したり、移動体の目的地の入力を受け付けたり、当該目的地までの推奨の経路を表示したりしてもよい。 The interface unit 250 is a functional unit that provides various information to the user of the position identification device 200 and accepts input from the user. As an example, the interface unit 250 may be a touch screen. The interface unit 250 may display a three-dimensional map generated by the SLAM unit 240, display the position of the mobile object 275 on the three-dimensional map, accept input of the mobile object's destination, and display a recommended route to the destination.

以上説明したように構成した位置特定装置２００によれば、入力映像において、動的内容を含む可能性が高い動的フレームを予測し、これらの動的フレームのみに対して動的オブジェクト分割処理を施すことで、必要なコンピューティング資源量を抑制しつつ、高精度の環境地図を生成し、移動体の位置を特定することが可能な位置特定手段を提供することができる。 The position identification device 200 configured as described above can predict dynamic frames in the input video that are likely to contain dynamic content, and perform dynamic object segmentation processing only on these dynamic frames, thereby providing a position identification means that can generate a highly accurate environmental map and identify the position of a moving object while reducing the amount of computing resources required.

次に、図３を参照して、本開示の実施例１に係る位置特定装置における地図生成処理の流れについて説明する。 Next, the flow of the map generation process in the position identification device according to the first embodiment of the present disclosure will be described with reference to FIG.

図３は、本開示の実施例１に係る位置特定装置２００における地図生成処理３００の流れを示すブロック図である。図３に示す地図生成処理３００は、位置特定装置２００を備える移動体（例えば、図２に示す移動体２７５）が移動する環境の３次元地図を生成するための処理であり、位置特定装置２００の各機能部によって実行される。 FIG. 3 is a block diagram showing the flow of a map generation process 300 in the position identification device 200 according to the first embodiment of the present disclosure. The map generation process 300 shown in FIG. 3 is a process for generating a three-dimensional map of an environment in which a mobile body equipped with the position identification device 200 (e.g., the mobile body 275 shown in FIG. 2) moves, and is executed by each functional unit of the position identification device 200.

まず、撮影部（例えば、図２に示す位置特定装置２００における撮影部２０５、図３には図示せず）は、入力映像３０５を取得する。上述したように、この入力映像３０５は、移動体が移動する環境を示す映像であり、複数のフレームから構成されている。一例として、ここでは、撮影部は、環境を撮影可能な状態で移動体に配置され、移動体が移動しながら環境を撮影することで入力映像３０５を取得してもよい。
撮影部によって取得される入力映像３０５は、画像特徴抽出部２２０及び選択的分割判定部２２５に入力される。 First, an image capturing unit (for example, the image capturing unit 205 in the position identification device 200 shown in FIG. 2, not shown in FIG. 3) acquires an input video 305. As described above, this input video 305 is an image showing an environment in which the mobile body moves, and is composed of a plurality of frames. As an example, here, the image capturing unit may be disposed on the mobile body in a state in which it is possible to capture the environment, and the input video 305 may be acquired by capturing images of the environment while the mobile body is moving.
An input image 305 acquired by the imaging unit is input to the image feature extraction unit 220 and the selective division determination unit 225 .

画像特徴抽出部２２０は、撮影部によって取得された入力映像３０５を入力した後、当該入力映像３０５を解析し、各フレームの特徴を抽出する。上述したように、ここでの特徴とは、フレームにおけるオブジェクトの特徴を表す情報（ベクトル等）であり、例えば色、大きさ、形状等を含んでもよい。
画像特徴抽出部２２０によって抽出される特徴は、選択的分割判定部２２５に入力される。 The image feature extraction unit 220 receives the input video 305 captured by the image capture unit, analyzes the input video 305, and extracts features of each frame. As described above, the features here are information (e.g., vectors) that represent the features of an object in a frame, and may include, for example, color, size, shape, and the like.
The features extracted by the image feature extraction unit 220 are input to the selective division determination unit 225 .

選択的分割判定部２２５は、撮影部によって取得された入力映像３０５に含まれる複数のフレームの中から、動的内容（Ｄｙｎａｍｉｃｃｏｎｔｅｎｔ）を含む可能性が高いフレームを、動的オブジェクト分割処理の対象となる動的フレームとして予測する。より具体的には、選択的分割判定部２２５は、入力映像３０５と、画像特徴抽出部２２０によって抽出された各フレームの特徴と、後述する移動体位置履歴情報３１０（なお、地図生成処理３００が初めて行われる場合は、当該移動体位置履歴情報３１０は存在しない）を入力し、入力映像３０５に含まれる複数のフレーム間の移動量、入力映像３０５を取得した撮影部の移動速度、及びフレームにおいて動的クラスに対応するオブジェクトの画素の比率のいずれかに基づいて、入力映像３０５の中から、動的内容を含む動的フレームを予測する。
動的内容を含むと予測される動的フレームは、動的オブジェクト分割部２３０に入力され、動的内容を含まないと予測される非動的フレームは、特徴分類部２３５に入力される。
選択的分割判定部２２５の処理の詳細については後述するため、ここではその説明を省略する。 The selective division determination unit 225 predicts, from among a plurality of frames included in the input video 305 acquired by the imaging unit, a frame that is likely to include dynamic content as a dynamic frame to be subjected to dynamic object division processing. More specifically, the selective division determination unit 225 inputs the input video 305, the features of each frame extracted by the image feature extraction unit 220, and moving object position history information 310 (described later) (note that, when the map generation processing 300 is performed for the first time, the moving object position history information 310 does not exist), and predicts a dynamic frame including dynamic content from the input video 305 based on any one of the movement amount between the plurality of frames included in the input video 305, the movement speed of the imaging unit that acquired the input video 305, and the ratio of pixels of an object corresponding to a dynamic class in the frame.
Dynamic frames that are predicted to contain dynamic content are input to a dynamic object segmenter 230 , and non-dynamic frames that are predicted to not contain dynamic content are input to a feature classifier 235 .
The details of the process of the selective division determination unit 225 will be described later, and therefore will not be described here.

動的オブジェクト分割部２３０は、選択的分割判定部２２５によって予測された動的フレームに対して、動的オブジェクト分割処理を施し、動的フレームにおける各画素について、当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを示す画像の意味的情報３２０を生成する。ここでは、動的オブジェクト分割部２３０は、動的クラスとみなされるオブジェクトの情報を示す動的クラス情報３１５に基づいて当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを判定してもよい。
動的オブジェクト分割部２３０によって生成される画像の意味的情報３２０はＳＬＡＭ部２４０及び特徴分類部２３５に入力される。 The dynamic object division unit 230 performs dynamic object division processing on the dynamic frame predicted by the selective division determination unit 225, and generates, for each pixel in the dynamic frame, image semantic information 320 indicating whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class. Here, the dynamic object division unit 230 may determine whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class based on dynamic class information 315 indicating information of an object considered to be a dynamic class.
The image semantic information 320 generated by the dynamic object segmentation unit 230 is input to the SLAM unit 240 and the feature classifier unit 235.

特徴分類部２３５は、動的オブジェクト分割部２３０によって生成された画像の意味的情報３２０と、画像特徴抽出部２２０によって抽出された特徴とに基づいて、入力映像３０５のフレームの各特徴を動的特徴（つまり、動的オブジェクトに属する特徴）又は静的特徴（つまり、静的オブジェクトに属する特徴）として分類する。ある実施例では、特徴分類部２３５は、フレーム間における特徴の移動量に基づいて特徴を静的特徴又は動的特徴として分類してもよい。
静的特徴として分類された特徴はＳＬＡＭ部２４０に入力され、動的特徴として分類された特徴は外れ値として扱われ、ＳＬＡＭ部２４０の処理から排除される。 The feature classifier 235 classifies each feature of the frames of the input video 305 as a dynamic feature (i.e., a feature belonging to a dynamic object) or a static feature (i.e., a feature belonging to a static object) based on the image semantic information 320 generated by the dynamic object segmenter 230 and the features extracted by the image feature extractor 220. In one embodiment, the feature classifier 235 may classify a feature as a static feature or a dynamic feature based on the amount of movement of the feature between frames.
Features classified as static features are input to the SLAM unit 240, while features classified as dynamic features are treated as outliers and excluded from the SLAM unit 240 processing.

ＳＬＡＭ部２４０は、特徴分類部２３５によって分類された静的特徴と、動的オブジェクト分割部２３０によって生成された画像の意味的情報３２０とに基づいて３次元地図３２５と、第１の移動体位置情報３３０とを生成する。
ここでの３次元地図３２５は、入力映像３０５に示された環境に対応する３次元地図である。図３に示す地図生成処理３００を、同じ環境の異なる部分を示す入力映像に基づいて行い、生成するそれぞれの３次元地図を結合することで、環境の全体を示す３次元地図を生成することができる。 The SLAM unit 240 generates a three-dimensional map 325 and first moving object position information 330 based on the static features classified by the feature classification unit 235 and the semantic information 320 of the image generated by the dynamic object division unit 230.
The 3D map 325 here is a 3D map corresponding to the environment shown in the input image 305. By performing the map generation process 300 shown in Figure 3 based on input images showing different parts of the same environment and combining each of the generated 3D maps, a 3D map showing the entire environment can be generated.

また、ここでの第１の移動体位置情報３３０は、移動体の位置（例えば、地理的座標等）及び向き（北、東、西、南等）を示す情報である。実際には、この第１の移動体位置情報３３０は、撮影部によって取得された入力映像から導出された撮影部の位置及び向きを示す情報であるが、撮影部は移動体に備えられているため、移動体の位置及び向きを示す情報でもある。
また、この第１の移動体位置情報３３０に示されている、複数の異なるフレームにおける移動体／撮影部の位置の変化から、移動体／撮影部の移動を時系列に示す移動体位置履歴情報３１０を生成することができる。後述するように、この移動体位置履歴情報３１０は、動的フレームを予測する際に用いられてもよい。
ＳＬＡＭ部２４０によって生成される３次元地図３２５及び第１の移動体位置情報３３０は、インターフェース部２５０に入力される。 The first moving object position information 330 here is information indicating the position (e.g., geographic coordinates, etc.) and orientation (north, east, west, south, etc.) of the moving object. In reality, the first moving object position information 330 is information indicating the position and orientation of the imaging unit derived from the input video acquired by the imaging unit, but since the imaging unit is provided in the moving object, it is also information indicating the position and orientation of the moving object.
Furthermore, it is possible to generate moving object position history information 310 indicating the movement of the moving object/image capture unit in a time series from the change in the position of the moving object/image capture unit in a plurality of different frames, which is indicated in this first moving object position information 330. As will be described later, this moving object position history information 310 may be used when predicting dynamic frames.
The three-dimensional map 325 generated by the SLAM unit 240 and the first mobile object position information 330 are input to the interface unit 250.

インターフェース部２５０は、ＳＬＡＭ部２４０によって生成される３次元地図３２５及び第１の移動体位置情報３３０を出力する。一例として、インターフェース部２５０は、３次元地図３２５を表示すると共に、当該３次元地図３２５における移動体の位置を表示してもよい。また、ある実施例では、インターフェース部２５０は、移動体の目的地の入力を受け付けた後、当該目的地までの推奨の経路を表示してもよい。 The interface unit 250 outputs the three-dimensional map 325 and the first mobile object position information 330 generated by the SLAM unit 240. As an example, the interface unit 250 may display the three-dimensional map 325 and the position of the mobile object on the three-dimensional map 325. In addition, in one embodiment, the interface unit 250 may receive input of a destination of the mobile object and then display a recommended route to the destination.

以上説明した地図生成処理３００によれば、動的内容を含む可能性が高い動的フレームを予測し、これらの動的フレームのみに対して動的オブジェクト分割処理を施すことで、必要なコンピューティング資源量を抑制しつつ、高精度の環境地図を構築することができる。 According to the map generation process 300 described above, dynamic frames that are likely to contain dynamic content are predicted, and dynamic object segmentation processing is performed only on these dynamic frames, making it possible to construct a highly accurate environmental map while reducing the amount of computing resources required.

次に、図４を参照して、本開示の実施例１に係る位置特定装置における位置特定処理の流れについて説明する。 Next, the flow of the location identification process in the location identification device according to the first embodiment of the present disclosure will be described with reference to FIG.

図４は、本開示の実施例１に係る位置特定装置２００における位置特定処理４００の流れを示す図である。図４に示す位置特定処理４００は、図３を参照して説明した地図生成処理３００によって予め生成された３次元地図３２５を用いて、位置特定装置２００を備える移動体（人間、ロボット等）の任意の環境における位置を特定するための処理であり、位置特定装置２００の各機能部によって実行される。
なお、図４に示す位置特定処理４００は、３次元地図３２５を生成するＳＬＡＭ部に代えて、予め作成された３次元地図３２５に基づいて移動体の位置を特定する位置特定部２４５を用いる点以外、基本的な処理の流れは図３を参照して説明した地図生成処理３００と同様であるため、繰り返しとなる説明を省略する。 Fig. 4 is a diagram showing a flow of a position identification process 400 in the position identification device 200 according to the first embodiment of the present disclosure. The position identification process 400 shown in Fig. 4 is a process for identifying a position of a moving body (human, robot, etc.) equipped with the position identification device 200 in an arbitrary environment by using a three-dimensional map 325 generated in advance by the map generation process 300 described with reference to Fig. 3, and is executed by each functional unit of the position identification device 200.
Note that the basic process flow of the position identification process 400 shown in FIG. 4 is similar to that of the map generation process 300 described with reference to FIG. 3, except that, instead of the SLAM unit that generates the 3D map 325, a position identification unit 245 that identifies the position of a moving object based on a pre-created 3D map 325 is used. Therefore, repeated explanations will be omitted.

位置特定部２４５は、特徴分類部２３５によって分類される静的特徴と、動的オブジェクト分割部２３０によって生成される画像の意味的情報３２０と、図３を参照して説明した地図生成処理３００によって生成され、メモリ２１５に格納される３次元地図３２５とを入力し、解析することで、３次元地図３２５における移動体の位置（例えば、地理的座標等）及び向き（北、東、西、南等）を示す第１の移動体位置情報３３０を生成する。
この第１の移動体位置情報３３０は、インターフェース部２５０に転送され、インターフェース部２５０にて出力される。 The position identification unit 245 inputs and analyzes the static features classified by the feature classification unit 235, the semantic information 320 of the image generated by the dynamic object division unit 230, and the three-dimensional map 325 generated by the map generation process 300 described with reference to Figure 3 and stored in the memory 215, to generate first moving object position information 330 indicating the position (e.g., geographical coordinates, etc.) and direction (north, east, west, south, etc.) of the moving object on the three-dimensional map 325.
This first mobile object position information 330 is transferred to the interface section 250 and output by the interface section 250 .

以上説明した位置特定処理４００によれば、動的内容を含む可能性が高い動的フレームを予測し、これらの動的フレームのみに対して動的オブジェクト分割処理を施すことで、必要なコンピューティング資源量を抑制しつつ、移動体の環境における位置を正確に特定することができる。 According to the position identification process 400 described above, dynamic frames that are likely to contain dynamic content are predicted, and dynamic object segmentation processing is performed only on these dynamic frames, thereby making it possible to accurately identify the position of a moving object in an environment while reducing the amount of computing resources required.

次に、図５を参照して、本開示の実施例１に係る位置特定装置における動的内容判定処理の流れについて説明する。 Next, the flow of the dynamic content determination process in the location identification device according to the first embodiment of the present disclosure will be described with reference to FIG.

上述したように、本開示では、映像フレームを構成する複数のフレームの中から、動的内容を含むと予測される動的フレームのみに対して動的オブジェクト分割処理を施し、動的内容を含まないと予測される非動的フレームを動的オブジェクト分割処理から排除することで、必要なコンピューティング資源量を抑制しつつ、移動体の環境における位置を正確に特定することができる。
より具体的には、本開示の実施例に係る動的内容の判定は、スキップマスクパラメータ５５０に基づいて行われる。このスキップマスクパラメータ５５０は、入力映像において、動的オブジェクト分割処理から排除するフレーム数を指定するパラメータである。このスキップマスクパラメータ５５０は、入力映像に含まれる複数のフレーム間の移動量、入力映像を取得した撮影部の移動速度、及びフレームにおいて動的クラスに対応するオブジェクトの画素の比率に基づいて設定及び更新をすることができる。
このスキップマスクパラメータ５５０を用いることで、動的内容を含む可能性が低いフレームがスキップ（つまり、動的オブジェクト分割処理から排除）され、動的内容を含む可能性が高いフレームのみが動的オブジェクト分割処理の対象となるため、入力映像の全てのフレームに対して動的オブジェクト分割処理を行った場合に比べて、コンピューティング資源を大幅に節約することができる。 As described above, in the present disclosure, from among the multiple frames that make up a video frame, dynamic object segmentation processing is performed only on dynamic frames that are predicted to contain dynamic content, and non-dynamic frames that are predicted not to contain dynamic content are excluded from the dynamic object segmentation processing, thereby making it possible to accurately identify the position of a moving object in an environment while reducing the amount of computing resources required.
More specifically, the determination of dynamic content according to the embodiment of the present disclosure is based on a skip mask parameter 550, which is a parameter that specifies the number of frames in the input video to be excluded from the dynamic object segmentation process. The skip mask parameter 550 can be set and updated based on the amount of movement between multiple frames included in the input video, the movement speed of the camera that captured the input video, and the ratio of pixels of objects corresponding to dynamic classes in the frames.
By using this skip mask parameter 550, frames which are unlikely to contain dynamic content are skipped (i.e., excluded from the dynamic object segmentation process), and only frames which are likely to contain dynamic content are subjected to the dynamic object segmentation process, thereby significantly saving computing resources compared to performing dynamic object segmentation on all frames of the input video.

図５は、本開示の実施例１に係る位置特定装置における動的内容判定処理５００の流れを示すブロック図である。図５に示す動的内容判定処理５００は、スキップマスクパラメータ５５０を用いて動的内容を含む可能性が高い動的フレームを予測するための処理であり、主に画像特徴抽出部、選択的分割判定部、及び動的オブジェクト分割部によって実行される。 FIG. 5 is a block diagram showing the flow of a dynamic content determination process 500 in a localization device according to a first embodiment of the present disclosure. The dynamic content determination process 500 shown in FIG. 5 is a process for predicting dynamic frames that are likely to contain dynamic content using skip mask parameters 550, and is mainly executed by an image feature extraction unit, a selective split determination unit, and a dynamic object splitter.

まず、入力映像３０５は、選択的分割判定部２２５及び画像特徴抽出部２２０に入力される。 First, the input video 305 is input to the selective division determination unit 225 and the image feature extraction unit 220.

上述したように、画像特徴抽出部２２０は、撮影部によって取得された入力映像３０５を入力した後、当該入力映像３０５を解析し、各フレームの特徴を抽出する。これらの特徴は、後述する移動量推定の際に用いられる。 As described above, the image feature extraction unit 220 inputs the input video 305 acquired by the image capture unit, analyzes the input video 305, and extracts features of each frame. These features are used when estimating the amount of movement, which will be described later.

選択的分割判定部２２５は、入力映像３０５を構成する各フレーム毎に、当該フレームの順番を示す画像指標（ｉｍａｇｅｉｎｄｅｘ）とスキップマスクパラメータ５５０を用いて除算演算（ｍｏｄｕｌｏｏｐｅｒａｔｉｏｎ）を行い、余りが「０」の場合（つまり、画像指標のスキップマスクパラメータ５５０による除算の余りが０の場合）、当該フレームを動的フレームとし、動的オブジェクト分割部２３０に入力する。一方、除算演算の余りが「０」でない場合、当該フレームに対して後述の移動量推定を実行する。
ここでは、この除算演算を用いることで、スキップマスクパラメータに設定されているフレーム数毎に、フレームが動的オブジェクト分割部に入力され、それ以外のフレームについては、後述の移動量推定が実行される。
なお、入力映像における１番目のフレームについては、スキップマスクパラメータ５５０はまだ未設定であるため、この１番目のフレームは直接に動的オブジェクト分割部２３０に入力される。 The selective division determination unit 225 performs a modulo operation for each frame constituting the input video 305 using an image index indicating the order of the frame and the skip mask parameter 550, and if the remainder is "0" (i.e., if the remainder of the division of the image index by the skip mask parameter 550 is 0), the frame is treated as a dynamic frame and inputs it to the dynamic object division unit 230. On the other hand, if the remainder of the division operation is not "0", a movement amount estimation described below is performed on the frame.
Here, by using this division operation, a frame is input to the dynamic object division unit for each number of frames set in the skip mask parameters, and for the remaining frames, the movement amount estimation described below is performed.
Note that since the skip mask parameter 550 has not yet been set for the first frame in the input video, this first frame is directly input to the dynamic object divider 230 .

次に、選択的分割判定部２２５は、除算演算の余りが「０」のフレームについて、画像特徴抽出部２２０によって抽出された特徴及び対象のフレームの、入力映像における前のフレーム５１０の情報（以下、「前フレーム」という）に基づいて、移動量推定を行う。
ここでは、選択的分割判定部２２５は、対象のフレームと、前フレーム５１０との特徴を比較することで、フレーム間の移動量ｅを推定する。上述したように、ここでのフレーム間の移動量ｅは、移動体の移動によって発生する複数のフレームの差分（例えば、画素情報の差分）を意味する。推定した移動量ｅが所定の移動量基準Ｍを超える場合、この対象のフレームは動的フレームとみなされ、動的オブジェクト分割部２３０に入力される。
一方、推定した移動量ｅが所定の移動量基準Ｍ以下の場合、この対象のフレームは非動的フレームとみなされ、特徴分類部（図５には図示せず）に入力される。
このように、非動的フレームに対して移動量推定を行うことで、仮にスキップマスクパラメータの設定によって排除されたフレームの中で動的内容が存在した場合でも、当該動的内容を検出し、この動的内容を含むフレームを動的フレームとして再分類することができる。このように、動的内容を含むフレームをＳＬＡＭ部から高い精度で排除することができるため、高精度の３次元地図の生成が可能となる。 Next, for frames where the remainder of the division operation is "0", the selective split determination unit 225 performs a movement amount estimation based on the features extracted by the image feature extraction unit 220 and information on the previous frame 510 in the input video (hereinafter referred to as the "previous frame") of the target frame.
Here, the selective division determination unit 225 estimates the inter-frame movement amount e by comparing the characteristics of the target frame with the previous frame 510. As described above, the inter-frame movement amount e here means the difference between multiple frames (e.g., the difference in pixel information) that occurs due to the movement of a moving object. If the estimated movement amount e exceeds a predetermined movement amount standard M, the target frame is considered to be a dynamic frame and is input to the dynamic object division unit 230.
On the other hand, if the estimated movement amount e is less than or equal to the predetermined movement amount reference M, the target frame is regarded as a non-dynamic frame and is input to a feature classifier (not shown in FIG. 5).
In this way, by performing motion estimation on non-dynamic frames, even if dynamic content is present among frames that have been excluded by setting the skip mask parameters, the dynamic content can be detected and the frames containing this dynamic content can be reclassified as dynamic frames. In this way, frames containing dynamic content can be excluded from the SLAM unit with high accuracy, making it possible to generate a highly accurate 3D map.

動的オブジェクト分割部２３０は、上述したように、動的クラス情報３１５を用いて、選択的分割判定部２２５によって予測された動的フレームに対して、動的オブジェクト分割処理を施し、動的フレームにおける各画素について、当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを示す画像の意味的情報３２０を生成する。 As described above, the dynamic object segmentation unit 230 uses the dynamic class information 315 to perform dynamic object segmentation processing on the dynamic frame predicted by the selective segmentation determination unit 225, and generates image semantic information 320 for each pixel in the dynamic frame that indicates whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class.

次に、選択的分割判定部２２５は、動的オブジェクト分割部２３０によって生成された画像の意味的情報３２０を用いて、当該フレームにおける動的内容の比率に基づいてスキップマスクパラメータ５５０を設定する。より具体的には、ここでは、選択的分割判定部２２５は、当該フレームの全画素数に対して、動的クラスに対応するオブジェクトの画素数の比率を計算し、計算した比率が所定の動的内容の割合に関する基準を満たすか否かを判定する。
計算した比率が所定の動的内容の割合に関する基準を満たさない場合、スキップマスクパラメータ５５０を設定・更新しない。一方、計算した比率が所定の動的内容の割合に関する基準を満たす場合、スキップマスクパラメータ５５０は「１」に設定される。 Next, the selective split determination unit 225 sets the skip mask parameters 550 based on the ratio of dynamic content in the frame using the semantic information 320 of the image generated by the dynamic object splitter 230. More specifically, the selective split determination unit 225 calculates the ratio of the number of pixels of the object corresponding to the dynamic class to the total number of pixels of the frame, and determines whether the calculated ratio satisfies a predetermined criterion regarding the ratio of dynamic content.
If the calculated ratio does not meet the predetermined dynamic content percentage criteria, the skip mask parameter 550 is not set/updated, whereas if the calculated ratio meets the predetermined dynamic content percentage criteria, the skip mask parameter 550 is set to "1".

また、選択的分割判定部２２５は、移動体位置履歴情報３１０を用いてスキップマスクパラメータ５５０を設定してもよい。上述したように、この移動体位置履歴情報３１０は、複数の異なるフレームにおける移動体／撮影部の位置を示す情報である。選択的分割判定部２２５は、この移動体位置履歴情報３１０に示されている移動体／撮影部の位置の変化に基づいて、移動体／撮影部の移動速度を推定する。その後、選択的分割判定部２２５は、推定した移動速度に基づいて、スキップマスクパラメータ５５０を設定・更新してもよい。ここでは、原則として、選択的分割判定部２２５は、移動体／撮影部の移動速度が早ければ早い程、スキップマスクパラメータ５５０をより大きい値とし、移動体／撮影部の移動速度が遅ければ遅い程、スキップマスクパラメータ５５０をより小さい値とすることが望ましい。
例えば、ある実施例では、選択的分割判定部２２５は、計算した移動体／撮影部の移動速度を事前に設定された閾値に基づいて「Fast」、「Medium」、「Slow」等のカテゴリーのいずれかに分類し、カテゴリーに応じてスキップマスクパラメータ５５０を設定してもよい。一例として、選択的分割判定部２２５は、移動体／撮影部の移動速度を「Fast」と判定した場合、スキップマスクパラメータ５５０を「２」に設定し、移動体／撮影部の移動速度を「Medium」と判定した場合、スキップマスクパラメータ５５０を「５」に設定し、移動体／撮影部の移動速度を「Slow」と判定した場合、スキップマスクパラメータ５５０を「１０」に設定してもよい。 In addition, the selective division determination unit 225 may set the skip mask parameter 550 using the moving object position history information 310. As described above, this moving object position history information 310 is information indicating the position of the moving object/photographing unit in a plurality of different frames. The selective division determination unit 225 estimates the moving speed of the moving object/photographing unit based on the change in the position of the moving object/photographing unit indicated in this moving object position history information 310. Thereafter, the selective division determination unit 225 may set/update the skip mask parameter 550 based on the estimated moving speed. Here, as a general rule, it is desirable for the selective division determination unit 225 to set the skip mask parameter 550 to a larger value the faster the moving speed of the moving object/photographing unit is, and to set the skip mask parameter 550 to a smaller value the slower the moving speed of the moving object/photographing unit is.
For example, in one embodiment, the selective division determination unit 225 may classify the calculated moving speed of the moving object/photographing unit into one of categories such as "Fast", "Medium", "Slow", etc. based on a preset threshold value, and set the skip mask parameter 550 according to the category. As an example, when the selective division determination unit 225 determines that the moving speed of the moving object/photographing unit is "Fast", it may set the skip mask parameter 550 to "2", when the selective division determination unit 225 determines that the moving speed of the moving object/photographing unit is "Medium", it may set the skip mask parameter 550 to "5", and when the selective division determination unit determines that the moving speed of the moving object/photographing unit is "Slow", it may set the skip mask parameter 550 to "10".

以上説明した動的内容判定処理５００によれば、入力映像に含まれる複数のフレーム間の移動量、入力映像を取得した撮影部の移動速度、及びフレームにおいて動的クラスに対応するオブジェクトの画素の比率に基づいてスキップマスクパラメータ５５０を設定及び更新することができる。また、このスキップマスクパラメータ５５０を用いて入力映像を処理することで、動的内容を含まない非動的フレーム（例えば、フレーム間移動が少ないフレーム、移動体／撮影部の移動速度が遅いフレーム、動的オブジェクトに対応する画素が少ないフレーム等）をスキップしつつ、動的内容を含む可能性が高い動的フレーム（例えば、フレーム間移動が多いフレーム、移動体／撮影部の移動速度が速いフレーム、動的オブジェクトに対応する画素が多いフレーム等）を判定することができる。 According to the dynamic content determination process 500 described above, the skip mask parameters 550 can be set and updated based on the amount of movement between multiple frames included in the input video, the movement speed of the image capture unit that captured the input video, and the ratio of pixels of objects corresponding to dynamic classes in the frames. In addition, by processing the input video using the skip mask parameters 550, it is possible to skip non-dynamic frames that do not contain dynamic content (e.g., frames with little movement between frames, frames with a slow movement speed of the moving body/image capture unit, frames with few pixels corresponding to dynamic objects, etc.), while determining dynamic frames that are likely to contain dynamic content (e.g., frames with a lot of movement between frames, frames with a fast movement speed of the moving body/image capture unit, frames with many pixels corresponding to dynamic objects, etc.).

次に、図６を参照して、本開示の実施例１に係る位置特定装置における特徴分類処理の流れについて説明する。 Next, the flow of the feature classification process in the location identification device according to the first embodiment of the present disclosure will be described with reference to FIG.

図６は、本開示の実施例１に係る位置特定装置における特徴分類処理６００の流れを示すブロック図である。図６に示す特徴分類処理６００は、入力映像の特徴を静的特徴か動的特徴かに分類するための処理であり、主に画像特徴抽出部２２０、選択的分割判定部２２５、及び特徴分類部２３５によって実行される。 Figure 6 is a block diagram showing the flow of feature classification processing 600 in the position identification device according to the first embodiment of the present disclosure. The feature classification processing 600 shown in Figure 6 is processing for classifying the features of an input video as static features or dynamic features, and is mainly executed by the image feature extraction unit 220, the selective division determination unit 225, and the feature classification unit 235.

上述したように、画像特徴抽出部２２０は、撮影部によって取得された入力映像３０５を入力した後、当該入力映像３０５を解析し、各フレームの特徴６１５を抽出する。これらの特徴６１５は、後述する移動量推定及び特徴分類の際に用いられる。 As described above, the image feature extraction unit 220 inputs the input video 305 acquired by the image capture unit, analyzes the input video 305, and extracts features 615 for each frame. These features 615 are used for estimating the amount of movement and for feature classification, which will be described later.

選択的分割判定部２２５は、入力映像３０５を構成する各フレーム毎に、当該フレームの順番を示す画像指標（ｉｍａｇｅｉｎｄｅｘ）とスキップマスクパラメータ５５０を用いて除算演算（ｍｏｄｕｌｏｏｐｅｒａｔｉｏｎ）を行い、余りが「０」の場合（つまり、画像指標のスキップマスクパラメータ５５０による除算の余りが０の場合）、当該フレームを動的フレームとし、動的オブジェクト分割部２３０に入力する。一方、除算演算の余りが「０」でない場合、当該フレームの特徴に対して特徴分類を実行する。
なお、入力映像における１番目のフレームの場合、スキップマスクパラメータ５５０はまだ未設定であるため、この１番目のフレームは直接に動的オブジェクト分割部２３０に入力される。 The selective division determination unit 225 performs a modulo operation for each frame constituting the input video 305 using an image index indicating the order of the frame and the skip mask parameter 550, and if the remainder is "0" (i.e., if the remainder of the division of the image index by the skip mask parameter 550 is 0), it sets the frame as a dynamic frame and inputs it to the dynamic object division unit 230. On the other hand, if the remainder of the division operation is not "0", feature classification is performed on the features of the frame.
In addition, in the case of the first frame in the input video, the skip mask parameter 550 has not yet been set, so this first frame is directly input to the dynamic object divider 230 .

動的オブジェクト分割部２３０は、上述したように、予め作成された動的クラス情報３１５を用いて、選択的分割判定部２２５によって予測された動的フレームに対して、動的オブジェクト分割処理を施し、動的フレームにおける各画素について、当該画素が動的クラスに対応するオブジェクトに属するか静的クラスに対応するオブジェクトに属するかを示す画像の意味的情報３２０を生成する。
この画像の意味的情報３２０は、特徴分類部２３５に入力される。 As described above, the dynamic object splitter 230 uses the pre-created dynamic class information 315 to perform dynamic object splitting processing on the dynamic frame predicted by the selective split determination unit 225, and generates image semantic information 320 for each pixel in the dynamic frame indicating whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class.
This image semantic information 320 is input to the feature classifier 235 .

特徴分類部２３５は、動的オブジェクト分割部２３０によって生成された画像の意味的情報３２０と、画像特徴抽出部２２０によって抽出された特徴６１５とに基づいて、入力映像のフレームの各特徴を動的特徴６２２又は静的特徴の候補６２４として分類する。ここで、動的特徴６２２として分類された特徴は外れ値として扱われ、ＳＬＡＭ部２４０の処理から排除される。
一方、静的特徴の候補６２４として分類された特徴については、特徴分類部２３５は、上述したフレーム間の移動量eに基づいて、更なる特徴分類を行ってもよい。より具体的には、特徴分類部２３５は、対象のフレームと、前フレーム５１０との特徴の比較に基づいて推定された移動量ｅを所定の移動量基準Ｍに比較し、移動量ｅが所定の移動量基準Ｍを超える場合、当該特徴を動的特徴６２２として分類し、移動量ｅが所定の移動量基準Ｍ以下の場合、当該特徴を静的特徴６３２として分類する。 The feature classification unit 235 classifies each feature of the input video frame as a dynamic feature 622 or a candidate static feature 624 based on the image semantic information 320 generated by the dynamic object segmentation unit 230 and the features 615 extracted by the image feature extraction unit 220. Here, features classified as dynamic features 622 are treated as outliers and excluded from the processing of the SLAM unit 240.
On the other hand, for features classified as static feature candidates 624, the feature classification unit 235 may perform further feature classification based on the above-mentioned inter-frame movement amount e. More specifically, the feature classification unit 235 compares the movement amount e estimated based on the comparison of features between the target frame and the previous frame 510 with a predetermined movement amount reference M, and classifies the feature as a dynamic feature 622 if the movement amount e exceeds the predetermined movement amount reference M, and classifies the feature as a static feature 632 if the movement amount e is equal to or less than the predetermined movement amount reference M.

以上説明した特徴分類処理６００によれば、入力映像の特徴を静的特徴６３１または動的特徴６２２のいずれかに分類することができる。更に、以上説明したように、特徴分類を画像の意味的情報３２０と、フレーム間の移動量とに基づいて二重に行うことにより、スキップマスクパラメータの設定によりスキップされた非動的フレームに存在する動的内容を特定することができると共に、画像の意味的情報３２０に基づいた判定により誤認されたオブジェクト（動的クラスに対応するのに移動していないオブジェクトや、静的クラスに対応するのに移動したオブジェクト）を正しく分類することができる。 According to the feature classification process 600 described above, the features of the input image can be classified into either static features 631 or dynamic features 622. Furthermore, as described above, by performing feature classification twice based on the image semantic information 320 and the amount of movement between frames, it is possible to identify dynamic content present in non-dynamic frames that have been skipped by setting the skip mask parameters, and to correctly classify objects that have been misidentified based on the image semantic information 320 (objects that do not move even though they correspond to a dynamic class, or objects that have moved even though they correspond to a static class).

次に、図７を参照して、本開示の実施例２に係る位置特定装置のハードウェア構成について説明する。 Next, the hardware configuration of the location identification device according to the second embodiment of the present disclosure will be described with reference to FIG. 7.

図７は、本開示の実施例２に係る位置特定装置７００のハードウェア構成の一例を示す図である。
上述したように、本開示の実施例１に係るＳＬＡＭ部２４０及び位置特定部２４５は、任意の環境の３次元地図において、移動体２７５の位置及び向きを示す第１の移動体位置情報を生成する。本開示の実施例２に係る位置特定装置７００は、この第１の移動体位置情報と、移動体の移動に関する移動センサデータから導出される第２の移動位置情報とに基づいて、移動体２７５の位置及び向きをより正確に示す補正済み位置情報を生成することができる。
なお、本開示の実施例２に係る位置特定装置７００は、上述した実施例１に係る位置特定装置２００の構成に加えて、慣性計測部７０５、ＧＰＳ部７１０、フィルタ部７１５、及び位置情報補正部７２０を含む点以外は、実施例１に係る位置特定装置２００と実質的に同様の構成であるため、以下では、繰り返しとなる説明を省略する。 FIG. 7 is a diagram illustrating an example of a hardware configuration of a position identifying device 700 according to the second embodiment of the present disclosure.
As described above, the SLAM unit 240 and the position identification unit 245 according to the first embodiment of the present disclosure generate first moving object position information indicating the position and orientation of the moving object 275 in a three-dimensional map of an arbitrary environment. The position identification device 700 according to the second embodiment of the present disclosure can generate corrected position information indicating the position and orientation of the moving object 275 more accurately based on the first moving object position information and the second movement position information derived from the movement sensor data related to the movement of the moving object.
In addition, the positioning device 700 according to the second embodiment of the present disclosure has substantially the same configuration as the positioning device 200 according to the first embodiment, except that in addition to the configuration of the positioning device 200 according to the first embodiment described above, the positioning device 700 includes an inertial measurement unit 705, a GPS unit 710, a filter unit 715, and a position information correction unit 720. Therefore, repeated explanations will be omitted below.

慣性計測部（ｉｎｅｒｔｉａｌｍｅａｓｕｒｅｍｅｎｔｕｎｉｔ）７０５は、運動を司る３軸の角度（または角速度）と加速度とを移動センサデータとして検出するための機能部である。慣性計測部７０５は、取得した３軸のジャイロと３方向の加速度計によって、移動体２７５の３次元の角速度及び加速度を計算することができる。 The inertial measurement unit 705 is a functional unit for detecting the angles (or angular velocities) and acceleration of the three axes governing the motion as moving sensor data. The inertial measurement unit 705 can calculate the three-dimensional angular velocity and acceleration of the moving body 275 using the acquired three-axis gyro and three-directional accelerometer.

ＧＰＳ(ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ)部７１０は、ＧＰＳ衛星からの信号に基づいて、移動体２７５の地理的位置情報（座標等）を取得するための機能部である。 The GPS (Global Positioning System) unit 710 is a functional unit for acquiring geographical position information (coordinates, etc.) of the mobile unit 275 based on signals from GPS satellites.

フィルタ部７１５は、慣性計測部７０５によって取得された移動センサデータに対して所定のフィルタ処理を施すことで、当該移動センサデータからノイズを除去するための機能部である。 The filter unit 715 is a functional unit that applies a predetermined filter process to the movement sensor data acquired by the inertial measurement unit 705 to remove noise from the movement sensor data.

位置情報補正部７２０は、上述したＳＬＡＭ部２４０又は位置特定部２４５によって生成される第１の移動体位置情報と、慣性計測部７０５によって取得される移動センサデータから導出される第２の移動体位置情報とに基づいて、移動体２７５の位置及び向きをより正確に示す補正済み移動体位置情報を生成するための機能部である。 The position information correction unit 720 is a functional unit for generating corrected moving body position information that more accurately indicates the position and orientation of the moving body 275 based on the first moving body position information generated by the above-mentioned SLAM unit 240 or the position identification unit 245 and the second moving body position information derived from the moving sensor data acquired by the inertial measurement unit 705.

以上説明したように構成した、本開示の実施例２に係る位置特定装置７００によれば、実施例１に係る位置特定装置２００に比べて、移動体２７５の位置をより正確に特定することができると共に、ＳＬＡＭ部２４０によって生成される３次元地図をＧＰＳデータに連携させることができる。 The position determining device 700 according to the second embodiment of the present disclosure, configured as described above, can determine the position of the moving body 275 more accurately than the position determining device 200 according to the first embodiment, and can link the 3D map generated by the SLAM unit 240 to GPS data.

次に、図８を参照して、本開示の実施例２に係る位置特定装置における移動体位置情報補正処理の流れについて説明する。 Next, the flow of the mobile object position information correction process in the position identification device according to the second embodiment of the present disclosure will be described with reference to FIG.

図８は、本開示の実施例２に係る位置特定装置における移動体位置情報補正処理８００の流れを示すブロック図である。図８に示す移動体位置情報補正処理８００は、ＳＬＡＭ部２４０又は位置特定部２４５によって生成される第１の移動体位置情報と、慣性計測部によって生成される第２の移動体位置情報とに基づいて、移動体２７５の位置及び向きをより正確に示す補正済み移動体位置情報を生成するための処理である。そして、移動体位置情報補正処理８００は、ＳＬＡＭ部２４０、位置特定部２４５、慣性計測部７０５、フィルタ部７１５、及び位置情報補正部７２０によって実行される。 FIG. 8 is a block diagram showing the flow of a moving object position information correction process 800 in a position identification device according to a second embodiment of the present disclosure. The moving object position information correction process 800 shown in FIG. 8 is a process for generating corrected moving object position information that more accurately indicates the position and orientation of the moving object 275, based on the first moving object position information generated by the SLAM unit 240 or the position identification unit 245 and the second moving object position information generated by the inertial measurement unit. The moving object position information correction process 800 is executed by the SLAM unit 240, the position identification unit 245, the inertial measurement unit 705, the filter unit 715, and the position information correction unit 720.

まず、撮影部によって取得される入力映像３０５は、位置特定装置のＳＬＡＭ部２４０又は位置特定部２４５に入力される。ＳＬＡＭ部２４０又は位置特定部２４５は、図３に示す地図生成処理３００又は図４に示す位置特定処理４００を行うことで、移動体の位置（例えば、地理的座標等）及び向き（北、東、西、南等）を示す第１の移動体位置情報３３０を生成する。 First, the input image 305 acquired by the imaging unit is input to the SLAM unit 240 or the position identification unit 245 of the position identification device. The SLAM unit 240 or the position identification unit 245 performs the map generation process 300 shown in FIG. 3 or the position identification process 400 shown in FIG. 4 to generate first mobile object position information 330 indicating the position (e.g., geographic coordinates, etc.) and direction (north, east, west, south, etc.) of the mobile object.

また、慣性計測部７０５は、移動体が移動しながら、慣性計測部７０５を含む位置特定装置を備えた移動体の３次元の角速度及び加速度を示す移動センサデータ８０５を取得する。 The inertial measurement unit 705 also acquires moving sensor data 805 indicating the three-dimensional angular velocity and acceleration of the moving body equipped with a position identification device including the inertial measurement unit 705 while the moving body is moving.

次に、フィルタ部７１５は、慣性計測部７０５によって取得された移動センサデータ８０５に対して、所定のフィルタ処理を施すことで、当該センサデータからノイズを除去し、フィルタ済みデータ８１０を生成する。
ここでは、移動センサデータ８０５からノイズを除去するために、任意の既存のノイズ除去手段を用いてもよく、本開示では特に限定されない。 Next, the filter unit 715 performs a predetermined filtering process on the movement sensor data 805 acquired by the inertial measurement unit 705 to remove noise from the sensor data and generate filtered data 810 .
Here, any existing noise removal means may be used to remove noise from the mobile sensor data 805, and is not particularly limited in this disclosure.

次に、位置情報補正部７２０は、フィルタ部７１５によって生成されたフィルタ済みデータ８１０に基づいて、移動体の位置及び向きを示す第２の移動体位置情報８３０を生成する。なお、実際には、この第２の移動体位置情報８３０は、慣性計測部７０５によって取得された移動センサデータから導出された慣性計測部７０５の位置及び向きを示す情報である。そして、この第２の移動体位置情報８３０は、慣性計測部７０５は移動体に備えられているため、移動体の位置及び向きを示す情報でもある。
ここでは、フィルタ済みデータ８１０から移動体の位置及び向きを導出するためには、位置情報補正部７２０は、任意の既存の手段を用いてもよく、本開示では特に限定されない。 Next, the position information correction unit 720 generates second moving object position information 830 indicating the position and orientation of the moving object based on the filtered data 810 generated by the filter unit 715. Note that, in reality, this second moving object position information 830 is information indicating the position and orientation of the inertial measurement unit 705 derived from the mobile sensor data acquired by the inertial measurement unit 705. And, since the inertial measurement unit 705 is provided in the moving object, this second moving object position information 830 is also information indicating the position and orientation of the moving object.
Here, in order to derive the position and orientation of the moving object from the filtered data 810, the position information correction unit 720 may use any existing means, and is not particularly limited in this disclosure.

次に、位置情報補正部７２０は、上述した第１の移動体位置情報３３０と、第２の移動体位置情報８３０とに基づいて、移動体の位置及び向きを第１の移動体位置情報３３０又は第２の移動体位置情報８３０よりも正確に示すことができる補正済み移動体位置情報８５０を生成する。
ここで生成される補正済み移動体位置情報８５０は、例えば位置特定装置のインターフェース部（例えば図７に示す位置特定装置７００のインターフェース部２５０）を介して出力されてもよい。 Next, the position information correction unit 720 generates corrected moving body position information 850 based on the above-mentioned first moving body position information 330 and second moving body position information 830, which can indicate the position and orientation of the moving body more accurately than the first moving body position information 330 or the second moving body position information 830.
The corrected mobile object position information 850 generated here may be output, for example, via an interface unit of the position specifying device (for example, the interface unit 250 of the position specifying device 700 shown in FIG. 7).

以上説明したように、本開示の実施例２に係る位置特定装置において、ＳＬＡＭ部２４０又は位置特定部２４５によって生成された第１の移動体位置情報３３０と、慣性計測部７０５によって生成される第２の移動体位置情報８３０とに基づいて、移動体の位置及び向きをより正確に示す補正済み移動体位置情報を生成することができる。 As described above, in the position determination device according to the second embodiment of the present disclosure, corrected moving body position information that more accurately indicates the position and orientation of the moving body can be generated based on the first moving body position information 330 generated by the SLAM unit 240 or the position determination unit 245 and the second moving body position information 830 generated by the inertial measurement unit 705.

次に、図９を参照して、本開示の実施例２に係る位置特定装置におけるＧＰＳ対応付け処理及び地図読み込み処理の流れについて説明する。 Next, the flow of the GPS association process and the map reading process in the position identification device according to the second embodiment of the present disclosure will be described with reference to FIG.

上述したように、本開示の実施例２に係る位置特定装置７００は、ＧＰＳ部７１０を備える。このＧＰＳ部７１０を用いて、ＧＰＳデータをＳＬＡＭ部２４０によって生成される３次元地図に対応付けることで、所定の環境における移動体の位置のみならず、移動体の全地球における位置（ＧｌｏｂａｌＰｏｓｉｔｉｏｎ）を特定することが可能となる。 As described above, the positioning device 700 according to the second embodiment of the present disclosure includes a GPS unit 710. By using the GPS unit 710 to associate GPS data with the three-dimensional map generated by the SLAM unit 240, it becomes possible to determine not only the position of a mobile body in a specific environment, but also the global position of the mobile body.

図９は、本開示の実施例２に係る位置特定装置におけるＧＰＳ対応付け処理９００及び地図読み込み処理９５０の流れを示すブロック図である。 Figure 9 is a block diagram showing the flow of the GPS association process 900 and the map reading process 950 in the position determination device according to the second embodiment of the present disclosure.

まず、ＧＰＳ対応付け処理９００について説明する。ＧＰＳ対応付け処理９００とは、ＧＰＳデータをＳＬＡＭ部２４０によって生成される３次元地図に対応付けて地図データベース９２５格納するための処理である。 First, the GPS association process 900 will be described. The GPS association process 900 is a process for associating GPS data with a three-dimensional map generated by the SLAM unit 240 and storing the data in the map database 925.

まず、ＳＬＡＭ部２４０は、例えば図３に示す地図生成処理３００に従って、入力映像３０５に対応する３次元地図３２５を生成する。 First, the SLAM unit 240 generates a 3D map 325 corresponding to the input image 305, for example, according to the map generation process 300 shown in FIG. 3.

３次元地図３２５が作成された後、ＧＰＳ部７１０は、移動体の全地球における位置（ＧｌｏｂａｌＰｏｓｉｔｉｏｎ）を示すＧＰＳデータ９０５を取得する。その後、ＧＰＳ部７１０は、取得したＧＰＳデータ９０５を、３次元地図３２５に対応付ける。ここで、ＧＰＳデータ９０５を３次元地図３２５に対応付ける手段として、ＧＰＳ部７１０は、３次元地図３２５において、特定の地点（公園、オフィスビル、駅等）に対して、当該地点の地理的座標（緯度、経度）を示すジオタグを追加してもよい。
その後、ＧＰＳ部７１０は、ＧＰＳデータ９０５を対応付けた３次元地図３２５を、インターネット等の通信ネットワーク９１５を介して、地理的に離れた場所に設置されているサーバ装置（クラウドサーバ等）に管理されている地図データベース（以下、「地図ＤＢ」）９２５に格納する。 After the three-dimensional map 325 is created, the GPS unit 710 acquires GPS data 905 indicating the global position of the mobile object. Then, the GPS unit 710 associates the acquired GPS data 905 with the three-dimensional map 325. Here, as a means for associating the GPS data 905 with the three-dimensional map 325, the GPS unit 710 may add a geotag indicating the geographical coordinates (latitude, longitude) of a specific point (such as a park, an office building, or a station) in the three-dimensional map 325.
Thereafter, the GPS unit 710 stores the three-dimensional map 325 associated with the GPS data 905 in a map database (hereinafter, "map DB") 925 managed by a server device (such as a cloud server) installed in a geographically distant location via a communication network 915 such as the Internet.

次に、地図読み込み処理９５０について説明する。地図読み込み処理９５０は、ＧＰＳデータに対応付けられている３次元地図を読み込むための処理である。この地図読み込み処理９５０は、ＧＰＳデータ９０５を対応付けた３次元地図３２５が既に作成されており、地図ＤＢ９２５に格納されている環境の中に移動体が入場した際に自動的に行われてもよい。 Next, the map loading process 950 will be described. The map loading process 950 is a process for loading a three-dimensional map associated with GPS data. This map loading process 950 may be performed automatically when a mobile object enters an environment in which a three-dimensional map 325 associated with GPS data 905 has already been created and stored in the map DB 925.

移動体のＧＰＳ部７１０は、位置特定装置７００（つまり、移動体）の地理的情報（緯度、経度等）９５５を含む地図要求を、通信ネットワーク９１５を介して、地図ＤＢ９２５に送信する。なお、本開示において「地図要求」とは、移動体の現在の地理的情報に対応する、予め作成されている環境地図（例えば、上述したＧＰＳ対応付け処理９００によって地図ＤＢ９２５に格納された３次元地図）を要求する情報である。
その後、地図ＤＢ９２５は、位置特定装置７００から受信した地図要求に含まれる地理的情報９５５に対応する３次元地図３２５を検索し、検索した３次元地図３２５を、通信ネットワーク９１５を介して送信する。その後、３次元地図３２５を受信した位置特定装置７００は、この３次元地図３２５を用いて、例えば図４を参照して説明した位置特定処理４００を行うことで、位置特定装置７００を備えた移動体の全地球における位置を特定することができる。 The GPS unit 710 of the mobile body transmits a map request including geographical information (latitude, longitude, etc.) 955 of the position identification device 700 (i.e., the mobile body) to the map DB 925 via the communication network 915. Note that in this disclosure, a "map request" is information requesting a pre-created environmental map (e.g., a three-dimensional map stored in the map DB 925 by the above-mentioned GPS association process 900) that corresponds to the current geographical information of the mobile body.
Thereafter, the map DB 925 searches for a three-dimensional map 325 corresponding to the geographical information 955 included in the map request received from the position identification device 700, and transmits the searched three-dimensional map 325 via the communication network 915. After that, the position identification device 700 that has received the three-dimensional map 325 uses the three-dimensional map 325 to perform the position identification process 400 described with reference to Fig. 4, for example, to identify the position of the mobile object equipped with the position identification device 700 on the entire earth.

以上説明したＧＰＳ対応付け処理９００及び地図読み込み処理９５０によれば、ＧＰＳデータをＳＬＡＭ部２４０によって生成される環境の３次元地図に対応付けて格納しておくことで、後に当該環境に入場する移動体は、予め生成された３次元地図を地図ＤＢ９２５から取得し、利用することができるため、何もない状態から３次元地図を生成することが不要となる上、移動体の全地球における位置を特定することができる。 According to the GPS association process 900 and map reading process 950 described above, by storing GPS data in association with a three-dimensional map of the environment generated by the SLAM unit 240, a moving object that later enters the environment can obtain and use the previously generated three-dimensional map from the map DB 925, eliminating the need to generate a three-dimensional map from scratch and enabling the location of the moving object on the entire globe to be identified.

次に、図１０を参照して、本開示の実施例３に係る位置特定装置のハードウェア構成について説明する。 Next, the hardware configuration of the location identification device according to the third embodiment of the present disclosure will be described with reference to FIG. 10.

図１０は、本開示の実施例３に係る位置特定装置１０００のハードウェア構成の一例を示す図である。
上述したように、本開示の実施例１及び実施例２に係る位置特定装置は、任意の環境の３次元地図において、移動体２７５の位置及び向きを示す移動体位置情報を生成することができる。本開示の実施例３に係る位置特定装置１０００は、この移動体位置情報と、環境におけるリスクを示すリスク位置情報とに基づいて、移動体２７５に対して影響を及ぼす可能性があるリスクを知らせることができる。
なお、本開示の実施例３に係る位置特定装置１０００は、上述した実施例２に係る位置特定装置７００の構成に加えて、リスク管理部１０１０及び通知部１０２０を含む点以外、構成が実施例２に係る位置特定装置７００と実質的に同様であるため、以下では、繰り返しとなる説明を省略する。 FIG. 10 is a diagram illustrating an example of a hardware configuration of a position identifying device 1000 according to a third embodiment of the present disclosure.
As described above, the positioning devices according to the first and second embodiments of the present disclosure can generate mobile object position information indicating the position and orientation of the mobile object 275 in a three-dimensional map of an arbitrary environment. The positioning device 1000 according to the third embodiment of the present disclosure can notify the mobile object 275 of a risk that may affect the mobile object 275 based on the mobile object position information and risk position information indicating a risk in the environment.
In addition, the location identification device 1000 according to the third embodiment of the present disclosure has a configuration substantially similar to that of the location identification device 700 according to the second embodiment, except that in addition to the configuration of the location identification device 700 according to the second embodiment described above, the location identification device 1000 includes a risk management unit 1010 and a notification unit 1020. Therefore, repeated explanations will be omitted below.

リスク管理部１０１０は、移動体２７５の環境における位置及び向きを示す補正済み移動体位置情報と、リスクの候補の位置を示すリスク位置情報とに基づいて、移動体２７５に対して影響を及ぼす可能性があるリスクを判定するための機能部である。ここでのリスクとは、例えば、移動体２７５の移動を阻止したり、移動体２７５にダメージや障害を与える危険性を有する物事を意味し、例えば大型機械、有害物質、治安が悪い領域等を含んでもよい。 The risk management unit 1010 is a functional unit for determining risks that may affect the mobile unit 275, based on corrected mobile unit position information indicating the position and orientation of the mobile unit 275 in the environment, and risk position information indicating the positions of risk candidates. Here, risk means, for example, things that have the potential to prevent the movement of the mobile unit 275 or cause damage or obstruction to the mobile unit 275, and may include, for example, large machinery, hazardous substances, areas with poor security, etc.

通知部１０２０は、リスク管理部１０１０によって判定されたリスクを移動体２７５に知らせるためのリスク通知を生成し、出力するための機能部である。ある実施例では、この通知は、移動体２７５に影響を及ぼす可能性があると判定されたリスクの位置（地理的座標等）や、当該リスクを回避するための指示等を含んでもよい。 The notification unit 1020 is a functional unit for generating and outputting a risk notification to inform the mobile unit 275 of the risk determined by the risk management unit 1010. In one embodiment, the notification may include the location (e.g., geographic coordinates) of the risk determined to have the potential to affect the mobile unit 275, instructions for avoiding the risk, etc.

以上説明したように、本開示の実施例３に係る位置特定装置１０００によれば、移動体２７５が移動する環境において存在するリスクを判定すると共に、当該リスクを移動体２７５に知らせることができる。これにより、当該環境における移動体２７５の安全性を向上させることができる。 As described above, the positioning device 1000 according to the third embodiment of the present disclosure can determine the risks present in the environment in which the mobile object 275 moves and can notify the mobile object 275 of the risks. This can improve the safety of the mobile object 275 in the environment.

次に、図１１を参照して、本開示の実施例３に係る位置特定装置におけるリスク管理処理の流れについて説明する。 Next, the flow of risk management processing in the location identification device according to the third embodiment of the present disclosure will be described with reference to FIG. 11.

図１１は、本開示の実施例３に係る位置特定装置におけるリスク管理処理１１００の流れを示すブロック図である。リスク管理処理１１００は、移動体に対して影響を及ぼす可能性があるリスクを判定し、当該リスクを知らせる通知を出力するための処理であり、図１０に示す位置特定装置１０００におけるリスク管理部１０１０及び通知部１０２０によって実行される。 FIG. 11 is a block diagram showing the flow of a risk management process 1100 in a position identification device according to a third embodiment of the present disclosure. The risk management process 1100 is a process for determining risks that may affect a moving body and outputting a notification of the risk, and is executed by the risk management unit 1010 and the notification unit 1020 in the position identification device 1000 shown in FIG. 10.

まず、リスク管理部１０１０は、ＳＬＡＭ部２４０によって生成された３次元地図３２５と、位置情報補正部７２０によって生成される補正済み移動体位置情報８５０と、当該環境におけるリスクの位置を示すリスク位置情報１１１５を取得する。ここでのリスク位置情報１１１５は、例えばインターネット等の通信ネットワークを介してアクセス可能なサーバ装置に格納されていてもよい。一例として、対象となる環境が工場の場合、リスク位置情報１１１５は、工場のサーバに格納されていてもよい。
また、ここでは、補正済み移動体位置情報８５０を用いる場合を一例として説明するが、本開示はこれに限定されず、補正済み移動体位置情報８５０の代わりに、第１の移動体位置情報や第２の移動体位置情報を用いてもよい。ただし、補正済み移動体位置情報８５０は、第１の移動体位置情報及び第２の移動体位置情報に比べて精度が高いため、補正済み移動体位置情報８５０を用いることが望ましい。 First, the risk management unit 1010 acquires the 3D map 325 generated by the SLAM unit 240, the corrected mobile object position information 850 generated by the position information correction unit 720, and risk position information 1115 indicating the position of the risk in the environment. The risk position information 1115 here may be stored in a server device accessible via a communication network such as the Internet. As an example, when the target environment is a factory, the risk position information 1115 may be stored in a server in the factory.
In addition, although a case where the corrected moving body position information 850 is used will be described as an example here, the present disclosure is not limited thereto, and the first moving body position information or the second moving body position information may be used instead of the corrected moving body position information 850. However, since the corrected moving body position information 850 has higher accuracy than the first moving body position information and the second moving body position information, it is desirable to use the corrected moving body position information 850.

次に、リスク管理部１０１０は、取得した３次元地図３２５、補正済み移動体位置情報８５０及びリスク位置情報１１１５に基づいて、移動体に対して影響を及ぼす可能性があるリスクを判定する。ここでは、リスク管理部１０１０は、例えば、リスク位置情報１１１５に示されているリスクの候補の中から、移動体の現在位置に対する距離が所定の距離基準未満のリスクの候補を、移動体に対して影響を及ぼす可能性があるリスクとして判定してもよい。 Next, the risk management unit 1010 determines risks that may affect the mobile body based on the acquired three-dimensional map 325, the corrected mobile body position information 850, and the risk position information 1115. Here, the risk management unit 1010 may determine, for example, from among the risk candidates shown in the risk position information 1115, risk candidates whose distance to the current position of the mobile body is less than a predetermined distance standard, as risks that may affect the mobile body.

その後、通知部１０２０は、リスク管理部１０１０によって判定されたリスクを移動体に知らせるためのリスク通知を生成し、出力する。上述したように、この通知は、移動体に影響を及ぼす可能性があると判定されたリスクの位置（地理的座標等）や、当該リスクを回避するための指示等を含んでもよい。
ある実施例では、通知部１０２０は、作成したリスク通知を、インターフェース部（例えば、図３に示す位置特定装置１０００におけるインターフェース部２５０）に出力し、タッチ画面等を介してユーザに表示してもよい。 Thereafter, the notification unit 1020 generates and outputs a risk notification to inform the mobile unit of the risk determined by the risk management unit 1010. As described above, this notification may include the location (e.g., geographic coordinates) of the risk determined to have the potential to affect the mobile unit, instructions for avoiding the risk, and the like.
In one embodiment, the notification unit 1020 may output the created risk notification to an interface unit (e.g., the interface unit 250 in the position identification device 1000 shown in FIG. 3) and display it to the user via a touch screen or the like.

以上説明したリスク管理処理１１００によれば、移動体が移動する環境において存在するリスクを判定すると共に、当該リスクを移動体に知らせることができる。これにより、当該環境における移動体の安全性を向上させることができる。 The risk management process 1100 described above can determine the risks present in the environment in which the mobile body moves and can notify the mobile body of the risks. This can improve the safety of the mobile body in the environment.

次に、図１２を参照して、本開示の実施例に係るインターフェース部の機能について説明する。 Next, the function of the interface unit according to the embodiment of the present disclosure will be described with reference to FIG.

上述したように、本開示の実施例に係るインターフェース部は、各種情報をユーザ（例えば、移動体が位置特定装置を備えた個人用端末を持つ人（人間）のユーザの場合）に提供すると共に、ユーザからの入力を受け付けるための機能部である。一例として、インターフェース部２５０は、タッチ画面であってもよい。インターフェース部２５０は、例えばＳＬＡＭ部によって生成された３次元地図を表示したり、当該３次元地図における移動体の位置（つまり、ユーザの位置）を表示したり、移動体の目的地の入力を受け付けたり、当該目的地までの推奨の経路を表示したりしてもよい。
図１２は、本開示の実施例に係るインターフェース部２５０の機能を示すブロック図である。図１２に示すように、インターフェース部２５０は、３次元地図表示機能１２０５、移動体位置表示機能１２１０、目的地設定機能１２１５、経路案内機能１２２０、地図検索機能１２２５、地図読み込み機能１２３０及びリスク表示機能１２３５を含む。 As described above, the interface unit according to the embodiment of the present disclosure is a functional unit for providing various information to a user (e.g., in the case where the mobile object is a human user having a personal terminal equipped with a position identification device) and accepting input from the user. As an example, the interface unit 250 may be a touch screen. The interface unit 250 may, for example, display a three-dimensional map generated by the SLAM unit, display the position of the mobile object on the three-dimensional map (i.e., the user's position), accept input of a destination of the mobile object, and display a recommended route to the destination.
12 is a block diagram showing functions of an interface unit 250 according to an embodiment of the present disclosure. As shown in FIG. 12, the interface unit 250 includes a three-dimensional map display function 1205, a mobile object position display function 1210, a destination setting function 1215, a route guidance function 1220, a map search function 1225, a map reading function 1230, and a risk display function 1235.

３次元地図表示機能１２０５は、図３を参照して説明した地図生成処理３００よって生成される３次元地図、又は地図データベースから取得される３次元地図を表示するための機能である。ここで、インターフェース部２５０は、３次元地図を、例えばタッチ画面等を介してユーザに表示してもよい。 The three-dimensional map display function 1205 is a function for displaying a three-dimensional map generated by the map generation process 300 described with reference to FIG. 3, or a three-dimensional map obtained from a map database. Here, the interface unit 250 may display the three-dimensional map to the user via, for example, a touch screen.

移動体位置表示機能１２１０は、３次元地図における移動体の位置を表示するための機能である。ここで、移動体位置表示機能１２１０は、上述した第１の移動体位置情報、第２の移動体位置情報、及び補正済み移動体位置情報のいずれかに従って移動体の位置を３次元地図に表示してもよいが、補正済み移動体位置情報は、第１の移動体位置情報及び第２の移動体位置情報に比べて精度が高いため、補正済み移動体位置情報を用いることが望ましい。 The mobile object position display function 1210 is a function for displaying the position of a mobile object on a three-dimensional map. Here, the mobile object position display function 1210 may display the position of a mobile object on a three-dimensional map according to any of the above-mentioned first mobile object position information, second mobile object position information, and corrected mobile object position information. However, since the corrected mobile object position information has higher accuracy than the first mobile object position information and the second mobile object position information, it is preferable to use the corrected mobile object position information.

目的地設定機能１２１５は、ユーザ（例えば、移動体が位置特定装置を備えた個人用端末を持つ人間ユーザの場合）の入力に基づいて、移動体（つまり、ユーザ）の目的地を設定するための機能である。例えば、ここでは、人のユーザは、インターフェース部２５０によってタッチ画面等で表示される３次元地図に対して、希望の目的地を指等で選択することにより、目的地を設定してもよい。 The destination setting function 1215 is a function for setting the destination of a mobile object (i.e., the user) based on input from a user (e.g., when the mobile object is a human user having a personal terminal equipped with a positioning device). For example, here, the human user may set the destination by selecting the desired destination with a finger or the like on a three-dimensional map displayed on a touch screen or the like by the interface unit 250.

経路案内機能１２２０は、移動体を、目的地設定機能１２１５で設定された目的地まで案内するための推奨の経路を判定し、表示するための機能である。ここでは、経路案内１２２０は、例えば既存の自動運転機能等に従って経路を判定してもよい。 The route guidance function 1220 is a function for determining and displaying a recommended route for guiding a mobile object to a destination set by the destination setting function 1215. Here, the route guidance 1220 may determine a route according to, for example, an existing automated driving function.

地図検索機能１２２５は、上述したＧＰＳ部等によって取得された移動体の地理的情報に基づいて、当該地理的情報に対応する作成済みの３次元地図を、サーバに保存される地図ＤＢ（例えば、図９に示す地図ＤＢ９２５）から取得するための機能である。 The map search function 1225 is a function for obtaining a created 3D map corresponding to the geographical information of a moving object obtained by the above-mentioned GPS unit or the like from a map DB (e.g., map DB 925 shown in FIG. 9) stored in the server based on the geographical information of the moving object.

地図読み込み機能１２３０は、地図検索機能１２２５によって地図ＤＢから取得された作成済みの３次元地図を読み込んでインターフェース部を介して表示するための機能である。 The map loading function 1230 is a function for loading a created 3D map obtained from the map DB by the map search function 1225 and displaying it via the interface unit.

リスク表示機能１２３５は、図１１を参照して説明したリスク管理処理１１００によって判定され、通知されたリスクを表示するための機能である。ここで、移動体に対して影響を及ぼす可能性があるリスクの情報は、インターフェース部を介して表示される３次元地図において表示されてもよい。 The risk display function 1235 is a function for displaying the risks determined and notified by the risk management process 1100 described with reference to FIG. 11. Here, information on risks that may affect the mobile object may be displayed on a three-dimensional map displayed via the interface unit.

以上説明したインターフェース部によれば、環境内で移動する人のユーザは、位置特定装置に出力される各種情報を容易に確認することができる。 The interface unit described above allows a human user moving around within the environment to easily check the various pieces of information output to the positioning device.

次に、図１３を参照して、本開示の実施例に係るスキップマスクパラメータの使用例について説明する。 Next, referring to FIG. 13, we will explain an example of how to use the skip mask parameters according to an embodiment of the present disclosure.

上述したように、本開示の実施例に係る動的内容判定処理（例えば、図５に示す動的内容判定処理５００）は、スキップマスクパラメータに基づいて行われる。このスキップマスクパラメータは、入力映像において、動的オブジェクト分割処理から排除するフレーム数を指定するパラメータである。このスキップマスクパラメータを用いることで、動的内容を含む可能性が低いフレームがスキップ（つまり、動的オブジェクト分割処理から排除）され、動的内容を含む可能性が高いフレームのみが動的オブジェクト分割処理の対象となるため、入力映像の全てのフレームに対して動的オブジェクト分割処理を行った場合に比べて、コンピューティング資源を大幅に節約することができる。
なお、このスキップマスクパラメータは、入力映像に含まれる複数のフレーム間の移動量、入力映像を取得した撮影部の移動速度、及びフレームにおいて動的クラスに対応するオブジェクトの画素の比率に基づいて設定及び更新される。
図１３は、本開示の実施例に係るスキップマスクパラメータの使用例を示す図である。説明の便宜上、以下の使用例では、フレームの動的内容の比率に基づいてスキップマスクパラメータを設定する場合を説明するが、上述したように、本開示はこれに限定されず、スキップマスクパラメータは動的内容の比率のみならず、移動体／撮影部の移動速度、フレーム間の移動等に応じて設定されてもよい。 As described above, the dynamic content determination process according to the embodiment of the present disclosure (e.g., the dynamic content determination process 500 shown in FIG. 5) is performed based on a skip mask parameter. The skip mask parameter is a parameter that specifies the number of frames in the input video to be excluded from the dynamic object segmentation process. By using the skip mask parameter, frames that are unlikely to contain dynamic content are skipped (i.e., excluded from the dynamic object segmentation process), and only frames that are likely to contain dynamic content are subject to the dynamic object segmentation process, which can significantly save computing resources compared to performing the dynamic object segmentation process on all frames of the input video.
The skip mask parameters are set and updated based on the amount of movement between multiple frames included in the input video, the movement speed of the image capture unit that captured the input video, and the ratio of pixels of objects corresponding to the dynamic class in the frames.
13 is a diagram showing an example of using skip mask parameters according to an embodiment of the present disclosure. For convenience of explanation, the following example of use describes a case where the skip mask parameters are set based on the ratio of dynamic content of a frame, but as described above, the present disclosure is not limited thereto, and the skip mask parameters may be set according to not only the ratio of dynamic content but also the moving speed of the moving body/image capture unit, the movement between frames, etc.

まず、フレーム１３０１、フレーム１３０７、フレーム１３０８、フレーム１３０９及びフレーム１３１０等の複数のフレームを少なくとも含む入力映像１３００が取得され、本開示の実施例に係る位置特定装置に入力され、動的内容判定処理の対象となるとする。上述したように、動的内容判定処理が初めて行われる場合、スキップマスクパラメータはまだ未設定であるため、選択的分割判定部は、最初のフレームであるフレーム１３０１を直接に動的オブジェクト分割部に入力する。 First, an input video 1300 including at least a number of frames, such as frame 1301, frame 1307, frame 1308, frame 1309, and frame 1310, is acquired, input to a positioning device according to an embodiment of the present disclosure, and is subjected to dynamic content determination processing. As described above, when the dynamic content determination processing is performed for the first time, the skip mask parameters have not yet been set, so the selective split determination unit directly inputs frame 1301, which is the first frame, to the dynamic object splitter.

フレーム１３０１は、動的オブジェクト分割部によって解析された結果、人間等の動的オブジェクトは存在しないため、動的内容の比率が低く、動的内容割合基準を満たさないと判定される。このため、スキップマスクパラメータは、例えば「５フレーム」等の所定の値に設定される。ここで、スキップマスクパラメータの値は、例えば位置特定装置の管理者によって設定されてもよく、過去に地図生成に用いられた入力映像に基づいて決定されてもよい。 As a result of analyzing frame 1301 by the dynamic object segmentation unit, it is determined that there are no dynamic objects such as a human being, and therefore the dynamic content ratio is low and does not satisfy the dynamic content ratio criterion. For this reason, the skip mask parameter is set to a predetermined value such as "5 frames." Here, the value of the skip mask parameter may be set by, for example, an administrator of the position identification device, or may be determined based on input video used in the past to generate a map.

次に、選択的分割判定部は、スキップマスクパラメータに従って、入力映像１３００において、フレーム１３０１から５フレーム（つまり、図１３に図示されないフレーム１３０２～１３０６）をスキップし、フレーム１３０７を動的オブジェクト分割部に入力する。 Next, the selective split determination unit skips five frames from frame 1301 (i.e., frames 1302 to 1306, not shown in FIG. 13) in the input video 1300 in accordance with the skip mask parameters, and inputs frame 1307 to the dynamic object splitter.

フレーム１３０７は、動的オブジェクト分割部によって解析された結果、動的オブジェクト（人間）が存在するため、動的内容の比率が高く、動的内容割合基準を満たすと判定される。このため、スキップマスクパラメータは、例えば「０フレーム」に更新される。 Frame 1307 is analyzed by the dynamic object segmentation unit and is therefore determined to have a high dynamic content ratio and to satisfy the dynamic content ratio criterion because it contains a dynamic object (human). For this reason, the skip mask parameter is updated to, for example, "0 frames."

次に、選択的分割判定部は、スキップマスクパラメータに従って、入力映像１３００において、フレーム１３０７からフレームをスキップしないで、次のフレームであるフレーム１３０８を動的オブジェクト分割部に入力する。 Next, the selective split determination unit inputs the next frame, frame 1308, in the input video 1300 to the dynamic object splitter without skipping any frames from frame 1307 onwards in accordance with the skip mask parameters.

フレーム１３０８は、動的オブジェクト分割部によって解析された結果、フレーム１３０７と同様に、動的オブジェクト（人間）が存在するため、動的内容の比率が高く、動的内容割合基準を満たすと判定される。このため、スキップマスクパラメータは、変更されず、「０フレーム」のままに維持される。 As a result of analyzing frame 1308 by the dynamic object segmentation unit, it is determined that, like frame 1307, a dynamic object (human) is present, so the dynamic content ratio is high and the dynamic content ratio criterion is met. For this reason, the skip mask parameters are not changed and remain at "0 frames."

フレーム１３０９は、フレーム１３０７及びフレーム１３０８と同様に、動的オブジェクト（人間）が存在するため、動的内容の比率が高く、動的内容割合基準を満たすと判定される。このため、スキップマスクパラメータは、変更されず、「０フレーム」に維持される。
このように、選択的分割判定部は、フレームが動的内容割合基準を満たさなくなるまで、フレームをスキップせず、各フレームを動的オブジェクト分割部に入力する。 Frame 1309, like frames 1307 and 1308, is determined to have a high proportion of dynamic content and meet the dynamic content proportion criterion due to the presence of a dynamic object (a human), and therefore the skip mask parameter is not changed and remains at "0 frames".
In this manner, the selective split decision unit inputs each frame to the dynamic object splitter without skipping any frames until the frame no longer meets the dynamic content proportion criterion.

一方、フレーム１３１０が動的オブジェクト分割部に入力され、解析された結果、当該フレーム１３１０は、人間等の動的オブジェクトを含まないため、動的内容の比率が低く、動的内容割合基準を満たさないと判定される。従って、スキップマスクパラメータは、例えば「５フレーム」等の所定の値に設定され、選択的分割判定部は、スキップマスクパラメータに従って、入力映像１３００において、フレーム１３１０から５フレームをスキップした後、次のフレーム（図１３には図示せず）を動的オブジェクト分割部に入力する。 On the other hand, when frame 1310 is input to the dynamic object segmentation unit and analyzed, it is determined that frame 1310 does not contain a dynamic object such as a human, has a low ratio of dynamic content, and does not satisfy the dynamic content ratio criterion. Therefore, the skip mask parameter is set to a predetermined value such as "5 frames", and the selective segmentation determination unit skips 5 frames from frame 1310 in the input video 1300 in accordance with the skip mask parameter, and then inputs the next frame (not shown in FIG. 13) to the dynamic object segmentation unit.

このように、スキップマスクパラメータを用いることで、動的内容を含まない非動的フレームをスキップしつつ、動的内容を含む可能性が高い動的フレームを判定することができる。これによれば、非動的フレームを地図生成のためにスキップ部に入力しながら、動的フレームを動的オブジェクト分割処理から排除することができるため、入力映像の全てのフレームに対して動的オブジェクト分割処理を行った場合に比べて、コンピューティング資源を大幅に節約することができる。 In this way, by using the skip mask parameters, it is possible to determine dynamic frames that are likely to contain dynamic content while skipping non-dynamic frames that do not contain dynamic content. This allows dynamic frames to be excluded from the dynamic object segmentation process while non-dynamic frames are input to the skip unit for map generation, thereby significantly saving computing resources compared to performing dynamic object segmentation on all frames of the input video.

以上、本発明の実施の形態について説明したが、本発明は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 Although the embodiment of the present invention has been described above, the present invention is not limited to the above-mentioned embodiment, and various modifications are possible without departing from the gist of the present invention.

２００、７００、１０００位置特定装置
２０５撮影部
２１０プロセッサ
２１５メモリ
２２０画像特徴抽出部
２２５選択的分割判定部
２３０動的オブジェクト分割部
２３５特徴分類部
２４０ＳＬＡＭ部
２４５位置特定部
２５０インターフェース部
７０５慣性計測部
７１０ＧＰＳ部
７１５フィルタ部
７２０位置情報補正部
１０１０リスク管理部
１０２０通知部 200, 700, 1000 Position identification device 205 Photography unit 210 Processor 215 Memory 220 Image feature extraction unit 225 Selective division judgment unit 230 Dynamic object division unit 235 Feature classification unit 240 SLAM unit 245 Position identification unit 250 Interface unit 705 Inertial measurement unit 710 GPS unit 715 Filter unit 720 Position information correction unit 1010 Risk management unit 1020 Notification unit

Claims

A position determining device for determining the position of a moving object in an arbitrary environment, comprising:
an image feature extraction unit that analyzes an input video showing the environment and extracts features for each of a plurality of frames included in the input video;
a selective division determination unit that predicts a dynamic frame including dynamic content from the input video based on any one of an amount of movement between the plurality of frames included in the input video, a movement speed of a shooting unit that captured the input video, and a ratio of pixels of an object corresponding to a dynamic class in the frame; and
a dynamic object segmentation unit that performs a dynamic object segmentation process on the dynamic frame and generates, for each pixel in the dynamic frame, semantic information of the image that indicates whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class;
a feature classification unit that classifies each feature as a dynamic feature or a static feature based on semantic information of the image and the features;
a SLAM unit that generates a three-dimensional map that stereoscopically illustrates the environment and first moving object position information that indicates a position and an orientation of the moving object based on the static features and semantic information of the image;
A location determination device comprising:

The selective division determination unit is
estimating a moving speed of the imaging unit based on a change in a position of the imaging unit that acquired the input video in the plurality of frames;
setting a skip mask parameter that specifies a number of frames to be excluded from the dynamic object segmentation process in the input video based on a moving speed of the image capture unit;
processing the input video based on the skip mask parameters to skip non-dynamic frames that do not include dynamic content and to predict the dynamic frames;
2. The position determining device according to claim 1 .

The selective division determination unit is
determining an amount of motion between the frames by comparing the features extracted from the non-motion frame with features of a frame immediately preceding the non-motion frame in the input video;
If the amount of movement between the frames satisfies a predetermined movement amount criterion, the non-dynamic frame is determined to be a dynamic frame.
3. The position determining device according to claim 2 .

The feature classification unit is
determining the amount of movement of the static features between the frames;
inputting, into the SLAM unit, only those static features whose inter-frame movement amount does not satisfy a predetermined movement amount criterion;
4. The position determining device according to claim 3 .

The selective division determination unit is
According to the semantic information of the image, if the ratio of pixels of the object corresponding to the dynamic class in the target frame included in the input video meets a predetermined dynamic content ratio criterion;
setting the skip mask parameter to a lower value;
5. The position determining device according to claim 4.

The location identification device is
an inertial measurement unit for acquiring movement sensor data relating to the movement of the moving object;
a filter unit that performs a filtering process on the movement sensor data to generate filtered movement sensor data;
generating second mobile object position information indicative of a position and an orientation of the mobile object based on the filtered motion sensor data;
a position information correction unit that generates corrected moving object position information that more accurately indicates a position and an orientation of the moving object based on the first moving object position information and the second moving object position information;
The location determination device of claim 1 further comprising:

The location identification device is
Risk Management Department,
A notification unit;
Further comprising:
The risk management department
determining a risk that may affect the moving object based on the corrected moving object position information and risk position information indicating the position of a risk candidate;
generating and outputting a risk notification indicating instructions for avoiding said risk;
7. The position determining device according to claim 6,

The location identification device is
acquiring GPS data indicative of the location of said mobile object on the globe;
a GPS unit that stores the GPS data in a map database in association with the three-dimensional map;
8. The position determining device according to claim 7,

A method for determining a location of a moving object in an arbitrary environment, comprising:
analyzing an input video showing the environment and extracting features for each of a plurality of frames included in the input video;
setting a skip mask parameter that specifies the number of frames to be excluded from a dynamic object segmentation process based on any one of an amount of movement between the plurality of frames included in the input video, a moving speed of a shooting unit that captured the input video, and a ratio of pixels of an object corresponding to a dynamic class in the frames;
processing the input video based on the skip mask parameters to skip non-dynamic frames that do not contain dynamic content and to predict dynamic frames that contain dynamic content;
performing a dynamic object segmentation process on the dynamic frame to generate, for each pixel in the dynamic frame, image semantic information indicating whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class;
classifying each feature as a dynamic feature or a static feature based on semantic information of the image and the features;
determining the amount of movement of the static features between the frames;
generating a three-dimensional map showing the environment in a three-dimensional manner and first moving object position information showing a position and an orientation of the moving object based on the static features, among the static features, a feature where the movement amount between the frames does not satisfy a predetermined movement amount criterion and semantic information of the image;
A location determination method comprising:

A location determination system for determining the location of a moving object in an arbitrary environment, comprising:
a position determination device installed on the mobile object and configured to generate a three-dimensional map showing the environment;
a map database for storing the three-dimensional map is connected via a communication network;
The location identification device is
an image feature extraction unit that analyzes an input video showing the environment and extracts features for each of a plurality of frames included in the input video;
a selective division determination unit that predicts a dynamic frame including dynamic content from the input video based on any one of an amount of movement between the plurality of frames included in the input video, a movement speed of a shooting unit that captured the input video, and a ratio of pixels of an object corresponding to a dynamic class in the frame; and
a dynamic object segmentation unit that performs a dynamic object segmentation process on the dynamic frame and generates, for each pixel in the dynamic frame, semantic information of the image that indicates whether the pixel belongs to an object corresponding to a dynamic class or an object corresponding to a static class;
a feature classification unit that classifies each feature as a dynamic feature or a static feature based on semantic information of the image and the features;
a SLAM unit that generates the three-dimensional map that stereoscopically shows the environment and first mobile object position information that shows a position and an orientation of the mobile object based on the static features and semantic information of the image, and stores the three-dimensional map in the map database via the communication network;
A location determination system comprising:

The location identification device is
acquiring GPS data indicative of the location of said mobile object on the globe;
a GPS unit that stores the GPS data in the map database in association with the three-dimensional map;
11. The location system according to claim 10,

The GPS unit includes:
transmitting a map request to said map database via said communications network, said map request including GPS data indicating a global position of said mobile unit;
receiving from the map database a three-dimensional map corresponding to the global location of the moving object;
12. The location system according to claim 11,