JP7808295B2

JP7808295B2 - Method for controlling a robotic device

Info

Publication number: JP7808295B2
Application number: JP2022080087A
Authority: JP
Inventors: ロソレオネル; デイヴヴェーダント
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2021-05-17
Filing date: 2022-05-16
Publication date: 2026-01-29
Anticipated expiration: 2042-05-16
Also published as: KR20220155921A; JP2022176917A; DE102021204961B4; DE102021204961A1; CN115351780A

Description

本開示は、ロボットデバイスを制御するための方法に関する。 The present disclosure relates to a method for controlling a robotic device.

多くの用途では、ロボットが、場合によっては動的で構造化されていない環境において自律的に動作することが望まれる。このためには、ロボットは、自身の周辺環境の中でどのように動き、どのように対話するかを学習する必要がある。そうするために、ロボットは、単純な動作を実行したり、複雑なタスクを複数のスキルの組み合わせとして実行したりするために使用できるスキルのライブラリに依存する場合がある。動作スキルを学習する手法は、人間の例を介して、デモンストレーションから学ぶこと（ＬｆＤ：Learning from demonstrations）として公知である。これには、ロボットによって模倣されるべき特定の動作を１回または複数回示す専門家（典型的には人間）が必要である。 In many applications, it is desirable for robots to operate autonomously, sometimes in dynamic and unstructured environments. To do this, robots need to learn how to move and interact in their surroundings. To do so, robots may rely on a library of skills that can be used to perform simple actions or to perform complex tasks as a combination of multiple skills. A technique for learning behavioral skills is known as learning from demonstrations (LfD), which requires an expert (typically a human) to demonstrate one or more times the specific behavior to be imitated by the robot.

A. Paraschos らによる刊行物「Using probabilistic movement primitives in robotics” by A. Paraschos et al., in Autonomous Robots, 42:529-551, 2018」には、ロボット動作スキルを学習して合成するための確率的枠組みである確率的運動プリミティブ（ＰｒｏＭＰ）が記載されている。ＰｒｏＭＰは、コンパクトな基底関数表現に基づく軌道分布を表している。その確率論的定式化により、運動の変調、平行運動の起動、および制御における分散情報の活用が可能になる。 The publication "Using probabilistic movement primitives in robotics" by A. Paraschos et al., in Autonomous Robots, 42:529-551, 2018, describes probabilistic movement primitives (ProMP), a probabilistic framework for learning and synthesizing robot movement skills. ProMP represents trajectory distributions based on compact basis function representations. Its probabilistic formulation enables the exploitation of distributed information in motion modulation, parallel motion activation, and control.

ＰｒｏＭＰは、直交運動の学習に使用されてきたが、その定式化では、四元数軌道の形態での配向運動を扱うことができない。しかしながら、四元数は、それらがほぼ最小の表現と、閉ループ配向制御での強い安定性とを提供するなど、ロボット制御にとって好ましい特性を備えている。したがって、四元数軌道を含むデモンストレーションからロボット制御学習を可能にするアプローチが望まれる。 While ProMP has been used to learn cartesian motion, its formulation cannot handle orientation motion in the form of quaternion trajectories. However, quaternions have favorable properties for robot control, such as the fact that they offer near-minimal representation and strong stability in closed-loop orientation control. Therefore, an approach that enables robot control learning from demonstrations containing quaternion trajectories is desirable.

発明の開示
様々な実施形態によれば、ロボットデバイスを制御するための方法が提供され、この方法は、ロボットスキルのためのデモンストレーションを提供するステップであって、ここで、各デモンストレーションは、ロボット構成のシーケンスを含む軌道をデモンストレーションし、ここで、各ロボット構成は、リーマン多様体の構造を有する予め定められた構成空間の要素によって記述されるステップを含む。本方法はさらに、各デモンストレーションされた軌道について、重みベクトルに従った基本運動の組み合わせと、デモンストレーションされた軌道との間の距離測定値を最小化する重みベクトルを検索することによって、ロボットデバイスの予め定められた基本運動の重みベクトルとしての軌道の表現を決定するステップであって、ここで、組み合わせは、多様体に写像されるステップを含む。本方法はさらに、デモンストレーションされた軌道について決定された重みベクトルに確率分布を適合させることによって重みベクトルの確率分布を決定するステップと、重みベクトルの決定された確率分布に従って基本運動を実行することによってロボットデバイスを制御するステップとを含む。 DISCLOSURE OF THE INVENTION According to various embodiments, a method for controlling a robotic device is provided, the method including providing demonstrations for a robotic skill, where each demonstration demonstrates a trajectory including a sequence of robot configurations, where each robot configuration is described by an element of a predetermined configuration space having the structure of a Riemannian manifold. The method further includes, for each demonstrated trajectory, determining a representation of the trajectory as a weight vector of predetermined primitive motions of the robotic device by searching for a weight vector that minimizes a distance measure between the demonstrated trajectory and a combination of primitive motions according to the weight vector, where the combination is mapped to the manifold. The method further includes determining a probability distribution of the weight vector by fitting a probability distribution to the weight vector determined for the demonstrated trajectory, and controlling the robotic device by executing the primitive motions according to the determined probability distribution of the weight vector.

様々な実施形態によれば、上述の方法は、（以下で詳細に説明するように多変量測地線回帰を使用して）確率的動作プリミティブを符号化、再現、および適合させるリーマン多様体アプローチを使用するロボット制御を提供する。特に、様々な実施形態によれば、四元数軌道の空間は、リーマン多様体とみなされる。このアプローチは、ジオメトリを認識しないアプローチ（古典的なＰｒｏＭＰなど）と比較して、ともすれば不正確なデータを符号化したり、歪んだ軌道を再現したりすることが少なく、ロボットによるスキルの学習と再現とを可能にさせる。これは大まかな近似に依存しないため、モデルもより説明しやすくなる。その上さらに、このアプローチは、軌道分布の変調や動作プリミティブの混合などの付加的な適合能力も提供する。 According to various embodiments, the above-described method provides robot control using a Riemannian manifold approach to encoding, reproducing, and adapting stochastic motion primitives (using multivariate geodesic regression, as described in more detail below). In particular, according to various embodiments, the space of quaternion trajectories is considered a Riemannian manifold. This approach is less likely to encode inaccurate data or reproduce distorted trajectories compared to geometry-unaware approaches (such as classical ProMP), allowing the robot to learn and reproduce skills. Because it does not rely on rough approximations, the model is also more explainable. Furthermore, this approach also offers additional adaptation capabilities, such as modulating trajectory distributions and blending motion primitives.

様々な実施形態によれば、デモンストレーションされた軌道は、測地線回帰である重みベクトルとして表される。これは、測地線が、各デモンストレーションされた軌道に適合しているように見え得ることを意味する。 According to various embodiments, the demonstrated trajectories are represented as weight vectors that are geodesic regressions. This means that a geodesic curve can be seen to be fitted to each demonstrated trajectory.

以下では様々な例が与えられる。 Various examples are given below.

実施例１は、上述したようなロボットデバイスを制御するための方法である。 Example 1 is a method for controlling a robot device as described above.

実施例２は、実施例１による方法であり、ここで、重みベクトルの確率分布は、デモンストレーションされた軌道について決定された重みベクトルにガウス分布を適合させることによって決定される。 Example 2 is a method according to Example 1, where the probability distribution of the weight vector is determined by fitting a Gaussian distribution to the weight vector determined for the demonstrated trajectory.

訓練および再現のためにガウス分布を使用することにより、デモンストレーションでは見られなかった制御シナリオの信頼性の高い制御が提供される。 The use of Gaussian distributions for training and recall provides reliable control of control scenarios not seen in the demonstration.

実施例３は、実施例１または２による方法であり、ここで、各デモンストレーションされた軌道は、時点の予め定められたシーケンスの各時点に対するロボット構成を含み、重みベクトルに従った基本運動の各組み合わせは、時点の予め定められたシーケンスの各時点に対するロボット構成を指定し、
各デモンストレーションされた軌道について、重みベクトルは、可能な重みベクトルの集合から、重みベクトルに従った基本運動の組み合わせと、デモンストレーションされた軌道とについての重みベクトルを決定することによって決定され、
組み合わせは、多様体に写像され、可能な重みベクトルの集合の中で最小であり、
多様体に写像された基本運動の組み合わせと、デモンストレーションされた軌道との間の距離は、時点のシーケンスの時点にわたって、多様体に写像されたときの時点における基本運動の組み合わせによって与えられる多様体の要素と、デモンストレーションされた軌道との間の多様体のメトリックの値または値のべき乗を含む各時点についての項を含んだ項にわたる合計によって与えられる。 Example 3 is the method according to example 1 or 2, wherein each demonstrated trajectory includes a robot configuration for each time point in a predetermined sequence of time points, and each combination of primitive motions according to a weight vector specifies a robot configuration for each time point in the predetermined sequence of time points;
For each demonstrated trajectory, a weight vector is determined by determining a weight vector for a combination of the basic movement according to the weight vector and the demonstrated trajectory from a set of possible weight vectors;
The combination is mapped onto the manifold and is the smallest among the set of possible weight vectors,
The distance between a combination of elementary motions mapped onto a manifold and a demonstrated trajectory is given by a sum over the time points of the sequence of time points, including a term for each time point that contains the value or power of the value of the manifold metric between the element of the manifold given by the combination of elementary motions at the time point when mapped onto the manifold and the demonstrated trajectory.

これにより、デモンストレーションされた軌道に重みベクトルを適合させることによって、重みベクトルによりデモンストレーションされた軌道を表現する効率的な手法が提供される。組み合わせは、多様体上の点を選択し、選択された点における多様体の接空間の指数関数により多様体に組み合わせを写像することによって、多様体に写像されてもよい。 This provides an efficient way to represent the demonstrated trajectories with weight vectors by fitting a weight vector to the demonstrated trajectories. Combinations may be mapped to the manifold by selecting a point on the manifold and mapping the combination to the manifold by an exponential function of the tangent space of the manifold at the selected point.

実施例４は、実施例１～３までのいずれか１つによる方法であり、この方法は、デモンストレーションされた軌道の１つについて、重みベクトルに従った基本運動の組み合わせと、デモンストレーションされた軌道との間の距離測定値が最小化されるような、多様体の点および重みベクトルを検索するステップを含み、ここで、組み合わせは、点における接空間から前記多様体に写像され、ここで、各デモンストレーションされた軌道について、多様体への各組み合わせの写像は、選択された点における接空間から前記組み合わせを写像することによって実行される。 Example 4 is a method according to any one of Examples 1 to 3, the method including: searching for a point and a weight vector on a manifold such that, for one of the demonstrated trajectories, a distance measure between a combination of primitive movements according to a weight vector and the demonstrated trajectory is minimized, where the combination is mapped onto the manifold from a tangent space at a point, and where, for each demonstrated trajectory, the mapping of each combination onto the manifold is performed by mapping the combination from a tangent space at a selected point.

換言すれば、接空間（すなわち、接空間を取る多様体の点）は、１つの実証された軌道について、重みおよび点にわたる最適化を実行することにより決定される。次いで、この接空間は、組み合わせまたは検索中にこれが必要な任意の組み合わせを、デモンストレーションされたすべての軌道についての多様体に写像するために使用される。換言すれば、同じ接空間、したがって同じ指数写像が、すべてのデモンストレーションされた軌道に使用される。これにより、異なる軌道に対して異なる接空間を使用することが接線重みベクトルを非常に多様化させてしまうという問題を克服する効果的な手法が提供される。 In other words, the tangent space (i.e., the points on the manifold that take up the tangent space) is determined for one demonstrated trajectory by performing an optimization over the weights and points. This tangent space is then used to map any combinations required during the combination or search onto the manifold for all demonstrated trajectories. In other words, the same tangent space, and therefore the same exponential mapping, is used for all demonstrated trajectories. This provides an effective technique for overcoming the problem that using different tangent spaces for different trajectories leads to very diverse tangent weight vectors.

実施例５は、実施例１～４までのいずれか１つによる方法であり、ここで、軌道は、配向軌道であり、各デモンストレーションは、位置軌道をさらにデモンストレーションし、各ロボット構成は、三次元空間におけるベクトルによって記述される姿勢と、予め定められた構成空間の要素によって記述される向きとを含む。 Example 5 is the method according to any one of Examples 1 to 4, wherein the trajectories are orientation trajectories, each demonstration further demonstrates a position trajectory, and each robot configuration includes a pose described by a vector in three-dimensional space and an orientation described by elements of a predetermined configuration space.

したがって、スキルは、ロボットの姿勢のシーケンス、例えばエンドエフェクタの位置および向きをデモンストレーションすることによって学習されてもよく、ここで、向きのためのモデルは、リーマン多様体に基づくアプローチを使用して学習される。 Thus, skills may be learned by demonstrating sequences of robot poses, e.g., end-effector positions and orientations, where a model for orientation is learned using a Riemannian manifold-based approach.

実施例６は、実施例１～５までのいずれか１つによる方法であり、この方法は、より多くのロボットスキルのデモンストレーションを提供するステップと、各スキルについて、軌道の表現と重みベクトルと重みベクトルの確率分布とを決定するステップと、各スキルについて、重みベクトルの確率分布から、（時点毎に）多様体点のリーマンガウス分布を決定することによって、ロボットデバイスを制御するステップと、スキルのリーマンガウス分布の積分布を決定するステップと、（時点毎に）決定された積確率分布からサンプリングすることによってロボットデバイスを制御するステップと、を含む。 Example 6 is a method according to any one of Examples 1 to 5, the method including the steps of providing demonstrations of more robot skills; determining, for each skill, a representation of a trajectory, a weight vector, and a probability distribution of the weight vectors; controlling the robotic device by determining (for each time point) a Riemannian-Gauss distribution of the manifold points from the probability distribution of the weight vectors for each skill; determining a product distribution of the Riemannian-Gauss distributions of the skills; and controlling the robotic device by sampling (for each time point) from the determined product probability distribution.

これにより、リーマン多様体上のデモンストレーションから学んだスキルのためのスキルのブレンディングが可能になる。 This allows for skill blending for skills learned from demonstrations on Riemannian manifolds.

実施例７は、請求項１から６までのいずれか１項記載の方法を実行するように構成されているロボットデバイスコントローラである。 Example 7 is a robot device controller configured to execute the method described in any one of claims 1 to 6.

実施例８は、命令がプロセッサによって実行されるときに、該プロセッサに実施例１から６までのいずれか１つによる方法を実行させる命令を含んでいるコンピュータプログラムである。 Example 8 is a computer program comprising instructions that, when executed by a processor, cause the processor to perform a method according to any one of Examples 1 to 6.

実施例９は、命令がプロセッサによって実行されるときに、該プロセッサに実施例１から６までのいずれか１つによる方法を実行させる命令が格納されているコンピュータ可読媒体である。 Example 9 is a computer-readable medium having stored thereon instructions that, when executed by a processor, cause the processor to perform a method according to any one of Examples 1 to 6.

図面において、同様の参照符号は、一般に、異なる図面を通して同じ部品を指している。これらの図面は必ずしも縮尺通りではなく、代わりに本発明の原理を一般的に説明することに重点が置かれている。以下の明細書では、以下の図面を参照しながら様々な態様が説明される。 In the drawings, like reference numbers generally refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon generally illustrating the principles of the invention. In the following specification, various aspects are described with reference to the following drawings:

ロボットを示す図である。FIG. 1 is a diagram illustrating a robot. 球面多様体Ｓ^２を示す図であり、それらの点は例えばロボットのエンドエフェクタの可能な向きをそれぞれ表すことができる。1 is a diagram showing a spherical manifold ^S2 whose points can represent each possible orientation of, for example, a robot's end effector. 一実施形態による球面多様体Ｓ^２上の多変量一般線形回帰を示す図である。FIG. 1 illustrates multivariate general linear regression on a spherical manifold ^S2 according to one embodiment. 説明のために球面上の文字に実施形態を適用した例を示す図である。FIG. 10 is a diagram illustrating an example in which the embodiment is applied to characters on a spherical surface for the purpose of explanation. 説明のために球面上の文字のための一実施形態によるブレンディングプロセスを示す図である。FIG. 10 illustrates the blending process according to one embodiment for characters on a sphere for illustrative purposes. ロボットデバイスを制御するための方法を示すフローチャートである。1 is a flowchart illustrating a method for controlling a robotic device.

以下の詳細な説明は、本発明が実施され得る本開示の特定の詳細および態様を例示として示す添付の図面を参照している。また、本発明の保護範囲から逸脱することなく、他の態様を使用したり、構造的、論理的、および電気的な変更を行ったりしてもよい。本開示のいくつかの態様は、新たな態様を形成するために本開示の１つ以上の他の態様と組み合わせることができるので、本開示の様々な態様は、必ずしも相互に排他的であるとは限らない。 The following detailed description refers to the accompanying drawings, which show, by way of example, specific details and aspects of the present disclosure in which the present invention may be practiced. Furthermore, other aspects may be used and structural, logical, and electrical changes may be made without departing from the scope of protection of the present invention. Various aspects of the present disclosure are not necessarily mutually exclusive, as some aspects of the present disclosure can be combined with one or more other aspects of the present disclosure to form new aspects.

以下では、様々な例をより詳細に説明する。 Various examples are explained in more detail below.

図１は、ロボット１００を示す。 Figure 1 shows the robot 100.

このロボット１００は、作業部品（または１つ以上の他の対象物）を操作したり、組み立てたりするためのロボットアーム１０１、例えば産業用ロボットアームを含む。このロボットアーム１０１は、マニピュレータ１０２，１０３，１０４と、これらのマニピュレータ１０２，１０３，１０４が支持されている基台（または支持台）１０５とを含む。「マニピュレータ」という用語は、ロボットアーム１０１の可動部材を指し、それらの操作が、例えば作業を実行するために環境との物理的な相互作用を可能にしている。制御のために、ロボット１００は、制御プログラムに従って環境との相互作用を実施するように構成された（ロボット）コントローラ１０６を含む。マニピュレータ１０２，１０３，１０４の（支持台１０５から最も離れた）最後の部材１０４は、エンドエフェクタ１０４とも称され、１つ以上のツール、例えば溶接トーチ、把持器具、塗装設備などを含むことができる。 The robot 100 includes a robotic arm 101, e.g., an industrial robotic arm, for manipulating or assembling work pieces (or one or more other objects). The robotic arm 101 includes manipulators 102, 103, and 104 and a base (or support) 105 on which the manipulators 102, 103, and 104 are supported. The term "manipulator" refers to the movable members of the robotic arm 101, the manipulation of which allows for physical interaction with the environment, e.g., to perform a task. For control, the robot 100 includes a (robot) controller 106 configured to implement the interaction with the environment according to a control program. The final member 104 of the manipulators 102, 103, and 104 (furthest from the support 105), also referred to as the end effector 104, may include one or more tools, e.g., a welding torch, a gripping tool, painting equipment, etc.

（支持台１０５の近傍にある）他のマニピュレータ１０２，１０３は、例えばエンドエフェクタ１０４と共に、その端部にエンドエフェクタ１０４を備えるロボットアーム１０１が設けられた位置決めデバイスを形成することができる。ロボットアーム１０１は、人間の腕と同様の機能を提供することができる機械的なアームである（場合によっては、その端部にツールを備える）。 The other manipulators 102, 103 (near the support base 105) can, for example, together with an end effector 104, form a positioning device having a robotic arm 101 at its end with the end effector 104. The robotic arm 101 is a mechanical arm (possibly with a tool at its end) that can provide functions similar to those of a human arm.

ロボットアーム１０１は、マニピュレータ１０２，１０３，１０４を互いに相互接続し、さらに支持台１０５にも相互接続する関節要素１０７，１０８，１０９を含むことができる。関節要素１０７，１０８，１０９は、１つ以上の関節を含むことができ、それらの各々は、互いに関連するマニピュレータに対して回転可能な動作（すなわち回転動作）および／または並進動作（すなわち変位）を提供することができる。マニピュレータ１０２，１０３，１０４の運動は、コントローラ１０６によって制御されるアクチュエータを用いて開始することができる。 The robotic arm 101 may include joint elements 107, 108, and 109 that interconnect the manipulators 102, 103, and 104 with each other and with the support base 105. The joint elements 107, 108, and 109 may include one or more joints, each of which may provide rotatable (i.e., rotational) and/or translational (i.e., displacement) motion for the associated manipulators. Movement of the manipulators 102, 103, and 104 may be initiated using actuators controlled by the controller 106.

「アクチュエータ」という用語は、駆動されることに応じて機構やプロセスに影響を与えるように適合された構成部品として理解されてもよい。アクチュエータは、コントローラ１０６によって出力された命令（いわゆる起動）を、機械的な運動として実行することができる。アクチュエータ、例えば電気機械変換器は、駆動に応じて電気エネルギを機械エネルギに変換するように構成されていてもよい。 The term "actuator" may be understood as a component adapted to affect a mechanism or process in response to being actuated. The actuator may execute commands (so-called actuations) output by the controller 106 as mechanical movements. An actuator, for example an electromechanical converter, may be configured to convert electrical energy into mechanical energy in response to being actuated.

「コントローラ」という用語は、任意のタイプの論理実装された項目として理解されてもよく、これは、例えば、記録媒体に格納されたソフトウェア、ファームウェア、またはそれらの組み合わせを実行することができ、例えば、本例のアクチュエータに命令を出力することができる回路および／またはプロセッサを含むことができる。コントローラは、例えば、システム、本例ではロボットの運用を制御するためにプログラムコード（例えばソフトウェア）によって構成されていてもよい。 The term "controller" may be understood as any type of logic-implemented item, which may include, for example, circuitry and/or a processor capable of executing software, firmware, or a combination thereof stored on a recording medium, and capable of outputting instructions to, for example, an actuator in this example. The controller may be configured, for example, by program code (e.g., software) to control the operation of a system, in this example, a robot.

本例では、コントローラ１０６は、１つ以上のプロセッサ１１０と、コードおよびデータを格納したメモリ１１１とを含み、これらのコードおよびデータに基づいて、プロセッサ１１０はロボットアーム１０１を制御する。様々な実施形態によれば、コントローラ１０６は、メモリ１１１に格納された機械学習モデル１１２に基づいて、ロボットアーム１０１を制御する。 In this example, the controller 106 includes one or more processors 110 and a memory 111 that stores code and data based on which the processor 110 controls the robotic arm 101. According to various embodiments, the controller 106 controls the robotic arm 101 based on a machine learning model 112 stored in the memory 111.

様々な実施形態によれば、リーマン多様体アプローチは、ＰｒｏＭＰを使用して配向動作プリミティブを学習するために使用される。すなわち、リーマン多様体定式化を使用して「配向ＰｒｏＭＰ」として示される、古典的ＰｒｏＭＰの拡張が提供される。 According to various embodiments, a Riemannian manifold approach is used to learn oriented motion primitives using ProMP. That is, an extension of classical ProMP is provided, denoted as "oriented ProMP" using a Riemannian manifold formulation.

オリジナルの（すなわち古典的な）確率的運動プリミティブ（ＰｒｏＭＰ）アプローチは、ユークリッド空間でのロボットのスキルを処理するため、（ロボットの向きを表す）四元数軌道の学習および再現を不可能にさせる。 The original (i.e., classical) Probabilistic Motion Primitives (ProMP) approach processes robot skills in Euclidean space, making it impossible to learn and reproduce quaternion trajectories (representing the robot's orientation).

以下に説明するＰｒｏＭＰのリーマン定式化は、四元数データの学習および再現を可能にさせる。その上さらに、本明細書で与えられる一般的な処理のため、一般的なリーマン多様体に対する使用が可能になる。 The Riemannian formulation of ProMP described below allows for the learning and reproduction of quaternion data. Furthermore, the general treatment given here allows for its use on general Riemannian manifolds.

以下では、ユークリッド空間でのロボットスキルを処理するためのＰｒｏＭＰの導入が示される。 Below, we present an implementation of ProMP for processing robot skills in Euclidean space.

以下では、次の表記が使用される。
In what follows, the following notation is used:

一般に、単一の運動実行に対して、所定の軌道
が、変数ｙの時系列として示される。ここで、ｙ_ｔは、時間ｔについてのロボット構成とも称され、時間ステップｔでのタスク空間内の関節角度または直交位置のいずれかを表すことができる（付加的にｙの時間微分が考慮されてもよい）。古典的なＰｒｏＭＰ表記法に従って、ｙ_ｔは、ｄ自由度（ＤｏＦ）のシステム、例えば７自由度を有するロボットアーム１０１の測定値を表すｄ次元ベクトルである。軌道τの各点は、次式のように線形基底関数モデルとして表すことができる。
ｙ_ｔ＝Ψ_ｔｗ＋ε_ｙ⇒Ｐ（ｙ_ｔ│ｗ）＝Ｎ（ｙ_ｔ│Ψ_ｔｗ，Σ_ｙ）（１）
ここで、ｗは、ｄＮ_φ次元の重みベクトルであり、Ψ_ｔは、各ＤｏＦに対する時間依存の基底関数φ_ｔを含むｄ×ｄＮ_φ次元のブロック対角行列であり（１つのＤｏＦに対する基底関数は、基本運動（例えば、所定の方向への運動、所定の軸周りの回転）とも称される）、Ｎ_φは、基底関数の数を示し、ε_ｙ～Ｎ（０，Σ_ｙ）は、不確かさΣ_ｙを有するゼロ平均ｉ．ｉ．ｄ．ガウスノイズである。 Generally, for a single movement execution, a given trajectory
is denoted as a time series of the variable y. Here, _yt is also referred to as the robot configuration for time t and can represent either the joint angles or the Cartesian position in task space at time step t (the time derivative of y may additionally be considered). Following classical ProMP notation, _yt is a d-dimensional vector representing measurements of a d-degree-of-freedom (DoF) system, e.g., a robot arm 101 with 7 degrees of freedom. Each point of the trajectory τ can be represented as a linear basis function model as follows:
y _t =Ψ _t w+ε _y ⇒P(y _t │w)=N(y _t │Ψ _t w, Σ _y ) (1)
where w is a dN _φ -dimensional weight vector, Ψ _t is a d×dN _φ -dimensional block diagonal matrix containing time-dependent basis functions φ _t for each DoF (a basis function for one DoF is also called a basic motion (e.g., motion in a given direction, rotation around a given axis)), N _φ denotes the number of basis functions, and ε _y ∼N(0,Σ _y ) is zero-mean i.i.d. Gaussian noise with uncertainty Σ _y .

ＰｒｏＭＰは、各デモンストレーションが重みベクトルｗの異なる値によって特徴付けられ、分布Ｐ（ｗ；θ）＝Ｎ（ｗ│μｗ，Σｗ）となることを想定している。次いで、完全な軌道は、Ｐ（ｗ；θ）から引き出された重みｗと共に各ｔにおける基底関数の合成としてモデル化できる。したがって、時間ｔに対する状態Ｐ（ｙ_ｔ；θ）の分布は、次式のように計算できる。
この式からは、各タイムステップｔにおける平均と分散の両方が推定される。 ProMP assumes that each demonstration is characterized by a different value of the weight vector w, resulting in a distribution P(w;θ) = N(w|μw,Σw). The complete trajectory can then be modeled as a composition of basis functions at each t with weights w drawn from P(w;θ). Thus, the distribution of states P(y _t ;θ) over time t can be calculated as follows:
This equation estimates both the mean and the variance at each time step t.

デモンストレーションから学習する場合、例示的な軌道は時間の長さが異なることが多い。ＰｒｏＭＰは、位相変数を導入してデータを時間インスタンスから分離することでこの問題を克服する。これにより、時間変調が可能になる。この場合、デモンストレーションの範囲は、ｚ_０＝０からｚ_Ｔ＝１であり、デモンストレーションされた軌道は、
として再定義される。Ψを形成する基底関数は、位相変数ｚにも依存する。具体的には、ＰｒｏＭＰは、幅ｈ、中心ｃ_ｉでもって、ｂ_ｉ（ｚ_ｔ）＝ｅｘｐ（（－（ｚ_ｔ－ｃ_ｉ）^２）／２ｈ）として定義されるストロークベースの運動のためのガウス基底関数を使用し、これらはしばしば実験的に設計されている。次いで、これらのガウス基底関数は、正規化され、次式となる。
When learning from demonstrations, the example trajectories often vary in length in time. ProMP overcomes this problem by introducing a phase variable to separate the data from the time instance, which allows for time modulation. In this case, the demonstration ranges from z ₀ =0 to z _T =1, and the demonstrated trajectory is
The basis functions forming Ψ also depend on the phase variable z. Specifically, ProMP uses Gaussian basis functions for stroke-based motion, defined as b _i (z _t ) = exp((−(z _t −c _i ) ² )/2h), with width h and center c _i , which are often designed experimentally. These Gaussian basis functions are then normalized to yield:

一般的に言えば、ＰｒｏＭＰの学習プロセスは、主に重み分布Ｐ（ｗ；θ）を推定することからなる。そうするために、式（１）のようなｉ番目のデモンストレーションを表す重みベクトルｗ_ｉが最尤推定によって推定される。これは、次式、
ｗ_ｉ＝（Ψ^ＴΨ＋λＩ）^－１Ψ^ＴＹ_ｉ（３）
の形態の線形リッジ回帰の解につながる。ここで
は、観測されたすべての軌道点を連結し、Ψは、基底関数行列Ψ_ｔについてのすべての時間インスタンスからなる。次いで、Ｎ個のデモンストレーションの集合が与えられると、重み分布パラメータθ＝｛μ_ｗ，Σ_ｗ｝が最尤法で推定できる。新しい状況に適合するために、ＰｒｏＭＰは、関連する共分散Σ_ｙ ^＊を用いて所期の軌道点
に到達するように動作を条件付けることにより、通過点または目標位置への軌道変調を可能にする。これは、結果として、条件付き確率
となり、そのパラメータは、以下のように計算できる（ガウス分布を想定）。
Generally speaking, the learning process of ProMP mainly consists of estimating the weight distribution P(w;θ). To do so, the weight vector w _i representing the i-th demonstration as shown in Equation (1) is estimated by maximum likelihood estimation. This is expressed as follows:
w _i =(Ψ ^T Ψ+λI) ^-1 Ψ ^T Y _i (3)
leads to a linear ridge regression solution of the form
connects all observed trajectory points, and Ψ consists of all time instances for the basis function matrix Ψ _t . Then, given a set of N demonstrations, the weight distribution parameters θ = {μ _w , Σ _w } can be estimated by maximum likelihood. To adapt to new situations, ProMP estimates the weight distribution parameters θ = {μ w , Σ w } for the desired trajectory points with the associated covariance Σ _y ^* .
This allows for trajectory modulation to waypoints or target locations by conditioning the motion to reach
and its parameters can be calculated as follows (assuming a Gaussian distribution):

軌道分布の積を計算することにより、異なる運動プリミティブは、単一の動作にブレンドすることができる。具体的には、最終動作への影響がブレンディング重みα_ｔ，ｓに従って変化するＳ個の異なるＰｒｏＭＰの集合Ｐ_ｓ（ｙ_ｔ）＝Ｎ（ｙ_ｔ│μ_ｔ，ｓ，Σ_ｔ，ｓ）に対して、各時間ステップｔにおけるブレンドされた軌道は、次の分布
に従う。次いで、
のパラメータは、次のようにガウス分布の加重積から容易に推定される。
By computing the product of trajectory distributions, different motion primitives can be blended into a single motion. Specifically, for a set of S different ProMPs _Ps ( _yt ) = N( _yt | _μt,s ,Σt _,s ), whose contribution to the final motion varies according to blending weights αt _,s , the blended trajectory at each time step t is given by the following distribution:
Then,
The parameters of are easily estimated from a weighted product of Gaussians as follows:

タスクパラメータは、例えば、タスクを達成するためにロボット動作を目標対象物に適合させることができる。そのような情報は、デモンストレーション中に得られることが多く、ＰｒｏＭＰの定式化に統合させることができる。形式的には、ＰｒｏＭＰは、外部状態
を考慮し、
から平均重みベクトルμ_ｗへのアフィン写像を学習して以下の結合確率分布となる。
ここで、｛Ｏ，ｏ｝は線形リッジ回帰を使用して学習される。 Task parameters can, for example, adapt robot motion to a target object in order to accomplish a task. Such information is often obtained during demonstrations and can be integrated into the ProMP formulation. Formally, ProMP is a set of parameters that are based on external conditions.
Considering
The affine mapping from to the average weight vector μ _w is learned, resulting in the joint probability distribution:
where {O, o} is learned using linear ridge regression.

上述したように、四元数は、ロボット制御に適した特性を備えている。ただし、（ロボット制御に使用される）四元数は、単位ノルム制約を満たすため、ベクトル空間を形成せず、したがって、（単位ノルムを伴う）四元値を有する変数を処理し、分析するための従来のユークリッド空間法の使用は不十分である。 As mentioned above, quaternions have properties that make them suitable for robotic control. However, because quaternions (as used in robotic control) satisfy unit norm constraints, they do not form vector spaces, and therefore the use of traditional Euclidean space methods for processing and analyzing quaternion-valued variables (with unit norm) is insufficient.

様々な実施形態によれば、リーマン幾何学は、四元数空間上でＰｒｏＭＰを定式化するために活用される。 According to various embodiments, Riemannian geometry is leveraged to formulate ProMP on quaternion space.

リーマン多様体Ｍは、各点が局所的にユークリッド空間
に類似し、大域的に定義された微分構造を持つｍ次元の位相空間である。各点ｘ∈Ｍに対して、ｘを通るすべての可能な滑らかな曲線の接ベクトルからなるベクトル空間である接空間Ｔ_ｘＭが存在する。リーマン多様体は、リーマンメトリックと称される滑らかに変化する正定値の内積を備え、これによりＭ内の曲線の長さを定義することができる。これらの曲線は、測地線と称され、Ｍ内の２点間の最小長さの曲線を表すため、ユークリッド空間上の直線をリーマン多様体に一般化したものである。 A Riemannian manifold M is a space in which each point is locally in Euclidean space.
It is an m-dimensional topological space with a globally defined differential structure similar to that of the Riemannian manifold. For each point x∈M, there exists a tangent space T _x M, which is a vector space of the tangent vectors of all possible smooth curves that pass through x. Riemannian manifolds have smoothly varying positive definite inner products called Riemann metrics, which allow us to define the length of a curve in M. These curves, called geodesics, represent the curve of minimum length between two points in M, and are a generalization of lines in Euclidean space to Riemannian manifolds.

図２は、それらの点が例えばロボットエンドエフェクタの可能な向きをそれぞれ表すことができる球面多様体Ｓ^２の図を示す。 FIG. 2 shows a diagram of a spherical manifold ^S2 whose points can represent each possible orientation of, for example, a robot end effector.

２つの点ｘおよびｙは、ロボットエンドエフェクタ１０４の２つの異なる方向を表すためにコントローラ１０６によって使用されてもよい球面上に示されている。 Two points x and y are shown on a sphere that may be used by the controller 106 to represent two different orientations of the robot end effector 104.

周囲空間における２点間の最短距離は直線２０１となるが、多様体上の最短経路は測地線２０２である。 The shortest distance between two points in ambient space is a straight line 201, but the shortest path on the manifold is a geodesic line 202.

ユークリッド接空間を利用するために、接空間
の間を行き来する写像が使用されてもよく、これらはそれぞれ指数写像および対数写像と表記される。 To use the Euclidean tangent space,
may be used, which are denoted as the exponential map and the logarithmic map, respectively.

指数写像
は、ｘから始まり、ｘとｙとの間の測地距離ｄＭが、ｘとｕとの間の距離のノルムに等しくなるようなｕの方向における測地線上に存在するように、ｘの接空間内にある点ｕを多様体上の点ｙに写像する。逆の操作は対数写像
と称される。すなわち、
Exponential map
starts at x and maps a point u in the tangent space of x to a point y on the manifold such that u lies on a geodesic in the direction of u such that the geodesic distance dM between x and y is equal to the norm of the distance between x and u. The inverse operation is the logarithmic map
That is,

多様体に関する別の有用な操作として、接空間内の２つの要素間の内積が一定に保たれるように、接空間の間で要素を移動させる平行移動
がある。 Another useful operation on manifolds is the translation, which moves an element between tangent spaces so that the dot product between two elements in the tangent spaces remains constant.
There is.

例えば、図２では、
は、
から
まで平行移動されたベクトル
および
である（簡略化のため、インデックス
は省略されている）。 For example, in FIG.
teeth,
from
The vector translated to
and
(For simplicity, the index
is omitted).

以下では、確率変数ｐ∈Ｍのリーマンガウス分布が、次式
ただし、平均μ∈Ｍおよび共分散Σ∈ＴμＭ
として導入される。このリーマンガウスは、リーマン多様体のための近似的な最大エントロピー分布に対応する。 In the following, the Riemannian distribution of a random variable p∈M is expressed as follows:
where the mean μ∈M and the covariance Σ∈TμM
This Riemann-Guassian corresponds to an approximate maximum entropy distribution for a Riemannian manifold.

以下は、球面多様体Ｓ^ｍについてのリーマン距離、指数写像、対数写像、および平行移動操作のための式である。
Below are formulas for the Riemannian distance, exponential map, logarithmic map, and translation operations for a spherical manifold S ^m .

様々な実施形態によれば、線形回帰をリーマン多様体設定に一般化する測地線回帰が使用される（例えばコントローラ１０６）。この測地線回帰モデルは、以下のように定義される。
ここで、ｙ∈Ｍおよび
は、それぞれ出力変数と入力変数、ｐ∈Ｍは、多様体上の基点、ｕ∈Ｔ_ｐＭは、ｐにおける接空間内のベクトル、誤差項εは、
における接空間内の値をとる確率変数である。線形回帰と同様に、（ｐ，ｕ）は、切片ｐおよび傾きｕとして解釈することができる。 According to various embodiments, geodesic regression, which generalizes linear regression to the Riemannian manifold setting, is used (e.g., controller 106). This geodesic regression model is defined as follows:
where y∈M and
are the output and input variables, respectively, p∈M is the base point on the manifold, u∈T _p M is a vector in the tangent space at p, and the error term ε is
is a random variable that takes values in the tangent space at p. Similar to linear regression, (p, u) can be interpreted as having an intercept p and a slope u.

ここで、点｛ｙ_１，…，ｙ_Ｔ｝∈Ｍおよび
の集合を考察する。測地線回帰の目的は、すべてのＴ個の対（ｘ_ｉ，ｙ_ｉ）の間の関係を最良にモデル化する測地線曲線γ∈Ｍを見つけ出すことである。これを達成するために、モデル推定値と観測値との間のリーマン距離の２乗和（つまり誤差）が最小化される。すなわち、
ここで、
は、多様体Ｍ上のモデル推定値であり、
は、リーマン誤差であり、対（ｐ，ｕ）∈ＴＭは、接束ＴＭの要素である。測地線モデルの最小二乗推定量は、上記のリーマン距離の二乗和の最小化として定式化できる。すなわち、
where points {y ₁ , . . . , y _T }εM and
The goal of geodesic regression is to find a geodesic curve γ∈M that best models the relationship between all T pairs (x _i , y _i ). To achieve this, the sum of squared Riemannian distances (i.e., the error) between the model estimates and the observations is minimized, i.e.,
where:
is the model estimate on the manifold M,
is the Riemann error, and the pair (p, u) ∈ TM is an element of the tangent bundle TM. The least squares estimator for the geodesic model can be formulated as the minimization of the sum of squares of the Riemann distances above, i.e.,

しかしながら、式（９）は、式（３）のような解析的な解を与えられない。解は最急降下法によって得ることができるが、これには、リーマン距離関数の導関数と指数写像の導関数とを計算する必要がある。後者は、初期点ｐおよび初期速度ｕに関する導関数に分けられる。これらの勾配は、ヤコビ場（すなわち、リーマン曲率テンソルのもとで特定の初期条件に従う２次方程式の解）の観点から計算することができる。 However, equation (9) cannot be analytically solved like equation (3). A solution can be obtained by steepest descent, which requires computing the derivatives of the Riemann distance function and the exponential map. The latter can be separated into derivatives with respect to the initial point p and the initial velocity u. These gradients can be computed in terms of the Jacobi field (i.e., the solution of a quadratic equation under the Riemann curvature tensor subject to specific initial conditions).

上記の測地線モデルは、スカラー独立変数
のみを考慮していることに留意されたい。これは、導関数が、単一の接ベクトルｕによってパラメータ化された単一の測地線曲線に沿ったヤコビ場によって取得されることを意味する。ヤコビ場の計算は、いわゆる随伴演算子に依存し、これは、実際には測地線回帰の誤差項の平行移動の役割を果たす。
の多変量ケースへの拡張には、複数の測地線曲線（これはユークリッド空間における「基底」ベクトルとみなすことができる）の識別を伴う若干異なったアプローチが必要である。リーマン多様体上の多変量一般線形モデル（ＭＧＬＭ）は、この問題の解決策を提供する。 The above geodesic model is based on the scalar independent variable
Note that we only consider . This means that the derivatives are obtained in terms of the Jacobi field along a single geodesic curve parameterized by a single tangent vector u. The computation of the Jacobi field relies on the so-called adjoint operator, which in effect plays the role of a translation of the error term of the geodesic regression.
The extension of to the multivariate case requires a slightly different approach, which involves identifying multiple geodesic curves (which can be thought of as "basis" vectors in Euclidean space). Multivariate general linear models (MGLMs) on Riemannian manifolds provide a solution to this problem.

ＭＬＧＭは、ｘの次元毎に１つずつ、複数の接ベクトルｕｊ∈ＴｐＭによって形成される測地線基底Ｕ＝［ｕ_１…ｕ_ｎ］を使用する。次いで、問題の式（９）は、
を用いて以下のように再定式化することができる。
式（１０）を解くために、対応する勾配は、随伴演算子が平行移動操作に類似しているという洞察を活用して計算することができる。そのようにして、多変量ケースのための特別な随伴演算子を設計するというハードルを克服することができ、代わりに、平行移動操作が、必要な勾配を近似するために実行されてもよい。この多変量の枠組みは、リーマン多様体Ｍ上にある各デモンストレーションについて、式（３）に類似した重みベクトルを計算するという目的を果たす。 The MLGM uses a geodesic basis U=[u ₁ ...u _n ] formed by multiple tangent vectors u j ∈ TpM, one for each dimension of x. Then, equation (9) in the problem becomes
can be reformulated as follows:
To solve equation (10), the corresponding gradient can be computed by leveraging the insight that the adjoint operator is similar to a translation operation. In this way, the hurdle of designing a special adjoint operator for the multivariate case can be overcome; instead, a translation operation may be performed to approximate the required gradient. This multivariate framework serves the purpose of computing a weight vector similar to equation (3) for each demonstration on the Riemannian manifold M.

以下では、デモンストレーションデータが四元数軌道に対応する場合、すなわちＭ≡Ｓ^３の場合に、どのようにＭＬＧＭが使用され得るかについて説明する。 Below we explain how MLGM can be used when the demonstration data corresponds to quaternion orbitals, i.e., when ^M≡S3 .

人間のデモンストレーションが（運動感覚教授または遠隔操作を介して）直交運動パターンによって特徴付けられる場合、ロボットエンドエフェクタの並進運動と回転運動の両方を包含する学習モデル１１２を有することが必要である。これは、所定のデモンストレーション軌道
が、ここで時間ステップｔにおけるエンドエフェクタの完全な直交姿勢を表すデータポイント
として構成されることを意味する。このケースでの課題は、
におけるユークリッドのケースが古典的なＰｒｏＭＰに従うため、配向空間におけるＰｒｏＭＰの学習である。 If the human demonstration is characterized by orthogonal movement patterns (either via kinesthetic instruction or teleoperation), it is necessary to have a learning model 112 that encompasses both translational and rotational movements of the robot end effector. This is because the robot end effector is able to learn the robot's motions based on the given demonstration trajectory.
where the data points representing the complete orthogonal pose of the end effector at time step t are
The problem in this case is that
Since the Euclidean case in follows classical ProMP, we train ProMP in the orientation space.

最初に
についての等価式が、ＭＧＬＭの枠組みで、式（１）における線形基底関数モデルに類似するように導入される。具体的には、推定値
であり、ここでは以下のとおりである。
First
An equivalent expression for is introduced to resemble the linear basis function model in equation (1) in the MGLM framework. Specifically, the estimated value
and here it is as follows:

この等価性は、ＰｒｏＭＰの古典的な定式化と我々の提案する配向軌道のためのアプローチとの間の類似性を確立するときに有用であることが判明した。式（１）と同様に、τの点ｙ_ｔ∈Ｍは、次のように測地線基底関数モデルとして表現することができる。
Ｐ（ｙ_ｔ│ｗ）＝Ｎ_Ｍ（ｙ_ｔ│Ｅｘｐ_ｐ（Ψ_ｔｗ），Σ_ｙ）（１２）
ここで、ｐは、Ｍ上の固定基点であり、
は、Ｎ_φ個の重みベクトルｗ_ｎ∈Ｔ_ｐＭを連結した大きな重みベクトルであり、Ψｔは、式（１）と同じ時間依存性の基底関数の行列であり、Σ_ｙは、
上の不確実性を符号化した共分散行列である。この定式化に関する２つの特別な態様、詳細には、（ｉ）式（１２）のリーマンガウス分布の平均、つまりＥｘｐ_ｐ（Ψ_ｔｗ）∈Ｍが前述のＭＧＬＭの等価的定式化を活用すること、および（ｉｉ）式（１２）においてｗを形成する重みベクトルが、ＭＧＬＭの測地基底を構成するベクトルに対応することは、とりわけ注目に値する。 This equivalence proved useful when establishing the similarity between the classical formulation of ProMP and our proposed approach for orientation trajectories. Similar to equation (1), the points y _t ∈ M in τ can be expressed as a geodesic basis function model as follows:
P(y _t │w)=N _M (y _t │Exp _p (Ψ _t w), Σ _y ) (12)
where p is a fixed base point on M,
is a large weight vector concatenating N _φ weight vectors w _n ∈ _{T p} M, Ψt is a matrix of time-dependent basis functions the same as in equation (1), and Σ _y is
is a covariance matrix encoding the uncertainty above. Two special aspects of this formulation are particularly noteworthy: (i) the mean of the Riemann-Gassus distribution in equation (12), i.e., Exp _p (Ψ _t w)∈M, exploits the equivalent formulation of the MGLM discussed above, and (ii) the weight vector forming w in equation (12) corresponds to the vectors that make up the geodesic basis of the MGLM.

すべてのデモンストレーションは、異なる重みベクトルｗによって特徴付けられるため、ここでも分布Ｐ（ｗ；θ）＝Ｎ（ｗ│μ_ｗ，Σ_ｗ）が取得できる。したがって、ｙ_ｔの周辺分布は次のように計算することができる。
Ｐ（ｙ；θ）＝∫Ｎ_Ｍ（ｙ│Ｅｘｐ_ｐ（Ψｗ），Σ_ｙ）Ｎ（ｗ│μ_ｗ，Σ_ｗ）ｄｗ（１３）
ここで、周辺分布は、異なる多様体上にある２つの確率分布に依存する（簡略化のために、ここおよび以下では時間インデックスを省略する）。しかしながら、平均μ_ｙは、単一の固定点ｐ∈Ｍおよびμ_ｗ∈Ｔ_ｐＭに依存する。これらの２つの観測値は、以下のように接空間Ｔ_ｐＭ上の境界（１３）を解くために活用される。
ここで、
は、μ_ｙからｐへの平行移動共分散Σ_ｙである。この周辺分布は、依然として接空間Ｔ_ｐＭ上にあるため、指数写像を使用してＭに逆写像されることに留意されたい。これにより、最終的に周辺分布は次のようになる。
ただし、
Since every demonstration is characterized by a different weight vector w, we again obtain the distribution P(w; θ) = N(w|μ _w , Σ _w ). The marginal distribution of y _t can therefore be calculated as follows:
P (y; θ) = ∫N _M (y│Exp ₍ Ψw), Σ _y ) N (w│μ _w , Σ _w ) dw (13)
where the marginal distributions depend on two probability distributions on different manifolds (for simplicity, we omit the time index here and below). However, the mean μ _y depends on a single fixed point p∈M and μ _w ∈T _p M. These two observations are exploited to solve the bound (13) on the tangent space T _p M as follows:
where:
is the translation covariance Σ _y from μ _y to p. Note that this marginal distribution is still on the tangent space T _p M, so it is mapped back to M using the exponential map. This finally gives us the marginal distribution
however,

上述のように、ＰｒｏＭＰの学習プロセスは、重み分布Ｐ（ｗ；θ）を推定することに集約される。そうするために、各デモンストレーションｉについて、コントローラ１０６は、ＭＧＬＭを使用して、重みベクトル
を推定する。はじめに、先に導入されたｙ_ｔについての等価式が使用され、ここで、
は、基底関数の数である。その上さらに、ｙ_ｔ∈Ｓ^３を用いてデモンストレーションされた四元数軌道
を考察する。次いで、式（３）と同様にユークリッド空間において、重み推定値が、ここでは式（１０）の活用によって取得され、次式となる。
ここで、φ_ｔは、時点ｔにおける基底関数のベクトルであり、Ｗは、推定された接重みベクトル
（すなわち、点ｐ∈Ｍから現れるＮ_φ個の接ベクトル）の集合を含んでいる。 As mentioned above, the learning process of ProMP boils down to estimating the weight distribution P(w;θ). To do so, for each demonstration i, the controller 106 uses the MGLM to estimate the weight vector
First, the equivalent formula for _yt introduced earlier is used, where:
is the number of basis functions. Furthermore, the quaternion orbital demonstrated with y _t ∈ ^{S 3}
Then, in Euclidean space similar to equation (3), the weight estimates are obtained, here by exploiting equation (10), to give:
where φ _t is the vector of basis functions at time t, and W is the estimated tangent weight vector
(i.e., a set of N _φ tangent vectors emerging from a point pεM).

図３は、配向ＰｒｏＭＰの重みの学習に使用した球面多様体Ｓ^２上の多変量一般線形回帰を示している。軌道ｙが与えられれば、接空間Ｔ_ｐＭの原点ｐと、接重みベクトルｗ_ｎとが式（１５）を介して推定される。 Figure 3 shows the multivariate general linear regression on the spherical manifold ^S2 used to train the weights of the orientation ProMP. Given the trajectory y, the origin p of the tangent space _TpM and the tangent weight vector _wn are estimated via equation (15).

式（１５）を解くために、ｐおよび各ｗ_ｎに関するＥ（ｐ，ｗ_ｎ）の勾配が計算される。上記で説明したように、これらの勾配は、いわゆる随伴演算子に依存し、大まかに言えば、各誤差項
を、
を用いて
からＴ_ｐＭにもたらしている。したがって、これらの随伴演算子は、平行移動操作として近似させることができる。これは、式（１５）の誤差関数を次のような再定式化に導く。
To solve equation (15), the gradients of E(p, w _n ) with respect to p and each w _n are calculated. As explained above, these gradients depend on so-called adjoint operators, roughly speaking, for each error term
of,
Using
to T _p M. These adjoint operators can therefore be approximated as translation operations, which leads to the reformulation of the error function in equation (15) as

次いで、誤差関数Ｅ（ｐ，ｗ_ｎ）の近似勾配は次のように対応する。
上記の勾配を用いることにより、コントローラ１０６は、各デモンストレーションｉについて、Ｎ_φ個のベクトルｗ_ｎによって形成されるベクトルｐ_ｉと重み行列Ｗ_ｉの両方を推定することができる。各デモンストレーションは、各接重みベクトルｗ_ｎ∈Ｔ_ｐＭを推定するために使用される多様体Ｍにおける原点を定義するｐの異なる推定値につながる可能性があることに留意されたい。これにより、デモンストレーション全体にわたって異なる接空間が生成される可能性があり、したがって、非常に多様な接重みベクトルが生成される可能性がある。この問題を克服する有効な手法は、すべてのデモンストレーションが同じ接空間の原点ｐを共有していることを想定することであり、これは、測地線基底関数モデル（式（１２））を定義するときに行われたのと同じ想定である。したがって、様々な実施形態によれば、コントローラ１０６は、単一のデモンストレーションについてｐを推定し、それを使用して、デモンストレーションの集合全体についてすべての接重みベクトルを推定する。次いで、Ｎ個のデモンストレーションの集合が与えられると、重み分布パラメータθ＝｛μ_ｗ，Σ_ｗ｝は、
として標準最尤法によって推定することができる。 Then the approximate gradient of the error function E(p, w _n ) corresponds to:
Using the above gradient, the controller 106 can estimate, for each demonstration i, both the vector p _i formed by the N _φ vectors w _n and the weight matrix W _i . Note that each demonstration may lead to a different estimate of p, which defines the origin in the manifold M used to estimate each tangent weight vector w _n ∈T _p M. This may result in different tangent spaces across the demonstrations, and therefore a wide variety of tangent weight vectors. An effective approach to overcome this problem is to assume that all demonstrations share the same tangent space origin p, the same assumption made when defining the geodesic basis function model (Equation (12)). Thus, according to various embodiments, the controller 106 estimates p for a single demonstration and uses it to estimate all tangent weight vectors for the entire set of demonstrations. Then, given a set of N demonstrations, the weight distribution parameter θ={μ _w , Σ _w } is given by
can be estimated by standard maximum likelihood methods as

Ｎ個のデモンストレーションの集合が提供された（例えば、ロボットアーム１０１を手で動かすことによってユーザから提供された）後にコントローラ１０６が実行することができる、配向ＰｒｏＭＰによるロボット制御モデル１１２の学習アルゴリズムの一例は、以下のとおりである。
An example of a learning algorithm for the robot control model 112 with oriented ProMP that the controller 106 can execute after a set of N demonstrations has been provided (e.g., provided by a user by manually moving the robot arm 101) is as follows:

古典的なＰｒｏＭＰと同様に、コントローラ１０６は、関連する共分散
を有する所期の軌道点
に到達するように動作を調整することによって、軌道変調（すなわち新たな状況に適合するための、すなわち制御シナリオ）を実行することができる。この結果、式（１３）と同様に、異なる多様体上にある２つの確率分布に依存する条件付き確率
が得られる。ここで再び、平均μ_ｙは、単一で固定されたｐ∈Ｍに依存し、それが重み分布の存在する接空間Ｔ_ｐＭの基底であるということが活用される。これにより、条件付き分布は次のように書き換えることができる。
ここで、
は、結果としての条件付き分布について推定するためのパラメータである。ここで両分布は、ユークリッド空間に埋め込まれたＴ_ｐＭ上に存在するため、新しい分布パラメータは、共分散行列の平行移動に特別な注意を払いながら古典的なＰｒｏＭＰ条件付け手順と同様に推定することができる。次いで、新たな重み分布パラメータは、以下のとおりである。
Similar to classical ProMP, the controller 106 calculates the associated covariance
A desired trajectory point with
Trajectory modulation (i.e., control scenarios to adapt to new situations) can be performed by adjusting the operation to reach . This results in a conditional probability that depends on two probability distributions on different manifolds, similar to equation (13).
Here again we exploit the fact that the mean μ _y depends on a single, fixed p∈M, which is a basis for the tangent space T _p M in which the weight distribution lies. This allows us to rewrite the conditional distribution as
where:
are the parameters to estimate for the resulting conditional distribution. Since both distributions now reside on T _p M embedded in Euclidean space, the new distribution parameters can be estimated similarly to the classical ProMP conditioning procedure, with special attention to the translation of the covariance matrix. The new weight distribution parameters are then:

結果としての新たな重み分布からは、新たな周辺分布Ｐ（ｙ；θ^＊）がここでは式（１４）を介して得られる可能性もある。 From the resulting new weight distribution, new marginal distributions P(y;θ ^* ) may also be obtained, here via equation (14).

ブレンディングに関して、古典的なＰｒｏＭＰは、ガウス分布の積を使用することによって、運動プリミティブの集合をブレンドする。Ｍにおいてプリミティブをブレンドする場合、各軌道分布は、異なる接空間ＴｐＭ上にある重みベクトルの集合によってパラメータ化されることを考慮する必要がある。したがって、ガウス分布の加重積を再定式化する必要がある。そうするために、様々な実施形態によれば、リーマン多様体上のガウス積の定式化が使用され、ここで、積の対数尤度は、勾配ベースのアプローチを使用して繰り返し最大化される。 Regarding blending, classical ProMP blends a set of motion primitives by using a product of Gaussian distributions. When blending primitives in M, it is necessary to consider that each trajectory distribution is parameterized by a set of weight vectors on a different tangent space TpM. Therefore, the weighted product of Gaussian distributions needs to be reformulated. To do so, according to various embodiments, a formulation of a Gaussian product on a Riemannian manifold is used, where the log-likelihood of the product is iteratively maximized using a gradient-based approach.

形式的には、リーマンガウス分布の積の対数尤度は、以下のように与えられる（定数項は除外する）。
ここで、μ_ｙ，ｓおよびΣ_ｙ，ｓは、スキルｓのための周辺分布Ｐ_ｓ（ｙ；θ）のパラメータである。なお、式（２０）における対数写像は、異なる接空間
に作用することに留意されたい。対数尤度の最大化を実行するために、元の対数尤度関数が変更されないようにしながら、写像の基数と引数とが入れ替えられる。そうするために、Ｌｏｇ_ｘ（ｙ）＝－Ｌｏｇ_ｙ（ｘ）の関係性ならびに平行移動操作をこの問題の克服のために活用することができ、次式となる。
ここで、μ^＋は、結果としての（推定される）ガウスの平均であり、
である。式（２１）は、ベクトル
とブロック対角行列
とを定義することによって書き直すことができる。この結果、Ｊは、リーマン多様体Ｍ上のガウス分布の経験的平均ｖを計算するために使用される目的関数の形態を有し、
そこからは、次のように平均を繰り返し計算することが可能である。
ここで、Ｊは、ｖ_ｋにおけるＭの接空間の基底に関するε（ｖ）のヤコビアンである。コントローラ１０６は、ここでは平均μ^＋の同様の反復推定を以下に示すように実行することができる。
ただし、
である。反復Ｋで収束した後、コントローラ１０６は、分布Ｐ（ｙ^＋）＝Ｎ_Ｍ（ｙ^＋│μ^＋，Σ^＋）の最終パラメータを以下に示すように取得する。
Formally, the log-likelihood of a product of Riemann-Gausian distributions is given by (excluding the constant term):
where μ _y,s and Σ _y,s are parameters of the marginal distribution P _s (y; θ) for skill s. Note that the logarithmic map in (20) is
Note that this operates on. To perform log-likelihood maximization, the cardinality and argument of the map are swapped while leaving the original log-likelihood function unchanged. To do so, the relationship Log _x (y) = -Log _y (x) as well as a translation operation can be exploited to overcome this issue, resulting in:
where μ ⁺ is the mean of the resulting (estimated) Gaussian,
Equation (21) is the vector
and a block diagonal matrix
As a result, J has the form of an objective function used to compute the empirical mean v of a Gaussian distribution on a Riemannian manifold M,
From there, it is possible to iteratively calculate the average as follows:
where J is the Jacobian of ε(ν) with respect to the basis of the tangent space of M at ν _k . The controller 106 can now perform a similar iterative estimation of the mean μ ⁺ as shown below:
however,
After convergence in iteration K, the controller 106 obtains the final parameters of the distribution P(y ⁺ )=N _M (y ⁺ |μ ⁺ ,Σ ⁺ ) as shown below:

上記で説明したように、古典的なＰｒｏＭＰでは、重み分布Ｐ（ｗ；θ）＝Ｎ（ｗ│μ_ｗ，Σ_ｗ）を外部タスクパラメータ
の関数として適合させることができ、ここでは、各デモンストレーションについて
の値にアクセスできると想定される。タスクパラメータ化は、重みベクトル
として配向ＰｒｏＭＰにも同様に適用され、したがって、式（６）は、タスクパラメータ
がユークリッドである限り直接適用することができる。ただし、
がリーマン多様体に属する場合は、より一般的なアプローチが必要となる。 As explained above, in classical ProMP, the weight distribution P(w; θ) = N(w|μ _w , Σ _w ) is defined as the external task parameter
can be fitted as a function of , where for each demonstration
The task parameterization is assumed to have access to the values of the weight vector
This also applies to the orientation ProMP as
can be applied directly as long as is Euclidean, except that
If belongs to a Riemannian manifold, a more general approach is needed.

タスクパラメータ
を保持する場合、コントローラ１０６は、リーマン多様体上のガウス混合モデルを使用して結合確率分布
を学習することができる。その後、コントローラ１０６は、新たなタスクパラメータ
が提供された場合、再生中に
を計算するためにガウス混合回帰を採用することができる。 Task parameters
, the controller 106 uses a Gaussian mixture model on a Riemannian manifold to obtain the joint probability distribution
The controller 106 can then learn the new task parameters
If provided, during playback
Gaussian mixture regression can be employed to calculate

配向ＰｒｏＭＰにおけるモデル学習、軌道再生、通過点適合、およびスキルブレンディング作業のやり方をより良好に説明するために、手書き文字のデータセットが使用された。元の軌道は、
において生成され、その後の単位ノルムベクトルへの単純な写像によってＳ^２へ投影された。データセット中の各文字は、Ｎ＝８回デモンストレーションされ、主に可視化の目的で、簡単な平滑化フィルタが各軌道に適用された。４つのＰｒｏＭＰモデルがトレーニングされ、１つは｛Ｇ，Ｉ，Ｊ，Ｓ｝のセットの各文字用である。ＩおよびＪについてトレーニングされたモデルには、均一に分布した中心を有するＮ_φ＝３０個の基底関数が使用され、文字ＧおよびＳについては、Ｎ_φ＝６０個の基底関数が使用された。配向ＰｒｏＭＰモデルは、上記で与えられたアルゴリズムに従って、初期学習率α＝０．００５、対応する上限値α_ｍａｘ＝０．０３でトレーニングされた。 To better illustrate how model learning, trajectory recovery, waypoint matching, and skill blending work in Oriented ProMP, a dataset of handwritten characters was used. The original trajectory is
The trajectories were generated in and then projected to ^S2 by a simple mapping to a unit-norm vector. Each character in the dataset was demonstrated N = 8 times, and a simple smoothing filter was applied to each trajectory, primarily for visualization purposes. Four ProMP models were trained, one for each character in the set {G, I, J, S}. N _φ = 30 basis functions with uniformly distributed centers were used for the models trained for I and J, and N _φ = 60 basis functions were used for the characters G and S. The oriented ProMP models were trained according to the algorithm given above, with an initial learning rate α = 0.005 and a corresponding upper limit α _max = 0.03.

図４は、文字ＧおよびＳに対してトレーニングされたモデルに対応する、デモンストレーションデータ、式（１３）を介して計算された周辺分布Ｐ（ｙ；θ）、ならびに式（１８）および（１９）から得られた通過点適合を示している。周辺分布の平均は、デモンストレーションパターンに従い、対応する共分散プロファイルは、Ｓ^２におけるデモンストレーションの変動性を捕捉する。文字ＧおよびＳの軌道は、現実的なロボット設定において観察されるものよりもさらに複雑となる可能性のある非常に精巧な「動作」パターンを示しており、その複雑さには注目の価値がある。通過点適合に関しては、共分散
が関連付けられたランダム点ｙ^＊∈Ｓ^２が使用された（すなわち、ｙ^＊を通過する際に高精度が要求された）。 Figure 4 shows the demonstration data, the marginal distributions P(y;θ) calculated via equation (13), and the waypoint fits obtained from equations (18) and (19) corresponding to models trained on the letters G and S. The means of the marginal distributions follow the demonstration patterns, and the corresponding covariance profiles capture the variability of the demonstrations in ^S2 . The trajectories of letters G and S exhibit very elaborate "movement" patterns that may be even more complex than those observed in realistic robotic settings, and their complexity is noteworthy. For waypoint fits, the covariance
A random point y ^* ∈ ^S2 associated with y* was used (i.e., high accuracy was required when passing through y ^* ).

図４に示すように、配向ＰｒｏＭＰは、所与の通過点を正確に通過しながら、軌道と関連する共分散プロファイルとの両方をスムーズに適合させることができる。 As shown in Figure 4, Orientation ProMP can smoothly fit both the trajectory and the associated covariance profile while accurately passing through a given waypoint.

図５は、｛Ｇ，Ｉ｝および｛Ｓ，Ｊ｝に対する配向ＰｒｏＭＰのブレンディングプロセスを示す。 Figure 5 shows the blending process of oriented ProMP for {G,I} and {S,J}.

目標は、集合の中の第１の文字のプロファイルを追従することによって始まり、次いで、第２の文字の軌道分布が途中で滑らかに切り替わる軌道を生成することであった。図５には、前述の２つのケースに対する結果としてのブレンドされた軌道が示されており、ここで、配向ＰｒｏＭＰは、所与の２つの軌道分布を上述したように導入された配向ＰｒｏＭＰのためのブレンディング手順を追従することによって滑らかにブレンドしている。ブレンディング挙動は、各スキルｓに関連付けられた重みα_ｓ∈［０，１］の一時的な発生に強く依存することに留意されたい。この一連の実験では、
である間、重み
および
に対してシグモイド状の関数が使用された。前述の結果は、配向ＰｒｏＭＰが、Ｓ^２上の軌道分布を正常に学習および再現し、完全な通過点適合およびブレンディング能力を提供することを示している。 The goal was to generate trajectories that start by following the profile of the first character in the set, and then smoothly switch to the trajectory distribution of the second character midway. Figure 5 shows the resulting blended trajectories for the two aforementioned cases, where Oriented ProMP smoothly blends the two given trajectory distributions by following the blending procedure for Oriented ProMP introduced above. Note that the blending behavior strongly depends on the temporal occurrence of the weights α _s ∈ [0,1] associated with each skill s. In this set of experiments,
While the weight
and
A sigmoid-like function was used for . The preceding results show that OrientedProMP successfully learns and reproduces the trajectory distribution on ^S2 , providing perfect waypoint matching and blending capabilities.

実験によれば、これは、例えば、以前に掴んだ物体を持ち上げ、エンドエフェクタ１０４を回転させ、当該物体をその元の場所に戻すが、向きは変えて配置することに相当するような再配向スキルのためのロボット設定においても同様に成り立つことを示している。このロボットスキルは、大きな位置および向きの変更を特徴とし、したがって、配向ＰｒｏＭＰの機能性を披露するのに適している。 Experiments have shown that this holds true equally well in a robotic setup for a reorientation skill, which corresponds to, for example, picking up a previously grasped object, rotating the end effector 104, and placing the object back in its original location, but with a different orientation. This robotic skill features large changes in position and orientation, and is therefore well suited to showcasing the functionality of OrientationProMP.

再配向スキルのようなロボットスキルをトレーニングするために、各デモンストレーションは、例えば、フルポーズのロボットエンドエフェクタの軌道
を与える。ここで、
は、タイムステップｔにおけるエンドエフェクタの姿勢を表す。このように、各デモンストレーションは、位置軌道（各々が
の要素によって記述される位置の時系列を含む）および配向軌道（各々がＳ^３の要素によって記述される向きの時系列を含む）をデモンストレーションする。これらの軌道からの生データは、位置に対するサブモデルと向きに対するサブモデルとを含んだ
のＰｒｏＭＰモデル１１２をトレーニングするために使用されてもよく、ここでの、位置モデルは、古典的なＰｒｏＭＰアプローチを使用して学習され、向きモデルは、配向ＰｒｏＭＰアプローチ（例えば上述のアルゴリズム）を使用して学習される。これらのサブモデルの両方については、同じ（例えばＮ_φ＝４０個の）基底関数の集合が使用されてもよいが、異なる成分について（位置サブモデルにおける各位置成分および向きサブモデルにおける各向き成分について）使用されてもよい。 To train robotic skills such as reorientation skills, each demonstration includes, e.g., a full-pose robot end-effector trajectory.
where,
represents the pose of the end effector at time step t. Thus, each demonstration is a position trajectory (each
We demonstrate the use of S3 trajectories (each containing a time series of positions described by elements of S3) and orientation trajectories (each containing a time series of orientations described by elements of ^S3 ). The raw data from these trajectories included a sub-model for position and a sub-model for orientation.
The ProMP model 112 may be used to train the ProMP model 112, where the position model is learned using the classical ProMP approach and the orientation model is learned using the orientation ProMP approach (e.g., the algorithm described above). For both of these sub-models, the same (e.g., N _φ = 40) set of basis functions may be used, but for different components (for each position component in the position sub-model and each orientation component in the orientation sub-model).

要約すると、様々な実施形態に従って、本方法は、図６に示されるように提供される。 In summary, according to various embodiments, the method is provided as shown in Figure 6.

図６は、ロボットデバイスを制御するための方法を示すフローチャート６００を示す。 Figure 6 shows a flowchart 600 illustrating a method for controlling a robotic device.

ステップ６０１では、デモンストレーションがロボットスキルのために提供され、ここで、各デモンストレーションは、ロボット構成のシーケンスを含む軌道をデモンストレーションし、ここで、各ロボット構成は、リーマン多様体の構造を有する予め定められた構成空間の要素によって記述される。 In step 601, demonstrations are provided for a robot skill, where each demonstration demonstrates a trajectory including a sequence of robot configurations, where each robot configuration is described by an element of a predetermined configuration space having the structure of a Riemannian manifold.

ステップ６０２では、各デモンストレーションされた軌道について、ロボットデバイスの予め定められた基本運動の重みベクトルとしての軌道の表現が、重みベクトルに従った基本運動の組み合わせと、デモンストレーションされた軌道との間の距離測定値を最小化する重みベクトルを検索することによって決定され、ここで、組み合わせは、多様体に写像される。 In step 602, for each demonstrated trajectory, a representation of the trajectory as a weight vector of predetermined primitive motions of the robotic device is determined by searching for a weight vector that minimizes a distance measure between the combination of primitive motions according to the weight vector and the demonstrated trajectory, where the combination is mapped to a manifold.

ステップ６０３では、重みベクトルの確率分布が、デモンストレーションされた軌道について決定された重みベクトルに確率分布を適合させることによって決定される。 In step 603, a probability distribution for the weight vector is determined by fitting a probability distribution to the weight vector determined for the demonstrated trajectory.

ステップ６０４では、ロボットデバイスが、重みベクトルの決定された確率分布に従って基本運動を実行することによって制御される。 In step 604, the robotic device is controlled by performing primitive movements according to the determined probability distribution of the weight vector.

これは、（式（１）に従って）重みベクトルの確率分布からサンプリングし、サンプルベクトルに従って基本運動を実行することを含むことができる。また、（式（１４）に従って）軌道の確率分布を導出することも可能であり、そのうちの１つを制御のためにサンプリングすることができ、それらは上記説明のような軌道の混合などの高度な制御に使用されてもよい。 This can involve sampling from a probability distribution of weight vectors (according to equation (1)) and performing primitive movements according to the sampled vector. It is also possible to derive probability distributions of trajectories (according to equation (14)), one of which can be sampled for control, which may be used for more advanced control such as blending trajectories as described above.

図６の方法は、１つ以上のデータ処理ユニットを含む１つ以上のコンピュータによって実行されてもよい。用語「データ処理ユニット」は、データまたは信号の処理を可能にする任意のタイプの項目として理解することができる。例えば、データまたは信号は、データ処理ユニットによって実行される少なくとも１つの（すなわち１つ以上の）特定の機能に従って処理されてもよい。データ処理ユニットは、アナログ回路、デジタル回路、コンポジット信号回路、ロジック回路、マイクロプロセッサ、マイクロコントローラ、中央処理装置（ＣＰＵ）、グラフィックス処理ユニット（ＧＰＵ）、デジタル信号プロセッサ（ＤＳＰ）、プログラマブルゲートアレイ（ＦＰＧＡ）集積回路、またはそれらの任意の組み合わせを含むことができ、あるいはそれらから形成されてもよい。それぞれの機能を実装する任意の他の手法は、データ処理ユニットまたは論理回路として理解されてもよい。本明細書に詳細に記載される方法ステップのうちの１つ以上は、データ処理ユニットによって実行される１つ以上の特定の機能を通して、データ処理ユニットによって実行（例えば、実装）されてもよいことが理解されるであろう。 The method of FIG. 6 may be performed by one or more computers including one or more data processing units. The term "data processing unit" may be understood as any type of item that enables processing of data or signals. For example, data or signals may be processed according to at least one (i.e., one or more) specific functions performed by the data processing unit. The data processing unit may include or be formed from an analog circuit, a digital circuit, a composite signal circuit, a logic circuit, a microprocessor, a microcontroller, a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), a programmable gate array (FPGA) integrated circuit, or any combination thereof. Any other manner of implementing the respective functions may be understood as a data processing unit or logic circuit. It will be understood that one or more of the method steps detailed herein may be performed (e.g., implemented) by a data processing unit through one or more specific functions performed by the data processing unit.

様々な実施形態は、例えば、デモンストレーションのデータを取得するために、ビデオ、レーダ、ＬｉＤＡＲ、超音波、サーマルイメージング、ソナーなどのような、様々な視覚センサ（カメラ）から画像データを受信し、使用することができる。 Various embodiments may receive and use image data from various visual sensors (cameras), such as video, radar, LiDAR, ultrasound, thermal imaging, sonar, etc., to acquire data for demonstration purposes.

図６のアプローチは、例えば、ロボット、車両、家電製品、電動工具、製造機械、パーソナルアシスタント、またはアクセス制御システムなどのコンピュータ制御された機械のような物理システムを制御するための制御信号を計算するために使用することができる。様々な実施形態によれば、物理システムを制御するためのポリシーが学習され、次いで、この物理システムがそれに応じて操作されてもよい。 The approach of FIG. 6 can be used to compute control signals for controlling a physical system, such as a computer-controlled machine, such as a robot, a vehicle, an appliance, a power tool, a manufacturing machine, a personal assistant, or an access control system. According to various embodiments, a policy for controlling the physical system may be learned, and the physical system may then be operated accordingly.

一実施形態によれば、この方法はコンピュータに実装される。 According to one embodiment, this method is computer-implemented.

本明細書では、特定の実施形態が示され説明されてきたが、当業者であるならば、図示され説明されてきたこれらの特定の実施形態を、本発明の保護範囲から逸脱することなく様々な代替的および／または等価的な実装形態に入れ替えてもよいことは明らかであろう。本出願では、本明細書で論じられる特定の実施形態の何らかの適合化または変化形態をカバーすることが意図されている。それゆえ、本発明は、本出願の特許請求の範囲および等価物によってのみ限定されることが意図される。 While specific embodiments have been shown and described herein, it will be apparent to those skilled in the art that the specific embodiments shown and described may be substituted with various alternative and/or equivalent implementations without departing from the scope of protection of the present invention. This application is intended to cover any adaptations or variations of the specific embodiments discussed herein. Therefore, it is intended that the present invention be limited only by the claims of this application and their equivalents.

Claims

1. A method for controlling a robotic device, the method comprising:
providing demonstrations for a robotic skill, each demonstration demonstrating a trajectory including a sequence of robot configurations, each robot configuration being described by an element of a predetermined configuration space having the structure of a Riemannian manifold;
determining, for each demonstrated trajectory, a representation of the trajectory as a weight vector of predetermined primitive movements of the robotic device by searching for a weight vector that minimizes a distance measure between the demonstrated trajectory and a combination of primitive movements according to the weight vector, the combination being mapped onto the manifold;
determining a probability distribution for a weight vector by fitting a probability distribution to the weight vector determined for the demonstrated trajectory;
and controlling the robotic device by executing primitive movements according to the determined probability distribution of the weight vector.

The method of claim 1, wherein the probability distribution of the weight vector is determined by fitting a Gaussian distribution to the weight vector determined for the demonstrated trajectory.

The method of claim 1 or 2, wherein each of the demonstrated trajectories includes a robot configuration for each time point in a predetermined sequence of time points, and each combination of primitive movements according to a weight vector specifies a robot configuration for each time point in the predetermined sequence of time points, and for each of the demonstrated trajectories, the weight vector is determined by determining a weight vector for the combination of primitive movements according to the weight vector and the demonstrated trajectory from a set of possible weight vectors, the combination being mapped to a manifold that is smallest among the set of possible weight vectors, and the distance between the combination of primitive movements mapped to the manifold and the demonstrated trajectory is given by a sum over time points in the sequence of time points, including a term for each time point that includes a value or a power of a value of the manifold metric between the demonstrated trajectory and an element of the manifold given by the combination of primitive movements at the time point when mapped to the manifold.

3. The method of claim 1, further comprising: searching for a point and a weight vector of a manifold such that, for one of the demonstrated trajectories, a distance measure between a combination of primitive movements according to the weight vector and the demonstrated trajectory is minimized, the combination being mapped onto the manifold from a tangent space at a point; and , for each of the demonstrated trajectories, mapping each combination onto the manifold is performed by mapping the combination from a tangent space at a selected point.

3. The method of claim 1, wherein the trajectories are orientation trajectories, and wherein each of the demonstrations further demonstrates a position trajectory, and wherein each of the robot configurations includes a pose described by a vector in three-dimensional space and an orientation described by elements of a predetermined configuration space.

3. The method of claim 1, further comprising the steps of: providing a demonstration of more robot skills; determining, for each skill, a representation of a trajectory, a weight vector, and a probability distribution of the weight vectors; controlling the robotic device by determining, for each skill, a Riemannian-Gauss distribution of manifold points from the probability distribution of the weight vectors; determining a product distribution of the Riemannian- Gauss distributions of the skills; and controlling the robotic device by sampling from the determined product probability distribution.

A robotic device controller configured to perform the method of claim 1 or 2 .

A computer program comprising instructions which, when executed by a processor, cause the processor to carry out a method according to claim 1 or 2 .

A computer readable medium having stored thereon instructions which, when executed by a processor, cause the processor to perform the method of claim 1 or 2 .