JP2026500177A

JP2026500177A - Systems and methods for anatomy segmentation and anatomical structure tracking

Info

Publication number: JP2026500177A
Application number: JP2025532959A
Authority: JP
Inventors: ダニ・キヤセフ; ファブリジオ・サンティーニ
Original assignee: ヴィカリアス・サージカル・インコーポレイテッド
Priority date: 2022-12-06
Filing date: 2023-12-06
Publication date: 2026-01-06
Also published as: EP4629925A1; WO2024123888A1

Abstract

A system and method for anatomical structure segmentation and anatomical structure tracking is provided. The system receives an image from a camera assembly of the system. The image may include a representation of one or more anatomical structures of a target. The system extracts a visual representation from the image. The system determines position and orientation data associated with a robotic assembly of the system. The position and orientation data indicates a pose of the robotic assembly. The system generates a state representation based at least in part on the pose and the visual representation. The state representation indicates a state of the system. The system identifies one or more anatomical landmarks in an anatomical space in which the robotic assembly is operating. The system generates a plurality of segmentation maps. Each segmentation map identifies which of the anatomical structures should avoid contact with the robotic assembly.

Description

（関連出願の相互参照）
本出願は、２０２２年１２月６日出願の米国仮特許出願第６３／４３０，５１３号に対する優先権の利益を主張するものであり、その内容全体は、参照により、本明細書に組み込まれる。 CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of priority to U.S. Provisional Patent Application No. 63/430,513, filed December 6, 2022, the entire contents of which are incorporated herein by reference.

外科手術ロボットシステムは、ユーザ（本明細書では、「オペレータ」又は「ユーザ」とも記載される）が、ロボット制御機器を使用して動作を実施して、処置中にタスク及び機能を実施することを可能にする。しかしながら、ユーザ（例えば、外科医）は、依然として患者の内部の視覚的フィードバックに限定される。こうした視覚的フィードバックは、狭く、しばしば分かりにくい手術室内でユーザが自身をナビゲート及び配向する能力を制限し得る。 Surgical robotic systems allow a user (also referred to herein as an "operator" or "user") to perform actions using robotic controls to perform tasks and functions during a procedure. However, the user (e.g., a surgeon) is still limited to visual feedback of the patient's interior. Such visual feedback can limit the user's ability to navigate and orient themselves within a small and often confusing operating room.

ナビゲーションは通常、標的エリア及び解剖学的ランドマークについての事前知識を使用して実施される。事前知識は、かなりの術前計画時間を用いて取得することができるが、解剖学的ランドマークを認識することは、ユーザが患者に対するカメラ配向を知らないか、又は十分な量の専門知識を欠く場合には課題となり得る。この問題は、例えば、誤った構造を切除することによる、術中時間及び潜在的な傷害の可能性を著しく増加させ得る。 Navigation is typically performed using prior knowledge of the target area and anatomical landmarks. While prior knowledge can be acquired using significant preoperative planning time, recognizing anatomical landmarks can be a challenge if the user does not know the camera orientation relative to the patient or lacks a sufficient amount of expertise. This issue can significantly increase intraoperative time and the likelihood of potential injury, for example, by resecting the wrong structure.

外科手術ロボットシステムが提示される。外科手術ロボットシステムは、ロボット組立品を含む。ロボット組立品は、対象（例えば、患者）の内部空洞の１つ以上の画像（例えば、一連のビデオフレーム、ライブビデオ映像、バーストモードで捕捉された一連の写真、単一の写真、及び／又は他の好適な画像）を生成するように構成されたカメラ組立品と、外科手術を実施するために内部空洞内に配置されるロボットアーム組立品と、を含む。外科手術ロボットシステムはまた、１つ以上の命令を記憶するメモリと、メモリ内に記憶された１つ以上の命令を読み出すように構成又はプログラムされたプロセッサと、を含む。プロセッサは、ロボット組立品に動作可能に結合されている。プロセッサは、カメラ組立品から画像を受信するように構成されている。画像は、対象の１つ以上の解剖学的構造（例えば、臓器、臓器管、血管（動脈及び静脈）、神経、その他の繊細な構造、及び／又はロボットアーム組立品を妨害する障害物構造）の表現を含み得る。プロセッサは、画像からその視覚的表現を抽出するようにさらに構成されている。視覚的表現は、コンパクト表現（例えば、画像の寸法を、データサイズの低減など、コンパクトな様式で画像を表す１つ又は任意の他の適切な表現に減少させることによって、画像を表すベクトル）である。プロセッサは、ロボット組立品に関連付けられた位置及び配向データを決定するようにさらに構成されている。位置及び配向データは、ロボット組立品の姿勢を示す。プロセッサは、位置及び配向データに少なくとも部分的に基づいて、ロボット組立品の姿勢表現を生成するようにさらに構成されている。プロセッサは、姿勢表現及び視覚的表現に少なくとも部分的に基づいて、状態表現を生成するようにさらに構成されている。状態表現は、外科手術ロボットシステムの状態を表す。プロセッサは、状態表現に少なくとも部分的に基づいて、ロボット組立品が動作している解剖学的空間内の１つ以上の解剖学的ランドマーク（例えば、鼠径三角、トライアングル・オブ・ドゥーム、疼痛の三角など）を識別するようにさらに構成されている。プロセッサは、複数のセグメンテーションマップを生成するようにさらに構成されている。各セグメンテーションマップは、１つ以上の解剖学的構造のうちのどれのロボット組立品との接触を回避すべきかを識別する。 A surgical robotic system is presented. The surgical robotic system includes a robotic assembly. The robotic assembly includes a camera assembly configured to generate one or more images (e.g., a series of video frames, live video footage, a series of photographs captured in burst mode, a single photograph, and/or other suitable images) of an internal cavity of a subject (e.g., a patient) and a robotic arm assembly positioned within the internal cavity to perform a surgical procedure. The surgical robotic system also includes a memory that stores one or more instructions and a processor configured or programmed to read the one or more instructions stored in the memory. The processor is operably coupled to the robotic assembly. The processor is configured to receive images from the camera assembly. The images may include representations of one or more anatomical structures of the subject (e.g., organs, organ ducts, blood vessels (arteries and veins), nerves, other delicate structures, and/or obstructing structures obstructing the robotic arm assembly). The processor is further configured to extract a visual representation from the images. The visual representation is a compact representation (e.g., a vector representing the image by reducing the dimensions of the image to one or any other suitable representation that represents the image in a compact manner, such as by reducing data size). The processor is further configured to determine position and orientation data associated with the robotic assembly. The position and orientation data indicates a pose of the robotic assembly. The processor is further configured to generate a pose representation of the robotic assembly based at least in part on the position and orientation data. The processor is further configured to generate a state representation based at least in part on the pose representation and the visual representation. The state representation indicates a state of the surgical robotic system. The processor is further configured to identify, based at least in part on the state representation, one or more anatomical landmarks (e.g., inguinal triangle, triangle of doom, triangle of pain, etc.) within an anatomical space in which the robotic assembly is operating. The processor is further configured to generate a plurality of segmentation maps. Each segmentation map identifies which of one or more anatomical structures should avoid contact with the robotic assembly.

一部の実施形態では、プロセッサは、状態表現と以前の状態表現との間の類似性を決定するようにさらに構成されている。以前の状態表現は、以前の位置及び配向データから抽出された以前の姿勢表現、並びに以前の画像から抽出された以前の視覚的表現に少なくとも部分的に基づいて生成される。類似性が類似性閾値以上であると決定することに応答して、プロセッサは、状態表現及び以前の状態表現を平均化して、平均化された状態表現を生成するようにさらに構成されている。プロセッサは、平均化された状態表現に少なくとも部分的に基づいて、１つ以上の解剖学的ランドマークを識別するようにさらに構成されている。プロセッサは、複数のセグメンテーションマップを生成するようにさらに構成されており、各セグメンテーションマップは、１つ以上の解剖学的構造のうちのどれのロボット組立品との接触を回避すべきかを識別する。一部の実施形態では、プロセッサは、ロボット組立品が動作している解剖学的空間の三次元（３Ｄ）再構築を決定するようにさらに構成されている。プロセッサは、３Ｄ再構築に少なくとも部分的に基づいて、解剖学的空間内の一つ以上の識別された解剖学的構造の各々の場所を決定するようにさらに構成されている。 In some embodiments, the processor is further configured to determine a similarity between the state representation and a previous state representation. The previous state representation is generated based at least in part on a previous pose representation extracted from previous position and orientation data and a previous visual representation extracted from previous images. In response to determining that the similarity is equal to or greater than a similarity threshold, the processor is further configured to average the state representation and the previous state representation to generate an averaged state representation. The processor is further configured to identify one or more anatomical landmarks based at least in part on the averaged state representation. The processor is further configured to generate a plurality of segmentation maps, each segmentation map identifying which of one or more anatomical structures should avoid contact with the robotic assembly. In some embodiments, the processor is further configured to determine a three-dimensional (3D) reconstruction of the anatomical space in which the robotic assembly is operating. The processor is further configured to determine a location of each of the one or more identified anatomical structures within the anatomical space based at least in part on the 3D reconstruction.

本発明のこれら及び他の特徴及び利点は、同様の参照符号が様々な図を通して同様の要素を指す、添付図面と併せて以下の詳細な説明を参照することによって、より完全に理解されるであろう。図面は、本発明の原理を例示し、正確な縮尺ではないが、相対的な寸法を示す。 These and other features and advantages of the present invention will be more fully understood by reference to the following detailed description taken in conjunction with the accompanying drawings, in which like reference characters refer to like elements throughout the various views. The drawings illustrate the principles of the invention and, although not to scale, show relative dimensions.

図１は、一部の実施形態による、例示的な外科手術ロボットシステムを示す図である。FIG. 1 illustrates an exemplary surgical robotic system, according to some embodiments. 図２Ａは、一部の実施形態による、外科用ロボットシステムのロボットサブシステムに連結されたロボット支持システムを含む患者カートの例示的な斜視図である。FIG. 2A is an exemplary perspective view of a patient cart including a robotic support system coupled to a robotic subsystem of a surgical robotic system, according to some embodiments. 図２Ｂは、一部の実施形態による、本開示の外科用ロボットシステムの例示的なオペレータコンソールの例示的な斜視図である。FIG. 2B is an exemplary perspective view of an exemplary operator console of the presently disclosed surgical robotic system, according to some embodiments. 図３Ａは、一部の実施形態による、対象の内部空洞内で外科手術を実施する外科手術ロボットシステムの例示的な側面図を示す図である。FIG. 3A illustrates an exemplary side view of a surgical robotic system performing a surgical procedure within an internal cavity of a subject, according to some embodiments. 図３Ｂは、一部の実施形態による、図３Ａの対象の内部空洞内で外科手術を実施する外科用ロボットシステムの例示的な上面図を示す図である。3B illustrates an exemplary top view of a surgical robotic system for performing a surgical procedure within an internal cavity of the object of FIG. 3A, according to some embodiments. 図４Ａは、一部の実施形態による、単一のロボットアームサブシステムの例示的な斜視図である。FIG. 4A is an exemplary perspective view of a single robotic arm subsystem, according to some embodiments. 図４Ｂは、一部の実施形態による、図４Ａの単一のロボットアームサブシステムの単一のロボットアームの例示的な側面斜視図である。4B is an exemplary side perspective view of a single robotic arm of the single robotic arm subsystem of FIG. 4A, according to some embodiments. 図５は、一部の実施形態による、カメラ組立品及びロボットアーム組立品の正面斜視図である。FIG. 5 is a front perspective view of a camera assembly and a robotic arm assembly, according to some embodiments. 図６は、一部の実施形態による、患者の空洞及び外科手術ロボットシステムの一対のロボットアームの錐台ビュー、ロボット姿勢ビュー、及びカメラビューを含む、ロボット姿勢ビューの例示的なグラフィカルユーザインターフェースである。FIG. 6 is an exemplary graphical user interface of a robot pose view including a frustum view, a robot pose view, and a camera view of a patient cavity and a pair of robotic arms of a surgical robotic system, according to some embodiments. 図７は、一部の実施形態による、生体構造セグメンテーション及び解剖学的構造追跡のための例示的な生体構造セグメンテーション及び追跡モジュールを示す図である。FIG. 7 illustrates an exemplary anatomy segmentation and tracking module for anatomy segmentation and anatomical structure tracking, according to some embodiments. 図８は、一部の実施形態による、外科手術ロボットシステムによって実施される解剖学的構造セグメンテーション及び解剖学的ランドマークの識別のためのステップを示すフローチャートである。FIG. 8 is a flowchart illustrating steps for anatomy segmentation and anatomical landmark identification performed by a surgical robotic system, according to some embodiments. 図９は、一部の実施形態による、外科手術ロボットシステムによって実施される解剖学的構造追跡のためのステップを示すフローチャートである。FIG. 9 is a flowchart illustrating steps for anatomy tracking performed by a surgical robotic system, according to some embodiments. 図１０は、一部の実施形態による、解剖学的構造を自動的に識別するように外科手術ロボットシステムを訓練するためのステップを示すフローチャートである。FIG. 10 is a flowchart illustrating steps for training a surgical robotic system to automatically identify anatomical structures, according to some embodiments. 図１１Ａは、例示的な実施形態によって提供される方法の１つ以上のステップを実施するために使用され得る、例示的なコンピューティングモジュールの図である。FIG. 11A is a diagram of an exemplary computing module that may be used to implement one or more steps of the method provided by the exemplary embodiments. 図１１Ｂは、一部の実施形態による、図１１Ａのコンピューティングモジュールによって実行可能であり得る例示的なシステムコードを示す図である。FIG. 11B illustrates exemplary system code that may be executable by the computing module of FIG. 11A, according to some embodiments. 図１２は、システムが実装され得るコンピュータハードウェア及びネットワークコンポーネントを示す図である。FIG. 12 is a diagram illustrating computer hardware and network components on which the system may be implemented.

本明細書に教示及び記載される実施形態は、生体構造セグメンテーション及び解剖学的構造追跡のためのシステム及び方法を提供する。一部の実施形態では、本明細書に教示及び記載される外科手術ロボットシステムは、例えば、リアルタイムのビデオストリーム（例えば、ライブビデオ映像）中の解剖学的ランドマーク、臓器、臓器管、血管（動脈及び静脈）、神経、その他の繊細な構造、及び障害物構造（例えば、ロボットアーム組立品を妨害する構造）を強調及び追跡することによってなど、現在外科手術ロボットシステムのカメラ組立品の視野内にある解剖学的に関連する構造を表示することによって、複雑な外科手術環境の拡張ビューを提供し得る。例えば、外科手術ロボットシステムは、解剖学的ランドマークの識別、解剖学的構造セグメンテーション、及び解剖学的構造追跡を含む、生体構造セグメンテーション及び追跡を提供することができ、これは、ユーザが自身を配向するために費やす術中時間を短縮し、外科手術段階に対するユーザの認識を高めることができる。本明細書に教示及び記載される生体構造セグメンテーション及び追跡は、外科手術中にユーザを支援することができるため、外科手術ロボットシステムを使用するユーザに必要な訓練時間及びスキルが低減され、それによって、より広範なユーザへの外科手術ロボットシステムのアクセス性が増加する。本明細書に教示及び記載される解剖学的ランドマーク識別はまた、組織及び解剖学的構造の識別、形状ナビゲーション、及びユーザ定義可能な安全キープアウトゾーンなどのツールをユーザに提供して、望ましくない組織衝突を防止する、高度なユーザインターフェースを可能にし得る。本明細書に教示及び記載される解剖学的構造セグメンテーションは、外科手術ロボットシステムが、オン及びオフカメラ損傷から対象を保護する安全キープアウトゾーンを自動的に画定することをさらに可能にすることができ、また、外科手術ロボットシステムが、ユーザの監督下で安全かつ効率的に、縫合及び組織切開など、対象内部でナビゲーション及び外科手術処置を実行しながら、ヒト組織と相互作用し得る半自律外科手術ロボットシステムとなることを可能にし得る。 Embodiments taught and described herein provide systems and methods for anatomy segmentation and anatomical structure tracking. In some embodiments, the surgical robotic systems taught and described herein may provide an augmented view of a complex surgical environment by displaying anatomically relevant structures currently within the field of view of the surgical robotic system's camera assembly, such as by highlighting and tracking anatomical landmarks, organs, organ ducts, blood vessels (arteries and veins), nerves, other delicate structures, and obstructing structures (e.g., structures obstructing the robotic arm assembly) in a real-time video stream (e.g., live video footage). For example, the surgical robotic system may provide anatomy segmentation and tracking, including anatomical landmark identification, anatomical structure segmentation, and anatomical structure tracking, which may reduce the intraoperative time a user spends orienting themselves and increase the user's awareness of the surgical stage. The anatomy segmentation and tracking taught and described herein can assist users during surgical procedures, reducing the training time and skill required for users to use surgical robotic systems, thereby increasing the accessibility of surgical robotic systems to a wider range of users. The anatomical landmark identification taught and described herein can also enable advanced user interfaces that provide users with tools such as tissue and anatomical structure identification, shape navigation, and user-definable safety keepout zones to prevent unwanted tissue collisions. The anatomical structure segmentation taught and described herein can further enable surgical robotic systems to automatically define safety keepout zones that protect subjects from on- and off-camera damage, and can enable surgical robotic systems to become semi-autonomous surgical robotic systems that can interact with human tissue while navigating and performing surgical procedures within subjects, such as suturing and tissue dissection, safely and efficiently under user supervision.

解剖学的構造のタイプ、解剖学的に制約された領域、追跡している環境、追跡している物、及びデータモダリティによって制限される従来の生体構造セグメンテーションシステム及び方法と比較して、本明細書に教示及び記載される生体構造セグメンテーション及び追跡は、セグメントの様々なタイプの解剖学的構造（例えば、血管、神経、その他の繊細な構造、障害物構造、又はこれに類するもの）に対して人工知能（ＡＩ）ベースのフレームワーク（例えば、機械学習／ディープラーニングモデル）を使用し、複数の解剖学的構造に関する情報を同時に提供することができる。本明細書に教示及び記載される生体構造セグメンテーション及び追跡はまた、様々な解剖学的場所で解剖学的構造をセグメント化することによって、身体内の解剖学的構造の場所に関係なく、解剖学的構造のセグメンテーションを実施することができる。加えて、本明細書に教示及び記載される生体構造セグメンテーション及び追跡は、経時的に解剖学的構造を追跡し、同時に、内部体腔の再構築された三次元（３Ｄ）マップにそれらを限局化することができる。さらに、本明細書に教示及び記載される生体構造セグメンテーション及び追跡は、複数のデータモダリティ（例えば、ロボットアーム組立品の姿勢情報、視覚的情報、染料及び蛍光データ、又は他の好適なモダリティデータ）を利用して、解剖学的ランドマークの識別、解剖学的構造のセグメンテーション、及び解剖学的構造の追跡を実施することができる。 In comparison to conventional anatomy segmentation systems and methods that are limited by the type of anatomy, anatomically constrained regions, tracking environment, tracking object, and data modality, the anatomy segmentation and tracking taught and described herein uses an artificial intelligence (AI)-based framework (e.g., machine learning/deep learning models) to segment various types of anatomy (e.g., blood vessels, nerves, other delicate structures, obstacle structures, or the like) and can simultaneously provide information about multiple anatomy. The anatomy segmentation and tracking taught and described herein can also perform anatomy segmentation regardless of the location of the anatomy within the body by segmenting the anatomy at various anatomical locations. Additionally, the anatomy segmentation and tracking taught and described herein can track anatomy over time and simultaneously localize them in a reconstructed three-dimensional (3D) map of an internal body cavity. Additionally, the anatomy segmentation and tracking taught and described herein can utilize multiple data modalities (e.g., robotic arm assembly pose information, visual information, dye and fluorescence data, or other suitable modality data) to perform anatomical landmark identification, anatomical structure segmentation, and anatomical structure tracking.

図７～図１２に関して生体構造セグメンテーション及び追跡の追加的な具体的説明を提供する前に、一部の実施形態が採用され得る外科手術ロボットシステムを、図１～図６に関して以下に説明する。 Before providing additional specific details about anatomy segmentation and tracking with respect to Figures 7-12, a surgical robotic system in which some embodiments may be employed is described below with respect to Figures 1-6.

様々な実施形態が本明細書に教示及び記載されてきたが、こうした実施形態は、単に例として提供されることが当業者には明らかであろう。本発明から逸脱することなく、多数の変形、変更、及び置換が当業者に生じ得る。本明細書に説明される本発明の実施形態に対する様々な代替物が用いられ得ることが理解され得る。 While various embodiments have been taught and described herein, it will be apparent to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It will be understood that various alternatives to the embodiments of the invention described herein may be used.

本明細書及び特許請求の範囲で使用される場合、単数形「ａ」、「ａｎ」、及び「ｔｈｅ」は、内容が別途明確に指示されない限り、複数の指示対象を含む。「備える（ｃｏｍｐｒｉｓｅｓ）」及び／若しくは「備える（ｃｏｍｐｒｉｓｉｎｇ）」、又は「含む（ｉｎｃｌｕｄｅ）」及び／若しくは「含む（ｉｎｃｌｕｄｉｎｇ）」という用語は、本明細書において使用する場合、記載された特徴、整数、ステップ、動作、要素、及び／又は構成要素の存在を示すが、１つ以上の他の特徴、整数、ステップ、動作、要素、構成要素、及び／又はそれらのグループの存在又は追加を妨げないことが更に理解されるであろう。本明細書で使用する「及び／又は」という用語は、関連する列挙されたアイテムのうちの１つ以上の任意の及び全ての組み合わせを含む。 As used in this specification and claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise. It will be further understood that the terms "comprises" and/or "comprising," or "include" and/or "including," when used herein, indicate the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The term "and/or," as used herein, includes any and all combinations of one or more of the associated listed items.

文脈から具体的に記載されない限り、又は明白でない限り、本明細書で使用する「約」という用語は、当技術分野での通常の公差の範囲内、例えば、平均の２標準偏差以内であると理解される。「約」は、記載値の１０％、９％、８％、７％、６％、５％、４％、３％、２％、１％、０．５％、０．１％、０．０５％、又は０．０１％以内であると理解され得る。文脈から別途明らかでない限り、本明細書に提供される全ての数値は、「約」という用語によって修正される。 Unless specifically stated or apparent from the context, the term "about" as used herein is understood to mean within normal tolerances in the art, e.g., within two standard deviations of the mean. "About" may be understood to mean within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise apparent from the context, all numerical values provided herein are modified by the term "about."

本明細書又は参照により組み込まれる文献では、例示的な実施形態が、例示的なプロセスを実施するために複数のユニットを採用するものとして説明されているが、例示的なプロセスは、１つ以上のモジュールによっても実施され得ることが理解される。加えて、コントローラ／コントローラという用語は、メモリ及びプロセッサを含み、本明細書に記載された一部の実施形態によるプロセスを実行するように具体的にプログラムされたハードウェアデバイスを指し得るものと理解される。一部の実施形態では、メモリは、モジュールを保存するように構成され、プロセッサは、以下で更に説明される１つ以上のプロセスを実施するために、当該モジュールを実行するように特に構成されている。一部の実施形態では、複数の異なるコントローラ若しくはコントローラ、又は複数の異なるタイプのコントローラ若しくはコントローラを、１つ以上のプロセスを実行するために採用し得る。一部の実施形態では、異なるコントローラ又はコントローラが、外科手術ロボットシステムの異なる部分に実装され得る。 While exemplary embodiments are described herein or in the documents incorporated by reference as employing multiple units to perform exemplary processes, it is understood that exemplary processes may also be performed by one or more modules. Additionally, it is understood that the term controller/controller may refer to a hardware device, including a memory and a processor, specifically programmed to perform processes according to some embodiments described herein. In some embodiments, the memory is configured to store modules, and the processor is specifically configured to execute the modules to perform one or more processes described further below. In some embodiments, multiple different controllers or controllers, or multiple different types of controllers or controllers, may be employed to perform one or more processes. In some embodiments, different controllers or controllers may be implemented in different portions of the surgical robot system.

外科手術ロボットシステム
一部の実施形態は、外科手術ロボットシステムとともに用いられ得る。ロボット外科手術のためのシステムは、ロボットサブシステムを含み得る。ロボットサブシステムは、本明細書では、単一の切開点又は部位を通じてトロカールを介して患者に挿入され得る、ロボット組立品とも呼ばれ得る、少なくとも一部分を含む。トロカールを経由して患者に挿入された部分は、外科手術部位にインビボで展開されるのに十分小さく、複数の異なる点又は部位において様々な外科手術処置を行うために、身体内で移動できるように、体内に挿入されたときに十分に操縦可能である。本明細書において、機能的タスクを実施する、身体に挿入された部分は、外科手術ロボットモジュール、外科手術ロボットモジュール、又はロボット組立品と称され得る。外科手術ロボットモジュールは、トロカール内に別々に挿入され得る複数の異なるサブモジュール又は部品を含み得る。外科手術ロボットモジュール、外科手術ロボットモジュール、又はロボット組立品は、異なる又は別個の軸に沿って患者内に展開可能な複数の別個のロボットアームを含み得る。これらの複数の別個のロボットアームは、本明細書ではロボットアーム組立品と総称され得る。更に、外科手術カメラ組立品はまた、別個の軸に沿って展開され得る。外科手術ロボットモジュール、外科手術ロボットモジュール、又はロボット組立品はまた、外科手術カメラ組立品を含み得る。したがって、外科手術ロボットモジュール、又はロボット組立品は、一対のロボットアーム及び外科手術又はロボットカメラ組立品などの複数の異なる構成要素を用い、それら各々は、異なる軸に沿って配備可能であり、別々に操作可能、操縦可能、及び移動可能である。別個の操作可能な軸に沿って配設可能であるロボットアーム及びカメラ組立品は、本明細書ではスプリットアーム（ＳＡ）アーキテクチャと呼ばれる。ＳＡアーキテクチャは、単一の挿入部位において単一のトロカールを通してロボット外科手術器具の挿入を単純化し、効率を高めると同時に、外科手術器具の外科手術準備完了状態への配備、並びにトロカールを通した外科手術器具のその後の除去を支援するように設計される。一例として、外科手術器具をトロカールを通して挿入して、患者の腹腔内にアクセスし、インビボで手術を行うことができる。一部の実施形態では、ロボット外科手術器具、及び本技術分野で知られている他の外科手術器具を含むがこれらに限定されない、様々な外科手術器具が使用又は採用され得る。 Surgical Robot System Some embodiments may be used with a surgical robot system. A system for robotic surgery may include a robotic subsystem. The robotic subsystem includes at least a portion, which may be referred to herein as a robotic assembly, that may be inserted into a patient via a trocar through a single incision point or site. The portion inserted into the patient via the trocar is small enough to be deployed in vivo at a surgical site and is sufficiently maneuverable when inserted into the body so that it can be moved within the body to perform various surgical procedures at multiple different points or sites. Herein, the portion inserted into the body that performs a functional task may be referred to as a surgical robot module, a surgical robot module, or a robotic assembly. A surgical robot module may include multiple different sub-modules or parts that may be separately inserted into a trocar. A surgical robot module, a surgical robot module, or a robotic assembly may include multiple separate robotic arms that can be deployed into a patient along different or separate axes. These multiple separate robotic arms may be collectively referred to herein as a robotic arm assembly. Additionally, a surgical camera assembly may also be deployed along a separate axis. The surgical robot module, surgical robot module, or robot assembly may also include a surgical camera assembly. Thus, a surgical robot module or robot assembly employs multiple distinct components, such as a pair of robotic arms and a surgical or robotic camera assembly, each of which is deployable along a different axis and separately operable, steerable, and movable. A robotic arm and camera assembly that is disposable along separate operable axes is referred to herein as a split-arm (SA) architecture. The SA architecture is designed to simplify and increase the efficiency of the insertion of robotic surgical instruments through a single trocar at a single insertion site, while also assisting in the deployment of the surgical instruments into a surgical-ready state and their subsequent removal through the trocar. As an example, surgical instruments can be inserted through the trocar to access a patient's abdominal cavity and perform surgery in vivo. In some embodiments, a variety of surgical instruments may be used or employed, including, but not limited to, robotic surgical instruments and other surgical instruments known in the art.

本明細書に開示のシステム、デバイス、及び方法は、例えば、米国特許第１０，２８５，７６５号及びＰＣＴ特許出願第ＰＣＴ／ＵＳ２０２０／３９２０３号に開示されたロボット外科手術デバイス及び関連するシステム、及び／又は米国特許公開第２０１９／００７６１９９号に開示されたカメラ組立品及びシステム、及び／又はＰＣＴ特許出願第ＰＣＴ／ＵＳ２０２１／０５８８２０号に開示された移植可能な外科手術ロボットシステムで外科手術ツールを交換するためのシステム及び方法に組み込まれ、かつこれらとともに利用され、上述の特許、特許出願及び公報の全ての内容及び教示は、参照により本明細書に組み込まれる。本発明の一部を形成する外科手術ロボットモジュールは、一部の実施形態では、適切なセンサ及びディスプレイを含むユーザワークステーションと、本発明のロボットサブシステムとの相互作用及び支持を行うためのロボット支持システム（ＲＳＳ）とを含む外科用ロボットシステムの一部を形成することができる。ロボットサブシステムは、一部の実施形態では、モータと、１つ以上のロボットアーム及び１つ以上のカメラ組立品を含む外科手術ロボットモジュールとを含む。ロボットアーム及びカメラ組立品は、単一支持軸ロボットシステムの一部を形成し得るか、分割アーム（ＳＡ）アーキテクチャロボットシステムの一部を形成し得るか、又は他の配置を有し得る。ロボット支持システムは、ロボットモジュールを患者内で単一の位置又は複数の異なる位置に操縦することができるように、複数の自由度を提供することができる。一実施形態では、ロボット支持システムは、手術台、又は手術室内の床若しくは天井に直接取り付けられ得る。別の実施形態では、取り付けは、クランプ、ネジ、又はそれらの組み合わせを含むがこれらに限定されない様々な締結手段によって達成される。他の実施形態では、構造体は、自立していてもよい。ロボット支持システムは、ロボットアーム組立品及びカメラ組立品を含む、外科手術ロボットモジュールに連結されたモータ組立品を取り付けることができる。モータ組立品は、外科手術ロボットモジュールの構成要素に給電するためのギア、モータ、ドライブトレイン、電子機器などを含み得る。 The systems, devices, and methods disclosed herein may be incorporated into and utilized in conjunction with, for example, the robotic surgical devices and related systems disclosed in U.S. Pat. No. 10,285,765 and PCT Patent Application No. PCT/US2020/39203, the camera assemblies and systems disclosed in U.S. Patent Publication No. 2019/0076199, and/or the systems and methods for exchanging surgical tools in an implantable surgical robotic system disclosed in PCT Patent Application No. PCT/US2021/058820, the entire contents and teachings of which are incorporated herein by reference. The surgical robot module forming part of the present invention may, in some embodiments, form part of a surgical robotic system including a user workstation including appropriate sensors and displays, and a robotic support system (RSS) for supporting and interacting with the robotic subsystem of the present invention. The robotic subsystem, in some embodiments, includes a motor and a surgical robot module including one or more robotic arms and one or more camera assemblies. The robotic arm and camera assembly may form part of a single support axis robotic system, a split-arm (SA) architecture robotic system, or other arrangements. The robotic support system may provide multiple degrees of freedom so that the robotic module can be maneuvered to a single position or multiple different positions within a patient. In one embodiment, the robotic support system may be attached directly to the operating table or to the floor or ceiling within the operating room. In another embodiment, attachment is achieved by various fastening means, including, but not limited to, clamps, screws, or combinations thereof. In other embodiments, the structure may be freestanding. The robotic support system may attach a motor assembly coupled to the surgical robot module, including the robotic arm assembly and camera assembly. The motor assembly may include gears, motors, drivetrains, electronics, etc., for powering the components of the surgical robot module.

ロボットアーム組立品及びカメラ組立品は、複数の移動自由度が可能である。一部の実施形態によれば、ロボットアーム組立品及びカメラ組立品がトロカールを通して患者に挿入されるとき、それらは、少なくとも軸方向、ヨー方向、ピッチ方向、及びロール方向で移動することが可能である。ロボットアーム組立品のロボットアームは、ユーザの手首領域又は関節に対応する、その遠位端に取り付けられたエンドエフェクタを有する、マルチ移動自由度ロボットアームを組み込み、利用するように設計されている。他の実施形態では、ロボットアームの作業端部（例えば、エンドエフェクタ端部）は、例えば、米国公開第２０１８／０２２１１０２号に記載された外科手術器具などの他のロボット外科手術器具を組み込み、使用又は採用するように設計されており、その全ての内容は参照により本明細書に組み込まれている。 The robotic arm assembly and camera assembly are capable of multiple degrees of freedom of movement. According to some embodiments, when the robotic arm assembly and camera assembly are inserted into a patient through a trocar, they are capable of movement in at least the axial, yaw, pitch, and roll directions. The robotic arm of the robotic arm assembly is designed to incorporate and utilize a multi-degree of freedom of movement robotic arm having an end effector attached to its distal end that corresponds to a user's wrist region or joint. In other embodiments, the working end (e.g., end effector end) of the robotic arm is designed to incorporate, use, or employ other robotic surgical instruments, such as, for example, the surgical instruments described in U.S. Publication No. 2018/0221102, the entire contents of which are incorporated herein by reference.

同様の数値識別子が、同じ要素を指すために図全体を通して使用される。 Similar numeric identifiers are used throughout the figures to refer to the same elements.

図１は、本開示の態様が本開示の一部の実施形態に従って採用され得る、外科手術ロボットシステム１０の概略図である。外科手術ロボットシステム１０は、一部の実施形態によるオペレータコンソール１１及びロボットサブシステム２０を含む。 FIG. 1 is a schematic diagram of a surgical robotic system 10 in which aspects of the present disclosure may be employed in accordance with some embodiments of the present disclosure. The surgical robotic system 10 includes an operator console 11 and a robotic subsystem 20 in accordance with some embodiments.

オペレータコンソール１１は、ディスプレイ１２と、画像計算モジュール１４（三次元（３Ｄ）計算モジュールでもよい）と、感知及び追跡モジュール１６を有するハンドコントローラ１７と、計算モジュール１８と、を含む。加えて、オペレータコンソール１１は、複数のペダルを含むフットペダルアレイ１９を含み得る。画像計算モジュール１４は、グラフィカルユーザインターフェース３９を含み得る。グラフィカルユーザインターフェース３９、コントローラ２６、若しくは画像レンダリング装置３０、又はその両方は、グラフィカルユーザインターフェース３９上に１つ以上の画像又は１つ以上のグラフィカルユーザインターフェース要素を与え得る。例えば、外科手術ロボットシステム１０、又は外科手術ロボットシステム１０の様々な構成要素のいずれかを操作するモードと関連付けられたピラーボックスは、グラフィカルユーザインターフェース３９上でレンダリングされ得る。カメラ組立品４４によって捕捉されたライブビデオ映像も、コントローラ２６又は画像レンダリング装置３０によってグラフィカルユーザインターフェース３９上にレンダリングすることができる。 The operator console 11 includes a display 12, an image computation module 14 (which may be a three-dimensional (3D) computation module), a hand controller 17 having a sensing and tracking module 16, and a computation module 18. Additionally, the operator console 11 may include a foot pedal array 19 including a plurality of pedals. The image computation module 14 may include a graphical user interface 39. The graphical user interface 39, the controller 26, or the image rendering device 30, or both, may provide one or more images or one or more graphical user interface elements on the graphical user interface 39. For example, pillar boxes associated with modes of operating the surgical robotic system 10 or any of the various components of the surgical robotic system 10 may be rendered on the graphical user interface 39. Live video footage captured by the camera assembly 44 may also be rendered on the graphical user interface 39 by the controller 26 or the image rendering device 30.

オペレータコンソール１１は、画像計算モジュール１４、計算モジュール１８、及び／又はロボットサブシステム２０によって生成される情報、画像、又はビデオを表示するための任意の選択されたタイプのディスプレイであり得る、ディスプレイ１２を含む可視化システム９を含み得る。ディスプレイ１２は、例えば、ヘッドマウンテドディスプレイ（ＨＭＤ）、拡張現実（ＡＲ）ディスプレイ（例えば、ＡＲディスプレイ、又はスクリーン若しくはディスプレイと組み合わせたＡＲガラス）、スクリーン又はディスプレイ、二次元（２Ｄ）スクリーン又はディスプレイ、三次元（３Ｄ）スクリーン又はディスプレイなどを含むか、又はその一部を形成することができる。ディスプレイ１２はまた、任意選択の感知及び追跡モジュール１６Ａを含み得る。一部の実施形態では、ディスプレイ１２は、ロボットサブシステム２０のカメラ組立品４４から画像を出力するための画像ディスプレイを含み得る。 The operator console 11 may include a visualization system 9 including a display 12, which may be any selected type of display for displaying information, images, or video generated by the image computation module 14, the computation module 18, and/or the robotic subsystem 20. The display 12 may include or form part of, for example, a head-mounted display (HMD), an augmented reality (AR) display (e.g., an AR display or AR glasses combined with a screen or display), a screen or display, a two-dimensional (2D) screen or display, a three-dimensional (3D) screen or display, or the like. The display 12 may also include an optional sensing and tracking module 16A. In some embodiments, the display 12 may include an image display for outputting images from the camera assembly 44 of the robotic subsystem 20.

ハンドコントローラ１７は、外科手術ロボットシステム１０を操作するために、オペレータの手及び／又は腕の移動を感知するように構成されている。ハンドコントローラ１７は、感知及び追跡モジュール１６、回路、並びに／又は他のハードウェアを含み得る。感知及び追跡モジュール１６は、オペレータの手の移動を感知する１つ以上のセンサ又は検出器を含み得る。一部の実施形態では、オペレータの手の移動を感知する１つ以上のセンサ又は検出器は、オペレータの手によって把持又は関与されるハンドコントローラ１７内に配置される。一部の実施形態では、オペレータの手の移動を感知する１つ以上のセンサ又は検出器は、オペレータの手及び／又は腕に連結される。例えば、感知及び追跡モジュール１６のセンサは、指、手首領域、肘領域、及び／又は肩領域などの手及び／又は腕の領域に連結され得る。一部の実施形態では、追加のセンサを、オペレータの頭部及び／又は頸部領域にも連結することができる。一部の実施形態では、感知及び追跡モジュール１６は、外部であってもよく、電気部品及び／又は装着ハードウェアを介してハンドコントローラ１７に連結されてもよい。一部の実施形態では、任意選択のセンサ及び追跡モジュール１６Ａは、オペレータの身体に取り付けられたセンサに加えて、又はその代わりに、オペレータの撮像に少なくとも部分的に基づいて、オペレータの頭部、オペレータの目、又はオペレータの首の少なくとも一部分の、オペレータの頭部のうちの１つ以上の動きを感知及び追跡し得る。 The hand controller 17 is configured to sense the movement of the operator's hand and/or arm to operate the surgical robotic system 10. The hand controller 17 may include a sensing and tracking module 16, circuitry, and/or other hardware. The sensing and tracking module 16 may include one or more sensors or detectors that sense the movement of the operator's hand. In some embodiments, the one or more sensors or detectors that sense the movement of the operator's hand are disposed within the hand controller 17, which is grasped or engaged by the operator's hand. In some embodiments, the one or more sensors or detectors that sense the movement of the operator's hand are coupled to the operator's hand and/or arm. For example, sensors in the sensing and tracking module 16 may be coupled to regions of the hand and/or arm, such as the fingers, wrist region, elbow region, and/or shoulder region. In some embodiments, additional sensors may also be coupled to the operator's head and/or neck region. In some embodiments, the sensing and tracking module 16 may be external and coupled to the hand controller 17 via electrical components and/or mounting hardware. In some embodiments, the optional sensor and tracking module 16A may sense and track movement of one or more of the operator's head, the operator's eyes, or at least a portion of the operator's neck based at least in part on imaging of the operator, in addition to or instead of sensors attached to the operator's body.

一部の実施形態では、感知及び追跡モジュール１６は、オペレータの胴体又は任意の他の身体部分に連結されたセンサを用いることができる。一部の実施形態では、感知及び追跡モジュール１６は、センサに加えて、例えば、加速度計、ジャイロスコープ、磁気計、及びモーションプロセッサを有する慣性運動量ユニット（ＩＭＵ）を用いることができる。磁気計の追加は、垂直軸の周りのセンサドリフトを低減することができる。一部の実施形態では、感知及び追跡モジュール１６はまた、手袋、外科手術スクラブ、又は外科手術ガウンなどの外科手術材料内に置かれたセンサを含む。センサは、再利用可能又は使い捨てであってもよい。一部の実施形態では、センサは、手術室などの部屋の固定された場所など、オペレータの外部に配設され得る。外部センサ３７は、計算モジュール１８によって処理され、したがって外科手術ロボットシステム１０によって用いられ得る外部データ３６を生成することができる。 In some embodiments, the sensing and tracking module 16 may use sensors coupled to the operator's torso or any other body part. In some embodiments, the sensing and tracking module 16 may use, in addition to sensors, an inertial momentum unit (IMU) having, for example, an accelerometer, a gyroscope, a magnetometer, and a motion processor. The addition of a magnetometer may reduce sensor drift around the vertical axis. In some embodiments, the sensing and tracking module 16 also includes sensors placed within surgical materials such as gloves, surgical scrubs, or a surgical gown. The sensors may be reusable or disposable. In some embodiments, the sensors may be disposed external to the operator, such as in a fixed location in a room such as an operating room. The external sensors 37 may generate external data 36 that may be processed by the computing module 18 and thus used by the surgical robotic system 10.

センサは、オペレータの手及び／又は腕の位置及び／又は配向を示す位置及び／又は配向データを生成する。感知及び追跡モジュール１６及び／又は１６Ａは、カメラ組立品４４及びロボットサブシステム２０のロボットアーム組立品４２の移動（例えば、位置及び／又は配向の変化）を制御するために利用され得る。感知及び追跡モジュール１６によって生成される追跡及び位置データ３４は、少なくとも１つのプロセッサ２２によって処理するために計算モジュール１８に伝達され得る。 The sensors generate position and/or orientation data indicative of the position and/or orientation of the operator's hands and/or arms. The sensing and tracking modules 16 and/or 16A may be utilized to control the movement (e.g., changes in position and/or orientation) of the camera assembly 44 and the robotic arm assembly 42 of the robot subsystem 20. The tracking and position data 34 generated by the sensing and tracking modules 16 may be transmitted to the computing module 18 for processing by at least one processor 22.

計算モジュール１８は、追跡及び位置データ３４及び３４Ａから、オペレータの手又は腕の位置及び／又は配向、並びにオペレータの頭部の一部の実施形態では、同様に決定又は計算し、追跡及び位置データ３４及び３４Ａをロボットサブシステム２０に伝達することができる。追跡及び位置データ３４、３４Ａは、プロセッサ２２によって処理され得、例えば、ストレージ２４に記憶され得る。追跡及び位置データ３４及び３４Ａはまた、コントローラ２６によって使用され得、このコントローラは、それに応答して、ロボットアーム組立品４２及び／又はカメラ組立品４４の移動を制御するための制御信号を生成することができる。例えば、コントローラ２６は、カメラ組立品４４の少なくとも一部分、ロボットアーム組立品４２の少なくとも一部分、又はその両方の位置及び／若しくは配向を変更し得る。一部の実施形態では、コントローラ２６はまた、カメラ組立品４４のパン及び傾斜を調整して、オペレータの頭部の移動に追従することができる。 The computing module 18 can determine or calculate the position and/or orientation of the operator's hands or arms, and in some embodiments, the operator's head, from the tracking and position data 34 and 34A and communicate the tracking and position data 34 and 34A to the robotic subsystem 20. The tracking and position data 34, 34A can be processed by the processor 22 and stored, for example, in the storage 24. The tracking and position data 34 and 34A can also be used by the controller 26, which can responsively generate control signals for controlling the movement of the robotic arm assembly 42 and/or the camera assembly 44. For example, the controller 26 can change the position and/or orientation of at least a portion of the camera assembly 44, at least a portion of the robotic arm assembly 42, or both. In some embodiments, the controller 26 can also adjust the pan and tilt of the camera assembly 44 to follow the movement of the operator's head.

ロボットサブシステム２０は、モータ４０及びトロカール５０又はトロカールマウントを有するロボット支持システム（ＲＳＳ）４６と、ロボットアーム組立品４２と、カメラ組立品４４とを含むことができる。ロボットアーム組立品４２及びカメラ組立品４４は、米国特許第１０，２８５，７６５号に開示及び説明されるような単一の支持軸ロボットユニットの一部を形成することができるか、又はＰＣＴ特許出願第ＰＣＴ／ＵＳ２０２０／０３９２０３号に開示及び説明されるような分割アーム（ＳＡ）アーキテクチャロボットシステムの一部を形成することができ、どちらも参照によりそれら全体が本明細書に組み込まれる。 The robotic subsystem 20 may include a robotic support system (RSS) 46 having a motor 40 and a trocar 50 or trocar mount, a robotic arm assembly 42, and a camera assembly 44. The robotic arm assembly 42 and camera assembly 44 may form part of a single support axis robotic unit such as that disclosed and described in U.S. Pat. No. 10,285,765, or may form part of a split-arm (SA) architecture robotic system such as that disclosed and described in PCT Patent Application No. PCT/US2020/039203, both of which are incorporated herein by reference in their entireties.

ロボットサブシステム２０は、異なる又は別個の軸に沿って配備可能な複数の異なるロボットアームを用いることができる。一部の実施形態では、複数の異なるカメラ要素を用いることができるカメラ組立品４４はまた、共通の別個の軸に沿って配備され得る。したがって、外科手術ロボットシステム１０は、異なる軸に沿って配備可能な一対の別個のロボットアーム及びカメラ組立品４４などの複数の異なる構成要素を用いることができる。一部の実施形態では、ロボットアーム４２及びカメラ組立品４４は、別個に操作可能であり、操縦可能であり、かつ移動可能である。ロボットアーム組立品４２及びカメラ組立品４４を含むロボットサブシステム２０は、別個の操作可能な軸に沿って使い捨て可能であり、本明細書ではＳＡアーキテクチャと呼ばれる。ＳＡアーキテクチャは、単一の挿入点又は部位における単一のトロカールを通したロボット外科手術器具の挿入を単純化し、効率を高めると同時に、外科手術器具の外科手術準備完了状態への配備、並びに以下に更に説明するように、トロカール５０を通した外科手術器具のその後の除去を補助するように設計される。 The robotic subsystem 20 may employ multiple distinct robotic arms deployable along different or separate axes. In some embodiments, a camera assembly 44 that may employ multiple distinct camera elements may also be deployed along a common, separate axis. Thus, the surgical robot system 10 may employ multiple distinct components, such as a pair of distinct robotic arms and camera assemblies 44 deployable along different axes. In some embodiments, the robotic arm 42 and camera assembly 44 are independently operable, steerable, and movable. The robotic subsystem 20, including the robotic arm assembly 42 and camera assembly 44, is disposable along separate operable axes and is referred to herein as an SA architecture. The SA architecture is designed to simplify and increase the efficiency of insertion of robotic surgical instruments through a single trocar at a single insertion point or site, while also assisting in the deployment of the surgical instruments into a surgical-ready state and subsequent removal of the surgical instruments through the trocar 50, as further described below.

ＲＳＳ４６は、モータ４０及びトロカール５０又はトロカールマウントを含むことができる。ＲＳＳ４６は、その遠位端に連結されたモータ４０を支持する支持部材を更に含むことができる。モータ４０は、カメラ組立品４４及びロボットアーム組立品４２の各々に連結され得る。支持部材は、ロボットサブシステム２０の１つ以上の構成要素を直線的に、又は任意の他の選択された方向若しくは配向で移動するように構成及び制御され得る。一部の実施形態では、ＲＳＳ４６は、自立することができる。一部の実施形態では、ＲＳＳ４６は、一端部においてロボットサブシステム２０に連結され、且つ対向端部において調整可能な支持部材又は要素に連結されるモータ４０を含むことができる。 The RSS 46 may include a motor 40 and a trocar 50 or trocar mount. The RSS 46 may further include a support member supporting the motor 40 coupled to its distal end. The motor 40 may be coupled to each of the camera assembly 44 and the robotic arm assembly 42. The support member may be configured and controlled to move one or more components of the robotic subsystem 20 linearly or in any other selected direction or orientation. In some embodiments, the RSS 46 may be freestanding. In some embodiments, the RSS 46 may include a motor 40 coupled at one end to the robotic subsystem 20 and at an opposite end to an adjustable support member or element.

モータ４０は、コントローラ２６によって生成される制御信号を受信することができる。モータ４０は、ロボットアーム組立品４２及びカメラ組立品４４に個別に又は一緒に給電及び駆動するための、歯車、１つ以上のモータ、ドライブトレイン、電子機器等を含むことができる。モータ４０はまた、ロボットアーム組立品４２、カメラ組立品４４、並びに／又はＲＳＳ４６及びロボットサブシステム２０の他の構成要素に、機械的な動力、電力、機械的な通信、及び電気的な通信を提供することができる。モータ４０は、計算モジュール１８によって制御され得る。したがって、モータ４０は、例えば、各アームの各関節運動する関節の位置及び配向、並びにカメラ組立品４４を含む、ロボットアーム組立品４２を制御及び駆動できる１つ以上のモータを制御するための信号を生成することができる。モータ４０は、トロカール５０を通してロボットサブシステム２０の各構成要素を挿入及び除去するために最初に利用される並進又は直線の自由度を更に提供することができる。モータ４０はまた、トロカール５０を通して患者１００に挿入されるときに、ロボットアーム組立品の各ロボットアーム４２の挿入深さを調整するために採用され得る。 The motors 40 can receive control signals generated by the controller 26. The motors 40 can include gears, one or more motors, drive trains, electronics, etc. for powering and driving the robotic arm assembly 42 and the camera assembly 44, individually or together. The motors 40 can also provide mechanical power, electrical power, mechanical communications, and electrical communications to the robotic arm assembly 42, the camera assembly 44, and/or the RSS 46 and other components of the robot subsystem 20. The motors 40 can be controlled by the computing module 18. Thus, the motors 40 can generate signals to control one or more motors that can control and drive the robotic arm assembly 42, including, for example, the position and orientation of each articulating joint of each arm, and the camera assembly 44. The motors 40 can further provide translational or linear degrees of freedom that are primarily utilized to insert and remove components of the robotic subsystem 20 through the trocar 50. The motor 40 may also be employed to adjust the insertion depth of each robotic arm 42 of the robotic arm assembly as it is inserted into the patient 100 through the trocar 50.

トロカール５０は、一部の実施形態では、爪（金属又はプラスチックの鋭利又はブレードなしの先端であり得る）、カニューレ（本質的に中空管）、及びシールからなり得る医療装置である。トロカール５０は、ロボットサブシステム２０の少なくとも一部分を、対象（例えば、患者）の内部空洞内に置くために使用され得、体腔からガス及び／又は流体を引き出すことができる。ロボットサブシステム２０をトロカール５０を通して挿入して、患者の体腔内にアクセスし、インビボで手術を行うことができる。一部の実施形態では、本発明のロボットサブシステム２０は、ロボットアーム組立品４２及びカメラ組立品４４が、単一位置又は複数の異なる位置に患者内で操縦され得るように、少なくとも部分的に、複数の自由度でトロカール５０又はトロカールマウントによって支持され得る。一部の実施形態では、ロボットアーム組立品４２及びカメラ組立品４４が、患者内で単一位置又は複数の異なる位置に操縦され得るように、ロボットアーム組立品４２及びカメラ組立品４４は、複数の自由度でトロカール５０又はトロカールマウントによって支持され得る。 The trocar 50 is a medical device that, in some embodiments, may consist of a claw (which may be a metal or plastic tip with a sharp or non-bladed tip), a cannula (essentially a hollow tube), and a seal. The trocar 50 may be used to position at least a portion of the robotic subsystem 20 within an internal cavity of a subject (e.g., a patient) and may withdraw gases and/or fluids from the body cavity. The robotic subsystem 20 may be inserted through the trocar 50 to access the patient's body cavity and perform surgery in vivo. In some embodiments, the robotic subsystem 20 of the present invention may be supported, at least in part, by a trocar 50 or trocar mount with multiple degrees of freedom so that the robotic arm assembly 42 and camera assembly 44 may be maneuvered within the patient to a single position or multiple different positions. In some embodiments, the robotic arm assembly 42 and camera assembly 44 may be supported by a trocar 50 or trocar mount with multiple degrees of freedom so that the robotic arm assembly 42 and camera assembly 44 may be maneuvered within the patient to a single position or multiple different positions.

一部の実施形態では、ＲＳＳ４６は、システム構成要素（例えば、ディスプレイ１２、感知及び追跡モジュール１６、ロボットアーム組立品４２、カメラ組立品４４など）のうちの１つ以上からの入力データを処理するための、並びにそれに応答して制御信号を生成するための任意選択のコントローラを更に含み得る。モータ４０はまた、一部の実施形態では、データを記憶するための記憶要素を含むことができる。 In some embodiments, the RSS 46 may further include an optional controller for processing input data from one or more of the system components (e.g., the display 12, the sensing and tracking module 16, the robotic arm assembly 42, the camera assembly 44, etc.) and for generating control signals in response thereto. The motor 40 may also, in some embodiments, include a memory element for storing data.

ロボットアーム組立品４２は、一部の実施形態では、及び一部の動作モードでは、関連するセンサによって感知される、オペレータのアーム及び／又は手のスケールダウンされた移動又は動きに従うように制御され得る。ロボットアーム組立品４２は、第一のロボットアームの遠位端に配置された器具先端を有する第一のエンドエフェクタを含む第一のロボットアームと、第二のロボットアームの遠位端に配置された器具先端を有する第二のエンドエフェクタを含む第二のロボットアームと、を含む。一部の実施形態では、ロボットアーム組立品４２は、肩関節、肘関節、及び手首関節並びにオペレータの指と関連付けられ得る移動と関連付けられ得る、部分又は領域を有することができる。例えば、ロボット肘関節は、ヒトの肘の位置及び配向に追従し得、ロボット手首関節は、ヒトの手首の位置及び配向に追従し得る。ロボットアーム組立品４２はまた、それと関連付けられる末端領域を有し得、これは、一部の実施形態では、例えば、ユーザが人差し指及び親指を一緒に挟む際に、人差し指など、オペレータの１つ以上の指の移動に追従するエンドエフェクタで終端し得る。一部の実施形態では、ロボットアーム組立品４２は、一部の制御モードでオペレータのアームの移動に従ってもよく、一方で、ロボット組立品の仮想胸部は、静止したままであってもよい（例えば、器具制御モード）。一部の実施形態では、オペレータの胴体の位置及び配向は、オペレータの腕及び／又は手の位置及び配向から差し引かれる。この差し引きにより、オペレータは、ロボットアームが動くことなく、胴体を移動させることが可能である。ロボット組立品の個別のアームの移動の更なる開示制御は、国際特許出願公開第２０２２／０９４０００Ａ１号及び同第２０２１／２３１４０２Ａ１号に提供され、その各々は、参照によりその全体が本明細書に組み込まれる。 In some embodiments, and in some modes of operation, the robotic arm assembly 42 may be controlled to follow scaled-down movements or motions of an operator's arm and/or hand, as sensed by associated sensors. The robotic arm assembly 42 includes a first robotic arm including a first end effector having an instrument tip disposed at the distal end of the first robotic arm, and a second robotic arm including a second end effector having an instrument tip disposed at the distal end of the second robotic arm. In some embodiments, the robotic arm assembly 42 may have portions or regions associated with movements associated with shoulder, elbow, and wrist joints, as well as the fingers of an operator. For example, a robotic elbow joint may track the position and orientation of a human elbow, and a robotic wrist joint may track the position and orientation of a human wrist. The robotic arm assembly 42 may also have a terminal region associated therewith, which in some embodiments may terminate in an end effector that tracks the movement of one or more fingers of the operator, such as the index finger, when the user pinches the index finger and thumb together, for example. In some embodiments, the robotic arm assembly 42 may follow the movement of the operator's arms in some control modes, while the virtual chest of the robotic assembly may remain stationary (e.g., in an instrument control mode). In some embodiments, the position and orientation of the operator's torso is subtracted from the position and orientation of the operator's arms and/or hands. This subtraction allows the operator to move the torso without the robotic arms moving. Further disclosure of control of the movement of the individual arms of the robotic assembly is provided in International Patent Applications WO 2022/094000 A1 and WO 2021/231402 A1, each of which is incorporated herein by reference in its entirety.

カメラ組立品４４は、オペレータに、例えば、手術又は手術部位のライブビデオフィードなどの画像データ４８を提供するだけでなく、オペレータがカメラ組立品４４の一部を形成するカメラを作動及び制御することを可能にするように構成されている。一部の実施形態では、カメラ組立品４４は、１つ以上のカメラ（例えば、一対のカメラ）を含み得、その光学軸は、選択された距離だけ軸方向に離間し、これは、カメラ間距離として知られ、外科手術部位の立体視又は画像を提供する。一部の実施形態では、オペレータは、オペレータの手に連結されたセンサを介して、又はオペレータの手によって把持若しくは保持されたハンドコントローラ１７を介して、手の移動を介してカメラの移動を制御することができ、したがって、オペレータは、直感的かつ自然な様式で手術部位の所望の視界を得ることができる。一部の実施形態では、オペレータは、オペレータの頭部の移動を介して、カメラの移動を追加的に制御することができる。カメラ組立品４４は、視野の方向に関して、例えば、ヨー方向、ピッチ方向及びロール方向を含む、複数の方向に移動可能である。一部の実施形態では、立体カメラの構成要素は、自然で快適なユーザ体験を提供するように構成され得る。一部の実施形態では、カメラ間の軸間距離は、オペレータによって知覚される手術部位の奥行きを調整するように修正され得る。 The camera assembly 44 is configured to provide the operator with image data 48, such as a live video feed of the procedure or surgical site, as well as to allow the operator to operate and control the cameras forming part of the camera assembly 44. In some embodiments, the camera assembly 44 may include one or more cameras (e.g., a pair of cameras) whose optical axes are axially spaced a selected distance, known as the inter-camera distance, to provide a stereoscopic view or image of the surgical site. In some embodiments, the operator can control camera movement through hand movement, via a sensor coupled to the operator's hand or via a hand controller 17 grasped or held by the operator's hand, thus allowing the operator to obtain a desired view of the surgical site in an intuitive and natural manner. In some embodiments, the operator can additionally control camera movement through movement of the operator's head. The camera assembly 44 is movable in multiple directions relative to the direction of view, including, for example, yaw, pitch, and roll. In some embodiments, the stereoscopic camera components may be configured to provide a natural and comfortable user experience. In some embodiments, the axial distance between the cameras can be modified to adjust the depth of the surgical site as perceived by the operator.

カメラ組立品４４によって生成される画像又はビデオデータ４８は、ディスプレイ１２上に表示され得る。実施形態では、ディスプレイ１２がＨＭＤを含む場合、ディスプレイは、ＨＭＤのヨー方向、ピッチ方向及びロール方向の未加工の配向データ、並びにＨＭＤのデカルト空間（ｘ、ｙ、ｚ）内の位置データを取得する、組み込み式感知及び追跡モジュール１６Ａを含むことができる。一部の実施形態では、オペレータの頭部に関する位置及び配向データは、別個の頭部追跡モジュールを介して提供され得る。一部の実施形態では、感知及び追跡モジュール１６Ａを使用して、ＨＭＤの組み込み式追跡システムの代わりに、又はそれに加えて、ディスプレイの補足的な位置及び配向追跡データを提供し得る。一部の実施形態では、オペレータの頭部追跡は使用又は採用されない。一部の実施形態では、オペレータの画像は、オペレータの頭部の少なくとも一部分を追跡するために、感知及び追跡モジュール１６Ａによって使用され得る。 Image or video data 48 generated by camera assembly 44 may be displayed on display 12. In embodiments, if display 12 includes an HMD, the display may include an embedded sensing and tracking module 16A that acquires raw orientation data in the yaw, pitch, and roll directions of the HMD, as well as position data in Cartesian space (x, y, z) for the HMD. In some embodiments, position and orientation data for the operator's head may be provided via a separate head tracking module. In some embodiments, sensing and tracking module 16A may be used to provide supplemental position and orientation tracking data for the display instead of, or in addition to, the HMD's embedded tracking system. In some embodiments, operator head tracking is not used or employed. In some embodiments, an image of the operator may be used by sensing and tracking module 16A to track at least a portion of the operator's head.

図２Ａは、一部の実施形態による、モバイル患者カートに組み込まれた、又はモバイル患者カート上に装着された、外科手術ロボットシステム１０の例示的なロボット組立品２０（本明細書ではロボットサブシステムとも呼ばれる）を示す。一部の実施形態では、ロボットサブシステム２０は、ＲＳＳ４６を含み、これは今度はモータ４０を含み、ロボットアーム組立品４２は、エンドエフェクタ４５を有し、カメラ組立品４４は、１つ以上のカメラ４７を有し、トロカール５０又はトロカールマウントも含み得る。 FIG. 2A shows an exemplary robotic assembly 20 (also referred to herein as a robotic subsystem) of a surgical robotic system 10 integrated into or mounted on a mobile patient cart, according to some embodiments. In some embodiments, the robotic subsystem 20 includes an RSS 46, which in turn includes motors 40, a robotic arm assembly 42 having an end effector 45, a camera assembly 44 having one or more cameras 47, and may also include a trocar 50 or trocar mount.

図２Ｂは、一部の実施形態による、本開示の外科手術ロボットシステム１０のオペレータコンソール１１の実施例を示す。オペレータコンソール１１は、ディスプレイ１２、ハンドコントローラ１７を含み、ロボットアーム組立品４２の制御、カメラ組立品４４の制御、及びシステムの他の態様の制御のためのフットペダルアレイ１９などの１つ以上の追加のコントローラも含む。 FIG. 2B shows an example of an operator console 11 of the presently disclosed surgical robotic system 10, according to some embodiments. The operator console 11 includes a display 12, a hand controller 17, and one or more additional controllers, such as a foot pedal array 19, for controlling the robotic arm assembly 42, the camera assembly 44, and other aspects of the system.

図２Ｂはまた、オペレータコンソールの左ハンドコントローラサブシステム２３Ａ及び右ハンドコントローラサブシステム２３Ｂを図示する。左ハンドコントローラサブシステム２３Ａは、左ハンドコントローラ１７Ａを含み、かつそれをサポートし、右ハンドコントローラサブシステム２３Ｂは、右ハンドコントローラ１７Ｂを含み、かつそれをサポートする。一部の実施形態では、左ハンドコントローラサブシステム２３Ａは、左ハンドコントローラ１７Ａに取り外し可能に接続又は係合することができ、右ハンドコントローラサブシステム２３Ｂは、右ハンドコントローラ１７Ａに取り外し可能に接続又は係合することができる。一部の実施形態では、接続は、左ハンドコントローラサブシステム２３Ａ及び右ハンドコントローラサブシステム２３Ｂが、それぞれ、左ハンドコントローラ１７Ａ又は右ハンドコントローラ１７Ｂのボタン又はタッチ入力装置上のユーザ選択から受信した入力を伝える信号を含む、左ハンドコントローラ１７Ａ及び右ハンドコントローラ１７Ｂから信号を受信し得るように、物理的及び電子的の両方であり得る。 2B also illustrates the left hand controller subsystem 23A and the right hand controller subsystem 23B of the operator console. The left hand controller subsystem 23A includes and supports the left hand controller 17A, and the right hand controller subsystem 23B includes and supports the right hand controller 17B. In some embodiments, the left hand controller subsystem 23A can be removably connected or engaged with the left hand controller 17A, and the right hand controller subsystem 23B can be removably connected or engaged with the right hand controller 17A. In some embodiments, the connections can be both physical and electronic, such that the left hand controller subsystem 23A and the right hand controller subsystem 23B can receive signals from the left hand controller 17A and the right hand controller 17B, respectively, including signals conveying input received from user selections on buttons or touch input devices of the left hand controller 17A or the right hand controller 17B.

左ハンドコントローラサブシステム２３Ａ及び右ハンドコントローラサブシステム２３Ｂの各々は、それぞれの左ハンドコントローラ１７Ａ及び右ハンドコントローラ１７Ｂの可動域を可能にする構成要素を含むことができ、その結果、左ハンドコントローラ１７Ａ及び右ハンドコントローラ１７Ｂは、三次元で並進又は変位することができ、ロール、ピッチ、及びヨー方向に追加的に移動し得る。加えて、左ハンドコントローラサブシステム２３Ａ及び右ハンドコントローラサブシステム２３Ｂの各々は、それぞれの左ハンドコントローラ１７Ａ及び右ハンドコントローラ１７Ｂの動作を前述の方向のそれぞれに登録することができ、こうした移動情報を提供する信号を外科用ロボットシステム１０のプロセッサ２２（図１に示す通り）に送信することができる。 Each of the left hand controller subsystem 23A and the right hand controller subsystem 23B can include components that enable a range of motion for the respective left hand controller 17A and right hand controller 17B, such that the left hand controller 17A and the right hand controller 17B can translate or displace in three dimensions, and can additionally move in the roll, pitch, and yaw directions. Additionally, each of the left hand controller subsystem 23A and the right hand controller subsystem 23B can register the movement of the respective left hand controller 17A and right hand controller 17B in each of the aforementioned directions and can transmit signals providing such movement information to the processor 22 (as shown in FIG. 1) of the surgical robotic system 10.

一部の実施形態では、左ハンドコントローラサブシステム２３Ａ及び右ハンドコントローラサブシステム２３Ｂの各々は、異なるハンドコントローラ（図示せず）を受容及び接続するか、又は係合するように構成され得る。例えば、異なる構成のボタン及びタッチ入力装置を有するハンドコントローラが提供され得る。加えて、異なる形状を有するハンドコントローラが提供され得る。ハンドコントローラは、特定の外科用ロボットシステム又は特定の外科用ロボット手順と適合するように選択され得るか、又はオペレータにとってより快適さ及び容易さを提供するために、ボタン及び入力装置に対するオペレータの好みに基づいて、又はハンドコントローラの形状に関して選択され得る。 In some embodiments, each of the left hand controller subsystem 23A and the right hand controller subsystem 23B may be configured to receive and connect to or engage a different hand controller (not shown). For example, hand controllers having different configurations of buttons and touch input devices may be provided. In addition, hand controllers having different shapes may be provided. The hand controllers may be selected to be compatible with a particular surgical robotic system or a particular surgical robotic procedure, or may be selected based on operator preferences for buttons and input devices or for hand controller shape to provide greater comfort and ease for the operator.

図３Ａは、一部の実施形態による、及び一部の外科手術処置のための、対象１００の内部空洞１０４内の外科手術を実施する外科手術ロボットシステム１０の側面図を概略的に示す。図３Ｂは、対象１００の内部空洞１０４内で外科手術を実施する外科手術ロボットシステム１０の斜視上面図を概略的に示す。対象１００（例えば、患者）は、手術テーブル１０２（例えば、外科手術テーブル１０２）上に置かれる。一部の実施形態では、及び一部の外科手術処置のために、患者１００に切開が行われ、内部空洞１０４へのアクセスが得られる。次に、トロカール５０は、選択された場所で患者１００内に挿入されて、内部空洞１０４又は手術部位へのアクセスを提供する。次いで、ＲＳＳ４６は、患者１００及びトロカール５０上の位置に操作され得る。一部の実施形態では、ＲＳＳ４６は、トロカール５０に結合するトロカールマウントを含む。カメラ組立品４４及びロボットアーム組立品４２は、モータ４０に連結されて、トロカール５０を通して患者１００の中に、したがって、患者１００の内部空洞１０４内に、個別に及び／又は順次、挿入され得る。カメラ組立品４４及びロボットアーム組立品４２は、使用時に対象の身体の外部に留まる一部の部分を含み得るが、ロボットアーム組立品４２及び／又はカメラ組立品４４を対象の内部空洞内に挿入すること、並びにロボットアーム組立品４２及び／又はカメラ組立品４４を対象の内部空洞内に配置することへの言及は、使用中に対象の内部空洞内にあることが意図されるロボットアーム組立品４２及びカメラ組立品４４の部分を指す。順次挿入方法は、より小さなトロカールを支持する利点を有し、それゆえに、患者１００においてより小さな切開を行うことができ、それゆえ、患者１００が経験する外傷を低減する。一部の実施形態では、カメラ組立品４４及びロボットアーム組立品４２は、任意の順序又は特定の順序で挿入され得る。一部の実施形態では、カメラ組立品４４の後に、ロボットアーム組立品４２の第一のロボットアーム４２Ａが続き、その後に、ロボットアーム組立品４２の第二のロボットアーム４２Ｂが続き、その全てが、トロカール５０内、ひいては、内部空洞１０４内に挿入され得る。患者１００に挿入されると、ＲＳＳ４６は、ロボットアーム組立品４２及びカメラ組立品４４を、オペレータコンソール１１によって手動又は自動的に制御される手術部位に移動させることができる。 FIG. 3A schematically illustrates a side view of a surgical robotic system 10 performing surgery within an internal cavity 104 of a subject 100, according to some embodiments and for some surgical procedures. FIG. 3B schematically illustrates a perspective top view of a surgical robotic system 10 performing surgery within an internal cavity 104 of a subject 100. The subject 100 (e.g., a patient) is positioned on a surgical table 102 (e.g., surgical table 102). In some embodiments and for some surgical procedures, an incision is made in the patient 100 to gain access to the internal cavity 104. A trocar 50 is then inserted into the patient 100 at a selected location to provide access to the internal cavity 104 or surgical site. The RSS 46 can then be manipulated into position on the patient 100 and trocar 50. In some embodiments, the RSS 46 includes a trocar mount that couples to the trocar 50. The camera assembly 44 and the robotic arm assembly 42 are coupled to the motor 40 and may be inserted individually and/or sequentially through the trocar 50 into the patient 100, and thus into the internal cavity 104 of the patient 100. While the camera assembly 44 and the robotic arm assembly 42 may include some portions that remain outside the subject's body during use, references to inserting the robotic arm assembly 42 and/or the camera assembly 44 into the internal cavity of the subject and disposing the robotic arm assembly 42 and/or the camera assembly 44 within the internal cavity of the subject refer to the portions of the robotic arm assembly 42 and the camera assembly 44 that are intended to be within the internal cavity of the subject during use. The sequential insertion method has the advantage of supporting smaller trocars, thus allowing for smaller incisions to be made in the patient 100, thus reducing trauma experienced by the patient 100. In some embodiments, the camera assembly 44 and the robotic arm assembly 42 may be inserted in any order or in a specific order. In some embodiments, the camera assembly 44 may be followed by a first robotic arm 42A of the robotic arm assembly 42, followed by a second robotic arm 42B of the robotic arm assembly 42, all of which may be inserted into the trocar 50 and thus into the internal cavity 104. Once inserted into the patient 100, the RSS 46 may move the robotic arm assembly 42 and camera assembly 44 to the surgical site, controlled manually or automatically by the operator console 11.

ロボットアーム組立品の個別のアームの移動の管理に関する開示制御は、国際特許出願公開第２０２２／０９４０００Ａ１号及び同第２０２１／２３１４０２Ａ１号に提供され、その各々は、参照によりその全体が本明細書に組み込まれる。 Disclosures regarding management of movement of individual arms of a robotic arm assembly are provided in International Patent Applications Nos. 2022/094000 A1 and 2021/231402 A1, each of which is incorporated herein by reference in its entirety.

図４Ａは、一部の実施形態による、ロボットアーム部分組立品２１の斜視図である。ロボットアーム部分組立品２１は、ロボットアーム４２Ａと、器具先端１２０（例えば、単極はさみ、針ドライバ／ホルダ、双極把持器、又は任意の他の適切なツール）を有するエンドエフェクタ４５と、ロボットアーム４２Ａを支持するシャフト１２２とを含む。シャフト１２２の遠位端は、ロボットアーム４２Ａに連結されており、シャフト１２２の近位端は、（図２Ａに示すように）モータ４０のハウジング１２４に連結される。シャフト１２２の少なくとも一部分は、（図３Ａ及び図３Ｂに示すように）内部空洞１０４の外部にあり得る。シャフト１２２の少なくとも一部分は、（図３Ａ及び図３Ｂに示すように）内部空洞１０４の中に挿入され得る。 Figure 4A is a perspective view of the robotic arm subassembly 21, according to some embodiments. The robotic arm subassembly 21 includes a robotic arm 42A, an end effector 45 having an instrument tip 120 (e.g., monopolar scissors, a needle driver/holder, a bipolar grasper, or any other suitable tool), and a shaft 122 that supports the robotic arm 42A. The distal end of the shaft 122 is coupled to the robotic arm 42A, and the proximal end of the shaft 122 is coupled to the housing 124 of the motor 40 (as shown in Figure 2A). At least a portion of the shaft 122 can be external to the internal cavity 104 (as shown in Figures 3A and 3B). At least a portion of the shaft 122 can be inserted into the internal cavity 104 (as shown in Figures 3A and 3B).

図４Ｂは、ロボットアーム組立品４２の側面図である。一部の実施態様によると、ロボットアーム組立品４２は、仮想肩を形成する肩関節１２６、位置センサ１３２（例えば、容量近接センサ）を有し、仮想肘を形成する肘関節１２８、仮想手首を形成する手首関節１３０、及びエンドエフェクタ４５を含む。一部の実施形態では、肩関節１２６、肘関節１２８、手首関節１３０は、一連のヒンジ及び回転ジョイントを含んで、エンドエフェクタ４５に対する追加の１つの把持自由度とともに、各アームに位置決め可能な７自由度を提供することができる。 Figure 4B is a side view of the robotic arm assembly 42. According to some embodiments, the robotic arm assembly 42 includes a shoulder joint 126 that forms a virtual shoulder, an elbow joint 128 having a position sensor 132 (e.g., a capacitive proximity sensor) that forms a virtual elbow, a wrist joint 130 that forms a virtual wrist, and an end effector 45. In some embodiments, the shoulder joint 126, elbow joint 128, and wrist joint 130 can include a series of hinges and revolute joints to provide seven positionable degrees of freedom for each arm, along with an additional grasping degree of freedom for the end effector 45.

図５は、患者の内部体腔内への挿入のために構成されたロボット組立品２０の一部分の斜視正面図を例示する。ロボット組立品２０は、ロボットアーム４２Ａ及びロボットアーム４２Ｂを含む。２つのロボットアーム４２Ａ及び４２Ｂは、一部の実施形態では、ロボットアーム２０の仮想胸部１４０を画定し得る。いくつかの実施形態では、仮想胸部１４０は、ロボットアーム４２Ａ（例えば、肩関節１２６）の最近位ジョイントの第一の旋回点１４２Ａと、ロボットアーム４２Ｂの最近位ジョイントの第二の旋回点１４２Ｂと、カメラ４７のカメラ撮像中心点１４４との間に延在する、胸部平面によって画定され得る。仮想胸部１４０の旋回中心１４６は、仮想胸部１４０の中央にある。 FIG. 5 illustrates a perspective front view of a portion of a robotic assembly 20 configured for insertion into a patient's internal body cavity. The robotic assembly 20 includes a robotic arm 42A and a robotic arm 42B. The two robotic arms 42A and 42B may, in some embodiments, define a virtual chest 140 of the robotic arm 20. In some embodiments, the virtual chest 140 may be defined by a chest plane extending between a first pivot point 142A of the most proximal joint of the robotic arm 42A (e.g., shoulder joint 126), a second pivot point 142B of the most proximal joint of the robotic arm 42B, and a camera imaging center point 144 of the camera 47. A pivot center 146 of the virtual chest 140 is at the center of the virtual chest 140.

一部の実施形態では、ロボットアーム４２Ａ及びロボットアーム４２Ｂの一方又は両方にあるセンサは、システム１０によって使用されて、ロボットアーム４２Ａ及び４２Ｂの各々又は両方の少なくとも一部分の三次元空間内の場所の変化を決定し得る。一部の実施形態では、第一のロボットアーム４２Ａ及び第二のロボットアーム４２Ｂの一方又は両方にあるセンサは、外科手術システム１０によって使用されて、他のロボットアームの少なくとも一部分の三次元空間内の場所に対して、１つのロボットアームの少なくとも一部分の三次元空間内の場所を決定することができる。 In some embodiments, sensors on one or both of the robotic arms 42A and 42B may be used by the system 10 to determine a change in location in three-dimensional space of at least a portion of each or both of the robotic arms 42A and 42B. In some embodiments, sensors on one or both of the first robotic arm 42A and the second robotic arm 42B may be used by the surgical system 10 to determine a location in three-dimensional space of at least a portion of one robotic arm relative to a location in three-dimensional space of at least a portion of the other robotic arm.

一部の実施形態では、カメラ組立品４４は、外科手術ロボットシステム１０が三次元空間内の相対位置を決定することができる画像を取得するように構成される。例えば、カメラ組立品４４は、複数のカメラを含むことができ、そのうちの少なくとも２つは、撮像軸に対して互いに横方向にずれており、システムは、内部体腔内の特徴までの距離を決定するように構成され得る。特徴までの距離を決定するためのカメラ組立品及び関連システムを含む外科用ロボットシステムに関する更なる開示は、「ＳｙｓｔｅｍａｎｄＭｅｔｈｏｄｆｏｒＤｅｔｅｒｍｉｎｉｎｇＤｅｐｔｈＰｅｒｃｅｐｔｉｏｎＩｎＶｉｖｏｉｎａＳｕｒｇｉｃａｌＲｏｂｏｔｉｃＳｙｓｔｅｍ」と題され、２０２１年８月１２日に公開され、参照によりその全体が本明細書に組み込まれる、国際特許出願公開第２０２１／１５９４０９号に見出され得る。カメラの特徴までの距離に関する情報及び光学特性に関する情報は、三次元空間内の相対位置を決定するためにシステムによって使用され得る。 In some embodiments, the camera assembly 44 is configured to acquire images that enable the surgical robotic system 10 to determine its relative position in three-dimensional space. For example, the camera assembly 44 can include multiple cameras, at least two of which are laterally offset from one another relative to an imaging axis, and the system can be configured to determine distances to features within an internal body cavity. Further disclosure regarding surgical robotic systems including camera assemblies and associated systems for determining distances to features can be found in International Patent Application Publication No. WO 2021/159409, entitled "System and Method for Determining Depth Perception In Vivo in a Surgical Robotic System," published August 12, 2021, and incorporated herein by reference in its entirety. Information regarding the distances to features and information regarding the optical properties of the cameras can be used by the system to determine their relative position in three-dimensional space.

図６は、患者の空洞のライブビデオ映像１６８の左及び右に、それぞれ左ピラーボックス１９８及び右ピラーボックス１９９を含むようにフォーマットされたグラフィカルユーザインターフェース１５０である。グラフィカルユーザインターフェース１５０は、ライブビデオ映像１６８上にオーバーレイされ得る。一部の実施形態では、ライブビデオ映像１６８は、コントローラ２６によってフォーマットされて、左ピラーボックス１９８及び右ピラーボックス１９９を収容する。一部の実施形態では、ライブビデオ映像１６８は、ディスプレイ１２上の所定のサイズ及び場所でディスプレイ１２上に表示することができ、左ピラーボックス１９８及び右ピラーボックス１９９は、ライブビデオ映像１６８によって占有されていないディスプレイ１２上の残りの領域に基づいて、特定のサイズで、ライブビデオ映像１６８の両側に表示することができる。グラフィカルユーザインターフェース１５０は、複数の異なるグラフィカルユーザインターフェース要素を含み、これは以下でより詳細に記載される。 6 illustrates a graphical user interface 150 formatted to include a left pillarbox 198 and a right pillarbox 199 to the left and right, respectively, of a live video feed 168 of a patient's cavity. The graphical user interface 150 may be overlaid on the live video feed 168. In some embodiments, the live video feed 168 is formatted by the controller 26 to accommodate the left pillarbox 198 and the right pillarbox 199. In some embodiments, the live video feed 168 may be displayed on the display 12 at a predetermined size and location on the display 12, and the left pillarbox 198 and right pillarbox 199 may be displayed on either side of the live video feed 168 at a particular size based on the remaining area on the display 12 not occupied by the live video feed 168. The graphical user interface 150 includes several different graphical user interface elements, which are described in more detail below.

ロボットアーム４２Ｂ及び４２Ａもまた、ライブビデオ映像で視認可能である。左ピラーボックス１９８は、状態識別子１７３、例えば、ロボットアーム４２Ｂの器具先端１２０に関連付けられた係合又は係合解除された状態識別子を含み得る。「係合」状態識別子１７３は、ユーザの左手及びアームが左ハンドコントローラ２０１と係合し、したがって、器具先端１２０も係合していることを示す。「係合解除」状態識別子１７３は、ユーザの左手及びアームがハンドコントローラ２０１と係合しておらず、したがって、器具先端１２０も係合解除されていることを示す。ユーザの左手及びアームが左ハンドコントローラ２０１と係合解除されると、外科手術ロボットシステム１０を完全に係合解除することができる。すなわち、外科手術ロボットシステム１０は、オンのままであり得るが、ユーザの手がハンドコントローラと再係合するまで応答しない。器具先端１２０は、どのタイプのエンドエフェクタ又は器具先端が現在使用されているかの確認をユーザに提供するために、器具先端１２０の名前を含む図像的記号１７９によって表され得る。図６では、図像的記号１７９によって表される器具先端１２０は、双極把持器である。特に、本開示は、図６に示される双極把持器又ははさみに限定されない。 Robotic arms 42B and 42A are also visible in the live video feed. The left pillarbox 198 may include a state identifier 173, for example, an engaged or disengaged state identifier associated with the instrument tip 120 of robotic arm 42B. The "engaged" state identifier 173 indicates that the user's left hand and arm are engaged with the left hand controller 201, and therefore the instrument tip 120 is also engaged. The "disengaged" state identifier 173 indicates that the user's left hand and arm are not engaged with the hand controller 201, and therefore the instrument tip 120 is also disengaged. When the user's left hand and arm are disengaged from the left hand controller 201, the surgical robotic system 10 can be completely disengaged. That is, the surgical robotic system 10 may remain on but will not respond until the user's hand re-engages the hand controller. The instrument tip 120 may be represented by a graphical symbol 179 containing the name of the instrument tip 120 to provide the user with confirmation of what type of end effector or instrument tip is currently being used. In FIG. 6, the instrument tip 120, represented by iconographic symbol 179, is a bipolar grasper. Notably, the present disclosure is not limited to the bipolar grasper or scissors shown in FIG. 6.

同様に、右ピラーボックス１９９は、ロボットアーム４２Ａの器具先端１２０に関連付けられた状態識別子１７５、例えば、係合又は係合解除状態識別子を含み得る。一部の実施形態では、エンドエフェクタの状態に基づいて、グラフィカルユーザインターフェースはまた、テキストに加えて、状態の視覚的表現を提供することができる。例えば、エンドエフェクタ図像は、係合解除されていない場合、「グレーアウト」されるか、又はあまり目立たないようにされ得る。 Similarly, the right pillar box 199 may include a state identifier 175, such as an engaged or disengaged state identifier, associated with the instrument tip 120 of the robotic arm 42A. In some embodiments, based on the state of the end effector, the graphical user interface may also provide a visual representation of the state in addition to text. For example, the end effector iconography may be "grayed out" or made less prominent when it is disengaged.

状態識別子１７５は、「係合」となり得、それによってユーザの右手及びアームが右ハンドコントローラ２０２と係合し、従って器具先端１２０も係合していることを示す。あるいは、状態識別子１７５は、「係合解除」となり得、それによって、ユーザの右手及びアームが右ハンドコントローラ２０２と係合しておらず、したがって、器具先端１２０も係合解除していることを示す。器具先端１２０は、どのタイプのエンドエフェクタ又は器具先端が現在使用されているかの確認をユーザに提供するために、器具先端１２０の名前を含む図像的記号１７６によって表され得る。図６では、図像的記号１７６によって表される器具先端１２０は、単極はさみである。特に、本開示は、図６に示される単極はさみに限定されない。 The status identifier 175 may be "engaged," thereby indicating that the user's right hand and arm are engaged with the right hand controller 202, and therefore the instrument tip 120 is also engaged. Alternatively, the status identifier 175 may be "disengaged," thereby indicating that the user's right hand and arm are not engaged with the right hand controller 202, and therefore the instrument tip 120 is also disengaged. The instrument tip 120 may be represented by a graphical symbol 176 containing the name of the instrument tip 120 to provide the user with confirmation of what type of end effector or instrument tip is currently being used. In FIG. 6, the instrument tip 120 represented by the graphical symbol 176 is a monopolar scissors. Notably, the present disclosure is not limited to the monopolar scissors shown in FIG. 6.

左ピラーボックス１９８はまた、ロボット姿勢ビュー１７１を含み得る。ロボット姿勢ビュー１７１は、ロボットアーム４２Ｂ及び４２Ａ、カメラ組立品４４、並びに支持アームの模擬ビューを含み、それによって、ユーザは、ロボットアーム組立品４２、カメラ組立品４４、及びロボット支持システム４６の三人称視点を得ることができる。一対の模擬ロボットアーム１９１及び１９２によって表されるロボットアーム４２Ｂ及び４２Ａの模擬ビュー。カメラ組立品４４の模擬ビューは、模擬カメラ１９３によって表される。ロボット姿勢ビュー１７１はまた、患者の空洞又は空洞の一部分に関連付けられた模擬カメラビューを含み、これは錐台１５１に対する一対のロボットアーム１５１及び１７２の配置又は場所を表す。より具体的には、カメラビューは、カメラ組立品４４の視野であり得、錐台１５１と同等である。 The left pillar box 198 may also include a robot pose view 171. The robot pose view 171 includes a simulated view of the robot arms 42B and 42A, the camera assembly 44, and the support arm, thereby allowing the user to obtain a third-person perspective of the robot arm assembly 42, the camera assembly 44, and the robot support system 46. A simulated view of the robot arms 42B and 42A is represented by a pair of simulated robot arms 191 and 192. A simulated view of the camera assembly 44 is represented by a simulated camera 193. The robot pose view 171 also includes a simulated camera view associated with the patient's cavity or portion of the cavity, which represents the placement or location of the pair of robot arms 151 and 172 relative to the frustum 151. More specifically, the camera view may be the field of view of the camera assembly 44, which is equivalent to the frustum 151.

右ピラーボックス１９９はまた、ロボットアーム４２Ｂ及び４２Ａ、カメラ組立品４４、支持アームの模擬ビューを含む、ロボット姿勢ビュー１７２を含むことができ、それによって、ユーザは、ロボットアーム組立品４２、カメラ組立品４４、及び支持アームの三人称視点を得ることができる。ロボットアーム４２Ｂ及び４２Ａの模擬ビューは、一対の模擬ロボットアーム１６５及び１６６である。カメラ組立品４４の模擬ビューは、模擬カメラ１９３によって表される。ロボット姿勢ビュー１７２はまた、患者の空洞又は空洞の一部分に関連付けられた模擬カメラビューを含み、これは錐台１６７に対する一対のロボットアーム１６５及び１６６の配置又は場所である。より具体的には、カメラビューは、錐台１６７であるカメラの視野とすることができる。ロボット姿勢ビュー１７２は、肘の高さの認識、及び特に、上向き／裏向き構成で操縦する場合に、状況認識を提供する。 The right pillar box 199 may also include a robot pose view 172 that includes a simulated view of the robot arms 42B and 42A, the camera assembly 44, and the support arm, thereby allowing the user to obtain a third-person perspective of the robot arm assembly 42, the camera assembly 44, and the support arm. The simulated view of the robot arms 42B and 42A is a pair of simulated robot arms 165 and 166. The simulated view of the camera assembly 44 is represented by a simulated camera 193. The robot pose view 172 also includes a simulated camera view associated with the patient's cavity or portion of the cavity, which is the placement or location of the pair of robot arms 165 and 166 relative to a frustum 167. More specifically, the camera view may be the field of view of the camera, which is the frustum 167. The robot pose view 172 provides elbow height awareness and situational awareness, particularly when maneuvering in a face-up/face-down configuration.

状況認識は、ロボットアーム４２Ａ及び４２Ｂが患者の空洞の内部にあるときに、時間及び空間に関して特定のロボット要素を理解する方法として特徴付けられ得る。例えば、ロボット姿勢ビュー１７１に示すように、模擬ロボットアーム１９２の肘は下向きに曲げられ、それによって、実際のロボットアーム４２Ａの肘が実際にどのように配向され、患者の空洞内に位置付けられているかを知る能力をユーザに提供する。ロボットアーム４２Ａ及び４２Ｂに対するカメラ組立品４４の位置付けのために、ロボットアーム４２Ａ及び４２Ｂの全長は、ライブビデオ映像１６８では見えない場合があることに留意されたい。結果として、ユーザは、ロボットアーム４２Ａ及び４２Ｂが患者の空洞内にどのように配向及び位置付けられているかの可視化を有しない場合がある。模擬ロボットアーム１６５及び１６６、並びに模擬ロボットアーム１９１及び１９２は、ユーザに、患者の空洞内の実際のロボットアーム４２Ａ及び４２Ｂの少なくとも位置及び配向の状況認識を提供する。 Situational awareness can be characterized as a way of understanding particular robotic elements with respect to time and space when robotic arms 42A and 42B are inside the patient cavity. For example, as shown in robot pose view 171, the elbow of simulated robotic arm 192 is bent downward, thereby providing the user with the ability to know how the elbow of actual robotic arm 42A is actually oriented and positioned within the patient cavity. Note that due to the positioning of camera assembly 44 relative to robotic arms 42A and 42B, the full length of robotic arms 42A and 42B may not be visible in live video feed 168. As a result, the user may not have visualization of how robotic arms 42A and 42B are oriented and positioned within the patient cavity. Simulated robotic arms 165 and 166, as well as simulated robotic arms 191 and 192, provide the user with situational awareness of at least the position and orientation of actual robotic arms 42A and 42B within the patient cavity.

ディスプレイ１２上に与えられるグラフィカルユーザインターフェース１５０の各側面上の二つの異なる視点からの、二つの別個のビュー（ロボット姿勢ビュー１７１及びロボット姿勢ビュー１７２）が存在し得る。ロボット姿勢ビュー１７１及びロボット姿勢ビュー１７２は、ロボットアーム４２Ａ及び４２Ｂをビュー内に維持しながら、トロカール５０の中心に留まるように自動的に更新される。ロボット姿勢ビュー１７１及び１７２はまた、空間認識をユーザに提供する。 There may be two separate views (robot pose view 171 and robot pose view 172) from two different perspectives on each side of the graphical user interface 150 presented on the display 12. Robot pose view 171 and robot pose view 172 are automatically updated to stay centered on the trocar 50 while keeping the robot arms 42A and 42B within view. Robot pose views 171 and 172 also provide spatial awareness to the user.

空間認識は、空洞内の他の物体及び空洞自体に対するロボット姿勢ビュー１７１及び１７２で見たときの、ロボットアーム４２Ａ及び４２Ｂの配置又は位置として特徴付けられ得る。ロボット姿勢ビュー１７１及び１７２は、模擬ロボットアーム１９１及び１９２をロボット姿勢ビュー１７１で、並びに模擬ロボットアーム１６５及び１６６をロボット姿勢ビュー１７２で見ることによって、実際のロボットアーム４２Ａ及び４２Ｂが空洞内のどこに位置するかを決定する能力をユーザに提供する。例えば、ロボット姿勢ビュー１７１は、錐台１５１に対する模擬ロボットアーム１９１及び１９２の位置及び場所を示す。ロボット姿勢ビュー１７１は、支持アーム、並びに支持アームに取り付けられた模擬ロボットアーム１９１及び１９２の側面図から、錐台１５１に対して模擬ロボットアーム１９１及び１９２を描写する。この特定のロボット姿勢は、空洞内の解剖学的特徴への近接をより良く確かめる能力をユーザに提供する。 Spatial awareness may be characterized as the placement or location of robot arms 42A and 42B when viewed in robot pose views 171 and 172 relative to other objects within the cavity and the cavity itself. Robot pose views 171 and 172 provide the user with the ability to determine where the actual robot arms 42A and 42B are located within the cavity by viewing simulated robot arms 191 and 192 in robot pose view 171 and simulated robot arms 165 and 166 in robot pose view 172. For example, robot pose view 171 shows the position and location of simulated robot arms 191 and 192 relative to frustum 151. Robot pose view 171 depicts simulated robot arms 191 and 192 relative to frustum 151 from a side view of the support arm and the simulated robot arms 191 and 192 attached to the support arm. This particular robot pose provides the user with the ability to better ascertain proximity to anatomical features within the cavity.

ロボット姿勢ビュー１７２はまた、実際のロボットアーム４２Ａ及び４２Ｂが互いにどの程度近いか、又はそれらが互いにどの程度離れているかをより良く確かめる能力をユーザに提供することができる。またさらに、ロボット姿勢ビュー１７２はまた、実際のロボットアーム４２Ａ及び４２Ｂが、ロボットアーム４２Ａ及び４２Ｂの左右にある患者の空洞の内部に対して、どこに位置決めされ得る又は位置し得るかを示すことができ、それによって、ロボットアーム４２Ａ及び４２Ｂが空洞内のどこにあるか、並びにそれらが空洞内の解剖学的特徴に対してどこにあるかの空間認識をユーザに提供する。上述のように、ロボットアーム４２Ａ及び４２Ｂの全長は、ライブビデオ映像１６８では見えないため、模擬ロボットアーム１６５及び１６６は、実際のロボットアーム４２Ａ及び４２Ｂが互いにどの程度近いか、又は離れているかを知るための空間認識をユーザに提供し得る。ロボット姿勢ビュー１７２によって提供されるビューは、ユーザが空洞の内部の範囲を見ているかのように見えるビューである。ロボット姿勢ビュー１７２は、ユーザが、仮想肘１２８が互いに近づくように右ハンドコントローラ２０２及び左ハンドコントローラを操作する場合、仮想肘１２８が互いにどの程度近いか、並びに実際のロボットアーム４２Ａ及び４２Ｂが互いにどの程度近いかを知るための空間認識をユーザに提供する。例えば、ユーザがロボットアーム４２Ａ及び４２Ｂを真っ直ぐにするように左ハンドコントローラ２０１及び右ハンドコントローラ２０２を操作すると、模擬ロボットアーム１６６及び１６５は、互いに平行になり、模擬ロボットアーム１６５の肘と模擬ロボットアーム１６６の肘の間の距離が減少する。逆に、ユーザがロボットアーム４２Ａ及び４２Ｂを曲げるように左ハンドコントローラ２０１及び右ハンドコントローラ２０２を操作すると、その結果、ロボットアーム４２Ａ及び４２Ｂの仮想肘１２８間の距離はさらに離れ、模擬ロボットアーム１６６及び１６５は互いに平行ではなくなり、模擬ロボットアーム１６５の肘と模擬ロボットアーム１６６の肘の間の距離は増加することになる。ライブビデオ映像１６８はロボットアーム４２Ａ及び４２Ｂの全長の可視化を提供しないため、ロボット姿勢ビュー１７１及び１７２が、外科手術処置中に空間認識をユーザに提供する。 The robot pose view 172 can also provide the user with the ability to better ascertain how close or far apart the actual robot arms 42A and 42B are to each other. Furthermore, the robot pose view 172 can also show where the actual robot arms 42A and 42B could be positioned or located relative to the interior of the patient's cavity to the left and right of the robot arms 42A and 42B, thereby providing the user with spatial awareness of where the robot arms 42A and 42B are within the cavity and where they are relative to anatomical features within the cavity. As mentioned above, because the full length of the robot arms 42A and 42B is not visible in the live video feed 168, the simulated robot arms 165 and 166 can provide the user with spatial awareness of how close or far apart the actual robot arms 42A and 42B are to each other. The view provided by the robot pose view 172 is a view that appears as if the user is looking into the interior of the cavity. The robot pose view 172 provides the user with spatial awareness to know how close the virtual elbows 128 are to each other and how close the actual robot arms 42A and 42B are to each other when the user manipulates the right hand controller 202 and the left hand controller so that the virtual elbows 128 move closer together. For example, if the user manipulates the left hand controller 201 and the right hand controller 202 to straighten the robot arms 42A and 42B, the simulated robot arms 166 and 165 will become parallel to each other and the distance between the elbows of the simulated robot arms 165 and 166 will decrease. Conversely, if the user manipulates the left hand controller 201 and the right hand controller 202 to bend the robot arms 42A and 42B, the result will be that the virtual elbows 128 of the robot arms 42A and 42B will move further apart, the simulated robot arms 166 and 165 will no longer be parallel to each other, and the distance between the elbows of the simulated robot arms 165 and 166 will increase. Because the live video feed 168 does not provide full-length visualization of the robotic arms 42A and 42B, the robot pose views 171 and 172 provide the user with spatial awareness during the surgical procedure.

図６では、模擬ロボットアーム１９１及び１９２は、錐台１５１に関連付けられたカメラ組立品１４の視野内にあるものとして示されており、これは、ロボットアーム４２Ｂ及びロボットアーム４２Ａが、患者の捕捉された実際の空洞の一部分内のどこに位置するか、又は位置付けられているかの状況認識及び空間認識をユーザに提供する。ロボット姿勢ビュー１７１に関連付けられたカメラビューは、ユーザが、患者の空洞内の側面図からロボットアーム４２Ｂ及びロボットアーム４２Ａの実際のビューを見ているかのような、ロボットアーム４２Ｂ及びロボットアーム４２Ａの模擬ビューである。上述のように、カメラビューは、錐台１６７であるカメラ組立品４４の視野とすることができる。すなわち、ロボット姿勢ビュー１７１は、ロボットアーム４２Ｂ及びロボットアーム４２Ａそれぞれに対応する模擬ビューである、模擬ロボットアーム１９１及び１９２の側面図をユーザに提供する。 6, simulated robotic arms 191 and 192 are shown within the field of view of camera assembly 14 associated with frustum 151, which provides the user with situational and spatial awareness of where robotic arms 42B and 42A are located or positioned within a captured portion of the patient's actual cavity. The camera view associated with robot pose view 171 is a simulated view of robotic arms 42B and 42A as if the user were viewing an actual view of robotic arms 42B and 42A from a side view within the patient's cavity. As noted above, the camera view may be the field of view of camera assembly 44, which is frustum 167. That is, robot pose view 171 provides the user with a side view of simulated robotic arms 191 and 192, which are simulated views corresponding to robotic arms 42B and 42A, respectively.

一部の実施形態では、グラフィカルユーザインターフェース１５０は、図６に示される空洞内の異なる領域に対する、空洞並びにロボットアーム４２Ｂ及びロボットアーム４２Ａの視野を含む単一の視座からのライブビデオ映像１６８を表示することができる。結果として、ユーザは、ロボットアーム４２Ｂ及びロボットアーム４２Ａの仮想肘１２８がどのように位置付けられているかを常に判定できるとは限らない場合がある。これは、カメラ組立品４４が、ロボットアーム４２Ｂの仮想肘１２８のビデオ映像及びロボットアーム４２Ａの肘のビデオ映像を常に含むとは限らない場合があり、したがって、ユーザが、患者の空洞内で操作することを望む場合、右ハンドコントローラ２０２及び左ハンドコントローラ２０１をどのように調整するかを決定することができない場合があるためである。左状況認識カメラビューパネルはロボットアーム１９１及び１９２の全長の模擬視野を含むため、ロボットアーム４２Ｂ（ロボットアーム１９１）の模擬ビュー及びロボットアーム４２Ａ（ロボットアーム１９２）の模擬ビューは、ユーザがロボットアーム４２Ａ及びロボットアーム４２Ｂの仮想肘１２８の位置付けを決定することを可能にする視点をユーザに提供する。ロボット姿勢ビュー１７１の模擬視野は、ロボットアーム１９１及び１９２の仮想肘１２８のビューを含むため、ユーザは、左ハンドコントローラ２０１及び右ハンドコントローラ２０２を操作し、左ハンドコントローラ２０１及び右ハンドコントローラ２０２の操作に従ってロボットアーム１９１及び１９２がどのように移動するかを注意して見ることによって、ロボットアーム４２Ｂ及びロボットアーム４２Ａの位置付けを調整することができる。 In some embodiments, the graphical user interface 150 can display live video feed 168 from a single viewpoint that includes the view of the cavity and robotic arms 42B and 42A for different regions within the cavity as shown in FIG. 6. As a result, the user may not always be able to determine how the virtual elbows 128 of robotic arms 42B and 42A are positioned. This is because the camera assembly 44 may not always include video feed of the virtual elbow 128 of robotic arm 42B and video feed of the elbows of robotic arm 42A, and therefore the user may not be able to determine how to adjust the right hand controller 202 and the left hand controller 201 when wishing to operate within the patient's cavity. Because the left situation awareness camera view panel includes a simulated view of the entire length of the robot arms 191 and 192, the simulated view of the robot arm 42B (robot arm 191) and the simulated view of the robot arm 42A (robot arm 192) provide the user with a perspective that enables the user to determine the positioning of the virtual elbows 128 of the robot arms 42A and 42B. Because the simulated view of the robot pose view 171 includes views of the virtual elbows 128 of the robot arms 191 and 192, the user can adjust the positioning of the robot arms 42B and 42A by manipulating the left hand controller 201 and the right hand controller 202 and carefully watching how the robot arms 191 and 192 move in accordance with the manipulation of the left hand controller 201 and the right hand controller 202.

グラフィカルユーザインターフェース１５０は、患者の空洞の一部分と関連付けられたカメラ組立品４４の視野である錐台１６７と、模擬カメラ１５８並びに、ロボットアーム１６５及び１６６を支持する模擬ロボット支持アームを有するロボットアーム１６５及び１６６と、がその中にあるロボット姿勢ビュー１７２を含み得る。 The graphical user interface 150 may include a robot pose view 172 within which is a frustum 167 of the field of view of the camera assembly 44 associated with a portion of the patient's cavity, a simulated camera 158, and robot arms 165 and 166 having a simulated robot support arm supporting the robot arms 165 and 166.

図６では、模擬ロボットアーム１６５及び１６６は、錐台１６７内にあるものとして示されており、これは、患者の実際の空洞内のロボットアーム４２Ｂ及びロボットアーム４２Ａの場所及び位置付けを表すものである。ロボット姿勢ビュー１７２に示されるビューは、ユーザがロボットアーム４２Ｂ及びロボットアーム４２Ａを患者の空洞内のトップダウンビューから見ているかのような、ロボットアーム４２Ｂ及びロボットアーム４２Ａの模擬ビューである。すなわち、ロボット姿勢ビュー１７２は、ロボットアーム４２Ｂ及びロボットアーム４２Ａそれぞれに対応する模擬ビューである、模擬ロボットアーム１６５及び１６６のトップダウンビューをユーザに提供する。トップダウンビューは、ユーザが空洞内で処置を実施する際に、ロボットアーム４２Ｂ及びロボットアーム４２Ａのある一定のレベルの状況認識を維持する能力をユーザに提供する。ロボット姿勢ビュー１７２は、ロボットアーム１６５及び１６６、カメラ１５８、並びにロボット組立品の支持アームの模擬トップダウン視野を含むため、ロボットアーム４２Ｂに対応する模擬ロボットアーム１６５のビュー、及びロボットアーム４２Ａに対応する模擬ロボットアーム１６６のビューは、ロボットアーム４２Ｂ及びロボットアーム４２Ａの位置付けを決定することを可能にする上から見た視点をユーザに提供する。錐台１６７によって輪郭が描かれたカメラ組立品４４の模擬視野が、模擬ロボットアーム１６５及び１６６のトップダウンビューを含むため、ユーザは、左ハンドコントローラ２０１及び右ハンドコントローラ２０２を操作し、左ハンドコントローラ２０１及び右ハンドコントローラ２０２の操作に従って、模擬ロボットアーム１６５及び１６６が、錐台１６７内の空洞の一部分内でどのように前方に移動するか、又は後方に移動するかを注意して見ることによって、ロボットアーム４２Ｂ及びロボットアーム４２Ａの位置付けを調整することができる。 6, simulated robotic arms 165 and 166 are shown within frustum 167, which represents the location and orientation of robotic arms 42B and 42A within the patient's actual cavity. The view shown in robot pose view 172 is a simulated view of robotic arms 42B and 42A as if the user were viewing robotic arms 42B and 42A from a top-down view within the patient's cavity. That is, robot pose view 172 provides the user with a top-down view of simulated robotic arms 165 and 166, which are simulated views corresponding to robotic arms 42B and 42A, respectively. The top-down view provides the user with the ability to maintain a level of situational awareness of robotic arms 42B and 42A as the user performs a procedure within the cavity. Because robot pose view 172 includes a simulated top-down view of robot arms 165 and 166, camera 158, and the support arms of the robot assembly, the view of simulated robot arm 165 corresponding to robot arm 42B and the view of simulated robot arm 166 corresponding to robot arm 42A provide a user with an overhead perspective that enables the positioning of robot arms 42B and 42A to be determined. Because the simulated field of view of camera assembly 44, outlined by frustum 167, includes a top-down view of simulated robot arms 165 and 166, the user can adjust the positioning of robot arms 42B and 42A by manipulating left hand controller 201 and right hand controller 202 and noting how simulated robot arms 165 and 166 move forward or backward within the portion of the cavity within frustum 167 in accordance with the manipulation of left hand controller 201 and right hand controller 202.

ロボット姿勢ビュー１７１及び１７２におけるロボットアーム４２Ｂ及び４２Ａの模擬ビューは、ロボットアーム４２Ｂ及び４２Ａをビュー内に維持しながら、トロカール５０の中心に留まるように自動的に更新される。一部の実施形態では、これは、ロボットアーム４２Ｂ及び４２Ａ上にある感知及び追跡モジュール１６からの一つ以上のセンサに基づいて達成されることができ、情報を右ハンドコントローラ２０２及び左ハンドコントローラ２０１に提供する。センサは、エンコーダ若しくはホール効果センサ又は他の適切なセンサとすることができる。 The simulated views of the robot arms 42B and 42A in the robot pose views 171 and 172 are automatically updated to keep the robot arms 42B and 42A in view while remaining centered over the trocar 50. In some embodiments, this can be achieved based on one or more sensors from the sensing and tracking module 16 located on the robot arms 42B and 42A, providing information to the right hand controller 202 and the left hand controller 201. The sensors can be encoders or Hall effect sensors or other suitable sensors.

生体構造セグメンテーション及び追跡
本明細書に記載される生体構造セグメンテーション及び追跡は、上記に説明された外科手術ロボットシステムのいずれか、又は任意の他の好適な外科手術ロボットシステムと併用することができる。更に、本明細書に説明される一部の実施形態は、部分的にのみロボットである半ロボット内視鏡下外科手術システムとともに採用され得る。 Anatomy Segmentation and Tracking The anatomical segmentation and tracking described herein can be used in conjunction with any of the surgical robotic systems described above, or any other suitable surgical robotic system. Additionally, some embodiments described herein can be employed with semi-robotic endoscopic surgical systems that are only partially robotic.

本明細書に教示される生体構造セグメンテーション及び追跡の実施例は、以下に記載される図７～図１２に示される実施形態を参照して理解され得る。便宜上、特に断りがない限り、図面に示される様々な実施形態の類似の特徴を参照するために同様の参照番号が使用される。 Examples of the anatomy segmentation and tracking taught herein can be understood with reference to the embodiments shown in Figures 7-12 described below. For convenience, unless otherwise noted, like reference numerals will be used to refer to similar features of the various embodiments shown in the drawings.

図７は、一部の実施形態による、生体構造セグメンテーション及び解剖学的構造追跡のための例示的な生体構造セグメンテーション及び追跡モジュール３００を示す図である。生体構造セグメンテーション及び追跡モジュール３００は、外科手術ロボットシステム１０のコンピューティングモジュール１８のプロセッサ２２によって実行され得る。一部の実施形態では、生体構造セグメンテーション及び追跡モジュール３００は、図１１Ａ及び１１Ｂに関して記載したようにシステムコード２０００の一部であり得る。生体構造セグメンテーション及び追跡モジュール３００は、入力３１０によって供給され、出力３４０を生成する機械学習モデル３２０、信頼モジュール３４６、及び追跡モジュール３６０を含み得る。一部の実施形態では、生体構造セグメンテーション及び追跡モジュール３００は、図７に示すように異なる構成要素を含み得る。例えば、生体構造セグメンテーション及び追跡モジュール３００は、機械学習モデルを訓練するための訓練モジュールをさらに含み得る。一部の実施形態は、信頼モジュール３４６を含む場合があり、一部の実施形態は、信頼モジュール３４６を含まない場合がある。 FIG. 7 illustrates an exemplary anatomy segmentation and tracking module 300 for anatomy segmentation and anatomical structure tracking, according to some embodiments. The anatomy segmentation and tracking module 300 may be executed by the processor 22 of the computing module 18 of the surgical robot system 10. In some embodiments, the anatomy segmentation and tracking module 300 may be part of the system code 2000, as described with respect to FIGS. 11A and 11B. The anatomy segmentation and tracking module 300 may include a machine learning model 320, which is provided by input 310 and generates output 340, a confidence module 346, and a tracking module 360. In some embodiments, the anatomy segmentation and tracking module 300 may include different components as shown in FIG. 7. For example, the anatomy segmentation and tracking module 300 may further include a training module for training the machine learning model. Some embodiments may include the confidence module 346, and some embodiments may not include the confidence module 346.

入力３１０は、ロボットアーム組立品４２及びカメラ組立品４４を有するロボット組立品２０の姿勢を示す位置及び配向データ３１２を含む。例えば、外科手術ロボットシステム１０は、ロボットアーム組立品４２及びカメラ組立品４４に連結されたセンサを使用して、位置及び配向データ３１２を決定することができる。図１及び図４～図６に関して、ロボットアーム組立品４２は、３Ｄ空間（例えば、特許の腹腔内）におけるロボットアームの各々の１つ以上の部分（例えば、様々なジョイント、エンドエフェクタ、又は他の部分）の位置及び配向（例えば、ロール、ピッチ、ヨー）を検出する、センサ（例えば、図４Ｂのセンサ１３２）を含み得る。ロボットアーム４２Ｂ及び４２Ａ並びにカメラ組立品４４上にある感知及び追跡モジュール１６からのセンサは、ロボットアーム４２Ｂ及び４２Ａ並びにカメラ組立品４４の位置及び配向データ３１２を検出することができる。 The input 310 includes position and orientation data 312 indicating the posture of the robot assembly 20, including the robotic arm assembly 42 and the camera assembly 44. For example, the surgical robot system 10 can determine the position and orientation data 312 using sensors coupled to the robotic arm assembly 42 and the camera assembly 44. With reference to FIGS. 1 and 4-6, the robotic arm assembly 42 can include sensors (e.g., sensor 132 in FIG. 4B) that detect the position and orientation (e.g., roll, pitch, yaw) of one or more portions of each of the robotic arms (e.g., various joints, end effectors, or other portions) in 3D space (e.g., within the abdominal cavity of the patent). Sensors from the sensing and tracking module 16 on the robotic arms 42B and 42A and the camera assembly 44 can detect the position and orientation data 312 of the robotic arms 42B and 42A and the camera assembly 44.

入力３１０はまた、対象の１つ以上の解剖学的構造の表現を含む画像３１４を含み得る。一部の実施形態では、図６に示すように、カメラ組立品４４の視野内でカメラ組立品４４によって捕捉された画像１６８（例えば、ライブビデオストリームのビデオフレーム）は、対象（例えば、患者）の解剖学的空間２２０（例えば、腹腔）内の、嵌頓ヘルニア２００及び血管２１０を含み得る。例えば、生体構造セグメンテーション及び追跡モジュール３００は、カメラ組立品４４からビデオストリームにリアルタイムでアクセスして、臓器、脈管構造、及び他の組織を含み得る、術野（例えば、カメラ組立品４４の視野）内のコンテンツを示すビデオを取得することができる。生体構造セグメンテーション及び追跡モジュール３００は、取得されたビデオを、特定のレート（例えば、３０Ｈｚ）で記録される一連のビデオフレームに分解することができる。生体構造セグメンテーション及び追跡モジュール３００は、以下に説明するように、各ビデオフレームを機械学習モデル３２０に入力することができる。一部の実施形態では、画像３１４は、単一の写真若しくはバーストモードで捕捉された一連の写真のうちの一つの写真、又は対象の解剖学的構造を描写する他の適切な画像とすることができる。 The input 310 may also include an image 314 including a representation of one or more anatomical structures of the subject. In some embodiments, as shown in FIG. 6 , an image 168 (e.g., a video frame of a live video stream) captured by the camera assembly 44 within the field of view of the camera assembly 44 may include an incarcerated hernia 200 and a blood vessel 210 within an anatomical space 220 (e.g., the abdominal cavity) of the subject (e.g., a patient). For example, the anatomy segmentation and tracking module 300 may access the video stream from the camera assembly 44 in real time to obtain video showing content within the surgical field (e.g., the field of view of the camera assembly 44), which may include organs, vasculature, and other tissues. The anatomy segmentation and tracking module 300 may decompose the obtained video into a series of video frames recorded at a particular rate (e.g., 30 Hz). The anatomy segmentation and tracking module 300 may input each video frame to a machine learning model 320, as described below. In some embodiments, image 314 may be a single photograph or one of a series of photographs captured in burst mode, or other suitable image depicting the subject's anatomy.

一部の実施形態では、入力３１０はまた、蛍光データ（例えば、インドシアニングリーン（ＩＣＧ）撮像から、及び／又は腹腔鏡システムから）、カメラ組立品４４によって捕捉される環境の３Ｄデータ若しくはマップ、及び／又は外科手術ロボットシステム１０が動作している内部体腔に関連付けられた他の適切なデータなどの、他のシステムからのデータを含み得る。特に、本開示は、本明細書に記載の入力データに限定されない。 In some embodiments, input 310 may also include data from other systems, such as fluorescence data (e.g., from indocyanine green (ICG) imaging and/or from a laparoscopic system), 3D data or maps of the environment captured by camera assembly 44, and/or other suitable data associated with the internal body cavity in which surgical robotic system 10 is operating. Notably, the present disclosure is not limited to the input data described herein.

機械学習モデル３２０は、エンコーダ－デコーダアーキテクチャ及び分類器アーキテクチャを含み得る。エンコーダ－デコーダアーキテクチャは、姿勢エンコーダ３２２、視覚的エンコーダ３２６、及びデコーダ３２２を含み得る。分類器アーキテクチャは、分類器３２４（例えば、多クラス分類器）を含み得る。機械学習モデル３２０は、術野（例えば、カメラ組立品４４の視野）内の複数のタイプの解剖学的構造（例えば、血管、神経、他の管様構造、臓器、他の繊細な構造、及び／又は障害物構造）を、多様な解剖学的場所にわたってセグメント化することができ、また以下に記載されるように（例えば、リアルタイムで）解剖学的ランドマークを識別することができる。 The machine learning model 320 may include an encoder-decoder architecture and a classifier architecture. The encoder-decoder architecture may include a pose encoder 322, a visual encoder 326, and a decoder 322. The classifier architecture may include a classifier 324 (e.g., a multi-class classifier). The machine learning model 320 may segment multiple types of anatomical structures (e.g., blood vessels, nerves, other vessel-like structures, organs, other delicate structures, and/or obstruction structures) within the surgical field (e.g., the field of view of the camera assembly 44) across various anatomical locations and may identify anatomical landmarks (e.g., in real time) as described below.

一部の実施形態では、機械リーニングモデル３２０は、１つ以上のニューラルネットワーク（例えば、畳み込みニューラルネットワーク又は他のタイプのニューラルネットワーク）を含み得る。姿勢エンコーダ３２２、視覚的エンコーダ３２６、デコーダ３２２、及び分類器３２４は、単一のニューラルネットワークの一部であってもよく、又は異なるニューラルネットワークであってもよい。一部の実施形態では、機械学習モード３２０は、教師あり学習、半教師あり学習、教師なし学習、強化学習、及び／又は他の適切な学習方法に基づいて、訓練プロセスを使用して生成され得る。 In some embodiments, the machine learning model 320 may include one or more neural networks (e.g., a convolutional neural network or other type of neural network). The pose encoder 322, visual encoder 326, decoder 322, and classifier 324 may be part of a single neural network or may be different neural networks. In some embodiments, the machine learning model 320 may be generated using a training process based on supervised learning, semi-supervised learning, unsupervised learning, reinforcement learning, and/or other suitable learning methods.

姿勢エンコーダ３２２は、ロボット組立品２０の位置及び配向データ３１２によって供給され、位置及び配向データ３１２から、ロボット組立品２０の姿勢のコンパクト表現を抽出することができる。抽出されたコンパクト表現は、姿勢表現３２４と称され得る。抽出されたコンパクト表現は、位置及び配向データ３１２の次元数を低減することによってロボット組立品２０の姿勢を表すベクトルであってもよく、又は位置及び配向データ３１２をコンパクトな様式（例えば、データサイズの低減など）で表す任意の他の適切な表現であってもよい。 The pose encoder 322 is supplied with the position and orientation data 312 of the robot assembly 20 and can extract a compact representation of the pose of the robot assembly 20 from the position and orientation data 312. The extracted compact representation may be referred to as a pose representation 324. The extracted compact representation may be a vector representing the pose of the robot assembly 20 by reducing the dimensionality of the position and orientation data 312, or any other suitable representation that represents the position and orientation data 312 in a compact manner (e.g., reducing data size, etc.).

視覚的エンコーダ３２６は、画像３１４によって供給され、画像３１４から、画像３１４のコンパクト表現を抽出することができる。抽出されたコンパクト表現は、視覚的表現３２８と称され得る。コンパクト表現は、画像３１４の次元数を低減することによって画像を表すベクトルであってもよく、又はコンパクトな様式（例えば、データサイズの低減など）で画像を表す任意の他の適切な表現であってもよい。 The visual encoder 326 is fed by the image 314 and can extract from the image 314 a compact representation of the image 314. The extracted compact representation can be referred to as a visual representation 328. The compact representation can be a vector that represents the image by reducing the dimensionality of the image 314, or any other suitable representation that represents the image in a compact manner (e.g., reducing data size, etc.).

機械学習モデル３２０は、姿勢表現３２４及び視覚的表現３２８を、ロボット組立品２０の現在の状態を反映する単一の状態表現３３０に集約することができ、解剖学的ランドマークの識別及び術野内の繊細な構造の描写などの、複数の対象のタスクを達成するために使用され得る。一部の実施形態では、機械学習モデル３２０は、表現を平均化し、各表現を等しくはかりにかけることによって、姿勢表現３２４及び視覚的表現３２８を集約することができる。一部の実施形態では、機械学習モデル３２０は、表現の加重平均を使用して、姿勢表現３２４及び視覚的表現３２８を集約することができる。各表現の重みは、各データモダリティの相対的な重要性に従って決定される。例えば、ロボット組立品２０が動いても、画像３１４のコンテンツがわずかしか変化しない場合、姿勢表現３２４は、視覚的表現３２８よりも大きな重みを有し得る。一部の実施形態では、機械学習モデル３２０は、各表現の追加によって状態表現３３０のサイズが大きくなるように、表現を連結することができる。 The machine learning model 320 can aggregate the pose representation 324 and the visual representation 328 into a single state representation 330 that reflects the current state of the robotic assembly 20 and can be used to accomplish multi-target tasks, such as identifying anatomical landmarks and depicting delicate structures within a surgical field. In some embodiments, the machine learning model 320 can aggregate the pose representation 324 and the visual representation 328 by averaging the representations and weighing each representation equally. In some embodiments, the machine learning model 320 can aggregate the pose representation 324 and the visual representation 328 using a weighted average of the representations. The weight of each representation is determined according to the relative importance of each data modality. For example, if the content of the image 314 changes only slightly as the robotic assembly 20 moves, the pose representation 324 may have a greater weight than the visual representation 328. In some embodiments, the machine learning model 320 can concatenate the representations such that the size of the state representation 330 increases with the addition of each representation.

一部の実施形態では、状態表現３３０は、任意の数のモダリティ固有の表現を説明することができ、姿勢表現３２４及び視覚的表現３２８に限定されないという点で拡張可能である。例えば、機械学習モデル３２０は、入力３１０に関して記載されるように、蛍光データ、３Ｄマップデータ、及び／又は他の適切なデータなどの、追加の入力からそれぞれのコンパクト表現を抽出するためにより多くのエンコーダを含み得る。 In some embodiments, state representation 330 is extensible in that it can describe any number of modality-specific representations and is not limited to pose representation 324 and visual representation 328. For example, machine learning model 320 may include more encoders to extract respective compact representations from additional inputs, such as fluorescence data, 3D map data, and/or other suitable data, as described with respect to input 310.

デコーダ３３２は、状態表現３３０によって供給され、複数のセグメンテーションマップ３４２を生成することができる。各セグメンテーションマップ３４２は、１つ以上の解剖学的構造のうちのどれのロボットアーム組立品４２との接触を回避すべきかを識別することができる。一部の実施形態では、デコーダ３３２は、複数の解剖学的構造を識別することができる。例えば、執刀医が把握しておきたい複数の繊細な構造があり得る。例えば、執刀医は、動脈（解剖学的構造２）に沿って誘導されている間、神経（解剖学的構造１）を損傷することを回避したい場合がある。複数の解剖学的構造のセグメンテーションを可能にするために、各デコーダ３３２は、特定の解剖学的構造（例えば、神経、動脈など）を識別するように訓練され得る。各デコーダ３３２は、画像３１４の各ピクセルが特定の解剖学的構造を示す確率を出力することができる。その値が値閾値を超えるピクセルは、視野内のその解剖学的構造を描写すると見なされ得る。例えば、Ｎ個の繊細な構造がある場合、機械学習モデル３２０は、Ｎ個のセグメンテーションデコーダを有し得る。一部の実施形態では、単一のデコーダは、複数のセグメンテーションマップ３４２を生成することができる。 The decoder 332 is fed by the state representation 330 and can generate multiple segmentation maps 342. Each segmentation map 342 can identify which of one or more anatomical structures the robot arm assembly 42 should avoid contacting. In some embodiments, the decoder 332 can identify multiple anatomical structures. For example, there may be multiple sensitive structures that the surgeon wants to keep track of. For example, the surgeon may want to avoid damaging a nerve (anatomical structure 1) while being guided along an artery (anatomical structure 2). To enable segmentation of multiple anatomical structures, each decoder 332 can be trained to identify a particular anatomical structure (e.g., nerve, artery, etc.). Each decoder 332 can output a probability that each pixel of the image 314 represents a particular anatomical structure. Pixels whose value exceeds a value threshold can be considered to depict that anatomical structure within the field of view. For example, if there are N sensitive structures, the machine learning model 320 can have N segmentation decoders. In some embodiments, a single decoder can generate multiple segmentation maps 342.

分類器３３４は、状態表現３３０によって供給され、ロボット組立品２０が動作している解剖学的空間内の１つ以上の解剖学的ランドマークを識別することができる。解剖学的ランドマークの例には、腹壁の領域を指す鼠径三角、内側を精管によって、側方を精巣動静脈によって、及び下側を腹膜ひだによって画定された解剖学的三角形を指すトライアングル・オブ・ドゥーム、並びに腸恥索、睾丸動静脈、及び腹膜ひだで囲まれた領域である疼痛の三角が含まれ得る。 The classifier 334 is fed by the state representation 330 and can identify one or more anatomical landmarks within the anatomical space in which the robotic assembly 20 is operating. Examples of anatomical landmarks may include the inguinal triangle, which refers to an area of the abdominal wall; the triangle of doom, which refers to an anatomical triangle bounded medially by the vas deferens, laterally by the testicular artery and vein, and inferiorly by the peritoneal fold; and the triangle of pain, which is the area bounded by the iliopubic cord, the testicular artery and vein, and the peritoneal fold.

一部の実施形態では、分類器３３４は、カメラ組立品４４が視認している領域が、複数の解剖学的ランドマークのうちのどれに属するかの確率を出力することができる。例えば、分類器３３４は、カメラ組立品４４が視認している領域を異なる解剖学的ランドマークに分類するための多クラス分類器であり得る。多クラス分類器は、内部体腔（例えば、腹腔又は他の内部空間）内の解剖学的ランドマークの予め定義されたリストによって訓練され得る。一部の実施形態では、予め定義されたリストは、既存の外科診療に基づいてもよく、外科医が現在、腹腔をどのようにナビゲートするかを手本にしている。例えば、臓器が腹筋を通って突出する状態である、鼠径ヘルニアを修復するための手術の文脈において、主要な解剖学的ランドマークには、鼠径三角、トライアングル・オブ・ドゥーム、及び疼痛の三角を含む。分類器３３４は、カメラ組立品４４が視認している領域が、鼠径三角、トライアングル・オブ・ドゥーム、及び疼痛の三角のうちのどれに属するかを決定することができる。 In some embodiments, the classifier 334 can output a probability that the area viewed by the camera assembly 44 belongs to one of multiple anatomical landmarks. For example, the classifier 334 can be a multi-class classifier for classifying the area viewed by the camera assembly 44 into different anatomical landmarks. The multi-class classifier can be trained with a predefined list of anatomical landmarks within an internal body cavity (e.g., the abdominal cavity or other internal space). In some embodiments, the predefined list can be based on existing surgical practice and model how surgeons currently navigate the abdominal cavity. For example, in the context of surgery to repair an inguinal hernia, a condition in which organs protrude through the abdominal muscles, key anatomical landmarks include the inguinal triangle, the triangle of doom, and the triangle of pain. The classifier 334 can determine whether the area viewed by the camera assembly 44 belongs to the inguinal triangle, the triangle of doom, or the triangle of pain.

一部の実施形態では、機械学習モデルは、訓練モジュール（例えば、図１１Ｂの訓練モジュール２４００）によって訓練されて、機械学習モデル３２０を生成することができる。一部の実施形態では、訓練モジュール２４００は、標識された解剖学的構造及び標識された解剖学的ランドマーク並びに既知の位置及び配向データを備える画像を有する訓練セットを含み得る。一部の実施形態では、訓練モジュール２４００は、リモートサーバに格納された訓練セットにアクセスすることができる。訓練モジュール２４００は、訓練セットを、訓練される機械学習モデルに供給することができる。訓練モジュール１２４は、訓練プロセス中に機械学習モデル内の重み及び他のパラメータを調整して、機械学習モデルの出力と予想される出力の間の差を低減することができる。訓練された機械学習モデル３２０は、データベース又は生体構造セグメンテーション及び追跡モジュール３００に格納され得る。一部の実施形態では、訓練モジュール２４００は、訓練セットのグループを検証セットとして選択し、訓練された機械学習モデル３２０を検証セットに適用して、訓練された機械学習モデル３２０を評価することができる。 In some embodiments, the machine learning model may be trained by a training module (e.g., training module 2400 of FIG. 11B ) to generate the machine learning model 320. In some embodiments, the training module 2400 may include a training set having images with labeled anatomical structures and labeled anatomical landmarks and known position and orientation data. In some embodiments, the training module 2400 may access a training set stored on a remote server. The training module 2400 may provide the training set to the machine learning model to be trained. The training module 124 may adjust weights and other parameters in the machine learning model during the training process to reduce the difference between the output of the machine learning model and an expected output. The trained machine learning model 320 may be stored in a database or in the anatomy segmentation and tracking module 300. In some embodiments, the training module 2400 may select a group of the training set as a validation set and apply the trained machine learning model 320 to the validation set to evaluate the trained machine learning model 320.

信頼モジュール３４６は、セグメンテーションマップ３４２及び識別された解剖学的ランドマーク３４４を有する様々な出力３４０間の相互依存を作り出すことができる。信頼モジュール３４６は、二方向性であってもよく、それによって、識別された解剖学的ランドマーク３４４は、異なる解剖学的構造に対して生成されたセグメンテーションマップ３４２に有し得る相対的信頼度に影響を与え得る。例えば、術野内のある領域における特定の解剖学的ランドマーク（例えば、神経を含まないか、又は限定された神経を有するもの）の存在は、その領域が神経である可能性を自動的に排除することができる。生体構造セグメンテーション及び追跡モジュール３００は、神経に関連するセグメンテーションマップ３４２の重みを下げることができる。信頼モジュール３４６は双方向であるため、一部の事例では、セグメンテーションマップ３４２はまた、識別された解剖学的ランドマーク３４４の相対的信頼度に影響を与え得る。例えば、生体構造セグメンテーション及び追跡モジュール３００が複数の構造（例えば、血管及び神経）をセグメント化する場合、生体構造セグメンテーション及び追跡モジュール３００は、ロボット組立品２０が、識別された解剖学的ランドマークのサブセットと一致する実体である神経血管束を見ているというより高い信頼度を有し得る。生体構造セグメンテーション及び追跡モジュール３００は、（例えば、外科手術及び解剖学的ドメインの知識に基づいて）このような神経血管束を有する可能性が低い、識別された解剖学的ランドマークの重みを下げることができる。 The confidence module 346 can create interdependencies between the various outputs 340, including the segmentation maps 342 and the identified anatomical landmarks 344. The confidence module 346 can be bidirectional, whereby the identified anatomical landmarks 344 can influence the relative confidence that can be placed in the segmentation maps 342 generated for different anatomical structures. For example, the presence of a particular anatomical landmark (e.g., one that does not contain nerves or has limited nerves) in a region within the surgical field can automatically eliminate the possibility that the region is a nerve. The anatomy segmentation and tracking module 300 can downweight segmentation maps 342 associated with nerves. Because the confidence module 346 is bidirectional, in some cases the segmentation maps 342 can also influence the relative confidence of the identified anatomical landmarks 344. For example, if the anatomy segmentation and tracking module 300 segments multiple structures (e.g., blood vessels and nerves), the anatomy segmentation and tracking module 300 may have a higher degree of confidence that the robotic assembly 20 is seeing an entity, a neurovascular bundle, that matches a subset of the identified anatomical landmarks. The anatomy segmentation and tracking module 300 may downweight identified anatomical landmarks that are unlikely to contain such a neurovascular bundle (e.g., based on knowledge of the surgical and anatomical domains).

追跡モジュール３６０は、セグメンテーションマップ３４２及び識別された解剖学的ランドマーク３３４を更新して、カメラ組立品４４の視野に変化を反映することができる。例えば、ビデオフレームは、特定のサンプリングレート（例えば、３０Ｈｚ）で経時的に記録される。記録期間中、カメラ組立品４４は、ユーザによって移動される、及び／又は視野内の物体は位置を変える。したがって、カメラ組立品４４の視野は、経時的に変化し得る。追跡モジュール３６０は、セグメンテーションマップ３４２及び識別された解剖学的ランドマーク３３４を更新して、生体構造セグメンテーション及び追跡モジュール３００が経時的に（例えば、リアルタイムで）解剖学的ランドマーク及び解剖学的構造を識別できるように、変化を反映することができる。 The tracking module 360 can update the segmentation map 342 and the identified anatomical landmarks 334 to reflect changes in the field of view of the camera assembly 44. For example, video frames are recorded over time at a particular sampling rate (e.g., 30 Hz). During the recording period, the camera assembly 44 is moved by the user and/or objects in the field of view change position. Thus, the field of view of the camera assembly 44 may change over time. The tracking module 360 can update the segmentation map 342 and the identified anatomical landmarks 334 to reflect the changes so that the anatomy segmentation and tracking module 300 can identify anatomical landmarks and anatomical structures over time (e.g., in real time).

追跡モジュール３６０は、ある期間にわたって所定の時間間隔（例えば、ビデオのフレームレート、サンプリングレート、又はユーザが定義した時間間隔）で一連の入力３１０を取得し、機械学習モデル３２０を各入力３１０に適用して、出力３４０を生成することができる。例えば、時間スロットＴ_１で、追跡モジュール３６０は、Ｔ_１で取得された入力３１０に機械学習モデル３２０を適用し、対応する出力３４０及び中間出力（例えば、状態表現３３０、姿勢表現３２４、及び／又は視覚的表現３２８）を保存することによって、データ処理３６２Ａを実施することができる。時間スロットＴ_２で、追跡モジュール３６０は、Ｔ_２で取得された入力３１０に機械学習モデル３２０を適用し、対応する出力３４０及び中間出力を保存することによって、データ処理３６２Ｂを実施することができる。時間スロットＴ_ｎで、追跡モジュール３６０は、Ｔ_ｎで取得された入力３１０に機械学習モデル３２０を適用し、対応する出力３４０及び中間出力を保存することによって、データ処理３６２Ｎを実施することができる。 The tracking module 360 may acquire a series of inputs 310 at predetermined time intervals (e.g., a video frame rate, a sampling rate, or a user-defined time interval) over a period of time and apply a machine learning model 320 to each input 310 to generate an output 340. For example, at time slot _T1 , the tracking module 360 may perform data processing 362A by applying the machine learning model 320 to the inputs 310 acquired at _T1 and saving the corresponding outputs 340 and intermediate outputs (e.g., state representation 330, pose representation 324, and/or visual representation 328). At time slot _T2 , the tracking module 360 may perform data processing 362B by applying the machine learning model 320 to the inputs 310 acquired at _T2 and saving the corresponding outputs 340 and intermediate outputs. At time slot T _n , the tracking module 360 may perform data processing 362N by applying the machine learning model 320 to the input 310 acquired at T _n and storing the corresponding output 340 and intermediate outputs.

一部の実施形態では、追跡モジュール３６０は、現在の時間スロット（例えば、Ｔ_ｎ）における状態表現と、以前の時間スロット（Ｔ_２又はＴ_１）からの以前の状態表現の間の類似性を決定し得る。以前の状態表現は、以前の時間スロットで取得された以前の位置及び配向データから抽出された以前の姿勢表現と、以前の時間スロットで取得された以前の画像から抽出された以前の視覚的表現とを使用して生成され得る。一部の実施形態では、現在の時間スロットは、以前の時間スロットに隣接している。例えば、ビデオフレームは、特定のサンプリングレート又はフレームレート（例えば、３０Ｈｚ）で経時的に記録される。現在の時間スロット及び以前の時間スロットは、隣接するフレームを取得するための隣接する時間スロットである。一部の実施形態では、現在の時間スロットと以前の時間スロットの間に、不連続なフレームを取得するための時間スロットなどの（例えば、フレームレートに基づく）所定の時間間隔がある。類似性が類似性閾値以上であると追跡モジュール３６０が判定する場合、追跡モジュール３６０は、状態表現及び以前の状態表現を平均化して、平均化された状態表現を生成し得る。追跡モジュール３６０は、平均化された状態表現を機械学習モデル３２０に供給して、ロボット組立品２０が現在動作している解剖学的空間内の１つ以上の解剖学的ランドマークを識別し、複数のセグメンテーションマップを生成することができる。 In some embodiments, the tracking module 360 may determine a similarity between a state representation in a current time slot (e.g., _Tn ) and a previous state representation from a previous time slot ( _T2 or _T1 ). The previous state representation may be generated using a previous pose representation extracted from previous position and orientation data acquired in the previous time slot and a previous visual representation extracted from previous images acquired in the previous time slot. In some embodiments, the current time slot is adjacent to the previous time slot. For example, video frames are recorded over time at a particular sampling rate or frame rate (e.g., 30 Hz). The current time slot and the previous time slot are adjacent time slots for acquiring adjacent frames. In some embodiments, there is a predetermined time interval (e.g., based on the frame rate) between the current time slot and the previous time slot, such as a time slot for acquiring discontinuous frames. If the tracking module 360 determines that the similarity is equal to or greater than a similarity threshold, the tracking module 360 may average the state representation and the previous state representation to generate an averaged state representation. The tracking module 360 can feed the averaged state representation to the machine learning model 320 to identify one or more anatomical landmarks within the anatomical space in which the robot assembly 20 is currently operating and generate multiple segmentation maps.

例えば、追跡モジュール３６０は、ある期間にわたる隣接するビデオフレームが、類似の視野（すなわち、最小限の変化）を示すかどうかを決定し得る。追跡モジュール３６０は、隣接するフレームに対応する状態表現の類似性を定量化することができる。類似性が閾値を超える場合、追跡モジュール３６０は、フレームが視野内に有する変化が最小限であると決定し、以前のフレームの以前の状態表現を時間的に先に伝播する。追跡モジュール３６０は、機械学習モデル３２０に状態表現を供給する前に、状態表現を平均化することができる。したがって、追跡モジュール３６０の能力は、履歴情報を利用して、より確実に解剖学的構造をセグメント化することができる。 For example, the tracking module 360 may determine whether adjacent video frames over a period of time exhibit similar fields of view (i.e., minimal change). The tracking module 360 can quantify the similarity of the state representations corresponding to adjacent frames. If the similarity exceeds a threshold, the tracking module 360 determines that the frames have minimal change in field of view and propagates the previous state representation of the previous frame forward in time. The tracking module 360 can average the state representations before providing them to the machine learning model 320. Thus, the tracking module 360's ability to utilize historical information can more reliably segment anatomical structures.

一部の実施形態では、追跡モジュール３６０は、対象の生体構造内のセグメント化された解剖学的構造の局在化を可能にし得る。追跡モジュール３６０は、他の構造に対するセグメント化された解剖学的構造の場所をユーザに提供し得る。例えば、追跡モジュール３６０は、ロボット組立品２０が動作している解剖学的空間の３Ｄ再構築を決定することができる。例えば、カメラ組立品４４は、光検出及び測距（ＬＩＤＡＲ）又はドットマトリックスプロジェクタを含んで、体腔の少なくとも一部分の３Ｄ表現又はマップを取得することができる。ユーザが、ロボット組立品２０を簡単に制御して、解剖学的構造を回避する、及び／又は解剖学的構造を操作することができるように、追跡モジュール３６０は、３Ｄ再構築を使用して、生体構造内のロボット組立品２０の場所及びセグメント化された解剖学的構造の場所を決定し得る。 In some embodiments, the tracking module 360 may enable localization of the segmented anatomical structure within the target anatomy. The tracking module 360 may provide the user with the location of the segmented anatomical structure relative to other structures. For example, the tracking module 360 may determine a 3D reconstruction of the anatomical space in which the robotic assembly 20 is operating. For example, the camera assembly 44 may include a light detection and ranging (LIDAR) or a dot matrix projector to obtain a 3D representation or map of at least a portion of a body cavity. The tracking module 360 may use the 3D reconstruction to determine the location of the robotic assembly 20 and the location of the segmented anatomical structure within the anatomy so that the user can easily control the robotic assembly 20 to avoid and/or manipulate the anatomical structure.

一部の実施形態では、信頼モジュール３４６及び／又は追跡モジュール３６０は、機械学習モデル３２０に含まれ得る。一部の実施形態では、追跡モジュール３６０は、機械学習モデル３２０を含み得る。 In some embodiments, the trust module 346 and/or the tracking module 360 may be included in the machine learning model 320. In some embodiments, the tracking module 360 may include the machine learning model 320.

図８は、一部の実施形態による、外科手術ロボットシステム１０によって実施される解剖学的構造のセグメンテーション及び解剖学的ランドマークの識別のためのステップ４００を示すフローチャートである。ステップ４０２で、外科手術ロボットシステム１０は、外科手術ロボットシステム１０のカメラ組立品４４から画像を受信する。画像は、対象の１つ以上の解剖学的構造の表現を含み得る。実施例は、図７の画像３１４に関して記載されている。 FIG. 8 is a flowchart illustrating steps 400 for anatomical structure segmentation and anatomical landmark identification performed by the surgical robotic system 10, according to some embodiments. In step 402, the surgical robotic system 10 receives an image from the camera assembly 44 of the surgical robotic system 10. The image may include a representation of one or more anatomical structures of a subject. An example is described with respect to image 314 of FIG. 7.

ステップ４０４で、外科手術ロボットシステム１０は、画像からその視覚的表現を抽出する。視覚的表現は、画像のコンパクト表現であり得る。実施例は、図７の視覚的エンコーダ３２６に関して記載されている。 In step 404, the surgical robot system 10 extracts a visual representation from the image. The visual representation may be a compact representation of the image. An example is described with respect to the visual encoder 326 in FIG. 7.

ステップ４０６で、外科手術ロボットシステム１０は、ロボット組立品２０に関連付けられた位置及び配向データを決定する。ロボット組立品２０は、ロボットアーム組立品４２及びカメラ組立品４４を含む。位置及び配向データは、ロボット組立品２０の姿勢を示す。実施例は、図７の位置及び配向データ３１２に関して記載されている。 In step 406, the surgical robot system 10 determines position and orientation data associated with the robot assembly 20. The robot assembly 20 includes a robot arm assembly 42 and a camera assembly 44. The position and orientation data indicates the posture of the robot assembly 20. An example is described with respect to the position and orientation data 312 in FIG. 7.

ステップ４０８で、外科手術ロボットシステム１０は、位置及び配向データに少なくとも部分的に基づいて、ロボット組立品２０の姿勢表現を生成する。実施例は、図７の姿勢エンコーダ３２２に関して記載されている。 In step 408, the surgical robot system 10 generates a pose representation of the robot assembly 20 based at least in part on the position and orientation data. An example is described with respect to the pose encoder 322 in FIG. 7.

ステップ４１０で、外科手術ロボットシステム１０は、視覚的表現及び姿勢表現に少なくとも部分的に基づいて、状態表現を生成する。実施例は、図７の状態表現３３０に関して記載されている。 In step 410, the surgical robot system 10 generates a state representation based at least in part on the visual representation and the pose representation. An example is described with respect to the state representation 330 in FIG. 7.

ステップ４１２で、外科手術ロボットシステム１０は、状態表現に少なくとも部分的に基づいて、ロボット組立品２０が動作している解剖学的空間内の１つ以上の解剖学的ランドマークを識別する。実施例は、図７の分類器３３４及び識別された解剖学的ランドマーク３４４に関して記載されている。 At step 412, the surgical robotic system 10 identifies one or more anatomical landmarks within the anatomical space in which the robotic assembly 20 is operating based at least in part on the state representation. An example is described with respect to the classifier 334 and identified anatomical landmarks 344 in FIG. 7.

ステップ４１４で、外科手術ロボットシステム１０は、複数のセグメンテーションマップを生成する。各セグメンテーションマップは、１つ以上の解剖学的構造のうちのどれのロボット組立品２０との接触を回避すべきかを識別する。実施例は、図７のデコーダ３２２及びセグメンテーションマップ３４２に関して記載されている。 At step 414, the surgical robot system 10 generates a plurality of segmentation maps. Each segmentation map identifies which of one or more anatomical structures should be avoided from contact with the robot assembly 20. An example is described with respect to the decoder 322 and segmentation map 342 in FIG. 7.

図９は、一部の実施形態による、外科手術ロボットシステム１０によって実施される解剖学的構造追跡のためのステップ５００を示すフローチャートである。ステップ５０２で、外科手術ロボットシステム１０は、カメラ組立品４４によって捕捉された複数の画像のうちの１つの画像を受信する。画像は、対象の１つ以上の解剖学的構造の表現を含む。実施例は、図７の画像３１４及び追跡モジュール３６０に関して記載されている。 FIG. 9 is a flowchart illustrating steps 500 for anatomical structure tracking performed by the surgical robotic system 10, according to some embodiments. In step 502, the surgical robotic system 10 receives one image of a plurality of images captured by the camera assembly 44. The image includes a representation of one or more anatomical structures of a subject. An example is described with respect to the image 314 and tracking module 360 of FIG. 7.

ステップ５０４で、外科手術ロボットシステム１０は、画像からその視覚的表現を抽出する。視覚的表現は、画像のコンパクト表現であり得る。実施例は、図７の視覚的エンコーダ３２６及び追跡モジュール３６０に関して記載されている。 In step 504, the surgical robot system 10 extracts a visual representation from the image. The visual representation may be a compact representation of the image. An example is described with respect to the visual encoder 326 and tracking module 360 in FIG. 7.

ステップ５０６で、外科手術ロボットシステム１０は、ロボット組立品２０に関連付けられた位置及び配向データを決定する。実施例は、図７の位置及び配向データ３１２及び追跡モジュール３６０に関して記載されている。 In step 506, the surgical robotic system 10 determines position and orientation data associated with the robotic assembly 20. An example is described with respect to the position and orientation data 312 and tracking module 360 in FIG. 7.

ステップ５０８で、外科手術ロボットシステム１０は、位置及び配向データに少なくとも部分的に基づいて、ロボット組立品２０の姿勢表現を生成する。実施例は、図７の姿勢エンコーダ３２２及び追跡モジュール３６０に関して記載されている。 In step 508, the surgical robot system 10 generates a pose representation of the robot assembly 20 based at least in part on the position and orientation data. An example is described with respect to the pose encoder 322 and tracking module 360 of FIG. 7.

ステップ５１０で、外科手術ロボットシステム１０は、視覚的表現及び姿勢表現に少なくとも部分的に基づいて、状態表現を生成する。実施例は、図７の状態表現３３０及び追跡モジュール３６０に関して記載されている。 In step 510, the surgical robot system 10 generates a state representation based at least in part on the visual representation and the pose representation. An example is described with respect to the state representation 330 and tracking module 360 in FIG. 7.

ステップ５１２で、外科手術ロボットシステム１０は、状態表現と以前の状態表現の間の類似性を決定する。以前の状態表現は、以前の位置及び配向データから抽出された以前の姿勢表現、並びに複数の画像のうちの１つの画像であり、かつその画像の時間的に前である以前の画像から抽出された以前の視覚的表現に少なくとも部分的に基づいて生成される。実施例は、図７の追跡モジュール３６０に関して記載されている。 In step 512, the surgical robot system 10 determines a similarity between the state representation and a previous state representation. The previous state representation is generated based at least in part on a previous pose representation extracted from previous position and orientation data and a previous visual representation extracted from a previous image, the previous image being one of the multiple images and temporally preceding the previous image. An example is described with respect to the tracking module 360 in FIG. 7.

ステップ５１４で、類似性が類似性閾値以上であると決定することに応答して、外科手術ロボットシステム１０は、状態表現及び以前の状態表現を平均化して、平均化された状態表現を生成する。実施例は、図７の追跡モジュール３６０に関して記載されている。 In step 514, in response to determining that the similarity is greater than or equal to the similarity threshold, the surgical robot system 10 averages the state representation and the previous state representation to generate an averaged state representation. An example is described with respect to the tracking module 360 in FIG. 7.

ステップ５１６で、外科手術ロボットシステム１０は、平均化された状態表現に少なくとも部分的に基づいて、ロボット組立品２０が動作している解剖学的空間内の１つ以上の解剖学的ランドマークを識別する。実施例は、図７の分類器３３４、識別された解剖学的ランドマーク３４４、及び追跡モジュール３６０に関して記載されている。 In step 516, the surgical robotic system 10 identifies one or more anatomical landmarks within the anatomical space in which the robotic assembly 20 is operating based at least in part on the averaged state representation. Examples are described with respect to the classifier 334, identified anatomical landmarks 344, and tracking module 360 of FIG. 7.

ステップ５１８で、外科手術ロボットシステム１０は、複数のセグメンテーションマップを生成する。各セグメンテーションマップは、１つ以上の解剖学的構造のうちのどれのロボット組立品２０との接触を回避すべきかを識別する。実施例は、図７のデコーダ３２２、セグメンテーションマップ３４２、及び追跡モジュール３６０に関して記載されている。 At step 518, the surgical robot system 10 generates a plurality of segmentation maps. Each segmentation map identifies which of one or more anatomical structures should be avoided from contact with the robot assembly 20. Examples are described with respect to the decoder 322, segmentation map 342, and tracking module 360 in FIG. 7.

図１０は、一部の実施形態による、解剖学的構造を自動的に識別するように外科手術ロボットシステム１０を訓練するための方法６００のステップを示すフローチャートである。 FIG. 10 is a flowchart illustrating steps of a method 600 for training a surgical robotic system 10 to automatically identify anatomical structures, according to some embodiments.

ステップ６０２で、計算デバイス、例えば、コンピューティングモジュール１８は、複数の標識された画像並びに外科手術ロボットシステム１０のロボット組立品２０に関連付けられた既知の位置及び配向データを有する訓練セットに少なくとも部分的に基づいて、機械学習モデルを訓練する。例えば、図１１Ｂの訓練モジュール２４００は、標識された解剖学的構造及び標識された解剖学的ランドマークを有する標識された画像並びに既知の位置及び配向データを有する訓練セットを含み得る。一部の実施形態では、訓練モジュール２４００は、非一時的コンピュータ可読媒体、例えば、サーバに記憶された訓練セットにアクセスすることができる。各標識された画像は、１つ以上の標識された解剖学的構造、及び一部の実施形態では、１つ以上の標識された解剖学的ランドマークを有し得る。既知の位置及び配向データは、ロボットアーム組立品４２の姿勢、カメラ組立品４４の姿勢、又はその両方など、ロボット組立品２０の姿勢を示す。位置及び配向データは、図７に記載されるように、１つの画像に加えた、機械学習モデルへの入力のうちの１つである。位置及び配向データは、方向、位置、配向、又は画像が捕捉されている間にロボットアーム組立品４２及びカメラ組立品４４がどのように位置付けられているかに関する任意の他の適切な情報など、カメラ組立品４４によって捕捉された画像に追加情報を提供することができる。訓練モジュール２４００は、訓練される機械学習モデルに訓練セットを供給して、訓練された機械学習モデル３２０を生成することができる。訓練モジュール２４００は、訓練プロセス中に機械学習モデル内の重み及び他のパラメータを調整して、機械学習モデルの出力と予想される出力との間の差を低減することができる。例えば、訓練モジュール２４００は、機械学習モデル内の１つ以上のパラメータを調整して、機械学習モデルによって識別された１つ以上の解剖学的ランドマークと、対応する標識された解剖学的ランドマークの間の差、及び機械学習モデルによって生成された各セグメンテーションマップ内の１つ以上の解剖学的構造と、対応する標識された解剖学的構造の間の差を低減することができる。訓練された機械学習モデル３２０は、非一時的記憶媒体上に、又は生体構造セグメンテーション及び追跡モジュール３００の構成要素として格納され得る。一部の実施形態では、訓練モジュール２４００は、訓練セットのグループを検証セットとして選択し、訓練された機械学習モデル３２０を検証セットに適用して、訓練された機械学習モデル３２０を評価することができる。 In step 602, a computing device, e.g., computing module 18, trains a machine learning model based at least in part on a training set having a plurality of labeled images and known position and orientation data associated with the robot assembly 20 of the surgical robot system 10. For example, training module 2400 of FIG. 11B may include a training set having labeled images with labeled anatomical structures and labeled anatomical landmarks and known position and orientation data. In some embodiments, training module 2400 may access a training set stored on a non-transitory computer-readable medium, e.g., a server. Each labeled image may have one or more labeled anatomical structures and, in some embodiments, one or more labeled anatomical landmarks. The known position and orientation data indicates the pose of the robot assembly 20, such as the pose of the robot arm assembly 42, the pose of the camera assembly 44, or both. The position and orientation data is one of the inputs to the machine learning model, added to an image, as described in FIG. 7. The position and orientation data may provide additional information to the images captured by the camera assembly 44, such as the direction, position, orientation, or any other suitable information regarding how the robotic arm assembly 42 and camera assembly 44 are positioned while the images are being captured. The training module 2400 may supply the training set to the machine learning model to be trained to generate a trained machine learning model 320. The training module 2400 may adjust weights and other parameters in the machine learning model during the training process to reduce differences between the output of the machine learning model and expected outputs. For example, the training module 2400 may adjust one or more parameters in the machine learning model to reduce differences between one or more anatomical landmarks identified by the machine learning model and corresponding labeled anatomical landmarks, and between one or more anatomical structures in each segmentation map generated by the machine learning model and corresponding labeled anatomical structures. The trained machine learning model 320 may be stored on a non-transitory storage medium or as a component of the anatomy segmentation and tracking module 300. In some embodiments, the training module 2400 may select a group of the training set as a validation set and apply the trained machine learning model 320 to the validation set to evaluate the trained machine learning model 320.

ステップ６０４で、外科手術ロボットシステム１０は、訓練された機械学習モデル３２０を展開する。例えば、訓練された機械学習モデル３２０は、ロボット組立品２０が動作している解剖学的空間内の１つ以上の解剖学的ランドマークを識別することができる。実施例は、図７の分類器３３４及び識別された解剖学的ランドマーク３４４に関して記載されている。訓練された機械学習モデル３２０は、複数のセグメンテーションマップを生成することができ、各セグメンテーションマップは、１つ以上の解剖学的構造のうちのどれのロボット組立品２０との接触を回避すべきかを識別する。実施例は、図７のデコーダ３２２及びセグメンテーションマップ３４２に関して記載されている。 In step 604, the surgical robot system 10 deploys the trained machine learning model 320. For example, the trained machine learning model 320 can identify one or more anatomical landmarks within the anatomical space in which the robot assembly 20 is operating. An example is described with reference to the classifier 334 and identified anatomical landmarks 344 in FIG. 7. The trained machine learning model 320 can generate multiple segmentation maps, each identifying which of one or more anatomical structures should avoid contact with the robot assembly 20. An example is described with reference to the decoder 322 and segmentation map 342 in FIG. 7.

一部の実施形態では、図１２に関して記載される訓練プロセスは、図１０のリモート計算サーバ１１０２ａ～１１０２ｎで実施され得る。計算サーバ１１０２ａ～１１０２ｎは、機械学習モデルを訓練し、解剖学的構造を自動的に識別するために、訓練された機械学習モデル３２０を外科手術ロボットシステム１０に送信することができる。 In some embodiments, the training process described with respect to FIG. 12 may be performed on the remote computational servers 1102a-1102n of FIG. 10. The computational servers 1102a-1102n may train the machine learning model and transmit the trained machine learning model 320 to the surgical robot system 10 for automatic identification of anatomical structures.

図１１Ａは、例示的な実施形態によって提供される方法の１つ以上のステップを実施するために使用され得る、例示的なコンピューティングモジュール１８の図である。コンピューティングモジュール１８は、例示的な実施形態を実装するための１つ以上のコンピュータ実行可能命令又はソフトウェアを記憶するための１つ以上の非一時的コンピュータ可読媒体を含む。非一時的コンピュータ可読媒体には、１つ以上のタイプのハードウェアメモリ、非一時的有形媒体（例えば、１つ以上の磁気記憶ディスク、１つ以上の光ディスク、１つ以上のＵＳＢフラッシュドライブ）などを含み得るが、これらに限定されない。例えば、コンピューティングモジュール１８に含まれるメモリ１００６は、例示的な実施形態（例えば、システムコード２０００）を実装するためのコンピュータ可読及びコンピュータ実行可能命令又はソフトウェアを格納することができる。コンピューティングモジュール１８はまた、メモリ１００６に格納されたコンピュータ可読及びコンピュータ実行可能命令又はソフトウェア、並びにシステムハードウェアを制御するための他のプログラムを実行するための、プロセッサ２２及び関連するコア１００４を含む。プロセッサ２２は、単一のコアプロセッサ又は複数のコア（１００４）プロセッサとすることができる。 FIG. 11A is a diagram of an exemplary computing module 18 that may be used to implement one or more steps of a method provided by an exemplary embodiment. The computing module 18 includes one or more non-transitory computer-readable media for storing one or more computer-executable instructions or software for implementing an exemplary embodiment. The non-transitory computer-readable media may include, but are not limited to, one or more types of hardware memory, non-transitory tangible media (e.g., one or more magnetic storage disks, one or more optical disks, one or more USB flash drives), etc. For example, the memory 1006 included in the computing module 18 may store computer-readable and computer-executable instructions or software for implementing an exemplary embodiment (e.g., system code 2000). The computing module 18 also includes a processor 22 and associated cores 1004 for executing the computer-readable and computer-executable instructions or software stored in the memory 1006, as well as other programs for controlling the system hardware. The processor 22 may be a single-core processor or a multi-core (1004) processor.

メモリ１００６は、ＤＲＡＭ、ＳＲＡＭ、ＥＤＯＲＡＭなどのコンピュータシステムメモリ又はランダムアクセスメモリを含み得る。メモリ１００６は、他のタイプのメモリ、又はそれらの組み合わせも含み得る。ユーザは、グラフィカルユーザインターフェース（ＧＵＩ）３９を表示できるタッチスクリーンディスプレイ又はコンピュータモニタなどのディスプレイ１２を介して、コンピューティングモジュール１８と相互作用することができる。ディスプレイ１２はまた、例示的な実施形態に関連付けられた他の態様、変換器、及び／又は情報若しくはデータを表示することができる。コンピューティングモジュール１８は、ユーザからの入力を受信するための他のＩ／Ｏデバイス、例えば、キーボード又は任意の適切なマルチポイントタッチインターフェース１００８、ポインティングデバイス１０１０（例えば、ペン、スタイラス、マウス、又はトラックパッド）を含み得る。キーボード１００８及びポインティングデバイス１０１０は、視覚的表示デバイス１２に連結され得る。コンピューティングモジュール１８は、他の適切な従来のＩ／Ｏ周辺装置を含み得る。 The memory 1006 may include computer system memory or random access memory, such as DRAM, SRAM, EDO RAM, etc. The memory 1006 may also include other types of memory, or combinations thereof. A user may interact with the computing module 18 through a display 12, such as a touchscreen display or computer monitor capable of displaying a graphical user interface (GUI) 39. The display 12 may also display other aspects, transducers, and/or information or data associated with the exemplary embodiments. The computing module 18 may include other I/O devices for receiving input from a user, such as a keyboard or any suitable multi-point touch interface 1008, and a pointing device 1010 (e.g., a pen, stylus, mouse, or trackpad). The keyboard 1008 and pointing device 1010 may be coupled to the visual display device 12. The computing module 18 may include other suitable conventional I/O peripherals.

コンピューティングモジュール１８はまた、本明細書に記載される外科手術ロボットシステム１０、又はその一部の例示的な動作／ステップを実行するデータ及びコンピュータ可読命令、アプリケーション、及び／又はソフトウェア（これらは、ディスプレイ１２上にＧＵＩ３９を生成するために実行され得る）を記憶するための、ハードドライブ、ＣＤ－ＲＯＭ、又は他のコンピュータ可読媒体などの、１つ以上の記憶装置２４を含むことができる。例示的な記憶装置２４はまた、例示的な実施形態を実施するために必要な任意の適切な情報を記憶するための１つ以上のデータベースを記憶することができる。データベースは、データベース内の１つ以上のアイテムを追加、削除、又は更新するために、ユーザによって、又は任意の適切な時点で自動的に更新され得る。例示的な記憶装置２４は、供給されたデータ、及び本明細書に記載のシステム及び方法の例示的な実施形態を実施するために使用される他のデータ／情報を記憶するための１つ以上のデータベース１０２６を格納することができる。 The computing module 18 may also include one or more storage devices 24, such as a hard drive, CD-ROM, or other computer-readable medium, for storing data and computer-readable instructions, applications, and/or software (which may be executed to generate the GUI 39 on the display 12) that perform the exemplary operations/steps of the surgical robotic system 10, or portions thereof, described herein. The exemplary storage device 24 may also store one or more databases for storing any suitable information necessary to implement the exemplary embodiments. The databases may be updated by a user to add, delete, or update one or more items in the databases, or automatically at any suitable time. The exemplary storage device 24 may store one or more databases 1026 for storing provisioned data and other data/information used to implement the exemplary embodiments of the systems and methods described herein.

コンピューティングモジュール１８は、ローカルエリアネットワーク（ＬＡＮ）、ワイドエリアネットワーク（ＷＡＮ）又はインターネットなどの、１つ以上のネットワークと、様々な接続（標準的な電話回線、ＬＡＮ又はＷＡＮリンク（例えば、８０２．１１、Ｔ１、Ｔ３、５６ｋｂ、Ｘ．２５）、ブロードバンド接続（例えば、ＩＳＤＮ、フレームリレー、ＡＴＭ）、無線接続、コントローラエリアネットワーク（ＣＡＮ）、又は上記のいずれか若しくはすべての組み合わせを含むがこれらに限定されない）を通して、１つ以上のネットワークデバイス１０２０を介してインターフェースするように構成されたネットワークインターフェース１０１２を含むことができる。ネットワークインターフェース１０１２には、内蔵ネットワークアダプタ、ネットワークインターフェースカード、ＰＣＭＣＩＡネットワークカード、カードバスネットワークアダプタ、無線ネットワークアダプタ、ＵＳＢネットワークアダプタ、モデム、又はコンピューティングモジュール１８を通信可能な任意のタイプのネットワークにインターフェース接続し、本明細書に記載の動作を実施するために適したその他の任意のデバイスを含み得る。さらに、コンピューティングモジュール１８は、ワークステーション、デスクトップコンピュータ、サーバ、ラップトップ、手持ち式コンピュータ、タブレットコンピュータ（例えば、ｉＰａｄ（登録商標）タブレットコンピュータ）、モバイルコンピューティング若しくは通信デバイス（例えば、ｉＰｈｏｎｅ（登録商標）通信デバイス）、又は通信可能であり、かつ本明細書に記載の動作を実施するのに十分なプロセッサパワー及びメモリ容量を有する他の形態のコンピューティング若しくは通信デバイスなどの、任意のコンピュータシステムであり得る。 Computing module 18 may include a network interface 1012 configured to interface with one or more networks, such as a local area network (LAN), a wide area network (WAN), or the Internet, through various connections (including, but not limited to, standard telephone lines, LAN or WAN links (e.g., 802.11, T1, T3, 56 kb, X.25), broadband connections (e.g., ISDN, Frame Relay, ATM), wireless connections, controller area networks (CAN), or combinations of any or all of the above) via one or more network devices 1020. Network interface 1012 may include an internal network adapter, a network interface card, a PCMCIA network card, a card bus network adapter, a wireless network adapter, a USB network adapter, a modem, or any other device suitable for interfacing computing module 18 to any type of network with which it can communicate and for performing the operations described herein. Furthermore, computing module 18 may be any computer system, such as a workstation, desktop computer, server, laptop, handheld computer, tablet computer (e.g., an iPad® tablet computer), mobile computing or communication device (e.g., an iPhone® communication device), or other form of computing or communication device capable of communications and having sufficient processor power and memory capacity to perform the operations described herein.

コンピューティングモジュール１８は、Ｍｉｃｒｏｓｏｆｔ（登録商標）Ｗｉｎｄｏｗｓ（登録商標）オペレーティングシステムのバージョンのいずれか、Ｕｎｉｘ及びＬｉｎｕｘオペレーティングシステムの異なるリリース、Ｍａｃｉｎｔｏｓｈコンピュータ用のＭａｃＯＳ（登録商標）の任意のバージョン、任意の組み込みオペレーティングシステム、任意のリアルタイムオペレーティングシステム、任意のオープンソースオペレーティングシステム、任意のプロプライエタリオペレーティングシステム、モバイルコンピューティングデバイス用の任意のオペレーティングシステム、又はコンピューティングデバイス上で実行し、本明細書に記載の動作を実施することができる任意の他のオペレーティングシステムなどの、任意のオペレーティングシステム１０１６を実行することができる。一部の実施形態では、オペレーティングシステム１０１６は、ネイティブモード又はエミュレートモードで実行され得る。一部の実施形態では、オペレーティングシステム１０１６は、１つ以上のクラウドマシンインスタンス上で実行され得る。 Computing module 18 may run any operating system 1016, such as any version of the Microsoft® Windows® operating system, different releases of the Unix and Linux operating systems, any version of MacOS® for Macintosh computers, any embedded operating system, any real-time operating system, any open source operating system, any proprietary operating system, any operating system for a mobile computing device, or any other operating system capable of running on a computing device and performing the operations described herein. In some embodiments, operating system 1016 may run in native mode or in an emulated mode. In some embodiments, operating system 1016 may run on one or more cloud machine instances.

コンピューティングモジュール１８はまた、アンテナ１０３０を含むことができ、アンテナ１０３０は、無線伝送を無線周波数（ＲＦ）フロントエンドで送信し、ＲＦフロントエンドから無線伝送を受信することができる。 The computing module 18 may also include an antenna 1030, which may transmit wireless transmissions to and receive wireless transmissions from a radio frequency (RF) front end.

図１１Ｂは、一部の実施形態による、コンピューティングモジュール１８によって実行可能であり得る例示的なシステムコード２０００を示す図である。システムコード２０００（非一時的、コンピュータ可読命令）は、コンピュータ可読媒体、例えば、記憶装置２４及び／又はメモリ１００６上に記憶され、コンピューティングモジュール１８のハードウェアプロセッサ２２によって実行可能であり得る。システムコード２０００は、本明細書に記載されるステップ／プロセスを実行する様々なカスタム作成ソフトウェアモジュールを含むことができ、図７の入力３１０を収集するデータ収集モジュール２１００、生体構造セグメンテーション及び追跡モジュール３００、姿勢エンコーダ３２２及び視覚的エンコーダ３２６を含むエンコーダ２２００、姿勢表現３２４及び視覚的表現３２８を集約するための状態表現集約モジュール２３００、構造セグメンテーションデコーダ３３２、分類器３３４、信頼モジュール３４６、追跡モジュール３６０、訓練エンジン２４００を含み得るがこれらに限定されない。システム１００の各構成要素は、図７に関して記載されている。 FIG. 11B illustrates exemplary system code 2000 that may be executable by computing module 18, according to some embodiments. The system code 2000 (non-transitory, computer-readable instructions) may be stored on a computer-readable medium, such as storage device 24 and/or memory 1006, and may be executable by the hardware processor 22 of computing module 18. The system code 2000 may include various custom-written software modules that perform the steps/processes described herein, including, but not limited to, a data collection module 2100 that collects input 310 of FIG. 7, an anatomy segmentation and tracking module 300, an encoder 2200 that includes a pose encoder 322 and a visual encoder 326, a state representation aggregation module 2300 that aggregates pose representations 324 and visual representations 328, a structure segmentation decoder 332, a classifier 334, a confidence module 346, a tracking module 360, and a training engine 2400. Each component of system 100 is described with respect to FIG. 7.

システムコード２０００は、Ｃ、Ｃ＋＋、Ｃ＃、Ｊａｖａ、Ｐｙｔｈｏｎ、又は任意の他の適切な言語を含むが、これらに限定されない、任意の適切なプログラミング言語を使用してプログラムされ得る。加えて、システムコード２０００は、通信ネットワークを介して互いに通信する複数のコンピュータシステムにわたって配給され、及び／又はクラウドコンピューティングプラットフォーム上で保存及び実行され、クラウドプラットフォームと通信するコンピュータシステムによって遠隔アクセスされ得る。 System code 2000 may be programmed using any suitable programming language, including, but not limited to, C, C++, C#, Java, Python, or any other suitable language. Additionally, system code 2000 may be distributed across multiple computer systems communicating with each other via a communications network and/or stored and executed on a cloud computing platform and remotely accessed by computer systems communicating with the cloud platform.

図１２は、システム１１００が実装され得るコンピュータハードウェア及びネットワークコンポーネントを示す図である。システム１１００は、外科手術ロボットシステム１０、少なくとも１つのプロセッサ（例えば、１つ以上のグラフィック処理ユニット（ＧＰＵ）、マイクロプロセッサ、中央処理装置（ＣＰＵ）、テンソル処理ユニット（ＴＰＵ）、特定用途向け集積回路（ＡＳＩＣ）など）を有する複数の計算サーバ１１０２ａ～１１０２ｎ、及び上述のコンピュータ命令及び方法（システムコード２０００として具現化され得る）を実行するためのメモリを含み得る。システム１１００はまた、データを記憶するための複数のデータ記憶サーバ１１０４ａ～１１０４ｎを含み得る。計算サーバ１１０２ａ～１１０２ｎ、データストレージサーバ１１０４ａ～１１０４ｎ、及びユーザ１１１２によってアクセスされる外科手術ロボットシステム１０は、通信ネットワーク１１０８を介して通信することができる。 FIG. 12 is a diagram illustrating computer hardware and network components upon which system 1100 may be implemented. System 1100 may include a surgical robot system 10, multiple computational servers 1102a-1102n having at least one processor (e.g., one or more graphics processing units (GPUs), microprocessors, central processing units (CPUs), tensor processing units (TPUs), application-specific integrated circuits (ASICs), etc.), and memory for executing the computer instructions and methods described above (which may be embodied as system code 2000). System 1100 may also include multiple data storage servers 1104a-1104n for storing data. Computational servers 1102a-1102n, data storage servers 1104a-1104n, and surgical robot system 10 accessed by user 1112 may communicate via a communications network 1108.

１０外科手術ロボットシステム
１１オペレータコンソール
１２ディスプレイ
１４画像計算モジュール
１６追跡モジュール
１６Ａ追跡モジュール
１７ハンドコントローラ
１７Ａ左ハンドコントローラ
１７Ｂ右ハンドコントローラ
１８計算モジュール
１９フットペダルアレイ
２０ロボットサブシステム
２１ロボットアーム部分組立品
２２プロセッサ
２３Ａ左ハンドコントローラサブシステム
２３Ｂ右ハンドコントローラサブシステム
２４ストレージ
２６コントローラ
３０画像レンダリング装置
３４、３４Ａ位置データ
３６外部データ
３７外部センサ
３９グラフィカルユーザインターフェース
４０モータ
４２ロボットアーム組立品
４２Ａ第一のロボットアーム
４２Ｂ第二のロボットアーム
４４カメラ組立品
４５エンドエフェクタ
４６ロボット支持システム
４７カメラ
４８画像データ
５０トロカール
１００患者
１０２手術テーブル
１０４内部空洞
１２０器具先端
１２２シャフト
１２４ハウジング
１２６肩関節
１２８肘関節
１３０手首関節
１３２センサ
１４０仮想胸部
１４２Ａ第一の旋回点
１４２Ｂ第二の旋回点
１４４カメラ撮像中心点
１４６旋回中心
１５０グラフィカルユーザインターフェース
１５１錐台
１５８模擬カメラ
１６５及び１６６模擬ロボットアーム
１６７錐台
１６８ライブビデオ映像
１７１及び１７２ロボット姿勢ビュー
１７３状態識別子
１７５状態識別子
１７６図像的記号
１７９図像的記号
１９１及び１９２模擬ロボットアーム
１９３模擬カメラ
１９８左ピラーボックス
１９９右ピラーボックス
２００嵌頓ヘルニア
２０１左ハンドコントローラ
２０２右ハンドコントローラ
２１０血管
２２０解剖学的空間
３００追跡モジュール
３１０入力
３１２配向データ
３１４画像
３２０機械学習モデル
３２２姿勢エンコーダ
３２４分類器
３２６視覚的エンコーダ
３２８視覚的表現
３３０状態表現
３３２デコーダ
３３４分類器
３４０出力
３４２セグメンテーションマップ
３４４解剖学的ランドマーク
３４６信頼モジュール
３６０追跡モジュール
３６２Ａデータ処理
３６２Ｂデータ処理
３６２Ｎデータ処理
１００４コア
１００６メモリ
１００８マルチポイントタッチインターフェース
１０１０ポインティングデバイス
１０１２ネットワークインターフェース
１０１６オペレーティングシステム
１０２０ネットワークデバイス
１０２６データベース
１０３０アンテナ
１１００システム
１１０２ａ～１１０２ｎリモート計算サーバ
１１０４ａ～１１０４ｎデータ記憶サーバ
１１０８通信ネットワーク
１１１２ユーザ
２０００システムコード
２１００データ収集モジュール
２２００エンコーダ
２３００状態表現集約モジュール
２４００訓練モジュール 10 Surgical robot system 11 Operator console 12 Display 14 Image computation module 16 Tracking module 16A Tracking module 17 Hand controller 17A Left hand controller 17B Right hand controller 18 Computation module 19 Foot pedal array 20 Robot subsystem 21 Robot arm subassembly 22 Processor 23A Left hand controller subsystem 23B Right hand controller subsystem 24 Storage 26 Controller 30 Image rendering device 34, 34A Position data 36 External data 37 External sensor 39 Graphical user interface 40 Motor 42 Robot arm assembly 42A First robot arm 42B Second robot arm 44 Camera assembly 45 End effector 46 Robot support system 47 Camera 48 Image data 50 Trocar 100 Patient 102 Operating table 104 Internal cavity 120 Instrument tip 122 Shaft 124 Housing 126 Shoulder joint 128 Elbow joint 130 Wrist joint 132 Sensor 140 Virtual chest 142A First pivot point 142B Second pivot point 144 Camera imaging center 146 Pivot center 150 Graphical user interface 151 Frustum 158 Simulated camera 165 and 166 Simulated robot arm 167 Frustum 168 Live video feed 171 and 172 Robot pose view 173 State identifier 175 State identifier 176 Iconographic symbol 179 Iconographic symbol 191 and 192 Simulated robot arm 193 Simulated camera 198 Left pillarbox 199 Right pillarbox 200 Incarcerated hernia 201 Left hand controller 202 Right hand controller 210 Blood vessel 220 Anatomical space 300 Tracking module 310 Input 312 Orientation data 314 Image 320 Machine learning model 322 Pose encoder 324 Classifier 326 Visual encoder 328 Visual representation 330 State representation 332 Decoder 334 Classifier 340 Output 342 Segmentation map 344 Anatomical landmarks 346 Confidence module 360 Tracking module 362A Data processing 362B Data processing 362N Data processing 1004 Core 1006 Memory 1008 Multi-point touch interface 1010 Pointing device 1012 Network interface 1016 Operating system 1020 Network device 1026 Database 1030 Antenna 1100 System 1102a to 1102n Remote Computation Servers 1104a-1104n Data Storage Server 1108 Communication Network 1112 User 2000 System Code 2100 Data Collection Module 2200 Encoder 2300 State Representation Aggregation Module 2400 Training Module

Claims

1. A surgical robotic system, comprising:
1. A robot assembly comprising:
a camera assembly configured to generate one or more images of an interior cavity of the object;
a robotic arm assembly disposed within the internal cavity to perform a surgical procedure;
a memory storing one or more instructions;
a processor configured or programmed to read the one or more instructions stored in the memory, the processor operably coupled to the robotic assembly;
receiving an image from the camera assembly, the image including a representation of one or more anatomical structures of the subject;
extracting from said image a visual representation thereof, said visual representation being a compact representation of said image;
determining position and orientation data associated with the robot assembly, the position and orientation data indicating a pose of the robot assembly;
generating a pose representation of the robotic assembly based at least in part on the position and orientation data;
generating a state representation based at least in part on the visual representation and the pose representation, the state representation representing a state of the surgical robotic system;
identifying one or more anatomical landmarks within an anatomical space in which the robotic assembly is operating based at least in part on the state representation;
a processor that generates a plurality of segmentation maps, each segmentation map identifying which of the one or more anatomical structures should avoid contact with the robot assembly.

the processor is further configured or programmed to read the one or more instructions stored in the memory and execute a machine learning model;
identifying the one or more anatomical landmarks in the image based on the state representation;
The surgical robot system of claim 1 , wherein the plurality of segmentation maps are generated, each of the segmentation maps identifying which of the anatomical structures should avoid contact with the robotic assembly.

The surgical robot system of claim 1, wherein the state representation is generated by aggregating the visual representation and the pose representation.

the visual representation and the pose representation are:
4. The surgical robot system of claim 3, wherein the visual and pose representations are aggregated by averaging the visual and pose representations, weighting each of the visual and pose representations equally, or by generating a weighted average of the visual and pose representations.

The surgical robot system of claim 3, wherein the visual representations and pose representations are aggregated by concatenating the visual representations and pose representations such that the size of the state representation increases with the addition of each visual representation and pose representation.

The surgical robot system of claim 1, wherein the one or more anatomical landmarks include at least one of the following: the inguinal triangle, which refers to an area of the abdominal wall; the triangle of doom, which refers to an anatomical triangle bounded medially by the vas deferens, laterally by the testicular artery and vein, and inferiorly by the peritoneal folds; or the triangle of pain, which is the area bounded by the iliopectineal cord, the testicular artery and vein, and the peritoneal folds.

the processor is further configured or programmed to read the one or more instructions stored in the memory;
creating an interdependency between the plurality of segmentation maps and the identified one or more anatomical landmarks;
The surgical robot system of claim 1 , further comprising: adjusting weights of the plurality of segmentation maps and the identified one or more anatomical landmarks based at least in part on the interdependence.

1. A surgical robotic system, comprising:
1. A robotic assembly comprising:
a camera assembly configured to generate a plurality of images of an interior cavity of the object;
a robotic arm assembly disposed within the internal cavity to perform a surgical procedure;
a memory storing one or more instructions;
a processor configured or programmed to read the one or more instructions stored in the memory, the processor operably coupled to the robotic assembly;
receiving an image of the plurality of images, the image including a representation of one or more anatomical structures of the subject;
extracting from said image a visual representation thereof, said visual representation being a compact representation of said image;
determining position and orientation data associated with the robot assembly, the position and orientation data indicating a pose of the robot assembly;
generating a pose representation of the robotic assembly based at least in part on the position and orientation data;
generating a state representation based at least in part on the visual representation and the pose representation, the state representation representing a current state of the surgical robotic system;
determining a similarity between the state representation and a previous state representation, the previous state representation being generated based at least in part on a previous pose representation extracted from previous position and orientation data and a previous visual representation extracted from a previous image that is one image of the plurality of images and that is temporally prior to the image;
responsive to determining that the similarity is greater than or equal to a similarity threshold, averaging the state representation and the prior state representation to generate an averaged state representation;
identifying one or more anatomical landmarks based at least in part on the averaged state representation;
a processor that generates a plurality of segmentation maps, each segmentation map identifying which of the one or more anatomical structures should avoid contact with the robot assembly.

the processor is further configured or programmed to read the one or more instructions stored in the memory;
determining a three-dimensional (3D) reconstruction of the anatomical space in which the robotic assembly is operating;
The surgical robot system of claim 8 , further comprising: determining a location of each of one or more identified anatomical structures within the anatomical space based at least in part on the 3D reconstruction.

The surgical robot system of claim 8, wherein the plurality of images includes a plurality of video frames.

the processor is further configured or programmed to read the one or more instructions stored in the memory and execute a machine learning model;
identifying the one or more anatomical landmarks based at least in part on the averaged state representation;
The surgical robot system of claim 8 , wherein the plurality of segmentation maps are generated, each segmentation map identifying which of the one or more anatomical structures should avoid contact with the robotic assembly.

The surgical robot system of claim 8, wherein the state representation is generated by aggregating the visual representation and the pose representation.

the visual representation and the pose representation are:
13. The surgical robot system of claim 12, wherein the visual and pose representations are aggregated by averaging the visual and pose representations, weighting each of the visual and pose representations equally, or by generating a weighted average of the visual and pose representations.

The surgical robot system of claim 14, wherein the visual representations and pose representations are aggregated by concatenating the visual representations and pose representations such that the size of the state representation increases with the addition of each visual representation and pose representation.

The surgical robot system of claim 8, wherein the one or more anatomical landmarks include at least one of the inguinal triangle, which refers to an area of the abdominal wall; the triangle of doom, which refers to an anatomical triangle bounded medially by the vas deferens, laterally by the testicular artery and vein, and inferiorly by the peritoneal folds; or the triangle of pain, which is the area bounded by the iliopubic cord, the testicular artery and vein, and the peritoneal folds.

the processor is further configured or programmed to read the one or more instructions stored in the memory;
creating an interdependency between the plurality of segmentation maps and the identified one or more anatomical landmarks;
The surgical robot system of claim 8 , further comprising: adjusting weights of the plurality of segmentation maps and the identified one or more anatomical landmarks based at least in part on the interdependence.

The similarity between the state representation and a previous state representation is:
The surgical robot system of claim 8 , wherein the imaging is determined by determining whether adjacent images of the plurality of images over a period of time exhibit similar fields of view.

The surgical robot system of claim 8, wherein the processor is further configured or programmed to read the one or more instructions stored in the memory and update the plurality of segmentation maps and the identified one or more anatomical landmarks in real time to reflect changes in the field of view of the camera assembly.

1. A computer-implemented method for training a surgical robotic system to automatically identify anatomical structures, comprising:
training a machine learning model based at least in part on a training set having a plurality of labeled images and known position and orientation data associated with a robotic assembly of the surgical robotic system, wherein each labeled image has one or more labeled anatomical structures and one or more labeled anatomical landmarks, and the known position and orientation data indicates a pose of the robotic assembly;
Deploying the trained machine learning model
Identifying one or more anatomical landmarks within an anatomical space in which the robotic assembly is operating;
generating a plurality of segmentation maps, each segmentation map identifying which of the one or more anatomical structures should avoid contact with the robotic assembly.

The computer-implemented method of claim 1, wherein training the machine learning model includes adjusting one or more parameters in the machine learning model to reduce differences between one or more anatomical landmarks identified by the machine learning model and corresponding labeled anatomical landmarks, and between the one or more anatomical structures in each segmentation map generated by the machine learning model and the corresponding labeled anatomical structures.