KR20250015581A

KR20250015581A - Method for performing hand tracking and wearable electronic device supporting the same

Info

Publication number: KR20250015581A
Application number: KR1020230107703A
Authority: KR
Inventors: 김성오; 강진모; 구본곤; 류지수; 박성권; 염동현
Original assignee: 삼성전자주식회사
Priority date: 2023-07-20
Filing date: 2023-08-17
Publication date: 2025-02-03

Abstract

일 실시예에 따른 웨어러블 전자 장치는, 제 1 카메라 및 제 2 카메라를 포함하는 카메라, 디스플레이, 및 상기 카메라 및 상기 디스플레이와 작동적으로 연결된 적어도 하나의 프로세서를 포함할 수 있다. 상기 적어도 하나의 프로세서는, 상기 제 1 카메라를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서는, 상기 제 2 카메라를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여, 상기 사용자의 응시 지점을 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서는 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하도록 구성될 수 있다. 상기 적어도 하나의 프로세서는, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서는, 상기 획득된 스켈레톤에 기반하여, 상기 손과 관련된 동작을 수행하도록 구성될 수 있다.A wearable electronic device according to one embodiment may include a camera including a first camera and a second camera, a display, and at least one processor operatively connected to the camera and the display. The at least one processor may be configured to acquire a position of a hand of a user based on at least one first image acquired through the first camera. The at least one processor may be configured to acquire a gaze point of the user based on at least one second image acquired through the second camera. The at least one processor may be configured to determine whether the position of the hand corresponds to the gaze point. The at least one processor may be configured to acquire a skeleton including key points related to the hand based on whether the position of the hand corresponds to the gaze point. The at least one processor may be configured to perform a motion related to the hand based on the acquired skeleton.

Description

METHOD FOR PERFORMING HAND TRACKING AND WEARABLE ELECTRONIC DEVICE SUPPORTING THE SAME

본 개시는 핸드 트래킹을 수행하는 방법 및 이를 지원하는 웨어러블 전자 장치에 관한 것이다.The present disclosure relates to a method for performing hand tracking and a wearable electronic device supporting the same.

AR 글래스(augmented reality glass), VR 글래스(virtual reality glass), HMD(head mounted display) 장치와 같은 웨어러블 전자 장치를 통해 제공되는 다양한 서비스 및 부가 기능들이 점차 증가하고 있다. 이러한 웨어러블 전자 장치의 효용 가치를 높이고 다양한 사용자들의 욕구를 만족시키기 위해서, 통신 서비스 제공자 또는 웨어러블 전자 장치 제조사들은 다양한 기능들을 제공하고 다른 업체와의 차별화를 위해 웨어러블 전자 장치를 경쟁적으로 개발하고 있다. 이에 따라, 웨어러블 전자 장치를 통해서 제공되는 다양한 기능들도 점점 고도화 되고 있다.The variety of services and additional functions provided through wearable electronic devices such as augmented reality glasses (AR glasses), virtual reality glasses (VR glasses), and head mounted displays (HMDs) are gradually increasing. In order to increase the utility value of these wearable electronic devices and satisfy the needs of various users, communication service providers or wearable electronic device manufacturers are competitively developing wearable electronic devices to provide various functions and differentiate themselves from other companies. Accordingly, the variety of functions provided through wearable electronic devices are also becoming increasingly advanced.

웨어러블 전자 장치는, 다양한 방법들을 통하여, 사용자와의 인터랙션(interaction)을 수행할 수 있다. 예를 들어, 웨어러블 전자 장치는, 사용자의 손(손의 위치 및/또는 손의 움직임)을 추적하고, 추적된 손에 기반하여 제스처를 인식하는 동작(및/또는 손에 대응하는 가상 손을 나타내는 동작)(이하, "핸드 트래킹(hand tracking)"으로 지칭됨)을 수행할 수 있다.Wearable electronic devices can interact with users in various ways. For example, wearable electronic devices can track a user's hand (position of the hand and/or movement of the hand) and perform actions of recognizing gestures based on the tracked hand (and/or actions of representing a virtual hand corresponding to the hand) (hereinafter referred to as "hand tracking").

상술한 정보는 본 개시에 대한 이해를 돕기 위한 목적으로 하는 배경 기술(related art)로 제공될 수 있다. 상술한 내용 중 어느 것도 본 개시와 관련된 종래 기술(prior art)로서 적용될 수 있는지에 대하여 어떠한 주장이나 결정이 제기되지 않는다.The above information may be provided as related art for the purpose of assisting in understanding the present disclosure. No claim or determination is made as to whether any of the above is applicable as prior art related to the present disclosure.

핸드 트래킹은 복수의 동작들을 포함할 수 있다. 예를 들어, 핸드 트래킹은, 카메라(또는 센서)를 통하여 획득된 이미지(또는 뎁스 맵(depth map)) 내에서 손의 위치를 획득하는 동작, 손의 위치에 기반하여 손과 관련된 스켈레톤(skeleton)을 획득하는 동작, 손이 나타내는 제스처를 인식하는 동작, 및/또는 손에 대응하는 가상 손을 렌더링(rendering)하는 동작을 포함할 수 있다.Hand tracking may include multiple operations. For example, hand tracking may include: obtaining a position of a hand within an image (or depth map) acquired via a camera (or sensor), obtaining a skeleton associated with the hand based on the position of the hand, recognizing a gesture indicated by the hand, and/or rendering a virtual hand corresponding to the hand.

웨어러블 전자 장치는, 핸드 트래킹을 수행하도록 하는 입력이 입력되는 경우, 핸드 트래킹에 포함된 상기 복수의 동작들을 수행하고 있다. 이러한 경우, 웨어러블 전자 장치가, 사용자와의 인터랙션 또는 사용자의 의도(또는 관심도)와 무관하게, 핸드 트래킹에 포함된 상기 복수의 동작들을 수행함으로써, 웨어러블 전자 장치는 많은 전력을 소모할 수 있다.The wearable electronic device performs the above-described multiple operations included in hand tracking when an input for performing hand tracking is input. In this case, since the wearable electronic device performs the above-described multiple operations included in hand tracking regardless of interaction with the user or the user's intention (or interest), the wearable electronic device may consume a lot of power.

본 개시는, 사용자의 응시 지점(및/또는 오브젝트의 위치)을 고려하여 핸드 트래킹을 수행할 수 있는, 핸드 트래킹을 수행하는 방법 및 이를 지원하는 웨어러블 전자 장치에 관한 것이다.The present disclosure relates to a method for performing hand tracking, which can perform hand tracking by considering a user's gaze point (and/or a position of an object), and a wearable electronic device supporting the same.

본 개시가 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved by the present disclosure are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by a person having ordinary skill in the technical field to which the present invention belongs from the description below.

일 실시예에 따른 웨어러블 전자 장치에서 핸드 트래킹을 수행하는 방법은, 상기 웨어러블 전자 장치의 제 1 카메라를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 웨어러블 전자 장치의 제 2 카메라를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여, 상기 사용자의 응시 지점을 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하는 동작을 포함할 수 있다. 상기 방법은, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 획득된 스켈레톤에 기반하여, 상기 손과 관련된 동작을 수행하는 동작을 포함할 수 있다.A method for performing hand tracking in a wearable electronic device according to one embodiment may include an operation of acquiring a position of a hand of a user based on at least one first image acquired through a first camera of the wearable electronic device. The method may include an operation of acquiring a gaze point of the user based on at least one second image acquired through a second camera of the wearable electronic device. The method may include an operation of determining whether the position of the hand corresponds to the gaze point. The method may include an operation of acquiring a skeleton including key points related to the hand based on whether the position of the hand corresponds to the gaze point. The method may include an operation of performing an operation related to the hand based on the acquired skeleton.

일 실시예에서, 컴퓨터 실행 가능 명령어들을 기록한 비-일시적인 컴퓨터 판독 가능 매체에 있어서, 상기 컴퓨터 실행 가능 명령어들은, 실행 시, 적어도 하나의 프로세서를 포함하는 웨어러블 전자 장치가, 상기 웨어러블 전자 장치의 제 1 카메라를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득하도록 할 수 있다. 상기 컴퓨터 실행 가능 명령어들은, 실행 시, 상기 웨어러블 전자 장치가, 상기 웨어러블 전자 장치의 제 2 카메라를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여, 상기 사용자의 응시 지점을 획득하도록 할 수 있다. 상기 컴퓨터 실행 가능 명령어들은, 실행 시, 상기 웨어러블 전자 장치가, 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하도록 할 수 있다. 상기 컴퓨터 실행 가능 명령어들은, 실행 시, 상기 웨어러블 전자 장치가, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하도록 할 수 있다. 상기 컴퓨터 실행 가능 명령어들은, 실행 시, 상기 웨어러블 전자 장치가, 상기 획득된 스켈레톤에 기반하여, 상기 손과 관련된 동작을 수행하도록 할 수 있다.In one embodiment, a non-transitory computer-readable medium having computer-executable instructions recorded thereon may cause a wearable electronic device including at least one processor, when executed, to obtain a position of a hand of a user based on at least one first image acquired through a first camera of the wearable electronic device. The computer-executable instructions, when executed, may cause the wearable electronic device to obtain a gaze point of the user based on at least one second image acquired through a second camera of the wearable electronic device. The computer-executable instructions, when executed, may cause the wearable electronic device to determine whether the position of the hand corresponds to the gaze point. The computer-executable instructions, when executed, may cause the wearable electronic device to obtain a skeleton including key points associated with the hand based on whether the position of the hand corresponds to the gaze point. The above computer executable instructions, when executed, may cause the wearable electronic device to perform a motion associated with the hand based on the acquired skeleton.

본 개시의 일 실시예에 따른, 핸드 트래킹을 수행하는 방법 및 이를 지원하는 웨어러블 전자 장치는, 응시 지점(및/또는 오브젝트의 위치)을 고려하여 핸드 트래킹을 수행함으로써, 웨어러블 전자 장치에서 소모되는 전력 소모를 감소시킬 수 있다.According to one embodiment of the present disclosure, a method for performing hand tracking and a wearable electronic device supporting the same can reduce power consumption in the wearable electronic device by performing hand tracking by considering a gaze point (and/or a position of an object).

도 1은, 일 실시예에 따른 따른, 네트워크 환경 내의 전자 장치의 블록도이다.
도 2는, 본 개시의 일 실시 예에 따른, 웨어러블 전자 장치의 내부 구성을 설명하기 위한 사시도이다.
도 3a는, 일 실시 예에 따른, 웨어러블 전자 장치의 전면을 나타내는 도면이다.
도 3b는, 일 실시 예에 따른, 웨어러블 전자 장치의 후면을 나타내는 도면이다.
도 4는, 일 실시 예에 따른, 웨어러블 전자 장치의 블록도이다.
도 5는, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도이다.
도 6은, 일 실시예에 따른, 사용자의 손의 위치를 획득하는 방법을 설명하기 위한 도면이다.
도 7은, 일 실시예에 따른, 사용자의 응시 지점을 획득하는 방법을 설명하기 위한 도면이다.
도 8은, 일 실시예에 따른, 스켈레톤을 획득하는 동작을 설명하기 위한 도면이다.
도 9는, 일 실시예에 따른, 가상 손을 렌더링 하는 동작을 설명하기 위한 도면이다.
도 10은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도이다.
도 11은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도이다.
도 12은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 도면이다.
도 13은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도이다.FIG. 1 is a block diagram of an electronic device within a network environment according to one embodiment.
FIG. 2 is a perspective view illustrating the internal configuration of a wearable electronic device according to an embodiment of the present disclosure.
FIG. 3A is a diagram illustrating a front side of a wearable electronic device according to one embodiment.
FIG. 3b is a drawing showing the rear side of a wearable electronic device according to one embodiment.
FIG. 4 is a block diagram of a wearable electronic device according to one embodiment.
FIG. 5 is a flowchart illustrating a method of performing hand tracking according to one embodiment.
FIG. 6 is a diagram illustrating a method for obtaining a position of a user's hand according to one embodiment.
FIG. 7 is a diagram illustrating a method for obtaining a user's gaze point according to one embodiment.
FIG. 8 is a drawing for explaining an operation of obtaining a skeleton according to one embodiment.
FIG. 9 is a diagram for explaining an operation of rendering a virtual hand according to one embodiment.
FIG. 10 is a flowchart illustrating a method of performing hand tracking according to one embodiment.
FIG. 11 is a flowchart illustrating a method of performing hand tracking according to one embodiment.
FIG. 12 is a diagram for explaining a method of performing hand tracking according to one embodiment.
FIG. 13 is a flowchart illustrating a method of performing hand tracking according to one embodiment.

도 1은, 일 실시예에 따른, 네트워크 환경(100) 내의 전자 장치(101)의 블록도이다.FIG. 1 is a block diagram of an electronic device (101) within a network environment (100), according to one embodiment.

도 1을 참조하면, 네트워크 환경(100)에서 전자 장치(101)는 제 1 네트워크(198)(예: 근거리 무선 통신 네트워크)를 통하여 전자 장치(102)와 통신하거나, 또는 제 2 네트워크(199)(예: 원거리 무선 통신 네트워크)를 통하여 전자 장치(104) 또는 서버(108) 중 적어도 하나와 통신할 수 있다. 일 실시예에 따르면, 전자 장치(101)는 서버(108)를 통하여 전자 장치(104)와 통신할 수 있다. 일 실시예에 따르면, 전자 장치(101)는 프로세서(120), 메모리(130), 입력 모듈(150), 음향 출력 모듈(155), 디스플레이 모듈(160), 오디오 모듈(170), 센서 모듈(176), 인터페이스(177), 연결 단자(178), 햅틱 모듈(179), 카메라 모듈(180), 전력 관리 모듈(188), 배터리(189), 통신 모듈(190), 가입자 식별 모듈(196), 또는 안테나 모듈(197)을 포함할 수 있다. 어떤 실시예에서는, 전자 장치(101)에는, 이 구성요소들 중 적어도 하나(예: 연결 단자(178))가 생략되거나, 하나 이상의 다른 구성요소가 추가될 수 있다. 어떤 실시예에서는, 이 구성요소들 중 일부들(예: 센서 모듈(176), 카메라 모듈(180), 또는 안테나 모듈(197))은 하나의 구성요소(예: 디스플레이 모듈(160))로 통합될 수 있다.Referring to FIG. 1, in a network environment (100), an electronic device (101) may communicate with an electronic device (102) via a first network (198) (e.g., a short-range wireless communication network), or may communicate with at least one of an electronic device (104) or a server (108) via a second network (199) (e.g., a long-range wireless communication network). According to one embodiment, the electronic device (101) may communicate with the electronic device (104) via the server (108). According to one embodiment, the electronic device (101) may include a processor (120), a memory (130), an input module (150), an audio output module (155), a display module (160), an audio module (170), a sensor module (176), an interface (177), a connection terminal (178), a haptic module (179), a camera module (180), a power management module (188), a battery (189), a communication module (190), a subscriber identification module (196), or an antenna module (197). In some embodiments, the electronic device (101) may omit at least one of these components (e.g., the connection terminal (178)), or may have one or more other components added. In some embodiments, some of these components (e.g., the sensor module (176), the camera module (180), or the antenna module (197)) may be integrated into one component (e.g., the display module (160)).

프로세서(120)는, 예를 들면, 소프트웨어(예: 프로그램(140))를 실행하여 프로세서(120)에 연결된 전자 장치(101)의 적어도 하나의 다른 구성요소(예: 하드웨어 또는 소프트웨어 구성요소)를 제어할 수 있고, 다양한 데이터 처리 또는 연산을 수행할 수 있다. 일 실시예에 따르면, 데이터 처리 또는 연산의 적어도 일부로서, 프로세서(120)는 다른 구성요소(예: 센서 모듈(176) 또는 통신 모듈(190))로부터 수신된 명령 또는 데이터를 휘발성 메모리(132)에 저장하고, 휘발성 메모리(132)에 저장된 명령 또는 데이터를 처리하고, 결과 데이터를 비휘발성 메모리(134)에 저장할 수 있다. 일 실시예에 따르면, 프로세서(120)는 메인 프로세서(121)(예: 중앙 처리 장치 또는 어플리케이션 프로세서) 또는 이와는 독립적으로 또는 함께 운영 가능한 보조 프로세서(123)(예: 그래픽 처리 장치, 신경망 처리 장치(NPU: neural processing unit), 이미지 시그널 프로세서, 센서 허브 프로세서, 또는 커뮤니케이션 프로세서)를 포함할 수 있다. 예를 들어, 전자 장치(101)가 메인 프로세서(121) 및 보조 프로세서(123)를 포함하는 경우, 보조 프로세서(123)는 메인 프로세서(121)보다 저전력을 사용하거나, 지정된 기능에 특화되도록 설정될 수 있다. 보조 프로세서(123)는 메인 프로세서(121)와 별개로, 또는 그 일부로서 구현될 수 있다.The processor (120) may control at least one other component (e.g., a hardware or software component) of the electronic device (101) connected to the processor (120) by executing, for example, software (e.g., a program (140)), and may perform various data processing or calculations. According to one embodiment, as at least a part of the data processing or calculations, the processor (120) may store a command or data received from another component (e.g., a sensor module (176) or a communication module (190)) in the volatile memory (132), process the command or data stored in the volatile memory (132), and store result data in the nonvolatile memory (134). According to one embodiment, the processor (120) may include a main processor (121) (e.g., a central processing unit or an application processor) or an auxiliary processor (123) (e.g., a graphic processing unit, a neural processing unit (NPU), an image signal processor, a sensor hub processor, or a communication processor) that can operate independently or together therewith. For example, if the electronic device (101) includes a main processor (121) and a secondary processor (123), the secondary processor (123) may be configured to use lower power than the main processor (121) or to be specialized for a given function. The secondary processor (123) may be implemented separately from the main processor (121) or as a part thereof.

보조 프로세서(123)는, 예를 들면, 메인 프로세서(121)가 인액티브(예: 슬립) 상태에 있는 동안 메인 프로세서(121)를 대신하여, 또는 메인 프로세서(121)가 액티브(예: 어플리케이션 실행) 상태에 있는 동안 메인 프로세서(121)와 함께, 전자 장치(101)의 구성요소들 중 적어도 하나의 구성요소(예: 디스플레이 모듈(160), 센서 모듈(176), 또는 통신 모듈(190))와 관련된 기능 또는 상태들의 적어도 일부를 제어할 수 있다. 일 실시예에 따르면, 보조 프로세서(123)(예: 이미지 시그널 프로세서 또는 커뮤니케이션 프로세서)는 기능적으로 관련 있는 다른 구성요소(예: 카메라 모듈(180) 또는 통신 모듈(190))의 일부로서 구현될 수 있다. 일 실시예에 따르면, 보조 프로세서(123)(예: 신경망 처리 장치)는 인공지능 모델의 처리에 특화된 하드웨어 구조를 포함할 수 있다. 인공지능 모델은 기계 학습을 통해 생성될 수 있다. 이러한 학습은, 예를 들어, 인공지능 모델이 수행되는 전자 장치(101) 자체에서 수행될 수 있고, 별도의 서버(예: 서버(108))를 통해 수행될 수도 있다. 학습 알고리즘은, 예를 들어, 지도형 학습(supervised learning), 비지도형 학습(unsupervised learning), 준지도형 학습(semi-supervised learning) 또는 강화 학습(reinforcement learning)을 포함할 수 있으나, 전술한 예에 한정되지 않는다. 인공지능 모델은, 복수의 인공 신경망 레이어들을 포함할 수 있다. 인공 신경망은 심층 신경망(DNN: deep neural network), CNN(convolutional neural network), RNN(recurrent neural network), RBM(restricted boltzmann machine), DBN(deep belief network), BRDNN(bidirectional recurrent deep neural network), 심층 Q-네트워크(deep Q-networks) 또는 상기 중 둘 이상의 조합 중 하나일 수 있으나, 전술한 예에 한정되지 않는다. 인공지능 모델은 하드웨어 구조 이외에, 추가적으로 또는 대체적으로, 소프트웨어 구조를 포함할 수 있다.The auxiliary processor (123) may control at least a portion of functions or states associated with at least one of the components of the electronic device (101) (e.g., the display module (160), the sensor module (176), or the communication module (190)), for example, on behalf of the main processor (121) while the main processor (121) is in an inactive (e.g., sleep) state, or together with the main processor (121) while the main processor (121) is in an active (e.g., application execution) state. In one embodiment, the auxiliary processor (123) (e.g., an image signal processor or a communication processor) may be implemented as a part of another functionally related component (e.g., a camera module (180) or a communication module (190)). In one embodiment, the auxiliary processor (123) (e.g., a neural network processing device) may include a hardware structure specialized for processing artificial intelligence models. The artificial intelligence models may be generated through machine learning. Such learning may be performed, for example, in the electronic device (101) itself on which the artificial intelligence model is executed, or may be performed through a separate server (e.g., server (108)). The learning algorithm may include, for example, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but is not limited to the examples described above. The artificial intelligence model may include a plurality of artificial neural network layers. The artificial neural network may be one of a deep neural network (DNN), a convolutional neural network (CNN), a recurrent neural network (RNN), a restricted Boltzmann machine (RBM), a deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), deep Q-networks, or a combination of two or more of the above, but is not limited to the examples described above. In addition to the hardware structure, the artificial intelligence model may additionally or alternatively include a software structure.

메모리(130)는, 전자 장치(101)의 적어도 하나의 구성요소(예: 프로세서(120) 또는 센서 모듈(176))에 의해 사용되는 다양한 데이터를 저장할 수 있다. 데이터는, 예를 들어, 소프트웨어(예: 프로그램(140)) 및, 이와 관련된 명령에 대한 입력 데이터 또는 출력 데이터를 포함할 수 있다. 메모리(130)는, 휘발성 메모리(132) 또는 비휘발성 메모리(134)를 포함할 수 있다.The memory (130) can store various data used by at least one component (e.g., processor (120) or sensor module (176)) of the electronic device (101). The data can include, for example, software (e.g., program (140)) and input data or output data for commands related thereto. The memory (130) can include volatile memory (132) or nonvolatile memory (134).

프로그램(140)은 메모리(130)에 소프트웨어로서 저장될 수 있으며, 예를 들면, 운영 체제(142), 미들 웨어(144) 또는 어플리케이션(146)을 포함할 수 있다.The program (140) may be stored as software in memory (130) and may include, for example, an operating system (142), middleware (144), or an application (146).

입력 모듈(150)은, 전자 장치(101)의 구성요소(예: 프로세서(120))에 사용될 명령 또는 데이터를 전자 장치(101)의 외부(예: 사용자)로부터 수신할 수 있다. 입력 모듈(150)은, 예를 들면, 마이크, 마우스, 키보드, 키(예: 버튼), 또는 디지털 펜(예: 스타일러스 펜)을 포함할 수 있다.The input module (150) can receive commands or data to be used in a component of the electronic device (101) (e.g., a processor (120)) from an external source (e.g., a user) of the electronic device (101). The input module (150) can include, for example, a microphone, a mouse, a keyboard, a key (e.g., a button), or a digital pen (e.g., a stylus pen).

음향 출력 모듈(155)은 음향 신호를 전자 장치(101)의 외부로 출력할 수 있다. 음향 출력 모듈(155)은, 예를 들면, 스피커 또는 리시버를 포함할 수 있다. 스피커는 멀티미디어 재생 또는 녹음 재생과 같이 일반적인 용도로 사용될 수 있다. 리시버는 착신 전화를 수신하기 위해 사용될 수 있다. 일 실시예에 따르면, 리시버는 스피커와 별개로, 또는 그 일부로서 구현될 수 있다.The audio output module (155) can output an audio signal to the outside of the electronic device (101). The audio output module (155) can include, for example, a speaker or a receiver. The speaker can be used for general purposes such as multimedia playback or recording playback. The receiver can be used to receive an incoming call. According to one embodiment, the receiver can be implemented separately from the speaker or as a part thereof.

디스플레이 모듈(160)은 전자 장치(101)의 외부(예: 사용자)로 정보를 시각적으로 제공할 수 있다. 디스플레이 모듈(160)은, 예를 들면, 디스플레이, 홀로그램 장치, 또는 프로젝터 및 해당 장치를 제어하기 위한 제어 회로를 포함할 수 있다. 일 실시예에 따르면, 디스플레이 모듈(160)은 터치를 감지하도록 설정된 터치 센서, 또는 상기 터치에 의해 발생되는 힘의 세기를 측정하도록 설정된 압력 센서를 포함할 수 있다.The display module (160) can visually provide information to an external party (e.g., a user) of the electronic device (101). The display module (160) can include, for example, a display, a holographic device, or a projector and a control circuit for controlling the device. According to one embodiment, the display module (160) can include a touch sensor configured to detect a touch, or a pressure sensor configured to measure the intensity of a force generated by the touch.

오디오 모듈(170)은 소리를 전기 신호로 변환시키거나, 반대로 전기 신호를 소리로 변환시킬 수 있다. 일 실시예에 따르면, 오디오 모듈(170)은, 입력 모듈(150)을 통해 소리를 획득하거나, 음향 출력 모듈(155), 또는 전자 장치(101)와 직접 또는 무선으로 연결된 외부 전자 장치(예: 전자 장치(102))(예: 스피커 또는 헤드폰)를 통해 소리를 출력할 수 있다.The audio module (170) can convert sound into an electrical signal, or vice versa, convert an electrical signal into sound. According to one embodiment, the audio module (170) can obtain sound through an input module (150), or output sound through an audio output module (155), or an external electronic device (e.g., an electronic device (102)) (e.g., a speaker or a headphone) directly or wirelessly connected to the electronic device (101).

센서 모듈(176)은 전자 장치(101)의 작동 상태(예: 전력 또는 온도), 또는 외부의 환경 상태(예: 사용자 상태)를 감지하고, 감지된 상태에 대응하는 전기 신호 또는 데이터 값을 생성할 수 있다. 일 실시예에 따르면, 센서 모듈(176)은, 예를 들면, 제스처 센서, 자이로 센서, 기압 센서, 마그네틱 센서, 가속도 센서, 그립 센서, 근접 센서, 컬러 센서, IR(infrared) 센서, 생체 센서, 온도 센서, 습도 센서, 또는 조도 센서를 포함할 수 있다.The sensor module (176) can detect an operating state (e.g., power or temperature) of the electronic device (101) or an external environmental state (e.g., user state) and generate an electric signal or data value corresponding to the detected state. According to one embodiment, the sensor module (176) can include, for example, a gesture sensor, a gyro sensor, a barometric pressure sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor, an IR (infrared) sensor, a biometric sensor, a temperature sensor, a humidity sensor, or an illuminance sensor.

인터페이스(177)는 전자 장치(101)가 외부 전자 장치(예: 전자 장치(102))와 직접 또는 무선으로 연결되기 위해 사용될 수 있는 하나 이상의 지정된 프로토콜들을 지원할 수 있다. 일 실시예에 따르면, 인터페이스(177)는, 예를 들면, HDMI(high definition multimedia interface), USB(universal serial bus) 인터페이스, SD카드 인터페이스, 또는 오디오 인터페이스를 포함할 수 있다.The interface (177) may support one or more designated protocols that may be used to directly or wirelessly connect the electronic device (101) with an external electronic device (e.g., the electronic device (102)). According to one embodiment, the interface (177) may include, for example, a high definition multimedia interface (HDMI), a universal serial bus (USB) interface, an SD card interface, or an audio interface.

연결 단자(178)는, 그를 통해서 전자 장치(101)가 외부 전자 장치(예: 전자 장치(102))와 물리적으로 연결될 수 있는 커넥터를 포함할 수 있다. 일 실시예에 따르면, 연결 단자(178)는, 예를 들면, HDMI 커넥터, USB 커넥터, SD 카드 커넥터, 또는 오디오 커넥터(예: 헤드폰 커넥터)를 포함할 수 있다.The connection terminal (178) may include a connector through which the electronic device (101) may be physically connected to an external electronic device (e.g., the electronic device (102)). According to one embodiment, the connection terminal (178) may include, for example, an HDMI connector, a USB connector, an SD card connector, or an audio connector (e.g., a headphone connector).

햅틱 모듈(179)은 전기적 신호를 사용자가 촉각 또는 운동 감각을 통해서 인지할 수 있는 기계적인 자극(예: 진동 또는 움직임) 또는 전기적인 자극으로 변환할 수 있다. 일 실시예에 따르면, 햅틱 모듈(179)은, 예를 들면, 모터, 압전 소자, 또는 전기 자극 장치를 포함할 수 있다.The haptic module (179) can convert an electrical signal into a mechanical stimulus (e.g., vibration or movement) or an electrical stimulus that a user can perceive through a tactile or kinesthetic sense. According to one embodiment, the haptic module (179) can include, for example, a motor, a piezoelectric element, or an electrical stimulation device.

카메라 모듈(180)은 정지 영상 및 동영상을 촬영할 수 있다. 일 실시예에 따르면, 카메라 모듈(180)은 하나 이상의 렌즈들, 이미지 센서들, 이미지 시그널 프로세서들, 또는 플래시들을 포함할 수 있다.The camera module (180) can capture still images and moving images. According to one embodiment, the camera module (180) can include one or more lenses, image sensors, image signal processors, or flashes.

전력 관리 모듈(188)은 전자 장치(101)에 공급되는 전력을 관리할 수 있다. 일 실시예에 따르면, 전력 관리 모듈(188)은, 예를 들면, PMIC(power management integrated circuit)의 적어도 일부로서 구현될 수 있다.The power management module (188) can manage power supplied to the electronic device (101). According to one embodiment, the power management module (188) can be implemented as, for example, at least a part of a power management integrated circuit (PMIC).

배터리(189)는 전자 장치(101)의 적어도 하나의 구성요소에 전력을 공급할 수 있다. 일 실시예에 따르면, 배터리(189)는, 예를 들면, 재충전 불가능한 1차 전지, 재충전 가능한 2차 전지 또는 연료 전지를 포함할 수 있다.The battery (189) can power at least one component of the electronic device (101). In one embodiment, the battery (189) can include, for example, a non-rechargeable primary battery, a rechargeable secondary battery, or a fuel cell.

통신 모듈(190)은 전자 장치(101)와 외부 전자 장치(예: 전자 장치(102), 전자 장치(104), 또는 서버(108)) 간의 직접(예: 유선) 통신 채널 또는 무선 통신 채널의 수립, 및 수립된 통신 채널을 통한 통신 수행을 지원할 수 있다. 통신 모듈(190)은 프로세서(120)(예: 어플리케이션 프로세서)와 독립적으로 운영되고, 직접(예: 유선) 통신 또는 무선 통신을 지원하는 하나 이상의 커뮤니케이션 프로세서를 포함할 수 있다. 일 실시예에 따르면, 통신 모듈(190)은 무선 통신 모듈(192)(예: 셀룰러 통신 모듈, 근거리 무선 통신 모듈, 또는 GNSS(global navigation satellite system) 통신 모듈) 또는 유선 통신 모듈(194)(예: LAN(local area network) 통신 모듈, 또는 전력선 통신 모듈)을 포함할 수 있다. 이들 통신 모듈 중 해당하는 통신 모듈은 제 1 네트워크(198)(예: 블루투스, WiFi(wireless fidelity) direct 또는 IrDA(infrared data association)와 같은 근거리 통신 네트워크) 또는 제 2 네트워크(199)(예: 레거시 셀룰러 네트워크, 5G 네트워크, 차세대 통신 네트워크, 인터넷, 또는 컴퓨터 네트워크(예: LAN 또는 WAN)와 같은 원거리 통신 네트워크)를 통하여 외부의 전자 장치(104)와 통신할 수 있다. 이런 여러 종류의 통신 모듈들은 하나의 구성요소(예: 단일 칩)로 통합되거나, 또는 서로 별도의 복수의 구성요소들(예: 복수 칩들)로 구현될 수 있다. 무선 통신 모듈(192)은 가입자 식별 모듈(196)에 저장된 가입자 정보(예: 국제 모바일 가입자 식별자(IMSI))를 이용하여 제 1 네트워크(198) 또는 제 2 네트워크(199)와 같은 통신 네트워크 내에서 전자 장치(101)를 확인 또는 인증할 수 있다.The communication module (190) may support establishment of a direct (e.g., wired) communication channel or a wireless communication channel between the electronic device (101) and an external electronic device (e.g., the electronic device (102), the electronic device (104), or the server (108)), and performance of communication through the established communication channel. The communication module (190) may operate independently from the processor (120) (e.g., the application processor) and may include one or more communication processors that support direct (e.g., wired) communication or wireless communication. According to one embodiment, the communication module (190) may include a wireless communication module (192) (e.g., a cellular communication module, a short-range wireless communication module, or a GNSS (global navigation satellite system) communication module) or a wired communication module (194) (e.g., a local area network (LAN) communication module or a power line communication module). Among these communication modules, a corresponding communication module may communicate with an external electronic device (104) via a first network (198) (e.g., a short-range communication network such as Bluetooth, wireless fidelity (WiFi) direct, or infrared data association (IrDA)) or a second network (199) (e.g., a long-range communication network such as a legacy cellular network, a 5G network, a next-generation communication network, the Internet, or a computer network (e.g., a LAN or WAN)). These various types of communication modules may be integrated into a single component (e.g., a single chip) or implemented as multiple separate components (e.g., multiple chips). The wireless communication module (192) may use subscriber information (e.g., an international mobile subscriber identity (IMSI)) stored in the subscriber identification module (196) to identify or authenticate the electronic device (101) within a communication network such as the first network (198) or the second network (199).

무선 통신 모듈(192)은 4G 네트워크 이후의 5G 네트워크 및 차세대 통신 기술, 예를 들어, NR 접속 기술(new radio access technology)을 지원할 수 있다. NR 접속 기술은 고용량 데이터의 고속 전송(eMBB(enhanced mobile broadband)), 단말 전력 최소화와 다수 단말의 접속(mMTC(massive machine type communications)), 또는 고신뢰도와 저지연(URLLC(ultra-reliable and low-latency communications))을 지원할 수 있다. 무선 통신 모듈(192)은, 예를 들어, 높은 데이터 전송률 달성을 위해, 고주파 대역(예: mmWave 대역)을 지원할 수 있다. 무선 통신 모듈(192)은 고주파 대역에서의 성능 확보를 위한 다양한 기술들, 예를 들어, 빔포밍(beamforming), 거대 배열 다중 입출력(massive MIMO(multiple-input and multiple-output)), 전차원 다중입출력(FD-MIMO: full dimensional MIMO), 어레이 안테나(array antenna), 아날로그 빔형성(analog beam-forming), 또는 대규모 안테나(large scale antenna)와 같은 기술들을 지원할 수 있다. 무선 통신 모듈(192)은 전자 장치(101), 외부 전자 장치(예: 전자 장치(104)) 또는 네트워크 시스템(예: 제 2 네트워크(199))에 규정되는 다양한 요구사항을 지원할 수 있다. 일 실시예에 따르면, 무선 통신 모듈(192)은 eMBB 실현을 위한 Peak data rate(예: 20Gbps 이상), mMTC 실현을 위한 손실 Coverage(예: 164dB 이하), 또는 URLLC 실현을 위한 U-plane latency(예: 다운링크(DL) 및 업링크(UL) 각각 0.5ms 이하, 또는 라운드 트립 1ms 이하)를 지원할 수 있다.The wireless communication module (192) can support a 5G network and next-generation communication technology after a 4G network, for example, NR access technology (new radio access technology). The NR access technology can support high-speed transmission of high-capacity data (eMBB (enhanced mobile broadband)), terminal power minimization and connection of multiple terminals (mMTC (massive machine type communications)), or high reliability and low latency (URLLC (ultra-reliable and low-latency communications)). The wireless communication module (192) can support, for example, a high-frequency band (e.g., mmWave band) to achieve a high data transmission rate. The wireless communication module (192) may support various technologies for securing performance in a high-frequency band, such as beamforming, massive multiple-input and multiple-output (MIMO), full dimensional MIMO (FD-MIMO), array antenna, analog beam-forming, or large scale antenna. The wireless communication module (192) may support various requirements specified in an electronic device (101), an external electronic device (e.g., an electronic device (104)), or a network system (e.g., a second network (199)). According to one embodiment, the wireless communication module (192) may support a peak data rate (e.g., 20 Gbps or more) for eMBB realization, a loss coverage (e.g., 164 dB or less) for mMTC realization, or a U-plane latency (e.g., 0.5 ms or less for downlink (DL) and uplink (UL) each, or 1 ms or less for round trip) for URLLC realization.

안테나 모듈(197)은 신호 또는 전력을 외부(예: 외부의 전자 장치)로 송신하거나 외부로부터 수신할 수 있다. 일 실시예에 따르면, 안테나 모듈(197)은 서브스트레이트(예: PCB) 위에 형성된 도전체 또는 도전성 패턴으로 이루어진 방사체를 포함하는 안테나를 포함할 수 있다. 일 실시예에 따르면, 안테나 모듈(197)은 복수의 안테나들(예: 어레이 안테나)을 포함할 수 있다. 이런 경우, 제 1 네트워크(198) 또는 제 2 네트워크(199)와 같은 통신 네트워크에서 사용되는 통신 방식에 적합한 적어도 하나의 안테나가, 예를 들면, 통신 모듈(190)에 의하여 상기 복수의 안테나들로부터 선택될 수 있다. 신호 또는 전력은 상기 선택된 적어도 하나의 안테나를 통하여 통신 모듈(190)과 외부의 전자 장치 간에 송신되거나 수신될 수 있다. 어떤 실시예에 따르면, 방사체 이외에 다른 부품(예: RFIC(radio frequency integrated circuit))이 추가로 안테나 모듈(197)의 일부로 형성될 수 있다.The antenna module (197) can transmit or receive signals or power to or from the outside (e.g., an external electronic device). According to one embodiment, the antenna module (197) can include an antenna including a radiator formed of a conductor or a conductive pattern formed on a substrate (e.g., a PCB). According to one embodiment, the antenna module (197) can include a plurality of antennas (e.g., an array antenna). In this case, at least one antenna suitable for a communication method used in a communication network, such as the first network (198) or the second network (199), can be selected from the plurality of antennas by, for example, the communication module (190). A signal or power can be transmitted or received between the communication module (190) and the external electronic device through the selected at least one antenna. According to some embodiments, in addition to the radiator, another component (e.g., a radio frequency integrated circuit (RFIC)) can be additionally formed as a part of the antenna module (197).

다양한 실시예에 따르면, 안테나 모듈(197)은 mmWave 안테나 모듈을 형성할 수 있다. 일 실시예에 따르면, mmWave 안테나 모듈은 인쇄 회로 기판, 상기 인쇄 회로 기판의 제 1 면(예: 아래 면)에 또는 그에 인접하여 배치되고 지정된 고주파 대역(예: mmWave 대역)을 지원할 수 있는 RFIC, 및 상기 인쇄 회로 기판의 제 2 면(예: 윗 면 또는 측 면)에 또는 그에 인접하여 배치되고 상기 지정된 고주파 대역의 신호를 송신 또는 수신할 수 있는 복수의 안테나들(예: 어레이 안테나)을 포함할 수 있다.According to various embodiments, the antenna module (197) may form a mmWave antenna module. According to one embodiment, the mmWave antenna module may include a printed circuit board, an RFIC disposed on or adjacent a first side (e.g., a bottom side) of the printed circuit board and capable of supporting a designated high-frequency band (e.g., a mmWave band), and a plurality of antennas (e.g., an array antenna) disposed on or adjacent a second side (e.g., a top side or a side) of the printed circuit board and capable of transmitting or receiving signals in the designated high-frequency band.

상기 구성요소들 중 적어도 일부는 주변 기기들간 통신 방식(예: 버스, GPIO(general purpose input and output), SPI(serial peripheral interface), 또는 MIPI(mobile industry processor interface))을 통해 서로 연결되고 신호(예: 명령 또는 데이터)를 상호간에 교환할 수 있다.At least some of the above components may be interconnected and exchange signals (e.g., commands or data) with each other via a communication method between peripheral devices (e.g., a bus, a general purpose input and output (GPIO), a serial peripheral interface (SPI), or a mobile industry processor interface (MIPI)).

일 실시예에 따르면, 명령 또는 데이터는 제 2 네트워크(199)에 연결된 서버(108)를 통해서 전자 장치(101)와 외부의 전자 장치(104)간에 송신 또는 수신될 수 있다. 외부의 전자 장치(102, 또는 104) 각각은 전자 장치(101)와 동일한 또는 다른 종류의 장치일 수 있다. 일 실시예에 따르면, 전자 장치(101)에서 실행되는 동작들의 전부 또는 일부는 외부의 전자 장치들(102, 104, 또는 108) 중 하나 이상의 외부의 전자 장치들에서 실행될 수 있다. 예를 들면, 전자 장치(101)가 어떤 기능이나 서비스를 자동으로, 또는 사용자 또는 다른 장치로부터의 요청에 반응하여 수행해야 할 경우에, 전자 장치(101)는 기능 또는 서비스를 자체적으로 실행시키는 대신에 또는 추가적으로, 하나 이상의 외부의 전자 장치들에게 그 기능 또는 그 서비스의 적어도 일부를 수행하라고 요청할 수 있다. 상기 요청을 수신한 하나 이상의 외부의 전자 장치들은 요청된 기능 또는 서비스의 적어도 일부, 또는 상기 요청과 관련된 추가 기능 또는 서비스를 실행하고, 그 실행의 결과를 전자 장치(101)로 전달할 수 있다. 전자 장치(101)는 상기 결과를, 그대로 또는 추가적으로 처리하여, 상기 요청에 대한 응답의 적어도 일부로서 제공할 수 있다. 이를 위하여, 예를 들면, 클라우드 컴퓨팅, 분산 컴퓨팅, 모바일 에지 컴퓨팅(MEC: mobile edge computing), 또는 클라이언트-서버 컴퓨팅 기술이 이용될 수 있다. 전자 장치(101)는, 예를 들어, 분산 컴퓨팅 또는 모바일 에지 컴퓨팅을 이용하여 초저지연 서비스를 제공할 수 있다. 다른 실시예에 있어서, 외부의 전자 장치(104)는 IoT(internet of things) 기기를 포함할 수 있다. 서버(108)는 기계 학습 및/또는 신경망을 이용한 지능형 서버일 수 있다. 일 실시예에 따르면, 외부의 전자 장치(104) 또는 서버(108)는 제 2 네트워크(199) 내에 포함될 수 있다. 전자 장치(101)는 5G 통신 기술 및 IoT 관련 기술을 기반으로 지능형 서비스(예: 스마트 홈, 스마트 시티, 스마트 카, 또는 헬스 케어)에 적용될 수 있다.In one embodiment, commands or data may be transmitted or received between the electronic device (101) and an external electronic device (104) via a server (108) connected to a second network (199). Each of the external electronic devices (102, or 104) may be the same or a different type of device as the electronic device (101). In one embodiment, all or part of the operations executed in the electronic device (101) may be executed in one or more of the external electronic devices (102, 104, or 108). For example, when the electronic device (101) is to perform a certain function or service automatically or in response to a request from a user or another device, the electronic device (101) may, instead of or in addition to executing the function or service itself, request one or more external electronic devices to perform at least a part of the function or service. One or more external electronic devices that have received the request may execute at least a part of the requested function or service, or an additional function or service related to the request, and transmit the result of the execution to the electronic device (101). The electronic device (101) may process the result as is or additionally and provide it as at least a part of a response to the request. For this purpose, for example, cloud computing, distributed computing, mobile edge computing (MEC), or client-server computing technology may be used. The electronic device (101) may provide an ultra-low latency service by using, for example, distributed computing or mobile edge computing. In another embodiment, the external electronic device (104) may include an IoT (Internet of Things) device. The server (108) may be an intelligent server using machine learning and/or a neural network. According to one embodiment, the external electronic device (104) or the server (108) may be included in the second network (199). The electronic device (101) can be applied to intelligent services (e.g., smart home, smart city, smart car, or healthcare) based on 5G communication technology and IoT-related technology.

본 문서에 개시된 다양한 실시예들에 따른 전자 장치는 다양한 형태의 장치가 될 수 있다. 전자 장치는, 예를 들면, 휴대용 통신 장치(예: 스마트폰), 컴퓨터 장치, 휴대용 멀티미디어 장치, 휴대용 의료 기기, 카메라, 웨어러블 장치, 또는 가전 장치를 포함할 수 있다. 본 문서의 실시예에 따른 전자 장치는 전술한 기기들에 한정되지 않는다.The electronic devices according to various embodiments disclosed in this document may be devices of various forms. The electronic devices may include, for example, portable communication devices (e.g., smartphones), computer devices, portable multimedia devices, portable medical devices, cameras, wearable devices, or home appliance devices. The electronic devices according to embodiments of this document are not limited to the above-described devices.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술적 특징들을 특정한 실시예들로 한정하려는 것이 아니며, 해당 실시예의 다양한 변경, 균등물, 또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 또는 관련된 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 아이템에 대응하는 명사의 단수 형은 관련된 문맥상 명백하게 다르게 지시하지 않는 한, 상기 아이템 한 개 또는 복수 개를 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및 B 중 적어도 하나", "A 또는 B 중 적어도 하나", "A, B 또는 C", "A, B 및 C 중 적어도 하나", 및 "A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", 또는 "첫째" 또는 "둘째"와 같은 용어들은 단순히 해당 구성요소를 다른 해당 구성요소와 구분하기 위해 사용될 수 있으며, 해당 구성요소들을 다른 측면(예: 중요성 또는 순서)에서 한정하지 않는다. 어떤(예: 제 1) 구성요소가 다른(예: 제 2) 구성요소에, "기능적으로" 또는 "통신적으로"라는 용어와 함께 또는 이런 용어 없이, "커플드" 또는 "커넥티드"라고 언급된 경우, 그것은 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로(예: 유선으로), 무선으로, 또는 제 3 구성요소를 통하여 연결될 수 있다는 것을 의미한다.It should be understood that the various embodiments of this document and the terminology used herein are not intended to limit the technical features described in this document to specific embodiments, but include various modifications, equivalents, or substitutes of the embodiments. In connection with the description of the drawings, similar reference numerals may be used for similar or related components. The singular form of a noun corresponding to an item may include one or more of the items, unless the context clearly dictates otherwise. In this document, each of the phrases "A or B", "at least one of A and B", "at least one of A or B", "A, B, or C", "at least one of A, B, and C", and "at least one of A, B, or C" can include any one of the items listed together in the corresponding phrase, or all possible combinations thereof. Terms such as "first", "second", or "first" or "second" may be used merely to distinguish one component from another, and do not limit the components in any other respect (e.g., importance or order). When a component (e.g., a first) is referred to as "coupled" or "connected" to another (e.g., a second) component, with or without the terms "functionally" or "communicatively," it means that the component can be connected to the other component directly (e.g., wired), wirelessly, or through a third component.

본 문서의 다양한 실시예들에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구현된 유닛을 포함할 수 있으며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로와 같은 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는, 상기 부품의 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 일 실시예에 따르면, 모듈은 ASIC(application-specific integrated circuit)의 형태로 구현될 수 있다. The term "module" used in various embodiments of this document may include a unit implemented in hardware, software or firmware, and may be used interchangeably with terms such as logic, logic block, component, or circuit, for example. A module may be an integrally configured component or a minimum unit of the component or a portion thereof that performs one or more functions. For example, according to one embodiment, a module may be implemented in the form of an application-specific integrated circuit (ASIC).

본 문서의 다양한 실시예들은 기기(machine)(예: 전자 장치(101)) 의해 읽을 수 있는 저장 매체(storage medium)(예: 내장 메모리(136) 또는 외장 메모리(138))에 저장된 하나 이상의 명령어들을 포함하는 소프트웨어(예: 프로그램(140))로서 구현될 수 있다. 예를 들면, 기기(예: 전자 장치(101))의 프로세서(예: 프로세서(120))는, 저장 매체로부터 저장된 하나 이상의 명령어들 중 적어도 하나의 명령을 호출하고, 그것을 실행할 수 있다. 이것은 기기가 상기 호출된 적어도 하나의 명령어에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 상기 하나 이상의 명령어들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장 매체는, 비일시적(non-transitory) 저장 매체의 형태로 제공될 수 있다. 여기서, ‘비일시적’은 저장 매체가 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장 매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다.Various embodiments of the present document may be implemented as software (e.g., a program (140)) including one or more instructions stored in a storage medium (e.g., an internal memory (136) or an external memory (138)) readable by a machine (e.g., an electronic device (101)). For example, a processor (e.g., a processor (120)) of the machine (e.g., an electronic device (101)) may call at least one instruction among the one or more instructions stored from the storage medium and execute it. This enables the machine to operate to perform at least one function according to the called at least one instruction. The one or more instructions may include code generated by a compiler or code executable by an interpreter. The machine-readable storage medium may be provided in the form of a non-transitory storage medium. Here, ‘non-transitory’ simply means that the storage medium is a tangible device and does not contain signals (e.g. electromagnetic waves), and the term does not distinguish between cases where data is stored semi-permanently or temporarily on the storage medium.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory(CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어^TM)를 통해 또는 두 개의 사용자 장치들(예: 스마트 폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in the present document may be provided as included in a computer program product. The computer program product may be traded between a seller and a buyer as a commodity. The computer program product may be distributed in the form of a machine-readable storage medium (e.g., a compact disc read only memory (CD-ROM)), or may be distributed online (e.g., downloaded or uploaded) via an application store (e.g., Play Store ^TM ) or directly between two user devices (e.g., smart phones). In the case of online distribution, at least a part of the computer program product may be at least temporarily stored or temporarily generated in a machine-readable storage medium, such as a memory of a manufacturer's server, a server of an application store, or an intermediary server.

다양한 실시예들에 따르면, 상기 기술한 구성요소들의 각각의 구성요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있으며, 복수의 개체 중 일부는 다른 구성요소에 분리 배치될 수도 있다. 다양한 실시예들에 따르면, 전술한 해당 구성요소들 중 하나 이상의 구성요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성요소들(예: 모듈 또는 프로그램)은 하나의 구성요소로 통합될 수 있다. 이런 경우, 통합된 구성요소는 상기 복수의 구성요소들 각각의 구성요소의 하나 이상의 기능들을 상기 통합 이전에 상기 복수의 구성요소들 중 해당 구성요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 상기 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다.According to various embodiments, each component (e.g., a module or a program) of the above-described components may include a single or multiple entities, and some of the multiple entities may be separately arranged in other components. According to various embodiments, one or more components or operations of the above-described corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, the multiple components (e.g., a module or a program) may be integrated into one component. In such a case, the integrated component may perform one or more functions of each of the multiple components identically or similarly to those performed by the corresponding component of the multiple components before the integration. According to various embodiments, the operations performed by the module, program, or other component may be executed sequentially, in parallel, repeatedly, or heuristically, or one or more of the operations may be executed in a different order, omitted, or one or more other operations may be added.

도 2는, 본 개시의 일 실시 예에 따른, 웨어러블 전자 장치(200)의 내부 구성을 설명하기 위한 사시도이다.FIG. 2 is a perspective view for explaining the internal configuration of a wearable electronic device (200) according to one embodiment of the present disclosure.

도 2를 참조하면, 본 개시의 일 실시예에 따른 웨어러블 전자 장치(200)는, 광 출력 모듈(211), 표시 부재(201) 또는 카메라 모듈(250) 중 적어도 하나를 포함할 수 있다.Referring to FIG. 2, a wearable electronic device (200) according to one embodiment of the present disclosure may include at least one of a light output module (211), a display member (201), or a camera module (250).

본 개시의 일 실시예에 따르면, 광 출력 모듈(211)은 영상을 출력할 수 있는 광원, 및 영상을 표시 부재(201)로 가이드하는 렌즈를 포함할 수 있다. 본 개시의 일 실시예에 따르면, 광 출력 모듈(211)은 액정 표시 장치(liquid crystal display, LCD), 디지털 미러 표시 장치(digital mirror device, DMD), 실리콘 액정 표시 장치(liquid crystal on silicon, LCoS), 실리콘 온 발광 다이오드(light emitting diode(LED) on silicon; LEDoS), 유기 발광 다이오드(organic light emitting diode, OLED) 또는 마이크로 엘이디(micro light emitting diode, micro LED) 중 적어도 하나를 포함할 수 있다.According to one embodiment of the present disclosure, the light output module (211) may include a light source capable of outputting an image, and a lens for guiding the image to the display member (201). According to one embodiment of the present disclosure, the light output module (211) may include at least one of a liquid crystal display (LCD), a digital mirror device (DMD), a liquid crystal on silicon (LCoS), a light emitting diode (LED) on silicon (LEDoS), an organic light emitting diode (OLED), or a micro light emitting diode (micro LED).

본 개시의 일 실시예에 따르면, 표시 부재(201)는 광 도파로(예: 웨이브 가이드)를 포함할 수 있다. 본 개시의 일 실시예에 따르면, 광 도파로의 일단으로 입사된 광 출력 모듈(211)의 출력된 영상은 광 도파로 내부에서 전파되어 사용자에게 제공될 수 있다. 본 개시의 일 실시예에 따르면 광 도파로는 적어도 하나의 회절 요소(예: DOE(Diffractive Optical Element), HOE(Holographic Optical Element)) 또는 반사 요소(예: 반사 거울) 중 적어도 하나를 포함할 수 있다. 예를 들어, 광 도파로는 적어도 하나의 회절 요소 또는 반사 요소를 이용하여 광 출력 모듈(211)의 출력된 영상을 사용자의 눈으로 유도할 수 있다.According to one embodiment of the present disclosure, the display member (201) may include an optical waveguide (e.g., a waveguide). According to one embodiment of the present disclosure, an output image of an optical output module (211) incident on one end of the optical waveguide may be propagated inside the optical waveguide and provided to a user. According to one embodiment of the present disclosure, the optical waveguide may include at least one diffractive element (e.g., a Diffractive Optical Element (DOE), a Holographic Optical Element (HOE)) or at least one reflective element (e.g., a reflective mirror). For example, the optical waveguide may guide an output image of the optical output module (211) to a user's eyes by using at least one diffractive element or reflective element.

본 개시의 일 실시예에 따르면, 카메라 모듈(250)은 정지 영상 및/또는 동영상을 촬영할 수 있다. 일 실시예에 따르면, 카메라 모듈(250)은 렌즈 프레임 내에 배치되고, 표시 부재(201)의 주위에 배치될 수 있다.According to one embodiment of the present disclosure, the camera module (250) can capture still images and/or moving images. According to one embodiment, the camera module (250) is disposed within a lens frame and can be disposed around the display member (201).

본 개시의 일 실시예에 따르면, 제1 카메라 모듈(251)은 사용자의 눈(예: 동공(pupil), 홍채(iris)) 또는 시선의 궤적을 촬영 및/또는 인식할 수 있다. 본 개시의 일 실시예에 따르면, 제1 카메라 모듈(251)은, 사용자의 눈 또는 시선의 궤적과 관련된 정보(예: 궤적 정보)를 프로세서(예: 도 1의 프로세서(120))로 주기적으로 또는 비주기적으로 전송할 수 있다.According to one embodiment of the present disclosure, the first camera module (251) can capture and/or recognize the trajectory of the user's eye (e.g., pupil, iris) or gaze. According to one embodiment of the present disclosure, the first camera module (251) can periodically or aperiodically transmit information related to the trajectory of the user's eye or gaze (e.g., trajectory information) to a processor (e.g., processor (120) of FIG. 1).

본 개시의 일 실시예에 따르면, 제2 카메라 모듈(253)은 외부의 이미지를 촬영할 수 있다.According to one embodiment of the present disclosure, the second camera module (253) can capture an external image.

본 개시의 일 실시예에 따르면, 제3 카메라 모듈(255)은 핸드(hand) 검출과 트래킹(tracking), 사용자의 제스처(예: 손 동작) 인식을 위해 사용될 수 있다. 본 개시의 일 실시예에 따른 제3 카메라 모듈(255)은, 3DoF(3 degrees of freedom), 6DoF의 헤드 트래킹(head tracking), 위치(공간, 환경) 인식 및/또는 이동 인식을 위해 사용될 수 있다. 본 개시의 일 실시예에 따른 핸드 검출과 트래킹, 사용자의 제스처 인식을 위해 제2 카메라 모듈(253)이 사용될 수도 있다. 본 개시의 일 실시예에 따르면, 제1 카메라 모듈(251) 내지 제3 카메라 모듈(255) 중 적어도 하나는 센서 모듈 (예: LiDAR 센서)로 대체될 수 있다. 예를 들면, 센서 모듈은, VCSEL(vertical cavity surface emitting laser), 적외선 센서, 및/또는 포토 다이오드(photodiode) 중 적어도 하나를 포함할 수 있다.According to one embodiment of the present disclosure, the third camera module (255) can be used for hand detection and tracking, and user gesture (e.g., hand movement) recognition. According to one embodiment of the present disclosure, the third camera module (255) can be used for 3DoF (3 degrees of freedom), 6DoF head tracking, position (space, environment) recognition, and/or movement recognition. The second camera module (253) can also be used for hand detection and tracking, and user gesture recognition according to one embodiment of the present disclosure. According to one embodiment of the present disclosure, at least one of the first camera module (251) to the third camera module (255) can be replaced with a sensor module (e.g., a LiDAR sensor). For example, the sensor module can include at least one of a VCSEL (vertical cavity surface emitting laser), an infrared sensor, and/or a photodiode.

도 3a는 일 실시 예에 따른 웨어러블 전자 장치(300)의 전면을 나타내는 도면이다.FIG. 3a is a drawing showing the front of a wearable electronic device (300) according to one embodiment.

도 3b는 일 실시 예에 따른 웨어러블 전자 장치(300)의 후면을 나타내는 도면이다.FIG. 3b is a drawing showing the rear side of a wearable electronic device (300) according to one embodiment.

도 3a 및 도 3b를 참조하면, 일 실시예에서, 하우징의 제 1 면(310) 상에는 웨어러블 전자 장치(300)의 주변 환경과 관련된 정보를 획득하기 위한 카메라 모듈들(311, 312, 313, 314, 315, 316) 및/또는 뎁스 센서(317) 가 배치될 수 있다.Referring to FIGS. 3A and 3B, in one embodiment, camera modules (311, 312, 313, 314, 315, 316) and/or depth sensors (317) for obtaining information related to the surrounding environment of the wearable electronic device (300) may be arranged on a first surface (310) of the housing.

일 실시예에서, 카메라 모듈들(311, 312)은, 웨어러블 전자 장치 주변 환경과 관련된 이미지를 획득할 수 있다.In one embodiment, the camera modules (311, 312) can acquire images related to the environment surrounding the wearable electronic device.

일 실시예에서, 카메라 모듈들(313, 314, 315, 316)은, 웨어러블 전자 장치가 사용자에 의해 착용된 상태에서, 이미지를 획득할 수 있다. 카메라 모듈들(313, 314, 315, 316)은 핸드 검출과, 트래킹, 사용자의 제스처(예: 손 동작) 인식을 위해 사용될 수 있다. 카메라 모듈들(313, 314, 315, 316)은 3DoF, 6DoF의 헤드 트래킹, 위치(공간, 환경) 인식 및/또는 이동 인식을 위하여 사용될 수 있다. 일 실시예에서, 핸드 검출과 트래킹, 사용자의 제스처 위하여 카메라 모듈들(311, 312)이 사용될 수도 있다.In one embodiment, the camera modules (313, 314, 315, 316) can acquire images while the wearable electronic device is worn by a user. The camera modules (313, 314, 315, 316) can be used for hand detection and tracking, and recognition of user's gestures (e.g., hand movements). The camera modules (313, 314, 315, 316) can be used for 3DoF, 6DoF head tracking, position (spatial, environmental) recognition, and/or movement recognition. In one embodiment, the camera modules (311, 312) can also be used for hand detection and tracking, and user's gestures.

일 실시예에서, 뎁스(depth) 센서(317)는, 신호를 송신하고 피사체로부터 반사되는 신호를 수신하도록 구성될 수 있으며, TOF(time of flight)와 같이 물체와의 거리 확인을 위한 용도로 사용될 수 있다. 예를 들면, 뎁스 센서(217)를 대체하여 또는 추가적으로, 카메라 모듈들(313, 314, 315, 316)이 물체와의 거리를 확인할 수 있다.In one embodiment, the depth sensor (317) can be configured to transmit a signal and receive a signal reflected from a subject, and can be used for purposes such as time of flight (TOF) to determine the distance to an object. For example, instead of or in addition to the depth sensor (217), the camera modules (313, 314, 315, 316) can determine the distance to an object.

일 실시예에 따라서, 하우징의 제 2 면(320) 상에는 얼굴 인식용 카메라 모듈(325, 326) 및/또는 디스플레이(321)(및/또는 렌즈)가 배치될 수 있다.According to one embodiment, a camera module (325, 326) for facial recognition and/or a display (321) (and/or a lens) may be arranged on the second side (320) of the housing.

일 실시예에서, 디스플레이에 인접한 얼굴 인식용 카메라 모듈(325, 326)은 사용자의 얼굴을 인식하기 위한 용도로 사용되거나, 사용자의 양 눈들을 인식 및/또는 트래킹할 수 있다.In one embodiment, a face recognition camera module (325, 326) adjacent to the display may be used to recognize a user's face, or may recognize and/or track both eyes of the user.

일 실시예에서, 디스플레이(321)(및/또는 렌즈)는, 웨어러블 전자 장치(300)의 제 2 면(320)에 배치될 수 있다. 일 실시예에서, 웨어러블 전자 장치(300)는, 복수의 카메라 모듈들(313, 314, 315, 316) 중에서, 카메라 모듈들(315, 316)을 포함하지 않을 수 있다. 도 3a 및 도 3b에 도시하지는 않았지만, 웨어러블 전자 장치(300)는, 도 2에 도시된 구성들 중 적어도 하나의 구성을 더 포함할 수 있다.In one embodiment, the display (321) (and/or lens) may be disposed on the second side (320) of the wearable electronic device (300). In one embodiment, the wearable electronic device (300) may not include camera modules (315, 316) among the plurality of camera modules (313, 314, 315, 316). Although not shown in FIGS. 3A and 3B , the wearable electronic device (300) may further include at least one of the configurations illustrated in FIG. 2 .

상술한 바와 같이, 일 실시예에 따른, 웨어러블 전자 장치(300)는 사용자의 머리에 착용되기 위한 폼 팩터를 가질 수 있다. 웨어러블 전자 장치(300)는 사용자의 신체 부위 상에 고정되기 위한 스트랩, 및/또는 착용 부재를 더 포함할 수 있다. 웨어러블 전자 장치(300)는, 상기 사용자의 머리에 착용된 상태 내에서, 증강 현실, 가상 현실, 및/또는 혼합 현실에 기반하는 사용자 경험을 제공할 수 있다.As described above, according to one embodiment, the wearable electronic device (300) may have a form factor for being worn on a user's head. The wearable electronic device (300) may further include a strap for being fixed on a body part of the user, and/or a wearing member. The wearable electronic device (300) may provide a user experience based on augmented reality, virtual reality, and/or mixed reality while being worn on the user's head.

도 4는, 일 실시 예에 따른, 웨어러블 전자 장치(401)의 블록도이다.FIG. 4 is a block diagram of a wearable electronic device (401) according to one embodiment.

도 4를 참조하면, 일 실시예에서, 웨어러블 전자 장치(401)는, 도 2의 웨어러블 전자 장치(200)와 같은 AR 글래스 또는 도 3a 및 도 3b의 웨어러블 전자 장치(300)와 같은 VR 글래스일 수 있다.Referring to FIG. 4, in one embodiment, the wearable electronic device (401) may be AR glasses such as the wearable electronic device (200) of FIG. 2 or VR glasses such as the wearable electronic device (300) of FIGS. 3A and 3B.

일 실시예에서, 웨어러블 전자 장치(401)는, 디스플레이(410), 카메라(420), 센서(430), 메모리(440), 및/또는 프로세서(450)를 포함할 수 있다.In one embodiment, a wearable electronic device (401) may include a display (410), a camera (420), a sensor (430), memory (440), and/or a processor (450).

일 실시예에서, 디스플레이(410)는, 도 1의 디스플레이 모듈(160), 도 2의 광 출력 모듈(211), 및/또는 도 3a 및 도 3b의 디스플레이(321)일 수 있다.In one embodiment, the display (410) may be the display module (160) of FIG. 1, the light output module (211) of FIG. 2, and/or the display (321) of FIGS. 3A and 3B.

일 실시예에서, 카메라(420)는, 도 1의 카메라 모듈(180), 도 2의 카메라 모듈(250), 및/또는 도 3a의 카메라 모듈들(311, 312, 313, 314, 315, 316, 325, 326) 중 적어도 하나일 수 있다.In one embodiment, the camera (420) can be at least one of the camera module (180) of FIG. 1, the camera module (250) of FIG. 2, and/or the camera modules (311, 312, 313, 314, 315, 316, 325, 326) of FIG. 3a.

일 실시예에서, 카메라(420)는 제 1 카메라(421) 및 제 2 카메라(422)를 포함할 수 있다.In one embodiment, the camera (420) may include a first camera (421) and a second camera (422).

일 실시예에서, 제 1 카메라(421)는 사용자의 손을 트래킹을 위한 카메라일 수 있다. 예를 들어, 제 1 카메라(421)를 통하여 획득된 이미지에 기반하여, 사용자의 손을 검출하는 동작, 사용자의 손의 위치를 획득하는 동작, 사용자의 손과 관련된 스켈레톤을 획득하는 동작, 사용자의 제스처(예: 손 동작)을 인식하는 동작, 및/또는 사용자의 손을 렌더링하는 동작이 수행될 수 있다. 제 1 카메라(421)를 이용하여 사용자의 손을 트래킹하는 동작에 대해서는 상세히 후술하도록 한다.In one embodiment, the first camera (421) may be a camera for tracking a user's hand. For example, based on an image acquired through the first camera (421), an operation for detecting a user's hand, an operation for acquiring a position of the user's hand, an operation for acquiring a skeleton related to the user's hand, an operation for recognizing a user's gesture (e.g., a hand movement), and/or an operation for rendering the user's hand may be performed. The operation for tracking a user's hand using the first camera (421) will be described in detail later.

일 실시예에서, 제 1 카메라(421)는 스테레오(stereo) 카메라일 수 있다. 예를 들어, 제 1 카메라(421)는, 웨어러블 전자 장치(401)에서 서로 다른 위치들에 배치되고, 동일한 피사체에 대한 이미지들(예: 2개의 이미지들)을 동시에 획득할 수 있는, 복수의 카메라들(예: 2개의 카메라들)을 포함하는 스테레오 카메라일 수 있다. 다만, 이에 제한되지 않으며, 제 1 카메라(421)는 사용자의 손을 트래킹할 수 있는 1개의 카메라일 수도 있다.In one embodiment, the first camera (421) may be a stereo camera. For example, the first camera (421) may be a stereo camera including multiple cameras (e.g., two cameras) positioned at different locations on the wearable electronic device (401) and capable of simultaneously acquiring images (e.g., two images) of the same subject. However, the present invention is not limited thereto, and the first camera (421) may also be a single camera capable of tracking a user's hand.

일 실시예에서, 제 1 카메라(421)는, 도 2의 제 2 카메라 모듈(253) 및/또는 제 3 카메라 모듈(255)일 수 있다. 일 실시예에서, 제 1 카메라(421)는, 도 3a의 복수의 카메라 모듈들(313, 314, 315, 316) 중 하나 이상의 카메라 모듈일 수 있다.In one embodiment, the first camera (421) may be the second camera module (253) and/or the third camera module (255) of FIG. 2. In one embodiment, the first camera (421) may be one or more of the plurality of camera modules (313, 314, 315, 316) of FIG. 3a.

일 실시예에서, 제 2 카메라(422)는 사용자의 눈의 응시 지점(gaze point)(이하, "응시 지점"으로도 지칭됨)를 획득하기 위한 카메라일 수 있다. 예를 들어, 제 2 카메라(422)를 통하여 획득된 이미지에 기반하여, 사용자의 양안이 응시하는 방향들("시선 방향(gaze direction)"으로도 지칭됨)이 획득될 수 있다. 상기 획득된 방향들 및 양안 간 거리("양안 시차(binocular disparity)"로도 지칭됨)에 기반하여, 삼각 측량법(triangulate)을 이용하여, 응시 지점(예: 사용자의 눈이 응시하는 3차원 좌표)가 획득(예: 산출)될 수 있다. 제 2 카메라(422)를 이용하여 응시 지점을 획득하는 동작에 대해서는 상세히 후술하도록 한다.In one embodiment, the second camera (422) may be a camera for acquiring a gaze point of the user's eyes (hereinafter, also referred to as a "gaze point"). For example, based on an image acquired through the second camera (422), directions in which the user's two eyes gaze (also referred to as a "gaze direction") may be acquired. Based on the acquired directions and the distance between the two eyes (also referred to as a "binocular disparity"), a gaze point (e.g., a three-dimensional coordinate at which the user's eyes gaze) may be acquired (e.g., calculated) using a triangulation method. An operation of acquiring a gaze point using the second camera (422) will be described in detail below.

일 실시예에서, 제 2 카메라(422)는, 도 2의 제 1 카메라 모듈(251) 또는 도 3b의 카메라 모듈(325, 326)일 수 있다.In one embodiment, the second camera (422) may be the first camera module (251) of FIG. 2 or the camera module (325, 326) of FIG. 3b.

일 실시예에서, 센서(430)는, 제 1 센서(431), 제 2 센서(432), 및/또는 제 3 센서(433)를 포함할 수 있다.In one embodiment, the sensor (430) may include a first sensor (431), a second sensor (432), and/or a third sensor (433).

일 실시예에서, 제 1 센서(431)는 사용자의 손의 트래킹을 위한 센서일 수 있다. 예를 들어, 제 1 센서(431)는 도 3a의 뎁스 센서(317)일 수 있다. 예를 들어, 제 1 센서(431)는, TOF(time of flight) 방식(예: 일정 파장을 가진 광(예: 적외선)을 이용한 direct TOF(dTOF) 또는 indirect TOF(iTOF)), 또는 structured light 방식(구조 광 방식)을 이용하여, 웨어러블 전자 장치(401) 및 사용자의 손 간 거리(또는 뎁스)에 대한 정보(예: 뎁스 맵(depth map) 또는 뎁스 이미지)를 획득하기 위한 데이터를 획득(예: 센싱)할 수 있다. 제 1 센서(431)를 통하여 획득된 상기 데이터에 기반하여, 사용자의 손을 검출하는 동작(예: 제 1 센서(431)의 화각 내에 손이 존재하는지 여부), 사용자의 손의 위치(예: 손등의 중심 위치 또는 손바닥의 중심 위치)를 획득하는 동작, 사용자의 손과 관련된 스켈레톤을 획득하는 동작, 사용자의 제스처(예: 손 동작)을 인식하는 동작, 및/또는 사용자의 손을 렌더링하는 동작이 수행될 수 있다.In one embodiment, the first sensor (431) may be a sensor for tracking a user's hand. For example, the first sensor (431) may be a depth sensor (317) of FIG. 3A. For example, the first sensor (431) may obtain (e.g., sense) data to obtain information (e.g., a depth map or a depth image) about a distance (or depth) between the wearable electronic device (401) and the user's hand using a time of flight (TOF) method (e.g., direct TOF (dTOF) or indirect TOF (iTOF) using light having a certain wavelength (e.g., infrared rays)) or a structured light method. Based on the data acquired through the first sensor (431), an operation of detecting the user's hand (e.g., whether the hand exists within the field of view of the first sensor (431)), an operation of acquiring the position of the user's hand (e.g., the center position of the back of the hand or the center position of the palm), an operation of acquiring a skeleton related to the user's hand, an operation of recognizing the user's gesture (e.g., hand movement), and/or an operation of rendering the user's hand can be performed.

일 실시예에서, 제 2 센서(432)는, 사용자의 눈의 응시 지점을 획득하기 위한 센서일 수 있다. 예를 들어, 제 2 센서(432)는, 사용자의 양안이 응시하는 방향들을 획득할 수 있는 센서일 수 있다. 예를 들어, 제 2 센서(432)는, 사용자의 동공(예: 동공의 중심 위치)을 검출하고, 적외선과 같은 광이 사용자의 눈의 각막에서 반사되는 방향 또는 광량을 검출함으로써, 사용자의 시선 방향이 획득되도록 할 수 있는, 센서일 수 있다. 제 2 센서(432)를 이용하여 획득된 시선 방향 및 양안 시차에 기반하여, 삼각 측량법을 이용하여, 응시 지점이 획득될 수 있다.In one embodiment, the second sensor (432) may be a sensor for obtaining a gaze point of the user's eyes. For example, the second sensor (432) may be a sensor capable of obtaining directions in which the user's two eyes are gazed. For example, the second sensor (432) may be a sensor capable of detecting the user's pupil (e.g., the center position of the pupil) and detecting a direction or amount of light, such as infrared, reflected from the cornea of the user's eyes, thereby obtaining the user's gaze direction. Based on the gaze direction and binocular parallax obtained using the second sensor (432), the gaze point may be obtained using triangulation.

일 실시예에서, 제 3 센서(433)는 헤드 트래킹(head tracking)을 위하여 이용되는 센서일 수 있다. 예를 들어, 제 3 센서(433)는, 3DoF(3 degrees of freedom)를 지원하는 센서(3축 센서) 또는 6DoF를 지원하는 센서(6축 센서)일 수 있다. 제 3 센서(433)를 통하여 획득된 센서 데이터에 기반하여, 사용자의 머리가 향하는 방향 및/또는 사용자의 머리의 위치가 획득될 수 있다.In one embodiment, the third sensor (433) may be a sensor used for head tracking. For example, the third sensor (433) may be a sensor supporting 3DoF (3 degrees of freedom) (3-axis sensor) or a sensor supporting 6DoF (6-axis sensor). Based on sensor data acquired through the third sensor (433), the direction in which the user's head is directed and/or the position of the user's head may be acquired.

도 4에서 센서(430)가 제 1 센서(431), 제 2 센서(432), 및/또는 제 3 센서(433)를 포함하는 것으로 예시하고 있지만 이에 제한되지 않는다. 예를 들어, 센서(430)는, 도 1의 센서 모듈(176)에 포함된 적어도 하나의 구성을 더 포함할 수 있다. 예를 들어, 센서(430)에 포함된 적어도 하나의 센서는, 웨어러블 전자 장치(401) 외부의 전자 장치(예: 웨어러블 전자 장치)에 포함될 수 있다. 상기 웨어러블 전자 장치(401) 외부의 전자 장치에 포함된 센서로부터 획득된 센싱 값에 기반하여, 손, 시선, 또는 사용자의 머리 중 적어도 하나와 관련된 정보가 획득될 수 있다.In FIG. 4, the sensor (430) is exemplified as including a first sensor (431), a second sensor (432), and/or a third sensor (433), but is not limited thereto. For example, the sensor (430) may further include at least one component included in the sensor module (176) of FIG. 1. For example, at least one sensor included in the sensor (430) may be included in an electronic device (e.g., a wearable electronic device) external to the wearable electronic device (401). Based on the sensing value acquired from the sensor included in the electronic device external to the wearable electronic device (401), information related to at least one of a hand, a gaze, or a head of the user may be acquired.

일 실시예에서, 메모리(440)는 도 1의 메모리(130)일 수 있다.In one embodiment, the memory (440) may be the memory (130) of FIG. 1.

일 실시예에서, 메모리(440)는 핸드 트래킹을 수행하기 위한 정보를 저장할 수 있다.In one embodiment, the memory (440) may store information for performing hand tracking.

일 실시예에서, 메모리(440)는, 핸드 트래킹 모듈(441), 아이(eye) 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444)을 포함할 수 있다.In one embodiment, the memory (440) may include a hand tracking module (441), an eye tracking module (442), a hand rendering module (443), and/or a head tracking module (444).

일 실시예에서, 핸드 트래킹 모듈(441)은, 프로세서(450)에 의해 실행 시, 웨어러블 전자 장치(401)가 핸드 트래킹을 수행하도록 구성된 명령어들을 포함할 수 있다.In one embodiment, the hand tracking module (441) may include instructions that, when executed by the processor (450), cause the wearable electronic device (401) to perform hand tracking.

일 실시예에서, 아이 트래킹 모듈(442)은, 프로세서(450)에 의해 실행 시, 웨어러블 전자 장치(401)가 아이 트래킹을 수행하도록 구성된 명령어들을 포함할 수 있다.In one embodiment, the eye tracking module (442) may include instructions that, when executed by the processor (450), cause the wearable electronic device (401) to perform eye tracking.

일 실시예에서, 핸드 렌더링 모듈(443)은, 프로세서(450)에 의해 실행 시, 웨어러블 전자 장치(401)가 사용자의 손을 나타내는 가상 손(또는 손 모델)을 생성하고, 생성된 가상 손을 디스플레이(410)를 통하여 표시하도록 구성된 명령어들을 포함할 수 있다.In one embodiment, the hand rendering module (443) may include instructions that, when executed by the processor (450), are configured to cause the wearable electronic device (401) to generate a virtual hand (or hand model) representing a user's hand and to display the generated virtual hand through the display (410).

핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 및 핸드 렌더링 모듈(443)에 의해 수행되는 동작들에 대해서는, 도 5 내지 도 13을 참조하여 상세히 후술하도록 한다.The operations performed by the hand tracking module (441), the eye tracking module (442), and the hand rendering module (443) will be described in detail later with reference to FIGS. 5 to 13.

일 실시예에서, 헤드 트래킹 모듈(444)은, 프로세서(450)에 의해 실행 시, 웨어러블 전자 장치(401)가, 3차원 공간(예: 3차원 현실 공간 또는 3차원 가상 공간)의 좌표계 내에서, 웨어러블 전자 장치(401)(또는 웨어러블 전자 장치(401)를 착용한 사용자의 얼굴)이 향하는 방향 및 웨어러블 전자 장치(401)의 위치를 획득하도록 구성될 명령어들을 포함할 수 있다. 예를 들어, 웨어러블 전자 장치(401)가 파워 온(power on)되는 경우 또는 웨어러블 전자 장치(401)에서 어플리케이션이 실행되는 경우, 제 3 센서(433)를 통하여 획득된 웨어러블 전자 장치(401)의 위치 및 방향에 기반하여 3차원 공간의 좌표계가 설정될 수 있다. 헤드 트래킹 모듈(444)은, 3차원 공간의 좌표계가 설정된 후, 웨어러블 전자 장치(401)의 위치가 변경되는 경우(예: 웨어러블 전자 장치(401)를 착용한 사용자의 이동에 의해 웨어러블 전자 장치(401)의 위치가 변경되는 경우) 및/또는 웨어러블 전자 장치(401)의 방향이 변경되는 경우(예: 웨어러블 전자 장치(401)를 착용한 사용자의 머리가 회전하는 경우), 제 3 센서(433)를 통하여 획득된 데이터에 기반하여, 3차원 공간의 좌표계 내에서 웨어러블 전자 장치(401)의 변경된 위치 및/또는 방향을 획득할 수 있다. 3차원 공간의 좌표계 내에서 웨어러블 전자 장치(401)의 변경된 위치 및/또는 방향은, 사용자의 손의 위치(예: 3차원 공간의 좌표계 내에서 사용자의 손의 3차원 좌표) 및 응시 지점(예: 3차원 공간의 좌표계 내에서 사용자가 응시하는 지점의 3차원 좌표)을 획득하기 위하여 이용될 수 있다.In one embodiment, the head tracking module (444) may include instructions that, when executed by the processor (450), are configured to cause the wearable electronic device (401) to obtain a direction toward which the wearable electronic device (401) (or a face of a user wearing the wearable electronic device (401)) is facing and a position of the wearable electronic device (401) within a coordinate system of a three-dimensional space (e.g., a three-dimensional real space or a three-dimensional virtual space). For example, when the wearable electronic device (401) is powered on or when an application is executed on the wearable electronic device (401), a coordinate system of the three-dimensional space may be set based on the position and direction of the wearable electronic device (401) obtained through the third sensor (433). The head tracking module (444) may obtain the changed position and/or direction of the wearable electronic device (401) within the coordinate system of the three-dimensional space based on data obtained through the third sensor (433) when the position of the wearable electronic device (401) changes (e.g., when the position of the wearable electronic device (401) changes due to movement of the user wearing the wearable electronic device (401)) and/or when the direction of the wearable electronic device (401) changes (e.g., when the head of the user wearing the wearable electronic device (401) rotates) after the coordinate system of the three-dimensional space is set. The changed position and/or orientation of the wearable electronic device (401) within the coordinate system of the three-dimensional space can be used to obtain the position of the user's hand (e.g., the three-dimensional coordinate of the user's hand within the coordinate system of the three-dimensional space) and the gaze point (e.g., the three-dimensional coordinate of the point at which the user gazes within the coordinate system of the three-dimensional space).

일 실시예에서, 도 4에서 메모리(440)가, 핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444)을 포함하는 것으로 도시하고 있지만 이에 제한되지 않는다. 예를 들어, 메모리(440)는, 사용자의 제스처(예: 손 동작)를 인식하기 위한 제스처 인식 모듈을 더 포함할 수도 있다.In one embodiment, the memory (440) is illustrated in FIG. 4 as including, but not limited to, a hand tracking module (441), an eye tracking module (442), a hand rendering module (443), and/or a head tracking module (444). For example, the memory (440) may further include a gesture recognition module for recognizing a user's gestures (e.g., hand movements).

일 실시예에서, 전술한 예시들에서, 핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444)이 소프트웨어(software)로서 메모리(440)에 포함되는 것으로 예시하고 있지만, 이에 제한되지 않는다. 핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444) 중 적어도 하나의 모듈은, 하드웨어로서 웨어러블 전자 장치(401)에 포함될 수 있다.In one embodiment, in the examples described above, the hand tracking module (441), the eye tracking module (442), the hand rendering module (443), and/or the head tracking module (444) are illustrated as being included in the memory (440) as software, but are not limited thereto. At least one of the hand tracking module (441), the eye tracking module (442), the hand rendering module (443), and/or the head tracking module (444) may be included in the wearable electronic device (401) as hardware.

일 실시예에서, 도 4에서, 핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444)이 독립된 구성들로서 도시되어 있지만, 이에 제한되지 않는다. 예를 들어, 핸드 트래킹 모듈(441), 아이 트래킹 모듈(442), 핸드 렌더링 모듈(443), 및/또는 헤드 트래킹 모듈(444) 중에서 적어도 일부는 통합된 모듈로서 구현될 수 있다.In one embodiment, in FIG. 4, the hand tracking module (441), the eye tracking module (442), the hand rendering module (443), and/or the head tracking module (444) are illustrated as independent components, but are not limited thereto. For example, at least some of the hand tracking module (441), the eye tracking module (442), the hand rendering module (443), and/or the head tracking module (444) may be implemented as integrated modules.

일 실시예에서, 프로세서(450)는 도 1의 프로세서(120)일 수 있다.In one embodiment, the processor (450) may be the processor (120) of FIG. 1.

일 실시예에서, 프로세서(450)는 핸드 트래킹을 수행하는 동작을 전반적으로 제어할 수 있다. 일 실시예에서, 프로세서(450)는 핸드 트래킹을 수행하기 위한 하나 이상의 프로세서들을 포함할 수 있다.In one embodiment, the processor (450) may generally control the operation of performing hand tracking. In one embodiment, the processor (450) may include one or more processors for performing hand tracking.

이하, 도 5 내지 도 13을 참조하여, 프로세서(450)가 핸드 트래킹을 수행하는 동작에 대하여 설명하도록 한다.Hereinafter, with reference to FIGS. 5 to 13, the operation of the processor (450) performing hand tracking will be described.

도 4에서, 웨어러블 전자 장치(401)가, 디스플레이(410), 카메라(420), 센서(430), 메모리(440), 및/또는 프로세서(450)를 포함하는 것으로 도시하고 있지만, 이에 제한되지 않는다.In FIG. 4, a wearable electronic device (401) is illustrated as including, but not limited to, a display (410), a camera (420), a sensor (430), a memory (440), and/or a processor (450).

일 실시예에서, 웨어러블 전자 장치(401)는 도 4에 도시된 구성들(components) 중 일부를 포함하지 않을 수 있다. 예를 들어, 웨어러블 전자 장치(401)는, 핸드 트래킹을 수행하기 위한 구성들로서 제 1 카메라(421) 및 제 1 센서(431) 중 하나를 포함할 수 있다. 예를 들어, 웨어러블 전자 장치(401)는, 아이 트래킹을 수행하기 위한 구성들로서 제 2 카메라(422) 및 제 2 센서(432) 중 하나를 포함할 수 있다.In one embodiment, the wearable electronic device (401) may not include some of the components illustrated in FIG. 4. For example, the wearable electronic device (401) may include one of the first camera (421) and the first sensor (431) as components for performing hand tracking. For example, the wearable electronic device (401) may include one of the second camera (422) and the second sensor (432) as components for performing eye tracking.

일 실시예에서, 웨어러블 전자 장치(401)는 도 4에 도시된 구성들 외에 적어도 하나의 구성을 더 포함할 수 있다. 예를 들어, 웨어러블 전자 장치(401)는, 웨어러블 전자 장치(401)를 사용자의 신체 부위(예: 사용자의 머리) 상에 고정하기 위한 스트랩(strap), 및/또는 착용 부재를 더 포함할 수 있다. 예를 들어, 웨어러블 전자 장치(401)는, 도 1의 전자 장치(101)에 포함된 적어도 하나의 구성(예: 통신 모듈(190))을 더 포함할 수 있다.In one embodiment, the wearable electronic device (401) may further include at least one component in addition to the components illustrated in FIG. 4. For example, the wearable electronic device (401) may further include a strap for fixing the wearable electronic device (401) on a body part of a user (e.g., the user's head), and/or a wearing member. For example, the wearable electronic device (401) may further include at least one component (e.g., the communication module (190)) included in the electronic device (101) of FIG. 1.

도 5는, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도(500)이다.FIG. 5 is a flowchart (500) for explaining a method of performing hand tracking according to one embodiment.

도 5를 참조하면, 동작 501에서, 일 실시예에서, 프로세서(450)는, 제 1 카메라(421)를 통하여 획득된 이미지(이하, "제 1 이미지" 또는 "적어도 하나의 제 1 이미지"로 지칭됨)에 기반하여, 사용자의 손의 위치를 획득(예: 산출)할 수 있다.Referring to FIG. 5, in operation 501, in one embodiment, the processor (450) may obtain (e.g., calculate) the position of the user's hand based on an image obtained through the first camera (421) (hereinafter, referred to as "the first image" or "at least one first image").

일 실시예에서, 프로세서(450)는, 웨어러블 전자 장치(401)에서 어플리케이션이 실행됨에 기반하여, 제 1 카메라(421)를 활성화할 수 있다. 프로세서(450)는, 활성화된 제 1 카메라(421)를 이용하여 제 1 이미지(예: 적어도 하나의 제 1 이미지)를 획득할 수 있다.In one embodiment, the processor (450) may activate the first camera (421) based on an application running on the wearable electronic device (401). The processor (450) may acquire a first image (e.g., at least one first image) using the activated first camera (421).

일 실시예에서, 프로세서(450)는, 웨어러블 전자 장치(401)가 파워 온(power on)(또는 턴 온(turn on))됨에 기반하여, 제 1 카메라(421)를 활성화할 수 있다. 프로세서(450)는, 활성화된 제 1 카메라(421)를 이용하여 적어도 하나의 제 1 이미지를 획득할 수 있다.In one embodiment, the processor (450) can activate the first camera (421) based on the wearable electronic device (401) being powered on (or turned on). The processor (450) can acquire at least one first image using the activated first camera (421).

일 실시예에서, 프로세서(450)는, 제 1 카메라(421)(예: 웨어러블 전자 장치(401)에서 서로 다른 위치들에 배치되고 동일한 피사체에 대한 복수(예: 2개)의 이미지들을 동시에 획득할 수 있는, 복수(예: 2개)의 카메라들을 포함하는 스테레오 카메라)를 통하여, 복수(예: 2개)의 제 1 이미지들을 획득할 수 있다.In one embodiment, the processor (450) may acquire multiple (e.g., two) first images via the first camera (421) (e.g., a stereo camera including multiple (e.g., two) cameras positioned at different locations on the wearable electronic device (401) and capable of simultaneously acquiring multiple (e.g., two) images of the same subject.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득할 수 있다. 이하, 도 6을 참조하여, 적어도 하나의 제 1 이미지에 기반하여 사용자의 손의 위치를 획득하는 동작에 대하여 설명하도록 한다.In one embodiment, the processor (450) can obtain the position of the user's hand based on at least one first image. Hereinafter, with reference to FIG. 6, an operation of obtaining the position of the user's hand based on at least one first image will be described.

도 6은, 일 실시예에 따른, 사용자의 손의 위치를 획득하는 방법을 설명하기 위한 도면(600)이다.FIG. 6 is a drawing (600) for explaining a method for obtaining the position of a user's hand according to one embodiment.

도 6을 참조하면, 일 실시예에서, 참조 부호 610은 제 1 이미지(예: 제 1 카메라(421)가 스테레오 카메라인 경우 2개의 제 1 이미지들 중 하나의 제 1 이미지)를 나타낼 수 있다.Referring to FIG. 6, in one embodiment, reference numeral 610 may represent a first image (e.g., one of two first images when the first camera (421) is a stereo camera).

일 실시예에서, 프로세서(450)는, 제 1 이미지(610) 내에서 사용자의 손(예: 손들(621, 622))을 검출하고, 사용자의 손의 위치(예: 2차원 좌표)를 획득(예: 산출)할 수 있다. In one embodiment, the processor (450) may detect a user's hand (e.g., hands (621, 622)) within the first image (610) and obtain (e.g., calculate) a location (e.g., two-dimensional coordinates) of the user's hand.

일 실시예에서, 프로세서(450)는, 인공 지능 모델을 이용하여, 제 1 이미지(610) 내에서 사용자의 손을 검출하고, 사용자의 손의 좌표를 획득할 수 있다. 예를 들어, 프로세서(450)는, 입력 데이터로서 제 1 이미지(610)를, 손 검출과 관련된 인공 지능 모델을 이용한 인공 지능 엔진으로 입력할 수 있다. 프로세서(450)는, 상기 인공 지능 엔진으로부터, 출력 데이터로서 제 1 이미지(610) 내에서 손을 나타내는 영역 및 손의 위치(예: 지점들(631, 632))(예: 손 바닥의 중심 지점 또는 손 등의 중심 지점)를 획득할 수 있다. 전술한 예시에서, 인공 지능 모델을 이용하여 사용자의 손을 검출하고 손의 좌표를 획득하는 것으로 예시하고 있지만, 이에 제한되지 않는다. 예를 들어, 프로세서(450)는, 손 검출과 관련된 알고리즘을 이용하여, 제 1 이미지(610) 내에서 사용자의 손을 검출하고, 사용자의 손의 좌표를 획득할 수 있다.In one embodiment, the processor (450) may detect the user's hand in the first image (610) and obtain coordinates of the user's hand using the artificial intelligence model. For example, the processor (450) may input the first image (610) as input data to an artificial intelligence engine using an artificial intelligence model related to hand detection. The processor (450) may obtain, from the artificial intelligence engine, as output data, an area representing the hand in the first image (610) and a location of the hand (e.g., points (631, 632)) (e.g., a center point of the palm or a center point of the hand, etc.). In the above-described example, the artificial intelligence model is used to detect the user's hand and obtain coordinates of the hand, but the present invention is not limited thereto. For example, the processor (450) may detect the user's hand in the first image (610) and obtain coordinates of the user's hand using an algorithm related to hand detection.

일 실시예에서, 프로세서(450)는, 제 1 이미지(610) 내에서 손을 나타내는 영역 및 손의 위치(예: 지점들(631, 632))가 획득됨에 기반하여, 바운딩 박스(bounding box)를 획득(예: 생성)할 수 있다. 예를 들어, 프로세서(450)는, 손의 위치를 기준으로 손을 나타내는 영역을 포함하는 영역을 지시하는 바운딩 박스를 생성할 수 있다. 예를 들어, 도 6에서, 프로세서(450)는, 제 1 이미지(610) 내에서, 지점(631)이 중심점으로 설정되고 손(621)을 나타내는 영역을 포함하는 영역의 경계선을 나타내는 바운딩 박스(641)와 지점(632)이 중심점으로 설정되고 손(622)을 나타내는 영역을 포함하는 영역의 경계선을 나타내는 바운딩 박스(642)를 생성할 수 있다. 일 실시예에서, 손을 나타내는 영역을 포함하는 영역은, 바운딩 박스(예: 바운딩 박스(641), 바운딩 박스(642))로 제한되지 않는다. 예를 들어, 손을 나타내는 영역을 포함하는 영역의 형태는, 손을 나타내는 영역을 포함하는 다양한 형태(예: 원, 타원, 또는 비정형 형태)를 포함할 수 있다.In one embodiment, the processor (450) may acquire (e.g., generate) a bounding box based on acquiring an area representing a hand and a location of the hand (e.g., points (631, 632)) within the first image (610). For example, the processor (450) may generate a bounding box indicating an area including an area representing a hand based on the location of the hand. For example, in FIG. 6, the processor (450) may generate a bounding box (641) indicating a boundary line of an area including an area representing a hand (621) with point (631) set as a center point, and a bounding box (642) indicating a boundary line of an area including an area representing a hand (622) with point (632) set as a center point within the first image (610). In one embodiment, the area including the area representing the hand is not limited to the bounding box (e.g., bounding box (641), bounding box (642)). For example, the shape of the region including the region representing the hand may include various shapes (e.g., a circle, an ellipse, or an irregular shape) including the region representing the hand.

일 실시예에서, 프로세서(450)는, 제 1 이미지(610)로부터, 바운딩 박스(예: 바운딩 박스들(641, 642))가 지시하는 영역을 크롭(crop)할 수 있다. 상기 크롭된 영역은 후술할 동작 507의 스켈레톤을 획득하는 동작에 이용될 수 있다.In one embodiment, the processor (450) may crop an area indicated by a bounding box (e.g., bounding boxes (641, 642)) from the first image (610). The cropped area may be used in an operation of obtaining a skeleton of operation 507, which will be described later.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내로부터 획득된 손의 2차원 위치에 기반하여, 손의 3차원 위치를 획득할 수 있다. 예를 들어, 프로세서(450)는, 2개의 카메라들(예: 스테레오 카메라)를 통하여 획득된 2개의 제 1 이미지들로부터. 2개의 손의 2차원 위치들을 획득할 수 있다. 프로세서(450)는, 상기 획득된 손의 2차원 위치들 및 상기 2개의 카메라들 간 위치 차이(예: 2개의 카메라들 간 시차(disparity))에 기반하여, 웨어러블 전자 장치(401)의 위치 및 방향(예: 제 3 센서(433)를 이용하여 획득된 웨어러블 전자 장치(401)의 현재 위치 및 현재 방향)에 상대적인(예: 웨어러블 전자 장치(401)의 현재 위치 및 현재 방향을 기준으로 하는) 손의 3차원 위치(예: 손의 3차원 좌표)를 획득할 수 있다. 프로세서(450)는, 웨어러블 전자 장치(401)의 현재 위치 및 현재 방향과, 웨어러블 전자 장치(401)의 위치 및 방향에 상대적인 손의 3차원 위치에 기반하여, 가상 공간(예: 현실 공간)의 좌표계 내에서 손의 3차원 위치(예: 가상 공간의 좌표계 내에서 손의 3차원 좌표)를 획득할 수 있다.In one embodiment, the processor (450) can obtain a three-dimensional position of the hand based on a two-dimensional position of the hand acquired from at least one first image. For example, the processor (450) can obtain two two-dimensional positions of the hand from two first images acquired through two cameras (e.g., stereo cameras). The processor (450) can obtain a three-dimensional position of the hand (e.g., three-dimensional coordinates of the hand) relative to a position and direction of the wearable electronic device (401) (e.g., based on a current position and current direction of the wearable electronic device (401) acquired using a third sensor (433)) based on the obtained two-dimensional positions of the hand and a positional difference between the two cameras (e.g., a disparity between the two cameras). The processor (450) can obtain a three-dimensional position of the hand (e.g., three-dimensional coordinates of the hand within a coordinate system of the virtual space) within a coordinate system of a virtual space (e.g., real space) based on the current position and current direction of the wearable electronic device (401) and the three-dimensional position of the hand relative to the position and direction of the wearable electronic device (401).

전술한 예시들에서는, 제 1 카메라(421)로서 스테레오 카메라를 이용하여 손의 위치를 획득하는 동작을 설명하고 있지만, 이에 제한되지 않는다. 예를 들어, 프로세서(450)는, 제 1 카메라(421)를 대체하여 또는 제 1 카메라(421)에 추가적으로, 제 1 센서(431)(예: 뎁스 센서)를 이용하여, 손의 위치를 획득할 수 있다. 예를 들어, 프로세서(450)는, 제 1 센서(431)를 통하여 뎁스 정보(예: 뎁스 맵 또는 뎁스 이미지)를 획득할 수 있다. 프로세서(450)는, 뎁스 정보에 기반하여, 제 1 센서(431)의 화각 내에 사용자의 손이 존재하는지 여부(예: 사용자의 손이 위치하는지 여부)를 확인하고, 손의 위치(예: 가상 공간 좌표계 내에서 손의 3차원 좌표)를 획득할 수 있다.In the above examples, the operation of obtaining the position of the hand using a stereo camera as the first camera (421) is described, but is not limited thereto. For example, the processor (450) may obtain the position of the hand using the first sensor (431) (e.g., a depth sensor) instead of or in addition to the first camera (421). For example, the processor (450) may obtain depth information (e.g., a depth map or a depth image) through the first sensor (431). Based on the depth information, the processor (450) may determine whether the user's hand exists within the field of view of the first sensor (431) (e.g., whether the user's hand is positioned) and obtain the position of the hand (e.g., 3D coordinates of the hand within a virtual space coordinate system).

도 5를 다시 참조하면, 동작 503에서, 일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 통하여 획득된 이미지(이하, "제 2 이미지" 또는 "적어도 하나의 제 2 이미지"로 지칭됨)에 기반하여, 사용자의 응시 지점(gaze point 또는 point of gaze)(이하, "응시 지점"으로도 지칭됨)를 획득할 수 있다. 이하, 도 7을 참조하여, 응시 지점을 획득하는 동작에 대하여 설명하도록 한다.Referring back to FIG. 5, in operation 503, in one embodiment, the processor (450) may acquire a user's gaze point (or point of gaze) (hereinafter, also referred to as "gaze point") based on an image acquired through the second camera (422) (hereinafter, referred to as "second image" or "at least one second image"). Hereinafter, an operation of acquiring the gaze point will be described with reference to FIG. 7.

도 7은, 일 실시예에 따른, 사용자의 응시 지점을 획득하는 방법을 설명하기 위한 도면이다.FIG. 7 is a diagram illustrating a method for obtaining a user's gaze point according to one embodiment.

도 7을 참조하면, 일 실시예에서, 도 7을 통하여 설명되는 응시 지점을 획득하는 방식은 각막 반사법(pupil centre corneal reflection; PCCR)일 수 있다.Referring to FIG. 7, in one embodiment, the method for obtaining the gaze point described through FIG. 7 may be a pupil centre corneal reflection (PCCR) method.

일 실시예에서, 시선 추적(eye tracking)을 위하여 각막 반사법이 이용될 수 있다. 예를 들어, 각막 반사법은, 적외선 광원을 사용자의 눈으로 입사시킬 때, 각각 눈의 각막과 렌즈 부분에서 광원이 반사되는 적어도 하나의 이미지를 획득하고, 이를 통해서 시선의 방향, 응시 지점을 획득할 수 있는 방식일 수 있다.In one embodiment, corneal reflection may be used for eye tracking. For example, corneal reflection may be a method in which, when an infrared light source is incident on a user's eye, at least one image of the light source reflected from the cornea and lens of the eye is obtained, and through this, the direction of the gaze and the point of gaze may be obtained.

일 실시예에서, 프로세서(450)는, 제 2 카메라(422)(예: 좌안에 대한 제 2 이미지를 획득하도록 구성된 제 2-1 카메라 및 우안에 대한 제 2 이미지를 획득하도록 구성된 제 2-2 카메라)를 이용하여, 사용자의 양쪽 눈들에 대한 2개의 제 2 이미지들을 획득할 수 있다. 일 실시예에서, 참조 부호 701 및 참조 부호 702에서 제 2 이미지(710)는 좌안에 대한 제 2 이미지를 나타낼 수 있다.In one embodiment, the processor (450) may acquire two second images for both eyes of the user using a second camera (422) (e.g., a 2-1 camera configured to acquire a second image for the left eye and a 2-2 camera configured to acquire a second image for the right eye). In one embodiment, the second image (710) at reference numerals 701 and 702 may represent the second image for the left eye.

일 실시예에서, 프로세서(450)는, 발광부(예: LED(light emitting diode))로부터 방출(예: 발광)된 광(예: 적외선)이 사용자의 눈으로 입사하도록 발광부를 제어할 수 있다. 예를 들어, 프로세서(450)는, 복수의 발광부들 각각으로부터 방출된 복수의 광들이 사용자의 눈으로 입사하도록, 복수의 발광부들을 제어할 수 있다.In one embodiment, the processor (450) may control the light emitting unit (e.g., light emitting diode (LED)) to cause light (e.g., infrared) emitted (e.g., emitted) from the light emitting unit to be incident on the user's eyes. For example, the processor (450) may control the plurality of light emitting units to cause a plurality of lights emitted from each of the plurality of light emitting units to be incident on the user's eyes.

일 실시예에서, 프로세서(450)는, 발광부가 광을 방출하는 동안, 제 2 카메라(422)를 통하여, 사용자의 눈에 대한 적어도 하나의 제 2 이미지를 획득할 수 있다.In one embodiment, the processor (450) can acquire at least one second image of the user's eye through the second camera (422) while the light emitter is emitting light.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 2 이미지 내에서, 동공의 위치(예: 동공의 중심 지점) 및 글린트(glint)(예: 발광부로부터 방출된 광이 사용자의 눈에 반사됨으로써 발생하는 반짝임)의 위치를 획득할 수 있다. 예를 들어, 참조 부호 701에서, 프로세서(450)는, 제 2 이미지(710) 내에서, 좌안(733)의 동공(721)의 위치(예: 동공(721)의 중심 위치) 및 복수의 광들에 의해 발생한 복수의 글린트들(731)의 위치들을 획득할 수 있다.In one embodiment, the processor (450) may obtain, within at least one second image, a location of a pupil (e.g., a center point of the pupil) and a location of a glint (e.g., a glint caused by light emitted from a light emitting unit reflecting into a user's eye). For example, in reference numeral 701, the processor (450) may obtain, within the second image (710), a location of a pupil (721) of a left eye (733) (e.g., a center position of the pupil (721)) and locations of a plurality of glints (731) caused by a plurality of lights.

일 실시예에서, 프로세서(450)는, 복수의 글린트들(731)의 위치들 및 동공(721)의 위치에 기반하여, 사용자의 시선 방향(예: 2차원 시선 방향)을 획득할 수 있다. 예를 들어, 참조 부호 702에서, 프로세서(450)는, 복수의 글린트들(731)의 중심 위치(732)와 동공(721)의 위치에 기반하여, 사용자의 시선 방향(예: 2차원 시선 방향)을 나타내는(또는 대응하는) 시선 벡터(720)를 획득할 수 있다.In one embodiment, the processor (450) may obtain a gaze direction (e.g., a two-dimensional gaze direction) of the user based on the positions of the plurality of glints (731) and the position of the pupil (721). For example, in reference numeral 702, the processor (450) may obtain a gaze vector (720) representing (or corresponding to) a gaze direction (e.g., a two-dimensional gaze direction) of the user based on the center positions (732) of the plurality of glints (731) and the position of the pupil (721).

일 실시예에서, 동공의 위치(721)는 사용자의 눈의 움직임(예: 사용자의 시선 방향의 변경)에 의해 변경될 수 있는 반면, 복수의 글린트들(731)의 위치들(및 복수의 글린트들(731)의 중심 지점(732))은 사용자의 눈의 움직임이 발생하더라도 변경되지 않는 위치들(예: 고정된 위치들)일 수 있다. 이에 따라, 사용자의 눈의 움직임이 발생하는 경우, 시선 벡터(720)는 변경될 수 있다.In one embodiment, the position of the pupil (721) may change due to movement of the user's eyes (e.g., a change in the user's gaze direction), while the positions of the plurality of glints (731) (and the center points (732) of the plurality of glints (731)) may be positions that do not change even when movement of the user's eyes occurs (e.g., fixed positions). Accordingly, when movement of the user's eyes occurs, the gaze vector (720) may change.

일 실시예에서, 도 7에서는, 좌안(733)에 대한 시선 벡터(720)를 획득하는 동작을 설명하였지만, 이에 제한되지 않는다. 일 실시예에서, 프로세서(450)는, 도 7을 통하여 설명한 동작들과 동일 또는 유사한 동작들을 수행함으로써, 우안의 시선 방향(예: 2차원 시선 방향)을 나타내는 시선 벡터를 획득할 수 있다.In one embodiment, although FIG. 7 describes an operation of obtaining a gaze vector (720) for the left eye (733), it is not limited thereto. In one embodiment, the processor (450) can obtain a gaze vector representing a gaze direction (e.g., a two-dimensional gaze direction) of the right eye by performing operations identical or similar to those described through FIG. 7.

일 실시예에서, 프로세서(450)는, 상기 좌안의 시선 벡터(예: 좌안의 2차원 시선 방향을 나타내는 시선 벡터) 및 상기 우안의 시선 벡터(예: 우안의 2차원 시선 방향을 나타내는 시선 벡터)에 적어도 일부 기반하여, 좌안의 3차원 시선 방향 및 우안의 3차원 시선 방향을 획득할 수 있다.In one embodiment, the processor (450) can obtain a 3D gaze direction of the left eye and a 3D gaze direction of the right eye based at least in part on the gaze vector of the left eye (e.g., a gaze vector representing a 2D gaze direction of the left eye) and the gaze vector of the right eye (e.g., a gaze vector representing a 2D gaze direction of the right eye).

일 실시예에서, 참조 부호 703에서 도시된 바와 같이, 프로세서(450)는, 좌안(751)의 3차원 시선 방향(741), 우안(752)의 3차원 시선 방향(742), 및 양안 간 거리(양안 시차)(d)(예: 약 6.5 cm)에 기반하여, 삼각 측량법을 이용하여, 웨어러블 전자 장치(401)의 위치 및 방향에 상대적인 응시 지점(761)(예: 좌안의 시선 방향 및 우안의 시선 방향이 수렴하는 위치)을 획득할 수 있다.In one embodiment, as illustrated in reference numeral 703, the processor (450) may obtain a gaze point (761) (e.g., a position where the gaze direction of the left eye and the gaze direction of the right eye converge) relative to the position and direction of the wearable electronic device (401) by using triangulation based on a three-dimensional gaze direction (741) of the left eye (751), a three-dimensional gaze direction (742) of the right eye (752), and an inter-ocular distance (binocular disparity) (d) (e.g., about 6.5 cm).

일 실시예에서, 프로세서(450)는, 웨어러블 전자 장치(401)의 위치 및 방향과, 웨어러블 전자 장치(401)의 위치 및 방향에 상대적인 응시 지점에 기반하여, 가상 공간(예: 현실 공간)의 좌표계 내에서 응시 지점을 획득할 수 있다.In one embodiment, the processor (450) may obtain a gaze point within a coordinate system of a virtual space (e.g., real space) based on the position and orientation of the wearable electronic device (401) and the gaze point relative to the position and orientation of the wearable electronic device (401).

일 실시예에서, 도 7에서는 복수의 발광부들 및 제 2 카메라(422)를 이용한 방식으로 PCCR을 예시하였지만, 이에 제한되지 않는다. 일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 대체하여 또는 제 2 카메라(422)에 추가적으로, 제 2 센서(432)를 이용하여 응시 지점을 획득할 수 있다. 예를 들어, 프로세서(450)는, 광의 방향을 제어하는(예: 광이 눈으로 향하는 각도를 변경할 수 있는) 스캐닝 미러(scanning mirror)를 포함하는 발광부 및 광을 수광하는 수광부를 포함하는 제 2 센서(432)를 이용하여, 응시 지점을 획득할 수 있다. 프로세서(450)는, 상기 스캐닝 미러를 통하여 방향이 변경되는 광을 방출하도록 발광부를 제어할 수 있다. 프로세서(450)는, 발광부를 제어하는 동안, 사용자의 눈에 반사된 광(예: 수광부를 통하여 획득된 광)의 세기(예: 광량)이 최대인 시점에, 사용자의 눈에 광이 반사된 위치(예: 각막의 중심 위치)를 획득할 수 있다. 프로세서(450)는, 상기 획득된 위치 및 사용자의 눈의 중심 위치를 연결한 방향을 사용자의 시선 방향으로 결정할 수 있다. 프로세서(450)는, 양안의 시선 방향들 및 양안 간 거리에 기반하여, 응시 지점을 획득할 수 있다.In one embodiment, PCCR is illustrated in FIG. 7 using a plurality of light emitters and a second camera (422), but is not limited thereto. In one embodiment, the processor (450) may acquire a gaze point using a second sensor (432) instead of or in addition to the second camera (422). For example, the processor (450) may acquire a gaze point using a second sensor (432) including a light emitter including a scanning mirror that controls the direction of light (e.g., can change the angle at which the light is directed toward the eye) and a light receiver that receives the light. The processor (450) may control the light emitter to emit light whose direction is changed through the scanning mirror. The processor (450) can obtain a location (e.g., a central location of the cornea) where light is reflected in the user's eye at a time point when the intensity (e.g., light obtained through the light receiving unit) of light reflected in the user's eye is at its maximum while controlling the light emitting unit. The processor (450) can determine a direction connecting the obtained location and the central location of the user's eye as the user's gaze direction. The processor (450) can obtain a gaze point based on the gaze directions of both eyes and the distance between the two eyes.

다만, 응시 지점을 획득하는 방식들은 전술한 예시들에 제한되지 않는다.However, the methods for obtaining the gaze point are not limited to the examples described above.

도 5를 다시 참조하면, 도 5에서는 동작 501의 손의 위치를 획득하는 동작이 동작 503의 응시 지점을 획득하는 동작에 선행하여 수행되는 것으로 도시되어 있지만, 이에 제한되지 않는다. 예를 들어, 프로세서(450)는, 동작 501 및 동작 503을 병렬적으로(또는 동시에) 수행할 수 있다. 예를 들어, 프로세서(450)는, 동작 501에서 손의 위치가 획득된 경우(예: 적어도 하나의 제 1 이미지 내에서 손이 검출된 경우) 동작 503에서 응시 지점을 획득하는 동작을 수행할 수 있다.Referring back to FIG. 5, although FIG. 5 illustrates that the operation of acquiring the hand position in operation 501 is performed prior to the operation of acquiring the gaze point in operation 503, it is not limited thereto. For example, the processor (450) may perform operations 501 and 503 in parallel (or simultaneously). For example, the processor (450) may perform the operation of acquiring the gaze point in operation 503 when the hand position is acquired in operation 501 (e.g., when the hand is detected within at least one first image).

동작 505에서, 일 실시예에서, 프로세서(450)는 손의 위치가 응시 지점에 대응하는지 여부를 확인할 수 있다.In operation 505, in one embodiment, the processor (450) may determine whether the position of the hand corresponds to a gaze point.

일 실시예에서, 프로세서(450)는, 손의 위치(예: 도 6의 지점들(631, 632)) 및 응시 지점(예: 도 7의 응시 지점(761)) 간 거리가 지정된 거리 이하인지 여부를 확인할 수 있다. 예를 들어, 프로세서(450)는, 가상 공간의 좌표계 내에서, 손의 3차원 좌표 및 응시 지점의 3차원 좌표 간 거리가 지정된 거리 이하인지 여부를 확인할 수 있다.In one embodiment, the processor (450) can determine whether a distance between a hand position (e.g., points (631, 632) of FIG. 6) and a gaze point (e.g., gaze point (761) of FIG. 7) is less than or equal to a specified distance. For example, the processor (450) can determine whether a distance between a three-dimensional coordinate of the hand and a three-dimensional coordinate of the gaze point within a coordinate system of a virtual space is less than or equal to a specified distance.

일 실시예에서, 손의 위치가 응시 지점에 대응하는 경우는, 손의 위치 및 응시 지점 간 거리가 지정된 거리 이하인 경우일 수 있다. 손의 위치가 응시 지점에 대응하지 않는 경우는, 손의 위치 및 응시 지점 간 거리가 지정된 거리를 초과하는 경우일 수 있다.In one embodiment, the case where the hand position corresponds to the gaze point may be when the distance between the hand position and the gaze point is less than or equal to a specified distance. The case where the hand position does not correspond to the gaze point may be when the distance between the hand position and the gaze point exceeds a specified distance.

동작 507에서, 일 실시예에서, 프로세서(450)는, 손의 위치가 응시 지점에 대응함에 기반하여, 손과 관련된 키 포인트들(key points)를 포함하는 스켈레톤(skeleton)을 획득할 수 있다. 동작 507의 스켈레톤을 획득하는 동작에 대하여 이하 도 8을 참조하여 설명하도록 한다.In operation 507, in one embodiment, the processor (450) may obtain a skeleton including key points related to the hand based on whether the position of the hand corresponds to the gaze point. The operation of obtaining the skeleton in operation 507 will be described below with reference to FIG. 8.

도 8은, 일 실시예에 따른, 스켈레톤을 획득하는 동작을 설명하기 위한 도면(800)이다.FIG. 8 is a drawing (800) for explaining an operation of obtaining a skeleton according to one embodiment.

도 8을 참조하면, 일 실시예에서, 스켈레톤은, 키 포인트들("특징점들(feature points)", "노드들(nodes)", 또는 "관절들(joints)"로도 지칭됨) 및 키포인트들(key points)을 연결하는 라인들을 포함할 수 있다. 예를 들어, 도 8에서, 왼손(621)과 관련된(예: 왼손(621)에 대응하는) 스켈레톤(811)은 키 포인트들(예: 키 포인트(821)) 및 키 포인트들을 연결하는 라인들(예: 라인(831)을 포함하고, 오른손과 관련된 스켈레톤(812)은 키 포인트들(예: 키 포인트(822)) 및 키 포인트들을 연결하는 라인들(예: 라인(832)을 포함할 수 있다.Referring to FIG. 8, in one embodiment, a skeleton may include key points (also referred to as “feature points,” “nodes,” or “joints”) and lines connecting the key points. For example, in FIG. 8, a skeleton (811) associated with the left hand (621) (e.g., corresponding to the left hand (621)) may include key points (e.g., key points (821)) and lines connecting the key points (e.g., lines (831)), and a skeleton (812) associated with the right hand may include key points (e.g., key points (822)) and lines connecting the key points (e.g., lines (832)).

일 실시예에서, 프로세서(450)는, 스켈레톤 획득과 관련된 인공 지능 모델에 기반하여, 손과 관련된 스켈레톤을 획득할 수 있다. 예를 들어, 프로세서(450)는, 입력 데이터로서 바운딩 박스(예: 도 6의 바운딩 박스들(641, 642))가 지시하고 적어도 하나의 제 1 이미지로부터 크롭된 영역(예: 동작 501을 통하여 적어도 하나의 제 1 이미지로부터 크롭된 영역)을, 스켈레톤 획득과 관련된 인공 지능 모델을 이용한 인공 지능 엔진으로 입력할 수 있다. 프로세서(450)는, 상기 인공 지능 엔진으로부터, 출력 데이터로서 키 포인트들(예: 키 포인트들의 2차원 좌표들 또는 가상 공간의 좌표계에서 키 포인트들의 3차원 좌표들)을 포함하는 스켈레톤(예: 스켈레톤들(811, 812))을 획득할 수 있다.In one embodiment, the processor (450) may acquire a skeleton associated with a hand based on an artificial intelligence model related to skeleton acquisition. For example, the processor (450) may input, as input data, a bounding box (e.g., bounding boxes (641, 642) of FIG. 6) indicated and a cropped region (e.g., a region cropped from at least one first image through operation 501) from at least one first image, into an artificial intelligence engine using an artificial intelligence model related to skeleton acquisition. The processor (450) may acquire, from the artificial intelligence engine, a skeleton (e.g., skeletons (811, 812)) including key points (e.g., two-dimensional coordinates of key points or three-dimensional coordinates of key points in a coordinate system of a virtual space) as output data.

전술한 예시에서, 인공 지능 모델을 이용하여 손과 관련된 스켈레톤을 획득하는 것으로 예시하고 있지만, 이에 제한되지 않는다. 예를 들어, 프로세서(450)는, 스켈레톤 획득과 관련된 알고리즘을 이용하여, 손과 관련된 스켈레톤을 획득할 수 있다. 예를 들어, 프로세서(450)는 손과 관련된 형태를 획득하기 위하여, 입력 영상으로부터 손에 해당 하는 부분들의 특징들을 추출하여, 손의 자세들을 구분할 수 있다. 예컨대, 프로세서(450)는, 인공지능을 통한 손에 대한 각 부위별 패턴들을 학습하고, 획득한 이미지들을 각각의 패턴에 따라 구분하여 손의 각 부분들을 분류하고 손의 자세를 파악할 수 있다. 일 실시예에 따라, 프로세서(450)는, 알고리즘의 한 종류로서, 기계학습 알고리즘의 하나인 SVM을 활용하여, 입력 영상의 손의 특징과 가장 가까운 데이터 셋을 찾고 이를 판별결과로 출력할 수 있다. 또한, 프로세서(450)는, 손의 동적 패턴 분석을 위해 HMM이나, 특정 동작의 속도나 시간에 영향을 받지 않기 위해 DTW등을 이용하여 손의 연손동작 들을 구분하여 패턴을 파악할 수 있다.In the above example, the skeleton related to the hand is obtained using an artificial intelligence model, but is not limited thereto. For example, the processor (450) may obtain the skeleton related to the hand using an algorithm related to skeleton acquisition. For example, the processor (450) may extract features of parts corresponding to the hand from the input image to distinguish the poses of the hand in order to obtain a shape related to the hand. For example, the processor (450) may learn patterns for each part of the hand using artificial intelligence, distinguish the acquired images according to each pattern, classify each part of the hand, and identify the pose of the hand. According to one embodiment, the processor (450) may find a data set closest to the features of the hand of the input image by utilizing SVM, which is one type of algorithm, and output it as a determination result. In addition, the processor (450) can identify the pattern by distinguishing the hand movements using HMM for dynamic pattern analysis of the hand or DTW for not being affected by the speed or time of a specific movement.

도 5를 다시 참조하면, 일 실시예에서, 프로세서(450)는, 손의 위치가 응시 지점에 대응하지 않음에 기반하여, 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하는 동작을 수행하지 않을 수 있다. 예를 들어, 프로세서(450)는, 손의 위치 및 응시 지점 간 거리가 지정된 거리를 초과함에 기반하여, 상기 스켈레톤을 획득하는 동작을 수행하지 않을 수 있다. 예를 들어, 프로세서(450)는, 손의 위치가 응시 지점에 대응하지 않음에 기반하여, 상기 스켈레톤을 획득하는 동작을 수행함 없이, 동작 501 및 동작 503을 계속적으로 수행할 수 있다.Referring back to FIG. 5, in one embodiment, the processor (450) may not perform the operation of acquiring the skeleton including key points associated with the hand based on the hand position not corresponding to the gaze point. For example, the processor (450) may not perform the operation of acquiring the skeleton based on the distance between the hand position and the gaze point exceeding a specified distance. For example, the processor (450) may continue to perform operations 501 and 503 without performing the operation of acquiring the skeleton based on the hand position not corresponding to the gaze point.

동작 509에서, 일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 손과 관련된 동작을 수행할 수 있다.In operation 509, in one embodiment, the processor (450) may perform a hand-related operation based on the acquired skeleton.

일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 사용자의 제스처(예: 손 동작)를 인식하는 동작을 수행할 수 있다. 예를 들어, 프로세서(450)는, 스켈레톤에 포함된 키 포인트들의 좌표들(예: 3차원 좌표들)에 기반하여, 제스처 인식과 관련된 인공지능 모델(예: 제스처 분류 모델(gesture classification model) 또는 알고리즘을 이용하여, 사용자의 제스처(예: 집는(grab) 제스처, 손가락을 이용한 드래그(drag) 제스처, 손가락을 이용한 드롭(drop) 제스처)를 인식할 수 있다. 예를 들어, 프로세서(450)는, 스켈레톤의 형태(또는 스켈레톤의 형태의 변화)에 기반하여, 사용자의 제스처를 인식하는 동작을 수행할 수 있다.In one embodiment, the processor (450) may perform an operation of recognizing a user's gesture (e.g., a hand gesture) based on the acquired skeleton. For example, the processor (450) may recognize a user's gesture (e.g., a grab gesture, a drag gesture using a finger, a drop gesture using a finger) using an artificial intelligence model (e.g., a gesture classification model or an algorithm) related to gesture recognition based on coordinates (e.g., 3D coordinates) of key points included in the skeleton. For example, the processor (450) may perform an operation of recognizing a user's gesture based on a shape of the skeleton (or a change in the shape of the skeleton).

일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 사용자의 손을 렌더링(rendering)하는 동작을 수행할 수 있다. 이하, 도 9를 참조하여, 사용자의 손을 렌더링하는 동작에 대하여 설명하도록 한다.In one embodiment, the processor (450) may perform an operation of rendering the user's hand based on the acquired skeleton. Hereinafter, the operation of rendering the user's hand will be described with reference to FIG. 9.

도 9는, 일 실시예에 따른, 가상 손을 렌더링 하는 동작을 설명하기 위한 도면(900)이다.FIG. 9 is a drawing (900) for explaining an operation of rendering a virtual hand according to one embodiment.

도 9를 참조하면, 일 실시예에서, 프로세서(450)는, 손에 대응하는 스켈레톤에 기반하여, 손에 대응하는 가상 손을 생성할 수 있다. 예를 들어, 프로세서(450)는, 손 렌더링을 위한 손 모델(hand model)을 확인할 수 있다. 프로세서(450)는, 상기 손 모델을 이용하여, 동작 507을 통하여 획득된 획득된 스켈레톤에 대응하는(예: 매핑된(mapped)) 가상 손을 생성할 수 있다.Referring to FIG. 9, in one embodiment, the processor (450) may generate a virtual hand corresponding to the hand based on a skeleton corresponding to the hand. For example, the processor (450) may check a hand model for hand rendering. The processor (450) may generate a virtual hand corresponding to (e.g., mapped to) the acquired skeleton obtained through operation 507 using the hand model.

일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤의 위치(예: 스켈레톤에 포함된 키 포인트들의 좌표들(예: 3차원 좌표들))에 상기 생성된 가상 손을 디스플레이(410)를 통하여 표시할 수 있다. 예를 들어, 프로세서(450)는, 도 9에 도시된 바와 같이, 디스플레이(410)를 통하여, 가상 손들(911, 912)을 포함하는 화면(910)(예: 좌안을 위한 화면) 및 가상 손들(921, 922)을 포함하는 화면(920)(예: 우안을 위한 화면)을 표시할 수 있다.In one embodiment, the processor (450) may display the generated virtual hand at the position of the acquired skeleton (e.g., coordinates of key points included in the skeleton (e.g., 3D coordinates)) through the display (410). For example, the processor (450) may display a screen (910) including virtual hands (911, 912) (e.g., a screen for the left eye) and a screen (920) including virtual hands (921, 922) (e.g., a screen for the right eye) through the display (410), as illustrated in FIG. 9.

도 5를 다시 참조하면, 일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 사용자의 제스처(예: 손 동작)를 인식하는 동작 및 손을 렌더링하는 동작을 수행할 수 있다. 다만, 프로세서(450)가 스켈레톤에 기반하여 수행하는 동작은, 사용자의 제스처를 인식하는 동작 및 손을 렌더링하는 동작에 제한되지 않는다.Referring back to FIG. 5, in one embodiment, the processor (450) may perform an operation of recognizing a user's gesture (e.g., a hand gesture) and an operation of rendering a hand based on the acquired skeleton. However, the operation performed by the processor (450) based on the skeleton is not limited to the operation of recognizing a user's gesture and the operation of rendering a hand.

일 실시예에서, 프로세서(450)는, 동작 507을 수행하는 동안 또는 동작 507을 수행한 후, 제 1 카메라(421)를 통하여 획득된 적어도 하나의 제 1 이미지 내에서 손의 위치가 획득되지 않는 경우(예: 적어도 하나의 제 1 이미지 내에서 손이 검출되지 않는 경우), 동작 505 내지 동작 507을 수행함 없이, 동작 501 및/또는 동작 503을 수행할 수 있다.In one embodiment, the processor (450) may perform operations 501 and/or 503 without performing operations 505 to 507 if, during or after performing operation 507, the position of the hand is not acquired within at least one first image acquired via the first camera (421) (e.g., if the hand is not detected within the at least one first image).

도 10은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도(1000)이다.FIG. 10 is a flowchart (1000) for explaining a method of performing hand tracking according to one embodiment.

일 실시예에서, 도 10의 동작들은 도 5의 동작 501 및 동작 503에 포함되는 동작들일 수 있다.In one embodiment, the operations of FIG. 10 may be operations included in operations 501 and 503 of FIG. 5.

도 10을 참조하면, 동작 1001에서, 일 실시예에서, 프로세서(450)는, 제 1 카메라(421)를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치가 획득되는지 여부를 확인할 수 있다.Referring to FIG. 10, in operation 1001, in one embodiment, the processor (450) may determine whether the position of the user's hand is acquired based on at least one first image acquired through the first camera (421).

일 실시예에서, 동작 1001은, 도 5의 동작 501과 적어도 일부가 동일 또는 유사하므로, 중복되는 설명은 생략하기로 한다.In one embodiment, operation 1001 is at least partially identical or similar to operation 501 of FIG. 5, and therefore, redundant description will be omitted.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서(또는 적어도 하나의 제 1 이미지로부터) 사용자의 손이 검출되는지 여부(예: 적어도 하나의 제 1 이미지 내에 사용자의 손에 대응하는 영역이 존재하는지 여부)를 확인함으로써, 사용자의 손의 위치가 획득되는지 여부를 확인할 수 있다. 예를 들어, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 사용자의 손이 검출됨에 기반하여, 사용자의 손의 위치가 획득되는 것으로 확인할 수 있다. 예를 들어, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 사용자의 손이 검출되지 않음에 기반하여, 사용자의 손의 위치가 획득되지 않는 것으로 확인할 수 있다.In one embodiment, the processor (450) can determine whether the location of the user's hand is acquired by determining whether the user's hand is detected within (or from) at least one of the first images (e.g., whether an area corresponding to the user's hand exists within the at least one first image). For example, the processor (450) can determine that the location of the user's hand is acquired based on the user's hand being detected within the at least one first image. For example, the processor (450) can determine that the location of the user's hand is not acquired based on the user's hand not being detected within the at least one first image.

일 실시예에서, 프로세서(450)는, 동작 1001에서 사용자의 손의 위치가 획득되지 않는 경우, 동작 1001을 반복적으로(또는 계속적으로) 수행할 수 있다.In one embodiment, the processor (450) may repeatedly (or continuously) perform operation 1001 if the position of the user's hand is not obtained in operation 1001.

동작 1001에서 사용자의 손의 위치가 획득되는 경우, 동작 1003에서, 일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여 사용자의 응시 지점을 획득할 수 있다.When the position of the user's hand is acquired in operation 1001, in one embodiment, in operation 1003, the processor (450) may acquire the user's gaze point based on at least one second image acquired through the second camera (422).

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 손의 위치가 획득되는 경우, 적어도 하나의 제 2 이미지에 기반하여 응시 지점을 획득하는 동작을 수행할 수 있다. 예를 들어, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 손의 위치가 획득되는 경우, 적어도 하나의 제 2 이미지에 기반하여 응시 지점을 획득하는 동작을 시작할 수 있다.In one embodiment, the processor (450) may perform an operation of acquiring a gaze point based on at least one second image when a hand position is acquired within at least one first image. For example, the processor (450) may initiate an operation of acquiring a gaze point based on at least one second image when a hand position is acquired within at least one first image.

도 11은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도(1100)이다.FIG. 11 is a flowchart (1100) for explaining a method of performing hand tracking according to one embodiment.

도 12은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 도면(1200)이다.FIG. 12 is a drawing (1200) for explaining a method of performing hand tracking according to one embodiment.

도 11 및 도 12를 참조하면, 동작 1101에서, 일 실시예에서, 프로세서(450)는, 제 1 카메라(421)를 통하여 획득된 이미지에 기반하여, 사용자의 손의 위치를 획득할 수 있다.Referring to FIGS. 11 and 12, in operation 1101, in one embodiment, the processor (450) may obtain the position of the user's hand based on an image obtained through the first camera (421).

일 실시예에서, 동작 1101은, 도 5의 동작 501과 적어도 일부가 동일 또는 유사하므로, 상세한 설명은 생략하기로 한다.In one embodiment, operation 1101 is at least partially identical or similar to operation 501 of FIG. 5, and therefore a detailed description thereof will be omitted.

동작 1103에서, 일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 통하여 획득된 이미지에 기반하여, 사용자의 응시 지점을 획득할 수 있다.In operation 1103, in one embodiment, the processor (450) may acquire the user's gaze point based on an image acquired through the second camera (422).

일 실시예에서, 동작 1103은, 도 5의 동작 503과 적어도 일부가 동일 또는 유사하므로, 상세한 설명은 생략하기로 한다.In one embodiment, operation 1103 is at least partially identical or similar to operation 503 of FIG. 5, and therefore a detailed description thereof will be omitted.

동작 1105에서, 일 실시예에서, 프로세서(450)는 손의 위치가 응시 지점에 대응하는지 여부를 확인할 수 있다.In operation 1105, in one embodiment, the processor (450) may determine whether the position of the hand corresponds to a gaze point.

일 실시예에서, 동작 1105는, 도 5의 동작 505와 적어도 일부가 동일 또는 유사하므로, 상세한 설명은 생략하기로 한다.In one embodiment, operation 1105 is at least partially identical or similar to operation 505 of FIG. 5, and thus a detailed description thereof will be omitted.

동작 1107에서, 일 실시예에서, 프로세서(450)는, 손의 위치로부터 지정된 거리 내에 오브젝트가 위치하는지 여부를 확인할 수 있다. 예를 들어, 프로세서(450)는, 동작 1105에서 손의 위치가 응시 지점에 대응하는 경우, 손의 위치로부터 지정된 거리 내에 오브젝트가 위치하는지 여부를 확인할 수 있다.In operation 1107, in one embodiment, the processor (450) may determine whether an object is located within a specified distance from the position of the hand. For example, if the position of the hand corresponds to the gaze point in operation 1105, the processor (450) may determine whether an object is located within a specified distance from the position of the hand.

일 실시예에서, 오브젝트(object)("사용자 인터페이스(user interface)"로도 지칭됨)는, 웨어러블 전자 장치(401)에 의해 설정되는 가상 공간에 배치되고, 사용자와 인터랙션(interaction)이 가능한 오브젝트일 수 있다. 예를 들어, 오브젝트는, 웨어러블 전자 장치(401)에 의해 설정되는 가상 공간에 배치되고, 사용자 입력에 의해 실행 가능한 아이콘, 위젯, 이미지, 텍스트, 및/또는 윈도우를 포함할 수 있다. 다만, 오브젝트는 전술한 예시들에 제한되지 않는다.In one embodiment, an object (also referred to as a “user interface”) may be an object that is placed in a virtual space set by the wearable electronic device (401) and can interact with a user. For example, the object may include an icon, widget, image, text, and/or window that is placed in a virtual space set by the wearable electronic device (401) and can be executed by user input. However, the object is not limited to the examples described above.

일 실시예에서, 도 12에서, 참조 부호 1210은, 웨어러블 전자 장치(401)를 착용한 사용자에게 보여지는 현실 공간을 나타낼 수 있다. 일 실시예에서, 프로세서(450)는, 디스플레이(410)를 통하여, 현실 공간(1210) 내에 오브젝트(1240)(예: 가상 오브젝트)를 표시할 수 있다.In one embodiment, in FIG. 12, reference numeral 1210 may represent a real space shown to a user wearing a wearable electronic device (401). In one embodiment, the processor (450) may display an object (1240) (e.g., a virtual object) within the real space (1210) through the display (410).

일 실시예에서, 도 12에서, 프로세서(450)는, 사용자의 손(1220)의 위치(1221)(예: 손등의 중심 위치) 및 응시 지점(1231) 간 거리(d1)가 지정된 거리 이하임에 기반하여, 사용자의 손(1220)의 위치(1221) 및 오브젝트(1240)의 위치(1241) 간 거리(d2)가 지정된 거리 이하인지 여부를 확인할 수 있다.In one embodiment, in FIG. 12, the processor (450) can determine whether a distance (d2) between a position (1221) of the user's hand (1220) and a position (1241) of an object (1240) is less than or equal to a specified distance, based on the distance (d1) between a position (1221) of the user's hand (1220) (e.g., a center position of the back of the hand) and a gaze point (1231) being less than or equal to a specified distance.

일 실시예에서, 사용자의 손(1220)의 위치(1221)(예: 손등의 중심 위치) 및 응시 지점(1231) 간 거리(d1)와 비교되는 지정된 거리는, 사용자의 손(1220)의 위치(1221) 및 오브젝트(1240)의 위치(1241) 간 거리(d2)와 비교되는 거리와, 동일하거나 다르게 설정될 수 있다.In one embodiment, a specified distance compared to a distance (d1) between a position (1221) of a user's hand (1220) (e.g., a center position of the back of the hand) and a gaze point (1231) may be set to be the same as or different from a distance compared to a distance (d2) between a position (1221) of the user's hand (1220) and a position (1241) of an object (1240).

동작 1109에서, 일 실시예에서, 프로세서(450)는, 동작 1107에서 손의 위치로부터 지정된 거리 내에 오브젝트가 위치함에 기반하여, 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득할 수 있다.In operation 1109, in one embodiment, the processor (450) may obtain a skeleton including key points associated with the hand based on an object being located within a specified distance from the position of the hand in operation 1107.

일 실시예에서, 프로세서(450)는, 동작 1107에서 손의 위치로부터 지정된 거리 내에 오브젝트가 위치하지 않음에 기반하여, 스켈레톤을 획득하는 동작을 수행함 없이, 동작 1101(및 동작 1103)을 수행할 수 있다.In one embodiment, the processor (450) can perform operation 1101 (and operation 1103) without performing the operation of obtaining the skeleton based on the object not being located within a specified distance from the position of the hand in operation 1107.

일 실시예에서, 동작 1109는, 도 5의 동작 507과 적어도 일부가 동일 또는 유사하므로, 중복되는 설명은 생략하기로 한다.In one embodiment, operation 1109 is at least partially identical or similar to operation 507 of FIG. 5, and therefore, a redundant description will be omitted.

동작 1111에서, 일 실시예에서, 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 동작을 수행할 수 있다.In operation 1111, in one embodiment, the processor (450) may perform an operation based on the acquired skeleton.

일 실시예에서, 동작 1111은, 도 5의 동작 509와 적어도 일부가 동일 또는 유사하므로, 상세한 설명은 생략하기로 한다.In one embodiment, operation 1111 is at least partially identical or similar to operation 509 of FIG. 5, and therefore a detailed description thereof will be omitted.

도 13은, 일 실시예에 따른, 핸드 트래킹을 수행하는 방법을 설명하기 위한 흐름도(1300)이다.FIG. 13 is a flowchart (1300) for explaining a method of performing hand tracking according to one embodiment.

일 실시예에서, 도 13의 동작들은 도 5의 동작 501 및 동작 503에 포함되는 동작들일 수 있다.In one embodiment, the operations of FIG. 13 may be operations included in operations 501 and 503 of FIG. 5.

도 13을 참조하면, 동작 1301에서, 일 실시예에서, 프로세서(450)는, 제 1 카메라(421)를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치가 획득되는지 여부를 확인할 수 있다.Referring to FIG. 13, in operation 1301, in one embodiment, the processor (450) may determine whether the position of the user's hand is acquired based on at least one first image acquired through the first camera (421).

일 실시예에서, 동작 1301은, 도 5의 동작 501과 적어도 일부가 동일 또는 유사하므로, 중복되는 설명은 생략하기로 한다.In one embodiment, operation 1301 is at least partially identical or similar to operation 501 of FIG. 5, and therefore, a redundant description will be omitted.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서(또는 적어도 하나의 제 1 이미지로부터) 사용자의 손이 검출되는지 여부를 확인함으로써, 사용자의 손의 위치가 획득되는지 여부를 확인할 수 있다. 예를 들어, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 사용자의 손이 검출됨에 기반하여, 사용자의 손의 위치가 획득되는 것으로 확인할 수 있다. 예를 들어, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 사용자의 손이 검출되지 않음에 기반하여, 사용자의 손의 위치가 획득되지 않는 것으로 확인할 수 있다.In one embodiment, the processor (450) can determine whether the location of the user's hand is acquired by determining whether the user's hand is detected within (or from) at least one of the first images. For example, the processor (450) can determine that the location of the user's hand is acquired based on the user's hand being detected within the at least one first image. For example, the processor (450) can determine that the location of the user's hand is not acquired based on the user's hand not being detected within the at least one first image.

동작 1301에서 사용자의 손의 위치가 획득되지 않는 경우, 동작 1303에서, 일 실시예에서, 프로세서(450)는, 제 1 프레임 레이트(frame rate)(예: 제 1 FPS(frame per second))로 이미지(적어도 하나의 제 2 이미지)를 획득하도록 제 2 카메라(422)를 설정할 수 있다. If the position of the user's hand is not acquired in operation 1301, in one embodiment, in operation 1303, the processor (450) may set the second camera (422) to acquire images (at least one second image) at a first frame rate (e.g., a first frame per second (FPS)).

일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 통하여 제 1 프레임 레이트로 획득된 적어도 하나의 제 2 이미지에 기반하여, 응시 지점을 획득할 수 있다.In one embodiment, the processor (450) can obtain a gaze point based on at least one second image acquired at a first frame rate through the second camera (422).

동작 1301에서 사용자의 손의 위치가 획득된 경우, 동작 1305에서, 일 실시예에서, 프로세서(450)는, 제 1 프레임 레이트 보다 높은 제 2 프레임 레이트(예: 제 2 FPS(frame per second))로 이미지(적어도 하나의 제 2 이미지)를 획득하도록 제 2 카메라(422)를 설정할 수 있다.If the position of the user's hand is acquired in operation 1301, in one embodiment, in operation 1305, the processor (450) may set the second camera (422) to acquire images (at least one second image) at a second frame rate (e.g., a second FPS (frame per second)) higher than the first frame rate.

일 실시예에서, 프로세서(450)는, 제 2 카메라(422)를 통하여 제 2 프레임 레이트로 획득된 적어도 하나의 제 2 이미지에 기반하여, 응시 지점을 획득할 수 있다.In one embodiment, the processor (450) can obtain the gaze point based on at least one second image acquired at a second frame rate through the second camera (422).

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 손이 검출된 경우 제 1 프레임 레이트 보다 높은 제 2 프레임 레이트로 적어도 하나의 제 2 이미지를 획득하도록 제 2 카메라(422)를 제어함으로써, 적어도 하나의 제 1 이미지 내에서 손 검출 시 보다 정확한(또는 정밀한) 응시 지점이 획득되도록 할 수 있다.In one embodiment, the processor (450) may control the second camera (422) to acquire at least one second image at a second frame rate higher than the first frame rate when a hand is detected within the at least one first image, thereby allowing a more accurate (or precise) gaze point to be acquired when detecting a hand within the at least one first image.

일 실시예에서, 프로세서(450)는, 적어도 하나의 제 1 이미지 내에서 손이 검출되지 않은 경우 제 2 프레임 레이트 보다 낮은 제 1 프레임 레이트로 적어도 하나의 제 2 이미지를 획득하도록 제 2 카메라(422)를 제어함으로써, 적어도 하나의 제 1 이미지 내에서 손이 검출되지 않는 동안, 웨어러블 전자 장치(401)가 소모하는 전력을 감소시킬 수 있다.In one embodiment, the processor (450) can control the second camera (422) to acquire at least one second image at a first frame rate that is lower than the second frame rate when no hand is detected in the at least one first image, thereby reducing power consumed by the wearable electronic device (401) while no hand is detected in the at least one first image.

일 실시예에 따른 웨어러블 전자 장치(401)는, 제 1 카메라(421) 및 제 2 카메라(422)를 포함하는 카메라(420), 디스플레이(410), 및 상기 카메라(420) 및 상기 디스플레이(410)와 작동적으로 연결된 적어도 하나의 프로세서(450)를 포함할 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 제 1 카메라(421)를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 제 2 카메라(422)를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여, 상기 사용자의 응시 지점을 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 상기 손과 관련된 동작을 수행하도록 구성될 수 있다.A wearable electronic device (401) according to one embodiment may include a camera (420) including a first camera (421) and a second camera (422), a display (410), and at least one processor (450) operatively connected to the camera (420) and the display (410). The at least one processor (450) may be configured to acquire a position of a hand of a user based on at least one first image acquired through the first camera (421). The at least one processor (450) may be configured to acquire a gaze point of the user based on at least one second image acquired through the second camera (422). The at least one processor (450) may be configured to determine whether the position of the hand corresponds to the gaze point. The at least one processor (450) may be configured to obtain a skeleton including key points related to the hand based on whether the position of the hand corresponds to the gaze point. The at least one processor (450) may be configured to perform a motion related to the hand based on the obtained skeleton.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 적어도 하나의 제 1 이미지 내에서, 상기 손을 나타내는 영역 및 상기 손의 상기 위치를 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 손의 상기 위치를 기준으로 상기 손을 나타내는 영역을 포함하는 영역을 지시하는 바운딩 박스(bounding box)를 획득하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to obtain an area representing the hand and the position of the hand within the at least one first image. The at least one processor (450) may be configured to obtain a bounding box indicating an area including the area representing the hand based on the position of the hand.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치가 획득됨에 기반하여, 상기 사용자의 상기 응시 지점을 획득하는 동작을 수행하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to perform an operation of acquiring the gaze point of the user based on the acquisition of the position of the hand.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 적어도 하나의 제 2 이미지 내에서 상기 사용자의 동공의 위치를 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 적어도 하나의 제 2 이미지 내에서, 상기 전자 장치의 복수의 발광부들로부터 방출된 광이 상기 사용자의 눈에 반사됨으로써 상기 적어도 하나의 제 2 이미지 내에 표시되는 글린트들(glints)의 위치들을 획득하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 글린트들의 위치들에 기반하여 획득된 상기 글린트들의 중심 위치 및 상기 동공의 위치에 기반하여, 상기 사용자의 상기 응시 지점을 획득하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to obtain a position of a pupil of the user within the at least one second image. The at least one processor (450) may be configured to obtain positions of glints displayed within the at least one second image by light emitted from a plurality of light-emitting units of the electronic device reflected to an eye of the user within the at least one second image. The at least one processor (450) may be configured to obtain the gaze point of the user based on a center position of the glints and a position of the pupil obtained based on the positions of the glints.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리 이하인지 여부를 확인하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to determine whether the distance between the position of the hand and the gaze point is less than or equal to a specified distance.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리 이하임에 기반하여, 상기 스켈레톤을 획득하는 동작을 수행하도록 구성될 수 있다. 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리를 초과함에 기반하여, 상기 스켈레톤을 획득하는 동작이 수행되지 않을 수 있다.In one embodiment, the at least one processor (450) may be configured to perform an operation of acquiring the skeleton based on a distance between the position of the hand and the point of gaze being less than or equal to a specified distance. The operation of acquiring the skeleton may not be performed based on a distance between the position of the hand and the point of gaze exceeding a specified distance.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 상기 사용자의 상기 손의 움직임이 나타내는 제스처를 인식하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to recognize a gesture indicated by a movement of the hand of the user based on the acquired skeleton.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 획득된 스켈레톤에 기반하여, 상기 사용자의 상기 손에 대응하는 가상 손을 상기 디스플레이(410)를 통하여 표시하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to display a virtual hand corresponding to the hand of the user through the display (410) based on the acquired skeleton.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손의 위치로부터 지정된 거리 내에 오브젝트(object)가 위치하는지 여부를 확인할 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 손의 위치로부터 상기 지정된 거리 내에 상기 오브젝트가 위치함에 기반하여, 상기 스켈레톤을 획득하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may determine whether an object is located within a specified distance from a position of the hand based on whether the position of the hand corresponds to the gaze point. The at least one processor (450) may be configured to obtain the skeleton based on whether the object is located within the specified distance from the position of the hand.

일 실시예에서, 상기 적어도 하나의 프로세서(450)는, 상기 사용자의 상기 손의 상기 위치가 획득되지 않는 경우, 제 1 프레임 레이트로 상기 제 2 카메라(422)가 상기 적어도 하나의 제 2 이미지를 획득하도록, 상기 제 2 카메라(422)를 제어하도록 구성될 수 있다. 상기 적어도 하나의 프로세서(450)는, 상기 사용자의 상기 손의 상기 위치가 획득되는 경우, 상기 제 1 프레임 레이트 보다 높은 제 2 프레임 레이트로 상기 제 2 카메라(422)가 상기 적어도 하나의 제 2 이미지를 획득하도록 구성될 수 있다.In one embodiment, the at least one processor (450) may be configured to control the second camera (422) to acquire the at least one second image at a first frame rate when the position of the hand of the user is not acquired. The at least one processor (450) may be configured to cause the second camera (422) to acquire the at least one second image at a second frame rate higher than the first frame rate when the position of the hand of the user is acquired.

일 실시예에 따른 웨어러블 전자 장치(401)에서 핸드 트래킹을 수행하는 방법은, 상기 웨어러블 전자 장치(401)의 제 1 카메라(421)를 통하여 획득된 적어도 하나의 제 1 이미지에 기반하여, 사용자의 손의 위치를 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 웨어러블 전자 장치(401)의 제 2 카메라(422)를 통하여 획득된 적어도 하나의 제 2 이미지에 기반하여, 상기 사용자의 응시 지점을 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하는 동작을 포함할 수 있다. 상기 방법은, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손과 관련된 키 포인트들를 포함하는 스켈레톤을 획득하는 동작을 포함할 수 있다. 상기 방법은, 상기 획득된 스켈레톤에 기반하여, 상기 손과 관련된 동작을 수행하는 동작을 포함할 수 있다.A method for performing hand tracking in a wearable electronic device (401) according to one embodiment may include an operation of acquiring a position of a hand of a user based on at least one first image acquired through a first camera (421) of the wearable electronic device (401). The method may include an operation of acquiring a gaze point of the user based on at least one second image acquired through a second camera (422) of the wearable electronic device (401). The method may include an operation of checking whether the position of the hand corresponds to the gaze point. The method may include an operation of acquiring a skeleton including key points related to the hand based on whether the position of the hand corresponds to the gaze point. The method may include an operation of performing an operation related to the hand based on the acquired skeleton.

일 실시예에서, 상기 사용자의 손의 위치를 획득하는 동작은, 상기 적어도 하나의 제 1 이미지 내에서, 상기 손을 나타내는 영역 및 상기 손의 상기 위치를 획득하는 동작을 포함할 수 있다. 상기 사용자의 손의 위치를 획득하는 동작은, 상기 손의 상기 위치를 기준으로 상기 손을 나타내는 영역을 포함하는 영역을 지시하는 바운딩 박스를 획득하는 동작을 포함할 수 있다.In one embodiment, the operation of obtaining the position of the hand of the user may include an operation of obtaining an area representing the hand and the position of the hand within the at least one first image. The operation of obtaining the position of the hand of the user may include an operation of obtaining a bounding box indicating an area including the area representing the hand based on the position of the hand.

일 실시예에서, 상기 사용자의 응시 지점을 획득하는 동작은, 상기 손의 위치가 획득됨에 기반하여, 상기 사용자의 상기 응시 지점을 획득하는 동작을 수행하는 동작을 포함할 수 있다.In one embodiment, the act of obtaining the user's gaze point may include performing the act of obtaining the user's gaze point based on the position of the hand being obtained.

일 실시예에서, 상기 사용자의 응시 지점을 획득하는 동작은, 상기 적어도 하나의 제 2 이미지 내에서 상기 사용자의 동공의 위치를 획득하는 동작을 포함할 수 있다. 상기 사용자의 응시 지점을 획득하는 동작은, 상기 적어도 하나의 제 2 이미지 내에서, 상기 전자 장치의 복수의 발광부들로부터 방출된 광이 상기 사용자의 눈에 반사됨으로써 상기 적어도 하나의 제 2 이미지 내에 표시되는 글린트들의 위치들을 획득하는 동작을 포함할 수 있다. 상기 사용자의 응시 지점을 획득하는 동작은, 상기 글린트들의 위치들에 기반하여 획득된 상기 글린트들의 중심 위치 및 상기 동공의 위치에 기반하여, 상기 사용자의 상기 응시 지점을 획득하는 동작을 포함할 수 있다.In one embodiment, the operation of obtaining the gaze point of the user may include an operation of obtaining a position of a pupil of the user within the at least one second image. The operation of obtaining the gaze point of the user may include an operation of obtaining positions of glints displayed within the at least one second image by light emitted from a plurality of light-emitting units of the electronic device reflected to an eye of the user within the at least one second image. The operation of obtaining the gaze point of the user may include an operation of obtaining the gaze point of the user based on a center position of the glints and a position of the pupil obtained based on the positions of the glints.

일 실시예에서, 상기 손의 위치가 상기 응시 지점에 대응하는지 여부를 확인하는 동작은, 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리 이하인지 여부를 확인하는 동작을 포함할 수 있다.In one embodiment, the act of determining whether the position of the hand corresponds to the gaze point may include determining whether the distance between the position of the hand and the gaze point is less than or equal to a specified distance.

일 실시예에서, 상기 스켈레톤을 획득하는 동작은, 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리 이하임에 기반하여, 상기 스켈레톤을 획득하는 동작을 수행하는 동작을 포함할 수 있다. 상기 손의 위치 및 상기 응시 지점 간 거리가 지정된 거리를 초과함에 기반하여, 상기 스켈레톤을 획득하는 동작이 수행되지 않을 수 있다.In one embodiment, the action of acquiring the skeleton may include performing the action of acquiring the skeleton based on a distance between the position of the hand and the point of gaze being less than or equal to a specified distance. The action of acquiring the skeleton may not be performed based on a distance between the position of the hand and the point of gaze exceeding a specified distance.

일 실시예에서, 상기 손과 관련된 동작을 수행하는 동작은, 상기 획득된 스켈레톤에 기반하여, 상기 사용자의 상기 손의 움직임이 나타내는 제스처를 인식하는 동작을 포함할 수 있다.In one embodiment, the action of performing the hand-related action may include an action of recognizing a gesture indicated by the movement of the hand of the user based on the acquired skeleton.

일 실시예에서, 상기 손과 관련된 동작을 수행하는 동작은, 상기 획득된 스켈레톤에 기반하여, 상기 사용자의 상기 손에 대응하는 가상 손을 상기 웨어러블 전자 장치(401)의 디스플레이(410)를 통하여 표시하는 동작을 포함할 수 있다.In one embodiment, the action of performing the hand-related action may include an action of displaying a virtual hand corresponding to the hand of the user through the display (410) of the wearable electronic device (401) based on the acquired skeleton.

일 실시예에서, 상기 손과 관련된 동작을 수행하는 동작은, 상기 손의 위치가 상기 응시 지점에 대응함에 기반하여, 상기 손의 위치로부터 지정된 거리 내에 오브젝트가 위치하는지 여부를 확인하는 동작을 포함할 수 있다. 상기 손과 관련된 동작을 수행하는 동작은 상기 손의 위치로부터 상기 지정된 거리 내에 상기 오브젝트가 위치함에 기반하여, 상기 스켈레톤을 획득하는 동작을 포함할 수 있다.In one embodiment, the action of performing the hand-related action may include an action of determining whether an object is located within a specified distance from a position of the hand, based on whether the position of the hand corresponds to the gaze point. The action of performing the hand-related action may include an action of obtaining the skeleton, based on whether the object is located within the specified distance from the position of the hand.

일 실시예에서, 상기 방법은, 상기 사용자의 상기 손의 상기 위치가 획득되지 않는 경우, 제 1 프레임 레이트로 상기 제 2 카메라(422)가 상기 적어도 하나의 제 2 이미지를 획득하도록, 상기 제 2 카메라(422)를 제어하는 동작을 더 포함할 수 있다. 상기 방법은, 상기 사용자의 상기 손의 상기 위치가 획득되는 경우, 상기 제 1 프레임 레이트 보다 높은 제 2 프레임 레이트로 상기 제 2 카메라(422)가 상기 적어도 하나의 제 2 이미지를 획득하는 동작을 더 포함할 수 있다.In one embodiment, the method may further include an operation of controlling the second camera (422) to acquire the at least one second image at a first frame rate when the position of the hand of the user is not acquired. The method may further include an operation of controlling the second camera (422) to acquire the at least one second image at a second frame rate higher than the first frame rate when the position of the hand of the user is acquired.

또한, 상술한 본 개시의 실시예에서 사용된 데이터의 구조는 컴퓨터로 읽을 수 있는 기록매체에 여러 수단을 통하여 기록될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록매체는 마그네틱 저장매체(예를 들면, 롬, 플로피 디스크, 하드 디스크 등), 광학적 판독 매체(예를 들면, CD-ROM, DVD 등)와 같은 저장매체를 포함한다.In addition, the structure of data used in the embodiments of the present disclosure described above can be recorded on a computer-readable recording medium through various means. The computer-readable recording medium includes a storage medium such as a magnetic storage medium (e.g., a ROM, a floppy disk, a hard disk, etc.), an optical reading medium (e.g., a CD-ROM, a DVD, etc.).

Claims

In a wearable electronic device (401),
A camera (420) including a first camera (421) and a second camera (422);
display (410); and
At least one processor (450) operatively connected to the camera (420) and the display (410),
At least one processor (450) above,
Based on at least one first image acquired through the first camera (421), the position of the user's hand is acquired,
Based on at least one second image acquired through the second camera (422), the gaze point of the user is acquired,
Check whether the position of the above hand corresponds to the above gaze point,
Based on the position of the hand corresponding to the gaze point, a skeleton including key points related to the hand is obtained, and
A wearable electronic device (401) configured to perform a motion related to the hand based on the above-mentioned acquired skeleton.

In paragraph 1,
At least one processor (450) above,
In the at least one first image, obtaining an area representing the hand and the position of the hand, and
A wearable electronic device (401) configured to obtain a bounding box indicating an area including an area representing the hand based on the position of the hand.

In claim 1 or 2,
At least one processor (450) above,
A wearable electronic device (401) configured to perform an action of acquiring the gaze point of the user based on acquiring the position of the hand.

In any one of claims 1 to 3,
At least one processor (450) above,
Obtaining the position of the pupil of the user within at least one of the second images,
In the at least one second image, positions of glints displayed in the at least one second image are obtained by reflecting light emitted from a plurality of light-emitting units of the wearable electronic device (401) to the user's eyes, and
A wearable electronic device (401) configured to obtain the gaze point of the user based on the center position of the glints and the position of the pupil obtained based on the positions of the glints.

In any one of claims 1 to 4,
At least one processor (450) above,
A wearable electronic device (401) configured to determine whether the distance between the position of the hand and the gaze point is less than or equal to a specified distance.

In any one of claims 1 to 5,
At least one processor (450) above,
configured to perform an action of acquiring the skeleton based on the distance between the position of the hand and the gaze point being less than or equal to a specified distance;
A wearable electronic device (401) in which an action of acquiring the skeleton is not performed based on a distance between the position of the hand and the point of gaze exceeding a specified distance.

In any one of claims 1 to 6,
At least one processor (450) above,
A wearable electronic device (401) configured to recognize a gesture indicated by a movement of the hand of the user based on the acquired skeleton.

In any one of claims 1 to 7,
At least one processor (450) above,
A wearable electronic device (401) configured to display a virtual hand corresponding to the hand of the user through the display (410) based on the acquired skeleton.

In any one of claims 1 to 8,
At least one processor (450) above,
Based on whether the position of the hand corresponds to the gaze point, it is determined whether an object is located within a specified distance from the position of the hand, and
A wearable electronic device (401) configured to acquire the skeleton based on the object being located within the specified distance from the position of the hand.

In any one of claims 1 to 9,
At least one processor (450) above,
If the position of the hand of the user is not obtained, controlling the second camera (422) so that the second camera (422) obtains the at least one second image at a first frame rate, and
A wearable electronic device (401) configured to cause the second camera (422) to acquire the at least one second image at a second frame rate higher than the first frame rate when the position of the hand of the user is acquired.

A method for performing hand tracking in a wearable electronic device (401),
An operation of acquiring a position of a user's hand based on at least one first image acquired through a first camera (421) of the wearable electronic device (401);
An operation of acquiring a gaze point of the user based on at least one second image acquired through a second camera (422) of the wearable electronic device (401);
An action to check whether the position of the above hand corresponds to the above gaze point;
An operation of obtaining a skeleton including key points related to the hand based on the position of the hand corresponding to the gaze point; and
A method including a motion for performing a motion related to the hand based on the acquired skeleton.

In Article 11,
The action of obtaining the position of the user's hand is:
An operation of obtaining an area representing the hand and the position of the hand within at least one of the first images; and
A method comprising an action of obtaining a bounding box indicating an area including an area representing the hand based on the position of the hand.

In clause 11 or 12,
The action of obtaining the user's gaze point is:
A method comprising: performing an action of acquiring the gaze point of the user based on acquiring the position of the hand.

In any one of claims 11 to 13,
The action of obtaining the user's gaze point is:
An operation of obtaining a position of a pupil of the user within at least one of the second images;
An operation of obtaining positions of glints displayed in the at least one second image by reflecting light emitted from a plurality of light-emitting units of the wearable electronic device (401) to the user's eyes within the at least one second image; and
A method comprising an action of obtaining the gaze point of the user based on the center position of the glints and the position of the pupil obtained based on the positions of the glints.

In any one of paragraphs 11 to 14,
The action of checking whether the position of the above hand corresponds to the above gaze point is,
A method comprising an action of determining whether the distance between the position of the hand and the point of gaze is less than or equal to a specified distance.

In any one of paragraphs 11 to 15,
The action of obtaining the above skeleton is,
Including an action of performing an action of acquiring the skeleton based on the distance between the position of the hand and the gaze point being less than or equal to a specified distance;
A method in which an action of acquiring the skeleton is not performed based on a distance between the position of the hand and the gaze point exceeding a specified distance.

In any one of claims 11 to 16,
The action of performing the above hand-related actions is,
A method including an action of recognizing a gesture indicated by a movement of the hand of the user based on the acquired skeleton.

In any one of paragraphs 11 to 17,
The action of performing the above hand-related actions is,
A method including an action of displaying a virtual hand corresponding to the hand of the user through a display (410) of the wearable electronic device (401) based on the acquired skeleton.

In any one of claims 11 to 18,
The action of performing the above hand-related actions is,
An operation of determining whether an object is located within a specified distance from the position of the hand based on whether the position of the hand corresponds to the gaze point; and
A method comprising an action of obtaining the skeleton based on the object being located within the specified distance from the position of the hand.

In any one of claims 11 to 19,
An operation of controlling the second camera (422) so that the second camera (422) acquires the at least one second image at a first frame rate when the position of the hand of the user is not acquired; and
A method further comprising: when the position of the hand of the user is acquired, the second camera (422) acquiring the at least one second image at a second frame rate higher than the first frame rate.