KR102431386B1

KR102431386B1 - Method and system for interaction holographic display based on hand gesture recognition

Info

Publication number: KR102431386B1
Application number: KR1020200118393A
Authority: KR
Inventors: 이찬휘; 권순철; 유지상
Original assignee: 광운대학교 산학협력단
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2022-08-09
Anticipated expiration: 2040-09-15
Also published as: KR20220036146A

Abstract

본 발명에 의하면, 손 제스처 인식에 기초한 인터랙션 홀로그램 디스플레이 시스템에 있어서,제스처 인식을 위한 센서, 센서로부터 깊이 영상 데이터를 수신하고, 손 제스처 인식에 기초하여 홀로그램 데이터를 생성하고, 생성된 홀로그램 데이터를 실시간 전송하도록 구성되는 손 제스처 인터랙션 처리 장치, 및 손 제스처 인터랙션 처리 장치로부터 수신된 홀로그램 데이터에 기초하여 홀로그램 디스플레이를 제공하도록 구성된 홀로그램 디스플레이 장치를 포함하는 인터랙션 홀로그램 디스플레이 시스템을 제공할 수 있다.According to the present invention, in an interactive hologram display system based on hand gesture recognition, a sensor for gesture recognition, receiving depth image data from the sensor, generating hologram data based on hand gesture recognition, and displaying the generated hologram data in real time An interaction hologram display system including a hand gesture interaction processing device configured to transmit, and a hologram display device configured to provide a holographic display based on hologram data received from the hand gesture interaction processing device may be provided.

Description

Interaction hologram display method and system based on hand gesture recognition

본 발명은 손 제스처 인식에 기초한 인터랙션 홀로그램 디스플레이 방법 및 시스템에 관한 것이다. 보다 구체적으로, 깊이 영상을 딥러닝 및 영상 처리를 이용하여 사용자의 손 제스처로 인터랙션 가능한 홀로그램 디스플레이 시스템 및 이를 위한 방법에 관한 것이다.The present invention relates to an interactive hologram display method and system based on hand gesture recognition. More specifically, it relates to a hologram display system capable of interacting with a user's hand gesture using deep learning and image processing for a depth image, and a method therefor.

최근 컴퓨터나 핸드폰같은 기기와 사람의 손으로 하는 인터랙션(interaction)에 관한 많은 시도들이 있다. 대표적인 제품으로 립모션(Leap Motion)은 우수한 손 검출(hand detection) 성능으로 제스처 인터랙션(gesture interaction)을 통해 컴퓨터를 제어하고, 제스처 인터랙션을 통해 간단한 게임도 할 수 있다. 예컨대, 립모션은 컴퓨터에 적용하여 음량을 키우거나 줄이고, 화면을 넘기는 등 마우스나 키보드 없이 컴퓨터를 제어할 수 있는 시스템을 개발했다. 이처럼 제스처 인터랙션을 이용하면 간편하게 다른 기기를 제어할 수 있다. 또한, 최근에는 핸드폰 사진을 터치를 하지 않고 손바닥을 접었다가 펴는 제스처를 통해 사진을 찍기도 한다. 이처럼 손 제스처 인식을 통해 다른 기기와의 인터랙션에 관한 연구가 활발히 진행되고 있다. Recently, there have been many attempts to interact with a device such as a computer or a mobile phone with a human hand. As a representative product, Leap Motion has excellent hand detection performance and allows you to control a computer through gesture interaction and play simple games through gesture interaction. For example, Leap Motion has developed a system that can be applied to a computer to control the computer without a mouse or keyboard, such as raising or lowering the volume and flipping the screen. In this way, by using gesture interaction, you can easily control other devices. Also, recently, people are taking pictures through the gesture of folding and unfolding the palm of the hand without touching the picture of the cell phone. As such, research on interaction with other devices through hand gesture recognition is being actively conducted.

또한, 컨텐츠를 실감나게 경험하기 위한 3차원 디스플레이 기술 역시 마이크로소프트(Microsoft) 및 구글(Google)과 같은 글로벌 기업들에 의해 3차원 디스플레이 제품들을 출시하는 등 활발하게 연구되고 있다. 3차원 디스플레이 기술은 크게 안경을 착용하는 방식과 무안경 방식으로 나눌 수 있다. 안경을 착용하는 방식은 안경과 같이 눈 앞에 무언가를 착용해야만 볼 수 있는 방식이고, 무안경 방식은 눈 앞에 착용하는 디바이스 없이 3차원 디스플레이를 관찰할 수 있는 방식이다. 무안경 방식 중 디지털 홀로그램 디스플레이 방식은 다른 무안경 방식과는 다르게 초점-수렴 불일치에 의한 피로감이 발생하지 않는다는 장점이 있다. In addition, 3D display technology for realistically experiencing content is being actively researched by global companies such as Microsoft and Google by launching 3D display products. 3D display technology can be largely divided into a method of wearing glasses and a method of wearing glasses. The method of wearing glasses is a method that can be seen only by wearing something in front of the eyes, such as glasses, and the glasses-free method is a method of observing a 3D display without a device worn in front of the eyes. Among the glasses-free methods, the digital hologram display method has the advantage that, unlike other glasses-free methods, fatigue due to focus-convergence mismatch does not occur.

홀로그래픽 디스플레이(holographic display) 기술은 디지털 홀로그램 디스플레이를 360도 어디에서도 관찰 가능하게 하는 방식이다. 홀로그래픽 디스플레이는 빛의 회절과 간섭 현상을 이용한 방식이므로 관찰을 하기 위해서는 조명을 모두 끄고 관찰을 해야 한다. 이와 같은 조건 때문에 빛이 없는 어두운 환경에서 동작 가능하고, 홀로그램 디스플레이에 나오는 영상을 제어하기 위해서는 컴퓨터로 제어해야 한다. The holographic display technology is a method that enables a digital holographic display to be observed from anywhere in 360 degrees. Since the holographic display is a method that uses light diffraction and interference, all lights must be turned off for observation. Because of these conditions, it can operate in a dark environment without light, and in order to control the image displayed on the hologram display, it must be controlled by a computer.

따라서, 어두운 환경에서 제스처 인터랙션을 이용하여 홀로그램을 쉽게 제어할 수 있는 새로운 손 제스처 인식 시스템이 요구된다.Therefore, there is a need for a new hand gesture recognition system that can easily control a hologram using gesture interaction in a dark environment.

대한민국 등록특허 제10-1092909호Republic of Korea Patent Registration No. 10-1092909

본 발명은 깊이 영상을 딥러닝 및 영상 처리를 이용해 홀로그램 디스플레이를 사용자의 손 제스처로 인터랙션 하기 위한 인터랙션 홀로그램 디스플레이 방법 및 시스템을 제공하는 것을 목적으로 한다.An object of the present invention is to provide an interactive hologram display method and system for interacting a hologram display with a user's hand gesture using deep learning and image processing for a depth image.

또한, 본 발명은 홀로그램 디스플레이를 관찰하기 위한 어두운 환경에서 직관적인 제스처로 정확하게 실시간으로 인터랙션이 가능한 인터랙션 홀로그램 디스플레이 방법 및 시스템을 제공하는 것을 목적으로 한다.Another object of the present invention is to provide an interactive hologram display method and system capable of accurately and real-time interaction with an intuitive gesture in a dark environment for observing the hologram display.

또한, 본 발명은 제스처를 보다 직관적인 제스처를 관절 정보를 이용하여 정의 및 인식할 수 있는 손 제스처 인터랙션 홀로그램 디스플레이 방법 및 시스템을 제공하는 것을 목적으로 한다.Another object of the present invention is to provide a method and system for displaying a hand gesture interaction hologram that can define and recognize a more intuitive gesture using joint information.

본 발명의 해결 과제들은 이상에서 언급한 내용들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved of the present invention are not limited to the above-mentioned contents, and other technical problems not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명의 일 실시예에서, 손 제스처 인식에 기초한 인터랙션 홀로그램 디스플레이 시스템에 있어서, 제스처 인식을 위한 센서; 상기 센서로부터 깊이 영상 데이터를 수신하고, 손 제스처 인식에 기초하여 홀로그램 데이터를 생성하고, 생성된 홀로그램 데이터를 실시간 전송하도록 구성되는 손 제스처 인터랙션 처리 장치; 및 상기 손 제스처 인터랙션 처리 장치로부터 수신된 홀로그램 데이터에 기초하여 홀로그램 디스플레이를 제공하도록 구성된 홀로그램 디스플레이 장치를 포함하는 인터랙션 홀로그램 디스플레이 시스템을 제공할 수 있다.In one embodiment of the present invention, there is provided an interactive hologram display system based on hand gesture recognition, comprising: a sensor for gesture recognition; a hand gesture interaction processing device configured to receive depth image data from the sensor, generate hologram data based on hand gesture recognition, and transmit the generated hologram data in real time; and a hologram display device configured to provide a hologram display based on the hologram data received from the hand gesture interaction processing device.

또한, 상기 손 제스처 인터랙션 처리 장치는 손 제스처 인식을 수행하도록 구성된 인터랙션 제어부를 포함하고, 상기 인터랙션 제어부는 상기 센서로부터 깊이(depth) 영상을 수신하도록 구성될 수 있다.In addition, the hand gesture interaction processing apparatus may include an interaction control unit configured to perform hand gesture recognition, and the interaction control unit may be configured to receive a depth image from the sensor.

또한, 상기 인터랙션 제어부는 깊이 영상의 전처리(pre-processing)를 수행하고, 상기 깊이 영상의 전처리는 사용자 외의 배경 및 구조물의 깊이 정보를 제거하기 위한 배경 제거(background subtraction) 처리를 포함할 수 있다.In addition, the interaction controller may perform pre-processing of the depth image, and the pre-processing of the depth image may include a background subtraction process for removing depth information of a background and structures other than the user.

또한, 상기 배경 제거 처리는 첫 프레임과 다음 프레임들 사이의 깊이 정보가 미리 결정된 임계값을 초과하는 깊이 정보만을 사용함으로써 상기 배경 및 구조물을 제거할 수 있다.In addition, the background removal process may remove the background and the structure by using only depth information in which depth information between a first frame and subsequent frames exceeds a predetermined threshold value.

또한, 상기 깊이 영상의 전처리는 ROI(관심 영역) 설정 처리를 포함하고, 설정된 ROI 내에서만 손을 검출하도록 구성될 수 있다.In addition, the pre-processing of the depth image may include an ROI (region of interest) setting process, and may be configured to detect a hand only within the set ROI.

또한, 상기 인터랙션 제어부는 상기 전처리 후의 영상을 미리 데이터 셋을 학습한 딥러닝 모델을 통하여 손 전체의 관절 정보를 획득할 수 있다.In addition, the interaction control unit may acquire joint information of the entire hand through a deep learning model that has previously learned a data set of the image after the preprocessing.

또한, 상기 인터랙션 제어부는 상기 손 전체의 관절 정보에 기초하여 손의 중심을 기준으로 경계 박스(bounding box)를 설정하며, 손가락의 움직임이 상기 경계 박스를 넘어가는 경우 제스처 인식이 수행되도록 구성될 수 있다.In addition, the interaction control unit may be configured to set a bounding box based on the center of the hand based on joint information of the entire hand, and to perform gesture recognition when the movement of a finger crosses the bounding box. have.

또한, 상기 인터랙션 제어부는 검지 또는 중지가 손 중심으로부터 미리 결정된 기준 이상 멀어지게 되는지 여부에 기초하여 상, 하, 좌, 우 제스처를 인식하도록 구성될 수 있다.In addition, the interaction control unit may be configured to recognize up, down, left, and right gestures based on whether the index or middle finger moves away from the center of the hand by more than a predetermined reference.

또한, 상기 인터랙션 제어부는 엄지와 검지 끝 사이의 거리를 계속하여 저장하는 배열을 이용하여, 배열에 쌓이는 거리의 값이 미리 결정된 기준 이상 변화하는지 여부에 기초하여 확대 또는 축소 제스처를 인식하도록 구성될 수 있다.In addition, the interaction control unit may be configured to recognize an enlargement or reduction gesture based on whether the value of the distance accumulated in the arrangement changes by more than a predetermined reference using an arrangement that continuously stores the distance between the tip of the thumb and the index finger. have.

또한, 상기 손 제스처 인터랙션 처리 장치는 홀로그램 데이터를 홀로그램 디스플레이 장치로 전송하도록 구성된 실시간 홀로그램 데이터 송신부를 더 포함할 수 있다.In addition, the hand gesture interaction processing apparatus may further include a real-time hologram data transmitter configured to transmit the hologram data to the hologram display apparatus.

본 발명의 다른 실시예에서, 손 제스처 인식에 기초한 인터랙션 홀로그램 디스플레이 제공 방법에 있어서, 제스처 인식 센서에서 깊이 영상 데이터를 획득하는 단계; 손 제스처 인터랙션 처리 장치에서 상기 제스처 인식 센서로부터 깊이 영상 데이터를 수신하고, 손 제스처 인식에 기초하여 홀로그램 데이터를 생성하고, 생성된 홀로그램 데이터를 실시간 전송하는 단계; 및 상기 손 제스처 인터랙션 처리 장치로부터 수신된 홀로그램 데이터에 기초하여 홀로그램 디스플레이 장치에서 홀로그램 디스플레이를 제공하는 단계를 포함하는 인터랙션 홀로그램 디스플레이 제공 방법을 제공할 수 있다.In another embodiment of the present invention, there is provided a method for providing an interactive hologram display based on hand gesture recognition, the method comprising: acquiring depth image data from a gesture recognition sensor; Receiving depth image data from the gesture recognition sensor in a hand gesture interaction processing device, generating hologram data based on hand gesture recognition, and transmitting the generated hologram data in real time; and providing a hologram display in a hologram display device based on the hologram data received from the hand gesture interaction processing device.

또한, 상기 손 제스처 인터랙션 처리 장치에서 깊이 영상의 전처리(pre-processing)를 수행하는 단계를 더 포함하고, 상기 깊이 영상의 전처리 수행 단계는 사용자 외의 배경 및 구조물의 깊이 정보를 제거하기 위한 배경 제거(background subtraction) 처리를 포함할 수 있다.In addition, the method further comprises the step of performing pre-processing of the depth image in the hand gesture interaction processing apparatus, wherein the pre-processing of the depth image is background removal ( background subtraction) processing may be included.

또한, 상기 배경 제거 처리는 이전 프레임과 다음 프레임들 사이의 깊이 정보가 미리 결정된 임계값을 초과하는 깊이 정보만을 사용함으로써 상기 배경 및 구조물을 제거할 수 있다.In addition, the background removal process may remove the background and the structure by using only depth information in which depth information between the previous frame and the next frame exceeds a predetermined threshold value.

또한, 상기 깊이 영상의 전처리 수행 단계는 ROI(관심 영역) 설정 처리 단계를 더 포함하고, 설정된 ROI 내에서만 손을 검출하도록 구성될 수 있다.In addition, the pre-processing of the depth image may further include an ROI (region of interest) setting processing step, and may be configured to detect a hand only within the set ROI.

또한, 상기 손 제스처 인터랙션 처리 장치에서 상기 전처리 후의 영상을 미리 데이터 셋을 학습한 딥러닝 모델을 통하여 손 전체의 관절 정보를 획득하는 단계를 더 포함할 수 있다.In addition, the method may further include acquiring joint information of the entire hand through a deep learning model that has previously learned a data set from the pre-processed image in the hand gesture interaction processing apparatus.

또한, 상기 손 제스처 인터랙션 처리 장치에서 상기 손 전체의 관절 정보에 기초하여 손의 중심을 기준으로 경계 박스(bounding box)를 설정하고, 손가락의 움직임이 상기 경계 박스를 넘어가는 경우 제스처 인식을 수행하는 단계를 더 포함할 수 있다.In addition, in the hand gesture interaction processing device, a bounding box is set based on the center of the hand based on joint information of the entire hand, and gesture recognition is performed when the movement of the finger exceeds the bounding box. It may include further steps.

또한, 상기 손 제스처 인터랙션 처리 장치에서 검지 또는 중지가 손 중심으로부터 미리 결정된 기준 이상 멀어지게 되는지 여부에 기초하여 상, 하, 좌, 우 제스처를 인식하는 단계를 더 포함할 수 있다.The method may further include recognizing up, down, left, and right gestures based on whether the index finger or middle finger moves away from the center of the hand by more than a predetermined reference in the hand gesture interaction processing apparatus.

또한, 상기 손 제스처 인터랙션 처리 장치에서 엄지와 검지 끝 사이의 거리를 계속하여 저장하는 배열을 이용하여, 배열에 쌓이는 거리의 값이 미리 결정된 기준 이상 변화하는지 여부에 기초하여 확대 또는 축소 제스처를 인식하는 단계를 더 포함할 수 있다.In addition, by using an array that continuously stores the distance between the tip of the thumb and the index finger in the hand gesture interaction processing device, the zoom-in or zoom-out gesture is recognized based on whether the value of the distance accumulated in the array changes by more than a predetermined reference. It may include further steps.

또한, 상기 손 제스처 인터랙션 처리 장치는 실시간 홀로그램 데이터 송신부를 통해 홀로그램 데이터를 홀로그램 디스플레이 장치로 전송할 수 있다.Also, the hand gesture interaction processing apparatus may transmit hologram data to the hologram display apparatus through a real-time hologram data transmitter.

본 발명의 또 다른 실시예에서, 상술한 방법을 구현하기 위한 프로그램이 저장된 컴퓨터 판독 가능한 기록매체를 제공할 수 있다.In another embodiment of the present invention, a computer-readable recording medium in which a program for implementing the above-described method is stored may be provided.

본 발명에 의하면, 홀로그램 디스플레이를 관찰하는 어두운 환경에서 진행되기 어렵던 사용자의 손 제스처를 이용한 인터랙션을 사용자가 다른 도구 없이 손으로만 홀로그램을 직관적인 제스처로 자신이 원하는 모습으로 정확하게 인터랙션하여 편하게 관찰할 수 있다. According to the present invention, the user can easily observe the interaction using the user's hand gesture, which is difficult to proceed in a dark environment observing the hologram display, by accurately interacting with the hologram in the desired shape with an intuitive gesture without other tools. have.

또한, 본 발명에 의하면, 손으로 무언가를 만지기 힘든 상황, 예컨대 수술중인 의사가 제스처만을 통하여 자신이 원하는 홀로그램 디스플레이의 모습을 관찰하고 제어할 수 있게 할 수 있다.In addition, according to the present invention, it is possible to observe and control the appearance of the hologram display desired by the surgeon in a situation where it is difficult to touch something with his hand, for example, through a gesture only.

또한, 본 발명에 의하면, 깊이 영상을 딥러닝 및 영상 처리를 이용해 홀로그램 디스플레이를 사용자의 손 제스처로 인터랙션 하기 위한 인터랙션 홀로그램 디스플레이 방법 및 시스템을 제공할 수 있다.In addition, according to the present invention, it is possible to provide an interactive hologram display method and system for interacting a hologram display with a user's hand gesture using deep learning and image processing for a depth image.

또한, 본 발명에 의하면, 제스처를 보다 직관적인 제스처를 관절 정보를 이용하여 정의 및 인식할 수 있는 손 제스처 인터랙션 홀로그램 디스플레이 방법 및 시스템을 제공할 수 있다.Further, according to the present invention, it is possible to provide a method and system for displaying a hand gesture interaction hologram capable of defining and recognizing a more intuitive gesture using joint information.

본 발명의 효과들은 이상에서 언급한 내용들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the above-mentioned contents, and other technical effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 홀로그램 디스플레이 장치의 실제 모습을 나타내는 예시도이다.
도 1은 본 발명의 일 실시예에 따른 제스처 인터랙션 동작을 설명하기 위한 블록도이다.
도 2는 본 발명의 일 실시예에 따른 인터랙션 홀로그램 디스플레이 시스템을 설명하기 위한 블록도이다.
도 3은 본 발명의 일 실시예에 따른 깊이 영상의 전처리 과정을 설명하기 위한 예시도이다.
도 4는 본 발명의 일 실시예에 따른 배경 제거 결과를 보여주기 위한 예시도이다.
도 5는 본 발명의 일 실시예에 따른 깊이 정보만을 이용하여 손의 관절 정보를 찾기 위한 딥러닝 모델 CrossInfoNet의 구조를 설명하기 위한 개념도이다.
도 6은 본 발명의 일 실시예에 따른 ROI 설정 결과와 손의 관절 정보와 바운딩 박스(Bounding Box)의 처리 결과를 보여주기 위한 예시도이다.
도 7은 본 발명의 일 실시예에 따른 상, 하, 좌, 우, 확대, 축소, 연속 회전(좌, 우)에 대한 제스처 인식 동작을 설명하기 위한 예시도이다.
도 8은 본 발명의 일 실시예에 따른 기본 회전 상, 하, 좌, 우, 기본 상태인 홀로그램의 모습을 나타내는 예시도이다.
도 9는 본 발명의 일 실시예에 따른 기본 상태와 확대, 축소를 진행한 홀로그램의 모습을 나타내는 예시도이다.
도 10a 및 도 10b는 본 발명의 일 실시예에 따른 실험 결과를 나타내는 결과표이다.1 is an exemplary view showing an actual appearance of a hologram display device.
1 is a block diagram illustrating a gesture interaction operation according to an embodiment of the present invention.
2 is a block diagram illustrating an interactive hologram display system according to an embodiment of the present invention.
3 is an exemplary diagram for explaining a pre-processing of a depth image according to an embodiment of the present invention.
4 is an exemplary diagram illustrating a background removal result according to an embodiment of the present invention.
5 is a conceptual diagram for explaining the structure of a deep learning model CrossInfoNet for finding joint information of a hand using only depth information according to an embodiment of the present invention.
6 is an exemplary view for showing the ROI setting result, hand joint information, and the processing result of the bounding box according to an embodiment of the present invention.
7 is an exemplary diagram for explaining a gesture recognition operation for up, down, left, right, enlargement, reduction, and continuous rotation (left and right) according to an embodiment of the present invention.
8 is an exemplary view showing the state of the hologram in the basic rotation up, down, left, right, and basic state according to an embodiment of the present invention.
9 is an exemplary diagram illustrating a basic state and a state of a hologram that has been enlarged and reduced according to an embodiment of the present invention.
10A and 10B are result tables showing experimental results according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예가 상세하게 설명된다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고, 도면에서 본 발명의 실시예를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략되었다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily carry out the present invention. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. And, in order to clearly describe the embodiment of the present invention in the drawings, parts irrelevant to the description are omitted.

본 명세서에서 사용된 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도로 사용된 것이 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다.The terms used herein are used only to describe specific embodiments, and are not intended to limit the present invention. The singular expression may include the plural expression unless the context clearly dictates otherwise.

본 명세서에서, "포함하다", "가지다" 또는 "구비하다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것으로서, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해될 수 있다.In the present specification, terms such as "comprise", "have" or "include" are intended to designate that the features, numbers, steps, operations, components, parts, or combinations thereof described in the specification exist, and one It may be understood that this does not preclude in advance the possibility of the presence or addition of other features or numbers, steps, operations, components, parts, or combinations thereof, or other features or more.

또한, 본 발명의 실시예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 기술되고, 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있다. 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리 범위에 포함된다.In addition, the components shown in the embodiment of the present invention are shown independently to represent different characteristic functions, and it does not mean that each component is made of separate hardware or a single software component. That is, each component is listed as each component for convenience of description, and at least two components of each component are combined to form one component, or one component can be divided into a plurality of components to perform a function. Integrated embodiments and separate embodiments of each of these components are also included in the scope of the present invention without departing from the essence of the present invention.

또한, 이하의 실시예들은 당 업계에서 평균적인 지식을 가진 자에게 보다 명확하게 설명하기 위해서 제공되는 것으로서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다.In addition, the following embodiments are provided to more clearly explain to those of ordinary skill in the art, and the shapes and sizes of elements in the drawings may be exaggerated for more clear description.

이하, 첨부된 도면을 참조하여, 본 발명에 따른 바람직한 실시예에 대하여 설명한다.Hereinafter, with reference to the accompanying drawings, a preferred embodiment according to the present invention will be described.

도 1은 본 발명의 일 실시예에 따른 제스처 인터랙션 동작을 설명하기 위한 블록도이다.1 is a block diagram illustrating a gesture interaction operation according to an embodiment of the present invention.

대부분의 높은 성능을 내는 손 검출(hand detection) 모델의 경우 컬러(color) 정보를 사용한다. 하지만 홀로그램 디스플레이 환경에서는 컬러 정보를 사용할 수 없기 때문에, 때문에 깊이(depth) 정보만을 이용하고, 다른 많은 구조물 설치된 홀로그램 디스플레이 특성을 고려하여 배경 제거(background subtraction)과 ROI 설정 처리를 통하여 필요한 깊이 정보만을 가져와서 손을 검출하고, 검출한 손의 관절 정보들을 이용하여 제스처를 정의하고, 그에 맞는 제스처 동작을 취했을 때 그 제스처를 인식하고 출력할 수 있다. 또한, 제스처간의 딜레이(delay)를 1초를 주고 동작하도록 시스템을 설계 가능하다.Most high-performance hand detection models use color information. However, since color information cannot be used in the holographic display environment, only the depth information is used, and only the necessary depth information is obtained through background subtraction and ROI setting processing in consideration of the characteristics of the hologram display installed in many other structures. A hand is detected, a gesture is defined using the joint information of the detected hand, and the gesture can be recognized and output when an appropriate gesture action is taken. In addition, the system can be designed to operate with a delay between gestures of 1 second.

보다 구체적으로, 도 1을 참조하면, 먼저 깊이 카메라 등의 제스처 인식 센서로부터 깊이 영상 데이터를 획득할 수 있다.(S110) 예컨대, 제스처 인식 센서는 애저 키넥트(Azure Kinect) 등의 다양한 깊이 카메라를 포함한 센서 제품일 수 있다.More specifically, referring to FIG. 1 , first, depth image data may be acquired from a gesture recognition sensor such as a depth camera. (S110) For example, the gesture recognition sensor may use various depth cameras such as Azure Kinect. It may be a sensor product including

센서로부터 획득한 깊이 영상을 배경 제거(Background Subtraction) 및 ROI(Region of Interest) 설정 처리를 포함한 전처리(pre-processing) 수행할 수 있다.(S120) 예컨대, 전처리 중 먼저 배경 제거 처리를 이용하여 영상이 켜지면서 찍히는 첫 프레임과 이후에 들어오는 프레임들의 차이를 계산하여 일정 기준의 임계값(threshold) 이상의 것의 깊이 정보만을 남기고 나머지는 전부 0으로 처리함으로써, 영상에서 사용자 외의 다른 구조물 및 배경의 깊이 정보를 제거할 수 있다. ROI 설정 단계의 목적은 배경 제거(Background subtraction) 단계에서 제거되지 않을 수 있는 깊이(depth) 정보에서 손을 추정하지 않게 하기 위함이다.Pre-processing including background subtraction and region of interest (ROI) setting processing may be performed on the depth image obtained from the sensor (S120). By calculating the difference between the first frame taken while the camera is turned on and the subsequent frames, leaving only the depth information above a certain standard threshold and processing all the rest as 0, the depth information of structures and backgrounds other than the user in the image can be removed The purpose of the ROI setting step is to prevent hand estimation from depth information that may not be removed in the background subtraction step.

또한, 전처리 중 ROI 처리를 통해 손 검출 동작을 신속하고 효율적으로 수행할 수 있다. 예컨대, 사용자가 실제 손 동작을 수행하는 손이 존재하는 곳이 항상 일정하기 때문에, 혹시나 다른 구조물에 깊이 정보를 빼앗기는 것을 방지하기 위해 ROI를 마우스 클릭을 이용하여 ROI 설정할 곳을 정하고, 그곳을 고정적인 ROI 영역으로 사용하여, ROI 밖에서는 손을 찾지 않도록 함으로써, 처리되는 데이터양을 감소시켜 처리 속도를 증가시킬 수 있다.In addition, hand detection can be performed quickly and efficiently through ROI processing during pre-processing. For example, since the location of the hand where the user performs the actual hand motion is always constant, in order to prevent the depth information from being stolen by other structures, the location where the ROI is set is determined using a mouse click, and the location is fixed. By using it as an ROI area, it is possible to increase the processing speed by reducing the amount of data being processed by not looking for a hand outside the ROI.

다음으로, 전처리 후에 들어온 영상을 이용하여 손 검출을 수행할 수 있다.(S130) 예컨대, 앞서 진행한 전처리 과정 후에 들어온 영상을 미리 NYU 데이터 셋을 학습한 CrossInfoNet(손의 관절 정보를 찾는 딥러닝 모델)을 통하여 손의 관절 정보를 찾을 수 있으며, 딥러닝 모델 CrossInfoNet의 구조는 이후에 도 6을 참조하여 설명될 것이다.Next, hand detection can be performed using the image received after pre-processing. (S130) For example, CrossInfoNet (a deep learning model that finds joint information of the hand) that has previously learned the NYU data set from the image received after the pre-processing performed above. ), the joint information of the hand can be found, and the structure of the deep learning model CrossInfoNet will be described later with reference to FIG.

또한, 찾은 관절 정보를 이용하여 사용자의 손을 중심으로 그려지는 바운딩 박스(Bounding Box)를 설정하여 그려줄 수 있으며, 이 바운딩 박스는 기준 임계값으로 동작하여, 이 바운딩 박스를 넘어가게 제스처를 동작해야만 제스처가 출력되도록 구성될 수 있다.In addition, using the joint information found, it is possible to set and draw a bounding box drawn around the user's hand, and this bounding box operates as a reference threshold to operate a gesture to cross the bounding box. Only then can the gesture be configured to be output.

다음으로, 제스처 인터랙션이 수행되어, 제스처 인터랙션에 기초하여 홀로그램 디스플레이가 제어될 수 있다.(S140) 예컨대, CrossInfoNet 딥러닝 모델 등으로 찾은 관절 정보를 이용하여 제스처를 정의하고, 제스처를 인식할 수 있다.Next, a gesture interaction is performed, so that the hologram display can be controlled based on the gesture interaction. (S140) For example, a gesture can be defined and a gesture can be recognized using joint information found with a CrossInfoNet deep learning model, etc. .

먼저, 예컨대 상, 하, 좌, 우 제스처의 경우 사용자의 검지와 중지를 각각의 방향으로 스와이프(swipe)하는 형태로 정의할 수 있으며, 검지와 중지가 손 중심으로부터 기준 임계값 이상으로 멀어지게 되면 각각의 제스처가 출력될 수 있다.First, for example, in the case of up, down, left, and right gestures, the user's index and middle fingers can be defined as swiping in each direction, and the index and middle fingers are separated from the center of the hand by more than a reference threshold. Then, each gesture may be output.

또한, 예컨대, 확대, 축소 제스처의 경우 영상이 켜지면서 계속해서 엄지와 검지 사이의 거리 정보가 힙(Heap) 형태의 배열에 저장될 수 있다. 여기에 저장되는 엄지와 검지 사이의 거리가 커지면 확대 제스처가 출력되고, 줄어들게 되면 축소 제스처가 출력될 수 있다.Also, for example, in the case of an enlargement/reduction gesture, the distance information between the thumb and the index finger may be continuously stored in a heap-shaped array while the image is turned on. If the distance between the thumb and the index finger stored here increases, an enlargement gesture may be output, and when the distance between the thumb and index finger is decreased, a reduction gesture may be output.

또한, 예컨대 좌, 우 연속회전 제스처의 경우 손을 모두 피고 좌, 우로 스와이프하는 형태로 정의할 수 있으며, 여기서 엄지를 제외한 모든 손가락의 끝이 손 중심으로부터 기준 임계값 이상으로 멀어지게 되면 좌, 우 연속회전 제스처가 출력될 수 있다.In addition, for example, in the case of a continuous left and right rotation gesture, it can be defined as a form of swiping left and right while avoiding all hands. A right continuous rotation gesture may be output.

이때, 상, 하, 좌, 우 제스처와 좌, 우 연속 회전 제스처 사이의 중복을 피하기 위해 5개의 손가락 모두 손 중심과의 벡터의 내적값을 이용할 수 있다. 벡터의 내적값은 정규화(Normalization)을 통해 -1에서 1 사이의 값으로 바꿔주고, 손가락이 펴저 있으면 1에 가까운 값이 되고, 손가락이 구부러져 있으면 -1에 가까운 값이 나오는데, 이것을 이용하여 제스처들 사이의 중복을 제거할 수 있다.In this case, in order to avoid overlap between the up, down, left, and right gestures and the left and right continuous rotation gestures, the dot product value of the vector with the center of the hand for all five fingers may be used. The dot product value of the vector is changed to a value between -1 and 1 through normalization, and a value close to 1 is obtained when the finger is spread out, and a value close to -1 when the finger is bent. Duplicates between them can be removed.

이와 같은 과정을 통해 출력된 제스처는 예컨대 UDP/IP 통신을 통해 서버와 통신하여 출력된 제스처의 이진 코드(binary code)가 전송되면 그에 맞는 명령 메세지가 홀로그램 디스플레이에 전송되어 기본 상태였던 홀로그램 디스플레이가 해당 제스처를 거친 새로운 디스플레이를 보여주도록 수행할 수 있으며, 이과 같은 과정이 실시간으로 동작할 수 있다.The gesture output through this process communicates with the server through UDP/IP communication, for example, and when a binary code of the output gesture is transmitted, a corresponding command message is transmitted to the hologram display, and the hologram display, which was in the default state, corresponds to the corresponding command message. A gesture may be performed to show a new display, and this process may be performed in real time.

도 2는 본 발명의 일 실시예에 따른 인터랙션 홀로그램 디스플레이 시스템을 설명하기 위한 블록도이다.2 is a block diagram illustrating an interactive hologram display system according to an embodiment of the present invention.

제스처 인식 센서(100)는 제스처 인식을 위한 센서로서, 애저 키넥트(Azure Kinect) 등의 다양한 깊이 카메라를 포함한 센서로 구성될 수 있다.The gesture recognition sensor 100 is a sensor for gesture recognition, and may be configured as a sensor including various depth cameras, such as Azure Kinect.

손 제스처 인터랙션 처리 장치(200)는 센서로부터 깊이 영상 데이터를 수신하고, 손 제스처 인식에 기초하여 홀로그램 데이터를 생성하고, 생성된 홀로그램 데이터를 실시간 전송하도록 구성될 수 있으며, CGH 생성부(210), 홀로그램 데이터 저장부(220), 인터랙션 제어부(230) 및 실시간 홀로그램 데이터 송신부(240)를 포함할 수 있다.The hand gesture interaction processing apparatus 200 may be configured to receive depth image data from a sensor, generate hologram data based on hand gesture recognition, and transmit the generated hologram data in real time, a CGH generator 210, It may include a hologram data storage unit 220 , an interaction control unit 230 , and a real-time hologram data transmission unit 240 .

CGH 생성부(210)는 CGH(computer generated hologram)를 통해 홀로그램을 생성하도록 구성되고, 생성된 홀로그램은 파일 전송되어 홀로그램 데이터 저장부(220)에 저장될 수 있다. 홀로그램 데이터 저장부(220)는 다양한 형태의 하나 이상의 메모리로 구성될 수 있으며, 필요에 따라 클라우드 서버에 저장될 수도 있다.The CGH generator 210 is configured to generate a hologram through a computer generated hologram (CGH), and the generated hologram may be transferred to a file and stored in the hologram data storage 220 . The hologram data storage unit 220 may be composed of one or more memories of various types, and may be stored in a cloud server as necessary.

인터랙션 제어부(230)는 손 제스처 인식을 수행하도록 구성되며, 제스처 인식 센서(100)로부터 사용자 손이 포함된 깊이 영상을 포함한 3D 모션 데이터를 수신할 수 있다.The interaction control unit 230 is configured to perform hand gesture recognition, and may receive 3D motion data including a depth image including a user's hand from the gesture recognition sensor 100 .

여기서, 인터랙션 제어부(230)는 하나 이상의 프로세서(processor)에 의해 실행될 수 있는 프로그램 또는 프로그램 모듈을 포함할 수 있다. 인터랙션 제어부(230)에 포함된 프로그램 또는 프로그램 모듈들은 운영 체제(operating system), 어플리케이션 프로그램(application program) 또는 프로그램 등의 형태로 구성될 수 있으며, 널리 사용되는 다양한 종류의 저장 장치 상에 물리적으로 저장될 수 있다. 이와 같은 프로그램 또는 프로그램 모듈은 하나 이상의 루틴(routine), 서브루틴(subroutine), 프로그램(program), 오브젝트(object), 콤포넌트(component), 명령(instructions), 데이터 구조(data structure) 및 특정 작업(task)을 수행하거나 특정 데이터 유형을 실행하기 위한 다양한 형태를 포함할 수 있으며, 이들 형태로 제한되지 않는다.Here, the interaction control unit 230 may include a program or program module that may be executed by one or more processors. The program or program modules included in the interaction control unit 230 may be configured in the form of an operating system, an application program, or a program, and are physically stored on various types of widely used storage devices. can be Such programs or program modules may include one or more routines, subroutines, programs, objects, components, instructions, data structures, and specific tasks ( task) or may include, but are not limited to, various forms for executing specific data types.

인터랙션 제어부(230)는 깊이 영상의 전처리(pre-processing)를 수행하고, 깊이 영상의 전처리는 사용자 외의 배경 및 구조물의 깊이 정보를 제거하기 위한 배경 제거(background subtraction) 처리를 포함할 수 있다. 여기서, 배경 제거 처리는 첫 프레임과 다음 프레임들 사이의 깊이 정보가 미리 결정된 임계값을 초과하는 깊이 정보만을 사용함으로써 배경 및 구조물을 제거하는 과정을 포함할 수 있다.The interaction controller 230 performs pre-processing of the depth image, and the pre-processing of the depth image may include background subtraction processing for removing depth information of a background and structures other than the user. Here, the background removal process may include a process of removing the background and the structure by using only depth information in which depth information between the first frame and subsequent frames exceeds a predetermined threshold value.

또한, 인터랙션 제어부(230)는 전처리 후의 영상을 미리 데이터 셋을 학습한 딥러닝 모델인 CrossInfoNet을 통하여 손 전체의 관절 정보를 획득할 수 있다. 또한, 인터랙션 제어부(230)는 손 전체의 관절 정보에 기초하여 손의 중심을 기준으로 경계 박스(bounding box)를 설정하며, 손가락의 움직임이 상기 경계 박스를 넘어가는 경우 제스처 인식이 수행되도록 구성될 수 있다.In addition, the interaction control unit 230 may acquire joint information of the entire hand through CrossInfoNet, a deep learning model that has previously learned a data set from the image after preprocessing. In addition, the interaction control unit 230 sets a bounding box based on the center of the hand based on joint information of the entire hand, and gesture recognition is performed when the movement of the finger crosses the bounding box. can

또한, 인터랙션 제어부(230)는 인터랙션 제어부는 검지 또는 중지가 손 중심으로부터 미리 결정된 기준 이상 멀어지게 되는지 여부에 기초하여 상, 하, 좌, 우 제스처를 인식하도록 구성되고, 또한, 엄지와 검지 끝 사이의 거리를 계속하여 저장하는 배열을 이용하여, 배열에 쌓이는 거리의 값이 미리 결정된 기준 이상 변화하는지 여부에 기초하여 확대 또는 축소 제스처를 인식하도록 구성될 수 있다.In addition, the interaction control unit 230 is configured to recognize the upper, lower, left, and right gestures based on whether the index or middle finger moves away from the center of the hand by more than a predetermined standard, and also between the thumb and the tip of the index finger It may be configured to recognize an enlargement or reduction gesture based on whether or not the value of the distance accumulated in the array changes by more than a predetermined reference using an array that continuously stores the distance of .

실시간 홀로그램 데이터 송신부(240)는 인터랙션 제어부(230)에 의해 송신 제어되는 홀로그램 데이터를 홀로그램 디스플레이 장치(300)로 전송하도록 구성된 통신부를 포함할 수 있다. 위의 과정에서 정의되고 인식된 제스처 결과를 UDP/IP 통신 등을 사용하여 서버에 입력으로 들어가면, 서버는 해당 홀로그램을 송신단으로 전송할 수 있다.The real-time hologram data transmitter 240 may include a communicator configured to transmit hologram data, which is transmitted and controlled by the interaction controller 230 , to the hologram display device 300 . When the gesture result defined and recognized in the above process is input to the server using UDP/IP communication, etc., the server can transmit the corresponding hologram to the transmitter.

홀로그램 디스플레이 장치(300)는 손 제스처 인터랙션 처리 장치로부터 수신된 홀로그램 데이터에 기초하여 홀로그램 디스플레이를 제공하도록 구성될 수 있다. 이와 같이 전송된 홀로그램은 홀로그램 디스플레이 장치(300)를 통해 수신된 신호를 읽고 손 제스처 인터랙션에 기반한 홀로그램 디스플레이를 보여줄 수 있다.The hologram display device 300 may be configured to provide a hologram display based on hologram data received from the hand gesture interaction processing device. The transmitted hologram may read a signal received through the hologram display device 300 and display a hologram display based on a hand gesture interaction.

이와 같은 구성을 통해 사용자는 실시간으로 수행된 제스처 동작을 통해 새로운 홀로그램 디스플레이를 관찰할 수 있으며, 홀로그램 디스플레이 장치(300) 상에 위치한 손과 인터랙션함으로써, 3D 홀로그램 디스플레이와 사용자의 손이 상호작용할 수 있다.Through such a configuration, a user can observe a new hologram display through a gesture operation performed in real time, and by interacting with a hand located on the hologram display device 300, the 3D hologram display and the user's hand can interact. .

도 3은 본 발명의 일 실시예에 따른 깊이 영상의 전처리 과정을 설명하기 위한 예시도이다.3 is an exemplary diagram for explaining a pre-processing of a depth image according to an embodiment of the present invention.

도 3을 참조하여 이미지가 켜질 때 첫 번째 프레임과 다음 프레임들 사이의 깊이 정보의 차이가 임계 값을 초과하는 깊이 정보만을 사용하는 방법을 설명한다. 예컨대 카메라가 켜지면 첫 프레임으로 사용자 이외의 배경 또는 구조물만 있는 프레임을 받아오고, 그 이후 프레임들에는 사용자가 화면으로 들어오게 되고, 첫 프레임과의 깊이 차이값을 계속해서 계산하고, 미리 정해 놓은 기준 임계값을 초과하는 깊이 차이 값을 갖는 이미지의 깊이 정보만을 사용하여 배경 및 주변 구조물이 제거되고, 사용자와 사용자의 손의 깊이 정보만 사용할 수 있는 이미지가 될 수 있다.A method of using only depth information in which a difference in depth information between a first frame and subsequent frames exceeds a threshold value when an image is turned on will be described with reference to FIG. 3 . For example, when the camera is turned on, a frame with only a background or structure other than the user is received as the first frame, and in subsequent frames, the user enters the screen, and the depth difference value from the first frame is continuously calculated, The background and surrounding structures are removed using only the depth information of the image having the depth difference value exceeding the reference threshold, and the image may be an image in which only the user and the user's hand depth information can be used.

도 4는 본 발명의 일 실시예에 따른 배경 제거 결과를 보여주기 위한 예시도이다.4 is an exemplary diagram illustrating a background removal result according to an embodiment of the present invention.

예컨대 애저 키넥트(Azure Kinect)와 같은 제스처 인식 센서(100)를 사용하는 홀로그램 디스플레이의 바닥 부분은 사용자의 손보다 가깝게 된다. 또한, 시선 추적을 위한 여러 카메라가 홀로그램 디스플레이에 부착되어 깊이 정보를 유발하고, 이것이 정확한 손 감지를 방해할 수 있다. 따라서, 움직이는 사람을 제외한 나머지 부분의 깊이 정보는 제거되어야 하므로, 전처리 과정에서 배경 제거 처리를 사용하여 배경 및 구조물의 깊이 정보가 삭제할 수 있으며, 도 4의 좌측 도면이 배경 제거 처리 이전 화면이고, 도 4의 우측 도면이 배경 제거 처리 이후의 화면이다.For example, the bottom part of the holographic display using the gesture recognition sensor 100 such as Azure Kinect is closer than the user's hand. Additionally, multiple cameras for eye tracking can be attached to holographic displays to trigger depth information, which can interfere with accurate hand detection. Therefore, since the depth information of the remaining parts except for the moving person must be removed, the depth information of the background and the structure can be deleted using the background removal process in the pre-processing process, and the left figure of FIG. 4 is a screen before the background removal process, and FIG. 4 is the screen after the background removal process.

도 5는 본 발명의 일 실시예에 따른 깊이 정보만을 이용하여 손의 관절 정보를 찾기 위한 딥러닝 모델 CrossInfoNet의 구조를 설명하기 위한 개념도이다.5 is a conceptual diagram for explaining the structure of a deep learning model CrossInfoNet for finding joint information of a hand using only depth information according to an embodiment of the present invention.

본 발명의 일 예시에 따르면, 사용자의 손을 검출하기 위해서 CrossInfoNet 딥러닝 모델을 사용하였다. CrossInfoNet은 깊이 정보를 이용하여 손을 검출하는 딥 러닝 모델로서, 기존의 다른 모델들이 손 전체 이미지를 검출하고 관절 정보를 찾아내는 방식과는 다르게 CrossInfoNet은 손 전체를 한번 검출하고 나서 손바닥과 손가락을 각각 다시 검출하는 방식을 사용한다. According to an example of the present invention, a CrossInfoNet deep learning model was used to detect a user's hand. CrossInfoNet is a deep learning model that detects a hand using depth information. Unlike other existing models that detect the entire hand image and find joint information, CrossInfoNet detects the entire hand once and then re-apply each palm and finger. detection method is used.

도 5를 참조하면, 우선 손 전체를 검출하여 대략적인 손가락과 손바닥의 관절 정보를 획득하고, 획득한 손바닥과 손가락의 관절 정보를 각각의 다른 2개의 브랜치(branch)에서 다시 정교하게 찾아내고, 다시 찾아낸 정교한 관절 정보를 서로의 브랜치에 전달한다. 즉 손바닥을 정교하게 찾은 정보를 손가락을 정교하게 찾은 브랜치에 전달하고, 손가락을 정교하게 찾은 정보를 손바닥을 정교하게 찾은 브랜치에 전달하게 된다. 손바닥 정보를 정교하게 찾은 브랜치에서는 스킵 연결(skip connection)으로 받아온 대략적인 손바닥 관절 정보와 정교하게 찾은 손바닥 관절 정보와 공유받은 정교한 손가락 정보가 있다. 이 정보들을 제거 처리(subtracting)를 진행하게 되면, 대략적인 손바닥 정보와 정교한 손바닥 정보는 사라지게 되고, 정교한 손가락 정보만 남는다. Referring to FIG. 5 , first, the entire hand is detected to obtain approximate finger and palm joint information, and the acquired palm and finger joint information is again found in each other two branches elaborately, and again The elaborate joint information found is transmitted to each other's branches. In other words, the information on precisely finding the palm is transferred to the branch where the finger is precisely found, and the information precisely finding the finger is transferred to the branch where the palm is precisely found. In the branch in which the palm information is found precisely, there are the approximate palm joint information received through a skip connection, the elaborately found palm joint information, and the elaborate finger information shared. When this information is subtracted, rough palm information and elaborate palm information disappear, and only sophisticated finger information remains.

이와 같이 손가락의 관절 정보를 획득하고, 또한 손가락을 정교하게 찾은 브랜치에서는 스킵 연결로 받아온 대략적인 손가락 관절 정보와 정교하게 찾은 손가락 관절 정보 그리고 공유받은 정교한 손바닥 정보를 이용해서 제거 처리(subtracting)을 진행할 수 있다. 그 결과로는 정교한 손바닥 정보가 남게된다. In this way, the joint information of the finger is acquired, and in the branch where the finger is found precisely, the subtracting process is performed using the approximate finger joint information received through skip connection, the precisely found finger joint information, and the shared elaborate palm information. can The result is sophisticated palm information.

이 과정을 통해 얻은 손가락과 손바닥 정보를 학습이 진행될 때마다 찾은 결과를 계속해서 공유하면서 결과적으로 높은 정확도의 손가락과 손바닥 관절 정보를 획득하게 된다. 마지막으로 얻은 손가락과 손바닥 관절 정보를 합쳐서 손 전체의 관절 정보를 획득할 수 있게 된다.Finger and palm information obtained through this process is continuously shared with each learning process, and as a result, high-accuracy finger and palm joint information is obtained. By combining the last finger and palm joint information, joint information of the entire hand can be acquired.

도 6은 본 발명의 일 실시예에 따른 ROI 설정 결과와 손의 관절 정보와 바운딩 박스(Bounding Box)의 처리 결과를 보여주기 위한 예시도이다.6 is an exemplary view for showing the ROI setting result, hand joint information, and the processing result of the bounding box according to an embodiment of the present invention.

위에서 설명한 배경 제거 처리(Background Subtraction)로 배경과 구조물을 지웠지만 약간의 노이즈가 남아있기 때문에 ROI(관심 영역, Region of Interest)를 설정하여 ROI 안에서만 손을 찾도록 구성할 수 있다. 홀로그램 디스플레이 환경에서 카메라와 사용자가 제스처를 하는 곳이 항상 일정하므로 ROI를 정적으로 설정하여 사용할 수 있다. Although the background and structures have been erased with the background subtraction process described above, some noise remains, so you can configure an ROI (Region of Interest) to find a hand only within the ROI. In the holographic display environment, the location where the camera and the user make a gesture is always constant, so the ROI can be set statically.

또한, 손을 감지한 후 손의 중심을 기준으로 경계 상자인 바운딩 박스(bounding box)를 설정하고, 이 바운딩 박스는 제스처를 인식할 때 사용된 임계 값을 기반으로 그려지며 손가락이 이 경계 상자를 넘어가면 해당 제스처가 출력되게 할 수 있다. In addition, after detecting a hand, a bounding box, which is a bounding box, is set based on the center of the hand, and this bounding box is drawn based on the threshold used when recognizing a gesture, and the finger If it is passed, the corresponding gesture can be output.

도 6의 우측 도면을 참고하면, ROI 내에 찾은 관절 정보를 이용하여 사용자의 손을 중심으로 그려지는 바운딩 박스를 설정할 수 있으며, 이 바운딩 박스는 기준 임계값으로 동작하여, 이 바운딩 박스를 넘어가게 제스처를 동작해야만 제스처가 출력되도록 구성될 수 있다. 일 실시예에서 NYU hand dataset을 통해 학습을 진행하여 손바닥 중심을 포함한 14개의 손의 관절 정보를 얻을 수 있으며, 도 7에서와 같이 관절점과 함께 검출되는 손과 바운딩 박스를 확인할 수 있다.Referring to the right drawing of FIG. 6 , a bounding box drawn centered on the user's hand can be set using the joint information found in the ROI, and this bounding box operates as a reference threshold, so that the gesture crosses the bounding box. It can be configured to output a gesture only by operating . In an embodiment, joint information of 14 hands including the center of the palm can be obtained by learning through the NYU hand dataset, and the hand and the bounding box detected along with the joint point can be checked as shown in FIG. 7 .

도 7은 본 발명의 일 실시예에 따른 상, 하, 좌, 우, 확대, 축소, 연속 회전(좌, 우)에 대한 제스처 인식 동작을 설명하기 위한 예시도이다.7 is an exemplary diagram for explaining a gesture recognition operation for up, down, left, right, enlargement, reduction, and continuous rotation (left and right) according to an embodiment of the present invention.

도 6과 같이 손 검출로 결과로 얻은 14개 관절 정보들의 상대적 위치 관계, 손가락 마다의 내적 그리고 손가락 사이의 거리를 이용하여 각각의 제스처를 정의할 수 있다. As shown in FIG. 6 , each gesture can be defined using the relative positional relationship of 14 joint information obtained as a result of hand detection, the dot product of each finger, and the distance between the fingers.

제스처의 종류는 상, 하, 좌, 우, 확대, 축소 그리고 연속회전하는 좌, 우가 있다. 상, 하, 좌, 우 제스처는 손 중심을 기준으로 중지 손가락의 상, 하, 좌, 우 상대적인 위치가 기준 임계값을 넘고, 약지와 새끼 각각의 손가락의 손바닥과의 벡터의 내적이 음수일 때 해당 제스처가 출력될 수 있다. Gesture types include up, down, left, right, enlargement, reduction, and continuous rotation left and right. For up, down, left, and right gestures, when the relative positions of the upper, lower, left, and right of the middle finger with respect to the center of the hand exceed the reference threshold, and the dot product of the vector between the ring finger and the palm of each little finger is negative. A corresponding gesture may be output.

확대, 축소 제스처는 엄지와 검지 손가락 끝의 거리를 계속해서 저장하는 힙(Heap) 형태의 배열을 이용할 수 있다. 배열에 쌓이는 거리의 값이 기준 임계값을 넘도록 커지는 경향을 보이면 확대 제스처, 기준 임계값을 넘도록 거리가 줄어드는 경향을 보이면 축소 제스처가 출력되도록 할 수 있다. The zoom-in and zoom-out gestures can use a heap-shaped array that continuously stores the distance between the tips of the thumb and index finger. When the value of the distance accumulated in the array tends to increase to exceed the reference threshold value, an enlargement gesture may be output, and when the distance decreases to exceed the reference threshold value, a reduction gesture may be output.

연속회전 좌, 우 제스처는 우선 5개의 모든 손가락에 대하여 손바닥과의 벡터의 내적의 값이 기준 임계값을 넘고, 기본 좌, 우 회전과는 다르게 약지 손가락과 새끼 손가락역시 손바닥 중심과의 상대적인 위치가 좌, 우 각각의 기준 임계값을 넘게 되면 제스처가 출력되게 할 수 있다. In the continuous rotation left and right gesture, the value of the dot product of the vector with the palm for all five fingers exceeds the standard threshold, and unlike the basic left and right rotation, the ring finger and the little finger also have a relative position with the center of the palm. When the left and right reference thresholds are exceeded, the gesture may be output.

이때, 기본 좌, 우 제스처와 연속회전 좌, 우 제스처와 중복되는 경우가 발생하므로 각각 손가락과 손바닥과의 벡터의 내적을 통해 구분을 할 수 있다. 내적은 각각의 손가락과 손가락과 손 중심사이의 벡터의 내적을 이용할 수 있다. 여기서, 내적의 값을 -1에서 1까지로 정규화한다. 손가락을 구부르게 되면 벡터의 내적의 성질에 의해 내적의 값이 -1에 가까운 음수 값이 되고, 손가락을 피게 되면 벡터의 내적의 값이 1에 가깝게 된다. 이것을 이용해서 각각의 5개의 손가락마다 구부려졌는지 펴져있는지를 이용해서 제스처를 구분할 수 있다.In this case, since the basic left and right gestures overlap with the continuous rotation left and right gestures, it is possible to distinguish them through the dot product of the vectors of the fingers and the palms, respectively. The dot product can use the dot product of each finger and the vector between the finger and the hand center. Here, the value of the dot product is normalized from -1 to 1. When the finger is bent, the value of the dot product becomes a negative value close to -1 due to the property of the dot product of the vector, and when the finger is bent, the value of the dot product of the vector becomes close to 1. You can use this to distinguish gestures by whether each of the five fingers is bent or extended.

예컨대, 손 제스처는 기본회전 상, 하, 좌, 우 그리고 확대, 축소 그리고 연속회전 좌, 우 총 8개일 수 있다. 첫 번째, 기본 회전 상, 하, 좌, 우는 검지와 중지를 쭉 편채로 스와이프(swipe)하는 동작으로 구분될 수 있다. 직관적으로, 각각 상, 하, 좌, 우 방향으로 스와이프를 하면 스와이프하는 방향과 같은 제스처가 출력될 수 있다. 두 번째, 확대, 축소 제스처는 엄지와 검지를 이용하여, 엄지와 검지를 모으고 있다가 벌리게 되면 확대 제스처, 벌리고 있다가 두 손가락을 모으면 축소 제스처가 출력될 수 있다. 마지막으로, 연속회전 좌, 우는 손바닥을 모두 펴고 왼쪽과 오른쪽으로 스와이프하는 동작으로 인식될 수 있다.For example, the hand gesture may be a total of eight basic rotation up, down, left, right, enlargement, reduction, and continuous rotation left and right. First, the basic rotation up, down, left, right can be divided into an operation of swiping (swipe) with the index and middle fingers straight. Intuitively, when swiping in up, down, left, and right directions, respectively, a gesture such as a swipe direction may be output. Second, the zoom-in/reduction gesture may be performed using the thumb and index finger. When the thumb and index finger are gathered and then spread apart, the zoom-in gesture and the zoom-out gesture may be output when the two fingers are spread apart and then gathered. Finally, the continuous rotation left and right may be recognized as a motion of swiping left and right with both palms extended.

도 8은 본 발명의 일 실시예에 따른 기본 회전 상, 하, 좌, 우, 기본 상태인 홀로그램의 모습을 나타내는 예시도이다.8 is an exemplary view showing the state of the hologram in the basic rotation up, down, left, right, and basic state according to an embodiment of the present invention.

도 8은 기본 회전 제스처에 대한 결과를 보여준다. 기본 상태에서 업(up) 제스처를 취하면 홀로그램 이미지가 위로 회전하고, 다운(down) 제스처를 취하면 아래로 회전한다. 마찬가지로 좌(left), 우(right) 제스처를 취하면 각각 왼쪽, 오른쪽으로 회전하는 모습을 확인할 수 있다.8 shows the results for a basic rotation gesture. In the default state, an up gesture rotates the holographic image upward, and a down gesture rotates it downward. Similarly, if you take a left and right gesture, you can see how it rotates to the left and right, respectively.

도 9는 본 발명의 일 실시예에 따른 기본 상태와 확대, 축소를 진행한 홀로그램의 모습을 나타내는 예시도이다.9 is an exemplary diagram illustrating a basic state and a state of a hologram that has been enlarged and reduced according to an embodiment of the present invention.

도 9는 확대, 축소 제스처의 결과를 보여준다. 기본 상태에서 확대 제스처를 취하면 홀로그램 이미지가 커지게 되고, 축소 제스처를 취하면 홀로그램 이미지가 작아지게 되는 모습을 확인할 수 있다.9 shows the results of the zoom-in and zoom-out gestures. In the default state, it can be seen that the hologram image becomes larger when the zoom-in gesture is taken, and the hologram image becomes smaller when the zoom-out gesture is taken.

도 10a 및 도 10b는 본 발명의 일 실시예에 따른 실험 결과를 나타내는 결과표이다.10A and 10B are result tables showing experimental results according to an embodiment of the present invention.

본 발명에 따라 정의한 제스처 총 8종에 대하여 10명의 사람들로 실험을 진행하였다. 모든 피실험자들은 제스처 동작 방법과 1초의 딜레이가 있다는 것을 인지하고 제스처 1개당 10번의 실험을 진행하였다. 모든 피실험자들은 같은 테스트 환경에서 실험을 진행하였으며, 실험 환경은 애저 키넥트(Azure Kinect)로부터 손의 거리를 35~50cm로 고정하고 카메라를 정면으로 바라본 상태에서 진행하였다. 또한, 한 명의 실험자마다 8개 제스처에 대한 테스트를 총 80회 진행하여, 즉, 한개의 제스처 마다 100번의 실험을 진행하였다.An experiment was conducted with 10 people for a total of 8 types of gestures defined according to the present invention. All subjects recognized the gesture operation method and the delay of 1 second, and conducted 10 experiments per gesture. All subjects performed the experiment in the same test environment, and the experiment environment was conducted with the hand distance from Azure Kinect fixed at 35-50 cm, and the camera was viewed from the front. In addition, a total of 80 tests for 8 gestures were performed for each experimenter, that is, 100 experiments were performed for each gesture.

도 10a에 도시된 표에서 각각의 제스처를 100번씩 실험한 TP, FP, FN, Precision, Recall 그리고 F1 score를 보여준다. Up 제스처에서는 2번의 FN결과가 나왔고, 확대, 축소 제스처에서는 각각 1번씩의 FP 결과가 나왔다. 그리고 연속적으로 회전하는 Right 제스처 에서는 7번의 FN결과가 나왔다. 이 결과를 이용해 정밀도(Precision), 재현율(Recall) 그리고 F1 score를 계산하였다. 확대, 축소 제스처를 제외한 나머지 모든 제스처는 100의 Precision 값을 얻었다. 확대, 축소 제스처는 각각 99의 precision 값을 얻었다. 위와 연속회전 right 제스처를 제외한 모든 제스처는 100의 Recall 값을 얻었다. Up 제스처는 98 그리고 연속회전 right 제스처는 93의 Recall 값을 얻었다. precision 과 recall 값을 이용해 계산한 F1 score는 up 제스처가 98.98, 확대, 축소 제스처가 99.49 그리고 연속회전 right 제스처는 96.37을 얻었고, 나머지 제스처는 100의 값을 얻었다.In the table shown in FIG. 10A, TP, FP, FN, Precision, Recall, and F1 scores for each gesture 100 times are shown. In the Up gesture, FN results were obtained twice, and in the zoom-in and zoom-out gestures, FP results were obtained once each. And in the continuously rotating Right gesture, 7 FN results were obtained. Using these results, precision, recall, and F1 score were calculated. All gestures except zoom-in and zoom-out got a Precision value of 100. The zoom-in and zoom-out gestures obtained a precision value of 99, respectively. All gestures except for the above and continuous rotation right gestures obtained a Recall value of 100. Up gesture got a Recall value of 98 and continuous rotation right gesture got a Recall value of 93. As for the F1 score calculated using precision and recall values, 98.98 for up gesture, 99.49 for zoom in/out gesture, 96.37 for continuous rotation right gesture, and 100 for the remaining gestures.

본 명세서에 기재된 다양한 실시예들은 하드웨어, 미들웨어, 마이크로코드, 소프트웨어 및/또는 이들의 조합에 의해 구현될 수 있다. 예를 들어, 다양한 실시예들은 하나 이상의 주문형 반도체(ASIC)들, 디지털 신호 프로세서(DSP)들, 디지털 신호 프로세싱 디바이스(DSPD)들, 프로그램어블 논리 디바이스(PLD)들, 필드 프로그램어블 게이트 어레이(FPGA)들, 프로세서들, 컨트롤러들, 마이크로컨트롤러들, 마이크로프로세서들, 여기서 제시되는 기능들을 수행하도록 설계되는 다른 전자 유닛들 또는 이들의 조합 내에서 구현될 수 있다.The various embodiments described herein may be implemented by hardware, middleware, microcode, software, and/or combinations thereof. For example, various embodiments may include one or more application specific semiconductors (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs). ), processors, controllers, microcontrollers, microprocessors, other electronic units designed to perform the functions presented herein, or a combination thereof.

또한, 예를 들어, 다양한 실시예들은 명령들을 포함하는 컴퓨터-판독가능한 매체에 수록되거나 인코딩될 수 있다. 컴퓨터-판독가능한 매체에 수록 또는 인코딩된 명령들은 프로그램 가능한 프로세서 또는 다른 프로세서로 하여금 예컨대, 명령들이 실행될 때 방법을 수행하게끔 할 수 있다. 컴퓨터-판독가능한 매체는 컴퓨터 저장 매체 및 하나의 장소로부터 다른 장소로 컴퓨터 프로그램의 이송을 용이하게 하는 임의의 매체를 포함하는 통신 매체 모두를 포함한다. 저장 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수도 있다. 예를 들어, 이러한 컴퓨터-판독가능한 매체는 RAM, ROM, EEPROM, CD-ROM　또는 기타 광학디스크 저장 매체, 자기 디스크 저장 매체 또는 기타 자기 저장 디바이스 또는 원하는 프로그램 코드를 컴퓨터에 의해 액세스가능한 명령들 또는 데이터 구조들의 형태로 반송하거나 저장하는데 이용될 수 있는 임의의 다른 매체를 포함할 수 있다.Also, for example, the various embodiments may be embodied in or encoded on a computer-readable medium comprising instructions. The instructions embodied in or encoded on a computer-readable medium may cause a programmable processor or other processor to perform a method, eg, when the instructions are executed. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium may be any available medium that can be accessed by a computer. For example, such a computer-readable medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage medium, magnetic disk storage medium or other magnetic storage device or desired program code, instructions or data accessible by a computer. may include any other medium that can be used to transport or store in the form of structures.

이러한 하드웨어, 소프트웨어, 펌웨어 등은 본 명세서에 기술된 다양한 동작들 및 기능들을 지원하도록 동일한 디바이스 내에서 또는 개별 디바이스들 내에서 구현될 수 있다. 추가적으로, 본 발명에서 "~부"로 기재된 구성요소들, 유닛들, 모듈들, 컴포넌트들 등은 함께 또는 개별적이지만 상호 운용가능한 로직 디바이스들로서 개별적으로 구현될 수 있다. 모듈들, 유닛들 등에 대한 서로 다른 특징들의 묘사는 서로 다른 기능적 실시예들을 강조하기 위해 의도된 것이며, 이들이 개별 하드웨어 또는 소프트웨어 컴포넌트들에 의해 실현되어야만 함을 필수적으로 의미하지 않는다. 오히려, 하나 이상의 모듈들 또는 유닛들과 관련된 기능은 개별 하드웨어 또는 소프트웨어 컴포넌트들에 의해 수행되거나 또는 공통의 또는 개별의 하드웨어 또는 소프트웨어 컴포넌트들 내에 통합될 수 있다.Such hardware, software, firmware, etc. may be implemented in the same device or in separate devices to support the various operations and functions described herein. Additionally, components, units, modules, components, etc. described as “parts” in the present invention may be implemented together or individually as separate but interoperable logic devices. Depictions of different features of modules, units, etc. are intended to emphasize different functional embodiments, and do not necessarily imply that they must be realized by separate hardware or software components. Rather, functionality associated with one or more modules or units may be performed by separate hardware or software components or integrated within common or separate hardware or software components.

특정한 순서로 동작들이 도면에 도시되어 있지만, 이러한 동작들이 원하는 결과를 달성하기 위해 도시된 특정한 순서, 또는 순차적인 순서로 수행되거나, 또는 모든 도시된 동작이 수행되어야 할 필요가 있는 것으로 이해되지 말아야 한다. 임의의 환경에서는, 멀티태스킹 및 병렬 프로세싱이 유리할 수 있다. 더욱이, 상술한 실시예에서 다양한 구성요소들의 구분은 모든 실시예에서 이러한 구분을 필요로 하는 것으로 이해되어서는 안되며, 기술된 구성요소들이 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키징될 수 있다는 것이 이해되어야 한다.Although acts are shown in the figures in a particular order, it should not be understood that these acts need to be performed in the particular order shown, or sequential order, or all shown acts need to be performed to achieve a desired result. . In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the division of various components in the above-described embodiments should not be construed as requiring such division in all embodiments, and that the described components will generally be integrated together into a single software product or packaged into multiple software products. It should be understood that there can be

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해서 정해져야 할 것이다.Although the present invention has been described with reference to the embodiment shown in the drawings, which is merely exemplary, it will be understood by those skilled in the art that various modifications and equivalent other embodiments are possible therefrom. Accordingly, the true technical protection scope of the present invention should be defined by the technical spirit of the appended claims.

100: 제스처 인식 센서
200: 손 제스처 인터랙션 장치
210: CGH 생성부
220: 홀로그램 데이터 저장부
230: 인터랙션 제어부
240: 실시간 홀로그램 데이터 송신부
300: 홀로그램 디스플레이 장치100: gesture recognition sensor
200: hand gesture interaction device
210: CGH generator
220: hologram data storage unit
230: interaction control
240: real-time hologram data transmitter
300: hologram display device

Claims

An interactive hologram display system based on hand gesture recognition, comprising:
sensors for gesture recognition;
a hand gesture interaction processing device configured to receive depth image data from the sensor, generate hologram data based on hand gesture recognition, and transmit the generated hologram data in real time; and
A holographic display device configured to provide a holographic display based on holographic data received from the hand gesture interaction processing device
including,
The hand gesture interaction processing apparatus includes an interaction control unit configured to perform hand gesture recognition, wherein the interaction control unit is configured to receive a depth image from the sensor,
The interaction control unit sets a bounding box based on the center of the hand based on joint information of the entire hand, and is configured to perform gesture recognition when the movement of a finger crosses the bounding box,
When the upper, lower, left, and right relative positions of the middle finger with respect to the center of the hand exceed the reference threshold, and the dot product of the vector between the ring finger and the palm of each little finger is negative, the upper, lower, and left , which is configured to recognize a right gesture,
The interaction control unit is configured to recognize continuous rotation left and right gestures when the value of the dot product of the vector with the palm for all five fingers exceeds a reference threshold value,
The interaction control unit uses a heap-shaped array that continuously stores the distance between the tip of the thumb and the index finger, and recognizes an enlargement or reduction gesture based on whether the value of the distance accumulated in the array changes by more than a predetermined reference which is configured to do, an interactive hologram display system.

delete

The method of claim 1, wherein the interaction control unit performs pre-processing of the depth image, and the pre-processing of the depth image includes background subtraction processing for removing depth information of a background and structures other than a user. That is, an interactive hologram display system.

The interactive hologram display system according to claim 3, wherein the background removal process removes the background and the structure by using only depth information in which depth information between a first frame and subsequent frames exceeds a predetermined threshold value.

The interactive hologram display system according to claim 3, wherein the pre-processing of the depth image includes an ROI (region of interest) setting process, and is configured to detect a hand only within the set ROI.

The interaction hologram display system according to claim 3, wherein the interaction control unit acquires joint information of the entire hand through a deep learning model that has previously learned a data set from the pre-processed image.

delete

The interaction hologram display system according to claim 1, wherein the hand gesture interaction processing device further comprises a real-time hologram data transmitter configured to transmit the hologram data to the hologram display device.

A method for providing an interactive hologram display based on hand gesture recognition, the method comprising:
acquiring depth image data from a gesture recognition sensor;
Receiving depth image data from the gesture recognition sensor in a hand gesture interaction processing device, generating hologram data based on hand gesture recognition, and transmitting the generated hologram data in real time; and
providing a hologram display in a hologram display device based on the hologram data received from the hand gesture interaction processing device
including,
setting a bounding box based on the center of the hand based on joint information of the entire hand in the hand gesture interaction processing device, and performing gesture recognition when the movement of a finger exceeds the bounding box;
In the hand gesture interaction processing device, when the relative positions of the upper, lower, left, and right of the middle finger with respect to the center of the hand exceed the reference threshold, and the dot product of the vector of the palm of the ring finger and the palm of each little finger is negative, the upper, Recognizing down, left, and right gestures;
recognizing continuous rotation left and right gestures when a value of a vector dot product with a palm for all five fingers in the hand gesture interaction processing device exceeds a reference threshold value; and
Using a heap-shaped array that continuously stores the distance between the tip of the thumb and the index finger in the hand gesture interaction processing device, enlarged or reduced based on whether the value of the distance accumulated in the array changes by more than a predetermined reference Steps to recognize gestures
Interactive hologram display providing method further comprising.

The method of claim 11 , further comprising: performing pre-processing of a depth image in the hand gesture interaction processing apparatus, wherein the pre-processing of the depth image includes removing depth information of backgrounds and structures other than the user A method for providing an interactive hologram display, comprising a background subtraction process for

The method according to claim 12, wherein the background removal process removes the background and the structure by using only depth information in which depth information between a previous frame and subsequent frames exceeds a predetermined threshold value.

The method of claim 13 , wherein the pre-processing of the depth image further comprises an ROI (region of interest) setting processing step, and is configured to detect a hand only within the set ROI.

The method of claim 12 , further comprising: acquiring joint information of the entire hand through a deep learning model that has previously learned a data set from the pre-processed image in the hand gesture interaction processing device.

delete

The method of claim 11, wherein the hand gesture interaction processing device transmits hologram data to the hologram display device through a real-time hologram data transmitter.

A computer-readable recording medium storing a program for implementing the method according to any one of claims 11 to 15 and 19.