KR102044526B1

KR102044526B1 - Method of increasing speech recognition based on artificial intelligence and device of implementing thereof

Info

Publication number: KR102044526B1
Application number: KR1020170164586A
Authority: KR
Inventors: 이재훈; 원재용; 이흥규
Original assignee: 엘지전자 주식회사
Priority date: 2017-12-01
Filing date: 2017-12-01
Publication date: 2019-11-13
Anticipated expiration: 2037-12-01
Also published as: KR20190065094A

Abstract

본 발명은 인공지능에 기반하여 음성 인식을 향상시키는 방법 및 이를 구현하는 장치에 관한 것으로, 본 발명의 일 실시예에 의한 음성 인식을 향상시키는 장치는 사용자의 음성을 입력받는 음성 입력부, 음성을 발화한 사용자를 확인하는 사용자 확인부, 입력된 음성 및 사용자 확인부의 확인 결과를 분석하여 음성이 장치 또는 장치에 인접한 피어 장치의 기능을 제어하는 명령어의 후속을 지시하는 기동어 판단 결과를 생성하는 중앙제어부, 및 중앙제어부의 제어에 기반하여 장치의 기능을 제어하는 장치기능 제어부를 포함한다.The present invention relates to a method for improving speech recognition based on artificial intelligence and an apparatus for implementing the same. The apparatus for improving speech recognition according to an embodiment of the present invention comprises a voice input unit for receiving a user's voice and uttering a voice. A central control unit for analyzing a verification result of a user confirmation unit, an input voice, and a user confirmation unit confirming a user, and generating a start word determination result instructing the follow-up of a command for controlling the function of a voice device or a peer device adjacent to the device. And a device function controller for controlling the function of the device based on the control of the central controller.

Description

METHOD OF INCREASING SPEECH RECOGNITION BASED ON ARTIFICIAL INTELLIGENCE AND DEVICE OF IMPLEMENTING THEREOF}

본 발명은 음성 인식을 향상시키는 방법 및 이를 구현하는 장치에 관한 기술이다.The present invention relates to a method for improving speech recognition and an apparatus for implementing the same.

최근 가전 제품, 가전 기기 등을 제어함에 있어서 음성이 기반하여 제어하는 기술이 다양하게 제시되고 있다. 특히, 가전 제품이나 가전 기기 등의 장치들을 제어함에 있어서 인간의 음성을 빨리 인식시키기 위해서 음성 인식률을 높이도록 마이크를 다양하게 설치하거나 음성 인식 처리 모듈을 새로이 개발하는 등의 연구가 계속되고 있다. Recently, in controlling home appliances, home appliances, and the like, a variety of technologies for controlling voice based on them have been proposed. In particular, in the control of devices such as home appliances and home appliances, research has been continuously conducted to variously install microphones or newly develop a voice recognition processing module to increase the voice recognition rate in order to recognize human voices quickly.

한편, 음성 인식은 다양한 변수가 발생하는 상황에서 이루어져야 하므로, 이러한 변수에 기반하여 장치가 음성 인식의 정확도를 높이도록 대응하여 동작하는 것이 필요하다. 이에, 본 명세서에서는 댁내 혹은 건물의 일정한 사무 공간 등에 배치된 장치들이 정확하게 음성 인식을 수행할 수 있도록 하는 방안에 대해 제시한다. Meanwhile, since speech recognition must be performed in a situation where various variables occur, it is necessary to operate the device correspondingly to increase the accuracy of speech recognition based on these variables. Thus, the present specification proposes a method for precisely performing speech recognition by devices disposed in a predetermined office space of a house or a building.

본 명세서에서는 전술한 문제점을 해결하기 위한 것으로, 명령어를 입력받을 수 있도록 장치의 모드를 변경함에 있어서 주변의 소음이나 불확실한 음성 인식 상태를 조절하는 방법 및 장치를 제공하고자 한다. In this specification, to solve the above-described problem, it is intended to provide a method and apparatus for adjusting a noise or uncertain voice recognition state of the surroundings in changing the mode of the device to receive a command.

본 명세서에서는 명령어 입력 모드의 진입을 위한 기동어 인식의 정확도를 높이기 위해 장치 혹은 인접한 장치들이 사람의 존재를 확인하는 방법 및 장치를 제공하고자 한다.In the present specification, to improve the accuracy of starting word recognition for entering a command input mode, a device and a method of providing a device or an adjacent device to check the presence of a person are provided.

본 명세서는 명령어 입력 모드의 진입 과정에서 일시적으로 음성 인식의 민감도를 제어하여 기동어 인식의 정확도를 높이면서, 기동어 인식 후 다시 음성 인식의 민감도를 복원하여 일반 명령어의 인식률을 유지하는 방법 및 장치를 제공하고자 한다.In the present specification, a method and apparatus for maintaining the recognition rate of a general command by temporarily controlling the sensitivity of speech recognition during the entry process of the command input mode and increasing the accuracy of the speech recognition, restoring the sensitivity of the speech recognition again after the speech recognition. To provide.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다. The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention, which are not mentioned above, can be understood by the following description, and more clearly by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

본 발명의 일 실시예에 의한 음성 인식을 향상시키는 장치는 사용자의 음성을 입력받는 음성 입력부, 음성을 발화한 사용자를 확인하는 사용자 확인부, 입력된 음성 및 사용자 확인부의 확인 결과를 분석하여 음성이 장치 또는 장치에 인접한 피어 장치의 기능을 제어하는 명령어의 후속을 지시하는 기동어 판단 결과를 생성하는 중앙제어부, 및 중앙제어부의 제어에 기반하여 장치의 기능을 제어하는 장치기능 제어부를 포함한다.According to an embodiment of the present invention, an apparatus for improving speech recognition includes analyzing a verification result of a voice input unit receiving a user's voice, a user confirmation unit confirming a user who uttered a voice, an input voice, and a user confirmation unit. A central control unit for generating a start word determination result instructing subsequent to the command for controlling a function of the device or a peer device adjacent to the device, and a device function control unit for controlling the function of the device based on the control of the central controller.

본 발명의 다른 실시예에 의한 음성 인식을 향상시키는 방법은 음성 인식을 수행하는 장치에 있어서, 음성 입력부가 사용자의 음성을 입력받는 단계, 사용자 확인부가 음성을 발화한 사용자를 확인하는 단계; 및 중앙제어부가 입력된 음성 및 사용자 확인부의 확인 결과를 분석하여 음성이 장치 또는 장치에 인접한 피어 장치의 기능을 제어하는 명령어의 후속을 지시하는 기동어 판단 결과를 생성하는 단계를 포함한다. According to another aspect of the present invention, there is provided a method for improving speech recognition, comprising: receiving a voice of a user by a voice input unit, and checking a user who has spoken a voice; And generating, by analyzing the input voice and the confirmation result of the user verification unit, the central controller to generate a start word determination result instructing the voice to follow a command for controlling a function of the device or a peer device adjacent to the device.

본 발명을 적용할 경우, 입력된 기동어가 원래의 저장된 음성 모델과 일치하지 않지만, 민감도 파라미터에 근접한 경우, 장치를 기동어 검증 모드로 진입시켜 후속하여 발화되는 기동어의 인식률을 높일 수 있다. In the case of applying the present invention, if the input start word does not match the original stored voice model, but is close to the sensitivity parameter, the device may be entered into the start word verification mode to increase the recognition rate of a subsequent start word.

또한, 본 발명을 적용할 경우 기동어 검증 모드에서 장치들이 소음을 줄이거나 기동어 인식률이 높도록 마이크 등을 제어하여 기동어의 인식률을 높일 수 있다. In addition, when applying the present invention it is possible to increase the recognition rate of the starting word by controlling the microphone and the like to reduce the noise or increase the recognition rate of the starting word in the device verification mode.

또한, 본 발명을 적용할 경우, 인식된 음성이 기동어인지 불확실한 상태에서 사람이 주변에 있는 경우, 특정 시간 동안 민감도 파라미터를 높게 변경하고 기기 소음을 줄여, 한번 더 기동어가 발화될 경우 그 전 수준의 유사도를 가진다 하더라도 두 번째에는 기동어로 억셉트 될 수 있도록 한다.In addition, according to the present invention, when a person is in a state where it is unclear whether the recognized voice is a starting word, the sensitivity parameter is changed high for a specific time and the device noise is reduced, and if the starting word is uttered again, the previous level Even if they have similarity of, the second one can be accepted as a maneuver.

본 발명의 효과는 전술한 효과에 한정되지 않으며, 본 발명의 당업자들은 본 발명의 구성에서 본 발명의 다양한 효과를 쉽게 도출할 수 있다.The effects of the present invention are not limited to the above effects, and those skilled in the art can easily derive various effects of the present invention from the configuration of the present invention.

도 1은 본 발명의 일 실시예에 의한 장치의 구성을 보여주는 도면이다.
도 2는 본 발명의 일 실시예에 의한 사용자의 음성이 입력될 경우의 장치의 동작 과정을 보여주는 도면이다.
도 3은 본 발명의 일 실시예에 의한 장치가 가질 수 있는 모드를 보여주는 도면이다.
도 4 및 도 5는 본 발명의 일 실시예에 의한 사용자 확인부가 PIR 센서를 포함하는 장치의 동작 과정을 보여주는 도면이다.
도 6 및 도 7은 본 발명의 다른 실시예에 의한 사용자 확인부가 카메라 센서를 포함하는 장치의 동작 과정을 보여주는 도면이다.
도 8은 본 발명의 일 실시예에 의한 피어 장치들과의 협업으로 기동어를 인식하는 과정을 보여주는 도면이다.
도 9는 본 발명의 일 실시예에 의한 기동어 검증 모드로 진입한 장치의 주변 장치들도 소음을 줄이거나 기동어 입력을 위한 작업을 수행하는 과정을 보여주는 도면이다.
도 10은 본 발명의 다른 실시예에 의한 다수의 장치들이 음성 인식을 수행하는 과정을 보여주는 도면이다.
도 11은 본 발명의 일 실시예에 의한 음성 인식만을 수행하여 인접한 다른 기기들을 명령어 입력 모드로 진입하도록 제어하는 과정을 보여주는 도면이다. 1 is a view showing the configuration of an apparatus according to an embodiment of the present invention.
2 is a diagram illustrating an operation process of a device when a voice of a user is input according to an embodiment of the present invention.
3 is a diagram illustrating a mode that an apparatus according to an embodiment of the present invention may have.
4 and 5 are views illustrating an operation process of a device including a PIR sensor by a user confirmation unit according to an embodiment of the present invention.
6 and 7 illustrate an operation process of a device including a camera sensor by a user confirmation unit according to another embodiment of the present invention.
8 is a diagram illustrating a process of recognizing a start word in cooperation with peer devices according to an embodiment of the present invention.
FIG. 9 is a diagram illustrating a process of reducing noise or performing a task for inputting a start word, even in a peripheral device of a device that has entered a start word verification mode according to an embodiment of the present invention.
10 is a diagram illustrating a process of performing voice recognition by a plurality of devices according to another embodiment of the present invention.
11 is a diagram illustrating a process of controlling other adjacent devices to enter a command input mode by performing only voice recognition according to an embodiment of the present invention.

이하, 도면을 참조하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention.

본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 참조 부호를 붙이도록 한다. 또한, 본 발명의 일부 실시예들을 예시적인 도면을 참조하여 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가질 수 있다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략할 수 있다.In order to clearly describe the present invention, parts irrelevant to the description are omitted, and like reference numerals designate like elements throughout the specification. In addition, some embodiments of the invention will be described in detail with reference to exemplary drawings. In adding reference numerals to components of each drawing, the same components may have the same reference numerals as much as possible even though they are shown in different drawings. In addition, in describing the present invention, when it is determined that the detailed description of the related well-known configuration or function may obscure the gist of the present invention, the detailed description may be omitted.

본 발명의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질, 차례, 순서 또는 개수 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성 요소 사이에 다른 구성 요소가 "개재"되거나, 각 구성 요소가 다른 구성 요소를 통해 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.In describing the components of the present invention, terms such as first, second, A, B, (a), and (b) can be used. These terms are only to distinguish the components from other components, and the terms are not limited in nature, order, order, or number of the components. If a component is described as being "connected", "coupled" or "connected" to another component, that component may be directly connected to or connected to that other component, but between components It will be understood that the elements may be "interposed" or each component may be "connected", "coupled" or "connected" through other components.

또한, 본 발명을 구현함에 있어서 설명의 편의를 위하여 구성요소를 세분화하여 설명할 수 있으나, 이들 구성요소가 하나의 장치 또는 모듈 내에 구현될 수도 있고, 혹은 하나의 구성요소가 다수의 장치 또는 모듈들에 나뉘어져서 구현될 수도 있다. In addition, in the implementation of the present invention may be described by subdividing the components for convenience of description, these components may be implemented in one device or module, or one component is a plurality of devices or modules It can also be implemented separately.

본 명세서에서 특정한 공간 내에 배치되어 소정의 기능을 수행하는 가전 제품, 가전 기기 등을 장치라고 통칭한다. 장치들 중에서 음성 인식을 수행하는 장치들을 음성 인식 장치라고 지칭한다. 또한, 특정한 공간 내에 배치되는 둘 이상의 장치들은 통신 기능을 이용하여 제어 메시지를 송수신할 수 있다.In the present specification, a home appliance, a home appliance, and the like, which are disposed in a specific space and perform a predetermined function, are collectively called an apparatus. Among the devices, devices that perform voice recognition are referred to as voice recognition devices. In addition, two or more devices disposed in a specific space may transmit and receive a control message using a communication function.

본 명세서에서 사용자가 음성으로 발화하여 장치들이 인식을 필요로 하는 메시지는 두 종류로 구분하여 음성 인식률을 높이고자 한다. 사용자가 발화하는 언어는 기동어와 명령어, 더미어 총 3가지로 구분된다. 기동어는 장치에게 명령어가 후속한다는 것을 알려주는 기능을 수행한다. 예를 들어 장치의 분류적 명칭("TV", "라디오", "냉장고")가 되거나 장치의 브랜드("휘센", "트롬"), 또는 감탄사나 대화체의 단어("이봐", "여기")가 될 수 있다. In the present specification, a message is required by the user when the user speaks the voice to classify the message into two types to increase the voice recognition rate. The language spoken by the user is divided into three types: maneuver, command, and dummy. A startup word performs a function that tells the device that a command follows. For example, the device's categorical name ("TV", "radio", "fridge"), or the device's brand ("Wissen", "Trom"), or words of admiration or dialogue ("Hey", "here") Can be

명령어는 장치의 동작을 지시하는 것으로 장치의 분류에 따라 다양하게 구성될 수 있다. 일 실시예로, 명령어는 장치의 온/오프를 제어하거나 장치에 구성된 특정한 기능을 수행하는 것을 지시할 수 있다. 이는 장치 별로 다양하게 구성될 수 있다.The command indicates the operation of the device and may be variously configured according to the classification of the device. In one embodiment, the instructions may direct to control the on / off of the device or to perform a particular function configured in the device. This may be configured in various ways for each device.

이하, 장치 내에 배치되어 사용자의 음성으로 발화한 기동어와 명령어를 구분하여 대응하는 구성 및 방법에 대해 보다 상세히 살펴본다.Hereinafter, a description will be given of the corresponding configuration and method by dividing a start word and a command which is arranged in a device and spoken by a user's voice.

도 1은 본 발명의 일 실시예에 의한 장치의 구성을 보여주는 도면이다. 장치(100)를 제어하는 중앙제어부(150)는 장치(100)를 구성하는 다양한 구성요소들을 제어한다. 장치(100)를 구성하는 구성요소들로 음성 입력부(110), 사용자 확인부(120), 장치기능제어부(130), 기동어 데이터베이스부(160), 명령어 데이터베이스부(170), 그리고 통신부(180)가 제시된다. 중앙제어부(150)는 기동어를 인식하거나 인식된 기동어를 승인하며 장치의 동작을 제어할 수 있다. 1 is a view showing the configuration of an apparatus according to an embodiment of the present invention. The central controller 150 controlling the apparatus 100 controls various components of the apparatus 100. The components constituting the device 100 include the voice input unit 110, the user identification unit 120, the device function control unit 130, the start language database unit 160, the command database unit 170, and the communication unit 180. ) Is presented. The central controller 150 may recognize the starting word or accept the recognized starting word and control the operation of the device.

기동어의 승인이란, 중앙제어부(150)가 입력된 음성(제1음성)에 대해 기동어 판단 결과에 기반하여 기동어 검증 모드로 진입하도록 장치(100)를 제어한 뒤, 음성 입력부(110)가 기동어나 명령어를 포함하는 음성(제2음성)을 입력받으면, 앞서 기동어 검증 모드로 진입하기 전에 입력된 음성(제1음성)을 기동어 데이터베이스부(160)에 저장하여 추후 동일한 제1음성이 기동어로 인식될 수 있도록 한다. 이에 대해서 도 5 및 도 7의 S60a, S60b에서 상세히 살펴본다. Recognition of the start word means that the central control unit 150 controls the apparatus 100 to enter the start word verification mode based on the start word determination result with respect to the input voice (first voice), and then the voice input unit 110. Receives a voice (second voice) including a startup word or a command, stores the input voice (first voice) in the startup word database unit 160 before entering the startup verification mode and stores the same first voice later. Allow this to be recognized as a starter. This will be described in detail with reference to S60a and S60b of FIGS. 5 and 7.

도 1에는 장치(100)가 특정한 기능을 제공하기 위한 구성요소들은 도시하지 않았다. 예를 들어 장치(100)가 TV인 경우 표시패널이나 전원부를 별도로 가질 수 있다. 장치(100)가 에어컨인 경우 공조 기능을 제공하는 구성요소들을 별도로 가질 수 있다. 장치(100)가 세탁기인 경우 세탁 기능을 제공하는 구성요소들을 별도로 가질 수 있다. 장치(100)가 냉장고인 경우, 냉장 및 냉동 기능을 제공하는 구성요소들을 별도로 가질 수 있다. 1, the components for providing the device 100 with a specific function are not shown. For example, when the device 100 is a TV, it may have a display panel or a power supply separately. If the device 100 is an air conditioner, it may have separate components that provide an air conditioning function. If the device 100 is a washing machine, it may have separate components that provide a washing function. If the device 100 is a refrigerator, it may have separate components that provide refrigeration and freezing functions.

음성 입력부(110)는 사용자의 음성을 입력받는 모듈로, 마이크를 일 실시예로 한다. 특히, 음성 입력부(110)는 장치(100)에 일체로 배치될 수도 있고 장치 외부에 배치되어 입력된 음성을 파일로 전환하여 장치(100)에게 제공할 수 있다. 음성 입력부(110)는 마이크로부터 음성 데이터를 입력받아 이를 중앙제어부(150)가 음성을 인식할 수 있도록 중앙제어부(150)에게 전달한다. The voice input unit 110 is a module that receives a voice of a user, and uses a microphone as an embodiment. In particular, the voice input unit 110 may be disposed integrally with the device 100 or may be disposed outside the device to convert the input voice into a file and provide the same to the device 100. The voice input unit 110 receives voice data from the microphone and transmits the voice data to the central controller 150 so that the central controller 150 can recognize the voice.

사용자 확인부(120)는 음성을 발화한 사용자를 확인한다. 사용자를 확인한다는 것은, 사용자가 현재 장치가 배치된 공간 내에 있는지를 확인하는 것을 포함한다. 일 실시예로, 사용자 확인부(120)는 사람이 장치 주변에 위치하는지를 센싱하는 센서가 될 수 있다. 센서의 종류로 장치 주변을 촬영하는 카메라 센서, 혹은 사람이 존재하는지 여부를 감지하는 PIR(Passive Infra Sensor) 센서, 또는 사람의 움직임을 감지하는 움직임 감지 센서 등이 될 수 있다. 본 발명의 사용자 확인부(120)는 특정한 센서에 한정되지 않는다. The user confirmation unit 120 confirms the user who spoke the voice. Identifying the user includes confirming that the user is within the space in which the device is currently located. In one embodiment, the user identification unit 120 may be a sensor for sensing whether a person is located around the device. The type of sensor may be a camera sensor photographing the surroundings of the device, a passive infrared sensor (PIR) sensor that detects whether a person exists, or a motion detection sensor that detects a person's movement. The user identification unit 120 of the present invention is not limited to a specific sensor.

또한, 센싱이 아닌 다른 방식으로 사용자 확인부(120)는 사용자가 공간 내에 존재한다는 것을 확인할 수 있다. 일 실시예로, 장치(100)가 배치된 공간 내에서 장치(100) 또는 장치 주변에 배치된 또다른 장치인 피어 장치가 제어된 시간을 사용자 확인부(120)가 확인한다. 그리고, 미리 설정된 시간 내에 장치 또는 피어 장치가 제어된 경우, 사용자 확인부(120)는 사용자가 확인되었다는 정보를 중앙제어부(150)에게 제공할 수 있다. 그 결과 중앙제어부(150)는 사용자가 장치(100) 주변에 확인된 것으로 판단하여 음성 입력부(110)가 입력받은 음성을 판단할 수 있다. In addition, other than sensing, the user identification unit 120 may confirm that the user exists in the space. In one embodiment, the user identification unit 120 confirms the controlled time of the peer device, which is the device 100 or another device arranged around the device, in the space where the device 100 is disposed. When the device or the peer device is controlled within a preset time, the user identification unit 120 may provide the central controller 150 with information that the user has been confirmed. As a result, the central controller 150 may determine that the user is confirmed around the apparatus 100 and determine the voice input by the voice input unit 110.

장치기능 제어부(130)는 장치의 기능을 제어한다. 중앙제어부(150)에 의해 제어될 수 있다. 장치의 기능은 장치의 종류에 따라 다양할 수 있다. 장치기능 제어부(130)는 사용자가 리모컨을 이용하여 장치를 제어하는 신호를 수신하거나, 소정의 버튼 형식으로 장치에 배치되어 사용자가 기능을 제어할 수 있도록 인터페이스를 제공하는 제어 인터페이스부(135)를 더 포함한다. 제어 인터페이스부(135)는 외부로 소정의 음성이나 음향을 출력하는 스피커를 더 포함할 수 있다. The device function control unit 130 controls the function of the device. It may be controlled by the central controller 150. The function of the device may vary depending on the type of device. The device function control unit 130 receives a signal for controlling the device by the user using the remote control, or the control interface unit 135 disposed on the device in the form of a predetermined button to provide an interface for the user to control the function. It includes more. The control interface 135 may further include a speaker that outputs a predetermined voice or sound to the outside.

기동어 데이터베이스부(160)는 음성 입력부(110)가 입력받은 음성이 기동어인지를 확인하고 비교하는데 필요한 데이터를 저장한다. 또한, 기동어 데이터베이스부(160)는 기동어를 발화한 사용자의 음성의 특성에 따라 동일한 기동어의 상이한 음성이 기동어가 될 수 있도록 데이터를 저장할 수 있다. 명령어 데이터베이스부(170)는 음성 입력부(110)가 입력받은 음성이 명령어인지를 확인하고 비교하는데 필요한 데이터를 저장한다. The startup word database unit 160 stores data necessary for identifying and comparing whether or not the voice input unit 110 has received the startup word. In addition, the startup word database unit 160 may store data such that different voices of the same startup word may be startup words according to characteristics of the voice of the user who spoke the startup word. The command database unit 170 stores data necessary for checking and comparing whether the voice input unit 110 receives a command input from the voice input unit 110.

기동어 데이터베이스부(160) 또는 명령어 데이터베이스부(170)는 미리 설정된 기동어나 명령어의 음성 파일 데이터 혹은 음성 파일을 확인하는데 필요한 데이터 등을 저장할 수 있고 저장된 데이터는 업데이트 되거나 추가될 수 있다. 또한, 기동어 데이터베이스부(160)는 해당 장치 외에 인접한 다른 장치들의 기동어를 저장할 수 있다. 이는 도 10에서 설명한다. The startup word database unit 160 or the command database unit 170 may store voice file data of a preset startup word or command or data necessary for checking the voice file, and the stored data may be updated or added. In addition, the start word database unit 160 may store start words of other devices besides the corresponding device. This is illustrated in FIG. 10.

도면에 미도시 되었으나, 중앙제어부(150)는 메모리 카드나 메모리 칩과 같은 별도의 저장 공간을 포함하여 연산 속도를 높이거나 일정 기간 동안 입력된 음성 파일을 임시로 저장할 수 있다. Although not shown in the drawing, the central controller 150 may include a separate storage space such as a memory card or a memory chip to increase the operation speed or temporarily store the input voice file for a predetermined period of time.

정리하면 다음과 같다. 중앙제어부(150)는 음성 입력부(110)에서 입력된 음성 및 사용자 확인부(120)가 사용자의 존재를 확인한 결과를 분석하여 입력된 음성이 장치(100) 또는 장치(100)에 인접한 피어 장치의 기능을 제어하는 명령어의 후속을 지시하는 기동어 판단 결과를 생성한다. 기동어 판단 결과란 입력된 음성이 기동어에 얼마나 일치하는가에 대한 판단 결과이다. In summary, it is as follows. The central controller 150 analyzes the voice input from the voice input unit 110 and the result of the user confirming unit 120 confirming the existence of the user, and thus the input voice of the device 100 or a peer device adjacent to the device 100. Produces a start word determination result instructing subsequent commands to control a function. The start word determination result is a result of determining how the input voice corresponds to the start word.

기동어는 명령어의 후속을 지시하여 장치(100)를 명령어 입력 모드로 전환(모드 전환 또는 모드 진입)시키는데, 기동어 판단 결과에 따라 장치(100)는 명령어 입력 모드, 기동어 검증 모드, 또는 일반 모드 중 하나가 될 수 있다. 중앙제어부(150)가 음성 입력부(110)로부터 입력된 음성 데이터는 기동어 데이터베이스부(160)에 저장된 음향 모델과 비교하여 입력된 음성 데이터가 기동어에 해당하는지 여부를 판단할 수 있다. The startup word instructs the follow-up of the command to switch the device 100 to the command input mode (mode switching or mode entry). According to the result of the startup word determination, the device 100 enters the command input mode, the startup word verification mode, or the normal mode. It can be either. The voice data input from the voice input unit 110 by the central controller 150 may be compared with the acoustic model stored in the start language database 160 to determine whether the input voice data corresponds to the start language.

장치(100)는 명령어 입력 모드로 전환(진입)하면, 사용자에게 명령어를 입력할 준비가 되었다는 것을 내장 스피커 등을 통해 출력하거나 LED 등을 점멸시킬 수 있다. 예를 들어 기동어 확인 후 장치(100)는 "말씀하세요~" 라는 음성을 출력하여 명령어를 수신할 수 있는 모드로 변환함을 사용자가 확인할 수 있도록 한다. When the device 100 switches (enters) to the command input mode, the device 100 may output through the built-in speaker or the like to flash the LED light that the user is ready to input a command. For example, after confirming the start word, the device 100 outputs a voice of "Please speak" to allow the user to confirm that the user is converted to a mode in which the command can be received.

이외에도 LED 등을 점멸하여 일반 모드인 경우에는 꺼진 상태이지만 명령어 입력 모드에서 점멸하여 사용자가 현재 장치(100)의 상태가 명령어를 입력받을 수 있는 상태라는 것을 확인할 수 있도록 한다. 또한, 본 발명의 다른 실시예에 의하면, 기동어 검증 모드에서도 사용자에게 "네?" 와 같이 음성을 출력하여 기동어를 한번 더 입력할 것을 요청할 수 있다.In addition, in the normal mode by flashing the LED light is turned off, but flashes in the command input mode so that the user can confirm that the current state of the device 100 can receive a command. Further, according to another embodiment of the present invention, the user is asked "yes?" Even in the start word verification mode. As shown in FIG. 5, a voice may be output to request the user to input the starting word once more.

기동어 판단 결과는 수치적으로 생성될 수 있다. 기동어 데이터베이스부(160)에 저장된 데이터들과 비교하는데 여기에 저장된 기동어 데이터들과 일치하는 정도를 수치적으로 생성하고 이를 일정한 기준(예를 들어 80% 혹은 70%) 이상인 경우 기동어 판단 결과는 기동어 확인성으로 산출할 수 있다. The start word determination result may be generated numerically. Compared with the data stored in the starting word database unit 160, the numerical value of the matching with the starting word data stored therein is numerically generated, and the result of the determination of the starting word in the case of a predetermined criterion (for example, 80% or 70%) or more. Can be calculated by starting word identification.

반면, 기동어 데이터베이스부(160)에 저장된 데이터들과 비교하는데 여기에 저장된 기동어 데이터들과 일치하는 수치가 낮은 경우(예를 들어 40% 또는 50% 이하 등) 기동어 판단 결과는 기동어 불일치성으로 산출할 수 있다. On the other hand, when compared with the data stored in the start word database unit 160, the numerical value matching the start word data stored therein is low (for example, 40% or less than 50%), the result of the start word determination is a start word inconsistency It can be calculated as

그런데 기동어 확인성과 기동어 불일치성 사이의 결과에 대해 본 명세서에서는 기동어 모호성으로 판단하여 장치가 기동어 검증 모드로 진입하도록 중앙제어부(150)가 제어한다. However, in the present specification, the central controller 150 controls the device to enter the starting word verification mode by determining the starting word ambiguity regarding the result between the starting word checking and the starting word inconsistency.

기동어 검증 모드란 음성 입력된 내용이 기동어로 확인되지는 않으나 어느 정도 유사성을 가지는 경우에 장치가 기동어를 보다 잘 입력받을 수 있는 상태로 전환하는 것을 의미한다. 즉, 중앙제어부(150)가 기동어 모호성으로 기동어 판단 결과를 생성한 경우, 미리 설정된 시간 내에 음성 입력도의 음성 인식 파라미터를 제어하여 음성 인식의 민감도를 높이거나, 장치에서 발생하는 소음을 줄이거나 혹은 인접한 피어 장치에서 발생하는 소음을 줄이는 등의 모드로 전환하는 것을 일 실시예로 한다. The start-up word verification mode means that when the inputted voice is not confirmed as the start-up word, but the device has some similarity, the device switches to a state in which the start-up language can be better input. That is, when the central control unit 150 generates the result of the determination of the starting word with the starting word ambiguity, the voice recognition parameter of the voice input diagram is controlled within a preset time to increase the sensitivity of the speech recognition or reduce the noise generated by the device. In one embodiment, switching to a mode such as reducing noise generated by neighboring peer devices.

중앙제어부(150)는 기 설정된 민감도 파라미터 기준에서 기동어와 유사하지만 기동어는 아닌 것으로 판단할 경우 사용자 확인부(120)가 사용자 존재를 확인하여 사용자가 확인되면 일시적으로 민감도 파라미터를 높이고 운행 중인 기기의 소음을 줄이도록 하여 기동어의 인식율을 높일 수 있다. 이 경우 민감도 파라미터는 일시적으로 높이도록 하는데, 이는 기동어 인식을 위해 민감도 파라미터를 일시적으로 올린 후, 이후 명령어를 입력받을 경우에는 민감도 파라미터를 복원하여 정확하게 명령어를 입력받을 수 있도록 한다. If the control unit 150 determines that the user is similar to the starting word but is not the starting word based on the preset sensitivity parameter, the user identification unit 120 confirms the existence of the user and temporarily raises the sensitivity parameter when the user is confirmed. The recognition rate of the starting word can be increased by reducing the In this case, the sensitivity parameter is temporarily increased. This temporarily raises the sensitivity parameter for recognition of a starting word, and when receiving a command later, restores the sensitivity parameter so that the command can be correctly input.

예를 들어 기동어가 "시작하자" 인 반면 입력된 음성이 "시자카자" 인 경우 재차 발화되는 기동어의 인식을 위해 민감도 파라미터를 높일 경우 기동어 인식률을 높이고 "시자카자" 라는 음성이 기동어로 인식될 수 있다. For example, if the starting word is "Let's start" while the input voice is "Shizakaza", if the sensitivity parameter is increased to recognize the starting word that is uttered again, it increases the recognition rate of the starting word and the "Shizakaza" voice is recognized as the starting word. Can be.

그러나, 명령어의 경우에는 정확하게 장치(100)에게 특정한 동작을 지시하는 것이므로, 기동어가 입력된 후 명령어를 입력받기 위해서는 민감도 파라미터를 복원하는 것을 일 실시예로 한다. However, in the case of the command, since the device 100 correctly instructs a specific operation, in order to receive the command after the start word is input, the sensitivity parameter may be restored.

또한, 기동어로 "시작하자"의 음성 모델만 기동어 데이터베이스부(160)에 저장된 상태에서 민감도 파라미터를 높여서 "시자카자"라는 음성이 기동어로 계속 인식된다는 것이 중앙제어부(150)에서 확인될 경우(예를 들어 "시자카자"를 기동어로 인식한 후 명령어가 입력되는 등의 경우) 중앙제어부(150)는 "시자카자"라는 음성 모델을 새로운 기동어로 승인하고 이를 기동어 데이터베이스부(160)에 저장한다. 이에 대해 보다 상세히 살펴본다.In addition, when the central controller 150 confirms that the voice of "Shizakaza" is continuously recognized as the starting language by increasing the sensitivity parameter in the state where only the voice model of "Let's start" is stored in the starting language database unit 160 ( For example, when a command is input after recognizing "Shizakaza" as a startup language, the central control unit 150 recognizes the voice model "Shizakaza" as a new startup language and stores it in the startup language database unit 160. do. Let's take a closer look at this.

도 2는 본 발명의 일 실시예에 의한 사용자의 음성이 입력될 경우의 장치의 동작 과정을 보여주는 도면이다. 도 2는 두 개의 플로우로 구성되는데, 타이머를 부가하여 기동어 입력을 검증하는 기동어 검증 모드에 적합하게 음성 인식이 일정 기간동안 수행될 수 있도록 한다. 2 is a diagram illustrating an operation process of a device when a voice of a user is input according to an embodiment of the present invention. Figure 2 is composed of two flows, so that the speech recognition can be performed for a period of time suitable for the start word verification mode of verifying the start word input by adding a timer.

기동어나 명령어가 입력되지 않아 대기 상태인 일반 모드에서 시작할 수 있다. 먼저 사용자 음성이 음성 입력부(110)에 입력된다(S1). 입력된 음성에 대응하여 중앙제어부(150)는 입력된 음성이 기동어인지를 인식한다(S2). 기동어인지 여부는 입력된 음성과 기동어 데이터베이스부(160)에 저장된 정보(비교를 위한 음성 파일 혹은 단어 파일 등)를 비교하여 유사도를 측정하여 유사도에 따라 판단할 수 있다. You can start in normal mode with no startup words or commands. First, a user voice is input to the voice input unit 110 (S1). In response to the input voice, the central controller 150 recognizes whether the input voice is a starting word (S2). Whether the user is a starting word may be determined according to the similarity by measuring the similarity by comparing the input voice with information stored in the starting word database unit 160 (such as a voice file or a word file for comparison).

유사도를 측정한 결과 기동어로 확인되는 경우, 즉 기동어어와 기동어 데이터베이스부(160)에 저장된 정보를 비교한 결과 도출되는 음성 비교 결과가 미리 설정된 음성 매칭 상태를 지시하는 파라미터를 만족시킬 경우(S3), 중앙제어부(150)는 기동어가 발화된 것으로 확인하고, 명령어 입력 모드를 진행한다. When the degree of similarity is determined as a starting word, that is, when a speech comparison result derived by comparing the starting word and information stored in the starting word database unit 160 satisfies a parameter indicating a preset voice matching state (S3). The central control unit 150 confirms that the starting word is spoken and proceeds to the command input mode.

명령어 입력 모드는 맨 처음 일반 모드에서 시작할 경우에는 타이머가 실행되지 않는 상태이므로(S4), 장치(100)의 구성요소들은 명령어 인식을 수행하고(S6) 종료한다. 종료한다는 것은 장치(100)가 새로운 기동어 또는 명령어 입력을 위한 일반 모드로 진입하는 것을 의미한다. 만약 타이머가 실행된 상태인 경우(S4) 타이머를 해제하고 민감도를 복원하고 동작을 이전 상태로 복원한 후(S5) S6 단계를 진행한다. When the command input mode is first started in the normal mode, since the timer is not executed (S4), the components of the apparatus 100 perform the command recognition and terminate (S6). Terminating means that the device 100 enters a normal mode for inputting a new starting word or command. If the timer is in the running state (S4), the timer is released, the sensitivity is restored, and the operation is restored to the previous state (S5).

한편, S3에서 파라미터를 만족시키지 못한 상태이지만 결과에서 도출되는 음성 비교 결과가 미리 설정된 음성 매칭 상태를 지시하는 파라미터에 근접하거나 유사한 경우(S8), 기동어 검증 모드로 진입할 것인지 여부를 판단할 수 있다. 이를 위해 사용자가 존재하는지를 사용자 확인부(120)가 확인한다(S9). 확인 결과 사용자가 존재하는 것으로 판단되면, 중앙제어부(150)는 기동어 검증 모드로 진입하여 타이머를 설정 또는 재설정한다(S10). On the other hand, if the parameter is not satisfied at S3 but the voice comparison result derived from the result is close to or similar to the parameter indicating the preset voice matching state (S8), it may be determined whether to enter the starting-word verification mode. have. To this end, the user confirmation unit 120 checks whether a user exists (S9). If it is determined that the user exists, the central control unit 150 enters the start word verification mode and sets or resets the timer (S10).

이는 기동어를 검증하기 위해 일정 기간(타이머에 설정된 시간)동안 음성 입력부(110)의 음성 인식 민감도를 증가시키고, 해당 장치 또는 주변의 장치들의 소음을 감소시키도록 제어한다(S20). 이를 위해 장치(100)의 통신부(180)는 주변 장치들에게 소음을 감소시킬 것을 지시하는 메시지를 전송할 수 있다. This controls to increase the voice recognition sensitivity of the voice input unit 110 for a predetermined period (time set in the timer) to verify the start word, and to reduce the noise of the device or the surrounding devices (S20). To this end, the communicator 180 of the device 100 may transmit a message instructing peripheral devices to reduce noise.

S20은 중앙제어부(150)가 판단한 기동어 판단 결과가 기동어가 모호한 상태(기동어 모호성)인 경우, 이러한 기동어 판단 결과에 기반하여 중앙제어부(150)는 기동어 검증 모드로 진입한다. 기동어 모호성은 입력된 음성이 기동어로 억셉되지는 않으나 유사도가 미리 설정된 기준 이상인 경우를 의미한다. In step S20, when the start word determination result determined by the central controller 150 is a ambiguity of the start word (start word ambiguity), the central controller 150 enters the start word verification mode based on the start word determination result. Starting word ambiguity means that the input voice is not accepted as the starting word, but the similarity is higher than a preset criterion.

예를 들어 음성의 파형이 80%이상 일치하여야 기동어로 인식하는 경우 60% 내지 79%로 일치하는 경우 기동어 모호성이라고 판단할 수 있다. 또는 기동어의 글자수 N 대비 동일한 것으로 인식된 글자의 수가 K개 이상인 경우 이를 모호성인 상태로 확인하고 기동어 검증 모드로 진입할 수 있다. For example, if the waveform of the voice coincides with 80% or more when it is recognized as the starting word, it may be determined that the starting word ambiguity is 60% to 79%. Alternatively, when the number of letters recognized as the same as the number N of the starting word is K or more, it may be confirmed as an ambiguous state and may enter the starting word verification mode.

중앙제어부(150)는 미리 설정된 시간 내에 음성 입력부(110)의 음성 인식 파라미터를 제어하여 음성 인식의 민감도를 높이는 것을 일 실시예로 한다. 이 과정에서 음성 입력부(110)는 일반 모드인 경우보다 더 작은 소리를 센싱하여 입력받을 수 있다. The central controller 150 controls the voice recognition parameter of the voice input unit 110 within a preset time to increase the sensitivity of the voice recognition. In this process, the voice input unit 110 may sense and receive a smaller sound than in the normal mode.

또한 S20은 중앙제어부(150)가 판단한 기동어 판단 결과가 기동어 모호성인 경우, 중앙제어부(150)는 기동어 검증 모드로 진입한다. 즉, 주변의 피어 장치들과 메시지를 송수신하는 통신부(180)가 피어 장치에서 발생하는 소음을 줄이도록 지시하는 메시지를 피어장치에게 송신할 수 있다. 이는 에어컨이 기동어를 확인하는 과정에서 인접하게 배치된 피어 장치인 TV의 소리를 줄이도록 메시지를 송신하는 것을 일 실시예로 한다. In addition, in S20, when the start word determination result determined by the central controller 150 is start word ambiguity, the central controller 150 enters the start word verification mode. That is, the communicator 180 which transmits and receives a message with the surrounding peer devices may transmit a message to the peer device indicating to reduce the noise generated by the peer device. According to an embodiment of the present invention, the air conditioner transmits a message so as to reduce the sound of a TV, which is a peer device disposed adjacent to each other, in the process of confirming the starting word.

S20 이후 사용자의 음성 입력을 대기하도록 다시 S1 단계로 진행한다. 이 과정에서 앞서 설정한 타이머는 지속적으로 동작하는 상태이다. 만약, 타이머가 종료한 경우(S12) 더 이상 기동어가 입력되지 않는 것으로 보고 기동어 검증 모드에서 일반 모드로 진입하기 위해 타이머를 해제하고 민감도를 복원하며 해당 장치 및 다른 기기들이 소음 감소를 위해 중단하였거나 소리 소거 상태였던 상태를 중단하고 이전 동작으로 복원할 수 있다(S13). After S20, the process returns to step S1 to wait for a user's voice input. In this process, the timer set in advance is continuously operating. If the timer expires (S12), it is determined that no more start words are entered, and the timer is released to restore normal sensitivity from the start word verification mode, the sensitivity is restored, and the device and other devices are stopped for noise reduction. The state that was in the mute state can be stopped and restored to the previous operation (S13).

S20 이후 사용자로부터 다시 음성이 입력되면(S1) 앞서 민감도가 증가되고 기기 소음이 감소된 상태에서 중앙제어부(150)는 보다 정확하게 기동어를 인식할 수 있으며(S2) 이에 따라 동작할 수 있다. When the voice is input again from the user after S20 (S1), the central controller 150 may recognize the starting word more accurately in the state where the sensitivity is increased and the device noise is reduced (S2), and thus may operate accordingly.

만약, 재차 음성 입력 모드에서 기동어가 입력되지 않거나, 혹은 타이머로 설정된 시간이 지나기 전까지 별도의 음성이 입력되지 않을 경우 중앙제어부(150)는 타이머 종료와 함께 장치(100)를 일반 모드로 전환될 수 있다. If the start word is not input again in the voice input mode or a separate voice is not input until the time set as the timer elapses, the central controller 150 may switch the device 100 to the normal mode with the end of the timer. have.

정리하면, 일반 모드는 기동어를 입력받을 수 있는 장치(100)의 모드를 의미한다. 이 과정에서 기동어가 충분히 인식된 경우 명령어 입력 모드로 장치(100)의 상태가 진입하지만, 기동어가 불완전하게 인식된 경우, 기동어 검증 모드 상태로 장치(100)가 진입하여 기동어 인식을 위해 일정한 시간 동안 기동어 입력을 위해 장치 또는 주변 장치의 소음을 차단할 수 있다. In summary, the normal mode refers to a mode of the apparatus 100 capable of receiving a startup word. In this process, if the start word is sufficiently recognized, the state of the device 100 enters the command input mode. However, if the start word is incompletely recognized, the device 100 enters the start word verification mode and is fixed for the start word recognition. The noise of the device or peripheral device can be shut off for input of the starting word for a period of time.

도 3은 본 발명의 일 실시예에 의한 장치가 가질 수 있는 모드를 보여주는 도면이다. 장치(100)를 음성 인식의 관점에서 볼 때, 3가지의 모드를 가진다. 별도의 음성이 인식되지 않아 음성의 입력을 대기하는 일반 모드(STATE_N), 기동어가 입력되어 명령어를 입력받는 명령어 입력 모드(STATE_C), 그리고 기동어인지 명확하게 확인되지 않아 기동어가 재차 입력되기를 대기하는 기동어 검증 모드(STATE_R)를 포함하는 것을 일 실시예로 한다. 3 is a diagram illustrating a mode that an apparatus according to an embodiment of the present invention may have. From the perspective of speech recognition, the device 100 has three modes. Normal mode (STATE_N) to wait for voice input because no separate voice is recognized, Command input mode (STATE_C) to receive a command by inputting a starting word, and waiting for the starting word to be input again because it is not clearly identified as a starting word. One embodiment includes the start word verification mode (STATE_R).

일반 모드(STATE_N)에서 음성이 입력되면(S31) 기동어인지 확인하고 그 결과(기동어 판단 결과)에 따라 명령어 입력모드(STATE_C)로 진입하는 과정(S32), 기동어 검증 모드(STATE_R)로 진입하는 과정(S33) 또는 기동어가 아닌 것으로 확인되어 일반 모드(STATE_N)로 복귀하는 과정(S34)로 구성된다. When the voice is input in the normal mode (STATE_N) (S31), it is checked whether it is a starting word, and according to the result (starting language judgment result), the process enters the command input mode (STATE_C) (S32), and the starting word verification mode (STATE_R). It is determined that the process of entering (S33) or the process is not a starting word and returns to the normal mode (STATE_N) (S34).

기동어 판단 결과 기동어로 확인되면 중앙제어부(150)는 장치를 명령어 입력 모드로 전환하여 명령어를 입력받도록 할 수 있으며, 이 과정에서 사용자에게 명령어 입력 모드로 진입하였음을 알리기 위해 장치(100)가 "말씀하세요~"라는 음성을 출력할 수 있다. When the start word is determined as the start word, the central control unit 150 may switch the device to the command input mode and receive a command. In this process, the device 100 may notify the user of the command input mode. You can output the voice say "".

명령어 입력 모드(STATE_C)에서 명령어 입력이 되면 명령어를 수행한 후 일반 모드(STATE_N)로 진입하고, 일정 시간 내에 명령어가 입력되지 않으면 다시 일반 모드(STATE_N)로 진입한다(S35).When the command is input in the command input mode (STATE_C), the command enters the normal mode (STATE_N) after executing the command. If the command is not input within a predetermined time, the command enters the normal mode (STATE_N) again (S35).

기동어 검증 모드(STATE_R) 역시 기동어가 재차 입력될 경우 명령어 입력 모드(STATE_C)로 진입하고(S36), 일정 시간 내에 명령어가 입력되지 않거나 이후 입력되는 음성이 기동어가 아닌 경우 다시 일반 모드(STATE_N)로 진입한다(S37). 기동어 검증 모드(STATE_R)로 진입하면서 장치(100)는 음성 입력부(110)의 민감도를 증가시고 소음을 줄여서 기동어 입력 가능성을 높일 수 있다.The start word verification mode (STATE_R) also enters the command input mode (STATE_C) when the start word is input again (S36), and if the command is not input within a predetermined time or the voice input afterward is not the start word, the normal mode (STATE_N) again. Enter (S37). While entering the start word verification mode (STATE_R), the device 100 may increase the sensitivity of the voice input unit 110 and reduce the noise to increase the start word input possibility.

기동어 검증 모드(STATE_R)은 중앙제어부(150)가 기동어를 판단한 결과가 기동어 모호성인 것으로 판단된 경우, 장치에서 발생하는 소음을 줄이도록 중앙제어부(150)가 장치기능제어부(130)를 제어할 수 있다. 또한, 주변의 장치들 역시 소음을 줄이도록 제어할 수 있다.In the start word verification mode (STATE_R), when the central control unit 150 determines that the starting word is a starting word ambiguity, the central control unit 150 controls the device function control unit 130 to reduce noise generated by the device. Can be controlled. In addition, peripheral devices can also be controlled to reduce noise.

도 2 및 도 3에서 살펴본 내용을 정리하면 다음과 같다. 사용자가 음성 입력을 하면 음성 입력부(110)가 이를 중앙제어부(150)에게 제공한다. 중앙제어부(150)는 기동어 데이터베이스부(160)에 저장된 음향 모델과의 유사도를 측정하고 설정된 민감도 파라미터를 만족할 경우 기동어로 확인하고 명령어 입력 모드로 진입한다.2 and 3 summarized as follows. When the user inputs a voice, the voice input unit 110 provides the voice input to the central controller 150. The central controller 150 measures the similarity with the acoustic model stored in the starter database 160 and checks the starter if the set sensitivity parameter is satisfied, and enters the command input mode.

반면, 기동어가 아닌 것으로 리젝(reject) 되었으나 그 유사도의 범위가 미리 설정된 범위 내에 있거나, 혹은 사용자가 주변에 있는 것으로 사용자 확인부(120)가 사용자가 존재하는지 확인할 수 있다. 또는 유사도의 범위와 함께 사용자 확인을 진행할 수 있다. On the other hand, although the object is rejected (rejected), but the range of similarity is within a preset range, or the user is around, the user confirmation unit 120 can determine whether the user exists. Alternatively, user confirmation may be performed with a range of similarities.

사용자 확인 결과, 예를 들어 카메라나 PIR센서 또는 주변 기기들이 사용자에 의해 제어되는 등으로 인해 사용자가 장치의 주변에 있는 것으로 확인한 경우 특정 시간 동안 타이머를 설정하고 민감도 파라미터를 높이며 동작 중인 기기 또는 주변 기기의 소음을 감소시킨다. 타이머가 종료되기 전 음성 데이터가 음성 입력부(110)에 다시 들어오면, 높아진 민감도와 줄어든 소음으로 인해 유사 단어까지 기동어로 더 잘 인식되게 되며, 인식이 되면 민감도와 기기 동작을 복원하여 일반적인 음성 인식의 상태로 변경한다. 또한, 타이머가 종료될 경우에도 민감도와 기기 동작을 복원하여 과도한 민감도로 인한 명령어나 기동어 오인식을 방지하고 운행 중이었으나 소음을 줄이기 위해 잠시 중단 혹은 소음 제거 모드였던 기기 및 주변 기기들도 지속하여 동작하도록 제어한다.If the user confirms that the user is in the vicinity of the device, for example, because the camera, PIR sensor, or peripherals are controlled by the user, the device or peripheral may be set for a specific time and the sensitivity parameter may be set Reduces noise. If the voice data comes back to the voice input unit 110 before the timer expires, the similar words are better recognized as the starting words due to the increased sensitivity and reduced noise. Change to the state. In addition, even when the timer expires, the sensitivity and operation of the device are restored to prevent misunderstanding of commands or starting words due to excessive sensitivity, and the devices and peripheral devices that were in the pause or noise canceling mode while operating were continuously operated to reduce noise. To control.

도 4 및 도 5는 본 발명의 일 실시예에 의한 사용자 확인부가 PIR 센서를 포함하는 장치의 동작 과정을 보여주는 도면이다. 도 4의 100a는 세탁기를 일 실시예로 한다. 사용자 확인부(120a)는 PIR 센서를 포함한다. 4 and 5 are views illustrating an operation process of a device including a PIR sensor by a user confirmation unit according to an embodiment of the present invention. 100A of FIG. 4 uses a washing machine as an embodiment. The user identification unit 120a includes a PIR sensor.

도 4와 같이 구성된 경우, 도 5와 같은 과정에 기반하여 세탁기(100a)가 기동어에 대응하여 동작할 수 있다. 세탁기(100a)의 구성은 도 1에서 살펴본 구성요소를 포함할 수 있다. When configured as shown in FIG. 4, the washing machine 100a may operate in response to the starting word based on the process as shown in FIG. 5. The configuration of the washing machine 100a may include the components described with reference to FIG. 1.

도 5의 S41a과 같이 사용자가 세탁기(100a)를 동작시키기 위한 기동어로 "엘지 트롬"이라고 발화한다. 세탁기(100a)의 중앙제어부(150)는 마이크 등과 같은 음성 입력부(110)를 통해 입력된 음성 데이터가 기동어인지를 인식하여 유사도를 측정한다(S42). 유사도를 측정한 결과 음성을 인식함에 있어 설정된 민감도 파라미터의 수준에서는 리젝되었으나 기동어에 근접한 정도로 측정될 수 있다(S43a). As in S41a of FIG. 5, the user utters "LG trom" as a starting word for operating the washing machine 100a. The central controller 150 of the washing machine 100a recognizes whether the voice data input through the voice input unit 110 such as a microphone is a starting word and measures similarity (S42). As a result of measuring the similarity, the speech is rejected at the level of the sensitivity parameter set in the speech recognition but may be measured to be close to the starting word (S43a).

예를 들어, 주변 잡음에 의해서 혹은 사용자의 언어 습관 혹은 발음 등으로 인해 "엘지 드롬"으로 확인될 수 있다. 혹은 기동어 데이터베이스부(160)에 저장된 음성 모델과 비교한 결과 기동어를 구성하는 4개의 글자들 중에서 3개의 글자만 일치하는 것으로 확인된 경우에 중앙제어부(150)는 기동어를 억셉트(accept) 또는 리젝(reject) 중 어느 하나로 판단하기에 앞서, 사용자가 주변에 존재하는지 확인할 수 있다.For example, it may be identified as an "lg drop" by ambient noise or by a user's language habit or pronunciation. Alternatively, when it is confirmed that only three letters among four letters constituting the starting word match as compared with the voice model stored in the starting word database unit 160, the central controller 150 accepts the starting word. ) Or before rejecting, it may be checked whether the user is present in the vicinity.

즉, 사용자확인부(120a), 즉 PIR 센서는 주변을 확인하여 사람이 주변에 있는 것을 확인한다(S44a). 그리고 확인 결과 세탁기(100a) 주변에 사람이 있는 경우(S45), 중앙제어부(150)는 일정한 시간 동안 타이머를 재설정하고(S46), 민감도 파라미터를 높게 변경하고, 세탁기(100a)의 소음을 감소시킨다(S47a). That is, the user confirmation unit 120a, that is, the PIR sensor checks the surroundings to confirm that the person is in the vicinity (S44a). If there is a person around the washing machine 100a (S45), the central control unit 150 resets the timer for a predetermined time (S46), changes the sensitivity parameter to high, and reduces the noise of the washing machine 100a. (S47a).

민감도 파리미터를 높게 변경한다는 것은 보다 작은 소리를 입력받을 수 있도록 하는 것이다. 세탁기(100a)에 설정된 민감도 파라미터는 장치에 적합하게 설정되었지만, 기동어를 추가적으로 입력받기 위해 일시적으로 민감도 파라미터를 높게 변경하여 짧은 시간 내에 기동어가 입력될 경우 이를 잘 인식할 수 있도록 한다. Changing the sensitivity parameter higher allows you to receive a smaller sound. Although the sensitivity parameter set in the washing machine 100a is appropriately set for the device, the sensitivity parameter is temporarily changed to a high value so as to receive an additional input of the starting word so that it can be well recognized when the starting word is input within a short time.

민감도 파라미터를 높게 변경하면 오인식률도 높아질 수 있으므로, 본 명세서에서는 타이머를 (재)설정하여(S46) 일정한 시간 내에만 민감도 파라미터가 높게 유지되도록 한다. S45에서 사람이 없는 것으로 확인된 경우 일반 모드로 진입하고(S53), 새로운 음성이 인식되도록 대기한다(S54).If the sensitivity parameter is changed to a high value, the false recognition rate may also be increased. In this specification, the timer is reset (S46) so that the sensitivity parameter is kept high only within a predetermined time. If it is determined in S45 that there is no person, the mobile station enters a normal mode (S53) and waits for a new voice to be recognized (S54).

또한, 중앙제어부(150)는 세탁기(100a)가 동작중인 경우에는 장치기능제어부(130)를 제어하여 세탁기(100a)가 일시 정지하도록 한다. 물론 중앙제어부(150)는 세탁기(100a)가 동작 중이지 않을 경우에는 소음을 줄이는 단계를 생략할 수 있다. 또는 중앙제어부(150)는 세탁기(100a)가 소음 발생이 크지 않는 동작을 수행 중인 경우에도 소음을 줄이는 단계를 생략할 수 있다. In addition, when the washing machine 100a is in operation, the central controller 150 controls the device function controller 130 to allow the washing machine 100a to temporarily stop. Of course, the central controller 150 may omit the step of reducing the noise when the washing machine 100a is not operating. Alternatively, the central controller 150 may omit the step of reducing noise even when the washing machine 100a is performing an operation in which noise is not generated.

이는 기동어 검증 모드로 세탁기(100a)가 진입한 것을 의미한다. 이 과정에서 세탁기(100a)는 사용자에게 기동어를 재입력할 것을 요청하도록 "다시 말해주세요" 혹은 "네?" 라는 음성 파일을 출력할 수도 있다. S44a에서 사람이 확인되지 않을 경우 기동어가 입력되지 않은 것으로 판단하고 일반 모드로 진입한다. This means that the washing machine 100a enters the start word verification mode. In this process, the washing machine 100a asks the user to re-enter the mobile word. "Please say again" or "Yes?" You can also output a voice file called. If the person is not confirmed in S44a, it is determined that no starting word is input and the vehicle enters the normal mode.

이후 다시 사용자가 다시 "엘지 트롬" 혹은 이와 유사하게 판단될 수 있는 단어를 발화한다(S48a). 이는 사용자가 앞서 기동어를 발화하였으나 기기가 명령어를 입력받는 모드로 전환되었음을 통지하지 않은 상태이므로, 사용자가 기동어를 재발화하는 단계를 포함한다. 이때, 사용자가 재발화한 음성 데이터가 처음의 음성 데이터와 유사한 수준으로 측정되더라도, 일시적으로 높여둔 민감도와 비교하기 때문에 기동어로 억셉트(accept)될 확률이 높아진다.Afterwards, the user again fires a word that can be judged as "lg trom" or similar (S48a). This includes the step of re-igniting the start word by the user since the user has uttered the start word but has not been informed that the device has been switched to the mode for receiving the command. At this time, even if the user's reproduced voice data is measured at a level similar to that of the first voice data, the probability of being accepted as a starting word increases because it is compared with the temporarily increased sensitivity.

즉, 민감도가 증가된 상태이므로 입력된 음성은 S42 보다 기동어로 인식될 가능성이 높아지며 이에 따라 기동어로 인식한다(S49). 기동어로 인식한 결과 중앙제어부(150)는 타이머를 종료시키고, 이전의 민감도로 복원한 후(S50), 명령어를 입력받을 수 있는 명령어 입력 모드로 진입한다(S51). That is, since the sensitivity is increased, the input voice is more likely to be recognized as a starting language than S42, and thus is recognized as a starting language (S49). As a result of recognizing the maneuver, the central controller 150 terminates the timer, restores the sensitivity to the previous sensitivity (S50), and enters the command input mode in which the command can be input (S51).

중앙제어부(150)는 명령어 입력 모드로의 진입을 사용자에게 알리기 위해 "말씀하세요~" 라는 음성 파일을 스피커로 출력할 수 있다. 명령어의 인식률을 높이기 위해 명령어가 입력된 후 기기 동작을 복원할 수 있다(S52a). 물론, 민감도로 복원하면서 동시에 기기 동작을 복원할 수도 있다. The central control unit 150 may output a voice file “tell me” to the speaker to inform the user of the entry into the command input mode. In order to increase the recognition rate of the command, the operation of the device may be restored after the command is input (S52a). Of course, it is possible to restore the operation of the device while simultaneously restoring to sensitivity.

S52a 단계를 S51 전에 하는 경우의 일 실시예로는 소음을 감소시킨 세탁기(100a)의 동작이 소음이 크지 않는 동작(예를 들어 입수하는 동작이거나 불리는 동작 등인 경우)에 S51 전에 S52a 단계를 수행할 수 있다. 반면, 중앙제어부(150)는 소음이 큰 탈수와 같은 동작에 대해 명령어 입력 모드 이후 명령어가 입력된 후에 동작을 복원하도록 제어할 수 있다. In one embodiment when the step S52a is performed before the step S51, the operation of the washing machine 100a having reduced the noise may be performed before the step S52a in the case where the operation of the noise reduction operation is not significant (for example, an operation or a called operation). Can be. On the other hand, the central controller 150 may control to restore the operation after the command is input after the command input mode for an operation such as dehydration with a high noise.

또한 S47a 이후에 더 이상 기동어가 입력되지 않는 경우 일정한 시간이 지난 후(타이머가 종료할 경우) 일반 모드로 진입할 수 있다. In addition, if the starting word is no longer input after S47a, the user may enter the normal mode after a certain time (when the timer ends).

한편 S52a 이후 앞서 S41a에서 입력된 기동어를 기동어 데이터베이스부(160)에 저장하여 추후 "엘지 드롬"으로 인식될 경우에도 기동어로 인식될 수 있도록 하여 사용자의 편의성을 높일 수 있다. 또는, 기동어를 기동어 데이터베이스부(160)의 기동어로 바로 저장하는 대신, 기동어 후보로 저장하여 기동어 후보가 지속적으로 인식될 경우에 기동어 데이터베이스부(160)에 저장할 수 있다. 즉, 중앙제어부(150)는 S41a 또는 S41b에서 입력받은 제1음성에 대한 기동어 판단 결과에 기반하여 기동어 검증 모드로 진입한 후에, 이후 음성 입력부가 입력받은 제2음성(S48a, S48b 참조)이 기동어 또는 명령어로 확인된 경우, 제1음성을 기동어 데이터베이스부(160)에 저장한다. 즉, 중앙제어부(150)는 기동어 검증 모드로 진입하기 전에 입력된 제1음성을 기동어 데이터베이스부(160)에 저장하여 추후 제1음성과 동일한 음성이 입력되면 이를 기동어로 인식할 수 있도록 한다. Meanwhile, after S52a, the start word input in S41a is stored in the start word database unit 160 so that the start word may be recognized even if it is later recognized as an “lG drop”, thereby increasing user convenience. Alternatively, instead of storing the start word directly as a start word of the start word database unit 160, the start word may be stored as a start word candidate and stored in the start word database unit 160 when the start word candidate is continuously recognized. That is, the central controller 150 enters the starting word verification mode based on the starting word determination result for the first voice input from S41a or S41b, and then the second voice input by the voice input unit (see S48a and S48b). When it is confirmed by the start word or command, the first voice is stored in the start word database unit 160. That is, the central controller 150 stores the first voice input before starting the start word verification mode in the start word database unit 160 to recognize the start word if the same voice is input later. .

도 4 및 도 5는 세탁기(100a)를 일 실시예로 하지만 그 외에 다양한 가전 제품에 적용할 수 있다. 또한 사용자 확인부의 일 실시예로 인체(동작)감지센서를 제시하여 기동어 입력에서의 임계 조절 상황을 제시하고 있으나 그 외에 다양한 센서를 사용할 수도 있다.4 and 5 illustrate the washing machine 100a as an embodiment, but may be applied to various home appliances. In addition, as an embodiment of the user identification unit, a human body (motion) sensor is presented to suggest a threshold control situation in the input of the maneuver, but other sensors may be used.

도 6 및 도 7은 본 발명의 다른 실시예에 의한 사용자 확인부가 카메라 센서를 포함하는 장치의 동작 과정을 보여주는 도면이다. 도 6의 100b는 에어컨을 일 실시예로 한다. 사용자 확인부(120b)는 카메라 센서를 포함한다. 6 and 7 illustrate an operation process of a device including a camera sensor by a user confirmation unit according to another embodiment of the present invention. 100b of FIG. 6 illustrates an example of an air conditioner. The user identification unit 120b includes a camera sensor.

도 6와 같이 구성된 경우, 도 7와 같은 과정에 기반하여 에어컨(100b)이 기동어에 대응하여 동작할 수 있다. 에어컨(100b)의 구성은 도 1에서 살펴본 구성요소를 포함할 수 있다. 도 7은 도 5의 과정과 유사하므로, 차이가 있는 부분에 대해서 주로 살펴본다. 도 5와 차이있는 부분은 S41b, S43b, S44b, S47b, S52b, S60b 등이 된다.In the case of FIG. 6, the air conditioner 100b may operate in response to the starting word based on the process of FIG. 7. The configuration of the air conditioner 100b may include the components described with reference to FIG. 1. Since FIG. 7 is similar to the process of FIG. 5, the differences are mainly described. Parts different from FIG. 5 are S41b, S43b, S44b, S47b, S52b, S60b, and the like.

도 7의 S41b과 같이 사용자가 에이컨(100b)를 동작시키기 위한 기동어로 "엘지 휘센"이라고 발화한다. 에이컨(100b)의 중앙제어부(150)는 마이크 등과 같은 음성 입력부(110)를 통해 입력된 음성 데이터가 기동어인지를 인식하여 유사도를 측정한다(S42). 유사도를 측정한 결과 음성을 인식함에 있어 설정된 민감도 파라미터의 수준에서는 리젝되었으나 기동어에 근접한 정도로 측정될 수 있다(S43b). As in S41b of FIG. 7, the user utters "LG whissen" as a starter for operating the acorn 100b. The central controller 150 of the Aiken 100b recognizes whether the voice data input through the voice input unit 110 such as a microphone is a starting word and measures similarity (S42). As a result of measuring the similarity, the speech was rejected at the level of the sensitivity parameter set in the speech recognition but may be measured to be close to the starting word (S43b).

예를 들어, 주변 잡음에 의해서 혹은 사용자의 언어 습관 혹은 발음 등으로 인해 "엘지 히센"으로 확인될 수 있다. 혹은 기동어 데이터베이스부(160)에 저장된 음성 모델과 비교한 결과 기동어를 구성하는 4개의 글자들 중에서 3개의 글자만 일치하는 것으로 확인된 경우에 중앙제어부(150)는 기동어를 억셉트(accept) 또는 리젝(reject) 중 어느 하나로 판단하기에 앞서, 사용자가 주변에 존재하는지 확인할 수 있다.For example, it may be identified as an "lg hissen" by ambient noise or by a user's language habit or pronunciation. Alternatively, when it is confirmed that only three letters among four letters constituting the starting word match as compared with the voice model stored in the starting word database unit 160, the central controller 150 accepts the starting word. ) Or before rejecting, it may be checked whether the user is present in the vicinity.

즉, 사용자확인부(120b), 즉 카메라 센서는 주변을 확인하여 사람이 주변에 있는 것을 확인한다(S44b). 그리고 확인 결과 에이컨(100b) 주변에 사람이 있는 경우(S45), 중앙제어부(150)는 일정한 시간 동안 타이머를 재설정하고(S46), 민감도 파라미터를 높게 변경하고, 에이컨(100b)의 소음을 감소시킨다(S47). That is, the user confirmation unit 120b, that is, the camera sensor checks the surroundings and confirms that the person is in the vicinity (S44b). If there is a person around the air conditioner 100b (S45), the central controller 150 resets the timer for a predetermined time (S46), changes the sensitivity parameter to high, and reduces the noise of the air conditioner 100b. (S47).

민감도 파리미터를 높게 변경한다는 것은 보다 작은 소리를 입력받을 수 있도록 하는 것이다. 에이컨(100b)에 설정된 민감도 파라미터는 장치에 적합하게 설정되었지만, 기동어를 추가적으로 입력받기 위해 일시적으로 민감도 파라미터를 높게 변경하여 짧은 시간 내에 기동어가 입력될 경우 이를 잘 인식할 수 있도록 한다. 또한 앞서 카메라 센서를 이용하여 사용자의 위치를 확인하였으며, 중앙제어부(150)는 음성 입력부(110)의 마이크가 사용자의 위치를 향하도록 조절할 수도 있다. 한편, S45에서 사람이 없는 것으로 확인된 경우 S53 및 S54로 진행한다.Changing the sensitivity parameter higher allows you to receive a smaller sound. Although the sensitivity parameter set in the air conditioner 100b is appropriately set for the device, the sensitivity parameter is temporarily changed to a high value in order to receive an additional input of the starting word, so that it can be well recognized when the starting word is input within a short time. In addition, the position of the user was confirmed using a camera sensor, and the central controller 150 may adjust the microphone of the voice input unit 110 to face the position of the user. On the other hand, if it is determined that there is no person in S45, the process proceeds to S53 and S54.

또한, 중앙제어부(150)는 에이컨(100b)이 동작중인 경우에는 장치기능제어부(130)를 제어하여 에이컨(100b)의 풍량을 줄이거나 간접풍으로 전환한다. 일 실시예로, 중앙제어부(150)는 장치(100b)의 풍량을 제어하여 기동어를 좀더 잘 입력받을 수 있도록 한다. 물론 중앙제어부(150)는 에이컨(100b)이 동작 중이지 않을 경우에는 소음을 줄이는 단계를 생략할 수 있다. 또는 중앙제어부(150)는 에이컨(100b)이 소음 발생이 크지 않는 동작을 수행 중인 경우에도 소음을 줄이는 단계를 생략할 수 있다. In addition, when the air conditioner 100b is in operation, the central controller 150 controls the apparatus function control unit 130 to reduce the air volume of the air conditioner 100b or switch to indirect wind. In one embodiment, the central controller 150 controls the air volume of the device 100b to receive a better input of the starting word. Of course, the central control unit 150 may omit the step of reducing the noise when the air conditioner (100b) is not in operation. Alternatively, the central controller 150 may omit the step of reducing the noise even when the air conditioner 100b is performing an operation in which noise is not generated.

이는 기동어 검증 모드로 에이컨(100b)이 진입한 것을 의미하며 도 5에서 살펴본 바와 같이 S48b 내지 S52b 단계를 진행한다. 이 과정에서 에이컨(100b)는 사용자에게 기동어를 재입력할 것을 요청하도록 "다시 말해주세요" 혹은 "네?" 라는 음성 파일을 출력할 수도 있으며 S44에서 사람이 확인되지 않을 경우 기동어가 입력되지 않은 것으로 판단하고 일반 모드로 진입한다. This means that the Aiken 100b enters the start word verification mode and proceeds to steps S48b to S52b as shown in FIG. 5. In this process, Aiken 100b asks the user to re-enter the activation word, "Please say again" or "Yes?" It can also output a voice file, and if the person is not confirmed in S44, it is determined that the starting word is not input and enters the normal mode.

이후 다시 사용자가 다시 "엘지 휘센" 혹은 이와 유사하게 판단될 수 있는 단어를 발화한다(S48b). 이는 사용자가 앞서 기동어를 발화하였으나 기기가 명령어를 입력받는 모드로 전환되었음을 통지하지 않은 상태이므로, 사용자가 기동어를 재발화하는 단계를 포함한다. 이때, 사용자가 재발화한 음성 데이터가 처음의 음성 데이터와 유사한 수준으로 측정되더라도, 일시적으로 높여둔 민감도와 비교하기 때문에 기동어로 억셉트될 확률이 높아진다.Afterwards, the user again fires a word that can be judged as "LG Whissen" or similar (S48b). This includes the step of re-igniting the start word by the user since the user has uttered the start word but has not been informed that the device has been switched to the mode for receiving the command. At this time, even if the user's speech data is measured at a level similar to that of the first voice data, the probability of being accepted as a starting word increases because the sensitivity is temporarily increased.

즉, 민감도가 증가된 상태이므로 입력된 음성은 S42 보다 기동어로 인식될 가능성이 높아지며 이에 따라 기동어로 인식하고(S49) S50 내지 S52b 단계를 수행한다. S52b에서 앞서 줄였던 풍량을 복원하는 것을 포함한다. 에어컨(100b)의 풍량을 줄임으로써 상대적으로 SNR(Signal To Noise Ratio)가 더 좋아지므로 에어컨(100b)은 기동어를 잘 인식하게 된다. That is, since the sensitivity is increased, the input voice is more likely to be recognized as a starting language than S42. Accordingly, the input voice is recognized as a starting language (S49) and steps S50 to S52b are performed. It includes restoring the air volume previously reduced in S52b. By reducing the air volume of the air conditioner 100b, the signal-to-noise ratio (SNR) is relatively better, so the air conditioner 100b recognizes the starting word well.

한편 S52b 이후 앞서 S41b에서 입력된 기동어를 기동어 데이터베이스부(160)에 저장하여 추후 "엘지 히센"으로 인식될 경우에도 기동어로 인식될 수 있도록 하여 사용자의 편의성을 높일 수 있다. 또는, 기동어를 기동어 데이터베이스부(160)의 기동어로 바로 저장하는 대신, 기동어 후보로 저장하여 기동어 후보가 지속적으로 인식될 경우에 기동어 데이터베이스부(160)에 저장할 수 있다. 예를 들어 "엘지 히센"이 미리 설정된 횟수(예를 들어 5회) 이상 입력될 경우에 이를 기동어로 저장할 수 있다. Meanwhile, after S52b, the start word input in S41b is stored in the start word database unit 160 so that the start word may be recognized even if it is later recognized as an "lg hissen", thereby increasing user convenience. Alternatively, instead of storing the start word directly as a start word of the start word database unit 160, the start word may be stored as a start word candidate and stored in the start word database unit 160 when the start word candidate is continuously recognized. For example, when "LG Hissen" is input more than a predetermined number of times (for example, five times), it can be stored as a starting language.

도 4 내지 도 7에서 기기 소음의 제어는 해당 기기가 운전 중일 경우 특정 시간 동안 약하게, 천천히 혹은 저소음 모드로 동작하거나 일시정지 등을 진행함으로 이뤄질 수 있다. 다양한 장치들은 각각의 장치들의 특성에 맞게 소음을 줄이는 동작을 수행할 수 있는데, 예를 들어 에어컨인 경우 바람을 약하게 하거나 바람 방향을 사용자가 없는 쪽으로 변경하여 음성 인식률을 높일 수 있다. 세탁기의 경우 일시정지 혹은 천천히 동작하도록 할 수 있다. 청소기의 경우 흡입력을 약하게 할 수 있다. 특히 로봇 청소기의 경우 이동을 중단하거나 이동 속도를 줄일 수 있다. 스마트허브와 같이 장치들을 제어하는 경우 볼륨을 감소시킬 수 있다. In FIG. 4 to FIG. 7, the control of the device noise may be performed by operating in a soft, slow or low noise mode or pausing for a specific time when the device is in operation. Various devices may perform noise reduction according to the characteristics of each device. For example, in the case of an air conditioner, the voice recognition rate may be increased by weakening the wind or changing the direction of the wind toward the user. In the case of a washing machine can be paused or run slowly. In the case of a vacuum cleaner can weaken the suction power. In particular, the robot cleaner can stop the movement or reduce the movement speed. When controlling devices such as smart hubs, the volume can be reduced.

또한, 기동어를 검증하기 위해 일시적으로 민감도 파라미터를 높이는 것은 인식률과 함께 오인식률도 함께 증가할 수 있으므로, 기동어 검증 모드는 타이머에 기반하여 짧은 시간 내에서만 유지되도록 하여 오인식률을 낮출 수 있다.Further, temporarily increasing the sensitivity parameter to verify the start word may increase the recognition rate as well as the recognition rate, so that the start word verification mode may be maintained only within a short time based on a timer, thereby lowering the recognition rate.

한편, 도 5 및 도 7의 S60a, S60b는 기동어를 새로이 추가하는 과정을 포함한다. 즉, S41a 및 S41b에서 입력된 음성은 기동어와 일치하지는 않는다. 그러나, 기동어 판단 결과에 기반하여 중앙제어부(150)는 기동어 검증 모드로 진입한 뒤, 음성 입력부가 이후 입력받은 음성(기동어 검증 모드에서 입력받은 음성)이 S48a 및 S48b와 같이 기동어인 경우, 중앙제어부(150)는 기동어 검증 모드로 진입을 유발시킨 음성(즉, S41a 및 S41b에서 입력된 음성)을 기동어 데이터베이스부(160)에 저장할 수 있다.Meanwhile, S60a and S60b of FIGS. 5 and 7 include a process of newly adding a start word. In other words, the voice input in S41a and S41b does not match the starting word. However, based on the result of the determination of the starting word, the central controller 150 enters the starting word verification mode, and when the voice input unit subsequently inputs the voice (voice received in the starting language verification mode) is a starting word such as S48a and S48b. In addition, the central controller 150 may store the voice (that is, the voice input from S41a and S41b) causing the entry into the start word verification mode in the starter database unit 160.

본 발명의 다른 실시예로 기동어 검증 모드에서 명령어가 입력될 경우, 사용자는 이미 기동어를 발화한 것으로 판단하여 명령어를 발화한 경우에 해당한다. 따라서, 명령어가 기동어 검증 모드에서 입력되어도 중앙제어부(150)는 기동어 검증 모드로 진입을 유발시킨 음성(즉, S41a 및 S41b에서 입력된 음성)을 기동어 데이터베이스부(160)에 저장할 수 있다.According to another embodiment of the present invention, when a command is input in the start word verification mode, the user determines that the start word has already been uttered and corresponds to the case where the command is uttered. Therefore, even if a command is input in the start word verification mode, the central controller 150 may store the voice (that is, the voice input from S41a and S41b) causing the entry into the start word verification mode in the start language database unit 160. .

도면에 미도시되었으나, 기동어 검증 모드 상태에서 사용자가 장치(100a, 100b)를 제어할 경우에, 즉 장치의 기능이 외부에서 제어된 경우, 사용자의 의도가 장치를 제어하는 것에 있었다는 것으로 확인하여 중앙제어부(150)는 기동어 검증 모드로 진입을 유발시킨 음성(즉, S41a 및 S41b에서 입력된 음성)을 기동어 데이터베이스부(160)에 저장할 수 있다.Although not shown in the drawings, when the user controls the apparatuses 100a and 100b in the start word verification mode, that is, when the function of the apparatus is controlled externally, the user's intention was to control the apparatus. The central controller 150 may store a voice (ie, a voice input from S41a and S41b) that causes the entry into the start word verification mode in the starter database unit 160.

도 8은 본 발명의 일 실시예에 의한 피어 장치들과의 협업으로 기동어를 인식하는 과정을 보여주는 도면이다. 본 발명의 일 실시예에 의할 경우, 장치가 기동어 검증 모드로 진입하는 과정에서 인접한 다른 피어 장치들을 이용하여 사용자가 주변에 존재하는지 여부를 확인할 수도 있고 다른 피어 장치들을 일시적으로 소음을 줄이도록 메시지를 송신할 수 있다. 보다 상세히 살펴본다.8 is a diagram illustrating a process of recognizing a start word in cooperation with peer devices according to an embodiment of the present invention. According to an embodiment of the present invention, while the device enters the start-up verification mode, other adjacent peer devices may be used to check whether the user is present and to temporarily reduce the noise of the other peer devices. You can send a message. Look in more detail.

음성 인식을 수행하는 장치(100)는 주변에 피어 장치들(100p, 100q)이 배치된다. 장치(100)의 음성 입력부(110)가 음성을 입력받고(S61), 음성을 인식한 결과 기동어 모호성 상태로 확인된다(S62). 앞서 설명한 것과 같이 기동어에 매칭되지는 않지만, 리젝하기에는 매칭율이 일정 기준보다 높은 경우를 포함하며, 기동어 중에서 일부에 대응하거나 유사한 발음으로 확인되는 경우를 포함한다. In the device 100 for performing voice recognition, peer devices 100p and 100q are disposed in the vicinity. The voice input unit 110 of the apparatus 100 receives a voice (S61), and as a result of recognizing the voice, it is confirmed as a starting word ambiguity state (S62). As described above, although it is not matched to the maneuvering word, rejecting includes a case where the matching rate is higher than a predetermined criterion, and includes a case corresponding to some of the maneuver words or confirmed by similar pronunciation.

기동어 모호성 상태인 경우, 도 4 내지 7의 방식과 같이 주변의 사용자를 확인할 수도 있다. 또한, 도 8과 같이 해당 장치(100) 및 주변의 피어 장치들(100p, 100q)이 사용자가 주변에 있는지 그리고, 사용자에 의해 제어된 히스토리를 확인하여 사용자가 존재하는지를 확인할 수 있다. 즉, 해당 장치(100) 및 주변의 피어 장치들(100p, 100q)이 짧은 시간 내에(예를 들어 1분 혹은 3분 등) 사용자에 의해 제어된 바가 있다면, 사용자가 여전히 주변에 있을 가능성이 있으므로, 이를 확인한다. 장치(100)는 해당 장치에서 사용자가 제어한 히스토리를 확인하고(S63), 인접한 피어 장치들(100p, 100q)에게 사용자 확인 요청 메시지를 전송한다(S64p, S64q).In the case of the start word ambiguity state, surrounding users may be checked as in the method of FIGS. 4 to 7. In addition, as shown in FIG. 8, the corresponding device 100 and the surrounding peer devices 100p and 100q may check whether the user is present and whether the user exists by checking the history controlled by the user. That is, if the device 100 and the surrounding peer devices 100p and 100q have been controlled by the user within a short time (for example, 1 minute or 3 minutes, etc.), the user may still be around. , Check this. The device 100 checks the history controlled by the user in the corresponding device (S63), and transmits a user confirmation request message to adjacent peer devices 100p and 100q (S64p and S64q).

제1피어장치(100p)는 카메라 센서 또는 PIR 센서를 이용하여 사용자 확인부에서 사용자가 주변에 있는지를 확인한다(S65p). 확인 결과 제1피어장치(100p)가 사용자를 센싱되면 센싱된 결과를 사용자 확인 결과 메시지로 전송한다(S66p). 물론 제1피어장치(100p)에 의해 사용자가 센싱되지 않은 경우 센싱되지 않은 결과를 사용자 확인 결과 메시지로 전송한다(S66p).The first peer device 100p checks whether the user is in the vicinity by using the camera sensor or the PIR sensor (S65p). When the first peer device 100p detects the user as a result of the check, the detected result is transmitted as a user check result message (S66p). Of course, if the user is not sensed by the first peer device 100p, the unsensed result is transmitted as a user confirmation result message (S66p).

한편, 제2피어장치(100q)는 사용자가 제2피어장치(100q)를 제어한 히스토리를 확인한다(S65q). 예를 들어, 제2피어장치(100q)가 티비인 경우, 티비 볼륨을 올리거나 채널을 변경하는 등 사용자가 제2피어장치(100q)를 제어한 히스토리가 있는 경우, 제2피어장치(100q)는 주변에 사용자가 있는 것으로 확인하고 확인된 결과를 사용자 확인 결과 메시지지로 전송한다(S66q). On the other hand, the second peer device 100q checks the history of the user controlling the second peer device 100q (S65q). For example, when the second peer device 100q is a TV, when the user has a history of controlling the second peer device 100q, such as increasing a TV volume or changing a channel, the second peer device 100q is used. The user confirms that there is a user around, and transmits the confirmed result to the user confirmation result message (S66q).

장치(100)는 S63, S66p, S66q 등의 결과를 조합하여 사용자가 주변에 있는 것으로 확인된 경우 기동어 검증 모드로 진입한다(S67).When the device 100 determines that the user is in the vicinity by combining the results of S63, S66p, and S66q, the apparatus 100 enters the start word verification mode (S67).

도 8을 정리하면 다음과 같다. 장치(100)의 사용자 확인부는 장치(100)가 배치된 공간 내에서 장치가 제어되거나 혹은 피어 장치(100p, 100q)가 제어된 시간을 확인하여 미리 설정된 시간(예를 들어 1분 또는 3분 등) 내에 장치(100) 혹은 피어 장치(100p, 100q)가 제어된 경우에, 사용자가 주변에 있는 것으로 장치(100)가 확인할 수 있다. 그 결과 장치(100)의 중앙제어부(150)는 공간 내에 사용자가 확인된 것에 기반하여 기동어 판단 결과를 생성할 수 있다. 사용자가 있으며 기동어가 유사하게 입력된 경우, 장치(100)는 기동어 검증 모드로 진입할 수 있다.8 is as follows. The user identification unit of the device 100 checks the time when the device is controlled or the peer devices 100p and 100q are controlled in the space where the device 100 is disposed, and the preset time (for example, 1 minute or 3 minutes). In the case where the device 100 or the peer devices 100p and 100q are controlled in FIG. 2, the device 100 may confirm that the user is in the vicinity. As a result, the central controller 150 of the apparatus 100 may generate a result of determining the starting word based on the user's confirmation in the space. If there is a user and the start word is similarly input, the device 100 may enter the start word verification mode.

도 9는 본 발명의 일 실시예에 의한 기동어 검증 모드로 진입한 장치의 주변 장치들도 소음을 줄이거나 기동어 입력을 위한 작업을 수행하는 과정을 보여주는 도면이다.FIG. 9 is a diagram illustrating a process of reducing noise or performing a task for inputting a start word, even in a peripheral device of a device that has entered a start word verification mode according to an embodiment of the present invention.

장치(100)가 기동어 검증 모드로 진입하면(S71), 인접한 피어 장치들(100p, 100q)에게 기동어 검증 모드 요청 메시지를 전송한다(S72p, S72q). 이는 인접한 피어 장치들(100p, 100q)도 소음을 줄이거나 동작을 중단하거나 혹은 음성 입력을 수행할 것을 요청하는 것을 포함한다. 그 결과 제1피어장치(100p)는 기기의 소음을 줄이거나 동작을 중단한다(S73p). 제2피어장치(100q)는 사용자가 발화하는 음성을 잘 입력받을 수 있도록 사용자 입력부의 민감도를 높인다(S73q).When the device 100 enters the starter verification mode (S71), the starter verification mode request message is transmitted to the adjacent peer devices 100p and 100q (S72p and S72q). This includes requesting that adjacent peer devices 100p and 100q also reduce noise, stop operation or perform voice input. As a result, the first peer device 100p reduces the noise of the device or stops the operation (S73p). The second peer device 100q increases the sensitivity of the user input unit so that the user can receive the spoken voice well (S73q).

그리고 제2피어장치(100q)는 입력된 음성 파일을 장치(100)에게 전송한다(S74). 이 경우 장치(100)는 기동어를 별도로 입력받을 수 있고 S74와 같이 인접한 제2피어장치(100q)가 입력받은 음성을 수신할 수 있다. 이들을 이용하여 장치(100)는 기동어를 확인한다(S75). 이후 기동어가 발화된 것을 확인한 후 장치(100)는 명령어 입력 모드로 진입한다(S76). The second peer device 100q transmits the input voice file to the device 100 (S74). In this case, the device 100 may separately receive a starting word and receive a voice input by the adjacent second peer device 100q as shown in S74. Using these, the device 100 confirms the starting word (S75). After confirming that the startup word is uttered, the apparatus 100 enters the command input mode (S76).

이 과정에서 인접한 피어 장치들(100p, 100q)에게 기동어 검증 모드 종료 메시지를 전송한다(S77p, S77q). S76과 메시지를 전송하는 시점(S77p, S77q)에는 시간적 간격을 둘 수 있는데, 명령어 입력 모드에서 명령어가 보다 정확하게 입력될 수 있도록 S73p 및 S73q를 짧은 시간 동안 유지하기 위함이다. 이후, 각각의 피어 장치들(100p, 100q)은 S73p 및 S73q를 복원하도록 소음 및 동작 복원(S78p) 및 사용자 입력부의 민감도를 복원(S78q)하는 절차를 수행할 수 있다.In this process, the starter verification mode end message is transmitted to adjacent peer devices 100p and 100q (S77p and S77q). A time interval may be provided between S76 and the time points at which the message is transmitted (S77p and S77q), in order to maintain S73p and S73q for a short time so that the command can be input more accurately in the command input mode. Thereafter, each of the peer devices 100p and 100q may perform a procedure of restoring noise and motion S78p and restoring sensitivity of the user input unit S78q to restore S73p and S73q.

도 10은 본 발명의 다른 실시예에 의한 다수의 장치들이 음성 인식을 수행하는 과정을 보여주는 도면이다. 10 is a diagram illustrating a process of performing voice recognition by a plurality of devices according to another embodiment of the present invention.

제1장치(100a) 및 제2장치(100b)는 각각의 기동어에 대한 데이터베이스를 구축할 수 있다. 또한, 일 실시예로, 다른 장치에 대한 기동어 데이터베이스 역시 구축할 수 있다. 일 실시예로, 도 4의 경우 세탁기(100a)는 자신에 대한 기동어인 "엘지 트롬"를 기동어로 하고 있으나, 다른 장치인 도 5의 에어컨(100b)에 대한 기동어인 "엘지 휘센"도 함께 기동어로 저장할 수 있다. 그리고 다른 장치의 기동어가 입력되면, 입력된 기동어를 해당 장치로 전송할 수 있다. The first device 100a and the second device 100b can construct a database for each of the starting words. Also, in one embodiment, a starter database for other devices may also be established. In an embodiment, in the case of FIG. 4, the washing machine 100a uses the "LG Trom", which is a starting word for itself, but also starts the "LG whissen", which is a starting word for the air conditioner 100b of FIG. You can save it. When a start word of another device is input, the input start word may be transmitted to the corresponding device.

제1장치(100a), 예를 들어 세탁기는 일반 모드 상태로 있다(S81). 마찬가지로 제2장치(100b), 예를 들어 에어컨도 일반 모드 상에 있다(S82). 이 상태에서 사용자가 제2장치(100b)의 주변에서 "엘지 트롬"으로 말하고 이는 제2장치(100b)에서 음성 입력된다(S83). 제2장치(100b)의 음성 입력부(110)가 입력받은 "엘지 트롬"은 제2장치(100b)의 중앙제어부(150)가 분석한 결과 제2장치(100b)의 기동어인 "엘지 휘센"과 상이한 것으로 확인하였다. The first apparatus 100a, for example, the washing machine, is in the normal mode (S81). Similarly, the second device 100b, for example the air conditioner, is also in the normal mode (S82). In this state, the user speaks with the "lg trom" in the vicinity of the second device 100b, which is voice input from the second device 100b (S83). The "lg trom" input by the voice input unit 110 of the second device 100b is analyzed by the central control unit 150 of the second device 100b and the "LG whissen" which is the starting word of the second device 100b. It was confirmed to be different.

그러나 제2장치(100b)의 중앙제어부(150)는 제2장치(100b)에 저장된 제1장치(100a)의 기동어와 입력된 음성이 매칭된다는 것을 확인한다. 즉, 제2장치(100b)의 중앙제어부(150)는 음성 인식 결과 입력된 음성이 제1장치의 기동어로 확인하고(S84) 입력된 음성 파일을 제1장치(100a)에게 전송한다(S85). 물론 기동어 인식 결과를 S85에서 전송할 수 있다. 이후 제1장치(100a)는 송신된 음성 파일(혹은 기동어 인식 결과)에 기반하여 기동어로 확인하면(S86), 명령어 입력 모드로 전환한다(S87).However, the central controller 150 of the second device 100b confirms that the start word of the first device 100a stored in the second device 100b matches the input voice. That is, the central controller 150 of the second device 100b checks the voice inputted as the starter of the first device as the voice recognition result (S84) and transmits the input voice file to the first device 100a (S85). . Of course, the start word recognition result can be transmitted in S85. Thereafter, when the first apparatus 100a checks the starting language based on the transmitted voice file (or the starting word recognition result), the first apparatus 100a switches to the command input mode (S87).

도 8 내지 도 10은 인접한 장치들과 협업으로 기동어를 인식하거나 혹은 기동어 검증 모드로 진입하는 과정을 살펴보았다. 본 발명의 일 실시예에 의하면, 사용자가 발화한 기동어의 음성데이터가 설정된 음성인식 민감도 파라미터 기준으로 기동어와 유사하지만 동일한 것으로 판단되지 않아 리젝될 경우, 해당 장치 혹은 인접한 장치들(피어 장치들)이 사용자가 있음을 확인하여 오인식은 방지하면서 인식률을 높일 수 있다. 사용자 확인 방식은 카메라 센서, PIR 센서를 이용할 수도 있고 사용자가 장치들을 제어한 히스토리를 이용할 수도 있다. 8 to 10 illustrate a process of recognizing a start word or entering a start word verification mode in cooperation with adjacent devices. According to an embodiment of the present invention, if the voice data of the user's spoken start word is rejected because it is similar to the start word based on the set voice recognition sensitivity parameter but is not determined to be the same, the corresponding device or adjacent devices (peer devices) By confirming that this user exists, the recognition rate can be increased while preventing false recognition. The user identification method may use a camera sensor, a PIR sensor, or may use a history in which the user controls the devices.

기동어와 유사한 음성데이터가 장치에 입력되면 사용자 확인을 수행한 후, 사람이 있는 경우 특정 시간 동안 해당 장치는 음성인식 민감도 파라미터를 높게 변경함으로써 유사 단어까지 기동어로 인식할 수 있도록 한다. 또한, 해당 장치 및 주변의 피어 장치들은 기기의 소음을 줄임으로써(천천히, 약하게, 저소음 모드 동작, 일시정지 등) SNR을 확보하여 기동어 재 발화시 음성인식률을 높인다.When the voice data similar to the starting word is input to the device, after confirming the user, if there is a person, the device may change the voice recognition sensitivity parameter to a high level so that the similar word may be recognized as the starting word. In addition, the device and the surrounding peer devices reduce the noise of the device (slow, weak, low-noise mode operation, pause, etc.) to secure the SNR to increase the speech recognition rate when the start-up word re-ignition.

만약, 사용자를 확인한 결과 감지된 사용자가 없는 경우 기존 값과 동작을 그대로 유지하고, 사용자가 있고 특정 시간 동안 인식률을 높이려 하였음에도 기동어를 감지하지 못할 경우 기존 값과 동작으로 복원하여 음성 인식에서의 오인식을 증가시키지 않는다.If there is no user detected as a result of checking the user, if the user does not detect the starting word even though there is a user and tries to increase the recognition rate for a certain time, the user is restored to the existing value and operation. Does not increase misperception.

도 11은 본 발명의 일 실시예에 의한 음성 인식만을 수행하여 인접한 다른 기기들을 명령어 입력 모드로 진입하도록 제어하는 과정을 보여주는 도면이다. FIG. 11 is a diagram illustrating a process of controlling other adjacent devices to enter a command input mode by performing only voice recognition according to an embodiment of the present invention.

도 11은 허브 장치(300)가 일종의 명령어 인식 허브 역할을 하는 것을 일 실시예로 한다. 허브 장치(300)는 도 1의 구성 중에서 선택적으로 장치기능제어부(130)와 제어 인터페이스부(135)를 포함하지 않을 수 있다. 또한, 명령어 데이터베이스부(170)를 포함하지 않을 수 있다. 11 illustrates that the hub device 300 serves as a kind of command recognition hub. The hub device 300 may not optionally include the device function control unit 130 and the control interface unit 135 in the configuration of FIG. 1. In addition, the command database unit 170 may not be included.

허브 장치(300)는 음성을 입력받고 음성 인식 결과 기동어가 모호한 것(기동어 모호성 상태)으로 판단한다(S90). 이는 허브 장치(300)가 제어하는 제1장치(100a)의 기동어 및 제2장치(100b)의 기동어와 모두 비교하여 산출된 결과이다. 기동어가 모호하게 인식되었으므로 허브 장치(300), 제1장치(100a), 제2장치(100b) 모두 사용자 확인을 진행한다(S91). 사람을 센싱하거나 혹은 제어 히스토리를 확인하여 사용자 확인을 진행하는 것에 대해 앞서 살펴본 바 있다. The hub device 300 receives the voice and determines that the start word is ambiguous (starting word ambiguity state) as a result of the voice recognition (S90). This is a result calculated by comparing both the start word of the first device 100a and the start word of the second device 100b controlled by the hub device 300. Since the start word is ambiguously recognized, the hub device 300, the first device 100a, and the second device 100b all perform user verification (S91). Earlier, we looked at sensing people or checking control history to proceed with user verification.

S91 결과 사람이 있는 것으로 판단되면 허브 장치(300)는 기동어 검증 모드로 진입하며(S92) 제1장치(100a) 및 제2장치(100b)에게 기동어 검증 모드 요청 메시지를 전송한다(S93a, S93b). 이는 도 9의 S72p/S72q에서 살펴본 바와 같다. 이후 허브 장치(300), 제1장치(100a), 제2장치(100b) 모두 기동어 검증 모드로 진입하고(S94) 이 상태에서 새로이 음성이 입력되면 허브 장치(300)는 민감도 파라미터가 높아진 상태에서 입력된 기동어가 제1장치(100a)의 기동어라는 것을 확인한다(S95). 그리고, 허브 장치(300)는 기동어 검증 모드가 종료하였음을 통지한다. 즉, 허브 장치(300)는 제1장치(100a) 및 제2장치(100b)에게 기동어 검증 모드 종료 메시지를 전송하고(S96a, S96b) 각 장치들은 기동어 검증 모드를 종료한다(S97). 타이머를 종료시키거나 소음을 제거하거나 사용자 입력부의 민감도 파라미터를 복원하는 등의 작업을 수행함에 대해 앞서 도 9의 S78p, S78q에서 살펴보았다.If it is determined that there is a person, as a result of S91, the hub device 300 enters a startup word verification mode (S92) and transmits a startup word verification mode request message to the first device 100a and the second device 100b (S93a). S93b). This is as described with reference to S72p / S72q of FIG. 9. Thereafter, the hub device 300, the first device 100a, and the second device 100b all enter the start-up verification mode (S94), and when a new voice is input in this state, the hub device 300 has a high sensitivity parameter. In step S95, it is confirmed that the starting word input in step S1 is the starting word of the first apparatus 100a. The hub device 300 then notifies that the startup word verification mode has ended. That is, the hub device 300 transmits a start word verification mode end message to the first device 100a and the second device 100b (S96a and S96b), and each device ends the start word verification mode (S97). The operations of terminating the timer, removing noise, or restoring the sensitivity parameter of the user input unit have been described above with reference to S78p and S78q of FIG. 9.

허브 장치(300)는 제1장치의 기동어가 확인되었으므로, 제1장치(100a)에게 명령어 입력 모드로 전환할 것을 지시하는 메시지를 전송한다(S98). 그 결과 제1장치는 명령어 입력 모드(S99)로 진입할 수 있다. The hub device 300 transmits a message instructing the first device 100a to switch to the command input mode since the starting word of the first device is confirmed (S98). As a result, the first device may enter the command input mode S99.

물론, 본 발명의 다른 실시예에 의하면, 명령어까지 입력받은 후, 허브 장치(300)가 제1장치(100a)에게 특정한 명령어에 대응하는 기능을 수행할 것을 지시하는 메시지를 전송할 수 있다. 이는 구현 방식에 따라 다양하게 구성될 수 있다. Of course, according to another embodiment of the present invention, after receiving a command, the hub device 300 may transmit a message instructing the first device 100a to perform a function corresponding to a specific command. This may be configured in various ways depending on the implementation.

본 발명을 적용할 경우, 입력된 기동어가 원래의 저장된 음성 모델과 일치하지 않지만, 민감도 파라미터에 근접한 경우, 후속하여 발화되는 기동어의 인식률을 높일 수 있다. 특히 소음이 심한 환경에서는 사용자가 기동어 발화를 크고 정확히 하여도 인식이 불가한 경우가 많은데, 이를 대비하여 장치들이 소음을 줄이거나 기동어 인식률이 높도록 마이크 등을 제어할 수 있다. 또한, 이의 정확도를 높이기 위해 다양한 방식으로 사람이 주변에 존재하는지를 확인할 수 있다. 사람이 주변에 있는 경우, 특정 시간 동안 민감도 파라미터를 높게 변경하고 기기 소음을 줄여, 한번 더 기동어가 발화될 경우 그 전 수준의 유사도를 가진다 하더라도 두 번째에는 억셉트 될 수 있도록 한다.In the case of applying the present invention, if the input start word does not match the original stored voice model but approaches the sensitivity parameter, it is possible to increase the recognition rate of the subsequent start word. In particular, in a noisy environment, the user may not recognize even if the user speaks the starting language loudly and accurately. In contrast, the devices may control the microphone to reduce the noise or increase the recognition rate of the starting language. In addition, to increase its accuracy, it is possible to check whether a person is present in various ways. If a person is nearby, the sensitivity parameter is changed for a certain time and the device noise is reduced so that once the maneuver is fired, it can be accepted a second time, even if it has the previous level of similarity.

따라서, 도 3에서 살펴본 바와 같이 동일한 기동어라 하여도 일반 모드에서는 리젝되지만 기동어 검증 모드에서는 억셉트 될 수 있다. 그리고 기동어의 인식율은 일정한 시간 내에만 높이므로, 다른 명령어를 인식함에 있어서 오인식률이 증가하는 것을 방지할 수 있다. Accordingly, as shown in FIG. 3, the same starter word may be rejected in the normal mode but accepted in the starter verification mode. In addition, since the recognition rate of the starting word is increased only within a certain time, it is possible to prevent an increase in the false recognition rate in recognizing another command.

또한, 기동어 검증 모드에서 장치나 주변의 소음을 방지하기 위해 장치들의 동작을 짧은 시간 동안 제어할 수 있다. 이는 음성 인식률을 높이면서 기기의 동작에 영향을 주지 않으므로 장치의 기동어 대응 가능성을 높인다. 특히, 기동어를 2차례 이상 발화할 경우 인식률을 높임으로써 사용자가 음성으로 장치를 쉽게 제어할 수 있도록 한다. 뿐만 아니라, 기동어 데이터베이스에 저장되지 않은 음성 파일이라도 일정 기간동안 기동어 검증 모드로 진입을 발생시킨 음성 데이터는 다시 기동어로 저장하여 다양한 발음 성향을 가진 사용자 편의성에 대응하여 기동어 데이터베이스를 구성할 수 있다. In addition, the operation of the devices can be controlled for a short time to prevent the noise of the device or the surroundings in the start word verification mode. This increases the speech recognition rate and does not affect the operation of the device, thereby increasing the possibility of starting the device. In particular, if the user speaks more than two times, the recognition rate is increased, so that the user can easily control the device by voice. In addition, even if the voice file is not stored in the starter database, the voice data generated by entering the starter verification mode for a certain period of time is stored again as the starter language, so that the starter database can be configured in response to user convenience with various pronunciation tendencies. have.

본 발명의 실시예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 발명이 반드시 이러한 실시예에 한정되는 것은 아니며, 본 발명의 목적 범위 내에서 모든 구성 요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다. 또한, 그 모든 구성 요소들이 각각 하나의 독립적인 하드웨어로 구현될 수 있지만, 각 구성 요소들의 그 일부 또는 전부가 선택적으로 조합되어 하나 또는 복수 개의 하드웨어에서 조합된 일부 또는 전부의 기능을 수행하는 프로그램 모듈을 갖는 컴퓨터 프로그램으로서 구현될 수도 있다. 그 컴퓨터 프로그램을 구성하는 코드들 및 코드 세그먼트들은 본 발명의 기술 분야의 당업자에 의해 용이하게 추론될 수 있을 것이다. 이러한 컴퓨터 프로그램은 컴퓨터가 읽을 수 있는 저장매체(Computer Readable Media)에 저장되어 컴퓨터에 의하여 읽혀지고 실행됨으로써, 본 발명의 실시예를 구현할 수 있다. 컴퓨터 프로그램의 저장매체로서는 자기 기록매체, 광 기록매체, 반도체 기록소자를 포함하는 저장매체를 포함한다. 또한 본 발명의 실시예를 구현하는 컴퓨터 프로그램은 외부의 장치를 통하여 실시간으로 전송되는 프로그램 모듈을 포함한다. Although all components constituting the embodiments of the present invention have been described as being combined or operating in combination, the present invention is not necessarily limited to these embodiments, and all of the components are within the scope of the present invention. It can also be combined to operate selectively. In addition, although all of the components may be implemented in one independent hardware, each or all of the components may be selectively combined to perform some or all functions combined in one or a plurality of hardware. It may be implemented as a computer program having a. Codes and code segments constituting the computer program may be easily inferred by those skilled in the art. Such a computer program may be stored in a computer readable storage medium and read and executed by a computer, thereby implementing embodiments of the present invention. The storage medium of the computer program includes a storage medium including a magnetic recording medium, an optical recording medium and a semiconductor recording element. In addition, the computer program for implementing an embodiment of the present invention includes a program module transmitted in real time through an external device.

이상에서는 본 발명의 실시예를 중심으로 설명하였지만, 통상의 기술자의 수준에서 다양한 변경이나 변형을 가할 수 있다. 따라서, 이러한 변경과 변형이 본 발명의 범위를 벗어나지 않는 한 본 발명의 범주 내에 포함되는 것으로 이해할 수 있을 것이다.In the above description, the embodiment of the present invention has been described, but various changes and modifications can be made at the level of ordinary skill in the art. Therefore, it will be understood that such changes and modifications are included within the scope of the present invention without departing from the scope of the present invention.

100: 장치 110: 음성입력부
120: 사용자 확인부 150: 중앙제어부
160: 기동어 데이터베이스부100: device 110: voice input unit
120: user confirmation unit 150: central control unit
160: startup word database unit

Claims

In a device for improving speech recognition,
A voice input unit for receiving a voice of a user;
A user confirmation unit for confirming a user who uttered the voice;
A central controller configured to analyze a confirmation result of the input voice and the user confirmation unit to generate a start word determination result instructing the voice to follow a command for controlling a function of the device or a peer device adjacent to the device; And
A device function control unit for controlling a function of the device based on the control of the central controller;
The central controller enters into any one of a command input mode, a starting word verification mode, or a general mode based on a result of the determination of the starting word, and the central controller sets a timer when entering the starting word verification mode, thereby recognizing the voice input unit. Control parameters to increase the sensitivity of speech recognition to recognize similar words in the starting word
When the timer expires, the central controller restores the sensitivity of the speech recognition.
The normal mode is a standby state because no starting word or command is input,
The sensitivity of the speech recognition in the startup word verification mode is higher than the sensitivity of the speech recognition in the normal mode, so that the startup word rejected in the normal mode is accepted in the startup word verification mode.

The method of claim 1,
And increase the false recognition rate of speech in the start word verification mode.

The method of claim 1,
And the central controller enters a startup word verification mode based on the result of the determination of the starting word, and the central controller controls a device function control unit to reduce noise generated in the device.

The method of claim 1,
Further comprising a communication unit for transmitting and receiving a message between the device and the peer device,
The central controller enters a startup language verification mode based on a result of the determination of the startup language, and the communication unit transmits a message to the peer device instructing to reduce noise generated in the peer device. .

The method of claim 1,
The user identification unit further comprises a sensor for sensing whether a person is located around the device, the apparatus for improving speech recognition.

The method of claim 1,
When the user or the peer device is controlled within a preset time by checking the time the device or the peer device is controlled in the space where the device is disposed
And the central controller generates the starting word determination result based on the user's confirmation in the space.

The method of claim 1,
And the central controller switches the device to a command input mode based on a result of the determination of the starting word.

The method of claim 1,
Further comprising a startup word database unit for storing the startup word,
The central controller enters a starting word verification mode based on a result of determining a starting word for the first voice input in the normal mode, and when the second voice input after the voice input unit is a starting word or a command, the central And a controller is configured to store the first voice input before entering the starting word verification mode in the starting word database unit.

In the apparatus for performing speech recognition,
Receiving a voice of a user by a voice input unit;
Confirming, by a user confirmation unit, the user who spoke the voice; And
Analyzing, by the central controller, the verification result of the input voice and the verification result of the user verification unit, and generating a start word determination result instructing the voice to follow a command for controlling a function of the device or a peer device adjacent to the device; ,
The central controller enters into any one of a command input mode, a starting word verification mode, or a general mode based on a result of the determination of the starting word, and the central controller sets a timer when entering the starting word verification mode, thereby recognizing the voice input unit. Controlling a parameter to increase the sensitivity of the speech recognition, recognizing a similar word of the starting word as the starting word, and when the timer ends, the central controller restores the sensitivity of the speech recognition.
The normal mode is a standby state because no starting word or command is input,
The sensitivity of the speech recognition in the startup word verification mode is higher than the sensitivity of the speech recognition in the normal mode, so that the startup word rejected in the normal mode is accepted in the startup word verification mode.

The method of claim 9,
And a false recognition rate of speech in the start word verification mode is increased.

The method of claim 9,
Controlling the apparatus so that the central control unit enters a starting word verification mode based on a result of the starting word determination; And
And the central controller further controlling a device function control unit controlling a function of a device to reduce noise generated in the device.

The method of claim 9,
The device further includes a communication unit for transmitting and receiving messages between the device and the peer device,
Controlling the apparatus so that the central control unit enters a starting word verification mode based on a result of the starting word determination; And
And the communication unit sends a message to the peer device instructing to reduce noise generated by the peer device.

The method of claim 9,
The user identification unit further comprises the step of sensing the location of the person around the device using a sensor, the method of improving speech recognition.

The method of claim 9,
Confirming, by the user identification unit, a time at which the device or the peer device is controlled in a space where the device is disposed;
Confirming, by the central controller, that the device or the peer device is controlled within a preset time based on the confirmed result; And
And said central controller controlling said device to enter a start word verification mode.

The method of claim 9,
And switching the device to a command input mode based on a result of the determination of the starting word.

The method of claim 9,
Further comprising a startup word database unit for storing the startup word,
Controlling, by the central controller, the voice input unit to enter a starting word verification mode based on a result of determining the starting word for the first voice received in the normal mode;
Receiving a voice including a voice word or a command by the voice input unit; And
The central control unit further comprises the step of storing the first voice input before entering the startup language verification mode, the startup language database unit, speech recognition.