KR20000001476U

KR20000001476U - Design of door lock device by speaker recognition

Info

Publication number: KR20000001476U
Application number: KR2019980011207U
Authority: KR
Inventors: 이기영; 조병호
Original assignee: 조병호; 이기영
Priority date: 1998-06-20
Filing date: 1998-06-20
Publication date: 2000-01-25

Abstract

음성이 각 개인마다 특징이 다르므로 각 개인의 음성을 암호로하여 도어의 키가 없이도 키 대신에 음성을 이용하여 도어록 장치의 開閉를 제어한다. 이를 위하여 임의화자가 마이크를 통하여 입력한 특정문장음성이 표준화자의 것인지 아닌지를 판별하기 위한 화자인식기술이 필요하다.Since the voice is different for each individual, the voice of each individual is encrypted and the door lock device is controlled by using the voice instead of the key without using the door key. To this end, speaker recognition technology is needed to determine whether or not a specific sentence voice input through a microphone is a standardizer.

화자인식기술에서 이용할 음성에 포함된 개인특성을 얻기 위한 특징으로 LPC cepstrum을 사용하며, 추출한 개인특성의 집단화를 위한 벡터양자화는 각 표준화자의 대표특징을 수집하는 방법으로 사용한다. 화자인식을 의하여 1차적으로 임의화자의 음성이 표준화자에 해당하는지를 선정하기 위하여 DTW를 이용하며, DTW에 의해 선정된 표준화자의 음성으로부터 미리 벡터양자화를 거쳐 개인특성을 집단화한 코드북으로 부호화를 행하고, 2차적으로 이산 HMM과 Viterbi알고리즘을 수행하여 처음에 선정된 표준화자의 것과 일치하는지 확인한다.LPC cepstrum is used as a feature to obtain the personal features included in the speech to be used in speaker recognition technology, and vector quantization for the grouping of extracted personal features is used as a method of collecting the representative features of each standardizer. By using speaker recognition, DTW is used to first select whether a voice of an arbitrary speaker corresponds to a standardizer, and is encoded into a codebook in which individual characteristics are collectively subjected to vector quantization from the voice of the standardizer selected by the DTW in advance. Secondly, discrete HMM and Viterbi algorithms are performed to confirm that they match those of the initially selected standardizer.

이와 같은 두 과정을 거쳐 모두 만족하면 특정문장음성이 표준화자의 것임을 확인하고 도어록 장치의 開閉를 제어한다If both of these processes are met, it confirms that the specific sentence voice is the standardizer and controls the door lock device.

이는 사람의 음성패턴과 이미 약속된 특장문장의 말로서 보안기능을 하여 도어록의 開閉를 제어하는 장치로써 기존의 RF ID카드 등을 이용한 보안 출입장치보다 사용의 편리성 면에서 유리하고, 지문인식을 이용한 보안 출입장치, 홍채를 이용한 출입장치보다는 설치비용에서 유리하기 때문에 보안을 필요로 하는 출입문에 주로 사용된다.This is a device that controls the door lock by using the security function as the voice pattern of the human voice and the promised body sentence, which is more convenient in terms of convenience than the security access device using the existing RF ID card, etc. It is mainly used for doors that require security because it is advantageous in installation cost than security access device and iris access device.

Description

Design of door lock device by speaker recognition

음성이 각 개인마다 특징이 다르므로 각 개인의 음성을 암호로하여 도어의 키가 없이도 키 대신에 음성을 이용하여 도어록 장치의 開閉를 제어한다. 이를 위하여 임의화자가 마이크를 통하여 입력한 특정문장음성이 표준화자의 것인지 아닌지를 판별하는 화자인식을 수행한다. 그결과, 마이크로 입력한 특정문장음성이 표준화자의 것인지 아닌지를 판별하여 도어록 장치의 開閉를 제어한다.Since the voice is different for each individual, the voice of each individual is encrypted and the door lock device is controlled by using the voice instead of the key without using the door key. For this purpose, the speaker recognizes whether or not the specific sentence voice input through the microphone is the standardizer. As a result, it is determined whether or not the specific sentence voice inputted by the microphone is the standardizer and controls the door lock device.

표준화자의 음성을 암호로하여 도어의 키가 없이도 도어록 장치의 開閉를 제어하기 위해 특정문장음성의 화자인식을 수행한다. 화자인식을 수행하기 위해 사용하는 핵심기술은 LPC cepstrum[1]추출, 벡터양자화[2], DTW[3] 및 HMM과 Viterbi 알고리즘[1] 이다.By using the voice of the standardizer as a cipher, the speaker recognition of a specific sentence voice is performed to control the door lock device without the door key. The key techniques used to perform speaker recognition are LPC cepstrum [1], vector quantization [2], DTW [3], and HMM and Viterbi algorithm [1].

(1) 종래의 화자인식기술(1) conventional speaker recognition technology

종래의 화자인식 기본 구조는 그림 1 과 같다. 즉, 음성파형으로 특징을 추출하여 미리 등록된 화자의 표준패턴과 비교한다. 인식결과는 음성인식과 같이 거리나 유사도에 의해 결정된다.The conventional speaker recognition structure is shown in Figure 1. That is, the feature is extracted from the speech waveform and compared with the standard pattern of the speaker registered in advance. The recognition result is determined by distance or similarity like voice recognition.

입력된 음성의 패턴과 표준패턴의 거리가 문턱값보다 작으면 등록된 화자로 인정하고 그렇지 않으면 등록된 화자로 인정하지 않는다.If the distance between the input voice pattern and the standard pattern is smaller than the threshold value, it is recognized as a registered speaker. Otherwise, it is not recognized as a registered speaker.

다음 그림 2는 100인의 남녀화자가 발성한 특정문장으로 화자인식을 한 결과이다. 여기서 횡축은 화자의 수이며 종축은 오인식율이다.The following figure 2 shows the result of speaker recognition with specific sentences spoken by 100 men and women speakers. Where the horizontal axis is the number of speakers and the vertical axis is the false recognition rate.

음성고유의 개인성을 나타내는 특징으로는 스펙트럼 파라메타인 LPC 켑스트럼,의 동적특징,을 사용하였다.Characteristic characteristics of voice-unique personality include spectral parameters LPC cepstrum Dynamic features, Was used.

여기서,는길이의 대칭 윈도우이거나 1이다. 이 특징의 거리측정에 효과적인 측정식은 다음과 같다.here, Is Symmetrical window of length or one. The effective formula for the distance measurement of this feature is as follows.

또한, 이 식에서은의 분산에 역비례한 가중계수이며,과은 각각 표준화자와 입력된 미지화자 음성의 특징이다.Also, in this equation silver Weighting factor inversely proportional to the variance of and Are the characteristics of the normalizer and the input unknown speaker respectively.

(2) 화자인식기술과 도어록의 제어(2) Speaker recognition technology and door lock control

제안할 화자인식 시스템의 순서도는 [도 2] 와 같다. 먼저 음성파형이 입력되면 LPC 분석하여 스펙트럼 특징을 추출하고 DTW를 이용하여 시간축을 정규화한다. 훈련시 미리 벡터양자화하여 구한 코드북을 이용하여 등록된 화자의 코드북으로 양자화한다. 이때 양자화거리가 최소인 코드북으로 양자화하여 이차적으로 입력된 음성의 후보화자를 선택한다. 이 후보화자의 코드북으로 양자화된 코드열을 미리 생성한 각 화자의 HMM과 비교하기 위하여 Viterbi알고리즘을 수행한다.The flowchart of the speaker recognition system to be proposed is shown in FIG. First, when the speech waveform is input, the spectral features are extracted by LPC analysis, and the time base is normalized using the DTW. During training, the codebook obtained by vector quantization is quantized into a codebook of registered speakers. At this time, the candidate quantizer is selected by quantizing the codebook with the minimum quantization distance. A Viterbi algorithm is performed to compare the quantized code sequence with the candidate speaker's codebook with the HMM of each speaker.

이 알고리즘에서 최대 확율이 얻어지는 HMM의 화자가 시스템에 의하여 인식된 화자이다. 제안된 화자인식 시스템은 종래의 화자인식에서 얻은 97%정도의 기존 시스템에 HMM을 부가하여 시계열상의 특징벡터를 재확인한 것이므로 97%이상의 화자인식을 수행한다. 따라서, 화자인식결과를 도어록과 연결하여 도어록의 開閉를 제어한다.The speaker of the HMM whose maximum probability is obtained in this algorithm is the speaker recognized by the system. Proposed speaker recognition system reconfirms time-series feature vectors by adding HMM to 97% of existing systems obtained from conventional speaker recognition, so it performs more than 97% speaker recognition. Therefore, the speaker recognition result is connected with the door lock to control the door lock.

화자의 개인특성을 얻기 의한 특징으로 LPC cepstrum을 사용하며, 벡터양자화는 각 표준화자의 대표특징을 수집하는 방법으로 사용한다. [도 2] 의 순서도에서와 같이 DTW는 1차적으로 임의화자의 음성이 표준화자에 해당하는지 선정하는데 사용하며, 1차적으로 선정된 표준화자의 음성으로부터 미리 벡터양자화를 거쳐 얻어낸 코드북으로 부호화를 행하고 이산HMM을 수행하여 2차적으로 처음에 선정된 표준화자의 것과 일치하는지 확인한다. 이와 같은 두 과정을 거쳐 모두 만족하면 특정문장음성이 표준화자의 것임을 확인하고 도어록 장치의 開閉를 제어한다LPC cepstrum is used as a characteristic by obtaining the speaker's personal characteristics, and vector quantization is used as a method of collecting representative characteristics of each standardizer. As shown in the flowchart of FIG. 2, the DTW is primarily used to select whether the voice of the randomizer corresponds to the standardizer, and is encoded and coded into a codebook obtained through vector quantization in advance from the voice of the first selected standardizer. Perform an HMM to secondary verify that it matches the first chosen standardizer. If both of these processes are met, it confirms that the specific sentence voice is the standardizer and controls the door lock device.

제1도는 마이크를 통해 발성된 임의 화자의 특정문장음성이 표준화자의 것인지 아닌지를 판별하여 도어록 장치의 開閉를 제어하기 위한 전체 블럭도이다.1 is an overall block diagram for controlling the door lock device by determining whether or not a specific sentence voice of an arbitrary speaker spoken through a microphone is a standardizer.

제2도는 임의화자가 발성한 임의음성의 화자인식을 위한 순서도이다.2 is a flowchart for speaker recognition of an arbitrary voice spoken by an arbitrary speaker.

화자인식기술을 이용하여 출입자의 음성으로 도오록 장치의 開閉를 제어한다. [도 1] 의 전체 블록도와 같이 출입자의 감시장치는 마이크, 컴퓨터 및 도어록 장치로 나눌 수 있다. 마이크를 통해 출입자가 화자인식에 사용하기로 약속된 특정문장을 말하게 되면 컴퓨터로 연결된 전기선을 통해 전달된다. 컴퓨터에서는 입력된 음성이 [도 2] 특정문장음성의 화자인식을 위한 순서도에 준하여 구성된 소프트웨어를 거치면서 출입이 허가된 표준화자의 이미 녹음된 음성의 특징패턴과 비교하여 그 패턴이 일치하면 컴퓨터로부터 전기선로를 통하여 도오록 장치에 전기적 신호를 보내어 도오록 장치를 열도록 한다.Speaker recognition technology is used to control the door of the device by the voice of the person. As shown in the entire block diagram of FIG. 1, the visitor's monitoring device may be divided into a microphone, a computer, and a door lock device. When a microphone speaks a specific sentence that a person is supposed to use for speaker recognition, it is transmitted through a computer-connected electrical cable. In the computer, the input voice is compared with the characteristic pattern of the pre-recorded voice of the standardized person who is allowed to enter through the software configured according to the flowchart for speaker recognition of a specific sentence voice, and if the pattern matches, the electric cable from the computer An electrical signal is sent through the furnace to the door lock device to open the door lock device.

이는 사람의 음성패턴과 이미 약속된 특장문장의 말로서 보안기능을 하여 도어록을 하는 장치로서 기존의 RF ID카드 등을 이용한 보안 출입장치보다 사용의 편리성면에서 유리하고, 지문인식을 이용한 보안 출입장치, 홍채를 이용한 출입장치보다는 설치비용에서 유리하기 때문에 보안을 필요로 하는 출입문에 주로 사용된다.It is a device that performs door lock function by security function with the voice pattern of human voice and the promised body sentence, which is more convenient in terms of convenience than security access device using existing RF ID card, security access device using fingerprint recognition, It is mainly used for doors requiring security because it is advantageous in installation cost rather than iris access device.

Claims

In the present invention, when the voice of a predetermined sentence is uttered and input into a computer, speaker recognition is performed to determine whether the voice is a predetermined standardizer or not, thereby controlling the opening and closing of the door lock device.

The speaker recognition according to claim 1, wherein the speaker recognition is made by software configured in accordance with a flowchart for speaker recognition of a specific sentence voice of FIG.

* references

[1] L. R. Rabiner and B. Gold, Fundamentals of Speech Recognition, Prentice Hall International, Inc., 1993

[2] Y. Linde, A. Buzo, R. M. Gray, "An Algorithm of Vector Quantization Design," IEEE Trans. Communication, Vol. COM-28, pp. 83-108, 1980

[3] H. Sakoe, S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Trns. Acoust. Speech, Signal Processing, Vol, ASSP-26, No. 1, pp. 43-49, 1978