KR101741141B1

KR101741141B1 - Apparatus for suppressing noise and method thereof

Info

Publication number: KR101741141B1
Application number: KR1020150181624A
Authority: KR
Inventors: 이석필; 서지훈; 한혁수
Original assignee: 상명대학교산학협력단
Priority date: 2015-12-18
Filing date: 2015-12-18
Publication date: 2017-05-29
Anticipated expiration: 2035-12-18
Also published as: WO2017104876A1

Abstract

본 발명은 음성신호 처리방법에 관한 것으로, 본 발명의 일면에 따른 잡음제거 방법은, 음성신호와 잡음신호를 포함하는 혼합신호를 입력받는 단계; 상기 혼합신호 중 상기 음성 신호가 없는 구간을 이용하여 상기 잡음신호를 구하는 단계; 상기 잡음신호와 상기 혼합신호를 이용하여 사후 신호대잡음비를 구하는 단계; 상기 사후 신호대 잡음비와 이전 프레임의 잡음신호와 이전프레임의 선행 신호대잡음비를 이용하여 현재 프레임의 선행 신호대잡음비를 추정하는 단계; 상기 추정된 선행 신호대잡음비를 이용하여 가중치 값을 계산하는 단계; 상기 계산된 가중치값을 이용하여 각 주파수 별 필터값을 계산하는 단계; 및 상기 계산된 필터값을 상기 혼합신호에 곱하여 향상된 상기 추정 음성신호를 구하는 단계를 포함하는 것을 특징으로 한다.A noise canceling method according to an aspect of the present invention includes: receiving a mixed signal including a voice signal and a noise signal; Obtaining the noise signal using an interval in which the speech signal is absent from the mixed signal; Obtaining a post-SNR using the noise signal and the mixed signal; Estimating a preceding signal-to-noise ratio of a current frame using the post-S / N ratio, the noise signal of the previous frame, and the preceding SNR of the previous frame; Calculating a weight value using the estimated preceding SNR; Calculating a filter value for each frequency using the calculated weight value; And multiplying the mixed signal by the calculated filter value to obtain the enhanced estimated speech signal.

Description

[0001] APPARATUS FOR SUPPRESSING NOISE AND METHOD THEREOF [0002]

본 발명은 음성향상을 위한 신호처리에 관한 것으로서, 보다 구체적으로는 음성에 포함된 바람소리를 제거하여 음성의 명료도를 향상시키기 위한 신호처리 방법 및 그 장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to signal processing for improving speech, and more particularly, to a signal processing method and apparatus for enhancing intelligibility of speech by eliminating wind sounds included in speech.

스마트폰의 보급이 늘어남에 따라 음성인식 기술이 다양하게 사용되고 있다. 애플(Apple)사의 시리(Siri)나 구글(Google)사의 구글 나우(Google Now) 등은 음성인식을 이용한 대표적인 스마트폰 서비스이다.As the spread of smartphones grows, a variety of speech recognition technologies are being used. Apple's Siri and Google's Google Now are some of the more popular smartphone services using voice recognition.

주변이 조용한 상황에서는 이러한 음성인식 서비스의 인식률도 높고, 일반 통화상황에서도 상대방의 음성을 잘 들을 수 있으나, 주변이 시끄러운 상황이나 바람소리 등이 사용자의 음성에 섞여서 스마트폰에 입력되는 경우에는 음성인식 서비스의 음성 인식률이 떨어지고, 상대방의 목소리를 잘 인식할 수 없게 되기도 한다.In a quiet environment, the recognition rate of such a voice recognition service is high. Even in a normal call situation, the other party's voice can be heard well. However, when a noisy situation or wind noise is mixed with the user's voice and input to the smartphone, The voice recognition rate of the service is lowered and the voice of the other party can not be recognized well.

이렇게 바람소리가 혼합된 경우에 종래 기술은 단순히 LPF(Low Pass Filter)나 HPF(High Pass Filter)를 사용하여 신호의 특정 대역을 깎아내는 방식으로 바람소리를 줄이려는 시도를 하였다.In the case of mixed wind sounds, the prior art attempted to reduce the wind noise by simply cutting out a specific band of the signal using a low pass filter (LPF) or a high pass filter (HPF).

대한민국 출원번호 제10-2005-0120682호 발명은 바람소리를 레벨에 따라 자동으로 제거하는 방법에 관한 것으로 혼합신호를 로우패스필터로 필터링 하고 그 레벨을 측정하여 측정된 레벨에 따라 제어신호를 생성하고 하이패스필터를 거쳐 바람소리를 제거하려는 발명이다.Korean Patent Application No. 10-2005-0120682 The present invention relates to a method for automatically removing a wind sound according to a level, in which a mixed signal is filtered by a low-pass filter, and the level is measured to generate a control signal according to the measured level It is an invention to eliminate wind noise through a high pass filter.

그러나 이렇게 단순한 필터링 방법에 의할 경우 바람소리뿐 아니라 사용자의 음성 대역에도 손실이 생기기 때문에 음성 인식률이 향상되지 못하는 문제점이 존재한다.However, there is a problem that the speech recognition rate can not be improved because a simple filtering method causes a loss in the user's voice band as well as the wind sound.

본 발명은 전술한 바와 같은 기술적 배경에서 안출된 것으로서, 선행 신호대잡음비와 사후 신호대잡음비를 이용하여 필터계수를 구하고 이를 이용하여 바람소리를 제거하는 장치와 방법을 제공하는 것을 그 목적으로 한다.SUMMARY OF THE INVENTION It is an object of the present invention to provide an apparatus and method for obtaining a filter coefficient by using a preceding signal-to-noise ratio and a post-signal-to-noise ratio and using the same to eliminate wind noise.

본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects not mentioned can be clearly understood by those skilled in the art from the following description.

전술한 본 발명의 목적을 달성하기 위한 본 발명의 일면에 따른 잡음제거 방법은, 음성신호와 잡음신호를 포함하는 혼합신호를 입력받는 단계; 상기 혼합신호 중 상기 음성 신호가 없는 구간을 이용하여 상기 잡음신호를 구하는 단계; 상기 잡음신호와 상기 혼합신호를 이용하여 사후 신호대잡음비를 구하는 단계; 상기 사후 신호대 잡음비와 이전 프레임의 잡음신호와 이전프레임의 선행 신호대잡음비를 이용하여 현재 프레임의 선행 신호대잡음비를 추정하는 단계; 상기 추정된 선행 신호대잡음비를 이용하여 가중치 값을 계산하는 단계; 상기 계산된 가중치값을 이용하여 각 주파수 별 필터값을 계산하는 단계; 및 상기 계산된 필터값을 상기 혼합신호에 곱하여 향상된 상기 추정 음성신호를 구하는 단계를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a noise canceling method comprising: receiving a mixed signal including a voice signal and a noise signal; Obtaining the noise signal using an interval in which the speech signal is absent from the mixed signal; Obtaining a post-SNR using the noise signal and the mixed signal; Estimating a preceding signal-to-noise ratio of a current frame using the post-S / N ratio, the noise signal of the previous frame, and the preceding SNR of the previous frame; Calculating a weight value using the estimated preceding SNR; Calculating a filter value for each frequency using the calculated weight value; And multiplying the mixed signal by the calculated filter value to obtain the enhanced estimated speech signal.

본 발명의 다른 일면에 따른 잡음제거 장치는, 하나이상의 프로세서를 포함하고, 상기 프로세서는 음성신호와 잡음신호를 포함하는 혼합신호를 입력받는 입력부; 상기 혼합신호를 주파수 영역 신호로 변환하는 주파수 신호 변환부; 상기 혼합신호 중 상기 음성 신호가 없는 구간을 이용하여 상기 잡음신호를 구하고, 상기 잡음신호와 상기 혼합신호를 이용하여 사후 신호대잡음비를 구하고, 상기 사후 신호대 잡음비와 이전 프레임의 잡음신호와 이전프레임의 선행 신호대잡음비를 이용하여 현재 프레임의 선행 신호대잡음비를 추정하고, 상기 추정된 선행 신호대잡음비를 이용하여 가중치 값을 계산하고, 상기 계산된 가중치값을 이용하여 각 주파수 별 필터값을 계산하는연산부; 상기 계산된 필터값을 상기 혼합신호에 곱하여 향상된 음성신호를 구하는 필터부; 상기 향상된 음성신호를 시간 영역 신호로 변환하는 시간 영역 신호 변환부; 를 포함하여 구현하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a noise canceling apparatus including at least one processor, the processor including: an input unit for receiving a mixed signal including a voice signal and a noise signal; A frequency signal converter for converting the mixed signal into a frequency domain signal; Wherein the noise signal is obtained using the interval in which the speech signal is absent, the post-SNR is calculated using the noise signal and the mixed signal, and the post-SNR, the noise signal of the previous frame, An operation unit for estimating a preceding signal-to-noise ratio of a current frame using a signal-to-noise ratio, calculating a weight value using the estimated preceding signal-to-noise ratio, and calculating a filter value for each frequency using the calculated weight value; A filter unit for multiplying the mixed signal by the calculated filter value to obtain an improved speech signal; A time domain signal converter for converting the enhanced speech signal into a time domain signal; And a control unit.

본 발명에 따르면, 선행 신호대잡음비와 사후 신호대잡음비를 이용하여 형성된 필터를 이용하여 바람소리가 섞인 신호를 필터링 함으로써 보다 향상된 음성향상 기술을 제공함으로써 음성인식률을 높이고 통화 시 음성의 명료도를 높일 수 있는 효과가 있다.According to the present invention, it is possible to increase the voice recognition rate and improve the clarity of voice during a call by providing a voice enhancement technique by filtering a signal mixed with a wind sound using a filter formed using a preceding signal-to-noise ratio and a post- .

도 1은 본 발명의 일실시예에 따른 잡음제거 방법의 흐름도.
도 2는 본 발명의 일실시예에 따른 잡음제거 방법의 신호의 흐름을 나타낸 구조도.
도 3은 본 발명의 다른 실시예에 따른 잡음제거장치의 구조도.
도 4는 본 발명에 또 다른 실시예에 따른 잡음제거 방법이 구현되는 컴퓨터장치의 구조도.1 is a flowchart of a noise removal method according to an embodiment of the present invention;
BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a noise canceling method.
3 is a structural view of a noise removing apparatus according to another embodiment of the present invention.
4 is a structural view of a computer apparatus in which a noise reduction method according to another embodiment of the present invention is implemented.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성소자, 단계, 동작 및/또는 소자는 하나 이상의 다른 구성소자, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다.BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification. As used herein, the terms " comprises, " and / or "comprising" refer to the presence or absence of one or more other components, steps, operations, and / Or additions.

이하, 본 발명의 바람직한 실시예에 대하여 첨부한 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 잡음제거 방법의 흐름도를 나타낸다.1 shows a flowchart of a noise reduction method according to an embodiment of the present invention.

음성신호 향상을 위한 잡음제거를 위해 우선 혼합신호를 입력 받는다(S110).In order to remove noises for voice signal enhancement, a mixed signal is input first (S110).

입력 받은 혼합신호는 보통 시간 영역 신호이므로 이를 주파수 영역 신호로 바꾸기 위해 FFT(Fast Fourier Transform)연산을 수행한다. FFT연산을 거쳐 주파수 영역 신호로 바뀐 신호는 진폭(Magnitude)신호와 위상(Phase)신호로 구성되는데 본 발명에서는 진폭신호만으로 연산을 수행하므로 위상신호는 변형 없이 그대로 출력측으로 전달된다.Since the input mixed signal is a time domain signal, an FFT (Fast Fourier Transform) operation is performed to convert it into a frequency domain signal. The signal converted into the frequency domain signal through the FFT operation is composed of an amplitude signal and a phase signal. In the present invention, since the operation is performed using only the amplitude signal, the phase signal is transmitted to the output side without modification.

선행 신호대잡음비(a priori SNR)를 구하기 위해서는 잡음신호와 혼합신호와 사후 신호대잡음비(a posteriori SNR)가 필요한데, 혼합신호만을 입력으로 받으므로 나머지 잡음신호와 사후 신호대잡음비는 혼합신호로부터 추정한다.To obtain the a priori SNR, a noise signal, a mixed signal, and a posteriori SNR are required. Since only the mixed signal is received, the residual noise signal and the post-SNR are estimated from the mixed signal.

우선 잡음신호는 혼합신호에서 음성이 없는 구간을 이용하여 구한다. 혼합신호에서 항상 사람의 목소리가 존재하는 것은 아니고, 따라서 혼합신호입력을 받기 시작한 후 짧은 구간은 사람의 목소리가 존재하지 않을 것이므로 이 구간을 잡음신호만 존재하는 것으로 가정하여 잡음신호를 구한다.First, the noise signal is obtained by using the interval in which no speech is present in the mixed signal. Since the human voice is not always present in the mixed signal and therefore the human voice is not present in the short section after receiving the mixed signal input, the noise signal is obtained assuming that only the noise signal exists in this section.

잡음신호를 구한 후 잡음신호와 혼합신호를 이용하여 사후 신호대잡음비를 구할 수 있는데, 사후 신호대잡음비는 다음 수학식 1과 같이 구할 수 있다(S120).After the noise signal is obtained, the post-SNR can be obtained using the noise signal and the mixed signal. The post-SNR can be obtained by the following Equation 1 (S120).

사후 신호대잡음비인

은 p번째 프레임과 k번째 주파수 인덱스에서의 사후 신호대잡음비를 나타내고, Y(p,k)와 N(p,k)는 각각 p번째 프레임과 k번째 주파수 인덱스에서의 혼합신호와 잡음신호를 나타낸다. 잡음신호는 전단계에서 가정한 값을 사용한다.Post-signal-to-noise ratio

(P, k) represents the mixed signal and the noise signal at the p-th frame and the k-th frequency index, respectively. The noise signal uses the values assumed in the previous step.

계산한 사후 신호대잡음비를 이용하여 선행 신호대잡음비를 계산하고(S130), 수학식 2와 같이 구한다.The preceding signal-to-noise ratio is calculated using the calculated post-signal-to-noise ratio (S130).

는 혼합신호에서 잡음신호를 제거한 추정된 음성신호를 말하는데 본 발명에 의한 계산이 시작되기 전의 음성신호는 0으로 초기화하고 해당 프레임의 음성신호를 추정한 후 다음 프레임부터 선행 신호대잡음비를 계산하는데 이용된다.

Is an estimated speech signal from which a noise signal is removed from a mixed signal. The speech signal before the start of the calculation according to the present invention is initialized to 0, the speech signal of the frame is estimated, and the preceding signal-to- .

α는 미리 설정해놓는 비례계수 값으로, 음성신호를 추정함에 있어 직전 프레임의 추정 음성신호 및 잡음신호의 영향과 첫 번째 프레임부터 이전 프레임까지 누적된 사후 신호대잡음비의 영향을 조절하기 위한 값이다.α is a preset proportional coefficient value for adjusting the influence of the estimated speech signal and the noise signal of the previous frame and the post-S / N ratio accumulated from the first frame to the previous frame in estimating the speech signal.

즉 α는 0에서 1사이의 값인데 1에 가까울수록 직전 프레임의 값에 많은 영향을 받게 되고 0에 가까울수록 첫 번째 프레임부터 직전프레임까지 누적된 값에 의한 영향을 받게 되는, 다시말해 히스토리에 의한 영향이 커지는 것을 의미한다.That is, α is a value between 0 and 1. The closer to 1, the more influence on the value of the previous frame. The closer to 0, the more affected by the accumulated value from the 1st frame to the previous frame. It means that the influence becomes larger.

선행 신호대잡음비를 추출하면 이 값을 이용하여 가중치값을 계산하게 되고(S140), 가중치값은 수학식 3에 의해 구할 수 있다.When the preceding signal-to-noise ratio is extracted, the weight value is calculated using this value (S140), and the weight value can be obtained by Equation (3).

μ값은 가중치 파라미터이고, 선행 신호대잡음비 값이 크게 추정되면 음성신호의 크기가 크다는 의미이므로 가중치값도 커져야 하고 반대로 선행 신호대잡음비의 값이 작으면 잡음신호에 비해 음성신호도 작다는 의미이므로 가중치값도 작아져야 한다.The μ value is a weight parameter. If the value of the preceding signal-to-noise ratio is large, it means that the size of the voice signal is large. Therefore, the weight value must be large. Conversely, if the value of the preceding signal-to-noise ratio is small, the voice signal is small as compared with the noise signal. .

이렇게 가중치값과 선행 신호대잡음비 값을 구하면 두 값을 이용하여 잡음제거에 이용되는 필터값 H(p,k)를 구할 수 있고(S150), 이는 수학식 4와 같다.If the weight value and the preceding signal-to-noise ratio value are obtained, the filter value H (p, k) used for noise cancellation can be obtained using the two values (S150).

해당 프레임에서 각 주파수 인덱스 별 필터값을 구할 수 있으므로 혼합신호에 이 잡음제거 필터값을 곱하면 최종적으로 잡음이 저감된 최종 음성신호를 구할 수 있다(S160). 이 과정은 수학식 5와 같다.Since the filter value for each frequency index can be obtained in the corresponding frame, the final speech signal in which noise is finally reduced can be obtained by multiplying the mixed signal by the noise elimination filter value (S160). This process is shown in Equation (5).

Y(p,k)는 혼합신호를 나타내고, 전술한 바와 같이 이렇게 구한 추정된 음성신호

는 다음 프레임에서 선행 신호대잡음비를 구하는데 사용된다.Y (p, k) represents a mixed signal, and as described above,

Is used to determine the preceding signal-to-noise ratio in the next frame.

도 2는 잡음신호가 섞인 혼합신호가 필터링을 거쳐 잡음신호가 감쇄된 신호를 출력하기까지 신호의 흐름도를 나타낸다.FIG. 2 shows a flow chart of a signal until a mixed signal mixed with a noise signal is filtered to output a signal in which a noise signal is attenuated.

최종적으로 추정된 음성신호는 음성신호의 진폭신호이므로 변형을 거치지 않은 음성신호의 위상신호와 함께 IFFT(Inverse Fast Fourier Transform)방법으로 시간 영역 신호로 바뀌어 사용자에게 잡음이 제거된 신호를 제공하게 된다.Finally, since the estimated voice signal is the amplitude signal of the voice signal, it is converted into a time domain signal by IFFT (Inverse Fast Fourier Transform) together with the phase signal of the voice signal that has not been subjected to the deformation.

이렇게 선행 신호대잡음비를 추정하여 잡음을 제거를 하는 경우 기존 LPF 등의 단순한 필터로 잡음을 제거하는 것 보다 더 뛰어난 잡음제거 효과를 얻을 수 있다.When noise is removed by estimating the preceding signal-to-noise ratio, noise can be more effectively removed than by removing a noise with a simple filter such as a conventional LPF.

도 3은 본 발명의 다른 실시예에 따른 잡음제거 장치의 구조도를 나타낸다.3 is a structural view of a noise removing apparatus according to another embodiment of the present invention.

입력부(310)는 음성신호와 잡음신호가 혼재된 혼합신호를 입력받는다. 입력부는 마이크 등으로 구성될 수도 있고 음성파일이나 동영상파일 등 파일형태의 입력을 받아 음성신호인 혼합신호만을 추출할 수도 있다.The input unit 310 receives a mixed signal in which a voice signal and a noise signal are mixed. The input unit may be formed of a microphone or the like, or may extract only a mixed signal which is a voice signal by receiving a file type input such as an audio file or a moving picture file.

본 발명은 신호를 주파수 영역에서 처리하기 때문에 주파수신호 변환부(320)는 입력받은 신호를 FFT 등의 방법을 통해 주파수 신호로 변환한다. 주파수 신호 변환은 FFT뿐 아니라 DFT(Discrete Fourier Transform), DCT(Discrete Cosine Transform), Filterbank 등의 방법을 사용할 수 있다.Since the present invention processes a signal in the frequency domain, the frequency signal transforming unit 320 transforms the received signal into a frequency signal through a method such as FFT. Frequency signal conversion can be performed not only by FFT, but also by methods such as DFT (Discrete Fourier Transform), DCT (Discrete Cosine Transform), and Filterbank.

연산부(330)는 잡음제거를 위한 필터값을 입력신호로부터 추출한다.The operation unit 330 extracts a filter value for noise cancellation from the input signal.

입력받은 혼합신호와 잡음신호로부터 우선 사후 신호대잡음비를 구하고 이 과정은 전술한 수학식 1과 같다. 혼합신호에서 음성신호와 잡음신호를 구분하는 것은 불가능하지만, 초기 입력신호에서는 음성신호가 존재하지 않는 것으로 가정하여 이 구간의 신호를 잡음신호로 가정하여 사후 신호대잡음비를 계산하는 것이다.The post-SNR is first obtained from the received mixed signal and the noise signal, and this procedure is as shown in Equation (1). It is impossible to distinguish the voice signal from the noise signal in the mixed signal. However, it is assumed that the voice signal does not exist in the initial input signal, and the post-SNR is calculated assuming that the signal in this interval is a noise signal.

이렇게 계산한 사후 신호대잡음비와 이전 프레임에서 추정한 음성신호와 앞에서 구한 잡음신호의 평균값을 이용하여 선행 신호대잡음비를 구한다. 이 과정에서 비례계수값을 이용하여 직전프레임의 추정값과 직전프레임을 포함한 이전 프레임들의 히스토리값이 선행 신호대잡음비에 영향을 주는 비율을 조절할 수 있다.The preceding signal-to-noise ratio is calculated using the thus-calculated post-signal-to-noise ratio, the speech signal estimated in the previous frame, and the average value of the noise signal obtained in the previous frame. In this process, it is possible to control the rate at which the previous frame estimation value and the history value of the previous frames including the previous frame influence the preceding signal-to-noise ratio using the proportional coefficient value.

직전프레임 값의 비율을 높이는 경우에는 프레임 간 변화에 민감하게 변화할 수 있는 장점이 있으나 잦은 변화로 인해 사용자에게 불편함을 초래할 수 있고, 히스토리값의 비율을 높이는 경우에는 급격한 변화를 억제할 수 있어 자연스러운 음성신호를 들을 수 있으나 시간적으로 빨리 변화하는 신호에 신속하게 대응하지 못하는 단점이 있으므로 실험에 의해 둘 사이의 최적값을 결정하여 사용할 수 있다.If the ratio of the previous frame value is increased, it can be sensitive to the change between frames. However, it can cause inconvenience to the user due to frequent change, and when the ratio of the history value is increased, sudden change can be suppressed Although it is possible to hear a natural voice signal, it can not quickly respond to a signal that changes rapidly in time, so an optimal value between the two can be determined by experiments.

수학식 2에 의해 선행 신호대잡음비를 구하면 가중치값을 구할 수 있는데, 선행 신호대잡음비가 크면 음성신호가 큰것으로 예상되는 것이므로 가중치값을 크게하고 반대인 경우 가중치값을 작게 하여 잡음신호의 영향을 줄이기 위함이다. 가중치값은 수학식 3에 의해 구한다.If the preceding signal-to-noise ratio is obtained by Equation (2), the weight value can be obtained. If the preceding signal-to-noise ratio is large, the speech signal is expected to be large. To increase the weight value and decrease the weight value, to be. The weight value is obtained by Equation (3).

가중치값과 선행 신호대잡음비 값을 이용하여 최종적으로 필터값을 구할 수 있고 이는 수학식 4와 같다.The filter value can be finally obtained by using the weight value and the preceding signal-to-noise ratio value.

필터부(340)는 이렇게 구한 필터값을 혼합신호에 곱하여 잡음이 제거된 신호를 구하게 되고 이 과정은 수학식 5와 같다.The filter unit 340 multiplies the mixed signal by the filter value thus obtained to obtain a noise-free signal.

필터부(340)를 거쳐 잡음이 제거된 신호는 주파수 영역의 신호이기 때문에 마지막으로 시간신호 변환부(350)를 거쳐 음성신호를 시간영역 신호로 변환하여 출력부에 제공함으로써 사용자가 잡음이 제거된 신호를 들을 수 있다.Since the noise-canceled signal through the filter unit 340 is a signal in the frequency domain, the speech signal is converted into a time domain signal through the time signal converter 350 and provided to the output unit, You can hear the signal.

시간신호 변환부(350)는 IFFT, IDFT(Inverse DFT), IDCT(Inverse DCT), Inverse Filterbank등의 방법을 사용하여 주파수 영역 신호를 시간영역 신호로 변환할 수 있다.The time signal converter 350 may convert a frequency domain signal into a time domain signal using a method such as IFFT, Inverse DFT (IDFT), Inverse DCT (IDCT), or Inverse Filterbank.

한편, 본 발명의 실시예에 잡음제거 방법은 컴퓨터 시스템에서 구현되거나, 또는 기록매체에 기록될 수 있다. 도 4에 도시된 바와 같이, 컴퓨터 시스템은 적어도 하나 이상의 프로세서(421)와, 메모리(423)와, 사용자 입력 장치(426)와, 데이터 통신 버스(422)와, 사용자 출력 장치(427)와, 저장소(428)를 포함할 수 있다. 전술한 각각의 구성 요소는 데이터 통신 버스(422)를 통해 데이터 통신을 한다.Meanwhile, the noise cancellation method in the embodiment of the present invention can be implemented in a computer system or recorded on a recording medium. 4, a computer system includes at least one processor 421, a memory 423, a user input device 426, a data communication bus 422, a user output device 427, And may include a storage 428. Each of the above-described components performs data communication via the data communication bus 422. [

컴퓨터 시스템은 네트워크에 커플링된 네트워크 인터페이스(429)를 더 포함할 수 있다. 상기 프로세서(421)는 중앙처리 장치(central processing unit (CPU))이거나, 혹은 메모리(423) 및/또는 저장소(428)에 저장된 명령어를 처리하는 반도체 장치일 수 있다. The computer system may further include a network interface 429 coupled to the network. The processor 421 may be a central processing unit (CPU) or a semiconductor device that processes instructions stored in the memory 423 and / or the storage 428.

상기 메모리(423) 및 상기 저장소(428)는 다양한 형태의 휘발성 혹은 비휘발성 저장매체를 포함할 수 있다. 예컨대, 상기 메모리(423)는 ROM(424) 및 RAM(425)을 포함할 수 있다.The memory 423 and the storage 428 may include various forms of volatile or non-volatile storage media. For example, the memory 423 may include a ROM 424 and a RAM 425.

따라서, 본 발명의 실시예에 따른 잡음제거 방법은 컴퓨터에서 실행 가능한 방법으로 구현될 수 있다. 본 발명의 실시예에 따른 잡음제거 방법이 컴퓨터 장치에서 수행될 때, 컴퓨터로 판독 가능한 명령어들이 본 발명에 따른 인식 방법을 수행할 수 있다.Accordingly, the noise cancellation method according to the embodiment of the present invention can be implemented in a computer-executable method. When the noise cancellation method according to an embodiment of the present invention is performed in a computer device, computer-readable instructions can perform the recognition method according to the present invention.

한편, 상술한 본 발명에 따른 잡음제거 방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현되는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체로는 컴퓨터 시스템에 의하여 해독될 수 있는 데이터가 저장된 모든 종류의 기록 매체를 포함한다. 예를 들어, ROM(Read Only Memory), RAM(Random Access Memory), 자기 테이프, 자기 디스크, 플래시 메모리, 광 데이터 저장장치 등이 있을 수 있다. 또한, 컴퓨터로 판독 가능한 기록매체는 컴퓨터 통신망으로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 읽을 수 있는 코드로서 저장되고 실행될 수 있다.Meanwhile, the noise reduction method according to the present invention can be implemented as a computer-readable code on a computer-readable recording medium. The computer-readable recording medium includes all kinds of recording media storing data that can be decoded by a computer system. For example, there may be a ROM (Read Only Memory), a RAM (Random Access Memory), a magnetic tape, a magnetic disk, a flash memory, an optical data storage device and the like. The computer-readable recording medium may also be distributed and executed in a computer system connected to a computer network and stored and executed as a code that can be read in a distributed manner.

이상, 본 발명의 구성에 대하여 첨부 도면을 참조하여 상세히 설명하였으나, 이는 예시에 불과한 것으로서, 본 발명이 속하는 기술분야에 통상의 지식을 가진자라면 본 발명의 기술적 사상의 범위 내에서 다양한 변형과 변경이 가능함은 물론이다. 따라서 본 발명의 보호 범위는 전술한 실시예에 국한되어서는 아니되며 이하의 특허청구범위의 기재에 의하여 정해져야 할 것이다.While the present invention has been described in detail with reference to the accompanying drawings, it is to be understood that the invention is not limited to the above-described embodiments. Those skilled in the art will appreciate that various modifications, Of course, this is possible. Accordingly, the scope of protection of the present invention should not be limited to the above-described embodiments, but should be determined by the description of the following claims.

310: 입력부 320: 주파수신호 변환부
330: 연산부 340: 필터부
350: 시간신호 변환부310: input unit 320: frequency signal converter
330: operation unit 340: filter unit
350: time signal converting section

Claims

Receiving a mixed signal including a voice signal and a noise signal;
Obtaining the noise signal using an interval in which the speech signal is absent from the mixed signal;
Obtaining a post-SNR using the noise signal and the mixed signal;
Estimating a preceding signal-to-noise ratio of a current frame using the post-S / N ratio, the noise signal of the previous frame, and the preceding SNR of the previous frame;
Calculating a weight value by dividing a square root of a value obtained by squaring a preceding SNR of the current frame and an absolute value of a preceding SNR of the current frame by an absolute value of a preceding SNR of the current frame;
Calculating a filter value for each frequency using the calculated weight value; And
Multiplying the mixed signal by the calculated filter value to obtain an enhanced estimated speech signal;
&Lt; / RTI >

2. The method of claim 1, wherein the step of determining the post-
A value obtained by dividing the size of the mixed signal by the size of the noise signal is set as a post-signal-to-noise ratio
In noise removal method.

2. The method of claim 1, wherein the preceding signal to noise ratio
A value obtained by dividing a value obtained by squaring the size of the estimated voice signal of the previous frame by an average value of a value obtained by squaring the size of the noise signal,
A value obtained by multiplying a value obtained by subtracting 1 from the post-signal-to-noise ratio and a value obtained by multiplying a large value of 0 by a value obtained by subtracting the predetermined proportional coefficient from 1 is added to a value obtained by adding all values from the first frame to the previous frame
In noise removal method.

delete

The method of claim 1,
And multiplying the value obtained by multiplying the preceding signal-to-noise ratio by the weight by a value obtained by multiplying the value obtained by multiplying the preceding signal-to-
In noise removal method.

A noise cancellation apparatus comprising one or more processors,
An input unit for receiving a mixed signal including a voice signal and a noise signal;
A frequency signal converter for converting the mixed signal into a frequency domain signal;
Wherein the noise signal is obtained using the interval in which the speech signal is absent, the post-SNR is calculated using the noise signal and the mixed signal, and the post-SNR, the noise signal of the previous frame, Noise ratio of the current frame and a value of a square root of a sum of a value obtained by squaring the preceding signal to noise ratio of the current frame and an absolute value of a preceding signal to noise ratio of the current frame, Calculating a weight value by dividing the weight value by an absolute value of a noise ratio, and calculating a filter value for each frequency using the calculated weight value;
A filter unit for multiplying the mixed signal by the calculated filter value to obtain an improved speech signal;
A time domain signal converter for converting the enhanced speech signal into a time domain signal;
The noise canceller comprising:

7. The apparatus of claim 6, wherein the calculating unit
A value obtained by dividing the size of the mixed signal by the size of the noise signal is set as a post-signal-to-noise ratio
In noise canceling device.

7. The apparatus of claim 6, wherein the calculating unit
A value obtained by dividing a value obtained by squaring the size of the estimated voice signal of the previous frame by an average value of a value obtained by squaring the size of the noise signal,
A value obtained by multiplying a value obtained by subtracting 1 from the post-signal-to-noise ratio and a value obtained by multiplying a large value of 0 by 1 and subtracting the predetermined proportional coefficient from the first frame to the previous frame is set as the preceding SNR
In noise canceling device.

delete

7. The apparatus of claim 6, wherein the calculating unit
A value obtained by dividing a value obtained by multiplying the preceding signal-to-noise ratio by the weight value by a value obtained by multiplying the preceding signal-to-noise ratio by the weight value and 1 is added to the filter value
In noise canceling device.