[go: up one dir, main page]

CN113345455A - Wearable device voice signal processing device and method - Google Patents

Wearable device voice signal processing device and method Download PDF

Info

Publication number
CN113345455A
CN113345455A CN202110616157.7A CN202110616157A CN113345455A CN 113345455 A CN113345455 A CN 113345455A CN 202110616157 A CN202110616157 A CN 202110616157A CN 113345455 A CN113345455 A CN 113345455A
Authority
CN
China
Prior art keywords
voice
target direction
signal processing
inclination angle
wearable device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110616157.7A
Other languages
Chinese (zh)
Inventor
王鸣
梁家恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd, Xiamen Yunzhixin Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN202110616157.7A priority Critical patent/CN113345455A/en
Publication of CN113345455A publication Critical patent/CN113345455A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C9/00Measuring inclination, e.g. by clinometers, by levels
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The embodiment of the invention discloses a wearable device voice signal processing device and a method, wherein the device comprises: the voice acquisition module acquires original voice signals, the inclination angle acquisition module acquires the inclination angle of the voice acquisition device, the beam direction judgment module judges the target direction of beam forming according to the inclination angle, and the signal processing module enhances the sound in the target direction and attenuates the sound in the non-target direction. The invention solves the problem of poor voice processing effect of the existing wearable voice equipment in the inclined state.

Description

Wearable device voice signal processing device and method
Technical Field
The embodiment of the invention relates to the technical field of voice processing, in particular to a device and a method for processing a voice signal of wearable equipment.
Background
Currently, for multi-speaker separation, it is common practice in the industry to use hardware devices (such as microphone arrays, bidirectional microphones, etc.) to perform speaker separation on sound collection, or to use a clustering algorithm of sound features to perform speaker separation on monaural audio. The speaker separation is only the operation of classifying the voice and audio of a person according to different speakers, wherein the identification of the specific identity of the speaker to which the voice belongs is not involved, and the identification of the specific identity of the speaker to which the voice belongs to the technology of voiceprint identification (speaker identification), so that interested voices are obtained from massive voices through speaker separation. In the process of separating speakers from voice, the voice needs to be segmented, and then the segmented voice segments are labeled with speaker information.
The voiceprint-based scheme is difficult to solve the problem that voices of speakers are close, the semantic-based scheme is poor in performance of scenes of open conversations, and the problem that equipment is inclined when being worn is not good in processing effect due to the fact that a signal processing mode is used only.
Disclosure of Invention
Embodiments of the present invention provide a device and a method for processing a voice signal of a wearable device, so as to solve the problem of poor voice processing effect of the existing wearable voice device in an inclined state.
In order to achieve the above object, the embodiments of the present invention mainly provide the following technical solutions:
in a first aspect, an embodiment of the present invention provides a wearable device voice signal processing apparatus, where the apparatus includes: the voice acquisition module acquires original voice signals, the inclination angle acquisition module acquires the inclination angle of the voice acquisition device, the beam direction judgment module judges the target direction of beam forming according to the inclination angle, and the signal processing module enhances the sound in the target direction and attenuates the sound in the non-target direction.
Furthermore, the voice acquisition module acquires original input voice through a microphone area array installed on the wearable device.
Further, after the voice acquisition module acquires the original voice, the voice acquisition module converts the voice signal into an electric signal and preprocesses the electric signal, wherein the preprocessing operation comprises the following steps: filtering and removing impurities.
Further, the inclination angle acquisition module acquires the inclination angle of the current device through an acceleration sensor installed on the wearable device, and sends the inclination angle to the beam direction determination module.
Furthermore, the beam direction judging module judges the target direction of beam forming according to the inclination angle, and accurate target angle information can be provided for a beam forming algorithm through the inclination angle.
Furthermore, the signal processing module enhances the voice of the speaker in the target direction by using a beam forming algorithm, attenuates the voice in the non-target direction, and acquires the voice of the speaker in the target direction.
In a second aspect, an embodiment of the present invention further provides a method for processing a speech signal of a wearable device, where the method includes:
acquiring original voice through a microphone area array arranged on the wearable device;
acquiring the inclination angle of the current equipment by using an acceleration sensor on the wearable equipment;
judging the beam forming direction according to the inclination angle;
and enhancing the sound in the target direction by utilizing a beam forming algorithm, and attenuating the sound in the non-target direction.
Further, after the original voice is obtained, the voice signal is converted into an electric signal, and the electric signal is filtered and purified.
Furthermore, after enhancing the sound in the target direction, the beam forming algorithm obtains the voice of the speaker in the target direction, separates and reduces noise of the voice, and performs voice separation and recognition.
The technical scheme provided by the embodiment of the invention at least has the following advantages:
the invention discloses a wearable device voice signal processing device and a wearable device voice signal processing method.
Drawings
Fig. 1 is a flowchart of a method for processing a speech signal of a wearable device according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an area array arrangement of microphones according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of the device provided by the embodiment of the present invention, in which the inclination angle with respect to the gravity acceleration direction is consistent with the inclination angle of the human voice and the device.
Detailed Description
The following description of the embodiments of the present invention is provided for illustrative purposes, and other advantages and effects of the present invention will become apparent to those skilled in the art from the present disclosure.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
Examples
The embodiment discloses a wearable device voice signal processing device, the device includes: the voice acquisition module acquires original voice signals, the inclination angle acquisition module acquires the inclination angle of the voice acquisition device, the beam direction judgment module judges the target direction of beam forming according to the inclination angle, and the signal processing module enhances the sound in the target direction and attenuates the sound in the non-target direction.
The voice acquisition module acquires original input voice through a microphone area array arranged on the wearable device; after the voice acquisition module acquires original voice, the voice signal is converted into an electric signal, and the electric signal is preprocessed, wherein the preprocessing operation comprises the following steps: filtering and removing impurities. Through preprocessing, the acquired voice signals are enabled to be purer, and voice recognition and voice separation operations are convenient to perform in the later stage.
Referring to fig. 2 and 3, the inclination angle obtaining module obtains an inclination angle of the current device through an acceleration sensor mounted on the wearable device, and sends the inclination angle to the beam direction determining module, and the beam direction determining module determines a target direction of beam forming according to the inclination angle, so that accurate target angle information can be provided for a beam forming algorithm through the inclination angle.
The principle of the acceleration sensor for measuring the inclination angle is as follows: the acceleration sensor is subject to gravity when placed at rest and therefore has a gravitational acceleration of 1 g. By utilizing this property, by measuring the components of the gravitational acceleration in the X-axis and Y-axis of the acceleration sensor, the inclination angle thereof in the vertical plane can be calculated. Thus, a 2-axis acceleration sensor can measure the tilt angle in the X-Y plane according to the above principles. The 2-axis acceleration sensor can only measure the weight components in the X-axis and the Y-axis, and thus can only measure the tilt angle in the X-Y plane. Since it is difficult to ensure that the tilt is completely on the X-Y plane when the object is tilted in space, there is a limitation in using only 2-axis acceleration sensors for measurement, and therefore, we consider using 3-axis acceleration sensors. The 3-axis acceleration sensor can measure gravity components of an X axis, a Y axis and a Z axis, and a formula for calculating the spatial inclination angle is used for calculation.
It should be noted that the property of the object that is subjected to gravity when the object is at rest is utilized, and if the object also has a motion acceleration, the formula of the spatial inclination angle is no longer accurate. A constraint must be added to the formula, namely a 3-axis hardware implementation, and the acceleration sensors currently used in consumer products are classified into two types, digital output (e.g., ADXL345) and analog output (e.g., ADXL 335). The acceleration sensor with digital output can be directly connected with the MCU through an l2C or SPI bus; the acceleration sensor with analog output needs to use ADC for sampling. At present, the MCU generally used is basically provided with a built-in ADC channel, so that the acceleration sensor of digital output or analog output can be very easily connected with the MCU, and further the measurement function is realized.
The signal processing module enhances the voice of the speaker in the target direction by using a beam forming algorithm, attenuates the voice in the non-target direction and acquires the voice of the speaker in the target direction
Referring to fig. 1, this embodiment further discloses a method for processing a speech signal of a wearable device, where the method includes:
acquiring original voice through a microphone area array arranged on the wearable device, converting a voice signal into an electric signal, and filtering and removing impurities from the electric signal;
acquiring the inclination angle of the current equipment by using a gravity sensor on the wearable equipment;
judging the beam forming direction according to the inclination angle;
and enhancing the sound in the target direction by utilizing a beam forming algorithm, attenuating the sound in the non-target direction, acquiring the voice of the speaker in the target direction, separating and reducing noise of the voice, and performing voice separation and recognition.
The beam forming is derived from a concept of the adaptive antenna, and the signal processing of the receiving end can form a required ideal signal by performing weighted synthesis on each path of signal received by the multi-antenna array element. This corresponds to the formation of a beam in a defined direction from the antenna pattern (pattern) point of view. For example, the original omnidirectional receiving directional pattern is converted into a lobe directional pattern with a null point and the maximum direction. The same principle applies for the transmitting end. The amplitude and phase adjustment of the antenna array element feed can form a directional pattern with a required shape, and the beam forming is a signal processing technology for directional signal transmission or reception in a sensor array, namely, signals at specific angles undergo constructive interference and the others undergo destructive interference. The target direction sound can be enhanced through a beam forming algorithm, and the non-target direction sound is attenuated.
The invention discloses a wearable device voice signal processing device and a wearable device voice signal processing method.
The disclosed embodiments of the present invention provide a computer-readable storage medium having stored therein computer program instructions which, when run on a computer, cause the computer to perform the above-described method.
In an embodiment of the invention, the processor may be an integrated circuit chip having signal processing capability. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.
The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or which may include both volatile and nonvolatile memory.
The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.
The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Sync DRAM (SLDRAM), and Direct Rambus RAM (DRRAM).
The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.
Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (9)

1. A wearable device voice signal processing apparatus, the apparatus comprising: the voice acquisition module acquires original voice signals, the inclination angle acquisition module acquires the inclination angle of the voice acquisition device, the beam direction judgment module judges the target direction of beam forming according to the inclination angle, and the signal processing module enhances the sound in the target direction and attenuates the sound in the non-target direction.
2. The speech signal processing apparatus of claim 1, wherein the speech acquisition module acquires the original input speech through an array of microphones mounted on the wearable device.
3. The wearable device voice signal processing apparatus of claim 2, wherein the voice acquisition module obtains original voice, converts the voice signal into an electrical signal, and pre-processes the electrical signal, and the pre-processing operation comprises: filtering and removing impurities.
4. The speech signal processing device of claim 1, wherein the tilt angle acquiring module acquires a tilt angle of the current device through an acceleration sensor mounted on the wearable device, and sends the tilt angle to the beam direction determining module.
5. The speech signal processing apparatus of claim 1, wherein the beam direction determining module determines the target direction of the beam forming according to an inclination angle, and the inclination angle provides accurate target angle information for the beam forming algorithm.
6. The speech signal processing apparatus of claim 1, wherein the signal processing module is further configured to enhance the voice of the speaker in the target direction by using a beam forming algorithm, and attenuate the voice in the non-target direction to obtain the voice of the speaker in the target direction.
7. A speech signal processing method for wearable equipment is characterized by comprising the following steps:
acquiring original voice through a microphone area array arranged on the wearable device;
acquiring the inclination angle of the current equipment by using an acceleration sensor on the wearable equipment;
judging the beam forming direction according to the inclination angle;
and enhancing the sound in the target direction by utilizing a beam forming algorithm, and attenuating the sound in the non-target direction.
8. The method for processing the speech signal of the wearable device as claimed in claim 7, wherein the original speech is obtained and then converted into an electrical signal, and the electrical signal is filtered and purified.
9. The method as claimed in claim 7, wherein the beamforming algorithm enhances the sound in the target direction, obtains the voice of the speaker in the target direction, performs noise reduction on the voice, and performs voice separation recognition.
CN202110616157.7A 2021-06-02 2021-06-02 Wearable device voice signal processing device and method Pending CN113345455A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110616157.7A CN113345455A (en) 2021-06-02 2021-06-02 Wearable device voice signal processing device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110616157.7A CN113345455A (en) 2021-06-02 2021-06-02 Wearable device voice signal processing device and method

Publications (1)

Publication Number Publication Date
CN113345455A true CN113345455A (en) 2021-09-03

Family

ID=77472896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110616157.7A Pending CN113345455A (en) 2021-06-02 2021-06-02 Wearable device voice signal processing device and method

Country Status (1)

Country Link
CN (1) CN113345455A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050201549A1 (en) * 2004-03-10 2005-09-15 Mitel Networks Corporation Method and apparatus for optimizing speakerphone performance based on tilt angle
US20100128892A1 (en) * 2008-11-25 2010-05-27 Apple Inc. Stabilizing Directional Audio Input from a Moving Microphone Array
US20110158425A1 (en) * 2009-12-25 2011-06-30 Fujitsu Limited Microphone directivity control apparatus
US20140093093A1 (en) * 2012-09-28 2014-04-03 Apple Inc. System and method of detecting a user's voice activity using an accelerometer
CN107925817A (en) * 2015-07-27 2018-04-17 索诺瓦公司 Clip microphone assembly
CN108831498A (en) * 2018-05-22 2018-11-16 出门问问信息科技有限公司 The method, apparatus and electronic equipment of multi-beam beam forming
CN110178386A (en) * 2017-01-09 2019-08-27 索诺瓦公司 Microphone assembly for being worn at user's chest
US20200202880A1 (en) * 2018-12-20 2020-06-25 Gn Hearing A/S Hearing device with acceleration-based beamforming
CN111683319A (en) * 2020-06-08 2020-09-18 北京爱德发科技有限公司 Call pickup noise reduction method, earphone and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050201549A1 (en) * 2004-03-10 2005-09-15 Mitel Networks Corporation Method and apparatus for optimizing speakerphone performance based on tilt angle
US20100128892A1 (en) * 2008-11-25 2010-05-27 Apple Inc. Stabilizing Directional Audio Input from a Moving Microphone Array
US20110158425A1 (en) * 2009-12-25 2011-06-30 Fujitsu Limited Microphone directivity control apparatus
US20140093093A1 (en) * 2012-09-28 2014-04-03 Apple Inc. System and method of detecting a user's voice activity using an accelerometer
CN107925817A (en) * 2015-07-27 2018-04-17 索诺瓦公司 Clip microphone assembly
CN110178386A (en) * 2017-01-09 2019-08-27 索诺瓦公司 Microphone assembly for being worn at user's chest
CN108831498A (en) * 2018-05-22 2018-11-16 出门问问信息科技有限公司 The method, apparatus and electronic equipment of multi-beam beam forming
US20200202880A1 (en) * 2018-12-20 2020-06-25 Gn Hearing A/S Hearing device with acceleration-based beamforming
CN111683319A (en) * 2020-06-08 2020-09-18 北京爱德发科技有限公司 Call pickup noise reduction method, earphone and storage medium

Similar Documents

Publication Publication Date Title
CN106653041B (en) Audio signal processing apparatus, method and electronic apparatus
JP7011075B2 (en) Target voice acquisition method and device based on microphone array
CN106872945B (en) Sound source positioning method and device and electronic equipment
CN107221336B (en) Device and method for enhancing target voice
CN100559461C (en) The apparatus and method of voice activity detection
CN111629301B (en) Method and device for controlling multiple loudspeakers to play audio and electronic equipment
US20180146306A1 (en) Audio Analysis and Processing System
US8654998B2 (en) Hearing aid apparatus
WO2016183791A1 (en) Voice signal processing method and device
CN110379439B (en) Audio processing method and related device
CN106448722A (en) Sound recording method, device and system
CN1288223A (en) Device adaptive for direction characteristic used for speech voice control
CN109104683B (en) Method and system for correcting phase measurement of double microphones
CN110428851B (en) Beam forming method and device based on microphone array and storage medium
US9699549B2 (en) Audio capturing enhancement method and audio capturing system using the same
CN111916101A (en) Deep learning noise reduction method and system fusing bone vibration sensor and double-microphone signals
CN110830870B (en) Earphone wearer voice activity detection system based on microphone technology
WO2022027423A1 (en) Deep learning noise reduction method and system fusing signal of bone vibration sensor with signals of two microphones
US10972844B1 (en) Earphone and set of earphones
US11805360B2 (en) Noise suppression using tandem networks
US20180146285A1 (en) Audio Gateway System
CN112489674A (en) Speech enhancement method, device, equipment and computer readable storage medium
CN113345455A (en) Wearable device voice signal processing device and method
JP7079189B2 (en) Sound source direction estimation device, sound source direction estimation method and its program
JP2005303574A (en) Voice recognition headset

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210903

RJ01 Rejection of invention patent application after publication