CN111263253B

CN111263253B - A method and device for collecting voice signals using a microphone array

Info

Publication number: CN111263253B
Application number: CN201811461804.6A
Authority: CN
Inventors: 郑进吉; 李希才; 邓智威; 李飞燕
Original assignee: Yunnan Normal University
Current assignee: Yunnan Normal University
Priority date: 2018-12-02
Filing date: 2018-12-02
Publication date: 2025-03-25
Anticipated expiration: 2038-12-02
Also published as: CN111263253A

Abstract

A method and device for collecting voice signals for microphone arrays, the method comprising the following steps: obtaining multiple groups of voice signals, so that each group of voice signals includes voice signals from two alternate or adjacent microphones, converting and processing the two voice signals in each group respectively, obtaining digital signals corresponding to the two voice signals respectively, and obtaining a multi-channel stereo signal by synthesizing and processing the obtained digital signals. Since a cross-injection collection method is adopted for the microphone voice signals, the voice signals of adjacent or alternate microphones can be used as sampling references, which can not only enable the acquisition channel of each group of voice signals to have independent acquisition capabilities, but also realize the synchronous acquisition requirements of different acquisition positions, which is conducive to improving the mixed acquisition effect of voice signals, reducing the design difficulty of synchronous acquisition of multi-channel microphone arrays, and improving the expansibility of voice signal acquisition of microphone arrays.

Description

Voice signal acquisition method and acquisition device for microphone array

Technical Field

The application relates to the field of voice signal processing, in particular to a voice signal acquisition method and device for a microphone array.

Background

At present, the microphone signal collector generally adopts an independent analog-to-digital conversion ADC to finish the collection of the digital voice signal under the control of a microcontroller, and compared with the traditional independent analog signal ADC collection technology, the dual-channel stereo collection technology has the advantages of low cost, low power consumption, high sampling bit number, programmable gain control, good synchronous collection characteristic, quick AD conversion time and the like, is widely applied to the fields of artificial intelligent voice signal collection terminals, digital communication cameras and mobile equipment voice signal collection and processing, and is a key technology for realizing equipment digitization.

As shown in fig. 1, the basic structure of the prior art scheme for data acquisition by using an independent analog-to-digital conversion ADC includes a microphone voice sensing unit microphone 1, a signal amplifying unit 2, a digital-to-analog conversion ADC unit 3, a data read-write control unit 11, a clock generating unit 10, a PC upper computer unit 12 and a storage unit 13. The clock generating unit 10 supplies a clock signal with a frequency fCLK, the analog-to-digital conversion ADC circuit unit 3 performs a process of converting the two-channel stereo speech signal analog signal into a digital signal under the driving of the clock fCLK generated as described above, and then the data read-write control unit 11 reads out and stores the digital signal using a sampling clock with the same speed (frequency fCLK), and finally stores the digital signal using the storage unit 13, thereby obtaining a desired speech signal.

For the technical scheme illustrated in fig. 1, the signal amplifying circuit and the digital-to-analog conversion circuit are separated, the amplification gain of the amplifying circuit is difficult to be consistent, and the gain coefficient cannot be dynamically adjusted according to the amplitude of the input signal, so that the phenomenon of saturation distortion caused by signal over-amplification can occur. Referring to the test result of the data collector in the prior art scheme in fig. 2, where x1 shows waveforms of collected voice signals of two channels represented by two curves, in the experiment, the two microphones are placed at the same distance from the sound source, and it can be seen from the curve result that, at characteristic points of the signals, such as peaks and troughs, the time points of the signals on the time axis are at the same moment, the synchronous characteristics of the two signal collection can be qualitatively determined, and if the accuracy of the synchronous collection needs to be quantitatively collected, the accuracy of the synchronous collection can be quantitatively evaluated through a cross-correlation function.

Therefore, the prior art has the following defects that (1) the sampling rate of the voice signal collector is limited by the number of collecting channels and the main frequency and digital transmission bandwidth of a core controller of the data read-write control unit, for example, when a PC upper computer is communicated with the data read-write control unit through a USB cable, the number of channels of the collector with the sampling rate of 200K and the quantization bit number of 12 bits can not be expanded limitlessly when the USB interface reads data due to the limitation of the USB data transmission bandwidth, and generally only about 10 channels of data can be collected simultaneously.

(2) Because the clock frequency of the digital-to-analog converter ADC is higher, the phenomena of asynchronous acquisition, complex circuit design and high cost caused by insufficient clock driving capability and signal attenuation can be encountered when the multichannel microphone voice signal acquisition circuit is designed. In addition, the pre-amplifying circuit of the common collector has larger amplification factor difference of each channel due to the difference of electronic device parameters such as resistance, capacitance and the like in the circuit design and integration process, and is extremely easy to be interfered by external environment change, thereby generating the problems of asynchronous phenomenon deterioration and serious noise of the signal acquisition of the analog-to-digital converter.

Disclosure of Invention

The technical problem actually solved by the application is how to acquire the voice signals of the microphone array in a plurality of channels and obtain a better acquisition effect. In order to solve the above problems, the present application provides a method and device for collecting voice signals for a microphone array.

According to a first aspect, an embodiment of the present application provides a method for collecting a voice signal for a microphone array, including the steps of:

acquiring multiple groups of voice signals, wherein each group of voice signals comprises voice signals of two microphones which are alternately or adjacently arranged;

Respectively converting the two paths of voice signals in each group to obtain digital signals respectively corresponding to the two paths of voice signals;

And synthesizing and processing according to the digital signals to obtain multi-channel stereo signals.

The obtaining multiple sets of voice signals includes:

One voice signal is obtained from one microphone, another voice signal is obtained from another microphone spaced from the one microphone, such that a group of voice signals includes the voice signals of the two microphones spaced apart, and/or,

One voice signal is acquired from one microphone, and the other voice signal is acquired from the other microphone adjacent to the one microphone, so that one group of voice signals comprises the voice signals of the two microphones which are adjacently arranged.

The converting the two paths of voice signals in each group to obtain digital signals respectively corresponding to the two paths of voice signals respectively includes:

And for the two paths of voice signals in each group, carrying out synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate to obtain digital signals respectively corresponding to the two paths of voice signals.

The method further comprises a signal amplification step before the two paths of voice signals in each group are respectively converted, wherein the signal amplification step comprises the following steps:

And amplifying each path of voice signal according to a preset amplification gain so as to perform analog-to-digital conversion on the amplified voice signal.

The synthesizing process according to the digital signal obtains a multi-channel stereo signal, including:

Storing the digital signals after each group of voice signal conversion processing respectively;

and synthesizing and processing according to the synchronously stored digital signals to obtain the multichannel stereo signal.

According to a second aspect, an embodiment of the present application provides a voice signal acquisition device for a microphone array, including:

the system comprises a plurality of acquisition modules, a plurality of processing modules and a processing module, wherein the acquisition modules are used for respectively acquiring a group of voice signals, and each group of voice signals comprises voice signals of two alternate or adjacent microphones;

the conversion modules are connected with the acquisition modules in a one-to-one correspondence manner and are used for respectively converting two paths of voice signals in a group of voice signals obtained by each acquisition module to obtain digital signals respectively corresponding to the two paths of voice signals;

And the processing module is connected with the plurality of acquisition modules and is used for synthesizing and processing according to the digital signals to obtain multichannel stereo signals.

The acquisition module acquires one voice signal from one microphone and acquires the other voice signal from the other microphone which is arranged at intervals with the microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are arranged at intervals, and/or,

The acquisition module acquires one voice signal from one microphone and acquires the other voice signal from another microphone adjacent to the one microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are adjacently arranged.

For each conversion module, the device comprises two analog-to-digital conversion circuits, a data read-write control unit and a clock generation unit, wherein the two analog-to-digital conversion circuits are respectively connected with an acquisition module connected with the conversion module, and the data read-write control unit and the clock generation unit are both connected with the two analog-to-digital conversion circuits;

For two paths of voice signals in a group of voice signals obtained by each obtaining module, the data read-write control unit and the clock generation unit control the two analog-to-digital conversion circuits so as to perform synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, and obtain digital signals corresponding to the two paths of voice signals respectively.

The voice signal acquisition device also comprises a plurality of signal amplification modules, wherein each signal amplification module is arranged between each acquisition module and a conversion module connected with the acquisition module;

the plurality of signal amplifying modules amplify the voice signals of each path according to preset amplifying gains respectively so as to carry out analog-to-digital conversion on the amplified voice signals.

The voice signal acquisition device further comprises a plurality of storage units, and the storage units are respectively connected with the conversion modules;

The storage units respectively store the digital signals after each group of voice signal conversion processing, so that the processing module obtains the digital signals stored synchronously from the storage units and synthesizes the digital signals to obtain the multichannel stereo signals.

The storage unit voice signal acquisition method has the beneficial effects that:

According to the voice signal acquisition method and the voice signal acquisition device for the microphone array, the voice signal acquisition method comprises the steps of acquiring multiple groups of voice signals, enabling each group of voice signals to comprise voice signals of two adjacent microphones or alternatively, respectively converting two paths of voice signals in each group to obtain digital signals corresponding to the two paths of voice signals, and synthesizing and processing according to the obtained digital signals to obtain multichannel stereo signals. According to the method, a cross injection acquisition mode is adopted for microphone voice signals, so that voice signals of adjacent or alternate microphones can be used as sampling references, each group of voice signal acquisition channels can have independent acquisition capacity, synchronous acquisition requirements of different acquisition positions can be achieved, the mixed acquisition effect of the voice signals is improved, the design difficulty of synchronous acquisition of a multi-channel microphone array is reduced, and expansibility of voice signal acquisition of the microphone array is improved; in the second aspect, the synchronous amplification and analog-to-digital conversion operation of a group of voice signals is carried out by adopting a symmetrical amplification and conversion structure, which is beneficial to guaranteeing the consistency and synchronism of the amplification and conversion processes and avoiding noise and distortion caused by the asynchronism as much as possible, in the third aspect, the acquisition device constructed based on the voice signal acquisition method is beneficial to improving the signal-to-noise ratio of the signals and avoiding the conditions of inconsistent signal routing length and difficult maintenance due to the fact that the conversion process of the voice signals is controlled by adopting a data read-write control module to control the conversion process of the voice signals so that the converted digital signals can be stored and transmitted through a double buffer mechanism, and in the fourth aspect, the voice signal acquisition device adopts a plurality of acquisition modules, a plurality of conversion modules and a plurality of storage units to jointly form an acquisition channel of a plurality of voice signals, so that the design difficulty of synchronous acquisition of voice signals of a large microphone array is reduced, the bandwidth of the voice signals is greatly improved, and the wiring structure of the microphone array is better optimized.

Drawings

FIG. 1 is a schematic diagram of a voice signal collector of a conventional microphone array;

FIG. 2 is a diagram showing the test effect of a voice signal collector of a microphone array according to the prior art;

FIG. 3 is a schematic diagram of a voice signal acquisition device for a one-dimensional microphone array according to one embodiment;

FIG. 4 is a detailed schematic diagram of a voice signal acquisition device for a one-dimensional microphone array according to an embodiment;

FIG. 5 is a simplified schematic diagram of a voice signal acquisition device for a two-dimensional microphone array according to one embodiment;

FIG. 6 is a flow chart of a method of speech signal acquisition for a microphone array;

FIG. 7 is a schematic diagram illustrating a data flow of a method for collecting voice signals of a microphone array according to an embodiment;

FIG. 8 is a diagram showing a test effect of a voice signal acquisition device of a microphone array according to the present application;

Fig. 9 is a second diagram showing the effect of the synchronization test of the voice signal acquisition device of the microphone array according to the present application.

Detailed Description

The application will be described in further detail below with reference to the drawings by means of specific embodiments.

Embodiment one:

referring to fig. 3, a voice signal acquisition device for a microphone array includes a plurality of acquisition modules (e.g. 111, 112, 113, 114), a plurality of conversion modules (e.g. 121, 122, 123, 124) and a processing module 13, which are described below.

Here, the microphone array of the present application is a one-dimensional array or a two-dimensional array, or even a multi-dimensional array, formed by arranging a plurality of microphones in rows and columns. In this embodiment, a one-dimensional array of a plurality of microphones (such as M1, M2, M3, M4, and M5) will be described as an example, where the microphones are used to receive the original voice signal and convert the natural voice signal into an analog signal for output, and the microphones may be MEMS microphone sensors with high sensitivity, electret or capacitive microphone sensors, and there is no limitation on the specific type and specific number of microphones.

The plurality of acquisition modules (111, 112, 113, 114) are configured to acquire a set of speech signals, respectively, each set of speech signals comprising speech signals of two microphones that are spaced apart or adjacent. In an embodiment, the obtaining module obtains one voice signal from one microphone and obtains another voice signal from another microphone arranged at intervals with the microphone, so that the obtained one set of voice signals includes the voice signals of the two microphones arranged at intervals, and the specific visible obtaining module may of course obtain the voice signals of the microphones in another way, the obtaining module obtains one voice signal from one microphone and obtains another voice signal from another microphone adjacent to the one microphone, so that the obtained one set of voice signals includes the voice signals of the two microphones arranged adjacent to each other, and the specific visible obtaining module 111. In this embodiment, the two microphone voice signal acquisition modes of interval and adjacent are preferably adopted, the voice signal acquisition mode of the adjacent microphone may be adopted for the microphone on the edge, and the voice signal acquisition mode of the interval microphone may be adopted for the microphone in the middle.

The acquisition modules (111, 112, 113, 114) may be ports of communication lines, for example, the acquisition module 111 may be a communication port to acquire the voice signals from the microphone M1 and the microphone M2, respectively. Furthermore, a one-dimensional microphone array such as that shown in figure 4, which mainly comprises 1,2, m, n microphones, the microphones are subjected to voice signal acquisition by an acquisition module (111, 112, 114), where m and n are only the arrangement numbers of the microphones, and a plurality of identical modules are omitted between the acquisition modules 112 to 114, and thus, the limitation of the number of microphones and the number of acquisition modules (and the number of conversion modules) should not be construed here.

The plurality of conversion modules (121, 122, 123, 124) are connected with the plurality of acquisition modules (111, 112, 113, 114) in a one-to-one correspondence manner, and are used for respectively converting two paths of voice signals in a group of voice signals obtained by each acquisition module to obtain digital signals respectively corresponding to the two paths of voice signals, and when the conversion modules (121, 122, 123, 124) perform analog-to-digital conversion, unified sampling signals can be provided for each conversion module to control the analog-to-digital conversion process. In an embodiment, see fig. 4, for each conversion module, the two analog-to-digital conversion circuits, the data read-write control unit and the clock generation unit are included, the two analog-to-digital conversion circuits are respectively connected with the acquisition modules connected with the conversion module, the data read-write control unit and the clock generation unit are both connected with the two analog-to-digital conversion circuits, and for two paths of voice signals in a group of voice signals obtained by each acquisition module, the data read-write control unit and the clock generation unit control the two analog-to-digital conversion circuits to perform synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals respectively corresponding to the two paths of voice signals.

In a specific embodiment, see fig. 4, for example, the conversion module 121 includes two analog-to-digital conversion circuits (1211, 1212), a data read-write control unit 1213 and a clock generation unit 1214, where the analog-to-digital conversion circuits (1211, 1212) are common analog-to-digital conversion chips, such as WM8978 (the chips WM8978 can provide standard audio DAC clock configuration, MCLK of 256fs to the DAC and ADC), which has the characteristics of adjustable sampling bit number and sampling rate, the data read-write control unit 1213 can control the sampling bit number of the analog-to-digital conversion circuits (1211, 1212) to realize reading of the digital signals obtained by analog-to-digital conversion, the core processor of the data read-write control unit 1213 has an IIS interface dedicated to reading and writing the digital voice signals, the interface includes a clock signal interface for data reading, an analog-to-digital data read-out interface, and a digital audio output interface, the clock generation unit 1214 is a common clock chip, which can provide the analog-to-digital conversion circuits (1211, 1212) with frequency signals of conversion processes, and controls the sampling rate of analog-to-digital conversion by the frequency signals, and normally, the analog-to-digital conversion circuits 1211 and the analog-to-digital conversion unit 1212 can automatically read the two data buffers from the two buffers of the data buffers.

It should be noted that the number of sampling bits in the analog-to-digital conversion process may be selected from 8 bits, 16 bits, and 24 bits, and the sampling rate may be selected from 8Khz, 11.025Khz, 16Khz, 22.05Khz, 32Khz, 44.1Khz, 48Khz, 88.2Khz, 96Khz, 176.4Khz, and 192Khz, and the configuration is programmed. In this embodiment, in order to ensure the consistency and synchronism of sampling, each conversion module may have the same sampling bit number and sampling rate.

It should be noted that the STM32F429 may be used as a core processor to complete the data reading and writing and control operations. Then, the data read/write and control unit 1213 reads out the voice data at a high speed in a ping-pong operation manner, so that the analog-to-digital conversion circuits (1211, 1213) have a high sampling rate and stability. It should be noted that, since the analog-to-digital conversion circuit 1211 and the analog-to-digital conversion circuit 1212 use the same clock, so that two channels of analog-to-digital conversion in the conversion unit can be performed in parallel and synchronously, in order to ensure the stability of the clock signal, the clock generation unit 1214 preferably uses a high-precision phase-locked loop circuit.

It should be noted that the multi-bit oversampling ADC technology used in the analog-to-digital conversion circuits (1211, 1212) of the present embodiment can reduce the impact of pulse jitter and high-frequency noise by multi-bit feedback and high oversampling rate, and can filter out common-mode noise included in the signal. For example, an optional high pass filter and an adjustable notch filter are provided having a variable center frequency and bandwidth, which are adjusted by two coefficients, a0 and a1, with a0 and a1 set by registers NFA0[13:0] and NFA1[13:0], respectively. Such that a0= (1-tan (wb/2)/(1+tan (wb/2))), a1= - (1+a0) cos (w 0), where w0=2pi fc/fs, wb=2pi fb/fs, fc is the center frequency, fb is the bandwidth of-3 dB, fs is the sampling frequency, and the actual value of register NFA0[13:0] is 13 to the power of-a 0 multiplied by 2, and the actual value of register NFA1[13:0] is 12 to the power of-a 1 multiplied by 2.

It should be noted that, the structures and functions of the other conversion modules (122, 124) are the same as those of the conversion module 121, and thus the conversion modules (122, 124) may be understood by referring to the conversion module 121, and will not be described here.

Further, referring to fig. 4, the voice signal collecting apparatus further includes a plurality of signal amplifying modules (141, 142, & gt, 144), each of which is provided between each of the acquiring modules and the converting module connected to the acquiring module. The signal amplification modules (141, 142, 144) amplify each path of voice signal according to a preset amplification gain, so as to perform analog-to-digital conversion on the amplified voice signal. In a specific embodiment, as shown in fig. 4, the signal amplifying module 141 includes a signal amplifying circuit 1411 and a signal amplifying circuit 1412, which respectively amplify the two paths of voice signals acquired by the acquiring module 111, and respectively transmit the amplified voice signals to the analog-to-digital converting circuit 1211 and the analog-to-digital converting circuit 1212.

It should be noted that the signal amplifying modules (141, 142, 144) may employ a common signal amplifying chip, where the gain of the signal amplifying circuit may be set by programming, the gain of the signal amplifying circuit may be adjusted in a range from-12 dB to +35.25db, the step of adjustment may be set to 0.75dB, and specifically, the gain factor may be set through the IIC interface of STM32F 429. In this embodiment, to ensure consistency of the amplification of the speech signal, the signal amplification modules (141, 142,..144) are preferably set to the same gain factor.

Those skilled in the art will understand that the clock frequencies of the clock generating units may be slightly different, and the gain coefficients of the signal generating circuits cannot be absolutely identical, so that the invention adopts the cross-injected voice signal acquisition mode to sample the voice signals in groups, and thus, the inconsistency of the voice signals in the analog-to-digital conversion process can be avoided to the greatest extent.

Further, the speech signal collecting device further comprises a plurality of storage units (151, 152,) and 154, which are respectively connected with the plurality of conversion modules (121, 122,) and 124. In an embodiment, the plurality of storage units (151, 152, 154) store the digital signals after each group of voice signal conversion processing, so that the processing module 13 obtains the digital signals stored synchronously from the storage units (151, 152, 154) and synthesizes the multi-channel stereo signals.

The storage units (151, 152, 154) may use TF cards for data storage. When the number of microphones in the microphone array is increased, and all digital signals are difficult to read from the data read-write control unit in real time under the limitation of data transmission bandwidth, the invention firstly caches the data in the TF card, when the data acquisition is finished, on the one hand, the data in the TF card can be read out from the TF card, on the other hand, the data can be read out sequentially through the data read-write control unit and then uploaded to the processing module 13 through a USB port, and the processing module can obtain the expected digital voice signal.

Referring to fig. 3 and 4, the processing module 13 is connected to a plurality of acquisition modules (121, 122, 123, 124) for synthesizing and processing a multi-channel stereo signal according to the read digital signal (i.e., digital voice signal).

It should be noted that, the obtained digital signals or digital signals may be superimposed and mixed to achieve the purpose of synthesizing the digital signals, so that each synthesized audio signal forms a channel, and thus a plurality of channels may be formed according to the synthesized audio signals, and finally a multi-channel stereo signal is obtained. And respectively amplifying and playing the stereo signals to generate distinguishable stereo. When one directly hears stereo sound in a stereo space, one can feel their orientation and hierarchy in addition to the loudness, pitch, and tone of the sound.

In another embodiment, as can be seen in fig. 5, the voice signal acquisition device disclosed in the present application is used to acquire voice signals from a two-dimensional cross microphone array, where the cross microphone array includes not only microphones M1-M5 in vertical columns, but also microphones in horizontal columns (e.g., M6, M7, M3, M8, M9). The plurality of acquisition modules and the plurality of conversion modules are respectively connected with the microphones at intervals and adjacently, so that each acquisition module has the characteristic of cross injection of two voice signals. Moreover, a unified sampling signal can be set for the clock generation unit in each conversion module, so that each conversion module has a consistent sampling rate. Of course, the microphone array may have other modes of composition, but may be connected by using the voice signal acquisition device disclosed in the present application, which will not be described in detail herein.

A second embodiment of a voice signal acquisition method:

Referring to fig. 6, the present application further discloses a method for collecting voice signals of a microphone array based on the voice signal collecting device claimed in the first embodiment, which mainly includes steps S210-S230, and is described below.

In step S210, a plurality of sets of voice signals are obtained, each set of voice signals including voice signals of two microphones that are spaced apart or adjacent.

In an embodiment, such as that of fig. 2, the acquisition module (111, 112, 113, 114) acquires one voice signal from one microphone, acquires another voice signal from another microphone spaced apart from the microphone such that one set of voice signals includes voice signals of two microphones spaced apart, and/or the acquisition module (111, 112, 113, 114) acquires one voice signal from one microphone and acquires another voice signal from another microphone adjacent to the microphone such that one set of voice signals includes voice signals of two microphones disposed adjacent to each other.

Step S220, the two paths of voice signals in each group are respectively converted to obtain digital signals corresponding to the two paths of voice signals.

In an embodiment, for two paths of voice signals in each group, synchronous analog-to-digital conversion is performed on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals corresponding to the two paths of voice signals respectively. For example, in fig. 2, the data read/write control unit 1213 and the clock generation unit 1214 control two analog-to-digital conversion circuits (1211, 1212) to perform synchronous analog-to-digital conversion on two voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals corresponding to the two voice signals respectively.

In another embodiment, the method further comprises an amplifying step after the step S210 and before the step S220, wherein the amplifying step includes amplifying each path of voice signal according to a preset amplifying gain to perform analog-to-digital conversion on the amplified voice signal. For example, in fig. 2, the signal amplifying module 141 includes a signal amplifying circuit 1411 and a signal amplifying circuit 1412, performs signal amplifying processing on the two voice signals acquired by the acquiring module 111, and transmits the amplified voice signals to the analog-to-digital converting circuit 1211 and the analog-to-digital converting circuit 1212, respectively.

It should be noted that the signal amplifying module may employ a common signal amplifying chip, where the gain of the signal amplifying circuit may be set by programming, the gain of the signal amplifying circuit may be adjusted in a range from-12 dB to +35.25db, and the step of adjustment may be set to 0.75dB. In this embodiment, to ensure consistency of the amplification of the speech signal, the signal amplification modules (141, 142,..144) are preferably set to the same gain factor.

Step S230, obtaining the multi-channel stereo signal according to the digital signal synthesis processing obtained in step S220.

In one embodiment, the step S230 specifically includes storing the digital signals after each group of voice signal conversion processing, and synthesizing the digital signals according to the synchronously stored digital signals to obtain multi-channel stereo signals. For example, in fig. 2, a plurality of storage units (151, 152, 154) store the digital signals after the conversion processing of each group of voice signals, respectively, so that the processing module 13 acquires the digital signals stored synchronously from the storage units (151, 152, 154) and synthesizes the multi-channel stereo signals.

The method for collecting the voice signal enables those skilled in the art to clearly and accurately understand the technical scheme of the present application, and the technical principle and technical effect of the technical scheme will be described in detail with reference to fig. 7 to 9.

Referring to fig. 7, in order to achieve signal acquisition and not delay processing of signals by the system, so as to improve real-time performance of signal acquisition and processing, the invention adopts a double-buffer FOFO structure, and the AD converted data stream is firstly stored into an M0AR (memory 1) through DMA until the memory 1 is full, the hardware will automatically switch the buffer area for receiving data into a next buffer M1AR (memory 2), meanwhile, the program writes the data in the M0AR (memory 1) into the SD card through DMA to store as a wav file, and when the M1AR (memory 2) is full, the data is automatically switched into the M0AR (memory 1), so that whenever one buffer data is full, the pointer of the buffer area will be spontaneously switched into the next buffer area, and the data is stored, thereby realizing uninterrupted acquisition of the data and storage of the wav file.

Please refer to fig. 8, which is a waveform of a voice signal collected by a conventional universal data collector, wherein x1 is a waveform of a signal collected by a target microphone sensor through a first channel of the conventional dual-channel synchronous collector, and x2 is a waveform of a signal collected by the same microphone connected to the first channel through a second channel of the conventional dual-channel synchronous collector. Because the signal amplifying part of the voice signal input end of the existing general data collector on hardware is not provided with an automatic gain control circuit, the amplifying circuit can not adjust the gain coefficient according to the amplitude of the input signal, when the amplitude of the external voice signal is larger or when a microphone is closer to a sound source, the phenomenon of saturation distortion is easy to occur due to the fact that the signal amplifying factor is too large.

Referring to fig. 9, in order to verify the synchronization characteristic of the cross injection structure for voice signal acquisition provided by the technical scheme of the present application, the present application performs cross-correlation operation on two acquired digital voice signals to determine whether the maximum value point coordinate position of the cross-correlation operation function is at the zero point and the bilateral symmetry index of the cross-correlation function is used as the evaluation basis for the synchronization acquisition characteristic of the collector and whether the hardware design is symmetrical. And the peak point of the function curve after the cross-correlation operation is carried out on the two collected signals is positioned at the zero point, so that the voice signals collected by the two data collection channels have good synchronism.

In summary, the technical scheme provided by the application can reduce the difficulty of signal acquisition of the microphone array, and improve the synchronism of the signal acquisition process of the microphone array, so that the design difficulty of the voice signal acquisition device is reduced, the design of the voice signal acquisition device is not limited by the bottleneck of signal acquisition synchronization, the signal-to-noise ratio of the voice signal acquired by the voice signal acquisition device is greatly improved, the limitation of the bandwidth of the prior art is broken through data transmission and data storage, and the voice signal acquisition device adopts an orderly data buffer mechanism in the data read-write control unit, so that the voice signal acquisition device is well optimized in the aspect of electronic circuit wiring.

The foregoing is a further detailed description of the application in connection with specific embodiments, and it is not intended that the application be limited to such description. It will be apparent to those skilled in the art that several simple deductions or substitutions can be made without departing from the inventive concept.

Claims

1. A method for collecting voice signals for a microphone array, comprising the steps of:

synthesizing and processing according to the digital signals to obtain multi-channel stereo signals;

the method comprises the steps of obtaining a plurality of groups of voice signals, wherein the step of obtaining one voice signal from one microphone, the step of obtaining the other voice signal from the other microphone which is arranged at intervals with the microphone, so that one group of voice signals comprises voice signals of two microphones which are arranged at intervals;

The method comprises the steps of respectively converting two paths of voice signals in each group to obtain digital signals respectively corresponding to the two paths of voice signals, wherein for the two paths of voice signals in each group, synchronous analog-to-digital conversion is carried out on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate to obtain digital signals respectively corresponding to the two paths of voice signals;

The voice signals in the same group are configured to be driven by the same clock generator for analog-to-digital conversion, and the voice signals in different groups are configured to be driven by different clock generators for analog-to-digital conversion;

the method comprises the steps of obtaining multichannel stereo signals according to digital signal synthesis processing, storing digital signals after conversion processing of each group of voice signals respectively, carrying out synthesis processing of superposing and mixing on one group of digital signals or a plurality of groups of digital signals to achieve the digital signals, enabling each synthesized audio signal to form one channel, and forming a plurality of channels according to the plurality of synthesized audio signals to obtain multichannel stereo signals.

2. The voice signal collecting method according to claim 1, further comprising a signal amplifying step before the two voice signals in each group are respectively converted, said signal amplifying step comprising:

3. A speech signal acquisition device for a microphone array, comprising:

The processing module is connected with the plurality of acquisition modules and is used for synthesizing and processing according to the digital signals to obtain multichannel stereo signals;

The acquisition module acquires one path of voice signals from one microphone and acquires the other path of voice signals from the other microphone which is arranged at intervals with the microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are arranged at intervals; and/or the acquisition module acquires one voice signal from one microphone and acquires the other voice signal from the other microphone adjacent to the microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are adjacently arranged;

For each conversion module, the device comprises two analog-to-digital conversion circuits, a data read-write control unit and a clock generation unit, wherein the two analog-to-digital conversion circuits are respectively connected with an acquisition module connected with the conversion module, and the data read-write control unit and the clock generation unit are both connected with the two analog-to-digital conversion circuits; for two paths of voice signals in a group of voice signals obtained by each obtaining module, the data read-write control unit and the clock generation unit control two analog-to-digital conversion circuits to synchronously perform analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate to obtain digital signals corresponding to the two paths of voice signals respectively;

The processing module is used for superposing and mixing the obtained digital signals or a plurality of groups of digital signals to achieve the synthesis processing of the digital signals, so that each synthesized audio signal forms a sound channel, and a plurality of sound channels are formed according to the synthesized audio signals to obtain a multi-channel stereo signal.

4. The voice signal acquisition device of claim 3, further comprising a plurality of signal amplification modules, each of the signal amplification modules being disposed between each of the acquisition modules and a conversion module connected to the acquisition module, the plurality of signal amplification modules respectively amplifying each of the voice signals according to a preset amplification gain to perform analog-to-digital conversion on the amplified voice signals.