Disclosure of Invention
The technical problem actually solved by the application is how to acquire the voice signals of the microphone array in a plurality of channels and obtain a better acquisition effect. In order to solve the above problems, the present application provides a method and device for collecting voice signals for a microphone array.
According to a first aspect, an embodiment of the present application provides a method for collecting a voice signal for a microphone array, including the steps of:
acquiring multiple groups of voice signals, wherein each group of voice signals comprises voice signals of two microphones which are alternately or adjacently arranged;
Respectively converting the two paths of voice signals in each group to obtain digital signals respectively corresponding to the two paths of voice signals;
And synthesizing and processing according to the digital signals to obtain multi-channel stereo signals.
The obtaining multiple sets of voice signals includes:
One voice signal is obtained from one microphone, another voice signal is obtained from another microphone spaced from the one microphone, such that a group of voice signals includes the voice signals of the two microphones spaced apart, and/or,
One voice signal is acquired from one microphone, and the other voice signal is acquired from the other microphone adjacent to the one microphone, so that one group of voice signals comprises the voice signals of the two microphones which are adjacently arranged.
The converting the two paths of voice signals in each group to obtain digital signals respectively corresponding to the two paths of voice signals respectively includes:
And for the two paths of voice signals in each group, carrying out synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate to obtain digital signals respectively corresponding to the two paths of voice signals.
The method further comprises a signal amplification step before the two paths of voice signals in each group are respectively converted, wherein the signal amplification step comprises the following steps:
And amplifying each path of voice signal according to a preset amplification gain so as to perform analog-to-digital conversion on the amplified voice signal.
The synthesizing process according to the digital signal obtains a multi-channel stereo signal, including:
Storing the digital signals after each group of voice signal conversion processing respectively;
and synthesizing and processing according to the synchronously stored digital signals to obtain the multichannel stereo signal.
According to a second aspect, an embodiment of the present application provides a voice signal acquisition device for a microphone array, including:
the system comprises a plurality of acquisition modules, a plurality of processing modules and a processing module, wherein the acquisition modules are used for respectively acquiring a group of voice signals, and each group of voice signals comprises voice signals of two alternate or adjacent microphones;
the conversion modules are connected with the acquisition modules in a one-to-one correspondence manner and are used for respectively converting two paths of voice signals in a group of voice signals obtained by each acquisition module to obtain digital signals respectively corresponding to the two paths of voice signals;
And the processing module is connected with the plurality of acquisition modules and is used for synthesizing and processing according to the digital signals to obtain multichannel stereo signals.
The acquisition module acquires one voice signal from one microphone and acquires the other voice signal from the other microphone which is arranged at intervals with the microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are arranged at intervals, and/or,
The acquisition module acquires one voice signal from one microphone and acquires the other voice signal from another microphone adjacent to the one microphone, so that the acquired group of voice signals comprises the voice signals of the two microphones which are adjacently arranged.
For each conversion module, the device comprises two analog-to-digital conversion circuits, a data read-write control unit and a clock generation unit, wherein the two analog-to-digital conversion circuits are respectively connected with an acquisition module connected with the conversion module, and the data read-write control unit and the clock generation unit are both connected with the two analog-to-digital conversion circuits;
For two paths of voice signals in a group of voice signals obtained by each obtaining module, the data read-write control unit and the clock generation unit control the two analog-to-digital conversion circuits so as to perform synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, and obtain digital signals corresponding to the two paths of voice signals respectively.
The voice signal acquisition device also comprises a plurality of signal amplification modules, wherein each signal amplification module is arranged between each acquisition module and a conversion module connected with the acquisition module;
the plurality of signal amplifying modules amplify the voice signals of each path according to preset amplifying gains respectively so as to carry out analog-to-digital conversion on the amplified voice signals.
The voice signal acquisition device further comprises a plurality of storage units, and the storage units are respectively connected with the conversion modules;
The storage units respectively store the digital signals after each group of voice signal conversion processing, so that the processing module obtains the digital signals stored synchronously from the storage units and synthesizes the digital signals to obtain the multichannel stereo signals.
The storage unit voice signal acquisition method has the beneficial effects that:
According to the voice signal acquisition method and the voice signal acquisition device for the microphone array, the voice signal acquisition method comprises the steps of acquiring multiple groups of voice signals, enabling each group of voice signals to comprise voice signals of two adjacent microphones or alternatively, respectively converting two paths of voice signals in each group to obtain digital signals corresponding to the two paths of voice signals, and synthesizing and processing according to the obtained digital signals to obtain multichannel stereo signals. According to the method, a cross injection acquisition mode is adopted for microphone voice signals, so that voice signals of adjacent or alternate microphones can be used as sampling references, each group of voice signal acquisition channels can have independent acquisition capacity, synchronous acquisition requirements of different acquisition positions can be achieved, the mixed acquisition effect of the voice signals is improved, the design difficulty of synchronous acquisition of a multi-channel microphone array is reduced, and expansibility of voice signal acquisition of the microphone array is improved; in the second aspect, the synchronous amplification and analog-to-digital conversion operation of a group of voice signals is carried out by adopting a symmetrical amplification and conversion structure, which is beneficial to guaranteeing the consistency and synchronism of the amplification and conversion processes and avoiding noise and distortion caused by the asynchronism as much as possible, in the third aspect, the acquisition device constructed based on the voice signal acquisition method is beneficial to improving the signal-to-noise ratio of the signals and avoiding the conditions of inconsistent signal routing length and difficult maintenance due to the fact that the conversion process of the voice signals is controlled by adopting a data read-write control module to control the conversion process of the voice signals so that the converted digital signals can be stored and transmitted through a double buffer mechanism, and in the fourth aspect, the voice signal acquisition device adopts a plurality of acquisition modules, a plurality of conversion modules and a plurality of storage units to jointly form an acquisition channel of a plurality of voice signals, so that the design difficulty of synchronous acquisition of voice signals of a large microphone array is reduced, the bandwidth of the voice signals is greatly improved, and the wiring structure of the microphone array is better optimized.
Detailed Description
The application will be described in further detail below with reference to the drawings by means of specific embodiments.
Embodiment one:
referring to fig. 3, a voice signal acquisition device for a microphone array includes a plurality of acquisition modules (e.g. 111, 112, 113, 114), a plurality of conversion modules (e.g. 121, 122, 123, 124) and a processing module 13, which are described below.
Here, the microphone array of the present application is a one-dimensional array or a two-dimensional array, or even a multi-dimensional array, formed by arranging a plurality of microphones in rows and columns. In this embodiment, a one-dimensional array of a plurality of microphones (such as M1, M2, M3, M4, and M5) will be described as an example, where the microphones are used to receive the original voice signal and convert the natural voice signal into an analog signal for output, and the microphones may be MEMS microphone sensors with high sensitivity, electret or capacitive microphone sensors, and there is no limitation on the specific type and specific number of microphones.
The plurality of acquisition modules (111, 112, 113, 114) are configured to acquire a set of speech signals, respectively, each set of speech signals comprising speech signals of two microphones that are spaced apart or adjacent. In an embodiment, the obtaining module obtains one voice signal from one microphone and obtains another voice signal from another microphone arranged at intervals with the microphone, so that the obtained one set of voice signals includes the voice signals of the two microphones arranged at intervals, and the specific visible obtaining module may of course obtain the voice signals of the microphones in another way, the obtaining module obtains one voice signal from one microphone and obtains another voice signal from another microphone adjacent to the one microphone, so that the obtained one set of voice signals includes the voice signals of the two microphones arranged adjacent to each other, and the specific visible obtaining module 111. In this embodiment, the two microphone voice signal acquisition modes of interval and adjacent are preferably adopted, the voice signal acquisition mode of the adjacent microphone may be adopted for the microphone on the edge, and the voice signal acquisition mode of the interval microphone may be adopted for the microphone in the middle.
The acquisition modules (111, 112, 113, 114) may be ports of communication lines, for example, the acquisition module 111 may be a communication port to acquire the voice signals from the microphone M1 and the microphone M2, respectively. Furthermore, a one-dimensional microphone array such as that shown in figure 4, which mainly comprises 1,2, m, n microphones, the microphones are subjected to voice signal acquisition by an acquisition module (111, 112, 114), where m and n are only the arrangement numbers of the microphones, and a plurality of identical modules are omitted between the acquisition modules 112 to 114, and thus, the limitation of the number of microphones and the number of acquisition modules (and the number of conversion modules) should not be construed here.
The plurality of conversion modules (121, 122, 123, 124) are connected with the plurality of acquisition modules (111, 112, 113, 114) in a one-to-one correspondence manner, and are used for respectively converting two paths of voice signals in a group of voice signals obtained by each acquisition module to obtain digital signals respectively corresponding to the two paths of voice signals, and when the conversion modules (121, 122, 123, 124) perform analog-to-digital conversion, unified sampling signals can be provided for each conversion module to control the analog-to-digital conversion process. In an embodiment, see fig. 4, for each conversion module, the two analog-to-digital conversion circuits, the data read-write control unit and the clock generation unit are included, the two analog-to-digital conversion circuits are respectively connected with the acquisition modules connected with the conversion module, the data read-write control unit and the clock generation unit are both connected with the two analog-to-digital conversion circuits, and for two paths of voice signals in a group of voice signals obtained by each acquisition module, the data read-write control unit and the clock generation unit control the two analog-to-digital conversion circuits to perform synchronous analog-to-digital conversion on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals respectively corresponding to the two paths of voice signals.
In a specific embodiment, see fig. 4, for example, the conversion module 121 includes two analog-to-digital conversion circuits (1211, 1212), a data read-write control unit 1213 and a clock generation unit 1214, where the analog-to-digital conversion circuits (1211, 1212) are common analog-to-digital conversion chips, such as WM8978 (the chips WM8978 can provide standard audio DAC clock configuration, MCLK of 256fs to the DAC and ADC), which has the characteristics of adjustable sampling bit number and sampling rate, the data read-write control unit 1213 can control the sampling bit number of the analog-to-digital conversion circuits (1211, 1212) to realize reading of the digital signals obtained by analog-to-digital conversion, the core processor of the data read-write control unit 1213 has an IIS interface dedicated to reading and writing the digital voice signals, the interface includes a clock signal interface for data reading, an analog-to-digital data read-out interface, and a digital audio output interface, the clock generation unit 1214 is a common clock chip, which can provide the analog-to-digital conversion circuits (1211, 1212) with frequency signals of conversion processes, and controls the sampling rate of analog-to-digital conversion by the frequency signals, and normally, the analog-to-digital conversion circuits 1211 and the analog-to-digital conversion unit 1212 can automatically read the two data buffers from the two buffers of the data buffers.
It should be noted that the number of sampling bits in the analog-to-digital conversion process may be selected from 8 bits, 16 bits, and 24 bits, and the sampling rate may be selected from 8Khz, 11.025Khz, 16Khz, 22.05Khz, 32Khz, 44.1Khz, 48Khz, 88.2Khz, 96Khz, 176.4Khz, and 192Khz, and the configuration is programmed. In this embodiment, in order to ensure the consistency and synchronism of sampling, each conversion module may have the same sampling bit number and sampling rate.
It should be noted that the STM32F429 may be used as a core processor to complete the data reading and writing and control operations. Then, the data read/write and control unit 1213 reads out the voice data at a high speed in a ping-pong operation manner, so that the analog-to-digital conversion circuits (1211, 1213) have a high sampling rate and stability. It should be noted that, since the analog-to-digital conversion circuit 1211 and the analog-to-digital conversion circuit 1212 use the same clock, so that two channels of analog-to-digital conversion in the conversion unit can be performed in parallel and synchronously, in order to ensure the stability of the clock signal, the clock generation unit 1214 preferably uses a high-precision phase-locked loop circuit.
It should be noted that the multi-bit oversampling ADC technology used in the analog-to-digital conversion circuits (1211, 1212) of the present embodiment can reduce the impact of pulse jitter and high-frequency noise by multi-bit feedback and high oversampling rate, and can filter out common-mode noise included in the signal. For example, an optional high pass filter and an adjustable notch filter are provided having a variable center frequency and bandwidth, which are adjusted by two coefficients, a0 and a1, with a0 and a1 set by registers NFA0[13:0] and NFA1[13:0], respectively. Such that a0= (1-tan (wb/2)/(1+tan (wb/2))), a1= - (1+a0) cos (w 0), where w0=2pi fc/fs, wb=2pi fb/fs, fc is the center frequency, fb is the bandwidth of-3 dB, fs is the sampling frequency, and the actual value of register NFA0[13:0] is 13 to the power of-a 0 multiplied by 2, and the actual value of register NFA1[13:0] is 12 to the power of-a 1 multiplied by 2.
It should be noted that, the structures and functions of the other conversion modules (122, 124) are the same as those of the conversion module 121, and thus the conversion modules (122, 124) may be understood by referring to the conversion module 121, and will not be described here.
Further, referring to fig. 4, the voice signal collecting apparatus further includes a plurality of signal amplifying modules (141, 142, & gt, 144), each of which is provided between each of the acquiring modules and the converting module connected to the acquiring module. The signal amplification modules (141, 142, 144) amplify each path of voice signal according to a preset amplification gain, so as to perform analog-to-digital conversion on the amplified voice signal. In a specific embodiment, as shown in fig. 4, the signal amplifying module 141 includes a signal amplifying circuit 1411 and a signal amplifying circuit 1412, which respectively amplify the two paths of voice signals acquired by the acquiring module 111, and respectively transmit the amplified voice signals to the analog-to-digital converting circuit 1211 and the analog-to-digital converting circuit 1212.
It should be noted that the signal amplifying modules (141, 142, 144) may employ a common signal amplifying chip, where the gain of the signal amplifying circuit may be set by programming, the gain of the signal amplifying circuit may be adjusted in a range from-12 dB to +35.25db, the step of adjustment may be set to 0.75dB, and specifically, the gain factor may be set through the IIC interface of STM32F 429. In this embodiment, to ensure consistency of the amplification of the speech signal, the signal amplification modules (141, 142,..144) are preferably set to the same gain factor.
Those skilled in the art will understand that the clock frequencies of the clock generating units may be slightly different, and the gain coefficients of the signal generating circuits cannot be absolutely identical, so that the invention adopts the cross-injected voice signal acquisition mode to sample the voice signals in groups, and thus, the inconsistency of the voice signals in the analog-to-digital conversion process can be avoided to the greatest extent.
Further, the speech signal collecting device further comprises a plurality of storage units (151, 152,) and 154, which are respectively connected with the plurality of conversion modules (121, 122,) and 124. In an embodiment, the plurality of storage units (151, 152, 154) store the digital signals after each group of voice signal conversion processing, so that the processing module 13 obtains the digital signals stored synchronously from the storage units (151, 152, 154) and synthesizes the multi-channel stereo signals.
The storage units (151, 152, 154) may use TF cards for data storage. When the number of microphones in the microphone array is increased, and all digital signals are difficult to read from the data read-write control unit in real time under the limitation of data transmission bandwidth, the invention firstly caches the data in the TF card, when the data acquisition is finished, on the one hand, the data in the TF card can be read out from the TF card, on the other hand, the data can be read out sequentially through the data read-write control unit and then uploaded to the processing module 13 through a USB port, and the processing module can obtain the expected digital voice signal.
Referring to fig. 3 and 4, the processing module 13 is connected to a plurality of acquisition modules (121, 122, 123, 124) for synthesizing and processing a multi-channel stereo signal according to the read digital signal (i.e., digital voice signal).
It should be noted that, the obtained digital signals or digital signals may be superimposed and mixed to achieve the purpose of synthesizing the digital signals, so that each synthesized audio signal forms a channel, and thus a plurality of channels may be formed according to the synthesized audio signals, and finally a multi-channel stereo signal is obtained. And respectively amplifying and playing the stereo signals to generate distinguishable stereo. When one directly hears stereo sound in a stereo space, one can feel their orientation and hierarchy in addition to the loudness, pitch, and tone of the sound.
In another embodiment, as can be seen in fig. 5, the voice signal acquisition device disclosed in the present application is used to acquire voice signals from a two-dimensional cross microphone array, where the cross microphone array includes not only microphones M1-M5 in vertical columns, but also microphones in horizontal columns (e.g., M6, M7, M3, M8, M9). The plurality of acquisition modules and the plurality of conversion modules are respectively connected with the microphones at intervals and adjacently, so that each acquisition module has the characteristic of cross injection of two voice signals. Moreover, a unified sampling signal can be set for the clock generation unit in each conversion module, so that each conversion module has a consistent sampling rate. Of course, the microphone array may have other modes of composition, but may be connected by using the voice signal acquisition device disclosed in the present application, which will not be described in detail herein.
A second embodiment of a voice signal acquisition method:
Referring to fig. 6, the present application further discloses a method for collecting voice signals of a microphone array based on the voice signal collecting device claimed in the first embodiment, which mainly includes steps S210-S230, and is described below.
In step S210, a plurality of sets of voice signals are obtained, each set of voice signals including voice signals of two microphones that are spaced apart or adjacent.
In an embodiment, such as that of fig. 2, the acquisition module (111, 112, 113, 114) acquires one voice signal from one microphone, acquires another voice signal from another microphone spaced apart from the microphone such that one set of voice signals includes voice signals of two microphones spaced apart, and/or the acquisition module (111, 112, 113, 114) acquires one voice signal from one microphone and acquires another voice signal from another microphone adjacent to the microphone such that one set of voice signals includes voice signals of two microphones disposed adjacent to each other.
Step S220, the two paths of voice signals in each group are respectively converted to obtain digital signals corresponding to the two paths of voice signals.
In an embodiment, for two paths of voice signals in each group, synchronous analog-to-digital conversion is performed on the two paths of voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals corresponding to the two paths of voice signals respectively. For example, in fig. 2, the data read/write control unit 1213 and the clock generation unit 1214 control two analog-to-digital conversion circuits (1211, 1212) to perform synchronous analog-to-digital conversion on two voice signals according to a preset sampling bit number and a preset sampling rate, so as to obtain digital signals corresponding to the two voice signals respectively.
It should be noted that the number of sampling bits in the analog-to-digital conversion process may be selected from 8 bits, 16 bits, and 24 bits, and the sampling rate may be selected from 8Khz, 11.025Khz, 16Khz, 22.05Khz, 32Khz, 44.1Khz, 48Khz, 88.2Khz, 96Khz, 176.4Khz, and 192Khz, and the configuration is programmed. In this embodiment, in order to ensure the consistency and synchronism of sampling, each conversion module may have the same sampling bit number and sampling rate.
In another embodiment, the method further comprises an amplifying step after the step S210 and before the step S220, wherein the amplifying step includes amplifying each path of voice signal according to a preset amplifying gain to perform analog-to-digital conversion on the amplified voice signal. For example, in fig. 2, the signal amplifying module 141 includes a signal amplifying circuit 1411 and a signal amplifying circuit 1412, performs signal amplifying processing on the two voice signals acquired by the acquiring module 111, and transmits the amplified voice signals to the analog-to-digital converting circuit 1211 and the analog-to-digital converting circuit 1212, respectively.
It should be noted that the signal amplifying module may employ a common signal amplifying chip, where the gain of the signal amplifying circuit may be set by programming, the gain of the signal amplifying circuit may be adjusted in a range from-12 dB to +35.25db, and the step of adjustment may be set to 0.75dB. In this embodiment, to ensure consistency of the amplification of the speech signal, the signal amplification modules (141, 142,..144) are preferably set to the same gain factor.
Step S230, obtaining the multi-channel stereo signal according to the digital signal synthesis processing obtained in step S220.
In one embodiment, the step S230 specifically includes storing the digital signals after each group of voice signal conversion processing, and synthesizing the digital signals according to the synchronously stored digital signals to obtain multi-channel stereo signals. For example, in fig. 2, a plurality of storage units (151, 152, 154) store the digital signals after the conversion processing of each group of voice signals, respectively, so that the processing module 13 acquires the digital signals stored synchronously from the storage units (151, 152, 154) and synthesizes the multi-channel stereo signals.
It should be noted that, the obtained digital signals or digital signals may be superimposed and mixed to achieve the purpose of synthesizing the digital signals, so that each synthesized audio signal forms a channel, and thus a plurality of channels may be formed according to the synthesized audio signals, and finally a multi-channel stereo signal is obtained. And respectively amplifying and playing the stereo signals to generate distinguishable stereo. When one directly hears stereo sound in a stereo space, one can feel their orientation and hierarchy in addition to the loudness, pitch, and tone of the sound.
The method for collecting the voice signal enables those skilled in the art to clearly and accurately understand the technical scheme of the present application, and the technical principle and technical effect of the technical scheme will be described in detail with reference to fig. 7 to 9.
Referring to fig. 7, in order to achieve signal acquisition and not delay processing of signals by the system, so as to improve real-time performance of signal acquisition and processing, the invention adopts a double-buffer FOFO structure, and the AD converted data stream is firstly stored into an M0AR (memory 1) through DMA until the memory 1 is full, the hardware will automatically switch the buffer area for receiving data into a next buffer M1AR (memory 2), meanwhile, the program writes the data in the M0AR (memory 1) into the SD card through DMA to store as a wav file, and when the M1AR (memory 2) is full, the data is automatically switched into the M0AR (memory 1), so that whenever one buffer data is full, the pointer of the buffer area will be spontaneously switched into the next buffer area, and the data is stored, thereby realizing uninterrupted acquisition of the data and storage of the wav file.
Please refer to fig. 8, which is a waveform of a voice signal collected by a conventional universal data collector, wherein x1 is a waveform of a signal collected by a target microphone sensor through a first channel of the conventional dual-channel synchronous collector, and x2 is a waveform of a signal collected by the same microphone connected to the first channel through a second channel of the conventional dual-channel synchronous collector. Because the signal amplifying part of the voice signal input end of the existing general data collector on hardware is not provided with an automatic gain control circuit, the amplifying circuit can not adjust the gain coefficient according to the amplitude of the input signal, when the amplitude of the external voice signal is larger or when a microphone is closer to a sound source, the phenomenon of saturation distortion is easy to occur due to the fact that the signal amplifying factor is too large.
Referring to fig. 9, in order to verify the synchronization characteristic of the cross injection structure for voice signal acquisition provided by the technical scheme of the present application, the present application performs cross-correlation operation on two acquired digital voice signals to determine whether the maximum value point coordinate position of the cross-correlation operation function is at the zero point and the bilateral symmetry index of the cross-correlation function is used as the evaluation basis for the synchronization acquisition characteristic of the collector and whether the hardware design is symmetrical. And the peak point of the function curve after the cross-correlation operation is carried out on the two collected signals is positioned at the zero point, so that the voice signals collected by the two data collection channels have good synchronism.
In summary, the technical scheme provided by the application can reduce the difficulty of signal acquisition of the microphone array, and improve the synchronism of the signal acquisition process of the microphone array, so that the design difficulty of the voice signal acquisition device is reduced, the design of the voice signal acquisition device is not limited by the bottleneck of signal acquisition synchronization, the signal-to-noise ratio of the voice signal acquired by the voice signal acquisition device is greatly improved, the limitation of the bandwidth of the prior art is broken through data transmission and data storage, and the voice signal acquisition device adopts an orderly data buffer mechanism in the data read-write control unit, so that the voice signal acquisition device is well optimized in the aspect of electronic circuit wiring.
The foregoing is a further detailed description of the application in connection with specific embodiments, and it is not intended that the application be limited to such description. It will be apparent to those skilled in the art that several simple deductions or substitutions can be made without departing from the inventive concept.