CN110610717B - Separation method of mixed signals in complex frequency spectrum environment - Google Patents
Separation method of mixed signals in complex frequency spectrum environment Download PDFInfo
- Publication number
- CN110610717B CN110610717B CN201910810854.9A CN201910810854A CN110610717B CN 110610717 B CN110610717 B CN 110610717B CN 201910810854 A CN201910810854 A CN 201910810854A CN 110610717 B CN110610717 B CN 110610717B
- Authority
- CN
- China
- Prior art keywords
- sampling
- layer
- data
- signal
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0224—Processing in the time domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a method for separating mixed signals in a complex frequency spectrum environment, which can solve the problem of separating multiple highly-overlapped mixed signals in the complex frequency spectrum environment. The method has the following technical characteristics: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.
Description
Technical Field
The invention belongs to the field of spectrum sensing in wireless communication, and relates to a mixed signal separation method based on a semantic segmentation network
Background
Semantic segmentation is one of important research branches in the field of computer vision, and is simply to give a picture, classify each pixel point in the picture, and the picture after semantic segmentation is a picture containing a plurality of colors, wherein each color represents one class. Image semantic segmentation is an important branch in the field of AI and is an important ring for image understanding in machine vision technology. Different from classification, semantic segmentation needs to judge the category of each pixel point of an image and perform accurate segmentation. Deep learning is a branch of machine learning, mainly referring to a deep neural network algorithm, a neural network is an artificial neuron system established by simulating human neurons, and the neural network has multiple inputs and single output, and the output is used as the input of the next neuron. A number of individual neurons are organized together to form a neural network. While deep neural networks are generally applied to the separation or enhancement of sparse speech signals, or the separation of signals with certain periodicity. The semantic segmentation neural network can return pixel-level labels and segment the targets in the input data according to the labels. With the development of deep learning semantic segmentation technology, semantic segmentation networks are also applied to mixed signal separation technology.
In the non-cooperative reception of communication signals, single-channel mixed signals widely exist in environments such as short-wave, ultra-short-wave and satellite channels due to various factors, such as a specific communication system adopting frequency reuse, a complex electromagnetic environment, intentional or unintentional interference from other systems, or limitation of third-party reception regions and priori knowledge. Because the signals are aliased in both time domain and frequency domain, the effective separation of the source signals in the mixed signals is difficult to realize by adopting the traditional time domain or frequency domain filtering method, and the influence is caused to the signal analysis and information extraction work. Achieving separation of multiple time-frequency aliased signal components is inherently a difficult problem with fewer quantities to estimate more quantities. In a complex electromagnetic environment, the signal received by the sensor is very complex, and mainly consists of an echo signal, an interference signal, a clutter signal and internal noise. Due to the wide frequency spectrum, unknown characteristics and complex and variable waveforms of the signals, practical difficulties are brought to signal processing. For example, the signal received by the passive sonar may be a plurality of mixed signals which are completely unknown, and the transmission channel of the signal is also unknown, or is time-varying depending on the temperature and the ocean current (for example, marine environment). Since the received signal spectrums are aliased, it is difficult to separate them in the frequency domain.
The main method of multi-signal separation at present is to transform the signal from time domain to frequency domain, and to separate and identify the signal in the frequency domain, or to use the time-frequency analysis tool like wavelet transform to realize the detection and analysis of the signal. The method is mainly realized by utilizing the difference of the signals in the time domain and the frequency domain, namely, the signal frequency spectrum does not generate aliasing under the condition that the signal environment is relatively ideal. Since they are spectrally aliased, the original signal cannot be separated from the frequency domain. It is very difficult to realize signal separation and sorting identification by the above method. For a mixed signal whose source signal satisfies the independent equal distribution condition, when the probability density function of the source signal is severely smeared, a straight line of the mixed signal on the probability density contour map may pass through 2 branches of the source signal joint probability density contour map, and in this case, even if the mixing coefficient is known, the source signal cannot be separated from the single-path mixed signal.
Aiming at the problem of signal separation under a complex frequency spectrum environment in which various signals are mixed in a highly overlapping mode, the prior art has the following problems: there is no effective separation means for the mixed signal with high aliasing of time-frequency domain, and there are sparsity and periodicity requirements for the signal as the separation target. Signals which do not meet the requirements of sparsity and periodicity in an actual communication system cannot be effectively separated. In order to solve the mixed signal separation, the mixed signal is generally separated by separating signal sources, and depending on multiple receiving antennas, the mixed signal separation is realized by searching each signal source by carrying out algorithms such as clustering and matching on multi-path receiving data. However, for the situation that there are many signal sources and the received data of a single receiving antenna has high time-frequency domain overlapping, the traditional method and the existing deep neural network method cannot realize effective signal separation.
Disclosure of Invention
The invention aims to provide a method for separating a mixed signal in a complex frequency spectrum environment, which has good separation performance and can effectively inhibit noise, aiming at the problem of separating signals under the condition of high overlapping (serious time-frequency domain overlapping) of various signals and the problem of the existing mixed signal separation technology, so as to solve the problem of separating various mixed signals under the condition of high overlapping in the complex frequency spectrum environment.
The above object of the present invention can be achieved by the following measures, a method for separating a mixed signal in a complex spectrum environment, having the following technical features: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.
Compared with the prior art, the invention has the following beneficial effects.
Aiming at the problem of signal separation under high overlapping (serious time-frequency domain overlapping) of various signals, IQ two-path and amplitude spectrum data of a target signal are output through an output network by utilizing a semantic segmentation network, time domain windowing and spectrum reconstruction processing are carried out on the IQ two-path data of a received mixed signal, the time domain windowing inhibits spectrum leakage during spectrum reconstruction, the semantic segmentation network improved based on U-Net separates a baseband mixed signal under a Gaussian noise environment, the problem that the time domain and the frequency domain overlapping of the mixed signal are difficult to separate is solved, the characteristics of the mixed signal can be extracted when the semantic segmentation network of the target signal is trained under the condition that the signal mixing mode is unknown, and the time domain and the frequency domain separation of the mixed signal is realized by traversing the semantic segmentation network of each target signal. The output result can position the position of the target category, obtain good separation performance in time-frequency domain, and effectively inhibit the influence of noise. Simulation results show that the invention has strong tracking capability and wide application range, and obtains better separation performance under low signal-to-noise ratio and complex frequency spectrum environment.
The semantic segmentation network is constructed based on the classical structure U-Net of the semantic segmentation neural network. And improving a semantic segmentation network of convolution, pooling kernel size, loss function and output size by using U-Net, wherein the U-Net adopts an encoding-decoding network structure and promotes multi-scale feature fusion of data by using channel splicing. The method is characterized in that each time data passes through an up-sampling layer, feature fusion is carried out on the data and a down-sampling layer with the same data size. And better signal separation performance is obtained under the complex frequency spectrum environment with low signal-to-noise ratio. The U-Net is more suitable for processing data with small samples and large scales, so that the U-Net can be applied to the signal separation problem and can process longer time-frequency sampling sequences. The method improves a U-net semantic segmentation model, replaces the cross entropy of a pixel label with the mean square error of a time-frequency domain waveform as a loss function, adjusts the sizes of convolution and pooling kernels to ensure that the cross entropy is suitable for a one-dimensional time-frequency sampling sequence of a signal, extracts the characteristics of the time-frequency domain of the signal in the training process, realizes the separation of mixed signals, and solves the problems in the prior art to a certain extent.
Drawings
FIG. 1 is a diagram of a semantic segmentation network for separating mixed signals under Gaussian noise environment according to an embodiment of the present invention;
fig. 2 is a block diagram of a down-sampling module according to an embodiment of the present invention;
FIG. 3 is a block diagram of an upsampling module provided by an embodiment of the present invention;
FIG. 4 is a complex baseband I waveform of a mixed signal according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a comparison between a complex baseband I waveform of the present invention and a complex baseband I waveform of an original time domain Gaussian pulse signal;
FIG. 6 is a schematic diagram of a mixed signal magnitude spectrum provided by an embodiment of the invention;
FIG. 7 is a schematic diagram of the comparison of the amplitude spectrum of the polyphonic signal separated from the mixed signal and the amplitude spectrum of the original polyphonic signal according to the present invention;
fig. 8 is a time-frequency domain mean square error plot of a polyphonic signal separated from a mixed signal in accordance with the present invention.
In order to make the objects, technical means and advantages of the present invention more apparent in detail, the present invention will be further described in detail with reference to the accompanying drawings and examples.
Detailed Description
Referring to fig. 1, the semantic segmentation network structure designed according to the present invention is based on the classical structure U-Net of the semantic segmentation neural network, and adopts a down-sampling coding network composed of down-sampling modules connected to an up-sampling coding network composed of up-sampling modules, and inputs data through the down-sampling coding network and then through the up-sampling coding network; learning signal characteristics from received data in a training process, carrying out time domain windowing and spectrum reconstruction processing on IQ two-path data of a received mixed signal by a receiver, carrying out spectrum reconstruction based on the time domain windowing, carrying out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculating the amplitude of the signal, and finishing the spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum of the mixed signal form tensor data which are used as input of a semantic segmentation network, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.
The first layer of the semantic segmentation network is an input layer, the input size is (512,1024,1,3), the input layer is followed by a down-sampling coding network, the fourth down-sampling module is followed by an up-sampling decoding network, and the fourth up-sampling module is followed by an output network. The down-sampling coding network comprises: a first down-sampling module feature concatenation Conv2D, a convolution-ReLU activation layer, Conv2D, a second down-sampling module feature concatenation fourth up-sampling module, a third down-sampling module feature concatenation third up-sampling module, a fourth down-sampling module feature concatenation second up-sampling module; the first down-sampling module sequentially passes through the second sampling module, the third down-sampling module and the fourth down-sampling module, and is connected with Conv2D and a convolution-ReLU active layer in parallel through the first up-sampling module, the second up-sampling module, the third up-sampling module and the fourth up-sampling module; tensorSplicing the output of an up-sampling layer of the first up-sampling module with the input of a down-sampling layer of the fourth down-sampling module through the first down-sampling module and the second up-sampling module, and splicing the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module; the second up-sampling module splices the output of the up-sampling layer of the first up-sampling module and the input of the down-sampling layer of the fourth down-sampling module; the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module and the input of the down-sampling layer of the third down-sampling module; the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module, after the output of the up-sampling layer of the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, the output of the up-sampling layer of the fourth up-sampling module passes through a convolution-ReLU active layer with a convolution kernel size of (3,1)) and a convolution layer with a convolution kernel size of (1,1), and the number of the convolution layer channels is 3, wherein X isIFor I-way data flow, XQFor Q data streams, Y is the amplitude spectrumAnd (4) data flow.
The specific structure of the down-sampling coding network is as follows:
the first down-sampling module uses the basic structure of a down-sampling coding network, and the number of channels of the convolutional layer and the pooling layer is 32.
The second down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 64;
the third down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 128;
the fourth down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 256;
the fourth down-sampling module is followed by an up-sampling decoding network, and the specific structure of the up-sampling decoding network is as follows:
the first up-sampling module uses the basic structure of an up-sampling decoding network, the number of channels of a convolution kernel is 512, and the number of channels of an up-sampling layer is 256.
The second up-sampling module splices the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 256, and the number of channels of an up-sampling layer is 128;
the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module with the input of the down-sampling layer of the third down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 128, and the number of channels of an up-sampling layer is 64;
and the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 64, and the number of channels of an up-sampling layer is 32;
the output network behind the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, and then the number of channels of the convolutional layer is 32 through a convolutional-ReLU active layer with the convolutional kernel size of (3, 1); finally, IQ two paths and amplitude spectrum data of the separated target signal are respectively output by a layer of convolution layer with convolution kernel size of (1,1), wherein the number of the convolution layer channels is 3, and the three channels are respectively output. The output size is (512,1024,1, 3). The loss function of the network is a mean square error function, the optimizer is Adam, and the learning rate is 0.001.
The down-sampling coding network is a basic structure formed by two layers of convolution, an activation layer and a step pooling down-sampling layer. The basic structure of the down-sampling coding network is stacked 4 times to form the down-sampling coding network. The up-sampling decoding network is composed of two convolution layers, an active layer and an up-sampling layer to form a basic structure. The basic structure of the system forms an up-sampling decoding network 4 times, the basic structure is subjected to characteristic splicing with a down-sampling coding layer with the same size after each up-sampling, the spliced data passes through a convolution output layer with three channels, and the three channels respectively output IQ two channels and amplitude spectrum data of a separated target signalWherein, XoutputIComplex baseband I-path data, X, representing a split target signaloutputQComplex baseband Q-path data, Y, representing the split target signaloutputAnd magnitude spectrum data representing the separated target signal.
The semantic segmentation network training set inputs a tensor sample set consisting of two paths of IQ and amplitude spectrum data of a mixed signal, and the label of the training set is a tensor consisting of two paths of IQ and amplitude spectrum data of a target signalThe validation dataset is generated in the same way as the training dataset, where XlabelIComplex baseband I-path data, X, representing a tag target signallabelQComplex baseband Q-way data, Y, representing a target signal taglabelMagnitude spectrum data representing a target signal tag.
See fig. 2. The down-sampling coding network is composed of down-sampling modules, and the down-sampling modules comprise: two Conv2D-ReLU activated convolutional layers and one Max boosting 2D maximum value pooling downsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the Maxboost 2D layer reduces the input size to half of the original size to be used as an output. In the example Conv2D-ReLU active convolutional layer, the convolutional kernel size is (3,1), and the downsampling step of the maximum pooling downsampling layer is (2, 1). In the down-sampling module architecture shown in fig. 2; the downsampling module first extracts features through the Conv2D-ReLU activated convolutional layer, and then halves the data size through the maximum pooling layer with a pooling stride of (2, 1).
See fig. 3. The up-sampling decoding network is composed of up-sampling modules, and each up-sampling module comprises: two Conv2D-ReLU activated convolutional layers and one upsampling layer, which ensure the same input and output size of the convolutional layer by complementing 0. The up-sampling layer doubles its input size as output. The convolution kernel size of the convolution layer of this example is (3,1), and the up-sampling step of the up-sampling layer is (2, 1). In the upsampling module structure illustrated in fig. 3, the upsampling module first extracts features through the active layer of the Conv2D-ReLU convolution, and then multiplies the data size through the upsampling layer with the upsampling step of (2, 1). Specifically, the ReLU activation function, the up-sampling operation, and the mean square error function operation involved in the semantic segmentation network structure are as follows: ReLU is a modified linear unit activation function (ReLU) commonly used in convolutional neural networks, and is expressed as: the parameters of the upsampling operation in the relu (x) upsampling module are the step size (0, size (1)), the upsampling operation is repeated size (0) times along the rows of the input data and size (1) times along the columns of the input data. Taking the data size (B,1024,1,3) as an example, the data size (B,2048,1,3) is obtained after the up-sampling layer performs up-sampling with the step size (2, 2). The mean square error loss function is defined as follows:where N is the data length, ωiIs y ═ yiProbability of yiIn order to be the data of the tag,outputting data for the network.
The specific implementation comprises the following steps:
step 1: and the receiver performs time domain windowing and frequency spectrum reconstruction processing on the IQ two-path data of the received mixed signal. IQ two-path data of the mixed signal and the amplitude spectrum data form tensor data as network input; IQ two-path data of the target signal and the amplitude spectrum form tensor data as a network tag.
In an alternative embodiment, the present embodiment takes the I-way as an example,
setting the I path data stream received by the receiver asThe mixed signal of (2) adopts a time domain window function with the window length consistent with the number N of sampling pointsPerforming time domain windowing to obtain a time domain windowing result as follows: xwindowedI(i)=XI(i)WI(i),i=1,2,...,N
Spectrum reconstruction is performed based on time domain windowing, Fast Fourier Transform (FFT) with the length of N is performed on the time domain windowed signal, and the amplitude of the FFT is calculated. And performing frequency spectrum reconstruction based on time domain windowing to complete frequency spectrum reconstruction.
Step 2: the receiver constructs a semantic segmentation network and a sample set used for training, specifies a target signal and trains the network. In a gaussian noise environment, the present embodiment employs a semantic division network, which forms a tensor by using IQ two-way data and amplitude spectrum data, which are input as mixed signalsThe semantic segmentation network input layer takes N as the sample length, B is the number of samples input each time, the data are input in a sample format of (B, N,1,3), input data are compressed through a down-sampling coding network, then the data size is recovered through an up-sampling decoding network, complex baseband IQ two-path data and amplitude spectrum data of a target signal are separated from a mixed signal in parallel, and finally the baseband IQ two-path data and the amplitude spectrum data of the target signal are output by an output layer.
And step 3: for a mixed signal without a target signal, the semantic segmentation network forms IQ two paths and amplitude spectrum data of the reserved mixed signal into tensorTensor composed of mixed signal I path, Q path and amplitude spectrum at input layer of semantic segmentation networkInputting the semantic segmentation network to obtain I path, Q path and amplitude spectrum tensor of the target signal
And the receiver performs time domain windowing and frequency spectrum reconstruction processing on the IQ two-path data of the mixed signal received by the receiver according to the trained semantic segmentation network, then is connected with the IQ two-path data and the amplitude spectrum, and inputs the IQ two-path data and the amplitude spectrum data of the target signal into the semantic segmentation network. One network model corresponds to only one target signal. The semantic segmentation network model that specifies the target signal separates only the target signal from the mixed signal. For a mixed signal containing a target signal, the semantic segmentation network separates IQ two paths of the target signal and tensor consisting of amplitude spectrum data
In an alternative embodiment, the parameters and implementation steps of each type of parameter signal are as follows:
step 1: and the receiver performs time domain windowing and spectrum reconstruction processing on the IQ two-path data of the received mixed signal. Here, a hanning window is used, where the length N is 1024 and the FFT length N is 1024. In this embodiment, a mixed signal of a time domain gaussian pulse signal, a polyphonic signal, a linear sweep signal, and a noise blocking signal is used as an embodiment.
Step 2: and the training set specifies a target signal and constructs a semantic segmentation network. The overall structure of the network is shown in fig. 1. And the semantic segmentation network takes tensor data formed by the processed I path, Q path and magnitude spectrum as a training sample to train the network.
Fig. 4 shows the I-path waveform of the mixed signal complex baseband of the embodiment, the signal-to-noise ratio of each signal is-5 dB, and the situation that the time domain of each signal is overlapped seriously is visually shown.
Fig. 5 shows a comparison between the complex baseband I-path waveform of the time-domain gaussian pulse signal separated from the mixed signal and the complex baseband I-path waveform of the original time-domain gaussian pulse signal, and visually shows the signal separation performance in the time domain. The spectrum difference shows that the invention has better separation performance in the time domain.
Fig. 6 shows the amplitude spectrum waveform of the mixed signal of the embodiment, the signal components and the signal parameters are the same as those of the signal contained in fig. 4, and the condition that the frequency domains of the signals are overlapped seriously is visually shown.
Fig. 7 shows the comparison between the amplitude spectrum waveform of the polyphonic signal separated from the mixed signal and the amplitude spectrum waveform of the original polyphonic signal, and visually shows the signal separation performance in the frequency domain. The spectrum difference shows that the invention has better separation performance in the frequency domain.
Fig. 8 shows a plot of the mean square error of the separated polyphonic signals from the mixed signal, calculated as the sum of the time-frequency domain mean square errors of the separated polyphonic signals. And quantizing the signal separation performance of the semantic segmentation network. The mean square error is less than-20 dB under the lowest signal-to-noise ratio, the mean square error is less than-40 dB under the highest signal-to-noise ratio, and the separation performance is better.
The parameters of the mixed signal used in the examples are shown in table 1.
Table 1 example mixed signal parameter setting scheme
The foregoing is directed to the preferred embodiment of the present invention and it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.
Claims (10)
1. A separation method of mixed signals in a complex frequency spectrum environment has the following technical characteristics: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network based on U-Net improvement learns signal characteristics in received data by taking a mean square error function as a loss function, and separates two paths of complex baseband IQ and amplitude spectrum data of a target signal from a mixed signal in parallel to recover a source signal.
2. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the first layer of the semantic segmentation network is an input layer, the input size is (512,1024,1,3), the input layer is followed by a down-sampling coding network, the fourth down-sampling module is followed by an up-sampling decoding network, and the fourth up-sampling module is followed by an output network.
3. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the downsampling coding network includes: a first downsampling module feature concatenation Conv2D, a convolution-ReLU activation layer, Conv2D, a second downsampling module feature concatenation fourth upsampling module, a third downsampling module feature concatenation third upsampling module, and a fourth downsampling module feature concatenation second upsampling module.
4. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the first down-sampling module sequentially passes through the second sampling module, the third down-sampling module and the fourth down-sampling module, and is connected with Conv2D and a convolution-ReLU active layer in parallel through the first up-sampling module, the second up-sampling module, the third up-sampling module and the fourth up-sampling module; tensorSplicing the output of an up-sampling layer of the first up-sampling module with the input of a down-sampling layer of the fourth down-sampling module through the first down-sampling module and the second up-sampling module, and splicing the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module; the second up-sampling module splices the output of the up-sampling layer of the first up-sampling module and the input of the down-sampling layer of the fourth down-sampling module; the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module and the input of the down-sampling layer of the third down-sampling module; the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module, after the output of the up-sampling layer of the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, the output of the up-sampling layer of the fourth up-sampling module passes through a convolution-ReLU active layer with a convolution kernel size of (3,1)) and a convolution layer with a convolution kernel size of (1,1), and the number of the convolution layer channels is 3, wherein X isIFor I-way data flow, XQThe data flow is Q paths, and Y is the data flow of the amplitude spectrum.
5. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the down-sampling coding network is formed by two layers of convolution, an activation layer and a step pooling down-sampling layer, and the basic structure is stacked for 4 times to form the down-sampling coding network.
6. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the up-sampling decoding network is composed of two convolution layers, an activation layer and an up-sampling layer, the basic structure of the up-sampling decoding network is composed of up-sampling decoding networks 4 times, the basic structure is spliced with down-sampling coding layers with the same size after each up-sampling, the spliced data passes through a convolution output layer with three channels, and the three channels respectively output IQ two-channel and amplitude spectrum data of a separated target signal
Wherein, XoutputIComplex baseband I-path data, X, representing a split target signaloutputQComplex baseband Q-path data, Y, representing the split target signaloutputAnd magnitude spectrum data representing the separated target signal.
7. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the down-sampling coding network is composed of down-sampling modules, and the down-sampling modules comprise: two Conv2D-ReLU activated convolutional layers and one Max boosting 2D maximum value pooling downsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the Maxboost 2D layer reduces the input size to half of the original size to be used as an output.
8. The method for separating a complex spectral environment mixed signal according to claim 7, wherein: the down-sampling module firstly extracts features through a Conv2D-ReLU activated convolutional layer, and then reduces the data size to half through a maximum pooling layer with a pooling step of (2, 1); the upsampling module first extracts features through the active layer of the Conv2D-ReLU convolution, and then multiplies the data size through the upsampling layer with the upsampling step of (2, 1).
9. The method for separating a complex spectral environment mixed signal according to claim 6, wherein: the up-sampling decoding network is composed of up-sampling modules, and each up-sampling module comprises: two Conv2D-ReLU activated convolutional layers and one upsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the upsampling layer doubles the input size as the output.
10. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the receiver receives the I data stream asThe mixed signal of (2) adopts a time domain window function with the window length consistent with the number N of sampling pointsPerforming time domain windowing to obtain a time domain windowing result as follows:
XwindowedI(i)=XI(i)WI(i) and i is 1,2, 9, N, performing frequency spectrum reconstruction based on time domain windowing, performing Fast Fourier Transform (FFT) with the length of N on the signal subjected to time domain windowing, calculating the amplitude of the signal, performing frequency spectrum reconstruction based on the time domain windowing, and finishing frequency spectrum reconstruction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910810854.9A CN110610717B (en) | 2019-08-30 | 2019-08-30 | Separation method of mixed signals in complex frequency spectrum environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910810854.9A CN110610717B (en) | 2019-08-30 | 2019-08-30 | Separation method of mixed signals in complex frequency spectrum environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110610717A CN110610717A (en) | 2019-12-24 |
CN110610717B true CN110610717B (en) | 2021-10-15 |
Family
ID=68890685
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910810854.9A Active CN110610717B (en) | 2019-08-30 | 2019-08-30 | Separation method of mixed signals in complex frequency spectrum environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110610717B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111126332B (en) * | 2019-12-31 | 2022-04-22 | 桂林电子科技大学 | Frequency hopping signal classification method based on contour features |
CN112272066B (en) * | 2020-09-15 | 2022-08-26 | 中国民用航空飞行学院 | Frequency spectrum data cleaning method used in airport terminal area very high frequency communication |
CN112420065B (en) * | 2020-11-05 | 2024-01-05 | 北京中科思创云智能科技有限公司 | Audio noise reduction processing method, device and equipment |
CN112434415B (en) * | 2020-11-19 | 2023-03-14 | 中国电子科技集团公司第二十九研究所 | Method for implementing heterogeneous radio frequency front end model for microwave photonic array system |
CN113707164A (en) * | 2021-09-02 | 2021-11-26 | 哈尔滨理工大学 | Voice enhancement method for improving multi-resolution residual error U-shaped network |
CN113782043B (en) * | 2021-09-06 | 2024-06-14 | 北京捷通华声科技股份有限公司 | Voice acquisition method, voice acquisition device, electronic equipment and computer readable storage medium |
CN117935826B (en) * | 2024-03-22 | 2024-07-05 | 深圳市东微智能科技股份有限公司 | Audio up-sampling method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1008984A3 (en) * | 1998-12-11 | 2000-08-02 | Sony Corporation | Windband speech synthesis from a narrowband speech signal |
US6463406B1 (en) * | 1994-03-25 | 2002-10-08 | Texas Instruments Incorporated | Fractional pitch method |
CN106537499A (en) * | 2014-07-28 | 2017-03-22 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for generating an enhanced signal using independent noise-filling |
-
2019
- 2019-08-30 CN CN201910810854.9A patent/CN110610717B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6463406B1 (en) * | 1994-03-25 | 2002-10-08 | Texas Instruments Incorporated | Fractional pitch method |
EP1008984A3 (en) * | 1998-12-11 | 2000-08-02 | Sony Corporation | Windband speech synthesis from a narrowband speech signal |
CN106537499A (en) * | 2014-07-28 | 2017-03-22 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for generating an enhanced signal using independent noise-filling |
Non-Patent Citations (4)
Title |
---|
End-to-end Sound Source Separation Conditioned on Instrument Labels;Olga Slizovskaia et al.;《 ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20190417;第306-310页 * |
Online Singing Voice Separation Using a Recurrent One-dimensional U-NET Trained with Deep Feature Losses;Clement S. J. Doire et al.;《ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20190417;第3752-3754页 * |
Simultaneous Arteriole and Venule Segmentation of Dual-Modal Fundus Images Using a Multi-Task Cascade Network;Zhang, Shulin et al.;《IEEE ACCESS》;20190514;第57561-57565页 * |
语义分割网络下的混合信号频谱分离;马松等;《电讯技术》;20200428;第413-420页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110610717A (en) | 2019-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110610717B (en) | Separation method of mixed signals in complex frequency spectrum environment | |
CN110927706B (en) | Convolutional neural network-based radar interference detection and identification method | |
CN109307862B (en) | Target radiation source individual identification method | |
CN111783558A (en) | A kind of intelligent identification method and system of satellite navigation jamming signal type | |
Wang et al. | Radar emitter recognition based on the short time Fourier transform and convolutional neural networks | |
CN110532932B (en) | A method for identifying intra-pulse modulation of multi-component radar signals | |
CN111610488B (en) | Random array angle of arrival estimation method based on deep learning | |
CN114580476B (en) | A method for constructing a recognition model of drone signals and a corresponding recognition method and system | |
CN108764077A (en) | A kind of digital signal modulated sorting technique based on convolutional neural networks | |
CN113312996B (en) | Detection and identification method for aliasing short-wave communication signals | |
Chen et al. | Automatic modulation classification of radar signals utilizing X-net | |
CN111222442A (en) | Electromagnetic signal classification method and device | |
CN112686297B (en) | A method and system for classifying the motion state of a radar target | |
CN112764003A (en) | Radar radiation source signal time-frequency feature identification method and device and storage medium | |
CN114282576B (en) | Radar signal modulation format recognition method and device based on time-frequency analysis and denoising | |
CN111181574A (en) | A method, device and device for endpoint detection based on multi-layer feature fusion | |
CN106971392A (en) | A kind of combination DT CWT and MRF method for detecting change of remote sensing image and device | |
Li et al. | Deep Learning and Time-Frequency Analysis Based Automatic Low Probability of Intercept Radar Waveform Recognition Method | |
US10885928B1 (en) | Mixed domain blind source separation for sensor array processing | |
Hinderer | Blind source separation of radar signals in time domain using deep learning | |
CN114841195B (en) | Avionics space signal modeling method and system | |
Huynh-The et al. | WaveNet: Towards Waveform Classification in Integrated Radar-Communication Systems with Improved Accuracy and Reduced Complexity | |
CN115345216A (en) | FMCW radar interference elimination method fusing prior information | |
Chen et al. | Radar intra-pulse modulation signal classification using CNN embedding and relation network under small sample set | |
CN116340807B (en) | Broadband Spectrum Signal Detection and Classification Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |