CN105590633A - Method and device for generation of labeled melody for song scoring - Google Patents
Method and device for generation of labeled melody for song scoring Download PDFInfo
- Publication number
- CN105590633A CN105590633A CN201510784342.1A CN201510784342A CN105590633A CN 105590633 A CN105590633 A CN 105590633A CN 201510784342 A CN201510784342 A CN 201510784342A CN 105590633 A CN105590633 A CN 105590633A
- Authority
- CN
- China
- Prior art keywords
- track
- energy
- energy distribution
- accompaniment
- distribution spectrum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/81—Detection of presence or absence of voice signals for discriminating voice from music
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/091—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Auxiliary Devices For Music (AREA)
Abstract
The present invention provides a method and device for generation of a labeled melody for song scoring, and relates to a method for information extraction in audio frequency data and especially for labeled melody extraction from songs. The method comprises the following steps: S010, obtaining one section of real signals X0 in an original audio track and one section of real signals X1, corresponding to the real signals X0, in an accompaniment audio track; S020, applying windowing discrete Fourier transform to the signals X0 and the signals X1, and obtaining an energy distribution spectrum X0' corresponding to the original audio track and an energy distribution spectrum X1' corresponding to the accompaniment audio track; and S030, calculating the energy difference on each frequency range of the original audio track and the accompaniment audio track, and obtaining a human voice energy distribution spectrum Xmag_diff according to the difference. The present invention provides a method for bitch generation of a music score.
Description
Technical field
Relate to the information extraction in a kind of voice data, particularly from song, extract the raw method of the music score of Chinese operas.
Background technology
Music is a large product of human civilization, and music is not only a kind of civilization art, a kind of Social Culture especially; NoSame music has different social effects, and outstanding music has more the function of cultivating one's taste with soul distillation. Music industryIn global entertainment industry, occupy huge ratio, also have the connection of countless ties with video display industry, game animation industrySystem.
Music has numerous species type, and song is approximately maximum one. From the content of a song, form greatly by threePart: word, song, music. " song " is that a head sings the significant feature of tool, is significant difference place between song. One" song " of first song is made up of accompaniment spectrum and people's sound spectrum. As a first song, the part of voice is the key element of a song most critical especially.
As the key element of a first song most critical, people's sound spectrum is various content-based music information retrievals or comparison functionFoundation---for example singing search, original works of music comparison, proposed algorithm based on music similarity; In addition, people's sound spectrum alsoIt is important material in the middle of music teaching field, musical composition field.
Inventor finds realizing time of the present invention, and people's sound spectrum of wanting to obtain in song has three kinds of methods, first methodBe directly to be provided by the record company under song, but in most of the cases, it is original that record company can not disclose out songPeople's sound spectrum, so mostly often cannot use first method in situation.
The second is to be listened and write out by the staff who has music training, is unusual original and poor efficiency, although accurateRate is the highest, but this method can not be fast and automation complete, and human cost is very high, is not suitable for especiallyProcess song in enormous quantities.
The third is the angle from Audio Signal Processing, acoustic feature based on voice and musical instrument miscellaneous,Or supervision based on other or without supervision machine learning method, extract people's sound spectrum. But in present common music systemDo in process, various voice, instrumental music rail before mixed contracting are all likely applied various effect devices, and different blended compression process allLikely superpose the again effect device of various the unknowns, so this problem has become half-blindness source or total blindness's source signal separates, soThis method becomes more difficult, and the people's sound spectrum accuracy drawing is not high.
Above three kinds of methods, all can not meet automatic high-efficiency ground and calculate in batches the order of people's sound spectrum of magnanimity song.
Summary of the invention
Below provide the simplification of one or more aspect is summarized to try hard to provide the basic comprehension to this type of aspect. ThisSummarize detailed the combining of the not all aspect contemplating and look at, and neither be intended to point out out the key or decisive of all aspectsKey element is the non-scope of attempting to define any or all aspect also. Its unique object be to provide in simplified form one or moreSome concepts of individual aspect are using the more specifically bright order as providing after a while.
For this reason, need to provide the object method of people's sound spectrum that a kind of automatic high-efficiency ground calculates magnanimity song in batches and establishStandby
For achieving the above object, inventor provides a kind of music score of Chinese operas generation method for song scoring, it is characterized in that,Comprise step, S010, obtain the one section reality corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackSignal X1; S020, above-mentioned real signal X0 and X1 are implemented to windowing DFT algorithm, the energy that obtains corresponding original singer's track dividesThe Energy distribution spectrum X1 ' of cloth spectrum X0 ' and corresponding accompaniment track; S030, count according to Energy distribution spectrum X0 ' and Energy distribution spectrum X1 'Calculate the difference of original singer's track and accompaniment track energy in each frequency range, obtain voice Energy distribution spectrum Xmag_diff according to difference.S040, calculate base frequency according to voice Energy distribution spectrum Xmag_diff; By song segmentation and to the above-mentioned S010 of each carrying out step by step~S040 step, obtains base frequency corresponding to each segmentation, base frequency corresponding each segmentation is spliced according to time sequencing,Obtain the music score of Chinese operas for song scoring.
Be different from prior art, technique scheme is from the reality of real signal X0 and the corresponding vocal accompaniment track of corresponding original singer's trackIn signal X1, calculate the part that obtains people's acoustic energy, thereby determine that according to the energy of voice the frequency of voice (is also known as soundAdjust), use this method, can offset the impact of the various voice, instrumental music and the various effect devices that mix in accompaniment, increaseThe accuracy of voice identification. And use the batch process song that this method can high-efficient automatic, to obtain voice partMusic score, the music score of voice part can be further used for the points-scoring system of singing. Address relevant object for before reaching, thisOr more aspects are included in the feature of hereinafter fully describing and particularly pointing out in claims. Below description and attachedFigure has elaborated some illustrative aspects of this one or more aspect. But these features are only to have indicated to adoptSeveral with in the variety of way of the principle of various aspects, and this description be intended to contain all these type of aspects and etc. efficacious prescriptionsFace.
Brief description of the drawings
Describe disclosed aspect below with reference to accompanying drawing, it is non-limiting disclosed side in order to illustrate that accompanying drawing is providedFace, in accompanying drawing, similar label indicates similar key element, and therein:
Fig. 1 is a kind of implementation method of the present invention;
Fig. 2 is original singer's track and the accompaniment track schematic diagram of a certain first song;
Fig. 3 obtains the Energy distribution spectrum X0 ' of corresponding original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track;
Fig. 4 is the voice Energy distribution spectrum X obtainingmag_diff;
Fig. 5 is the music score of Chinese operas for song scoring obtaining;
Fig. 6 is module map corresponding to one embodiment of the present invention.
Description of reference numerals:
10, pretreatment module;
20, real signal acquisition module;
30, energy computing module;
40, base frequency computing module;
50, music score of Chinese operas synthesis module.
Detailed description of the invention
By describe in detail technical scheme technology contents, structural feature, realized object and effect, below in conjunction with concrete realityExecute example and coordinate accompanying drawing to be explained in detail. In the following description, set forth for explanatory purposes numerous details to provide rightThe thorough understanding of one or more aspect. But it is evident that do not have these details also can put into practice this type of aspect.
The invention provides a kind of music score of Chinese operas generation method for song scoring, referring to Fig. 1, step is as follows,
S010, obtain the one section real letter corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackNumber X1;
S020 implements windowing Fourier transformation to above-mentioned real signal X0 and X1, obtains the Energy distribution spectrum of corresponding original singer's trackThe Energy distribution spectrum X1 ' of X0 ' and corresponding accompaniment track;
S030, calculate original singer's track and accompany track in each frequency range according to Energy distribution spectrum X0 ' and Energy distribution spectrum X1 'The difference of energy, obtains voice Energy distribution spectrum X according to differencemag_diff。
S040, according to voice Energy distribution spectrum Xmag_difF calculates base frequency;
By song segmentation and to the above-mentioned S010~S040 of each carrying out step by step step, obtain base frequency corresponding to each segmentation,Base frequency corresponding each segmentation is spliced according to time sequencing, obtain the music score of Chinese operas for song scoring.
Voice Energy distribution spectrum Xmag_diffBe also referred to as voice amplitude spectrum.
In certain embodiments, said method is specially, the real signal of the first song original singer of acquisition one track and accompaniment trackReal signal, then does windowed FFT to them, and the short signal in window is calculated to frequency spectrum, passes through Fourier in this methodWhat conversion obtained is the frequency domain distribution (being energy spectrum) within a period of time. The preferred length of window of analyzing use is 4096Sampled point, step is moved 256 sampled points of length. For example, shown in Fig. 2 is a certain song while doing windowing Fourier transformation, and institute is usedThe real signal X0 of corresponding original singer's track and the real signal X1 of corresponding vocal accompaniment track. Real signal X0 and real signal X1 have 4096The short signal (1:26.600~1:26.685 part of corresponding described song) of sampled point. After obtaining real signal X1 and X2,Respectively real signal X0 and X1 are done to Hamming windowed FFT, then obtain respectively the Energy distribution of corresponding original singer's trackThe Energy distribution spectrum X1 ' of spectrum X0 ' and corresponding accompaniment track, is Fourier to a certain 4096 continuous each sampled points of a certain first songAs shown in Figure 3, in figure, that top is X0 ' for the Energy distribution spectrum X0 ' obtaining after conversion and X1 ', below be X1 ').
Above-mentioned real signal X0 and real signal X1 are implemented to Fourier transformation, Ke Yishi,
X0’=fft(x0·w)
X1’=fft(x1·w)
Be understandable that, can use other Fourier transformation implementation algorithms or its to improve algorithm, with according to real signalTry to achieve Energy distribution. For example other Fourier transformation methods, and adopt different algorithms, the X0 ' obtaining and X1 ' and above-mentioned figureShowing that the X0 ' in X compares with X1 ', may be different.
According to the Energy distribution spectrum X1 ' meter of the Energy distribution spectrum X0 ' of the above-mentioned original singer's track calculating and accompaniment trackCalculate original singer's track and compose X with the voice Energy distribution of accompaniment trackmag_diff, preferred computational methods are:
Formula 1:
Wherein
Be understandable that, constant arbitrarily can be multiplied by equation the right, is all the variant of this method. For example this methodVariant can also be:
Formula 2:
Wherein
The Energy distribution spectrum X1 ' meter of the Energy distribution spectrum X0 ' of original singer's track of a certain section of a certain song and accompaniment trackThe voice Energy distribution spectrum X calculatingmag_diffAs shown in Figure 4. Be understandable that and adopt different computational methods, the figure obtainingSpectrum may be discrepant.
What have choosing composes X according to voice Energy distributionmag_diffCalculate base frequency, comprise following concrete steps:
To each the frequency range sampling in voice range frequency range, respectively in conjunction with voice Energy distribution spectrum Xmag_diffIt is right to calculateEnergy weighted average summation maxAvgDb that should sample frequency section; Calculate energy weighted average summation corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs in maxAvgDb, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic waveBestOfBestFreq, this harmonic wave bestOfBestFreq respective frequencies is base frequency;
Described calculating comprises step to the energy weighted average summation of the frequency range of should sampling: calculate the various of this sampling frequency rangePossible harmonic wave and each harmonic wave corresponding energy weighted average summation avgDb respectively, and calculate the energy that each harmonic wave is corresponding and addMaximum maxAvgDb in weight average summation avgDb, the harmonic wave bestFreq that this maximum maxAvgDb is corresponding, harmonic waveThe frequency that bestFreq is corresponding is the most probable base frequency of this sampling frequency range. In further embodiments, if maximumMaxOfMaxAvgDbs is less than setting value, and this segmentation does not generate tone, and this period is interior without voice. Different when setting valueComputational methods, setting value can be different, the computational methods of impact setting value have: according to calculating energy distribution profile X0 ' and energyThe method of amount distribution profile X1 ', and calculate voice Energy distribution spectrum Xmag_diffMethod, and according to according to voice Energy distributionSpectrum Xmag_diffCalculate the method for base frequency.
Above-mentioned according to voice Energy distribution spectrum Xmag_diffThe method of calculating base frequency is expressed as by false code:
(figure only shows as shown in Figure 5 to calculate by said method the music score of Chinese operas for song scoring that a certain song drawsThe segment of one this music score of Chinese operas).
By said method, from the real signal X0 of corresponding original singer's track and the real signal X1 of corresponding vocal accompaniment track, calculate and obtainObtain the part of people's acoustic energy, thereby determine the frequency (being also known as tone) of voice according to the energy of voice, use this method, canTo offset the impact of the various voice, instrumental music and the various effect devices that mix in accompaniment, increase the accuracy of voice identification.And use the batch process song that this method can high-efficient automatic, to obtain the music score of voice part, the pleasure of voice partSpectrum can be further used for the points-scoring system of singing.
Obtain the one section reality corresponding with real signal X0 in one section of real signal X0 in original singer's track and the track of accompanying in stepAlso comprise that step separates the original singer's track in the MV of MPG form and accompaniment track before signal X1.
Original singer's track in the MV of MPG form and accompaniment track are separated. In addition, the track of two-channel be passed throughIt is monaural original singer's rail and accompaniment rail that PCA (principal component analytical method) extracts principal component. As shown in Figure 2, one of the first halfDimension real signal is that the original singer's monophonic after extracting, the one dimension real signal of the latter half are the accompaniment monophonics after extracting.
Inventor also provides a kind of electronic equipment, for generating the music score of Chinese operas of song scoring, it is characterized in that, comprises realitySignal acquisition module, energy computing module, base frequency computing module, music score of Chinese operas synthesis module;
Described real signal acquisition module is for obtaining in one section of real signal X0 of original singer's track and accompaniment track and real letterNumber one section of real signal X1 corresponding to X0;
Described energy computing module, for according to real signal X0 and X1, and implements Fourier transformation to X0 and X1, and it is right to obtainAnswer the Energy distribution spectrum X0 ' of original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track; And according to Energy distribution spectrum X0 'Calculate original singer's track and the difference of the track energy in each frequency range of accompanying with Energy distribution spectrum X1 ', obtain people's acoustic energy according to differenceAmount distribution profile Xmag_diff;
Described base frequency computing module is for composing X according to voice Energy distributionmag_difF calculates base frequency;
Described music score of Chinese operas synthesis module is combined into for song and comments for basis being commented on to base frequency that computing module calculatesThe music score of Chinese operas dividing.
In further embodiments, preferred energy computing module is for obtaining respectively correspondence according to Fourier transformation methodThe Energy distribution spectrum X1 ' of the Energy distribution spectrum X0 ' of original singer's track and corresponding accompaniment track.
In further embodiments, preferred energy computing module is for composing X0 ' and corresponding accompaniment tone according to Energy distributionThe Energy distribution spectrum X1 ' of rail calculates voice Energy distribution spectrum, and described calculating formula is:
Wherein
In further embodiments, preferred described base frequency computing module is in voice range frequency rangeEach frequency range of sampling, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate energy weighted average that should sample frequency sectionSummation maxAvgDb; Calculate the maximum in energy weighted average summation maxAvgDb corresponding to each sampling frequency rangeMaxOfMaxAvgDbs, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic wave bestOfBestDiv, this harmonic waveBestOfBestDiv respective frequencies is base frequency; Described calculating comprises the energy weighted average summation of the frequency range of should samplingStep: calculate various possible harmonic wave and the energy weighted average summation avgDb corresponding to each harmonic wave difference of this sampling frequency range, withAnd calculate the maximum maxAvgDb in the energy weighted average summation avgDb that each harmonic wave is corresponding, this maximum maxAvgDb coupleThe harmonic wave bestDiv answering, the frequency that harmonic wave bestDiv is corresponding is the most probable base frequency of this sampling frequency range. If maximumMaxOfMaxAvgDbs is less than setting value, in corresponding segments without voice composition.
In further embodiments, preferably also comprise pretreatment module, described pretreatment module is used for song filesIn original singer's track with accompaniment track separate.
Being understandable that song files can make video file, can be also audio file. It should be noted that, at thisWen Zhong, the relational terms such as the first and second grades is only used for an entity or operation and another entity or behaviourMake a distinction, and not necessarily require or imply between these entities or operation and have the relation of any this reality or suitableOrder. And term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thereby make bagProcess, method, article or the terminal device of drawing together a series of key elements not only comprise those key elements, but also do not comprise clearly rowOther key elements that go out, or be also included as the intrinsic key element of this process, method, article or terminal device. Do not havingIn the situation of more restrictions, by the key element limiting statement " comprising ... " or " comprising ... ", and be not precluded within and comprise described wantingIn process, method, article or the terminal device of element, also there is other key element. In addition, in this article, " being greater than ", " being less than "," exceed " etc. and to be interpreted as and not comprise given figure; " more than ", " below ", " in " etc. be interpreted as and comprise given figure.
Those skilled in the art should understand, the various embodiments described above can be provided as method, device or computer program productProduct. These embodiment can adopt complete hardware implementation example, completely implement software example or the embodiment in conjunction with software and hardware aspectForm. All or part of step in the method that the various embodiments described above relate to can be carried out the hardware that instruction is relevant by program and comeComplete, described program can be stored in the storage medium that computer equipment can read, for carrying out the various embodiments described above instituteThe all or part of step of stating. Described computer equipment, includes but not limited to: personal computer, server, all-purpose computer,Special-purpose computer, the network equipment, embedded device, programmable device, intelligent mobile terminal, intelligent home device, Wearable intelligenceEnergy equipment, vehicle intelligent equipment etc.; Described storage medium, includes but not limited to: RAM, ROM, magnetic disc, tape, CD, sudden strain of a muscleDeposit, USB flash disk, portable hard drive, storage card, memory stick, webserver stores, network cloud storage etc.
The various embodiments described above are with reference to according to the method described in embodiment, equipment (system) and computer programFlow chart and/or block diagram are described. Should understand can be by computer program instructions realization flow figure and/or block diagram everyFlow process in one flow process and/or square frame and flow chart and/or block diagram and/or the combination of square frame. These computers can be providedProgrammed instruction to produce a machine, makes the finger of carrying out by the processor of computer equipment to the processor of computer equipmentOrder produces for realizing and specifies at flow process of flow chart or multiple flow process and/or square frame of block diagram or multiple square frameThe device of function.
These computer program instructions also can be stored in and can establish with the computer of ad hoc fashion work by vectoring computer equipmentIn standby readable memory, the instruction that makes to be stored in this computer equipment readable memory produces the manufacture that comprises command deviceProduct, this command device is realized at flow process of flow chart or multiple flow process and/or square frame of block diagram or multiple square frame middle fingerFixed function.
These computer program instructions also can be loaded on computer equipment, make to carry out on computer equipment a series ofOperating procedure is to produce computer implemented processing, thereby the instruction of carrying out on computer equipment is provided for realizing in flow processThe step of the function of specifying in flow process of figure or multiple flow process and/or square frame of block diagram or multiple square frame.
Although the various embodiments described above are described, once those skilled in the art cicada basicCreative concept, can make other change and amendment to these embodiment, so the foregoing is only enforcement of the present inventionExample, not thereby limits scope of patent protection of the present invention, every equivalence that utilizes description of the present invention and accompanying drawing content to doStructure or the conversion of equivalent flow process, or be directly or indirectly used in other relevant technical fields, be all in like manner included in of the present inventionWithin scope of patent protection.
Claims (11)
1. the music score of Chinese operas generation method for song scoring, is characterized in that, comprises step,
S010, obtain the one section real signal corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackX1;
S020, above-mentioned real signal X0 and X1 are implemented to windowing DFT algorithm, obtain the Energy distribution of corresponding original singer's trackThe Energy distribution spectrum X1 ' of spectrum X0 ' and corresponding accompaniment track;
S030, Energy distribution spectrum X1 ' the calculating original singer's track and the companion that compose X0 ' and accompaniment track according to the Energy distribution of original singer's trackPlay the difference of track energy in each frequency range, obtain voice Energy distribution spectrum X according to differencemag_diff;
S040, according to voice Energy distribution spectrum Xmag_diffCalculate base frequency;
By song segmentation and to the above-mentioned S010~S040 of each carrying out step by step step, obtain base frequency corresponding to each segmentation, will be eachBase frequency corresponding to segmentation splices according to time sequencing, obtains the music score of Chinese operas for song scoring.
2. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, described according to original singerThe Energy distribution spectrum X1 ' of the Energy distribution spectrum X0 ' of track and accompaniment track calculates people's acoustic energy of original singer's track and accompaniment trackDistribution profile Xmag_diff, be specially:
。
Wherein i=1,2..., N
3. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, to above-mentioned real signalX0 and real signal X1 implement windowing Fourier transformation, obtain respectively Energy distribution spectrum X0 ' and the corresponding accompaniment of corresponding original singer's trackThe Energy distribution spectrum X1 ' of track is specially:
X0’=fft(x0·w)
X1’=fft(x1·w)
。
4. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, described according to peopleAcoustic energy distribution profile Xmag_diffCalculate base frequency, comprise step:
To each frequency sampling in voice range frequency range, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate shouldThe energy weighted average summation maxAvgDb of sample frequency section; Calculate energy weighted average summation corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs in maxAvgDb, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic waveBestOfBestFreq, this harmonic wave bestOfBestFreq respective frequencies is base frequency;
Described calculating comprises step to the energy weighted average summation of the frequency range of should sampling: calculating the various of this sampling frequency range mayHarmonic wave and respectively corresponding energy weighted average summation avgDb of each harmonic wave, and it is flat to calculate the energy weighting that each harmonic wave is correspondingAll maximum maxAvgDb in summation avgDb, the harmonic wave bestFreq that this maximum maxAvgDb is corresponding, harmonic waveThe frequency that bestFreq is corresponding is the most probable base frequency of this sampling frequency range.
5. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 4, is characterized in that, if maximumMaxOfMaxAvgDbs is less than setting value, and this period does not generate tone.
6. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, obtains former in stepBefore singing one section of real signal X1 corresponding with real signal X0 in one section of real signal X0 in track and the track of accompanying, also comprise stepOriginal singer's track in song files is separated with accompaniment track.
7. an electronic equipment, for generating the music score of Chinese operas of song scoring, is characterized in that, comprises real signal acquisition module, energyComputing module, base frequency computing module, music score of Chinese operas synthesis module;
Described real signal acquisition module is for obtaining in one section of real signal X0 of original singer's track and accompaniment track and real signal X0One section of corresponding real signal X1;
Described energy computing module, for according to real signal X0 and X1, and implements Fourier transformation to X0 and X1, obtains corresponding formerSing the Energy distribution spectrum X0 ' of track and the Energy distribution spectrum X1 ' of corresponding accompaniment track; And according to Energy distribution spectrum X0 ' and energyAmount distribution profile X1 ' calculates the difference of original singer's track and accompaniment track energy in each frequency range, obtains people's acoustic energy divide according to differenceCloth spectrum Xmag_diff;
Described base frequency computing module is for composing X according to voice Energy distributionmag_diffCalculate base frequency;
For basis is commented on, base frequency that computing module calculates is combined into for song scoring described music score of Chinese operas synthesis moduleThe music score of Chinese operas.
8. a kind of electronic equipment as claimed in claim 7, is characterized in that, described energy computing module is used for according to FourierTransform method obtains respectively the Energy distribution spectrum X0 ' of corresponding original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track.
9. a kind of electronic equipment as claimed in claim 7, is characterized in that, described energy computing module is for dividing according to energyThe Energy distribution spectrum X1 ' of cloth spectrum X0 ' and corresponding accompaniment track calculates voice Energy distribution spectrum, and described calculating formula is:
Wherein i=1,2..., N
10. a kind of electronic equipment as claimed in claim 7, is characterized in that, described base frequency computing module is used for peopleEach sampling frequency range in the frequency range of sound territory, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate should sample frequencyThe energy weighted average summation maxAvgDb of section; Calculate in energy weighted average summation maxAvgDb corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic wave bestOfBestFreq,This harmonic wave bestOfBestFreq respective frequencies is base frequency; Described calculating is total to the energy weighted average of the frequency range of should samplingWith comprise step: calculate the various possible harmonic wave of this sampling frequency range and each harmonic wave corresponding energy weighted average summation respectivelyAvgDb, and calculate the maximum maxAvgDb in the energy weighted average summation avgDb that each harmonic wave is corresponding, this maximumThe harmonic wave bestFreq that maxAvgDb is corresponding, the frequency that harmonic wave bestFreq is corresponding is the most probable basis of this sampling frequency range frequencyRate; If maximum maxOfMaxAvgDbs is less than setting value, in corresponding segments without voice composition.
11. a kind of electronic equipments as claimed in claim 7, is characterized in that, also comprise pretreatment module, described pretreatment mouldPiece is for separating original singer's track of song files and accompaniment track.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510784342.1A CN105590633A (en) | 2015-11-16 | 2015-11-16 | Method and device for generation of labeled melody for song scoring |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510784342.1A CN105590633A (en) | 2015-11-16 | 2015-11-16 | Method and device for generation of labeled melody for song scoring |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105590633A true CN105590633A (en) | 2016-05-18 |
Family
ID=55930155
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510784342.1A Pending CN105590633A (en) | 2015-11-16 | 2015-11-16 | Method and device for generation of labeled melody for song scoring |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105590633A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107910019A (en) * | 2017-11-30 | 2018-04-13 | 中国科学院微电子研究所 | Human body sound signal processing and analyzing method |
CN109300485A (en) * | 2018-11-19 | 2019-02-01 | 北京达佳互联信息技术有限公司 | Methods of marking, device, electronic equipment and the computer storage medium of audio signal |
WO2020015411A1 (en) * | 2018-07-18 | 2020-01-23 | 阿里巴巴集团控股有限公司 | Method and device for training adaptation level evaluation model, and method and device for evaluating adaptation level |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1148230A (en) * | 1995-04-18 | 1997-04-23 | 德克萨斯仪器股份有限公司 | Method and system for karaoke scoring |
US6057502A (en) * | 1999-03-30 | 2000-05-02 | Yamaha Corporation | Apparatus and method for recognizing musical chords |
US20030106413A1 (en) * | 2001-12-06 | 2003-06-12 | Ramin Samadani | System and method for music identification |
US20050065781A1 (en) * | 2001-07-24 | 2005-03-24 | Andreas Tell | Method for analysing audio signals |
CN1607575A (en) * | 2003-10-16 | 2005-04-20 | 扬智科技股份有限公司 | Humming arrangement system and method |
CN1924992A (en) * | 2006-09-12 | 2007-03-07 | 东莞市步步高视听电子有限公司 | A method for playing karaoke vocals |
CN1945689A (en) * | 2006-10-24 | 2007-04-11 | 北京中星微电子有限公司 | Method and its device for extracting accompanying music from songs |
CN101238511A (en) * | 2005-08-11 | 2008-08-06 | 旭化成株式会社 | Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program |
CN101894552A (en) * | 2010-07-16 | 2010-11-24 | 安徽科大讯飞信息科技股份有限公司 | Speech spectrum segmentation based singing evaluating system |
CN101944355A (en) * | 2009-07-03 | 2011-01-12 | 深圳Tcl新技术有限公司 | Obbligato music generation device and realization method thereof |
CN102054480A (en) * | 2009-10-29 | 2011-05-11 | 北京理工大学 | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) |
CN102682762A (en) * | 2011-03-15 | 2012-09-19 | 新加坡科技研究局 | Harmony synthesizer and method for harmonizing vocal signals |
CN103426433A (en) * | 2012-05-14 | 2013-12-04 | 宏达国际电子股份有限公司 | Noise elimination method |
US20140039891A1 (en) * | 2007-10-16 | 2014-02-06 | Adobe Systems Incorporated | Automatic separation of audio data |
CN103680517A (en) * | 2013-11-20 | 2014-03-26 | 华为技术有限公司 | Method, device and equipment for processing audio signals |
CN104134444A (en) * | 2014-07-11 | 2014-11-05 | 福建星网视易信息系统有限公司 | Song accompaniment removing method and device based on MMSE |
CN104219556A (en) * | 2014-09-12 | 2014-12-17 | 北京阳光视翰科技有限公司 | Use method of four-soundtrack karaoke identification playing system |
CN104538011A (en) * | 2014-10-30 | 2015-04-22 | 华为技术有限公司 | Tone adjusting method and device and terminal device |
CN104683933A (en) * | 2013-11-29 | 2015-06-03 | 杜比实验室特许公司 | Audio object extraction method |
-
2015
- 2015-11-16 CN CN201510784342.1A patent/CN105590633A/en active Pending
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1148230A (en) * | 1995-04-18 | 1997-04-23 | 德克萨斯仪器股份有限公司 | Method and system for karaoke scoring |
US6057502A (en) * | 1999-03-30 | 2000-05-02 | Yamaha Corporation | Apparatus and method for recognizing musical chords |
US20050065781A1 (en) * | 2001-07-24 | 2005-03-24 | Andreas Tell | Method for analysing audio signals |
US20030106413A1 (en) * | 2001-12-06 | 2003-06-12 | Ramin Samadani | System and method for music identification |
CN1607575A (en) * | 2003-10-16 | 2005-04-20 | 扬智科技股份有限公司 | Humming arrangement system and method |
CN101238511A (en) * | 2005-08-11 | 2008-08-06 | 旭化成株式会社 | Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program |
CN1924992A (en) * | 2006-09-12 | 2007-03-07 | 东莞市步步高视听电子有限公司 | A method for playing karaoke vocals |
CN1945689A (en) * | 2006-10-24 | 2007-04-11 | 北京中星微电子有限公司 | Method and its device for extracting accompanying music from songs |
US20140039891A1 (en) * | 2007-10-16 | 2014-02-06 | Adobe Systems Incorporated | Automatic separation of audio data |
CN101944355A (en) * | 2009-07-03 | 2011-01-12 | 深圳Tcl新技术有限公司 | Obbligato music generation device and realization method thereof |
CN102054480A (en) * | 2009-10-29 | 2011-05-11 | 北京理工大学 | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) |
CN101894552A (en) * | 2010-07-16 | 2010-11-24 | 安徽科大讯飞信息科技股份有限公司 | Speech spectrum segmentation based singing evaluating system |
CN102682762A (en) * | 2011-03-15 | 2012-09-19 | 新加坡科技研究局 | Harmony synthesizer and method for harmonizing vocal signals |
CN103426433A (en) * | 2012-05-14 | 2013-12-04 | 宏达国际电子股份有限公司 | Noise elimination method |
CN103680517A (en) * | 2013-11-20 | 2014-03-26 | 华为技术有限公司 | Method, device and equipment for processing audio signals |
CN104683933A (en) * | 2013-11-29 | 2015-06-03 | 杜比实验室特许公司 | Audio object extraction method |
CN104134444A (en) * | 2014-07-11 | 2014-11-05 | 福建星网视易信息系统有限公司 | Song accompaniment removing method and device based on MMSE |
CN104219556A (en) * | 2014-09-12 | 2014-12-17 | 北京阳光视翰科技有限公司 | Use method of four-soundtrack karaoke identification playing system |
CN104538011A (en) * | 2014-10-30 | 2015-04-22 | 华为技术有限公司 | Tone adjusting method and device and terminal device |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107910019A (en) * | 2017-11-30 | 2018-04-13 | 中国科学院微电子研究所 | Human body sound signal processing and analyzing method |
WO2020015411A1 (en) * | 2018-07-18 | 2020-01-23 | 阿里巴巴集团控股有限公司 | Method and device for training adaptation level evaluation model, and method and device for evaluating adaptation level |
US11074897B2 (en) | 2018-07-18 | 2021-07-27 | Advanced New Technologies Co., Ltd. | Method and apparatus for training adaptation quality evaluation model, and method and apparatus for evaluating adaptation quality |
US11367424B2 (en) | 2018-07-18 | 2022-06-21 | Advanced New Technologies Co., Ltd. | Method and apparatus for training adaptation quality evaluation model, and method and apparatus for evaluating adaptation quality |
CN109300485A (en) * | 2018-11-19 | 2019-02-01 | 北京达佳互联信息技术有限公司 | Methods of marking, device, electronic equipment and the computer storage medium of audio signal |
CN109300485B (en) * | 2018-11-19 | 2022-06-10 | 北京达佳互联信息技术有限公司 | Scoring method and device for audio signal, electronic equipment and computer storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111400543B (en) | Audio fragment matching method, device, equipment and storage medium | |
Nanni et al. | Combining visual and acoustic features for audio classification tasks | |
Lehner et al. | On the reduction of false positives in singing voice detection | |
Schlüter | Learning to Pinpoint Singing Voice from Weakly Labeled Examples. | |
Lu et al. | Fog computing approach for music cognition system based on machine learning algorithm | |
Wu et al. | Combining visual and acoustic features for music genre classification | |
CN1979491A (en) | Method for music mood classification and system thereof | |
CN103871426A (en) | Method and system for comparing similarity between user audio frequency and original audio frequency | |
CN105679324A (en) | Voiceprint identification similarity scoring method and apparatus | |
CN111309965A (en) | Audio matching method and device, computer equipment and storage medium | |
US20240177697A1 (en) | Audio data processing method and apparatus, computer device, and storage medium | |
CN104282316A (en) | Karaoke scoring method based on voice matching, and device thereof | |
CN111445922B (en) | Audio matching method, device, computer equipment and storage medium | |
CN105590633A (en) | Method and device for generation of labeled melody for song scoring | |
CN107767850A (en) | A kind of singing marking method and system | |
Dhall et al. | Music genre classification with convolutional neural networks and comparison with f, q, and mel spectrogram-based images | |
Dandawate et al. | Indian instrumental music: Raga analysis and classification | |
Mutiara et al. | Musical genre classification using SVM and audio features | |
Felipe et al. | Acoustic scene classification using spectrograms | |
Smaragdis | Polyphonic pitch tracking by example | |
Yang et al. | [Retracted] Research Based on the Application and Exploration of Artificial Intelligence in the Field of Traditional Music | |
MX2022006798A (en) | Method for music generation, electronic device, storage medium cross reference to related applications. | |
Wu et al. | Gabor-lbp features and combined classifiers for music genre classification | |
Shirali-Shahreza et al. | Fast and scalable system for automatic artist identification | |
Kiran | Indian Music Classification using Neural network based Dragon fly algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160518 |