CN102543126A

CN102543126A - Pyramid-based type multi-resolution audio waveform drawing method

Info

Publication number: CN102543126A
Application number: CN2010106103959A
Authority: CN
Inventors: 韩秀丽; 郑鹏程; 刘铁华; 见良
Original assignee: China Digital Video Beijing Ltd
Current assignee: China Digital Video Beijing Ltd
Priority date: 2010-12-28
Filing date: 2010-12-28
Publication date: 2012-07-04

Abstract

The invention discloses a pyramid-based type multi-resolution audio waveform drawing method. The method comprises the following steps of: firstly creating a pyramid model to save audio amplitude values of sampling points under different resolutions and the sampling number which is read per second at the stage; creating an audio file in one format to save the audio amplitude values of the sampling points of all the stages in the pyramid model; then calculating the optimal sampling number which needs to be read per second of the audio file and determining the stage in the pyramid model, which needs to be read; and finally reading data at the corresponding stage from the audio file and drawing an audio waveform graph. According to the method disclosed by the invention, the audio amplitude values of the sampling points under every resolution are respectively saved in the pyramid model, and when the resolution is changed, the values can be directly taken from the audio file, and the repeated calculation is further avoided. In addition, the values taken from every stage of the pyramid model are not respectively calculated on the basis of original audio data, but are taken on the basis of the upper stage, so that the work of comparing most values multiple times is avoided and efficiency is greatly improved.

Description

A kind of drawing audio waveforms method based on the pyramid multiresolution

Technical field

The present invention relates to the voice data process field, be specifically related to a kind of drawing audio waveforms method based on the pyramid multiresolution.

Background technology

Audio editing plays an important role in radio station, TV station and other digital Construction, is the important component part of media management system.Such as in radio station and TV station, edited before the raw data of gathering is put in storage, all to carry out editing operation fast, reached purposes such as extracting the user's interest content.If when editor's original media material, can show the waveform of corresponding audio data accurately, fast, just can locate fast the characteristic of voice data (like quiet district etc.), thereby the time of having accelerated editing operation.

In the voice data process field, the following two kinds of methods of main at present employing are carried out the drafting of waveform:

1. there is not way to play for time

The corresponding data in viewing area in this method reading disk are depicted as waveform with the data that read and show.Because this method will all be put into all operations of voice data in the disk, can cause the user like this when operation, data on the reading disk repeatedly are because reading speed is relatively slow, so had a strong impact on the time of drawing waveforms.

2. full buffer method

This method stores all voice datas on the internal memory into earlier; Read the corresponding data in internal memory viewing area; Data are depicted as waveform to be shown; When the user operates (or adjustment viewing area) to the data of viewing area, need continue to read the data in the internal memory, the data after the operation are depicted as waveform show.

Because this method stores all voice datas in the internal memory into; Accelerated the speed of drawing waveforms than first method; But because memory size wants little many with respect to disk size; For jumbo voice data, can't whole voice data all be stored in the internal memory, can't be suitable for jumbo voice data.

It is thus clear that; Existing voice data method for drafting is when drawing the waveform of large capacity audio frequency file; There is the problem that speed is slow, the running time is long, work efficiency is low, particularly for the drafting of multiresolution audio volume control, when resolution changing; Need to carry out one time again and calculate, relatively lose time to each resolution.

Summary of the invention

To the defective that exists in the prior art, the object of the present invention is to provide a kind of drawing audio waveforms method based on the pyramid multiresolution, when the drawing audio waveforms, minimizing double counting that can be a large amount of improves the efficient of drawing through this method.

For realizing above-mentioned purpose, the technical scheme that the present invention adopts is following:

A kind of drawing audio waveforms method based on the pyramid multiresolution may further comprise the steps:

(1) create pyramid model according to the difference of resolution, each grade preserved the audio amplitude value of the sampled point under the different resolution and the hits that this grade per second is read respectively;

(2) audio file of a kind of form of establishment, the audio amplitude value of preserving sampled point at different levels in the pyramid model;

(3) according to the difference of resolution, calculate the optimum sampling number that the audio file per second should read, the rank in the pyramid model of confirming to read according to this hits;

(4) read the data of pyramid model appropriate level in the audio file, draw audio volume control figure.

Further, aforesaid drawing audio waveforms method, in the step (1), the sampled point of next stage is on the basis of upper level, to carry out value in the pyramid model, next stage per two sampled points on the basis of upper level are got one.

Further, aforesaid drawing audio waveforms method, sampled point described in the step (1), for monophonic audio, a sampled point comprises the minimum extreme value and the maximum extreme value of audio volume control; For dual-channel audio, sampled point comprises minimum extreme value and the maximum extreme value of the L channel of audio volume control, the minimum extreme value and the maximum extreme value of R channel.

Further, aforesaid drawing audio waveforms method, said sampled point is provided with putting in order of extreme value in being saved in pyramid model the time.

Further, aforesaid drawing audio waveforms method, in the step (2), the audio file of a kind of form of said establishment, the concrete mode of preserving the audio amplitude value of sampled point at different levels in the pyramid model is:

At first, according to the audio file name, generate corresponding audio files;

Secondly, the audio amplitude value of the sampled point that the pyramid that calculates is at different levels is saved in the audio file respectively.

Further, aforesaid drawing audio waveforms method if audio file name corresponding audio files exists, then directly reads the audio amplitude value of corresponding sampled point from audio file.

Further, aforesaid drawing audio waveforms method is preserved the reference position of audio frequency title, file attribute, pyramidal progression, each value of series and the hits that each grade comprises in the said audio file file header.

Further again, aforesaid drawing audio waveforms method, in the step (3), the computing formula of the optimum sampling number that said per second should read is following:

In the following formula, the optimum sampling number is meant the hits when audio file satisfies sampled point of a pixel demonstration, and reproduction time is meant total reproduction time of audio file.

Further, aforesaid drawing audio waveforms method, in the step (4), the said data that read pyramid model appropriate level in the audio file, the concrete mode of drawing audio volume control figure is:

In the file header of audio file, find the reference position of each grade of preservation;

Find the reference position of the data preservation of corresponding one-level, read the audio amplitude value of the sampled point of this grade and draw waveform.

Effect of the present invention is: there is the value under every kind of resolution respectively in method of the present invention in the pyramid model, when resolution changing, can directly go value in the pyramid model of audio file, has avoided double counting; Each of pyramid model grade value is not all on the basis of original audio data, to go respectively to ask, but on the basis of the audio amplitude Value Data of the sampled point of upper level, gets, and has avoided repeatedly the work of value, has improved efficient greatly.

Description of drawings

Fig. 1 is the process flow diagram of drawing audio waveforms method of the present invention;

Fig. 2 is the synoptic diagram of pyramid model of the present invention;

Fig. 3 is the structural representation of audio file of the present invention.

Embodiment

Below in conjunction with Figure of description and embodiment the present invention is done further detailed description.

Fig. 1 shows the process flow diagram of drawing audio waveforms method of the present invention, and this method may further comprise the steps:

Step S11: create pyramid model, preserve the audio amplitude value of sampled point;

Difference according to resolution is created pyramid model, and each grade preserved the audio amplitude value of the sampled point under the different resolution and the number of the sampled point that this grade per second is got respectively.Wherein, the sampled point of next stage is on the basis of upper level, to carry out value in the pyramid model, and next stage per two sampled points on the basis of upper level are got one.

Be that example describes to create 10 grades pyramid model in the specific embodiment of the invention, as shown in Figure 2, the audio amplitude value of the sampled point that the first order is preserved is to get the numerical value of the frequency collection of 1000 sampled points with average per second; The second level per two sampled points on the basis of the first order are got one, so the audio amplitude value of the sampled point that preserve the second level is that average per second is got the numerical value that 500 sampled points draw, the rest may be inferred, and the 10th grade of average per second got 0.9750625 sampled point.

A sampled point of audio volume control is not a numerical value, and for monophonic audio, a sampled point comprises minimum extreme value and maximum extreme value; For dual-channel audio, sampled point comprises minimum extreme value and the maximum extreme value of L channel, the minimum extreme value and the maximum extreme value of R channel.For convenient when reading; When this method is saved in sampled point in the pyramid model; Be provided with putting in order of extreme value; The extreme value that shows monophonic audio and dual-channel audio like table 1 and table 2 is respectively preserved form, wherein use the underscore mark with the sampled point of representative respectively that does not use the underscore mark.Deposit the maximal value of a sampled point in the monophony earlier, carry the minimum value of depositing this sampled point, and then preserve the numerical value of next sampled point according to this; In the two-channel, deposit the extreme value of a sampled point L channel earlier, deposit the extreme value of R channel again.

Table 1

Table 2

For the pyramidal first order; No matter how many sampling rates is, all keep per second to get 1000 sampled points, for sampling rate the original audio information of 48kHz/s; In order to guarantee that per second gets 1000 sampled points, just need in original audio information, get 1 by per 48 sampled points; If sampling rate is 44.1kHz/s, then need in original audio information, get 10 in per 441 sampled points.

Getting sampled point is not arbitrarily to take out a value; But get the process of extreme value, if in 48 sampled points, get 1, compare the maximal value and the minimum value of 48 sampled points respectively; Draw maximum extreme value and minimum extreme value in these 48 sampled points, as final sampled point value.

Drawn pyramid first order data, when calculating other grade data, need not be directed against the raw data value, but per two sampled points are got one and got final product in upper level.As when calculating second level pyramid, satisfy 500 of per second samplings, and need be to get 1 in per 96 sampled points in the original audio information of 48kHz/s in sampling rate, do the computings waste that can cause a large amount of relatively extreme values like this.For fear of this problem, the present invention proposes, and partial calculating is that per two sampled points are got one on first order basis, and operand significantly reduces like this.If sampling second level data in the monophony raw data; 1 second data need be carried out 96 * 2 * 500=96000 time ratio of extreme values; And if in first order data, sample, 1 second data need be carried out 2 * 2 * 500=2000 time ratio of extreme values, visible operand has reduced about 48 times.

Wherein, in order when calculating pyramid sampled point at different levels, to avoid decimal, this method is when sampling step by step, and the first order is got 1024 sampled points as one group at every turn, calculates the audio amplitude value of all the other sampled points of 9 grades respectively; If discontented 1024 values of last group data are also carried out same treatment as one group,, then give up this extreme value and do not do computing if be odd number.

Step S12: create audio file, preserve the audio amplitude value of pyramid sampled points at different levels;

Create a kind of audio file of form, preserve the audio amplitude value of sampled point at different levels in the pyramid model.

When creating the audio file of form, at first,, generate corresponding audio files according to the audio file name; Secondly; The audio amplitude value of the sampled point that the pyramid that calculates is at different levels is saved in the audio file respectively; And in file header, preserve audio frequency title, file attribute, progression, the reference position of each value of series, the hits that each grade comprises etc., the version of audio file is as shown in Figure 3.If current audio file name corresponding audio files exists, then need not carry out the calculating of the audio amplitude value of sampled point, at this moment directly from audio file, read and get final product.

Step S13: confirm the rank of audio file in pyramid model;

According to the difference of resolution, calculate the optimum sampling number that the audio file per second should read, the rank in the pyramid model of confirming to read according to this hits.

Resolution shows that by the demonstration length that goes up the in the street decision of audio file length is unit with the pixel.In order to reach the optimal view that audio volume control shows, a pixel shows that a sampled point is the most appropriate.Simultaneously; Total reproduction time of this section audio file that shows can obtain according to total hits and SF, because, in order to satisfy the top condition that a pixel shows a sampled point; For a section audio file, the optimum sampling number that per second should read can obtain through following formula:

Obtain comparing with pyramid model number of samples at different levels after the number of samples that per second reads again, select that the most close one-level, the audio amplitude value of the sampled point of this one-level is exactly near one group of data of best image.As to calculate the optimum sampling number that per second reads be 27.38, and it is the most approaching that the 6th grade of average per second in this number and the pyramid model reads 31.25 sampled points, therefore at this moment reads the audio amplitude value of the sampled point in the 6th grade.

Step S14: read the data of audio file appropriate level, drawing waveforms.

Read the data of pyramid model appropriate level in the audio file, draw audio volume control figure.

Which rank of confirm to read after pyramid model according to resolution; In audio file, read the data of corresponding one-level; At first in file header, find the reference position of each grade of preservation; The reference position that finds a corresponding grade data to preserve then reads the audio amplitude value of sampled point at last and draws waveform.

Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technology thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims

1. drawing audio waveforms method based on the pyramid multiresolution may further comprise the steps:

2. drawing audio waveforms method as claimed in claim 1 is characterized in that: in the step (1), the sampled point of next stage is on the basis of upper level, to carry out value in the pyramid model, and next stage per two sampled points on the basis of upper level are got one.

3. according to claim 1 or claim 2 drawing audio waveforms method is characterized in that: sampled point described in the step (1), and for monophonic audio, a sampled point comprises the minimum extreme value and the maximum extreme value of audio volume control; For dual-channel audio, sampled point comprises minimum extreme value and the minimum extreme value and the maximum extreme value of maximum extreme value and R channel of the L channel of audio volume control.

4. drawing audio waveforms method as claimed in claim 3 is characterized in that: said sampled point is provided with putting in order of extreme value in being saved in pyramid model the time.

5. drawing audio waveforms method as claimed in claim 1 is characterized in that: in the step (2), and the audio file of a kind of form of said establishment, the concrete mode of preserving the audio amplitude value of sampled point at different levels in the pyramid model is:

At first, according to the audio file name, generate corresponding audio files;

6. drawing audio waveforms method as claimed in claim 5 is characterized in that: if audio file name corresponding audio files exists, then directly from audio file, read the audio amplitude value of corresponding sampled point.

7. like claim 5 or 6 described drawing audio waveforms methods, it is characterized in that: preserve the reference position of audio frequency title, file attribute, pyramidal progression, each value of series and the hits that each grade comprises in the said audio file file header.

8. drawing audio waveforms method as claimed in claim 1 is characterized in that: in the step (3), the computing formula of the optimum sampling number that said per second should read is following:

9. drawing audio waveforms method as claimed in claim 1 is characterized in that: in the step (4), and the said data that read pyramid model appropriate level in the audio file, the concrete mode of drawing audio volume control figure is: