CN1189278A

CN1189278A - Method and apparatus for time-scaling in communication products

Info

Publication number: CN1189278A
Application number: CN96195089A
Authority: CN
Inventors: 萨尼尔·萨特亚姆拉迪; 萨尔瓦多·西贝卡斯
Original assignee: Motorola Inc
Current assignee: Motorola Solutions Inc
Priority date: 1995-06-26
Filing date: 1996-03-29
Publication date: 1998-07-29

Abstract

A method and apparatus (206) for time-scale modification of an input signal comprises the steps and a device that performs the steps of digitizing the input signal for analysis, storing the input signal in a memory and then dividing the input signal into time frames (250). The method and device then determines a difference in the short-term frequency spectrum between adjacent time frames (252 and 254) to provide a first speed factor (S1) and the input signal is time-scaled (208) in accordance with the first speed factor. Optionally, a second speed factor can be determined from the percentage of silence (262) observed within the input signal and the input signal can then be time-scaled in accordance with the first and second speed factors to provide a given quality of reconstructed speech.

Description

The method and apparatus that is used for the communication products time scale

Relate generally to time scale of the present invention (time scliang) technology, the velocity factor that particularly relating to utilization is derived by input signal improves the method and apparatus of time scale.

The time scale method is to be used for compression and expansion input signal, particularly, and the method for voice input signal.And, when being used for communication system, the time scale technology than other technology applicable to the higher system of channel efficiency.The time scale of voice signal usually also will be compromised between voice compression ratio and voice reconstruction quality.It is with different speed or speed speech that existing time scale technology can not fully take into account different speakers.Like this, use existing technology, thereby the speaker's who says soon voice may be compressed to such an extent that the voice quality of reconstruct is descended, thereby and the speaker's who says slowly voice will compress to such an extent that cause invalid compression inadequately.Again, some all information of application requirements as oral account and sound-editing all carry out with identical speed and the speed of speaking regardless of people how.These devices can be from being used as a voice factor input of time scale correction algorithm and further making the speed of voice equate to obtain an advantage with this velocity factor.

About the paging relevant with the time scale of voice signal and other application as oral account and voice mail, the method for existing time scale lacks to be provided enough voice qualities and allows the designer to make this use optimized flexibility desirable combination between the two in given constraint.Like this, just need feasible economically and the optimized flexibility of realization is arranged in given structure, and more specifically use, further keep many advantages of Motorola's high speed calling stipulations about paging for voice communication system.

Aspect first, the time scale revised law of input signal comprises makes the input signal digitlization so that analyze, and input signal is stored in the memory, then this input signal is divided into steps such as many time frames of the present invention.Then, this method is determined the difference in the short-term spectrum between the adjacent time frame, so that one first velocity factor is provided, and according to this first velocity factor this input signal is carried out time scale.

Aspect second of the present invention, have at least the communication system of utilizing voice compression of a transmitter base station and a plurality of selective call receivers to comprise that in transmitter base station an input unit is for received audio signal; Processing unit is for this audio signal digitizing, so that an input signal is provided, in order to analyze, for this input signal is divided into equal time frame, in order to determine the difference in the short-term spectrum between the adjacent time frame, so that provide one first velocity factor and for according to this this input signal of first velocity factor time scale, so that the signal of time scale is provided; A memory that is used for stored audio signal; With one be used to launch the transmitter of the signal of time scale.In a plurality of selective call receivers of this communication system each comprises that is used to receive this selective call receiver of the signal of time scale, a processing unit, expand this signal of time scale in order to separate to be in harmonious proportion, so that a reconstruction signal is provided, with an amplifier, for this reconstruction signal is zoomed into reconstruct audio signal.

Aspect the 3rd of the present invention, an electronic installation that is used for the correction of voice time scale comprises that an input unit is for received audio signal; Processing unit is for this audio signal digitizing, so that an input signal is provided, in order to analyze, for this input signal is divided into equal time frame, in order to determine the difference in the short-term spectrum between the adjacent time frame, so that provide one first velocity factor and for according to this this input signal of first velocity factor time scale, so that a signal of time scale is provided; With a memory that is used for storing this audio signal.

Fig. 1 is the block diagram according to voice communication system of the present invention.

Fig. 2 is the block diagram according to base station transmitter of the present invention.

Fig. 3 is the electrical block diagram according to the expansion of time scale analyzer of the present invention.

Fig. 4 is the electrical block diagram according to the expansion of base station transmitter of the present invention.

Fig. 5 is the electrical block diagram according to the expansion of another base station transmitter of the present invention.

The invention provides a method, it detects the speaker and says soon or slow, and and the optimal compression factor take place together to provide the reconstruct voice of certain quality between the box lunch time spent during scale technology. There are several parameters or the factor to can be used to determine that the speaker says how soon or how slow. First parameter is the speed in voice spectral change a middle or short term. Second parameter or the factor are the time-out length between word. These two factors are used for determining the speak velocity factor of speed of a speaker. Alternatively, the 3rd factor can be from one specifically the approximate character pitch determined of words person in order to improve the precision of this velocity factor.

This system comprises makes voice digitization preferentially they are divided into then 20 milliseconds frame. Linear prediction analysis (lpc analysis) is that each frame at voice carries out, and the difference in the short-term spectrum between adjacent word group determines that with ITAKURA distance metric D it is represented by following formula:

D = \log {\frac{< R > < a >}{< R > < b >}}

Wherein, " R " is the autocorrelation matrix relevant with frame n, and " a " is the LPC vector relevant with frame n, and " b " is the LPC vector relevant with frame n+1. D is the distance metric between frame n and frame n+1. Distance B is more big, and the difference on the frequency between the consecutive frame is more big. Calculate then the overall average tolerance of distance for whole speech sentences. This has provided the tolerance of this velocity factor.

Under regard to each frame and measure energy and zero crossing, and detect silence period. The percentage of silence period in speech sentences provides another tolerance of this velocity factor. These two parameters (with randomly, a character pitch estimated parameter) be used for determining correct compressibility factor, its needs and the time measure technology is used together, in order to make the voice of reconstruct reach given quality.

Being briefly described as follows of a sampling character pitch estimation process.

1) speech frame is input to 20 milliseconds of word groups.

2) calculate energy in each word group.

3) calculate the average energy of each word group.

4) determine energy threshold, so that the function of speech voice as the average energy of each word group detected.

5) be determined to the phase loigature group of the speech voice of rare 5 block lengths with this energy threshold.

6) go on foot on each word group of the speech voice that link to each other that find the 5th, carry out the character pitch analysis.This can be with the autocorrelation method that comprises correction, and the autocorrelation method of AMDF or amplitude limit is implemented at interior the whole bag of tricks.

7) remove error in the estimation, smoothly this character pitch value with a median filter.

8) on average all level and smooth character pitch value is so that obtain the approximate estimated sum of this speaker's character pitch.

The supposition sampling rate is 8 KHz in above-mentioned all scenario, though also consider other sampling rate and other character pitch evaluation method in the present invention.

As shown in Figure 3, S1 and S2 are the velocity factors of determining with ITAKURA distance metric D and silence period respectively.Their (alternatively and the estimation of character pitch together) are used to determine last compression and spreading factor α.Like this, to be used for be that a given speech sentences is determined velocity factor and further is used for optimal compression/spreading factor of Mass Calculation into given reconstruct voice for ITAKURA distance metric and silence period.

With reference to Fig. 1, the communication system that voice compression of the present invention and expansion technique are described is illustrated in the block diagram of selected calling system 100, this selected calling system comprises an input unit that is used for received audio signal, as telephone set 114, begin selective call from it, so that be transferred to the selective call receiver in the system 100 based on speech.Each selective call that enters by telephone set 114 (or other input unit, as computer) typically comprises the receiver address of (a) at least one selective call receiver in this system and (b) speech information.A transmitter base station or a selective call terminal 113 are typically supplied with in these selective calls that are initialised, so that format and line up.The voice compression circuit 101 of terminal 113 is used for compressing the speech information that provided (detailed operation of this voice compression circuit 101 will below to further discussing in the description of Fig. 2-5).Preferably, voice compression circuit 101 comprises a processing unit, and it is with time scale technology and this audio signal of single-side band modulation technique compresses, so that a signal of handling is provided, though the present invention also considers the modulation technique with other.Then selective call is input to selective call transmitter 102, it is used for modulated radio signal there, by antenna 103 this modulated radio signal and the time scale factor is launched together.Preferably, this transmitter is the quadrature amplitude modulation transmitter of this treated signal of emission, though the present invention also considers the transmitter with other

Antenna 104 in selective call receiver 112 receives the emitting radio frequency signal of this modulation and it is input to selective call receiver module or radio-frequency transmitter module 105, so that receive this signal of handling or radiofrequency signal, there, this radiofrequency signal is by demodulation, and this receiver address and monolateral band (QAM) signal are restored.Then this signal is offered Analog-digital converter (A/D) 115.Preferably, selective call receiver 112 comprises a processor, and it is with the signal of the processing of single sideband demodulation technology and this reception of time scale expansion technique demodulation, so that a reconstruction signal is provided.The speech information that this has been compressed offers voice compression circuit 106 then, and the time span of this speech information preferably expands to the value of hope there.Then this speech information is offered a digital analogue converter (not drawing among the figure) and amplifier such as audio frequency amplifier 108, so that it is zoomed into the audio signal of reconstruct.

From radio-frequency transmitter 105 this demodulated received machine address and the time scale factor are offered decoder 107.If this receiver address and any one the receiver matching addresses that is stored in the decoder 107, then randomly encouraged alarm signal 111, user to selective call receiver 112 provides a brief sensation indication, points out to receive a selective call.This brief sensation indication can comprise acoustic signal, and haptic signal is as vibration, or visible signal such as light, or their combination.From audio frequency amplifier 108 speech information of this amplification is offered the audio tweeter of reporting to the police 111 then, make the user can hear and see information.

Decoder 107 can comprise a memory, and the speech information of reception can be stored in the there, and the effect by one or more controllers 110 repeatedly this information of recall so that consult.

In another aspect of the present invention, the part of Fig. 1 can for example be interpreted as with being equal to, a telecord, and voice mail, answering machine or sound are followed the tracks of the part of editing device.The wireless parties face of removing system 100 comprises removes selective call transmitter 102 and radio-frequency transmitter 105, and this system can connect up to speech expanded circuit 106 by A/D115 from voice compression circuit 101 alternatively, and is shown in dotted line.Like this, in voice mail, answering machine or sound are followed the tracks of in editing device or the telephone sound-recording system, and input unit 114 is added to an audio input signal such as voice signal the terminal 113 of voice compression circuit 101.Speech expanded circuit 106 and controller 110 are listened to supply and handle in voice mail, answering machine, and telephone sound-recording system, sound is followed the tracks of the device of the output voice signal in editing device or other the applicable system.The present invention is clearly expecting to require the time scale technology that proposes in the Patent right invention at this, except paging, also has many other application.Here the example of the paging of Gong Buing only is one explanation during these are used.

With reference to Fig. 2, base station of the present invention and radiating portion 200 are as shown in Figure 2.When work, the speech information of input is by telephone wire input 203,205,207 receive, and handle by the phone one line interface circuit 202 (preferably including a PBX system) of standard, temporarily be recorded in then or tape or an electronic memory, promptly in buffer storage.The telephone interface of this standard in square frame 202 and buffering storage area are called speech storage and forward direction transmitting system, and indicate handbook to be entitled as among the No.68P81105C25 of " sound store and forward direction MODAXPLUS/ meteorology-paging " in Motorola to describe in detail.

So, the buffer storage of information from square frame 202 is restored, and handle by speech detector square frame 206, this speech detector preferably includes the lpc analysis circuit, and its is analyzed voice signal and produces and the corresponding digital LPC parameter of input voice signal.

Voice rate detector 206 calculates and provides a time scale factor-alpha to compression time scale square frame 208, and this compression time scale square frame is according to the voice signal of this time scale factor compression input.Square frame 208 is connected with controller 210 and can preferably includes storage square frame 209.

In voice system of the present invention, we want to provide a kind of technology, and several voice messages can be launched in the required time and the influence of the quality of voice is eased down to minimum at present message of emission in this technology.

As mentioning in the above, temporarily be stored in the memory in the square frame 202 from the voice signal of telephone wire input 203,205 and 207.Do the analyzer that just can handle the peak input load and not need mass expensive like this.The speech sample of storage recovers from memory under the indication of controller 210, and in fact controller 210 can be a microcomputer.

Fig. 3 is the more detailed block diagram of the voice rate detector 206 of key diagram 2.Voice rate detector 206 preferably includes several analytical procedures, they finally cause more efficient compression or expansion and simultaneously desirable voice quality best level is arranged.Detector 206 preferably includes lpc analysis device 250, and it provides lpc analysis to the incoming frame of input signal.Preferably, as discussed above, with the input signal digitlization, it is divided into 20 milliseconds frame, and it is stored in (referring to the memory 209 of Fig. 2) in the memory.At square frame 252, between frame, determine ITAKURA distance metric D.At square frame 254, preferably use from this distance D of each frame and calculate this average distance from a source or speaker for the whole voice signal whole statement of self-contained information (for example, come).As this average distance that calculates with determine the result of the difference in the short-term spectrum between the adjacent time frame in square frame 254, a total distance metric and the consequent first velocity factor S1 are offered last determination block 258.Like this, last determination block 258 can provide a time measure factor-alpha, so that suitably compress (or expansion) input speech signal according to this time measure factor.Rely on speaker's character pitch estimation, available ITAKURA distance is according to following table compressed voice input signal: ITAKURA distance range * character pitch＜40 sample * character pitch＜40 sample compression ratios-0.4 arrive-0.55-0.4 to-0.45 3.25＜-0.55＜-0.45 to-0.5 3 to-0.45＜-0.3 4-0.45 to-0.5-0.35 to-0.4 3.5-0.5

＜-0.5 2.5* has variation in the above-mentioned value of the sample of 8 KHz sampling rates for optimization in different systems.As language, number of users, the such factor of the ratio in the user of a system between the sex may be some factor that makes in the many factors that change for the numeral in this table that optimizer system is provided.

Utilize second velocity factor S2, mourning in silence to analyze at square frame 260 to make this method further become more meticulous.In other words, record energy and zero crossing, and detect silence period with them for each frame.Like this, in square frame 262, second velocity factor S2 that the percentage of silence period provides another tolerance and calculates in square frame 264 in voice input signal.So the figure above investigating if mourn in silence less than 10%, just can reduce 0.25 with compression ratio.In fact the value that obtains from square frame 264 can be adjusted according to desirable voice quality level.So in this case, two velocity factor S1 and S2 offer determination block 258, be used for determining correct total compression (or expansion) factor then, we realize the quality of given reconstruct voice together with these factors and time scale technology.In addition, the present invention can randomly determine the average character pitch estimation in square frame 256, so that provide another factor to determination block 258, permission is further detailed-oriented when judging compression, and does not recognize the desirable best in quality of consequent voice influential.Average in this case character pitch estimation will help to determine that the speaker that a male sex still is a women is providing input signal.This knowledge will provide a time scale factor that is used for obtaining the reconstruct voice of best in quality in conjunction with the velocity factor of determining in the past.

The block diagram of Fig. 4 explanation first embodiment of transmitter 300 according to the present invention.An analog voice signal is input to anti-aliasing low pass filter 301, and this filter makes further connection cut open sampling rate all frequency strong attenuation over half of the Analog-digital converter (ADC) 303 of filter 301.ADC303 preferably transforms to digital signal with analog voice signal, thereby the available digital treatment technology is implemented further signal processing.Digital processing is a kind of preferable methods, but identical functions is implemented in the combination of also available analogue technique or simulation and digital technology.

The band pass filter 305 that is connected to ADC303 is decayed consumingly and is lower than and is higher than the frequency of its cut-off frequency.Lower cut-off frequency is preferably got 300 hertz, and it allows important speech frequency to pass through, but the lower frequency of pilot frequency carrier wave is disturbed in decay.Higher cut-off frequency is preferably got 2800 hertz, and it allows important speech frequency to pass through, but the upper frequency of adjacent transmission channel is disturbed in decay.Automatic gain control (AGC) square frame 307 preferably is connected to filter 305, and the volume level of different speeches is equated.Then, square frame 307 provides signal to voice rate detector square frame 206, and voice rate detector square frame 206 as above-mentioned Fig. 3 institute work provide a time scale factor to time compression square frame 309 with describing.

Time compression square frame 309 preferably is connected to AGC square frame 307, shortens the emission required time of voice signal, and keeps the signal spectrum identical with the output of band pass filter 305 basically.This time compression method is WSOLA-SD method (as in U.S. Patent application 08/395,739 explanation) preferably, but also available other method.Amplitude compression square frame 311 and in receiver (not drawing among the figure) corresponding amplitude expansion block form a companding device, it can increase the apparent signal to noise ratio that receives voice as everyone knows.This companding is got 2 to 1 decibels than preferably, but according to the present invention also desirable other ratio.In the concrete instance of the communication system as paging system, device 301-309 may be included in the call terminal (Fig. 1 113), and the remaining part among Fig. 4 can be formed a paging transmitter (Fig. 1 102).In this case, typically between call terminal and paging transmitter, there is digital link.For example, the signal behind the square frame 309 can be used pulse code modulation (PCM) technology for encoding, then with the PCM decoding, so that reduce the quantity of the position of transmitting between call terminal and paging transmitter.

Under any circumstance, second band pass filter 308 that is connected to amplitude compression square frame 311 decayed consumingly and is lower than and is higher than the frequency of its cut-off frequency, so that remove any by AGC307, the parasitic frequency component that time compression square frame 309 or amplitude compression square frame 311 produces.Lower cut-off frequency is preferably got 300 hertz, and it allows important speech frequency to pass through, but the lower frequency of pilot frequency carrier wave is disturbed in decay.Higher cut-off frequency is preferably got 2800 hertz, and it allows important speech frequency to pass through, but the upper frequency of adjacent transmission channel is disturbed in decay.

The time compression speech samples preferably is stored in the buffer 313, after whole voice messaging is processed.This just allows to go out as an overall emission after the time compression voice messaging.The sort buffer method is preferably used for paging service (it is non-real-time service typically).Other way to play for time can be preferably used for other application.For example, for the application that relates to two-way real-time session, the delay that is caused by sort buffer may be flagrant.In this situation, the segment of best several sessions of interleave.For example, if time compression ratio is 3: 1, can pass through 3 real-Time Speech Signals of a channels transmit so.These 3 emission energys with the form interleave of 150 milliseconds burst, and can not cause unsuitable delay thus on this channel.Be added to Hilbert transform filter 323 and time delay square frame 315 on both from the time compression voice signal of buffer 313, this time delay square frame has the delay identical with the Hilbert transform filter, but signal is not had other influence.

The output of time delay square frame 315 (by add circuit 317) and Hilbert transform filter 323 forms the homophase (I) and quadrature (Q) component of the monolateral band of upper sideband (USB) (SSB) signal respectively.The output of the change negative circuit of time delay circuit and Hilbert transform filter forms the homophase (I) and quadrature (Q) component of lower sideband (LSB) single sideband singal respectively.Like this emission can or upper sideband or lower sideband on, represented as connecting by dotted line.

When with time compression voice signal of upper sideband emission, lower sideband can be used to launch simultaneously second time compression voice signal, and this voice signal is to be launched by another similar transmitter that is operated in this lower sideband.SSB is preferred modulator approach, because it can effectively utilize transmitted bandwidth and compacting cross-talk.Can use double-side band Modulation and Amplitude Modulation (AM) or frequency modulation(FM) (FM), but need the bandwidth of twice at least in order to launch them.Also can directly launch a time compression voice signal by I component, with directly launch second time compression voice signal by Q component, yet, existing this method in this embodiment during when generation multipath reception on receiver owing to the cross-talk between these two signals suffers damage.

Direct current (DC) signal is added on the I component of this signal, so that produce pilot frequency carrier wave, it is launched together with this signal, and the machine that is received is used for cancelling basically gain and the variation of phase place or the effect that decline causes in transmission channel.The I of this signal and Q component are transformed into analog form by digital-analogue converter (DAC) 319 and 327 respectively.These two signals are removed the parasitic frequency component that is produced by the digital-analogue conversion process respectively through low-pass reconstruction filters 321 and 329 filtering then.Quadrature amplitude modulation (QAM) modulator 333 is modulated to I and Q signal on radio frequency (RF) carrier wave at low-power level.Other modulator approach will realize and DAC (319 and 327) the identical purpose of reconfigurable filter (321 and 329) and QAM modulator 333 as the Direct Digital of modulation signal is synthetic.At last, linear RF power amplifier 335 is amplified to desirable power level with modulated RF signal, 50 watts or more typically.Then, the output of RF power amplifier 335 is connected to transmitting antenna.Other variation can produce substantially the same result.For example, the amplitude compression can be implemented before time compression, perhaps it is omitted fully, and this device still can be implemented substantially the same function.

The block diagram of Fig. 5 explanation second embodiment of transmitter 400 according to the present invention.In Fig. 5, both are used for launching simultaneously the different piece of compressed signal at the same time upper and lower sideband.Transmitter 400 preferably includes anti-aliasing filter 404, ADC403, and band pass filter 405, AGC407, time compression square frame 409, amplitude compression square frame 411 and band pass filter 408, they connect as illustrated in fig. 4 and dispose.Square frame 407 detects square frame 206 to voice rate a signal is provided, and Fig. 3 that this voice rate detection square frame is as above stated works describedly, provides a time scale factor to time compression square frame 409.Through handling and being stored in the buffer 413, the work of the transmitter of Fig. 4 and Fig. 3's is identical up to whole voice messaging.Then, cut apart the time compression speech samples that is stored in the buffer 413, with they or upper sideband or lower sideband on launch.Preferably, first half one of time compression voice signal is by the emission of sideband, and second half one of time compression voice signal is by other sideband (perhaps randomly directly on each of I and Q component) emission.

Be added to first Hilbert transform filter 423 and first time delay square frame 415 on both from first part of the time compression voice signal of buffer 413, time delay square frame 415 has identical delay with Hilbert transform filter 423, but does not otherwise influence signal.The output of first time delay (by add circuit 417) and first Hilbert transform filter 423 (by add circuit 465) is same-phase (I) and quadrature phase (Q) signal component, when I that is connected to the QAM modulator or Q input, they produce upper side band signal, and this upper side band signal only has the information from the first of time compression speech samples.Be added to second Hilbert transform filter 461 and second time delay square frame 457 on both from second time compression voice signal of buffer 413, second time delay square frame 457 has identical delay with Hilbert transform filter 461, but otherwise do not influence signal.The negative value (463) of the output of the output of second time delay (by add circuit 459 and 417) and second Hilbert transform filter 461 (once more by add circuit 465) is same-phase (I) and quadrature phase (Q) signal component, when I that is connected to the QAM modulator and Q input, they produce upper side band signal, and this upper side band signal only has the information from second part of time compression speech samples.Go up or the I component and the DC pilot frequency carrier wave component addition (by add circuit 459) of lower sideband signal, so that the synthetic I component that is formed for launching.Go up or the Q component addition (by add circuit 465) of lower sideband signal, so that the synthetic Q component that is formed for launching.Should understand parts 415,423,457,461,417,459,463,465,419,427,421 and 429 form a preprocessor, and it produces through pretreated I and Q signal component, and these components produce the low level sub-channel signal that subcarrier FA is arranged when being connected to QAM modulator 453, it has two single sideband singals, and these two signals have independently information on each sideband.

Transmitter 400 further comprises DAC419 and 427, reconfigurable filter 421 and 429, and QAM modulator 433 and RF power amplifier 455, they are as illustrated in fig. 5 by row and structure.Identical among the work of the other parts of Fig. 5 transmitter and Fig. 4.

Preferably, among both, has only the anti-aliasing filter, reconfigurable filter, RF power amplifier and randomly a-d converter and a digital analogue converter are hardware componenies separately at the transmitter 300 of Fig. 4 and Fig. 5 and 400.The other parts of this device can preferably be merged into software, and this software can preferably turn round on a digital signal processor at processor.

Claims

1. the time scale modification method of input signal comprises the following steps:

Digital input signal is so that analyze;

Input signal is stored in the memory;

Input signal is divided into many time frames;

Determine the difference in the short-term spectrum between adjacent time frame so that first velocity factor is provided; With

According to first this input signal of velocity factor time scale.

2. the method for claim 1, wherein this method further comprises the step of determining the percentage of silence period in input signal, in order to analyze so that second velocity factor is provided, and then according to first velocity factor and second this input signal of velocity factor time scale.

3. each that the process of claim 1 wherein that the step of determining the difference in the short-term spectrum further is included in many time frames goes up the step of enforcement linear prediction analysis (LPC).

4. the process of claim 1 wherein and determine difference in the short-term spectrum with ITAKURA distance metric D.

5. the process of claim 1 wherein that the step of the percentage of determining silence period comprises the energy of measuring each time frame and the step of zero crossing.

6. the process of claim 1 wherein that this method further comprises the step of determining an average character pitch estimation, so that determine average character pitch estimation, then this input signal of time scale according to first velocity factor and this.

7. the time scale modification method of input signal comprises the following steps:

A) digital input signal is so that analyze;

A) input signal is stored in the memory;

B) input signal is divided into many equal time frames, so that an input signal that separates is provided;

C) input signal is transformed into LPC coded digital voice signal;

D) determine that difference in the short-term spectrum between the adjacent time frame of LPC coded digital voice signal is so that provide first velocity factor;

E) determine the percentage of the silence period in LPC coded digital voice signal by energy and the crosspoint of measuring each time frame, so that second velocity factor is provided; With

F) according to first and second this input signal of velocity factor time scale.

8. the communication system with voice compression has a transmitter base station and a plurality of selective call receiver at least, comprising:

In this transmitter base station:

An input unit that is used for receiving sound signal;

Be used for storing the memory of sound signal;

One is used for digitlization audio frequency signal so that the processing unit of input signal is provided, in order to analyze, in order to determine the difference in the short-term spectrum between the adjacent time frame so as to provide first velocity factor go to determine the time scale factor and for according to this this input signal of time scale factor time scale so that a time scale signal is provided;

A transmitter that is used for launching this time scale signal;

In a plurality of selective call receivers each:

A selective call receiver that is used for receiving this time scale signal;

Separate mediation according to this time scale factor for one and expand this time scale signal so that the processing unit of reconstruction signal is provided; With

An amplifier that is used for this reconstruction signal is zoomed into the reconstruct sound signal.

9. the communication system of claim 8, wherein being used for digitlization audio frequency Signal Processing device further determines the percentage of the silence period in this input signal, in order to analyze so that provide second velocity factor and further according to first velocity factor and second velocity factor, this input signal of time scale.

10. the communication system of claim 8 is wherein determined difference in the short-term spectrum at the processing unit of transmitter base station by implementing linear prediction analysis in a plurality of equal time frames each.

11. the communication system of claim 8, wherein the processor in transmitter base station is by determining the difference in the short-term spectrum with ITAKURA distance metric D.

12. the communication system of claim 8, the step of wherein determining the percentage of silence period comprise the energy of measuring each frame and the step of zero crossing.

13. a transmitting base station comprises:

An input unit that is used for receiving sound signal;

Be used for storing the memory of sound signal;

Be used for digitlization audio frequency signal so that the processing unit of input signal is provided, in order to analyze, for input signal being divided into a plurality of equal time frames, in order to determine the difference in the short-term spectrum between the adjacent time frame so that provide first velocity factor and for according to first velocity factor, this input signal of time scale is so that provide a time scale signal; With

A transmitter that is used for launching this time scale signal.

14. the transmitting base station of claim 13, wherein being used for digitlization audio frequency Signal Processing device further determines the percentage of the silence period in input signal, in order to analyze so that provide second velocity factor and according to first velocity factor and first velocity factor, further this input signal of time scale.

15. the transmitting base station of claim 13, wherein this processor is determined difference in the short-term spectrum with ITAKURA distance metric D.

16. the transmitting base station of claim 13, the step of wherein determining the percentage of silence period comprise the energy of measuring each time frame and the step of zero crossing.

17. an electronic installation that is used for the correction of voice markers comprises:

An input unit that is used for receiving sound signal;

Be used for storing the memory of sound signal;

One is used for digitlization audio frequency signal so that the processing unit of input signal is provided, in order to analyze, for input signal being divided into a plurality of equal time frames, in order to determine the difference in the short-term spectrum between the adjacent time frame so that provide first velocity factor and for according to first velocity factor, this input signal of time scale is so that provide a time scale signal.

18. the electronic installation of claim 17, wherein being used for digitlization audio frequency Signal Processing device further determines the percentage of the silence period in input signal, in order to analyze so that provide second velocity factor and according to first velocity factor and the second speed factor, further this input signal of time scale.

19. the electronic installation of claim 17, wherein this electronic installation comprises an answering machine.

20. the electronic installation of claim 17, wherein this electronic installation comprises a voice mail.