[go: up one dir, main page]

CN109496333A - A kind of frame loss compensation method and device - Google Patents

A kind of frame loss compensation method and device Download PDF

Info

Publication number
CN109496333A
CN109496333A CN201780046044.XA CN201780046044A CN109496333A CN 109496333 A CN109496333 A CN 109496333A CN 201780046044 A CN201780046044 A CN 201780046044A CN 109496333 A CN109496333 A CN 109496333A
Authority
CN
China
Prior art keywords
frame
historical
information
future
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201780046044.XA
Other languages
Chinese (zh)
Inventor
高振东
肖建良
刘泽新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN109496333A publication Critical patent/CN109496333A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种丢帧补偿方法及设备,包括:接收语音码流序列;获取该语音码流序列中的历史帧信息以及未来帧信息,其中,该语音码流序列包括多个语音帧的帧信息,该多个语音帧包括至少一个历史帧、至少一个当前帧和至少一个未来帧,该至少一个历史帧在时域上位于至少一个当前帧之前,至少一个未来帧在时域上位于至少一个当前帧之后,该历史帧信息是该至少一个历史帧的帧信息,该未来帧信息是该至少一个未来帧的帧信息;根据该历史帧信息以及未来帧信息,估计该至少一个当前帧的帧信息,提高了丢帧补偿的准确度。

A frame loss compensation method and device, comprising: receiving a voice code stream sequence; acquiring historical frame information and future frame information in the voice code stream sequence, wherein the voice code stream sequence includes frame information of multiple voice frames, the The plurality of speech frames include at least one historical frame, at least one current frame, and at least one future frame, the at least one historical frame is located before the at least one current frame in the time domain, and the at least one future frame is located after the at least one current frame in the time domain , the historical frame information is the frame information of the at least one historical frame, and the future frame information is the frame information of the at least one future frame; according to the historical frame information and the future frame information, the frame information of the at least one current frame is estimated to improve the The accuracy of drop frame compensation is improved.

Description

PCT国内申请,说明书已公开。PCT domestic application, the description has been published.

Claims (23)

  1. A method for frame loss compensation, the method comprising:
    receiving a voice code stream sequence;
    acquiring historical frame information and future frame information in the voice code stream sequence, wherein the voice code stream sequence comprises frame information of a plurality of voice frames, the plurality of voice frames comprise at least one historical frame, at least one current frame and at least one future frame, the at least one historical frame is positioned in front of the at least one current frame in a time domain, the at least one future frame is positioned behind the at least one current frame in the time domain, the historical frame information is frame information of the at least one historical frame, and the future frame information is frame information of the at least one future frame;
    estimating the frame information of the at least one current frame according to the historical frame information and the future frame information.
  2. The method of claim 1, further comprising: storing the voice code stream sequence in a buffer area;
    the acquiring of the historical frame information and the future frame information in the speech code stream sequence comprises:
    decoding frame information of a plurality of voice frames of the voice code stream sequence in the buffer area to obtain decoded historical frame information;
    obtaining the future frame information that is not decoded from the buffer.
  3. The method of claim 1 or 2, wherein the historical frame information comprises formant spectrum information of the at least one historical frame, the future frame information comprises formant spectrum information of the at least one future frame;
    estimating the frame information of the at least one current frame according to the historical frame information and the future frame information comprises:
    determining formant spectrum information of the at least one current frame based on the formant spectrum information of the at least one historical frame and the formant spectrum information of the at least one future frame.
  4. A method according to any one of claims 1 to 3, wherein said historical frame information comprises a pitch value of said at least one historical frame, and said future frame information comprises a pitch value of said at least one future frame;
    estimating the frame information of the at least one current frame according to the historical frame information and the future frame information comprises:
    determining a pitch value of the at least one current frame based on the pitch value of the at least one historical frame and the pitch value of the at least one future frame.
  5. The method of any of claims 1-4, wherein the historical frame information comprises an energy of the at least one historical frame, the future frame information comprises an energy of the at least one future frame;
    estimating the frame information of the at least one current frame according to the historical frame information and the future frame information comprises:
    determining the energy of the at least one current frame according to the energy of the at least one historical frame and the energy of the at least one future frame.
  6. The method of any of claims 1-5, wherein said estimating frame information for the at least one current frame based on the historical frame information and the future frame information comprises:
    determining a frame type of the at least one current frame, the frame type comprising an unvoiced sound or a voiced sound;
    determining at least one of an adaptive codebook gain and a fixed codebook gain of the at least one current frame according to the frame type.
  7. The method of claim 6, wherein said determining the frame type of the at least one current frame comprises:
    determining a magnitude of spectral tilt of the at least one current frame;
    and determining the frame type of the at least one current frame according to the spectral tilt of the at least one current frame.
  8. The method of claim 6, wherein said determining the frame type of the at least one current frame comprises:
    obtaining pitch change states of a plurality of subframes in the at least one current frame;
    and determining the frame type of the at least one current frame according to the pitch change states of the plurality of subframes.
  9. The method of any of claims 6-8, wherein said determining at least one of an adaptive codebook gain and a fixed codebook gain for the at least one current frame based on the frame type comprises:
    and if the frame type is voiced, determining the adaptive codebook gain of the at least one current frame according to the adaptive codebook gain and the pitch period of one historical frame and the energy gain of the at least one current frame, and taking the average value of the fixed codebook gains of a plurality of historical frames as the fixed codebook gain of the at least one current frame.
  10. The method of any of claims 6-9, wherein said determining at least one of an adaptive codebook gain and a fixed codebook gain for the at least one current frame based on the frame type comprises:
    and if the frame type is unvoiced, determining the fixed codebook gain of the at least one current frame according to the fixed codebook gain and the pitch period of one historical frame and the energy gain of the at least one current frame, and taking the average value of the adaptive codebook gains of a plurality of historical frames as the adaptive codebook gain of the at least one current frame.
  11. The method of claim 9 or 10, wherein the method further comprises:
    and determining the energy gain of the at least one current frame according to the time domain signal size in the decoded historical frame information and the length of each subframe in the historical frame.
  12. A frame loss compensation apparatus, the apparatus comprising:
    the receiving module is used for receiving the voice code stream sequence;
    an obtaining module, configured to obtain historical frame information and future frame information in the speech code stream sequence, where the speech code stream sequence includes frame information of multiple speech frames, the multiple speech frames include at least one historical frame, at least one current frame, and at least one future frame, the at least one historical frame is located before the at least one current frame in a time domain, the at least one future frame is located after the at least one current frame in the time domain, the historical frame information is frame information of the at least one historical frame, and the future frame information is frame information of the at least one future frame;
    and the processing module is used for estimating the frame information of the at least one current frame according to the historical frame information and the future frame information.
  13. The apparatus of claim 12, wherein the sequence of speech codestreams is stored in a buffer;
    the acquisition module is specifically configured to:
    decoding frame information of a plurality of voice frames of the voice code stream sequence in the buffer area to obtain decoded historical frame information;
    obtaining the future frame information that is not decoded from the buffer.
  14. The apparatus of claim 12 or 13, wherein the historical frame information comprises formant spectrum information for the at least one historical frame, the future frame information comprising formant spectrum information for the at least one future frame;
    the processing module is specifically configured to:
    and determining the formant spectrum information of the at least one current frame according to the formant spectrum information of the historical frame and the formant spectrum information of the future frame.
  15. The apparatus according to any one of claims 12 to 14, wherein said historical frame information comprises a pitch value of said at least one historical frame, and said future frame information comprises a pitch value of said at least one future frame;
    the processing module is specifically configured to:
    determining a pitch value of the at least one current frame based on the pitch value of the at least one historical frame and the pitch value of the at least one future frame.
  16. The apparatus of any of claims 12 to 15, wherein the historical frame information comprises an energy of the at least one historical frame, the future frame information comprises an energy of the at least one future frame;
    the processing module is specifically configured to:
    determining the energy of the at least one current frame according to the energy of the at least one historical frame and the energy of the at least one future frame.
  17. The apparatus according to any one of claims 12 to 16, wherein the processing module is specifically configured to:
    determining a frame type of the at least one current frame, the frame type comprising an unvoiced sound or a voiced sound;
    determining at least one of an adaptive codebook gain and a fixed codebook gain of the at least one current frame according to the frame type.
  18. The apparatus of claim 17, wherein the processing module is further configured to determine a magnitude of spectral tilt of the at least one current frame;
    and determining the frame type of the at least one current frame according to the spectral tilt of the at least one current frame.
  19. The apparatus of claim 17, wherein the processing module is further configured to obtain pitch change statuses for a plurality of subframes in the at least one current frame;
    and determining the frame type of the at least one current frame according to the pitch change states of the plurality of subframes.
  20. The apparatus according to any one of claims 17 to 19, wherein the processing module is specifically configured to:
    and if the frame type is voiced, determining the adaptive codebook gain of the at least one current frame according to the adaptive codebook gain and the pitch period of one historical frame and the energy gain of the at least one current frame, and taking the average value of the fixed codebook gains of a plurality of historical frames as the fixed codebook gain of the at least one current frame.
  21. The apparatus according to any one of claims 17 to 19, wherein the processing module is specifically configured to:
    and if the frame type is unvoiced, determining the fixed codebook gain of the at least one current frame according to the fixed codebook gain and the pitch period of one historical frame and the energy gain of the at least one current frame, and taking the average value of the adaptive codebook gains of a plurality of historical frames as the adaptive codebook gain of the at least one current frame.
  22. The apparatus of claim 20 or 21, wherein the processing module is further configured to determine the energy gain of the at least one current frame according to the time domain signal size in the decoded historical frame information and the length of each subframe in the historical frame.
  23. A frame loss compensation apparatus, comprising: a memory, a communication bus, and a vocoder, the memory coupled to the vocoder through the communication bus; wherein the memory is configured to store program code, and the vocoder is configured to call the program code to:
    receiving a voice code stream sequence;
    acquiring historical frame information and future frame information in the voice code stream sequence, wherein the voice code stream sequence comprises frame information of a plurality of voice frames, the plurality of voice frames comprise at least one historical frame, at least one current frame and at least one future frame, the at least one historical frame is positioned in front of the at least one current frame in a time domain, the at least one future frame is positioned behind the at least one current frame in the time domain, the historical frame information is frame information of the at least one historical frame, and the future frame information is frame information of the at least one future frame;
    estimating the frame information of the at least one current frame according to the historical frame information and the future frame information.
CN201780046044.XA 2017-06-26 2017-06-26 A kind of frame loss compensation method and device Pending CN109496333A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/090035 WO2019000178A1 (en) 2017-06-26 2017-06-26 Frame loss compensation method and device

Publications (1)

Publication Number Publication Date
CN109496333A true CN109496333A (en) 2019-03-19

Family

ID=64740767

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780046044.XA Pending CN109496333A (en) 2017-06-26 2017-06-26 A kind of frame loss compensation method and device

Country Status (2)

Country Link
CN (1) CN109496333A (en)
WO (1) WO2019000178A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111554308A (en) * 2020-05-15 2020-08-18 腾讯科技(深圳)有限公司 Voice processing method, device, equipment and storage medium
CN111711992A (en) * 2020-06-23 2020-09-25 瓴盛科技有限公司 Calibration method for CS voice downlink jitter
CN111836117A (en) * 2019-04-15 2020-10-27 深信服科技股份有限公司 Method and device for sending supplementary frame data and related components
CN112489665A (en) * 2020-11-11 2021-03-12 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
CN112634912A (en) * 2020-12-18 2021-04-09 北京猿力未来科技有限公司 Packet loss compensation method and device

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004239930A (en) * 2003-02-03 2004-08-26 Iwatsu Electric Co Ltd Pitch detection method and apparatus for packet loss compensation
KR20050024651A (en) * 2003-09-01 2005-03-11 한국전자통신연구원 Method and apparatus for frame loss concealment for packet network
CN1659625A (en) * 2002-05-31 2005-08-24 沃伊斯亚吉公司 Method and device for efficient frame erasure concealment in linear predictive based speech codecs
CN101009098A (en) * 2007-01-26 2007-08-01 清华大学 Sound coder gain parameter division-mode anti-channel error code method
CN101147190A (en) * 2005-01-31 2008-03-19 高通股份有限公司 Frame erasure concealment in voice communications
CN101379551A (en) * 2005-12-28 2009-03-04 沃伊斯亚吉公司 Method and device for efficient frame erasure concealment in speech codecs
US20090240490A1 (en) * 2008-03-20 2009-09-24 Gwangju Institute Of Science And Technology Method and apparatus for concealing packet loss, and apparatus for transmitting and receiving speech signal
CN101630242A (en) * 2009-07-28 2010-01-20 苏州国芯科技有限公司 Contribution module for rapidly computing self-adaptive code book by G723.1 coder
CN101894558A (en) * 2010-08-04 2010-11-24 华为技术有限公司 Lost frame recovering method and equipment as well as speech enhancing method, equipment and system
CN102449690A (en) * 2009-06-04 2012-05-09 高通股份有限公司 Systems and methods for reconstructing an erased speech frame
CN103325375A (en) * 2013-06-05 2013-09-25 上海交通大学 Coding and decoding device and method of ultralow-bit-rate speech
CN103714820A (en) * 2013-12-27 2014-04-09 广州华多网络科技有限公司 Packet loss hiding method and device of parameter domain
CN106251875A (en) * 2016-08-12 2016-12-21 广州市百果园网络科技有限公司 The method of a kind of frame losing compensation and terminal

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1659625A (en) * 2002-05-31 2005-08-24 沃伊斯亚吉公司 Method and device for efficient frame erasure concealment in linear predictive based speech codecs
JP2004239930A (en) * 2003-02-03 2004-08-26 Iwatsu Electric Co Ltd Pitch detection method and apparatus for packet loss compensation
KR20050024651A (en) * 2003-09-01 2005-03-11 한국전자통신연구원 Method and apparatus for frame loss concealment for packet network
CN101147190A (en) * 2005-01-31 2008-03-19 高通股份有限公司 Frame erasure concealment in voice communications
CN101379551A (en) * 2005-12-28 2009-03-04 沃伊斯亚吉公司 Method and device for efficient frame erasure concealment in speech codecs
CN101009098A (en) * 2007-01-26 2007-08-01 清华大学 Sound coder gain parameter division-mode anti-channel error code method
US20090240490A1 (en) * 2008-03-20 2009-09-24 Gwangju Institute Of Science And Technology Method and apparatus for concealing packet loss, and apparatus for transmitting and receiving speech signal
CN102449690A (en) * 2009-06-04 2012-05-09 高通股份有限公司 Systems and methods for reconstructing an erased speech frame
CN101630242A (en) * 2009-07-28 2010-01-20 苏州国芯科技有限公司 Contribution module for rapidly computing self-adaptive code book by G723.1 coder
CN101894558A (en) * 2010-08-04 2010-11-24 华为技术有限公司 Lost frame recovering method and equipment as well as speech enhancing method, equipment and system
CN103325375A (en) * 2013-06-05 2013-09-25 上海交通大学 Coding and decoding device and method of ultralow-bit-rate speech
CN103714820A (en) * 2013-12-27 2014-04-09 广州华多网络科技有限公司 Packet loss hiding method and device of parameter domain
CN106251875A (en) * 2016-08-12 2016-12-21 广州市百果园网络科技有限公司 The method of a kind of frame losing compensation and terminal

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111836117A (en) * 2019-04-15 2020-10-27 深信服科技股份有限公司 Method and device for sending supplementary frame data and related components
CN111836117B (en) * 2019-04-15 2022-08-09 深信服科技股份有限公司 Method and device for sending supplementary frame data and related components
CN111554308A (en) * 2020-05-15 2020-08-18 腾讯科技(深圳)有限公司 Voice processing method, device, equipment and storage medium
CN111711992A (en) * 2020-06-23 2020-09-25 瓴盛科技有限公司 Calibration method for CS voice downlink jitter
CN111711992B (en) * 2020-06-23 2023-05-02 瓴盛科技有限公司 CS voice downlink jitter calibration method
CN112489665A (en) * 2020-11-11 2021-03-12 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
CN112489665B (en) * 2020-11-11 2024-02-23 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
CN112634912A (en) * 2020-12-18 2021-04-09 北京猿力未来科技有限公司 Packet loss compensation method and device
CN112634912B (en) * 2020-12-18 2024-04-09 北京猿力未来科技有限公司 Packet loss compensation method and device

Also Published As

Publication number Publication date
WO2019000178A1 (en) 2019-01-03

Similar Documents

Publication Publication Date Title
CN109496333A (en) A kind of frame loss compensation method and device
JP5934259B2 (en) Noise generation in audio codecs
KR20230043250A (en) Synthesis of speech from text in a voice of a target speaker using neural networks
US7877253B2 (en) Systems, methods, and apparatus for frame erasure recovery
EP3611729B1 (en) Bandwidth extension method and apparatus
US8655656B2 (en) Method and system for assessing intelligibility of speech represented by a speech signal
EP2954524B1 (en) Systems and methods of performing gain control
KR102007972B1 (en) Unvoiced/voiced decision for speech processing
JP6616470B2 (en) Encoding method, decoding method, encoding device, and decoding device
JP6086999B2 (en) Apparatus and method for selecting one of first encoding algorithm and second encoding algorithm using harmonic reduction
CN104937662B (en) System, method, equipment and the computer-readable media that adaptive resonance peak in being decoded for linear prediction sharpens
JP2018528480A (en) Encoder and method for encoding an audio signal with reduced background noise using linear predictive coding
JP2016504637A5 (en)
JP2016539355A5 (en)
WO2008072671A1 (en) Audio decoding device and power adjusting method
CN105336336B (en) The temporal envelope processing method and processing device of a kind of audio signal, encoder
JP6626123B2 (en) Audio encoder and method for encoding audio signals
WO2014130083A1 (en) Systems and methods for determining pitch pulse period signal boundaries
CN111862967B (en) Voice recognition method and device, electronic equipment and storage medium
CN113409792A (en) Voice recognition method and related equipment thereof
JPWO2008072732A1 (en) Speech coding apparatus and speech coding method
CN101266798B (en) A method and device for gain smoothing in voice decoder
JP2006039559A (en) Voice coding apparatus and method using PLP of mobile communication terminal
JP2004061558A (en) Method and device for code conversion between speed encoding and decoding systems and storage medium therefor
JPH07135490A (en) Speech detector and speech encoder having speech detector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190319

WD01 Invention patent application deemed withdrawn after publication