[go: up one dir, main page]

WO2007037641A1 - Optional encoding system and method for operating the system - Google Patents

Optional encoding system and method for operating the system Download PDF

Info

Publication number
WO2007037641A1
WO2007037641A1 PCT/KR2006/003903 KR2006003903W WO2007037641A1 WO 2007037641 A1 WO2007037641 A1 WO 2007037641A1 KR 2006003903 W KR2006003903 W KR 2006003903W WO 2007037641 A1 WO2007037641 A1 WO 2007037641A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio data
user
data
user terminal
terminal
Prior art date
Application number
PCT/KR2006/003903
Other languages
French (fr)
Inventor
Yun Ho Jeon
Original Assignee
Realnetworks Asia Pacific Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Realnetworks Asia Pacific Co., Ltd. filed Critical Realnetworks Asia Pacific Co., Ltd.
Priority to CN2006800359075A priority Critical patent/CN101273405B/en
Publication of WO2007037641A1 publication Critical patent/WO2007037641A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals

Definitions

  • the present invention relates to a method and system for receiving audio data from a predetermined server, encoding the audio data via a predetermined encoder and providing a user terminal with the audio data.
  • the encoder may be variably set based on a characteristic of the audio data and when the audio data includes more voice data than a predetermined ratio, the encoder includes qualcomm code excited linear prediction (QCELP), enhanced voice rated codec (EVRC), or adaptive multi-rate (AMR) and the like.
  • QELP qualcomm code excited linear prediction
  • EVRC enhanced voice rated codec
  • AMR adaptive multi-rate
  • the audio contents are initially required to be downloaded to a computer terminal.
  • the audio contents downloaded to the computer terminal are transmitted to the mobile terminal such as a Moving Picture Experts Group Audio Layer 3 (MP3) player, a mobile phone and the like in an encoded form of encoding based on an audio compression technology such as the MP3 method, an advanced audio coding (ACC) method, and the like.
  • MP3 Moving Picture Experts Group Audio Layer 3
  • ACC advanced audio coding
  • the mobile terminal may replay compressed audio contents by decoding the compressed audio contents.
  • the computer terminal may download audio contents, such as news broadcasting and the like, from a server providing the audio data in a predetermined cycle, encodes the audio contents, and provides the mobile terminal with the audio contents.
  • the mobile terminal further includes a memory device for recording the audio contents which may be recorded in the memory device of the mobile terminal.
  • the mobile terminal that was currently and widely used generally has a memory capacity of tens of or hundreds of megabytes (MB).
  • the memory capacity may be insufficient for recording audio contents that are encoded at a high bit rate.
  • a technology that maximally compresses or encodes the audio contents recorded in the memory device is required.
  • the audio data received from a predetermined server is already encoded in a particular method when receiving the audio data.
  • a technology that can increase memory efficiency of the mobile terminal and reduce a load of a transmission channel is required.
  • audio data such as music, and the like
  • any voice-centered contents where sound quality is generally not a concern require a bit rate of at least 32 Kbps
  • a vocoder that is optimized for human voice for example, enhanced voice rated codec (EVRC)
  • EVRC enhanced voice rated codec
  • a sound source is generally by or in a grouping provided in an encoding method such as the MP3 method or the ACC method and the like regardless of voice data and music data in a rich site summary (RSS) or a podcasting according to a conventional art. Accordingly, cases when the mobile terminal stores the voice data compressed into a higher bit rate than necessary often occur. Thus, there is a problem that the memory of the mobile terminal is inefficiently used.
  • An aspect of the present invention provides a method and system for increasing usage efficiency of a memory device of a mobile terminal.
  • An aspect of the present invention also provides a method and system for reducing a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of data, and transmitting the audio data to a second user terminal via the wireless communication network.
  • a method of variably encoding audio data including: receiving the audio data from a predetermined server; determining whether voice data is contained in the audio data by analyzing a data format of the audio data; generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on the conversion information.
  • a system for variably encoding audio data including: a receiver receiving the audio data from a predetermined server; a converter determining whether voice data is contained in the audio data by analyzing a data format of the audio data, and generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and a transmitter transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on conversion information.
  • FIG. 1 is a diagram illustrating a network including a variable encoding system, a server and a second user terminal according to the present invention
  • FIG. 2 is a flowchart illustrating an operation based on a method of variably encoding audio data according to the present invention
  • FIG. 3 and FIG. 4 are diagrams illustrating examples of networks including a variable encoding system, a server and a second user terminal according to the present invention
  • FIG. 5 is a diagram illustrating data formats of audio data and second audio data according to an exemplary embodiment of the present invention
  • FIG. 6 is a block diagram illustrating an internal configuration of a variable encoding system according to an exemplary embodiment of the present invention
  • FIG. 7 is an internal block diagram of a general-purpose computer apparatus which can be adopted in implementing a variable encoding method according to the present invention. Best Mode for Carrying Out the Invention
  • FIG. 1 is a diagram illustrating a network including a variable encoding system, a web server, and a second user terminal according to the present invention.
  • a variable encoding system 100 receives predetermined audio data from a web server 110.
  • the web server 110 according to an embodiment of the present invention provides a podcasting service or a rich site summary (RSS) service. Accordingly, the variable encoding system 100 receives the audio data from the web server 110 in a predetermined cycle.
  • the audio data may include music data, voice data, or broadcasting data.
  • the variable encoding system 100 that receives the audio data analyzes the audio data and identifies whether voice data is contained in the audio data. Identifying whether the voice data is contained in the audio data by analyzing the audio data may use a conventional art.
  • the method for determining whether sound is cut off at a ratio greater than a predetermined ratio can be used.
  • whether the voice data is contained in the audio data may be determined by checking whether a predetermined pitch is detected from the audio data or whether a frequency of the audio data is crowded in a predetermined band by identifying the frequency of the audio data.
  • a current mobile communication terminal controls a transmission band in real-time via a function such as a voice activity detector (VAD), discontinuous transmission (DTX), or variable rate codec (VRC) and the like.
  • VAD voice activity detector
  • DTX discontinuous transmission
  • VRC variable rate codec
  • variable encoding system 100 may determine in comparatively greater detail whether the voice data is contained in the audio data due to more available time in analyzing the audio data.
  • the variable encoding system 100 that receives the audio data from the server 110 determines whether the voice data is contained in the audio data, and encodes the voice data via a predetermined vocoder when the voice data is contained in the audio data.
  • the variable encoding system 100 may use a vocoder such as qualcomm code excited linear prediction (QCELP), enhanced voice rated codec (EVRC), adaptive multi-rate (AMR), and the like.
  • QELP qualcomm code excited linear prediction
  • EVRC enhanced voice rated codec
  • AMR adaptive multi-rate
  • the second audio data is generated from the audio data after encoding via the vocoder.
  • the second audio data may be encoded at a bit rate corresponding to about 8 Kbps when the EVRC is used for the audio data including the voice data.
  • the variable encoding system 100 does not encode the audio data again.
  • a second user terminal 120 receives the second audio data from the variable encoding system 100.
  • the variable encoding system 100 is a computer terminal that receives the audio data from a service where audio contents are provided in the podcasting service or a similar method. Accordingly, the variable encoding system 100 receives the audio data from the server 110 via a wired/wireless Internet communication network. Also, the variable encoding system 100 variably encodes the second audio data, or transmits the audio data as is to the second user terminal 120.
  • the second user terminal 120 is a mobile terminal such as a mobile communication terminal, a Moving Picture Experts Group Audio Layer 3 (MP3) player, a PlayStation Portable (PSP), a portable multimedia player (PMP), a personal digital assistant (PDA), or an electronic notebook and the like, and the computer terminal transmits the second audio data by connecting with the second user terminal 120.
  • the variable encoding system 100 according to an exemplary embodiment of the present invention is a predetermined independent server. Accordingly, the variable encoding system 100 receives the audio data from the server 110 via the wired/wireless communication network, variably generates the second audio data from the audio data, or transmits the audio data as is to the second user terminal 120.
  • the second user terminal 120 is the mobile communication terminal, and the variable encoding system 100 wirelessly transmits the second audio data to the mobile communication terminal via a data channel.
  • the variable encoding system 100 according to the present invention may have an effect such as an increase of memory efficiency of the second user terminal 120, load reduction of a transmission channel, and the like.
  • the variable encoding system 100 according to the present invention may reduce full volume of the audio data by encoding only the voice data at a smaller bit rate again when the voice data is partially or fully contained in the audio data.
  • FIG. 2 is a flowchart illustrating an operation based on a method of variably encoding audio data according to the present invention.
  • a server transmits predetermined audio data to the variable encoding system according to an embodiment of the present invention.
  • the server is a system that provides a podcasting service or an RSS service. Accordingly, the variable encoding system identifies a renewed audio data list by identifying the server in a predetermined cycle, and requires the audio data to be transmitted when there is the renewed audio data.
  • the variable encoding system receives the audio data from the server and analyzes a data format.
  • the audio data includes data such as broadcasting, music, a song, a voice, and the like.
  • the audio data has a particular nature based on the data format and the particular nature may determine a characteristic by analyzing a frequency band, pitch detection, whether the sound is cut off, and the like.
  • a characteristic of the audio data is determined by using the conventional art as is.
  • the variable encoding system determines whether the voice data is contained in the audio data based on an analysis result of the data format.
  • the variable encoding system determines whether the voice data is contained in the audio data by analyzing the frequency band, pitch detection, whether the sound is cut off, and the like.
  • the variable encoding system according to an embodiment of the present invention separates one audio data into a predetermined portion and identifies each portion which contains the voice data in the audio data.
  • whether each portion includes an index and whether the index includes the audio data is recorded in a predetermined memory device.
  • the variable encoding system transmits the audio data as is to the second user terminal.
  • the variable encoding system encodes only a portion corresponding to the voice data among the audio data via a predetermined vocoder in operation 204 and generates the second audio data in operation 205.
  • the variable encoding system according to an exemplary embodiment of the present invention encodes a predetermined portion corresponding to the voice data among the audio data via the vocoder.
  • the variable encoding system encodes only the middle portion via the vocoder, and generates the second audio data by inserting identification information such as a predetermined flag or index information and the like into a beginning location of the middle portion or recombining conversion information such as vocoder information and the like.
  • identification information such as a predetermined flag or index information and the like
  • the second audio data has a different bit rate classified by each partial interval.
  • the audio data may be encoded in the portion corresponding to the voice data at an 8 Kbps bit rate and be encoded in the portion corresponding to the music data at a 128 Kbps bit rate.
  • variable encoding system may encode the total audio data at a bit rate corresponding to the voice data, when the voice data is contained in the audio data at a ratio corresponding to more than a predetermined ratio.
  • the predetermined ratio may be set by a developer or an operator of the variable encoding system.
  • the variable encoding system transmits the generated second audio data to the second user terminal.
  • the variable encoding system may be embodied in a user's computer terminal and the second user terminal may be a mobile terminal such as a mobile phone, a PDA, an electric notebook, a PMP, a PSP, an MP3 player, and the like.
  • the exemplary embodiment of the present invention is described in detail with reference to FIG. 3.
  • FIG. 3 is a diagram illustrating an example of a network including a variable encoding system, a server and a second user terminal according to the present invention.
  • variable encoding system 300 may be embodied on a computer terminal 310.
  • the variable encoding system 300 is a predetermined application program or hardware located in the computer terminal 310.
  • a server 301 transmits the audio data to the computer terminal 310 via a network 302 in a predetermined cycle based on the podcasting service or the RSS service.
  • the network 302 may be considered as a wired/wireless network to provide the computer terminal 310 with Internet communication capacity.
  • the computer terminal 310 that receives the audio data via the network 302 determines whether the voice data is contained in the audio data in the variable encoding system 300.
  • the variable encoding system 300 When the voice data is contained in the audio data, the variable encoding system 300 generates the second audio data after encoding the audio data via the vocoder.
  • the computer terminal 310 transmits the second audio data that the variable encoding system 300 generates to the second user terminal.
  • the second user terminal is a mobile terminal, such as an MP3 player 304, a mobile communication terminal 305, a PlayStation 306, and the like, having a predetermined memory device.
  • the second user terminal connects with the variable encoding system 300 via a short-distance communication module such as a universal serial bus (USB) module, a recommended standard-232C (RS-232C) module, a Bluetooth module, and the like, and the variable encoding system 300 transmits the second audio data to the second user terminal by identifying a connection of the second user terminal.
  • a short-distance communication module such as a universal serial bus (USB) module, a recommended standard-232C (RS-232C) module, a Bluetooth module, and the like
  • variable encoding system is a predetermined independent server and the second user terminal is the mobile communication terminal.
  • the exemplary embodiment of the present invention is described in detail with reference to FIG. 4.
  • FIG. 4 is a diagram illustrating an example of a network including a variable encoding system, a server and a second user terminal according to an embodiment of the present invention.
  • variable encoding system 400 receives predetermined audio data from a server 401 via a network 402.
  • the network 402 may be interpreted a broad meaning including all wired/wireless communication network.
  • variable encoding system 400 that receives the audio data identifies whether the voice data is contained in the audio data, and generates the second audio data after encoding the audio data via the predetermined vocoder when the voice data is contained in the audio data. Also, the generated second audio data is transmitted to the second user terminal via the network 403.
  • the second user terminal is a mobile communication terminal 404 and the network 403 includes a wireless communication network including a predetermined communication provider system.
  • variable encoding system 400 requires the communication provider system to establish a channel with a mobile communication terminal 404.
  • the communication provider system sets a wireless channel of the variable encoding system 400 and the mobile communication terminal 404, and the variable encoding system 400 wirelessly transmits the second audio data to the mobile communication terminal 404 via the wireless channel.
  • the mobile communication terminal 404 queries whether there is the second audio data transmitting the variable encoding system 400 in a predetermined cycle, and requires the variable encoding system 400 to transmit the second audio data when there is the second audio data.
  • the variable encoding system 400 may reduce memory usage of the mobile communication terminal 404 by efficiently reducing a volume of the audio data, and reduce the load of the transmission channel based on the mobile communication network.
  • the second user terminal decodes the second audio data based on the conversion information and provides the user with the second audio data via a predetermined speaker device.
  • the variable encoding system maintains a user database recording user information about at least one user.
  • the user information includes identification information of the second user terminal corresponding to the user, and telephone number information may be used as an example of the identification information.
  • the variable encoding system reads and extracts the user information corresponding to the second user terminal by referring to the user database to transmit the generated second audio data to the second user terminal, and wirelessly transmits the second audio data to the second user terminal based on the identification information corresponding to the user information.
  • the second user terminal is the mobile communication terminal such as the mobile phone.
  • FIG. 5 is a diagram illustrating data formats of audio data and second audio data according to an exemplary embodiment of the present invention.
  • audio data according to an exemplary embodiment of the present invention is an 'A.MP3 1 .
  • the 'A.MP3' includes a plurality of playlists and the variable encoding system identifies whether the voice data is contained in the audio by analyzing each playlist.
  • 'A.MP3' is radio broadcasting and may include narration data of an announcer and music data.
  • the variable encoding system determines that 'Al' and 'A3' are the music data, and 'A2' and 'A4' are the narration data of the announcer.
  • variable encoding system encodes 'Al' and 'A3' that are determined as the music data and encodes 'A2' and 'A4' by using the predetermined vocoder. Specifically, the variable encoding system analyzes each audio data classified by each playlist and, as a result of analysis, implements a heterogeneous encoding on each playlist. In this instance, the second user terminal is required to have a function to replay each list based on the playlist. Similar to reference numeral 501, the audio data where the voice data is significantly contained may prevent a problem that the audio data may be determined as the music data or the song data due to the music data or the song data at the beginning of the audio data.
  • variable encoding system deletes the playlist from the reference numeral 501, inserts conversion information related to encoding in each playlist, and recombines the playlist into one audio data.
  • predetermined software that may decode the audio data encoded via a plurality of encoders is required. Since the software is a well-known and common-use technology, a detail description is omitted.
  • FIG. 6 is a block diagram illustrating an internal configuration of a variable encoding system according to an exemplary embodiment of the present invention
  • variable encoding system 600 includes a receiver 601 , a converter 602, and a transmitter 603.
  • the receiver 601 receives audio data from a predetermined server.
  • the server provides the audio data such as voice, music, a song, broadcasting, and the like as a general server that provides the audio data.
  • the audio data includes all encoded data or unprocessed data.
  • the converter 602 determines whether voice data is contained in the audio data by analyzing a data format of the audio data that is received from the receiver 601, and generates second audio data by encoding the audio data via a predetermined vocoder when the voice data is contained in the audio data.
  • the converter 602 according to an exemplary embodiment of the present invention determines whether a plurality of data that the received audio data is divided based on a predetermined playlist is each voice data. Accordingly, discriminative encoding is separately implemented in the plurality of data and the plurality of data is generated into the second audio.
  • the second audio data includes conversion information about the vocoder and the encoding.
  • the converter 602 generates the audio data into the second audio data via a particular encoder by a command of a user.
  • the user may set the audio data via the particular encoder based on the user's taste or encoding errors to be encoded into the second audio data. For example, the user may set music data or song data to be encoded into the vocoder, according to memory capacity of the second user terminal.
  • the transmitter 603 transmits the generated second audio data to the user terminal.
  • the variable encoding system 600 according to an exemplary embodiment of the present invention is included in a predetermined computer terminal in the form of an application program or hardware.
  • the receiver 601 receives the audio data from a predetermined server via an Internet communication network in a wired/wireless form, and the converter 602 determines whether the voice data is contained in the audio data and generates the second audio data by encoding the audio data via the vocoder when the voice data is contained in the audio data.
  • the transmitter 603 transmits the second audio data to the second user terminal.
  • a short-distance communication module such as a USB module, an RS-232C module, a ultra wideband (UWB) module, a Bluetooth module, a wireless local area network (LAN), and the like
  • the transmitter 603 transmits the second audio data to the second user terminal.
  • the variable encoding system 600 is a predetermined independent server.
  • the receiver 601 receives the audio data from the server via a wired/wireless communication network, and the converter 602 generates the second audio data according to whether the voice data is contained in the audio data.
  • the transmitter 603 wirelessly transmits the second audio data to the second user terminal.
  • the second user terminal includes a mobile communication terminal, a public switched telephone network (PSTN) terminal, voice over Internet protocol (VoIP), session initiation protocol (SIP), a media gateway controller (Megaco), a personal digital assistant (PDA), a cellular phone, a personal communication service (PCS) phone, a hand-held personal computer (PC), a code division multiple access (CDMA)-2000(lX, 3X) phone, a wideband CDMA (WCDMA) phone, a dual band/dual mode phone, a global system for mobile communication (GSM) phone, a mobile broadband system (MBS) phone, a satellite/terrestrial digital multimedia broadcasting (DMB) phone, and the like, as a predetermined communication terminal.
  • PSTN public switched telephone network
  • VoIP voice over Internet protocol
  • SIP session initiation protocol
  • Megaco media gateway controller
  • PDA personal digital assistant
  • CDMA code division multiple access
  • WCDMA wideband CDMA
  • GSM global system for mobile communication
  • MBS mobile broadband system
  • variable encoding system 600 further includes a user database 604 and a database management unit 605.
  • the user database 604 maintains user information about at least one user.
  • the user information includes identification information of the second user terminal corresponding to the user.
  • the database management unit 605 reads and extracts the user information corresponding to the second user terminal by referring to the user database 604, controls the transmitter 603, and wirelessly transmits the second audio data to the second user terminal, based on the identification information corresponding to the user information.
  • the transmitter 603 parses the user database 604 to wirelessly transmit the second audio data to the second user terminal, reads and extracts predetermined the user information.
  • the user information includes the identification information, such as telephone number information of the second user terminal and the like, and the transmitter 603 transmits the second audio data to the second user terminal based on the identification information such as the telephone number information and the like.
  • FIG. 7 is an internal block diagram of a general-purpose computer apparatus which can be adopted in implementing a variable encoding method according to the present invention.
  • a computer apparatus 700 includes at least one processor 710 connected to a main memory device including a RAM (Random Access Memory) 720 and a ROM (Read Only Memory) 730.
  • the processor 710 is also known as a central processing unit (CPU).
  • the ROM 730 unidirectionally transmits data and instructions to the CPU, and the RAM 720 is generally used for bidirectionally transmitting data and instructions.
  • the RAM 720 and the ROM 730 may include a certain proper form of a computer-readable recording medium.
  • a mass storage device 740 is bidirectionally connected to the processor 710 to provide additional data storage capacity and may be one of a number of computer-readable recording mediums.
  • the mass storage device 740 is used for storing programs, data, and the like, and is an auxiliary memory device such as a hard disc that is generally slower than the main memory device.
  • a particular mass storage device such as a CD ROM 760 may be used.
  • the processor 710 is connected to at least one input/output interface 750 such as a video monitor, a track ball, a mouse, a keyboard, a microphone, a touch-screen type display, a card reader, a magnetic or paper tape reader, a voice or hand- writing recognizer, a joystick, or other known computer input/output unit.
  • the processor 710 may be connected to a wired or wireless communication network via a network interface 770. The procedure of the described method can be performed via the network connection.
  • the described devices and tools are well-known to those skilled in the art of computer hardware and software.
  • the described hardware devices may be configured to act as one or more software modules in order to perform the operations of the present invention.
  • An aspect of the present invention provides a method and system for increasing usage efficiency of a memory device of a mobile terminal recording audio data.
  • An aspect of the present invention also provides a method and system for reducing a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of the audio data, and transmitting the audio data to a second user terminal via the wireless communication network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

A variable encoding system and method for operating the system, the method including: receiving audio data from a predetermined server, encoding the audio data via a predetermined encoder and providing a user terminal with the audio data. The variable encoding system and method for operating the system can increase usage efficiency of a memory device of a mobile terminal recording audio data and reduce a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of data, and transmitting the audio data to a second user terminal via the wireless communication network.

Description

OPTIONAL ENCODING SYSTEM AND METHOD FOR OPERATING THE
SYSTEM
Technical Field The present invention relates to a method and system for receiving audio data from a predetermined server, encoding the audio data via a predetermined encoder and providing a user terminal with the audio data. In this instance, the encoder may be variably set based on a characteristic of the audio data and when the audio data includes more voice data than a predetermined ratio, the encoder includes qualcomm code excited linear prediction (QCELP), enhanced voice rated codec (EVRC), or adaptive multi-rate (AMR) and the like.
Background Art
As the Internet has developed nowadays, mobile terminals storing audio contents and replaying on demand have come into wide use. For example, in the case of downloading audio contents to a mobile terminal and using the audio contents such as a podcasting service, the audio contents are initially required to be downloaded to a computer terminal. The audio contents downloaded to the computer terminal are transmitted to the mobile terminal such as a Moving Picture Experts Group Audio Layer 3 (MP3) player, a mobile phone and the like in an encoded form of encoding based on an audio compression technology such as the MP3 method, an advanced audio coding (ACC) method, and the like. Thus, the mobile terminal may replay compressed audio contents by decoding the compressed audio contents. Also, the computer terminal may download audio contents, such as news broadcasting and the like, from a server providing the audio data in a predetermined cycle, encodes the audio contents, and provides the mobile terminal with the audio contents.
In this instance, the mobile terminal further includes a memory device for recording the audio contents which may be recorded in the memory device of the mobile terminal. However, the mobile terminal that was currently and widely used generally has a memory capacity of tens of or hundreds of megabytes (MB). The memory capacity may be insufficient for recording audio contents that are encoded at a high bit rate. Thus, for actual use, a technology that maximally compresses or encodes the audio contents recorded in the memory device is required.
Specifically, the audio data received from a predetermined server is already encoded in a particular method when receiving the audio data. By transmitting to the mobile terminal after transcoding the audio data that is encoded in the particular method, based on characteristics of the audio data, a technology that can increase memory efficiency of the mobile terminal and reduce a load of a transmission channel is required.
For example, audio data such as music, and the like, is generally compressed into a bit rate greater than 128 Kbps, and any voice-centered contents where sound quality is generally not a concern require a bit rate of at least 32 Kbps, but a vocoder that is optimized for human voice, for example, enhanced voice rated codec (EVRC), may be compressed into a low bit rate of 8 Kbps.
However, a sound source is generally by or in a grouping provided in an encoding method such as the MP3 method or the ACC method and the like regardless of voice data and music data in a rich site summary (RSS) or a podcasting according to a conventional art. Accordingly, cases when the mobile terminal stores the voice data compressed into a higher bit rate than necessary often occur. Thus, there is a problem that the memory of the mobile terminal is inefficiently used.
Disclosure of Invention Technical Goals
An aspect of the present invention provides a method and system for increasing usage efficiency of a memory device of a mobile terminal.
An aspect of the present invention also provides a method and system for reducing a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of data, and transmitting the audio data to a second user terminal via the wireless communication network.
Technical solutions According to an aspect of the present invention, there is provided a method of variably encoding audio data including: receiving the audio data from a predetermined server; determining whether voice data is contained in the audio data by analyzing a data format of the audio data; generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on the conversion information.
According to another aspect of the present invention, there is provided a system for variably encoding audio data including: a receiver receiving the audio data from a predetermined server; a converter determining whether voice data is contained in the audio data by analyzing a data format of the audio data, and generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and a transmitter transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on conversion information.
Brief Description of Drawings
FIG. 1 is a diagram illustrating a network including a variable encoding system, a server and a second user terminal according to the present invention;
FIG. 2 is a flowchart illustrating an operation based on a method of variably encoding audio data according to the present invention;
FIG. 3 and FIG. 4 are diagrams illustrating examples of networks including a variable encoding system, a server and a second user terminal according to the present invention;
FIG. 5 is a diagram illustrating data formats of audio data and second audio data according to an exemplary embodiment of the present invention;
FIG. 6 is a block diagram illustrating an internal configuration of a variable encoding system according to an exemplary embodiment of the present invention; and FIG. 7 is an internal block diagram of a general-purpose computer apparatus which can be adopted in implementing a variable encoding method according to the present invention. Best Mode for Carrying Out the Invention
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
FIG. 1 is a diagram illustrating a network including a variable encoding system, a web server, and a second user terminal according to the present invention.
Referring to FIG. 1, a variable encoding system 100 according to the present invention receives predetermined audio data from a web server 110. The web server 110 according to an embodiment of the present invention provides a podcasting service or a rich site summary (RSS) service. Accordingly, the variable encoding system 100 receives the audio data from the web server 110 in a predetermined cycle. Also, the audio data may include music data, voice data, or broadcasting data. The variable encoding system 100 that receives the audio data analyzes the audio data and identifies whether voice data is contained in the audio data. Identifying whether the voice data is contained in the audio data by analyzing the audio data may use a conventional art. For example, in order to identify whether the audio data is generally made of a human voice, the method for determining whether sound is cut off at a ratio greater than a predetermined ratio can be used. Also, whether the voice data is contained in the audio data may be determined by checking whether a predetermined pitch is detected from the audio data or whether a frequency of the audio data is crowded in a predetermined band by identifying the frequency of the audio data. Also, a current mobile communication terminal controls a transmission band in real-time via a function such as a voice activity detector (VAD), discontinuous transmission (DTX), or variable rate codec (VRC) and the like. Unlike the mobile communication terminal identifying in real-time whether the voice data is contained in the audio data, the variable encoding system 100 according to the present invention may determine in comparatively greater detail whether the voice data is contained in the audio data due to more available time in analyzing the audio data.
The variable encoding system 100 that receives the audio data from the server 110 determines whether the voice data is contained in the audio data, and encodes the voice data via a predetermined vocoder when the voice data is contained in the audio data. The variable encoding system 100 according to an embodiment of the present invention may use a vocoder such as qualcomm code excited linear prediction (QCELP), enhanced voice rated codec (EVRC), adaptive multi-rate (AMR), and the like. The second audio data is generated from the audio data after encoding via the vocoder. The second audio data may be encoded at a bit rate corresponding to about 8 Kbps when the EVRC is used for the audio data including the voice data. Also, when the voice data is not contained in the audio data, but when the music data or the song data is contained in the audio data, the variable encoding system 100 does not encode the audio data again.
A second user terminal 120 receives the second audio data from the variable encoding system 100.
The variable encoding system 100 according to an exemplary embodiment of the present invention is a computer terminal that receives the audio data from a service where audio contents are provided in the podcasting service or a similar method. Accordingly, the variable encoding system 100 receives the audio data from the server 110 via a wired/wireless Internet communication network. Also, the variable encoding system 100 variably encodes the second audio data, or transmits the audio data as is to the second user terminal 120. In this instance, the second user terminal 120 is a mobile terminal such as a mobile communication terminal, a Moving Picture Experts Group Audio Layer 3 (MP3) player, a PlayStation Portable (PSP), a portable multimedia player (PMP), a personal digital assistant (PDA), or an electronic notebook and the like, and the computer terminal transmits the second audio data by connecting with the second user terminal 120. The variable encoding system 100 according to an exemplary embodiment of the present invention is a predetermined independent server. Accordingly, the variable encoding system 100 receives the audio data from the server 110 via the wired/wireless communication network, variably generates the second audio data from the audio data, or transmits the audio data as is to the second user terminal 120. In this instance, the second user terminal 120 is the mobile communication terminal, and the variable encoding system 100 wirelessly transmits the second audio data to the mobile communication terminal via a data channel. Thus, the variable encoding system 100 according to the present invention may have an effect such as an increase of memory efficiency of the second user terminal 120, load reduction of a transmission channel, and the like. Specifically, the variable encoding system 100 according to the present invention may reduce full volume of the audio data by encoding only the voice data at a smaller bit rate again when the voice data is partially or fully contained in the audio data.
FIG. 2 is a flowchart illustrating an operation based on a method of variably encoding audio data according to the present invention.
In operation 201, a server transmits predetermined audio data to the variable encoding system according to an embodiment of the present invention. The server is a system that provides a podcasting service or an RSS service. Accordingly, the variable encoding system identifies a renewed audio data list by identifying the server in a predetermined cycle, and requires the audio data to be transmitted when there is the renewed audio data. In operation 202, the variable encoding system receives the audio data from the server and analyzes a data format. The audio data includes data such as broadcasting, music, a song, a voice, and the like. Accordingly, the audio data has a particular nature based on the data format and the particular nature may determine a characteristic by analyzing a frequency band, pitch detection, whether the sound is cut off, and the like. A characteristic of the audio data is determined by using the conventional art as is.
In operation 203, the variable encoding system determines whether the voice data is contained in the audio data based on an analysis result of the data format. The variable encoding system determines whether the voice data is contained in the audio data by analyzing the frequency band, pitch detection, whether the sound is cut off, and the like. The variable encoding system according to an embodiment of the present invention separates one audio data into a predetermined portion and identifies each portion which contains the voice data in the audio data. Here, whether each portion includes an index and whether the index includes the audio data is recorded in a predetermined memory device. Also, in operation 203, by branching to operation 206 when the voice data is not contained in the audio data as a result of analyzing the data format, the variable encoding system transmits the audio data as is to the second user terminal. When the voice data is partially or fully contained in the audio data, the variable encoding system encodes only a portion corresponding to the voice data among the audio data via a predetermined vocoder in operation 204 and generates the second audio data in operation 205. The variable encoding system according to an exemplary embodiment of the present invention encodes a predetermined portion corresponding to the voice data among the audio data via the vocoder. For example, to encode the audio data where a middle portion corresponds to the voice data, the variable encoding system encodes only the middle portion via the vocoder, and generates the second audio data by inserting identification information such as a predetermined flag or index information and the like into a beginning location of the middle portion or recombining conversion information such as vocoder information and the like. Specifically, when the voice data is partially contained in the audio data and the music data is partially contained in the audio data, the second audio data has a different bit rate classified by each partial interval. For example, the audio data may be encoded in the portion corresponding to the voice data at an 8 Kbps bit rate and be encoded in the portion corresponding to the music data at a 128 Kbps bit rate.
The variable encoding system according to an exemplary embodiment of the present invention may encode the total audio data at a bit rate corresponding to the voice data, when the voice data is contained in the audio data at a ratio corresponding to more than a predetermined ratio. In this instance, the predetermined ratio may be set by a developer or an operator of the variable encoding system.
In operation 206, the variable encoding system transmits the generated second audio data to the second user terminal. The variable encoding system according to an exemplary embodiment of the present invention may be embodied in a user's computer terminal and the second user terminal may be a mobile terminal such as a mobile phone, a PDA, an electric notebook, a PMP, a PSP, an MP3 player, and the like. The exemplary embodiment of the present invention is described in detail with reference to FIG. 3. FIG. 3 is a diagram illustrating an example of a network including a variable encoding system, a server and a second user terminal according to the present invention.
Referring to FIG. 3, the variable encoding system 300 may be embodied on a computer terminal 310. Specifically, the variable encoding system 300 is a predetermined application program or hardware located in the computer terminal 310. A server 301 transmits the audio data to the computer terminal 310 via a network 302 in a predetermined cycle based on the podcasting service or the RSS service. The network 302 may be considered as a wired/wireless network to provide the computer terminal 310 with Internet communication capacity. The computer terminal 310 that receives the audio data via the network 302 determines whether the voice data is contained in the audio data in the variable encoding system 300. When the voice data is contained in the audio data, the variable encoding system 300 generates the second audio data after encoding the audio data via the vocoder. When the second user terminal connects with the computer terminal 310, the computer terminal 310 transmits the second audio data that the variable encoding system 300 generates to the second user terminal. The second user terminal is a mobile terminal, such as an MP3 player 304, a mobile communication terminal 305, a PlayStation 306, and the like, having a predetermined memory device.
The second user terminal connects with the variable encoding system 300 via a short-distance communication module such as a universal serial bus (USB) module, a recommended standard-232C (RS-232C) module, a Bluetooth module, and the like, and the variable encoding system 300 transmits the second audio data to the second user terminal by identifying a connection of the second user terminal.
The variable encoding system according to an exemplary embodiment of the present invention is a predetermined independent server and the second user terminal is the mobile communication terminal. The exemplary embodiment of the present invention is described in detail with reference to FIG. 4. FIG. 4 is a diagram illustrating an example of a network including a variable encoding system, a server and a second user terminal according to an embodiment of the present invention.
Referring to FIG. 4, the variable encoding system 400 receives predetermined audio data from a server 401 via a network 402. In this instance, the network 402 may be interpreted a broad meaning including all wired/wireless communication network.
Similar to the exemplary embodiment of FIG. 3, the variable encoding system 400 that receives the audio data identifies whether the voice data is contained in the audio data, and generates the second audio data after encoding the audio data via the predetermined vocoder when the voice data is contained in the audio data. Also, the generated second audio data is transmitted to the second user terminal via the network 403. The second user terminal is a mobile communication terminal 404 and the network 403 includes a wireless communication network including a predetermined communication provider system.
Specifically, the variable encoding system 400 requires the communication provider system to establish a channel with a mobile communication terminal 404. Thus, the communication provider system sets a wireless channel of the variable encoding system 400 and the mobile communication terminal 404, and the variable encoding system 400 wirelessly transmits the second audio data to the mobile communication terminal 404 via the wireless channel. Also, the mobile communication terminal 404 according to an exemplary embodiment of the present invention queries whether there is the second audio data transmitting the variable encoding system 400 in a predetermined cycle, and requires the variable encoding system 400 to transmit the second audio data when there is the second audio data. Finally, the variable encoding system 400 according to the present invention may reduce memory usage of the mobile communication terminal 404 by efficiently reducing a volume of the audio data, and reduce the load of the transmission channel based on the mobile communication network.
Referring to FIG. 2 again, in operation 207, the second user terminal decodes the second audio data based on the conversion information and provides the user with the second audio data via a predetermined speaker device.
The variable encoding system according to an exemplary embodiment of the present invention maintains a user database recording user information about at least one user. The user information includes identification information of the second user terminal corresponding to the user, and telephone number information may be used as an example of the identification information. Specifically, the variable encoding system reads and extracts the user information corresponding to the second user terminal by referring to the user database to transmit the generated second audio data to the second user terminal, and wirelessly transmits the second audio data to the second user terminal based on the identification information corresponding to the user information. In this instance, the second user terminal is the mobile communication terminal such as the mobile phone.
FIG. 5 is a diagram illustrating data formats of audio data and second audio data according to an exemplary embodiment of the present invention. Referring to reference numeral 501 in FIG. 5, audio data according to an exemplary embodiment of the present invention is an 'A.MP31. The 'A.MP3' includes a plurality of playlists and the variable encoding system identifies whether the voice data is contained in the audio by analyzing each playlist. For example, 'A.MP3' is radio broadcasting and may include narration data of an announcer and music data. As a result of analyzing the playlist, the variable encoding system determines that 'Al' and 'A3' are the music data, and 'A2' and 'A4' are the narration data of the announcer. Also, the variable encoding system encodes 'Al' and 'A3' that are determined as the music data and encodes 'A2' and 'A4' by using the predetermined vocoder. Specifically, the variable encoding system analyzes each audio data classified by each playlist and, as a result of analysis, implements a heterogeneous encoding on each playlist. In this instance, the second user terminal is required to have a function to replay each list based on the playlist. Similar to reference numeral 501, the audio data where the voice data is significantly contained may prevent a problem that the audio data may be determined as the music data or the song data due to the music data or the song data at the beginning of the audio data.
Referring to reference numeral 502 in FIG. 5, the variable encoding system deletes the playlist from the reference numeral 501, inserts conversion information related to encoding in each playlist, and recombines the playlist into one audio data. In the case of the reference numeral 502, predetermined software that may decode the audio data encoded via a plurality of encoders is required. Since the software is a well-known and common-use technology, a detail description is omitted.
FIG. 6 is a block diagram illustrating an internal configuration of a variable encoding system according to an exemplary embodiment of the present invention
Referring to FIG. 6, the variable encoding system 600 according to the present invention includes a receiver 601 , a converter 602, and a transmitter 603.
The receiver 601 receives audio data from a predetermined server. The server provides the audio data such as voice, music, a song, broadcasting, and the like as a general server that provides the audio data. Also, the audio data includes all encoded data or unprocessed data.
The converter 602 determines whether voice data is contained in the audio data by analyzing a data format of the audio data that is received from the receiver 601, and generates second audio data by encoding the audio data via a predetermined vocoder when the voice data is contained in the audio data. The converter 602 according to an exemplary embodiment of the present invention determines whether a plurality of data that the received audio data is divided based on a predetermined playlist is each voice data. Accordingly, discriminative encoding is separately implemented in the plurality of data and the plurality of data is generated into the second audio. In this instance, the second audio data includes conversion information about the vocoder and the encoding.
The converter 602 according to an exemplary embodiment of the present invention generates the audio data into the second audio data via a particular encoder by a command of a user. The user may set the audio data via the particular encoder based on the user's taste or encoding errors to be encoded into the second audio data. For example, the user may set music data or song data to be encoded into the vocoder, according to memory capacity of the second user terminal.
The transmitter 603 transmits the generated second audio data to the user terminal. The variable encoding system 600 according to an exemplary embodiment of the present invention is included in a predetermined computer terminal in the form of an application program or hardware. Specifically, the receiver 601 receives the audio data from a predetermined server via an Internet communication network in a wired/wireless form, and the converter 602 determines whether the voice data is contained in the audio data and generates the second audio data by encoding the audio data via the vocoder when the voice data is contained in the audio data. Thus, when the second user terminal is connected via a short-distance communication module, such as a USB module, an RS-232C module, a ultra wideband (UWB) module, a Bluetooth module, a wireless local area network (LAN), and the like, and the transmitter 603 transmits the second audio data to the second user terminal.
The variable encoding system 600 according to an exemplary embodiment of the present invention is a predetermined independent server. Thus, the receiver 601 receives the audio data from the server via a wired/wireless communication network, and the converter 602 generates the second audio data according to whether the voice data is contained in the audio data. Thus, the transmitter 603 wirelessly transmits the second audio data to the second user terminal. The second user terminal includes a mobile communication terminal, a public switched telephone network (PSTN) terminal, voice over Internet protocol (VoIP), session initiation protocol (SIP), a media gateway controller (Megaco), a personal digital assistant (PDA), a cellular phone, a personal communication service (PCS) phone, a hand-held personal computer (PC), a code division multiple access (CDMA)-2000(lX, 3X) phone, a wideband CDMA (WCDMA) phone, a dual band/dual mode phone, a global system for mobile communication (GSM) phone, a mobile broadband system (MBS) phone, a satellite/terrestrial digital multimedia broadcasting (DMB) phone, and the like, as a predetermined communication terminal.
The variable encoding system 600 according to an exemplary embodiment of the present invention further includes a user database 604 and a database management unit 605.
The user database 604 maintains user information about at least one user. The user information includes identification information of the second user terminal corresponding to the user. Also, the database management unit 605 reads and extracts the user information corresponding to the second user terminal by referring to the user database 604, controls the transmitter 603, and wirelessly transmits the second audio data to the second user terminal, based on the identification information corresponding to the user information.
For example, the transmitter 603 parses the user database 604 to wirelessly transmit the second audio data to the second user terminal, reads and extracts predetermined the user information. The user information includes the identification information, such as telephone number information of the second user terminal and the like, and the transmitter 603 transmits the second audio data to the second user terminal based on the identification information such as the telephone number information and the like.
FIG. 7 is an internal block diagram of a general-purpose computer apparatus which can be adopted in implementing a variable encoding method according to the present invention.
A computer apparatus 700 includes at least one processor 710 connected to a main memory device including a RAM (Random Access Memory) 720 and a ROM (Read Only Memory) 730. The processor 710 is also known as a central processing unit (CPU). As well-known in the field of the art, the ROM 730 unidirectionally transmits data and instructions to the CPU, and the RAM 720 is generally used for bidirectionally transmitting data and instructions. The RAM 720 and the ROM 730 may include a certain proper form of a computer-readable recording medium. A mass storage device 740 is bidirectionally connected to the processor 710 to provide additional data storage capacity and may be one of a number of computer-readable recording mediums. The mass storage device 740 is used for storing programs, data, and the like, and is an auxiliary memory device such as a hard disc that is generally slower than the main memory device. A particular mass storage device such as a CD ROM 760 may be used. The processor 710 is connected to at least one input/output interface 750 such as a video monitor, a track ball, a mouse, a keyboard, a microphone, a touch-screen type display, a card reader, a magnetic or paper tape reader, a voice or hand- writing recognizer, a joystick, or other known computer input/output unit. The processor 710 may be connected to a wired or wireless communication network via a network interface 770. The procedure of the described method can be performed via the network connection. The described devices and tools are well-known to those skilled in the art of computer hardware and software.
The described hardware devices may be configured to act as one or more software modules in order to perform the operations of the present invention.
Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Industrial Applicability
An aspect of the present invention provides a method and system for increasing usage efficiency of a memory device of a mobile terminal recording audio data. An aspect of the present invention also provides a method and system for reducing a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of the audio data, and transmitting the audio data to a second user terminal via the wireless communication network.

Claims

1. A method of selectively encoding audio data, the method comprising: receiving the audio data from a predetermined server; determining whether voice data is contained in the audio data by analyzing a data format of the audio data; generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on the conversion information.
2. The method of claim 1, wherein a selective encoding system comprises a computer terminal, and when the second user terminal connects with the computer terminal, the computer terminal transmits the second audio data to the second user terminal.
3. The method of claim 1 , further comprising: maintaining a user database which records user information about at least one user, the user information comprising identification information of the second user terminal corresponding to the user, wherein the transmitting comprises: reading and extracting user information corresponding to the second user terminal by referring to the user database; and wirelessly transmitting the second audio data to the second user terminal, based on identification information corresponding to the user information.
4. The method of claim 1, wherein the vocoder comprises at least one of qualcomm code excited linear prediction (QCELP), enhanced voice rated codec
(EVRC), and adaptive multi-rate (AMR)
5. The method of claim 1, wherein the audio data is received from the server in a rich site summary (RSS) method.
6. The method of claim 1, wherein the second audio data is generated by dividing and encoding the audio data into a plurality of audio data via a heterogeneous vocoder.
7. A computer-readable recording medium storing a program for implementing the method according to any one of claims 1 through 6.
8. A system for selectively encoding audio data, the system comprising: a receiver receiving the audio data from a predetermined server; a converter determining whether voice data is contained in the audio data by analyzing a data format of the audio data, and generating second audio data by encoding only a portion corresponding to the voice data among the audio data via a predetermined vocoder when the voice data is contained in the audio data, the second audio data comprising conversion information about the vocoder and the encoding; and a transmitter transmitting the generated second audio data to a second user terminal, wherein the second user terminal decodes the second audio data based on conversion information.
9. The system of claim 8, further comprising: a user database recording user information about at least one user, the user information comprising identification information of the second user terminal corresponding to the user; and a database management unit reading and extracting the user information corresponding to the second user terminal by referring the user database, and controlling the transmitter to wirelessly transmit the second audio data to the second user terminal, based on the identification information corresponding to the user information.
PCT/KR2006/003903 2005-09-30 2006-09-28 Optional encoding system and method for operating the system WO2007037641A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2006800359075A CN101273405B (en) 2005-09-30 2006-09-28 Optional encoding system and method for operating the system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2005-0091846 2005-09-30
KR1020050091846A KR100757858B1 (en) 2005-09-30 2005-09-30 Selective Encoding System and Operation Method of the Selective Encoding System

Publications (1)

Publication Number Publication Date
WO2007037641A1 true WO2007037641A1 (en) 2007-04-05

Family

ID=37900009

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/003903 WO2007037641A1 (en) 2005-09-30 2006-09-28 Optional encoding system and method for operating the system

Country Status (3)

Country Link
KR (1) KR100757858B1 (en)
CN (1) CN101273405B (en)
WO (1) WO2007037641A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967735B (en) * 2021-02-23 2024-09-20 北京达佳互联信息技术有限公司 Training method of voice quality detection model and voice quality detection method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742734A (en) * 1994-08-10 1998-04-21 Qualcomm Incorporated Encoding rate selection in a variable rate vocoder
US5940796A (en) * 1991-11-12 1999-08-17 Fujitsu Limited Speech synthesis client/server system employing client determined destination control
US6490553B2 (en) * 2000-05-22 2002-12-03 Compaq Information Technologies Group, L.P. Apparatus and method for controlling rate of playback of audio data
KR20040064064A (en) * 2003-01-09 2004-07-16 와이더덴닷컴 주식회사 Preprocessing of Digital Audio data for Improving Perceptual Sound Quality on a Mobile Phone

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030057494A (en) * 2003-01-16 2003-07-04 (주)유토포스 The advanced digital audio contents service system and its implementation method for mobile wireless device on wireless and wired internet communication network
KR100597964B1 (en) * 2003-01-16 2006-08-21 (주)유토포스 The advanced digital audio contents service system and its implementation method for mobile wireless device on wireless and wired internet communication network
KR20060027246A (en) * 2004-09-22 2006-03-27 (주)믹스크리에이티브 Audio streaming service method for mobile phone using wireless communication network and service system for implementation thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5940796A (en) * 1991-11-12 1999-08-17 Fujitsu Limited Speech synthesis client/server system employing client determined destination control
US5742734A (en) * 1994-08-10 1998-04-21 Qualcomm Incorporated Encoding rate selection in a variable rate vocoder
US6490553B2 (en) * 2000-05-22 2002-12-03 Compaq Information Technologies Group, L.P. Apparatus and method for controlling rate of playback of audio data
KR20040064064A (en) * 2003-01-09 2004-07-16 와이더덴닷컴 주식회사 Preprocessing of Digital Audio data for Improving Perceptual Sound Quality on a Mobile Phone

Also Published As

Publication number Publication date
CN101273405A (en) 2008-09-24
KR100757858B1 (en) 2007-09-11
CN101273405B (en) 2011-12-21
KR20070036870A (en) 2007-04-04

Similar Documents

Publication Publication Date Title
JP4724452B2 (en) Digital media general-purpose basic stream
RU2434333C2 (en) Apparatus and method of transmitting sequence of data packets and decoder and apparatus for recognising sequence of data packets
US20070112571A1 (en) Speech recognition at a mobile terminal
US20070280209A1 (en) Combining selected audio data with a voip stream for communication over a network
KR20030074161A (en) Data stream-distribution system and method therefor
CN102045553A (en) Multimedia transcoding device and method and multimedia player
CN111078930A (en) Audio file data processing method and device
CN103181184A (en) Media file caching for an electronic device to conserve resources
US20040024592A1 (en) Audio data processing apparatus and audio data distributing apparatus
KR101280224B1 (en) System and Method for providing contents through network of impossible apparatus to connect network
US20100104267A1 (en) System and method for playing media file
CN118210470B (en) Audio playing method and device, electronic equipment and storage medium
CN102047338B (en) Optimizing seek functionality in media content
US20040192358A1 (en) Method of instantly receiving and playing back audio data from wireless network by wireless terminal
WO2007037641A1 (en) Optional encoding system and method for operating the system
KR101055714B1 (en) Method for playing audio files on portable electronic devices
US20070294723A1 (en) Method and system for dynamically inserting media into a podcast
CN102693728A (en) Cross-platform speech transmission/decoding method for mobile phones
CN109150400B (en) Data transmission method, apparatus, electronic device and computer readable medium
KR101403719B1 (en) System and method for audio available of file conversion
US20070083608A1 (en) Delivering a data stream with instructions for playback
US20140142955A1 (en) Encoding Digital Media for Fast Start on Digital Media Players
CN101207500B (en) Method for acoustic frequency data inflexion
US20080165896A1 (en) Self-configuring media devices and methods
JP4603006B2 (en) Information processing device

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680035907.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2573/DELNP/2008

Country of ref document: IN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06798986

Country of ref document: EP

Kind code of ref document: A1