CN113079267B - Audio conferencing in a room - Google Patents
Audio conferencing in a room Download PDFInfo
- Publication number
- CN113079267B CN113079267B CN202110012253.0A CN202110012253A CN113079267B CN 113079267 B CN113079267 B CN 113079267B CN 202110012253 A CN202110012253 A CN 202110012253A CN 113079267 B CN113079267 B CN 113079267B
- Authority
- CN
- China
- Prior art keywords
- computer system
- data
- microphone
- audio
- remote
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/006—Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Telephonic Communication Services (AREA)
Abstract
The application discloses audio conferences within a room. The first computer system and the second computer system and their respective first microphone and second microphone receive respective portions of the same audio input signal. The audio buffers received from the first computer system and the second computer system, respectively, include data encoded from respective microphone inputs of the first computer system and the second computer system. The received audio buffers are synchronized and corrected for gain differences between the received audio buffers to produce corrected audio buffers. The corrected audio buffers are mixed into an output buffer. Synchronization reduces echo when the output buffer is played at the remote peer computer system.
Description
Background
1. Technical field
The present invention relates to improvements in audio quality during audio conferences and audio conferences.
2. Description of related Art
Voice over internet protocol (VoIP) communications include encoding voice into digital data, encapsulating the digital data into data packets, and transmitting the data packets over a data network. A conference call is a telephone call between two or more participants at geographically dispersed locations that enables each participant to speak to and listen to the other participants simultaneously. Teleconferencing between participants may be conducted via a voice conference bridge (voice conference bridge) or a centralized server. Teleconferencing connects a plurality of endpoint devices (VoIP devices or computer systems) associated with participants using an appropriate web conference communication protocol. Alternatively, the teleconference may be mediated peer-to-peer (mediated peer), where audio may be streamed directly between the participant's computer systems without an intermediate server.
Brief summary of the invention
Various systems and methods are disclosed herein in a network that includes first and second computer systems and their respective first and second microphones in an acoustic environment. The first microphone and the second microphone receive respective portions of the same audio input signal. Audio buffers (audio buffers) received from the first computer system and the second computer system, respectively, include data encoded from respective microphone inputs of the first computer system and the second computer system. The received audio buffers are synchronized and corrected for gain differences between the received audio buffers to produce corrected audio buffers. The corrected audio buffers are mixed into an output buffer. Synchronization reduces echo when the output buffer is played on a remote peer computer system. Mixing corrected audio buffering may include boosting (emphsize) audio buffering from a computer system currently being used for audio input and reducing audio input into microphones attached to computer systems not currently being used for audio input. The first computer system and/or the second computer system may perform synchronization and mix corrected audio buffering. Alternatively, the synchronization and mixing corrected audio buffering may be performed by a server in the network. Prior to synchronization and mixing, the system/method may identify portions of the first computer system and the second computer system where microphones may receive the same audio input signal. An audio buffer may be received from a remote peer computer system of a network external to the acoustic environment. The received audio buffer may be transmitted to the first computer system and the second computer system with a corresponding delay such that the received audio buffer is played synchronously on the first computer system and the second computer system. Alternatively, the received audio buffer may be sent to one of the first computer system and the second computer system.
Various computer-readable media are disclosed that, when executed by a processor, cause the processor to perform the methods disclosed herein.
Brief Description of Drawings
The invention is described herein, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 illustrates a scenario in accordance with features of the present invention;
FIG. 1A schematically illustrates a conventional network connection between computer systems participating in an audio conference;
FIG. 1B illustrates a conventional audio stream during an audio conference;
fig. 2 schematically illustrates audio streaming during an audio conference according to features of the present invention;
fig. 3 schematically illustrates audio streaming during an audio conference according to another feature of the invention;
FIG. 4A illustrates an embodiment of audio streaming for audio received during an audio conference in accordance with features of the present invention;
FIG. 4B illustrates an alternative embodiment of audio streaming for audio received during an audio conference in accordance with features of the present invention;
FIG. 5 illustrates a method in accordance with features of the present invention; and
fig. 6 schematically shows a simplified computer system according to conventional technology.
The foregoing and/or other aspects will become apparent from the following detailed description when considered in conjunction with the accompanying drawings.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
By way of introduction, aspects of the present invention are directed to systems and methods for reducing audio echo or unwanted reverberation in an audio conference or audio video conference implemented over a computer network. In particular, for example, during an audio conference using voice over internet protocol (VoIP), a participant may use a computer workstation equipped with a microphone to participate in the conference. Various embodiments of the present invention may be implemented in a VoIP audio conference implemented by a peer-to-peer or by a VoIP server or a mixture thereof.
Referring now to the drawings, and now to FIG. 1, there is shown a scenario featuring in accordance with the present invention. Fig. 1 shows three participants operating three workstations or computer systems 10A, 10B, and 10C, respectively, the workstations or computer systems 10A, 10B, and 10C including microphones 2A, 2B, and 2C and speakers 3a,3B, and 3C, respectively. The workstations 10A, 10B are configured in a single room. Workstation 10C is a remote peer computer system operating in another room, another city, or another continent. When there are two participants in a single location (e.g., a single room), the participant's voice may be received by microphone 2A of his workstation and by microphone 2B of his roommate's workstation. Both workstations 10A and 10B transmit parallel audio streams to remote participants over a network, and when both audio streams of the participants' voices are played, the remote participants of the conference hear echoes of the same voice. Conventionally, participants in a conference sharing a room may be required to ensure that only one microphone is unmuted in order to ensure sound quality.
Referring now also to FIG. 1A, there is schematically shown a network connection between computer systems 10A, 10B and 10C and a Voice over Internet protocol (VoIP) server 13. Computer systems 10A and 10B may be conventionally interconnected by a Local Area Network (LAN), which may be implemented by a wired network (e.g., IEEE 802.3 Ethernet) or a wireless network (e.g., IEEE 802.11 Wifi). Reference is now also made to fig. 1B, which illustrates, by way of example, conventional peer-to-peer audio streaming during an audio conference. Specifically, computer system 10A communicates audio buffer A to computer systems 10B and 10C, and similarly computer system 10B communicates audio buffer B to computer systems 10A and 10C. In the scenario shown in fig. 1, where the same speech from the participant is encoded into audio buffers a and B (with a sufficiently long delay of greater than 30 milliseconds), then mixed and played at computer system 10C, the speech may hear an echo or unwanted reverberation as it is played at computer system 10C.
Reference is now also made to fig. 2, which schematically illustrates audio streaming during an audio conference according to features of the present invention. Thus, audio buffer B may be transmitted from computer system 10B and received by computer system 10A. At computer system 10A, audio buffers a and B may be synchronized (e.g., within 30 milliseconds), mixed, and transmitted to VoIP server 13.VoIP server 13 may transmit the synchronized and mixed audio buffer to remote computer system 10C, playing sound at remote computer system 10C without echo.
Reference is now also made to fig. 3, which schematically illustrates audio streaming during an audio conference according to another feature of the present invention. Computer systems 10A and 10B transmit audio buffers a and B, respectively, to VoIP server 13 separately. VoIP server 13 includes a module 14, module 14 can synchronize and mix audio buffers a and B into a synchronized/mixed audio buffer such that audio is played at computer system 10C without echo.
Referring now also to fig. 5, a method 50 in accordance with features of the present invention is shown. In step 51, the conferencing application may identify whether two or more computer systems 10 participating in the audio conference have microphones 2 that may receive the same audio input signal. The identification may be performed by prompting a participant whether another participant of the audio conference is sharing a room with the participant (step 51). In step 52, the corresponding audio buffers may be received from computer systems 10A and 10B and the audio buffers A and B are synchronized (step 53). In step 54, the gain difference between the received audio buffers a and B may be corrected. The microphone 2A may be less sensitive and/or the signal from the microphone 2A may be streamed at a lower level than the other microphone 2B, so that gain may be added to the microphone 2A to balance the level at play. It is also desirable to increase the gain of the microphone being used by the participant currently speaking relative to other unmuted microphones of the participants in the conference. In step 55, the audio buffers are mixed into output buffers, and the output buffers are sent (step 56) to the remote peer computer system 10C. In step 57, the echo is reduced in the output buffer as it is played in the remote computer system 10C.
Referring now to fig. 4A, there is illustrated audio streaming for audio received during an audio conference in accordance with features of the present invention. Computer system 10A may receive an audio buffer from VoIP server 13, the audio buffer comprising combined audio from a remote peer computer system (not shown). The computer system 10A may transmit audio locally to the computer system 10B so that all computer systems 10 in the same room play the audio synchronously. Referring now also to fig. 4B, there is shown audio streaming for audio received during an audio conference in another configuration. The synchronized audio buffers are sent directly from the VoIP server 13 to the computer systems 10A and 10B. The received audio buffers may be sent to the first computer system 10A and the second computer system 10B with corresponding delays such that the received audio buffers are played synchronously at the first computer system 10A and the second computer system 10B. Alternatively, one speaker 3 in the same room may play audio.
Referring now to FIG. 6, a simplified computer system 60 is schematically illustrated in accordance with conventional techniques. The computer system 10 includes a processor 601, a storage mechanism including a memory bus 607 for storing information in a memory 609, and a network interface 605 operatively connected to the processor 601 through the peripheral bus 603. The computer system 10 also includes a data input mechanism 611 (disk drive), such as for a computer readable medium 613 (e.g., an optical disk). The data input mechanism 611 is operatively coupled to the processor 601 using a peripheral bus 603. The sound card 614 is operatively connected to the peripheral bus 603. The input of the sound card 614 is operatively connected to the output of the microphone 2 and to the input of the speaker 3.
In this specification and in the following claims, a "computer system" is defined as one or more software modules, one or more hardware modules, or a combination thereof that work together to perform operations on electronic data. For example, the definition of computer system includes the hardware components of a personal computer as well as software modules, such as the operating system of a personal computer. The physical layout of the modules is not important. The computer system may include one or more computers coupled via a computer network. Likewise, a computer system may include a single physical device (e.g., a mobile phone, a laptop computer, or a tablet computer) with internal modules (e.g., memory and a processor) working together to perform operations on electronic data.
In this specification and in the following claims, a "network" is defined as any architecture in which two or more computer systems may exchange data. The data exchanged may be in the form of electrical signals that are meaningful to two or more computer systems. When data is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer system or computer device, the connection is properly viewed as a computer-readable medium. Thus, any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of computer-readable media. Computer-executable instructions comprise, for example, instructions and data which cause a general purpose computer system or special purpose computer system to perform a certain function or group of functions. The described embodiments may also be embodied as computer readable code on a non-transitory computer readable medium. A non-transitory computer readable medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the non-transitory computer readable medium include read-only memory, random-access memory, CD-ROM, HDD, DVD, magnetic tape, and optical data storage devices. The non-transitory computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
The various aspects, embodiments, implementations, or features of the described embodiments may be used alone or in any combination. The various aspects of the described embodiments may be implemented in software, hardware, or a combination of hardware and software.
The terms "device," "workstation," and "computer system" are used interchangeably herein.
The term "connected" as used herein refers to both wired and wireless computer connections.
The term "emphasis" as used herein refers to a relative increase in audio gain or audio level.
The term "echo" as used herein refers to hearing when two audio signals having similar or identical audio inputs are played asynchronously with a time delay of greater than about 10-50 milliseconds.
The term "synchronized" or "synchronization" as used herein is less than about 50 milliseconds. In some cases where the participants are in different locations in a large room, there will be some reverberation depending on the room size. In such cases, the term "synchronized" or "synchronization" may refer to less than about 30 milliseconds. Alternatively, in some embodiments of the invention, it may be desirable to reduce reverberation even further, so that synchronization of less than about 20 milliseconds or less than 10 milliseconds may be suggested to be effective.
The transitional term "comprising" as used herein is synonymous with "including" and is broad or open-ended and does not exclude additional, unrecited elements or method steps. The articles "a", "an" (such as "a computer system", "an audio buffer") as used herein have the meaning of "one or more", i.e. "one or more computer systems", "one or more audio buffers".
All optional and preferred features and modifications of the described embodiments and the dependent claims may be used in all aspects of the invention taught herein. Furthermore, the various features of the dependent claims, as well as all optional and preferred features and modifications of the described embodiments, are combinable and interchangeable with each other.
While selected features of the invention have been illustrated and described, it should be understood that the invention is not limited to the described features.
While selected embodiments of the present invention have been shown and described, it should be understood that the invention is not limited to the described embodiments. Rather, it should be understood that changes can be made in these embodiments without departing from the scope of the invention as defined in the following claims and their equivalents.
Claims (17)
1. A system operable in a network comprising a first computer system and a second computer system, wherein the first computer system and the second computer system and their respective first microphone and second microphone are in an acoustic environment, wherein the first microphone and the second microphone receive respective portions of the same audio input signal, the system configured to:
receiving data from respective audio buffers of the first computer system and the second computer system, wherein the data is encoded from respective microphone inputs of the first computer system and the second computer system;
synchronizing received data from the respective audio buffers and correcting a gain difference between said received data of the first microphone input and the second microphone input, thereby producing corrected data; and
mixing the corrected data into an output buffer;
wherein the synchronization reduces echo when the corrected data is played at the remote peer computer system.
2. The system of claim 1, wherein mixing the corrected data includes emphasizing data from a computer system currently being used for audio input and reducing input from a microphone attached to a computer system not currently being used for audio input.
3. The system of claim 1, wherein synchronizing and mixing are performed by a computer system selected from the group consisting of the first computer system and the second computer system.
4. The system of claim 1, wherein synchronizing and mixing are performed by a server in the network.
5. The system of claim 1, further configured to:
a portion of the first computer system and the second computer system where microphones receive the same audio input signal is identified.
6. The system of claim 1, further configured to:
receive remote data from a remote peer computer system of the network, wherein the remote peer computer system is external to the acoustic environment; and
the remote data is transmitted to the first computer system and the second computer system with a corresponding delay such that the remote data is played synchronously at the first computer system and the second computer system or the remote data is transmitted to one of the first computer system and the second computer system.
7. A computerized method executable in a network, the network comprising a first computer system and a second computer system, wherein the first computer system and the second computer system and their respective first microphone and second microphone are in an acoustic environment, wherein the first microphone and the second microphone receive portions of the same audio input signal, the method comprising:
receiving data from respective audio buffers of the first computer system and the second computer system, wherein the data is encoded from respective microphone inputs of the first computer system and the second computer system;
synchronizing the received data and correcting a gain difference between said received data of the first microphone input and the second microphone input, thereby producing corrected data; and
mixing the corrected data into an output buffer;
wherein the synchronization reduces echo when the corrected data is played at the remote peer computer system.
8. The computerized method of claim 7, further comprising:
transmitting the corrected data to a remote peer computer system of the network external to the acoustic environment.
9. The computerized method of claim 7, wherein the mixing the corrected data includes emphasizing data from a computer system currently being used for audio input and reducing input from a microphone attached to a computer system not currently being used for audio input.
10. The computerized method of claim 7, wherein synchronizing and mixing are performed by a computer system selected from the group consisting of the first computer system and the second computer system.
11. The computerized method of claim 7, wherein synchronizing and mixing are performed by a server in the network.
12. The computerized method of claim 7, further comprising:
a portion of the first computer system and the second computer system where microphones receive the same audio input signal is identified.
13. The computerized method of claim 7, further comprising:
receive remote data from a remote peer computer system of the network, wherein the remote peer computer system is external to the acoustic environment; and
the remote data is transmitted to the first computer system and the second computer system with a corresponding delay such that the remote data is played synchronously at the first computer system and the second computer system or the remote data for playing is transmitted to one of the first computer system and the second computer system.
14. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform a method in a network comprising first and second computer systems, wherein the first and second computer systems and their respective first and second microphones are in an acoustic environment, wherein the first and second microphones receive portions of a same audio input signal, the method comprising:
receiving data from respective audio buffers of the first computer system and the second computer system, wherein the data is encoded from respective microphone inputs of the first computer system and the second computer system;
synchronizing the received data and correcting a gain difference between said received data of the first microphone input and the second microphone input, thereby producing corrected data; and
mixing the corrected data into an output buffer;
wherein the synchronization reduces echo when the corrected data is played at the remote peer computer system.
15. The non-transitory computer readable storage medium of claim 14, wherein the mixing includes enhancing data from a computer system currently being used for audio input and reducing input from a microphone attached to a computer system not currently being used for audio input.
16. The non-transitory computer readable storage medium of claim 14, further storing instructions that, when executed by a processor, cause the processor to perform:
a portion of the first computer system and the second computer system where microphones receive the same audio input signal is identified.
17. The non-transitory computer readable storage medium of claim 14, further storing instructions that, when executed by a processor, cause the processor to perform:
receive remote data from a remote peer computer system of the network, wherein the remote peer computer system is external to the acoustic environment; and
the remote data is transmitted to the first computer system and the second computer system with a corresponding delay such that the remote data is played synchronously at the first computer system and the second computer system or the remote data for playing is transmitted to one of the first computer system and the second computer system.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062957372P | 2020-01-06 | 2020-01-06 | |
US62/957,372 | 2020-01-06 | ||
US17/092,339 US11425258B2 (en) | 2020-01-06 | 2020-11-09 | Audio conferencing in a room |
US17/092,339 | 2020-11-09 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113079267A CN113079267A (en) | 2021-07-06 |
CN113079267B true CN113079267B (en) | 2023-05-05 |
Family
ID=76609309
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110012253.0A Active CN113079267B (en) | 2020-01-06 | 2021-01-06 | Audio conferencing in a room |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113079267B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102461140A (en) * | 2009-04-14 | 2012-05-16 | 思杰系统有限公司 | System and method for transmitting computer and voice conference audio via a VoIP device during a teleconference |
CN102625006A (en) * | 2011-01-31 | 2012-08-01 | 深圳三石科技有限公司 | Method and system for synchronization and alignment of echo cancellation data and audio communication equipment |
US8406415B1 (en) * | 2007-03-14 | 2013-03-26 | Clearone Communications, Inc. | Privacy modes in an open-air multi-port conferencing device |
CN103583032A (en) * | 2011-05-11 | 2014-02-12 | 锐德世加拿大公司 | Resource efficient acoustic echo cancellation in IP networks |
CN107408395A (en) * | 2015-04-05 | 2017-11-28 | 高通股份有限公司 | Conference audio management |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8243631B2 (en) * | 2006-12-27 | 2012-08-14 | Nokia Corporation | Detecting devices in overlapping audio space |
US8560331B1 (en) * | 2010-08-02 | 2013-10-15 | Sony Computer Entertainment America Llc | Audio acceleration |
US9767784B2 (en) * | 2014-07-09 | 2017-09-19 | 2236008 Ontario Inc. | System and method for acoustic management |
GB201414352D0 (en) * | 2014-08-13 | 2014-09-24 | Microsoft Corp | Reversed echo canceller |
-
2021
- 2021-01-06 CN CN202110012253.0A patent/CN113079267B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8406415B1 (en) * | 2007-03-14 | 2013-03-26 | Clearone Communications, Inc. | Privacy modes in an open-air multi-port conferencing device |
CN102461140A (en) * | 2009-04-14 | 2012-05-16 | 思杰系统有限公司 | System and method for transmitting computer and voice conference audio via a VoIP device during a teleconference |
CN102625006A (en) * | 2011-01-31 | 2012-08-01 | 深圳三石科技有限公司 | Method and system for synchronization and alignment of echo cancellation data and audio communication equipment |
CN103583032A (en) * | 2011-05-11 | 2014-02-12 | 锐德世加拿大公司 | Resource efficient acoustic echo cancellation in IP networks |
CN107408395A (en) * | 2015-04-05 | 2017-11-28 | 高通股份有限公司 | Conference audio management |
Also Published As
Publication number | Publication date |
---|---|
CN113079267A (en) | 2021-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11386912B1 (en) | Method and computer program product for allowing a plurality of musicians who are in physically separate locations to create a single musical performance using a teleconferencing platform provided by a host server | |
CN113273153B (en) | System and method for distributed call processing and audio enhancement in a conference environment | |
US11910344B2 (en) | Conference audio management | |
US8606249B1 (en) | Methods and systems for enhancing audio quality during teleconferencing | |
US10732924B2 (en) | Teleconference recording management system | |
CN100446529C (en) | Teleconference Arrangement | |
US11710488B2 (en) | Transcription of communications using multiple speech recognition systems | |
US8700720B2 (en) | System architecture for linking packet-switched and circuit-switched clients | |
WO2024159973A1 (en) | Video conference implementation method and apparatus, device, and storage medium | |
US11521636B1 (en) | Method and apparatus for using a test audio pattern to generate an audio signal transform for use in performing acoustic echo cancellation | |
US11985173B2 (en) | Method and electronic device for Bluetooth audio multi-streaming | |
US11089164B2 (en) | Teleconference recording management system | |
JP2006018809A (en) | Efficient routing of real-time multimedia information | |
WO2012055291A1 (en) | Method and system for transmitting audio data | |
US11425258B2 (en) | Audio conferencing in a room | |
CN113079267B (en) | Audio conferencing in a room | |
JP4531013B2 (en) | Audiovisual conference system and terminal device | |
JP2011087074A (en) | Output controller of remote conversation system, method thereof, and computer executable program | |
JP5210788B2 (en) | Speech signal communication system, speech synthesizer, speech synthesis processing method, speech synthesis processing program, and recording medium storing the program | |
EP4300918A1 (en) | A method for managing sound in a virtual conferencing system, a related system, a related acoustic management module, a related client device | |
JP4522332B2 (en) | Audiovisual distribution system, method and program | |
JP2022108957A (en) | DATA PROCESSING DEVICE, DATA PROCESSING SYSTEM, SOUND PROCESSING METHOD | |
JP2016201739A (en) | Voice conference system, voice conference device, method therefor, and program | |
JPH02150153A (en) | Voice conference system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |