CN1618233A - Video conferencing system and method of operation - Google Patents
Video conferencing system and method of operation Download PDFInfo
- Publication number
- CN1618233A CN1618233A CNA028277430A CN02827743A CN1618233A CN 1618233 A CN1618233 A CN 1618233A CN A028277430 A CNA028277430 A CN A028277430A CN 02827743 A CN02827743 A CN 02827743A CN 1618233 A CN1618233 A CN 1618233A
- Authority
- CN
- China
- Prior art keywords
- video
- active
- video images
- multimedia
- speakers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000004891 communication Methods 0.000 claims description 42
- 230000005540 biological transmission Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 4
- 230000002708 enhancing effect Effects 0.000 claims 1
- 230000007246 mechanism Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000011664 signaling Effects 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000005641 tunneling Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008672 reprogramming Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234327—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/24—Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
- H04N21/2402—Monitoring of the downstream path of the transmission network, e.g. bandwidth available
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/258—Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
- H04N21/25808—Management of client data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/262—Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
- H04N21/26208—Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints
- H04N21/26216—Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints involving the channel capacity, e.g. network bandwidth
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440227—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by decomposing into layers, e.g. base layer and one or more enhancement layers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/462—Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
- H04N21/4621—Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/147—Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
- H04N7/152—Multipoint control units therefor
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Telephonic Communication Services (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域technical field
本发明涉及视频会议。本发明可以应用于基于H.323和/或基于SIP的集中视频会议中、使用分层视频编码的视频交换机制,但并不限于此。The present invention relates to video conferencing. The present invention can be applied to a video exchange mechanism using layered video coding in centralized video conferencing based on H.323 and/or SIP, but is not limited thereto.
背景技术Background technique
由于商业步伐加快,商业关系扩展到全世界,因此对快速并且经济地跨越通信距离的需要变成了主要的难题。为了在竞争越来越激烈的市场上取得成功,把顾客和工作人员有效地集合起来是关键。商家正在寻找灵活的解决方案,以使用各种通信方法,例如语音、视频、图像数据和它们的任何组合来支持跨国以及跨大洲的实时信息共享。As the pace of commerce increases and business relationships expand worldwide, the need to quickly and economically span communication distances becomes a major challenge. In order to succeed in an increasingly competitive marketplace, effectively bringing customers and staff together is key. Merchants are looking for flexible solutions to support real-time information sharing across borders and continents using various communication methods such as voice, video, image data and any combination thereof.
尤其是,跨国组织越来越希望取消昂贵的旅行并连接多个位置,以便让组织内的群组能够更有效地通信。在互联网协议(IP)网络上操作的多点会议系统试图解决该需要。在本发明的领域中,已知在多点视频会议中终端可以实时交换音频和视频流。现有的在IP网络上建立多点会议的方法是使用多点控制单元(MCU)。MCU是网络上的端点,为三个或更多终端和/或通信网关提供参与多点会议的能力。在点对点会议中MCU还可以连接两个终端,以便他们能够具有发展成多点会议的能力。In particular, multinational organizations are increasingly looking to eliminate costly travel and connect multiple locations so groups within the organization can communicate more effectively. Multipoint conferencing systems operating on Internet Protocol (IP) networks attempt to address this need. In the field of the invention, it is known that in a multipoint video conference terminals can exchange audio and video streams in real time. The existing method of establishing a multipoint conference on an IP network is to use a multipoint control unit (MCU). An MCU is an endpoint on a network that provides the ability for three or more endpoints and/or communication gateways to participate in a multipoint conference. The MCU can also connect two terminals in a point-to-point conference so that they can develop into a multi-point conference.
首先参照图1,示出了已知的集中会议模型100。集中会议利用基于MCU的会议网桥。所有终端(端点)120、122、125发送并接收去往/来自MCU110的以音频、视频和/或数据信号形式的媒体信息以及控制信息流140。这些传输可以以点对点方式进行。这在图1中示出了。Referring first to FIG. 1 , a known
MCU100由多点控制器(MC)和零或多个多点处理器(MP)组成。MC处理所有终端之间的呼叫建立和呼叫信令协商,以确定用于音频和视频处理的共同能力。MC110不直接处理任何一个媒体流。这留给MP处理,MP混合、交换并处理音频、视频和/或数据比特。MCU100 consists of a multipoint controller (MC) and zero or more multipoint processors (MP). The MC handles call setup and call signaling negotiations between all terminals to determine common capabilities for audio and video processing. MC110 does not directly handle any one media stream. This is left to the MP to mix, switch and process audio, video and/or data bits.
以这种方式,MCU提供举行多位置会议、销售会议、群组会议和其他‘面对面’通信的性能。已知多点会议可以用于各种应用,例如:In this way, the MCU provides the capability to hold multi-location conferences, sales meetings, group meetings and other 'face-to-face' communications. Multipoint conferencing is known to be used in various applications such as:
(i)在多个位置的执行者和管理者能够‘面对面’开会,共享实时信息,并且更快地做出决定,而没有时间、开支的任何损失和旅行的需要。(i) Executives and managers in multiple locations are able to meet 'face-to-face', share real-time information, and make decisions faster without any loss of time, expense, and need for travel.
(ii)项目组和知识工作者可以以实时方式协调各自的任务,并且观看和修正共享的文档、文稿、设计和文件;并且(ii) project teams and knowledge workers can coordinate their respective tasks in real-time and view and revise shared documents, manuscripts, designs and files; and
(iii)在远程位置的学生、受训人员和雇员可以跨越任何距离或时区访问共享的教育/培训资源。(iii) Students, trainees and employees at remote locations can access shared educational/training resources across any distance or time zone.
因此,可以想象得到,在未来基于IP的网络上的多媒体通信中,基于MCU的系统将起到重要作用。Therefore, it is conceivable that MCU-based systems will play an important role in future multimedia communications over IP-based networks.
这样的多媒体通信通常采用视频传输。在这种传输中,在发送和接收单元之间传送图像序列,这些图像序列通常被称为帧。可以使用各种方法建立多点多媒体会议系统,例如H.232和SIP会话层协议标准所规定的方法。有关SIP的参考文献可以在以下网址找到:Such multimedia communications typically employ video transmission. In this type of transmission, image sequences, usually referred to as frames, are transferred between the sending and receiving units. Various methods can be used to establish a multi-point multimedia conference system, such as the methods specified in the H.232 and SIP session layer protocol standards. References on SIP can be found at:
http://www.ietf.org/rfc/rfc2534.txt,以及 http://www.ietf.org/rfc/rfc2534.txt , and
http://www.cs.columbia.edu/~hgs/sip. http://www.cs.columbia.edu/~hgs/sip .
此外,例如在使用ITU H.263视频压缩[ITU-T Recommendation,H.263,‘Video Coding for Low Bit Rate Communication’]系统中,视频序列的第一帧包括相当数量的综合图像数据,通常被称为帧内编码信息。由于帧内编码帧是第一帧,因此它可以提供要显示的图像的实质部分。帧内编码帧随后是帧间编码(预测)信息,其通常包括与正发送的图像中的变化有关的数据。因此预测的帧间编码信息包含的信息比帧内编码信息少得多。Furthermore, for example in systems using ITU H.263 video compression [ITU-T Recommendation, H.263, 'Video Coding for Low Bit Rate Communication'], the first frame of a video sequence includes a considerable amount of synthetic image data, usually It is called intra-frame coding information. Since the intra-coded frame is the first frame, it can provide a substantial portion of the image to be displayed. Intra-coded frames are followed by inter-coded (predicted) information, which typically includes data related to changes in the picture being sent. Thus predicted inter-coded information contains much less information than intra-coded information.
在传统的多媒体会议系统中,当用户讲话时他们需要识别自己本身,以便接收终端知道是谁在讲话。很明显,如果发送终端不能识别它本身,收听的用户将不得不猜测是谁在讲话。In a traditional multimedia conferencing system, users need to identify themselves when they speak, so that the receiving terminal knows who is speaking. Obviously, if the sending terminal does not recognize itself, the listening user will have to guess who is speaking.
一种已知的技术通过分析音频流并转发活跃发言人的姓名和视频流给所有参与方来解决该问题。在集中会议系统中,MCU通常执行该功能。之后MCU可以通过把合适的输入多媒体流交换到输入端口/路径来把发言人的姓名和相应的视频和音频流发送给所有参与方。One known technique solves this problem by analyzing the audio stream and forwarding the active speaker's name and video stream to all parties involved. In a centralized conferencing system, the MCU usually performs this function. The MCU can then send the speaker's name and corresponding video and audio streams to all participants by switching the appropriate input multimedia streams to the input ports/paths.
视频交换是一种公知的技术,其旨在给每个端点传送一个单独的视频流,相当于安排多个点对点会话。视频交换可以是:Video switching is a well-known technique that aims to deliver to each endpoint a separate video stream, equivalent to arranging multiple point-to-point sessions. Video exchanges can be:
(i)语音激活交换,其中MCU发送活跃发言人的视频。(i) Voice-activated switching, where the MCU sends video of the active speaker.
(ii)定时激活交换,其中在预定时间隔相继发送每个参与方的视频。(ii) A time-activated exchange, where each participant's video is sent sequentially at predetermined time intervals.
(iii)个人视频选择交换,其中每一端点可以请求他/她希望接收的参与方视频流。(iii) Personal video opt-in exchange, where each endpoint can request which party video streams he/she wishes to receive.
现在参照图2,示出了传统的视频交换机制200的功能框图。在传统的集中会议系统中,视频交换的执行如下。MCU220,例如位于基于互联网协议(IP)的网络210内的MCU200,包含交换机230。MCU220接收所有参与方(用户设备)250、260、270、280的视频流255、265、275、285。MCU还可以从正在讲话的参与方单独接收组合(多路复用)的音频流290。之后,MCU220选择一个视频流并且把该视频流240发送给所有参与方250、260、270、280。Referring now to FIG. 2, a functional block diagram of a conventional
这种传统系统有一个缺点,就是他们只发送活跃发言人的视频流。用户还可能有一个问题,就是如果几个发言人同时讲话或活跃发言人不断改变,识别视频流发言人会有问题。这尤其是大型视频会议中存在的问题。One downside of this traditional system is that they only send the video stream of the active speaker. Users may also have an issue identifying the speaker of the video stream if several speakers are speaking at the same time or the active speaker keeps changing. This is especially problematic in large video conferences.
作为替换,可以将每个参与方的视频发送给所有参与方。但是,在基于无线的会议中该方法会由于带宽限制而受影响。Alternatively, each participant's video may be sent to all participants. However, in wireless based conferencing this method suffers due to bandwidth limitations.
在视频技术领域,已知视频作为一系列静态图像/画面发送。因为视频信号的质量在视频信号编码和压缩期间会受到影响,因此已知会包括附加的信息‘层’,这些层基于视频信号和编码视频比特流之间的差异。包含附加层能够使接收信号质量随着解码和/或解压缩而得到增强。因此,使用图像和被分为一个或多个层的增强图像这样的分级结构来产生分层视频比特流。In the field of video technology, it is known that video is sent as a series of still images/pictures. Because the quality of a video signal is affected during video signal encoding and compression, it is known to include additional 'layers' of information based on the differences between the video signal and the encoded video bitstream. The inclusion of additional layers enables the received signal quality to be enhanced with decoding and/or decompression. Therefore, a hierarchical video bitstream is generated using a hierarchical structure of pictures and enhanced pictures divided into one or more layers.
在分层(可量测)视频比特流中,可以在基本层之外通过以下之一对视频信号进行增强:In a layered (scalable) video bitstream, the video signal can be enhanced outside the base layer by one of the following:
(i)提高画面的分辨率(空间量测性);(i) Improve the resolution of the picture (spatial scalability);
(ii)包括错误信息,以改善画面的信噪比(SNR量测性);或(ii) include error messages to improve the signal-to-noise ratio (SNR measurability) of the picture; or
(iii)包括额外的画面,以提高帧速率(时间量测性)。(iii) Include extra frames to increase frame rate (temporal scalability).
这样的增强可以应用于整个画面,或应用到画面中任意形状的目标,这被称为基于目标的量测性。为了保留时间增强层的可任意处理特性,H.263+标准规定,包含在时间量测性模式中的画面应该是如图3的视频流所示的双向预测(B)画面。Such enhancements can be applied to the entire frame, or to objects of arbitrary shape in the frame, which is called object-based scalability. In order to preserve the discretionary feature of the temporal enhancement layer, the H.263+ standard stipulates that the pictures included in the temporal scalable mode should be bidirectional predictive (B) pictures as shown in the video stream of FIG. 3 .
图3示出了可量测视频配置300的示意性图示,说明了视频编码技术领域公知的B画面预测相关性。最初的帧内编码帧(I1)310后是双向预测帧(B2)320。随后是(单向)预测帧(P3)330,再然后是第二双向预测帧(B4)330。再后面是(单向)预测帧(P5)350,等等。Fig. 3 shows a schematic illustration of a scalable video configuration 300, illustrating B-picture prediction dependencies known in the art of video coding. The initial intra-coded frame (I 1 ) 310 is followed by a bi-directional predictive frame (B 2 ) 320 . This is followed by a (uni-directional) predictive frame (P 3 ) 330 and then a second bi-directional predictive frame (B 4 ) 330 . This is followed by a (unidirectional) predicted frame ( P5 ) 350, and so on.
图4是视频编码技术领域中已知的分层视频配置的示意性图示。分层视频比特流包括基本层405和一个或多个增强层435。Fig. 4 is a schematic illustration of a layered video configuration known in the field of video coding technology. A layered video bitstream includes a
基本层(层1)包括从原始视频信号画面抽样、编码和/或压缩得到的一个或多个帧内编码画面(I画面)410。另外,基本层包括从帧内编码帧410预测的多个预测帧间编码画面(P画面)420、430。The base layer (Layer 1) includes one or more intra-coded pictures (I-pictures) 410 that are sampled, encoded and/or compressed from the original video signal pictures. In addition, the base layer includes a plurality of predicted inter-coded pictures (P-pictures) 420 , 430 predicted from the
在增强层(层2或3或更多)435中,可以使用三种类型的画面:In an enhancement layer (layer 2 or 3 or more) 435, three types of pictures can be used:
(i)双向预测(B)画面(未示出);(i) bidirectional prediction (B) picture (not shown);
(ii)基于基本层405的帧内编码画面410的增强内插(EI)画面440;和(ii) an enhanced interpolation (EI)
(iii)基于基本层的帧间编码画面420、430的增强预测(EP)画面450、460。(iii) Enhancement prediction (EP) pictures 450, 460 based on
从较低层向上的垂直箭头说明,增强层中的画面是从参考(较低)层中的画面的重构近似中预测的。Vertical arrows going up from lower layers illustrate that pictures in the enhancement layer are predicted from reconstructed approximations of pictures in the reference (lower) layer.
总之,可以在多点通信多媒体会议中使用可量测视频编码,并且仅仅在点对点或多点通信视频通信的情况中使用。但是,当前无线网络不支持多点通信。另外,通过多点通信,每一层在分开的多点通信会话中发送,接收方确定它自己是否登录到一个或多个会话中。In conclusion, scalable video coding can be used in multipoint multimedia conferencing, and only in the case of point-to-point or multipoint video communications. However, current wireless networks do not support multipoint communication. Alternatively, with multipoint communication, each layer is sent in a separate multipoint communication session, and the receiver determines whether it is logged into one or more sessions.
因此,需要一种改进的视频会议配置和操作方法,可以减轻上面提到的缺点。Therefore, there is a need for an improved method of videoconferencing configuration and operation that mitigates the above-mentioned disadvantages.
发明内容Contents of the invention
根据本发明,提供一种如权利要求1所述的在多媒体视频会议中中继视频图像的方法,一种如权利要求7所述的用于中继视频图像的视频会议设备,一种如权利要求11所述的用于参与视频会议的无线装置,一种如权利要求12所述的多点处理器,一种如权利要求16所述的视频通信系统,一种如权利要求18所述的媒体资源功能元件,一种如权利要求19或权利要求20所述的视频通信单元,一种如权利要求23所述的存储介质。本发明的其他方面如从属权利要求所述。According to the present invention, there is provided a method for relaying video images in a multimedia video conference as claimed in
总之,本发明的发明原理是通过提供一种视频交换方法来解决现有技术配置的缺点,以在视频会议中改善参与方和发言人的识别。本发明利用分层视频编码,以更好地利用可用于每一用户的带宽。In summary, the inventive principle of the present invention is to solve the disadvantages of the prior art configurations by providing a video exchange method to improve identification of participants and speakers in a video conference. The present invention utilizes layered video coding to better utilize the bandwidth available to each user.
附图说明Description of drawings
图1示出了一种已知的集中会议模型。Figure 1 shows a known centralized conference model.
图2示出了传统的视频交换机制的功能框图。Fig. 2 shows a functional block diagram of a traditional video exchange mechanism.
图3是一个视频配置的示意性图示,表示在视频编码技术领域中已知的画面预测相关性。Fig. 3 is a schematic illustration of a video configuration representing picture prediction dependencies known in the field of video coding technology.
图4是在视频编码技术领域中已知的分层视频配置的示意性图示。Fig. 4 is a schematic illustration of a layered video configuration known in the field of video coding technology.
现在将参照附图描述本发明的示例性实施例,其中:Exemplary embodiments of the invention will now be described with reference to the accompanying drawings, in which:
图5示出了根据本发明的优选实施例的视频交换机制的功能框图。Fig. 5 shows a functional block diagram of a video exchange mechanism according to a preferred embodiment of the present invention.
图6示出了根据本发明的优选实施例的多点处理单元的功能框图/流程图。Fig. 6 shows a functional block diagram/flow diagram of a multi-point processing unit according to a preferred embodiment of the present invention.
图7示出了使用本发明的优选实施例参与视频会议的无线装置的视频显示。Figure 7 shows a video display of a wireless device participating in a video conference using the preferred embodiment of the present invention.
图8示出了根据本发明的优选实施例采用的UMTS(3GPP)通信系统。Fig. 8 shows a UMTS (3GPP) communication system employed in accordance with a preferred embodiment of the present invention.
具体实施方式Detailed ways
总体来说,本发明的优选实施例提出一种用于多媒体会议的新的视频交换机制,该机制使用分层视频编码。以前,分层视频编码只用于把一个视频比特流分成多于一个的层:如上面对照图4所描述的基本层和一个或几个增强层。这些用于可量测视频通信的已知技术在诸如H.263和MPEG-4的标准中进行了详细描述。In general, the preferred embodiments of the present invention propose a new video switching mechanism for multimedia conferencing using layered video coding. Previously, layered video coding was only used to divide a video bitstream into more than one layer: a base layer and one or several enhancement layers as described above with reference to FIG. 4 . These known techniques for scalable video communications are described in detail in standards such as H.263 and MPEG-4.
但是,本发明的发明人已经认识到通过采用分层视频编码的原理并把采用的原理应用到多媒体视频会议应用中可以得到这些好处。以这种方式,本发明定义了一种与点对点或多点通信视频通信不同类型的可量测视频编码来用于多媒体会议。However, the inventors of the present invention have realized that these benefits can be obtained by adopting the principles of layered video coding and applying the adopted principles to multimedia videoconferencing applications. In this way, the present invention defines a scalable video coding for multimedia conferencing as opposed to peer-to-peer or multipoint video communications.
现在参照图5,示出了根据本发明的优选实施例的视频交换机制的功能框图500。与传统的集中会议系统形成相比,这种视频交换的执行如下。MCU520,例如位于基于互联网协议(IP)的网络510内的MCU520,包含交换机530。Referring now to FIG. 5, there is shown a functional block diagram 500 of a video switching mechanism in accordance with a preferred embodiment of the present invention. Compared with traditional centralized conference system formation, this video exchange is performed as follows. MCU 520 , such as MCU 520 located within Internet Protocol (IP) based network 510 , includes switch 530 .
值得注意的是,MCU520接收‘分层’视频流,该视频流包括所有参与方(用户设备)550、560、570、580的基本层552、562、572、582和一个或多个增强层流555、565、575、585。为了清楚的目的,每个参与方只示出了一个增强层视频流。Notably, the MCU 520 receives a 'layered' video stream comprising the base layer 552, 562, 572, 582 and one or more enhancement layer streams of all parties (user equipment) 550, 560, 570, 580 555, 565, 575, 585. For purposes of clarity, only one enhancement layer video stream per participant is shown.
MCU520还可以单独从参与方接收组合(多路复用)的音频流590。之后MCU520使用交换机530选择多个活跃发言人535的基本层视频流和最活跃发言人的增强层540。之后MCU520发送这些视频流535、540给所有的参与方550、560、570、580。MCU 520 may also receive combined (multiplexed)
优选地,确定最活跃发言人的选择过程通过MCU520分析音频流590来执行,以便首先确定所有这些活跃发言人都是谁。然后如图6所述,优选地,在多点处理器单元中确定最活跃发言人。优选地,根据基于每一参与方的活跃性的优先级把一个或多个基本层和一个增强层发送到参与方。Preferably, the selection process of determining the most active speakers is performed by the MCU 520 analyzing the
为了实现图5的改进的但是更复杂的视频交换机制,多点处理单元(MP)600适于促进根据本发明的优选实施例和如图6所示的新的视频交换机制。To implement the improved but more complex video switching mechanism of FIG. 5, a multipoint processing unit (MP) 600 is adapted to facilitate a new video switching mechanism according to a preferred embodiment of the present invention and as shown in FIG.
MP600还通过分组过滤模块610从参与方的视频/多媒体通信单元接收音频流590并且把该音频流路由到分组路由模块630。但是,音频流现在还被路由到一个发言人识别模块620,该模块分析该音频流590以便确定谁是活跃的发言人。发言人识别模块620基于每一参与方的活跃性分配优先级并且确定:
(i)最活跃发言人620,(i) most
(ii)任何其他的活跃发言人625以及缺席的人(ii) any other
(iii)任何剩余的不活跃发言人。(iii) Any remaining inactive speakers.
根据本发明的优选实施例,之后发言人识别模块620把优先级信息转发到交换模块640,该交换模块适于处理发言人的优先级。另外,交换模块640适于通过分组过滤模块610从参与方的视频通信单元接收分层视频流,该分层视频流包括视频基本层流552、562、572和582以及视频增强层流555、565、575和585。交换模块640使用该发言人信息通过分组路由模块630把第二(次)活跃发言人和最活跃发言人的视频基本层和最活跃发言人的视频增强层发送给所有参与方。According to a preferred embodiment of the present invention, the
因此,多点处理器的一个或多个接收端口适于从一组用户设备550、560、570和580接收分层视频流,给分层视频流包括基本层视频流552、562、572和582以及增强层视频流555、565、575和585。在本发明的考虑中,如果确定只有一个活跃发言人,交换模块640可以只选择一个基本层视频图像和相应的一个或多个增强层。之后将该发言人自动指定为最活跃发言人,以发送到一个或一组用户设备550、560、570和580。Accordingly, one or more receive ports of the multipoint processor are adapted to receive layered video streams from a set of user equipment 550, 560, 570, and 580, the layered video streams including base layer video streams 552, 562, 572, and 582 and enhancement layer video streams 555, 565, 575 and 585. In contemplation of the present invention, if it is determined that there is only one active speaker, the switching module 640 may select only one base layer video image and the corresponding one or more enhancement layers. That speaker is then automatically designated as the most active speaker for transmission to one or a group of user devices 550, 560, 570, and 580.
如在视频会议中发生的,当最活跃发言人经常改变时,将不断交换增强层。本发明的发明人已经认识到这样经常并快速交换所具有的潜在问题。在这种情况下,如果第一帧实际上是来自之前只是第二活跃发言人的预测帧(EP),那么该帧需要被转换为内插帧(EI)。As happens in video conferencing, when the most active speaker changes frequently, enhancement layers will be constantly swapped. The inventors of the present invention have recognized potential problems with such frequent and rapid exchanges. In this case, if the first frame is actually a predicted frame (EP) from the previous only second active speaker, then that frame needs to be converted to an interpolated frame (EI).
为了解决该潜在问题,优选地将来自分组过滤模块610的视频基本层流552、562、572和582以及视频增强层流555、565、575、585输入到解包功能元件680。解包功能元件680对视频流去多路复用并且把经过去多路复用的视频流提供给视频解码器和缓冲器功能元件670。To address this potential problem, video base layer streams 552 , 562 , 572 and 582 and video enhancement layer streams 555 , 565 , 575 , 585 from
为了同步并配合视频解码,视频解码器和缓冲器功能元件670接收最活跃发言人622的指示。在提取最活跃发言人的视频流信息后,视频解码器和缓冲器功能元件670提供最活跃发言人622的双向预测(BP)675和/或预测(EP)视频流数据给‘EP帧到EI帧译码模块’660。该‘EP帧到EI帧译码模块’660处理输入视频流,以提供最初的发言人增强层视频流,如帧内编码(EI)帧。To synchronize and coordinate video decoding, the video decoder and
之后将最初的发言人增强层视频流输入到打包功能元件650,在那里打包并且输入到交换模块640。之后交换模块640组合最初的发言人增强层视频流和第二活跃发言人的视频基本层流552、562、572和582并且把组合的多媒体流路由到分组路由模块630。之后分组路由模块根据图5的方法把该信息路由到参与方。The initial speaker enhancement layer video stream is then input to the
在本发明的优选实施例中,当确定最初发言人改变时,视频交换模块640使用‘EP帧到EI帧译码模块’660的输出。In a preferred embodiment of the present invention, the video exchange module 640 uses the output of the 'EP frame to EI frame decoding module' 660 when determining that the initial speaker change has occurred.
在本发明的考虑中,还可以在MP600中包括类似于模块660的一个或多个模块,以便当认为第二发言人已经改变时对他们执行相同的功能。否则,在使用单个‘EP帧到EI帧译码模块’660来译码最初发言人的视频流的实施例中,当假定一个不活跃的发言人变成第二活跃发言人时,发言人识别模块620(或交换模块640)可以请求新的内插帧。作为选择的,交换模块640可以在发送相应的视频基本层流给所有参与方之前等待新的第二活跃发言人的新内插帧。It is contemplated by the present invention that one or more modules similar to module 660 may also be included in
在本发明的优选实施例之外,在多于一个的增强层可以使用的情况中使用多类发言人也在本发明的考虑之内。通过使用多类发言人,由于改善了发言人识别,可以得到多媒体消息的更精确的可量测性,尤其是对大的视频会议来说。In addition to the preferred embodiment of the present invention, it is also contemplated by the present invention to use multiple types of speakers in cases where more than one enhancement layer can be used. By using multiple classes of speakers, more accurate scalability of multimedia messages can be obtained due to improved speaker recognition, especially for large video conferences.
对一个或多个基本层流增加预测帧到内插帧的转换,也在本发明的考虑内。以这种方式,交换模块640可以快速的在基本层之间交换而不需等待新的内插帧。Adding the conversion of predicted frames to interpolated frames for one or more base layer streams is also contemplated by the present invention. In this way, the switching module 640 can quickly switch between base layers without waiting for new interpolation frames.
图7示出了使用本发明的优选实施例参加视频会议的无线装置700的视频显示器710。通过实现此前描述的本发明原理,可以得到改善的视频通信。具体地说,对于给定的带宽,通过降低次(第二)活跃发言人730的视频质量并且不为不活跃的发言人提供视频,参与方现在能够接收最活跃发言人720的更好的视频质量。为了提供这种改善的视频会议,视频通信装置接收最活跃发言人720的增强层和基本层、第二活跃发言人730的基本层并且不从不活跃发言人接收视频。Figure 7 shows a video display 710 of a wireless device 700 participating in a video conference using the preferred embodiment of the present invention. By implementing the principles of the invention described previously, improved video communications can be obtained. Specifically, for a given bandwidth, by reducing the video quality of the next (second) active speaker 730 and not providing video for inactive speakers, the participant is now able to receive a better video of the most active speaker 720 quality. To provide this improved video conferencing, the video communication device receives the enhancement and base layers of the most active speaker 720, the base layer of the second active speaker 730, and receives no video from inactive speakers.
以这种方式,视频通信单元可以在更大、更高分辨率的显示器提供不断更新的最活跃发言人的视频图像,同时较小的显示器可以显示第二(次)活跃发言人。In this way, the video communication unit can provide a continuously updated video image of the most active speaker on the larger, higher resolution display, while the smaller display can show the second (secondary) active speaker.
优选地,无线装置700具有用于显示最活跃发言人的较高质量视频图像的主要视频显示器710,以及一个或多个第二不同的显示器,用于显示各个次活跃发言人。优选地,由可操作地耦合到视频显示器的处理器(未示出)执行各个视频图像进入相应显示器的处理。处理器接收最活跃发言人720和次活跃发言人的指示,并且确定所接收的哪一个视频图像应该在第一显示器上显示,从次活跃发言人730接收的哪一个图像应该在第二显示器上显示。有益地,可以设置第二显示器,以提供较低质量的次活跃发言人视频图像,从而节省费用。Preferably, the wireless device 700 has a primary video display 710 for displaying a higher quality video image of the most active speaker, and one or more second, different displays for displaying each of the less active speakers. The processing of the respective video images into the respective displays is preferably performed by a processor (not shown) operatively coupled to the video displays. The processor receives indications of the most active speaker 720 and the second active speaker and determines which of the received video images should be displayed on the first display and which image received from the second active speaker 730 should be displayed on the second display show. Advantageously, a second display can be provided to provide a lower quality video image of the second active speaker, thereby saving costs.
可以预料到,在未来,基于MCU的系统将会有助于在基于IP的网络上的多媒体通信。因此,本发明的发明人想到,在此描述的技术可以包含在利用MCU的任何基于H.323/SIP的多点多媒体会议或系统中。It can be expected that in the future, MCU-based systems will facilitate multimedia communications over IP-based networks. Therefore, it is contemplated by the inventors of the present invention that the techniques described herein can be incorporated into any H.323/SIP based multipoint multimedia conference or system utilizing an MCU.
前述的优选应用是在用于宽带码分多址(WCDMA)标准的第三代合作计划(3GPP)规范中。具体地说,本发明可以应用于IP多媒体域(在规范的3G TS25.xxx系列中描述),其计划把H.323/SIP MCU结合到3GPP网络中。见图8,MCU将由媒体资源功能元件(MRF)890A支持。The foregoing preferred application is in the Third Generation Partnership Project (3GPP) specification for the Wideband Code Division Multiple Access (WCDMA) standard. In particular, the invention can be applied in the IP multimedia domain (described in the 3G TS25.xxx series of specifications), which plans to incorporate H.323/SIP MCUs into 3GPP networks. Referring to Figure 8, the MCU will be supported by a Media Resource Function (MRF) 890A.
图8示出了一种以分级结构形式的3GPP(UMTS)通信系统/网络800,其能够在根据本发明的的优选实施例中采用。通信系统800适于并且包含能够在UMTS和/或GPRS空中接口上操作的网络元件。Figure 8 shows a 3GPP (UMTS) communication system/network 800 in a hierarchical structure, which can be employed in a preferred embodiment according to the present invention. The communication system 800 is adapted to and contains network elements capable of operating over UMTS and/or GPRS air interfaces.
通常认为该网络包括:This network is generally considered to include:
(i)用户设备域810,由以下构成:(i) User Equipment Domain 810, consisting of:
(a)用户SIM(USIM)域820,以及(a) User SIM (USIM) field 820, and
(b)移动设备域830;和(b) mobile device domain 830; and
(ii)基础设施域840,由以下构成:(ii) Infrastructure domain 840, consisting of:
(c)接入网域850,和(c) access network domain 850, and
(d)核心网域860,其由以下(至少)构成:(d) core network domain 860, which consists of (at least) the following:
(di)服务网域870,和(di) service domain 870, and
(dii)转接网域880,和(dii) transit domain 880, and
(diii)IP多媒体域890,具有由SIP提供的多媒体(ETFRFC2543)。(diii) IP Multimedia domain 890 with multimedia provided by SIP (ETFRFC2543).
在移动设备域830中,UE830A经有线Cu接口从USIM域820中的用户SIM820A接收数据。UE830A经无线Uu接口与网络接入域850中的节点B850A传送数据。在网络接入域850内,节点B850A包含一个或多个收发信机单元并且经UMTS规范定义的Iub接口与基于蜂窝的系统基础设施的其余部分,例如,RNC850B通信。In mobile device domain 830 UE 830A receives data from subscriber SIM 820A in USIM domain 820 via a wired Cu interface. UE 830A communicates data with Node B 850A in network access domain 850 via the wireless Uu interface. Within the Network Access Domain 850, Node B 850A contains one or more transceiver units and communicates with the rest of the cellular based system infrastructure, eg RNC 850B, via the Iub interface defined by the UMTS specification.
RNC850B经Iu接口与其它RNC(未示出)通信。RNC850B经Iu接口与服务网域870中的SGSN870A通信。在服务网域870内,SGSN870A经Gn接口与GGSN870B通信,并且SGSN870A经Gs接口与VLR服务器870C通信。根据本发明的优选实施例,SGSN870A与MCU(未示出)通信,该MCU位于IP多媒体域890的媒体资源功能元件(890A)内。经Gi接口执行通信。RNC 850B communicates with other RNCs (not shown) via the Iu interface. RNC 850B communicates with SGSN 870A in serving network domain 870 via the Iu interface. Within service network domain 870, SGSN 870A communicates with GGSN 870B via Gn interface, and SGSN 870A communicates with VLR server 870C via Gs interface. According to a preferred embodiment of the present invention, SGSN 870A communicates with an MCU (not shown), which is located within the media resource function element (890A) of IP multimedia domain 890 . Communication is performed via the Gi interface.
GGSN870B(和/或SSGN)负责UMTS(或GPRS)与诸如因特网或公共交换电话网(PSTN)这样的公共交换数据网(PDSN)880A接口。SGSN870A执行UMTS核心网内业务的路由和隧道功能,同时GGSN870B连接到外部分组网络,在这种情况中是任何一个访问系统的UMTS模式的网络。GGSN 870B (and/or SSGN) is responsible for interfacing UMTS (or GPRS) with a Public Switched Data Network (PDSN) 880A such as the Internet or the Public Switched Telephone Network (PSTN). The SGSN870A performs the routing and tunneling functions for traffic within the UMTS core network, while the GGSN870B connects to the external packet network, in this case any UMTS-mode network that accesses the system.
RNC850B是负责许多节点B的资源控制和分配的UTRAN元件;通常,一个RNC850B可以控制50到100个节点B。RNC850B还通过空中接口提供可靠的用户业务传送。多个RNC彼此通信(经接口Iur)以支持切换和宏分集。RNC 850B is a UTRAN element responsible for resource control and allocation of many Node Bs; typically, one RNC 850B can control 50 to 100 Node Bs. RNC850B also provides reliable user business transmission through the air interface. Multiple RNCs communicate with each other (via interface Iur) to support handover and macrodiversity.
SGSN870A是UMTS核心网元件,负责会话控制以及到位置寄存器(HLR和VLR)的接口。SGSN是用于许多RNC的大型集中控制器。SGSN870A is the UMTS core network element responsible for session control and interface to location registers (HLR and VLR). SGSN is a large centralized controller for many RNCs.
GGSN870B是UMTS核心网元件,负责把核心分组网的用户数据集中并隧道到最终的目的地(例如,因特网服务提供商(ISP))。这样的用户数据包括去往/来自IP多媒体域890的多媒体和相关的信令数据。在IP多媒体域890中,MRF被分为多媒体资源功能控制器(MRFC)892A和多媒体资源功能处理器(MPFP)891A。如上所述,MRFC892A提供多点控制器(MC)功能性,而MPFP891A提供多点处理器(MP)功能性。GGSN870B is a UMTS core network element, responsible for centralizing and tunneling user data of the core packet network to the final destination (for example, Internet Service Provider (ISP)). Such user data includes multimedia and associated signaling data to/from IP multimedia domain 890 . In the IP multimedia domain 890, the MRF is divided into a multimedia resource function controller (MRFC) 892A and a multimedia resource function processor (MPFP) 891A. As mentioned above, the MRFC892A provides multipoint controller (MC) functionality, while the MPFP891A provides multipoint processor (MP) functionality.
跨越Mr参考点/接口893A使用的协议是SIP(如RFC2543定义的)。呼叫状态控制功能元件(CSCF)895A充当呼叫服务器并处理多媒体呼叫信令。The protocol used across the Mr reference point/interface 893A is SIP (as defined in RFC2543). Call State Control Function (CSCF) 895A acts as a call server and handles multimedia call signaling.
因此,根据本发明的优选实施例,如在此之前描述的,元件SGSN870A、GGSN870B和所有MRF890A中的部分都适于促进多媒体消息。此外,如在此之前描述的,UE830A、节点B850A和RNC850B还适于促进改进的多媒体消息。Therefore, according to a preferred embodiment of the present invention, elements SGSN 870A, GGSN 870B and all parts of MRF 890A are adapted to facilitate multimedia messages as described heretofore. Furthermore, UE 830A, Node B 850A, and RNC 850B are also adapted to facilitate improved multimedia messaging, as described hereinbefore.
总的来说,这种适配可以以任何合适的方式在各个通信单元中实现。例如,可以在现有的通信单元添加新的装置,或作为选择的采用现有的通信单元的现有部分,例如通过对其中的一个或多个处理器重新编程。这样,所要求的适配可以以存储在存储介质上的处理器可实现指令的形式来实现,这里的存储介质例如软盘、硬盘、PROM、RAM或任何这些或其他存储多媒体的组合。In general, this adaptation can be carried out in the respective communication unit in any suitable way. For example, new devices may be added to an existing communication unit, or alternatively existing portions of an existing communication unit may be employed, such as by reprogramming one or more processors therein. Thus, the required adaptations may be implemented in the form of processor-implementable instructions stored on a storage medium such as a floppy disk, hard disk, PROM, RAM, or any combination of these or other stored multimedia.
作为选择的,多媒体消息的这种适配还可以通过采用通信系统800的任何其他部分来控制、全部实现或部分实现,这也在本发明的考虑中。Alternatively, such adaptation of the multimedia message may also be controlled, fully or partially implemented by employing any other part of the communication system 800, which is also contemplated by the present invention.
尽管通常提供上面的元件作为分立单元(在它们自己各自的软件/硬件平台上),分为移动设备域830、接入网域850和服务网域870,但是可以想到也可以采用其他的配置。While the above elements are typically provided as discrete units (on their own respective software/hardware platforms), divided into mobile device domain 830, access network domain 850, and service network domain 870, it is contemplated that other configurations may also be employed.
另外,在其他网络基础设施的情况中,例如GSM网中,处理操作的实现可以由任何合适的节点来执行,例如任何其他合适类型的基站、基站控制器、移动交换中心或可操作和管理控制器等等。作为选择的,可以通过分布在任何合适网络网络内的不同位置或实体的各种部件来执行上面提到的步骤。Also, in the case of other network infrastructures, such as a GSM network, implementation of the processing operations may be performed by any suitable node, such as any other suitable type of base station, base station controller, mobile switching center or operational and management control device and so on. Alternatively, the above-mentioned steps may be performed by various components distributed at various locations or entities within any suitable network network.
如上所述,优选的,当应用在集中视频会议中时,使用分层视频编码的视频会议方法可以提供以下的优点:As mentioned above, preferably, when applied in centralized video conferencing, the video conferencing method using layered video coding can provide the following advantages:
(i)与传统系统相比,发言人的识别有了很大改善,因为共享带宽允许发送一个或多个增强层和几个基本层而不是只发送一个完全质量视频流。(i) Speaker recognition is much improved compared to conventional systems, since the shared bandwidth allows sending one or more enhancement layers and a few base layers instead of just one full-quality video stream.
(ii)当活跃发言人改变时,使用在此描述的本发明原理的视频交换更加平滑,这是因为它定义了几个状态,活跃发言人、第二最活跃发言人、不活跃发言人。(ii) Video switching using the inventive principles described here is smoother when the active speaker changes because it defines several states, active speaker, second most active speaker, inactive speaker.
(iii)最活跃发言人的视频质量得到了改善。(iii) The video quality of the most active speakers has been improved.
(iv)改进的视频通信单元可以显示各种发言人,每一被显示的图像依赖于与相应视频通信单元的传输有关的优先级。(iv) The improved video communication unit can display various speakers, each displayed image depending on the priority associated with the transmission of the corresponding video communication unit.
已经描述了一种在多个多媒体用户设备之间的多媒体视频会议中中继视频图像的方法。该方法包括以下步骤:通过许多用户设备中的多个发送分层视频图像,其中分层视频图像包括基本层和一个或多个增强层,并且在多点控制单元接收发送的分层视频图像。选择许多活跃发言人的许多基本层图像和最活跃发言人的一个或多个增强层。该多点控制单元把许多活跃发言人的许多基本层视频图像和最活跃发言人的一个或多个增强层发送给多个多媒体用户设备的一个或多个。A method of relaying video images in a multimedia video conference between a plurality of multimedia user equipment has been described. The method comprises the steps of: transmitting a layered video image comprising a base layer and one or more enhancement layers through a plurality of a plurality of user equipments, and receiving the transmitted layered video image at a multipoint control unit. A number of base layer images of many active speakers and one or more enhancement layers of the most active speakers are selected. The multipoint control unit transmits a plurality of base layer video images of a plurality of active speakers and one or more enhancement layers of a most active speaker to one or more of a plurality of multimedia user devices.
此外,描述了一种用于在多个用户设备之间中继视频图像的视频会议装置。另外,还描述了一种用于参与视频会议的无线装置,其中许多参与方发送视频图像。Furthermore, a video conferencing apparatus for relaying video images between a plurality of user equipment is described. Additionally, a wireless device for participating in a video conference in which a number of participants send video images is described.
Claims (23)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0202101.2 | 2002-01-30 | ||
GB0202101A GB2384932B (en) | 2002-01-30 | 2002-01-30 | Video conferencing system and method of operation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1618233A true CN1618233A (en) | 2005-05-18 |
Family
ID=9930013
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA028277430A Pending CN1618233A (en) | 2002-01-30 | 2002-12-16 | Video conferencing system and method of operation |
Country Status (7)
Country | Link |
---|---|
JP (1) | JP2005516557A (en) |
KR (1) | KR20040079973A (en) |
CN (1) | CN1618233A (en) |
FI (1) | FI20041039L (en) |
GB (1) | GB2384932B (en) |
HK (1) | HK1058450A1 (en) |
WO (1) | WO2003065720A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103023666A (en) * | 2005-09-07 | 2013-04-03 | 维德约股份有限公司 | System and method for a conference server architecture for low delay and distributed conferencing applications |
CN101467423B (en) * | 2006-06-16 | 2013-06-19 | 微软公司 | Performance enhancements for video conferencing |
WO2014005488A1 (en) * | 2012-07-03 | 2014-01-09 | 中国移动通信集团公司 | Video data flow transmission method, terminal and system |
CN106572320A (en) * | 2016-11-11 | 2017-04-19 | 上海斐讯数据通信技术有限公司 | Multiparty video conversation method and system |
CN107968768A (en) * | 2016-10-19 | 2018-04-27 | 中兴通讯股份有限公司 | Sending, receiving method and device, system, the video relaying of Media Stream |
CN108134915A (en) * | 2014-03-31 | 2018-06-08 | 宝利通公司 | For the method and system of mixed topology media conference system |
CN109076128A (en) * | 2016-02-29 | 2018-12-21 | 铁三角有限公司 | conference system |
CN109845151A (en) * | 2016-09-23 | 2019-06-04 | 高通股份有限公司 | Adaptive Modulation order for multi-user's superposed transmission with non-alignment resource |
CN111314738A (en) * | 2018-12-12 | 2020-06-19 | 阿里巴巴集团控股有限公司 | Data transmission method and device |
Families Citing this family (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8659636B2 (en) | 2003-10-08 | 2014-02-25 | Cisco Technology, Inc. | System and method for performing distributed video conferencing |
AU2004306740B2 (en) * | 2003-10-08 | 2010-11-25 | Cisco Technology, Inc. | System and method for performing distributed video conferencing |
JP2006067124A (en) * | 2004-08-25 | 2006-03-09 | Nec Corp | Method and device for switching image encoded data, system, and program |
CN100417220C (en) * | 2004-09-28 | 2008-09-03 | 中兴通讯股份有限公司 | Method for holding multi-point video conference by terminal dialing |
FR2875665A1 (en) * | 2005-01-04 | 2006-03-24 | France Telecom | Video bit stream highlighting method for transmitting stream to videoconference participants, involves adjusting value of encoding quality parameter of video bit stream based on measured value of predefined parameter of audio bit stream |
US7535484B2 (en) * | 2005-03-14 | 2009-05-19 | Sony Ericsson Mobile Communications Ab | Communication terminals that vary a video stream based on how it is displayed |
CN100401765C (en) * | 2005-03-24 | 2008-07-09 | 华为技术有限公司 | Video conference controlling method |
US20060244813A1 (en) * | 2005-04-29 | 2006-11-02 | Relan Sandeep K | System and method for video teleconferencing via a video bridge |
CA2616266A1 (en) | 2005-09-07 | 2007-07-05 | Vidyo, Inc. | System and method for a high reliability base layer trunk |
KR100695206B1 (en) | 2005-09-12 | 2007-03-14 | 엘지전자 주식회사 | Mobile communication terminal sharing device buffer and buffer sharing method using same |
JP2007158410A (en) | 2005-11-30 | 2007-06-21 | Sony Computer Entertainment Inc | Image encoder, image decoder, and image processing system |
US8436889B2 (en) | 2005-12-22 | 2013-05-07 | Vidyo, Inc. | System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers |
KR100666995B1 (en) * | 2006-01-16 | 2007-01-10 | 삼성전자주식회사 | Selective media data provision method and system in multimedia conference service |
WO2008042852A2 (en) | 2006-09-29 | 2008-04-10 | Vidyo, Inc. | System and method for multipoint conferencing with scalable video coding servers and multicast |
US8773494B2 (en) * | 2006-08-29 | 2014-07-08 | Microsoft Corporation | Techniques for managing visual compositions for a multimedia conference call |
US8334891B2 (en) | 2007-03-05 | 2012-12-18 | Cisco Technology, Inc. | Multipoint conference video switching |
US8264521B2 (en) | 2007-04-30 | 2012-09-11 | Cisco Technology, Inc. | Media detection and packet distribution in a multipoint conference |
KR100874024B1 (en) * | 2007-09-18 | 2008-12-17 | 주식회사 온게임네트워크 | Repeaters, methods and recording media for relaying interactive content |
EP2046041A1 (en) * | 2007-10-02 | 2009-04-08 | Alcatel Lucent | Multicast router, distribution system,network and method of a content distribution |
US7869705B2 (en) | 2008-01-21 | 2011-01-11 | Microsoft Corporation | Lighting array control |
US8421840B2 (en) * | 2008-06-09 | 2013-04-16 | Vidyo, Inc. | System and method for improved view layout management in scalable video and audio communication systems |
US8319820B2 (en) | 2008-06-23 | 2012-11-27 | Radvision, Ltd. | Systems, methods, and media for providing cascaded multi-point video conferencing units |
US8130257B2 (en) | 2008-06-27 | 2012-03-06 | Microsoft Corporation | Speaker and person backlighting for improved AEC and AGC |
KR101234495B1 (en) * | 2009-10-19 | 2013-02-18 | 한국전자통신연구원 | Terminal, node device and method for processing stream in video conference system |
US8780978B2 (en) * | 2009-11-04 | 2014-07-15 | Qualcomm Incorporated | Controlling video encoding using audio information |
KR101636716B1 (en) | 2009-12-24 | 2016-07-06 | 삼성전자주식회사 | Apparatus of video conference for distinguish speaker from participants and method of the same |
JP5999873B2 (en) * | 2010-02-24 | 2016-09-28 | 株式会社リコー | Transmission system, transmission method, and program |
US20110276894A1 (en) * | 2010-05-07 | 2011-11-10 | Audrey Younkin | System, method, and computer program product for multi-user feedback to influence audiovisual quality |
US8553068B2 (en) * | 2010-07-15 | 2013-10-08 | Cisco Technology, Inc. | Switched multipoint conference using layered codecs |
GB201017382D0 (en) | 2010-10-14 | 2010-11-24 | Skype Ltd | Auto focus |
WO2012072276A1 (en) * | 2010-11-30 | 2012-06-07 | Telefonaktiebolaget L M Ericsson (Publ) | Transport bit-rate adaptation in a multi-user multi-media conference system |
WO2012100410A1 (en) * | 2011-01-26 | 2012-08-02 | 青岛海信信芯科技有限公司 | Method, video terminal and system for enabling multi-party video calling |
US8848025B2 (en) | 2011-04-21 | 2014-09-30 | Shah Talukder | Flow-control based switched group video chat and real-time interactive broadcast |
GB2491852A (en) * | 2011-06-13 | 2012-12-19 | Thales Holdings Uk Plc | Rendering Active Speaker Image at Higher Resolution than Non-active Speakers at a Video Conference Terminal |
KR101183864B1 (en) | 2012-01-04 | 2012-09-19 | 휴롭 주식회사 | Hub system for supporting voice/data share among wireless communication stations and method thereof |
JP6174501B2 (en) * | 2014-02-17 | 2017-08-02 | 日本電信電話株式会社 | Video conference server, video conference system, and video conference method |
CN105450976B (en) * | 2014-08-28 | 2018-08-07 | 南宁富桂精密工业有限公司 | video conference processing method and system |
EP3270371B1 (en) * | 2016-07-12 | 2022-09-07 | NXP USA, Inc. | Method and apparatus for managing graphics layers within a graphics display component |
JP6535431B2 (en) | 2017-07-21 | 2019-06-26 | レノボ・シンガポール・プライベート・リミテッド | Conference system, display method for shared display device, and switching device |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0654322A (en) * | 1992-07-28 | 1994-02-25 | Fujitsu Ltd | System for controlling picture data adaption in tv conference using multi-spot controller |
US5629736A (en) * | 1994-11-01 | 1997-05-13 | Lucent Technologies Inc. | Coded domain picture composition for multimedia communications systems |
DE69515838T2 (en) * | 1995-01-30 | 2000-10-12 | International Business Machines Corp., Armonk | Priority-controlled transmission of multimedia data streams via a telecommunication line |
US6314302B1 (en) * | 1996-12-09 | 2001-11-06 | Siemens Aktiengesellschaft | Method and telecommunication system for supporting multimedia services via an interface and a correspondingly configured subscriber terminal |
WO1998042132A1 (en) * | 1997-03-17 | 1998-09-24 | Matsushita Electric Industrial Co., Ltd. | Method of processing, transmitting and receiving dynamic image data and apparatus therefor |
US6798838B1 (en) * | 2000-03-02 | 2004-09-28 | Koninklijke Philips Electronics N.V. | System and method for improving video transmission over a wireless network |
US20020093531A1 (en) * | 2001-01-17 | 2002-07-18 | John Barile | Adaptive display for video conferences |
-
2002
- 2002-01-30 GB GB0202101A patent/GB2384932B/en not_active Expired - Fee Related
- 2002-12-16 WO PCT/EP2002/014337 patent/WO2003065720A1/en active Application Filing
- 2002-12-16 CN CNA028277430A patent/CN1618233A/en active Pending
- 2002-12-16 JP JP2003565169A patent/JP2005516557A/en not_active Withdrawn
- 2002-12-16 KR KR10-2004-7011846A patent/KR20040079973A/en not_active Ceased
-
2004
- 2004-01-30 HK HK04100650A patent/HK1058450A1/en not_active IP Right Cessation
- 2004-07-29 FI FI20041039A patent/FI20041039L/en unknown
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103023666A (en) * | 2005-09-07 | 2013-04-03 | 维德约股份有限公司 | System and method for a conference server architecture for low delay and distributed conferencing applications |
CN101467423B (en) * | 2006-06-16 | 2013-06-19 | 微软公司 | Performance enhancements for video conferencing |
WO2014005488A1 (en) * | 2012-07-03 | 2014-01-09 | 中国移动通信集团公司 | Video data flow transmission method, terminal and system |
CN103533294A (en) * | 2012-07-03 | 2014-01-22 | 中国移动通信集团公司 | Video data flow transmission method, terminal and system |
CN108134915A (en) * | 2014-03-31 | 2018-06-08 | 宝利通公司 | For the method and system of mixed topology media conference system |
CN108134915B (en) * | 2014-03-31 | 2020-07-28 | 宝利通公司 | Method and system for a hybrid topology media conferencing system |
CN109076128A (en) * | 2016-02-29 | 2018-12-21 | 铁三角有限公司 | conference system |
US10791156B2 (en) | 2016-02-29 | 2020-09-29 | Audio-Technica Corporation | Conference system |
CN109076128B (en) * | 2016-02-29 | 2020-11-27 | 铁三角有限公司 | conference system |
CN109845151A (en) * | 2016-09-23 | 2019-06-04 | 高通股份有限公司 | Adaptive Modulation order for multi-user's superposed transmission with non-alignment resource |
CN107968768A (en) * | 2016-10-19 | 2018-04-27 | 中兴通讯股份有限公司 | Sending, receiving method and device, system, the video relaying of Media Stream |
CN106572320A (en) * | 2016-11-11 | 2017-04-19 | 上海斐讯数据通信技术有限公司 | Multiparty video conversation method and system |
CN111314738A (en) * | 2018-12-12 | 2020-06-19 | 阿里巴巴集团控股有限公司 | Data transmission method and device |
Also Published As
Publication number | Publication date |
---|---|
FI20041039L (en) | 2004-09-29 |
GB0202101D0 (en) | 2002-03-13 |
HK1058450A1 (en) | 2004-05-14 |
KR20040079973A (en) | 2004-09-16 |
GB2384932B (en) | 2004-02-25 |
GB2384932A (en) | 2003-08-06 |
WO2003065720A1 (en) | 2003-08-07 |
JP2005516557A (en) | 2005-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1618233A (en) | Video conferencing system and method of operation | |
US11503250B2 (en) | Method and system for conducting video conferences of diverse participating devices | |
CN100344097C (en) | Videoconference call set up | |
EP1496700B1 (en) | Apparatus, method and computer program for supporting video conferencing in a communication system | |
CN1575565A (en) | Method and device for multicasting in a UMTS network | |
CN1741610A (en) | The conversion method of image coded data and device, the system and program | |
JP2019530996A (en) | Method and apparatus for use of compact parallel codec in multimedia communications | |
CN1849824A (en) | System and method for performing distributed video conferencing | |
CN1745551A (en) | Communication control device, communication terminal device, server device, and communication control method | |
US9743043B2 (en) | Method and system for handling content in videoconferencing | |
US20140002584A1 (en) | Method of selecting conference processing device and video conference system using the method | |
JP2007096974A (en) | Video conference terminal and display position determining method | |
CN1253000C (en) | Method and system for realizing meeting appointment business | |
CN106254354A (en) | A kind of SDP negotiation method of asymmetric media parameter | |
JP2004304410A (en) | Communication processing apparatus, communication processing method, and computer program | |
Johanson | Multimedia communication, collaboration and conferencing using Alkit Confero | |
GB2378601A (en) | Replacing intra-coded frame(s) with frame(s) predicted from the first intra-coded frame | |
Mankin et al. | The design of a digital amphitheater | |
Jia et al. | Efficient 3G324M protocol Implementation for Low Bit Rate Multipoint Video Conferencing. | |
Gharai et al. | High Definition Conferencing: Present, Past and Future | |
Li et al. | A distributed multimedia conferencing based on SIP |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
C20 | Patent right or utility model deemed to be abandoned or is abandoned |