[go: up one dir, main page]

WO2007026269A1 - Communication system with landscape viewing mode - Google Patents

Communication system with landscape viewing mode Download PDF

Info

Publication number
WO2007026269A1
WO2007026269A1 PCT/IB2006/052504 IB2006052504W WO2007026269A1 WO 2007026269 A1 WO2007026269 A1 WO 2007026269A1 IB 2006052504 W IB2006052504 W IB 2006052504W WO 2007026269 A1 WO2007026269 A1 WO 2007026269A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
bandwidth
terminal
link
display screen
Prior art date
Application number
PCT/IB2006/052504
Other languages
French (fr)
Inventor
Stijn De Waele
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007026269A1 publication Critical patent/WO2007026269A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • H04N2007/145Handheld terminals

Definitions

  • the invention relates to a communication system with a terminal that contains a display screen and a camera unit.
  • the invention also relates to a method of operating a communication system, to terminals for such a system and a communication network for use in such a system.
  • a mobile video telephone comprises a display screen and a camera unit that is directed at a region from which the display screen will be viewed.
  • a user performs a call with such a mobile video telephone by looking at the display screen while the camera unit captures and transmits images of the user.
  • Such mobile video telephones are also frequently used to show the "landscape" around the user, by directing the camera away from the user towards the "landscape” (i.e. any scene, natural or man-made around the user).
  • the display screen is also turned away from the user, so that the user cannot view it.
  • Video telephony requires a considerable amount of transmission bandwidth in the network. It is desirable to reduce the bandwidth use. It is in particular desirable to reduce bandwidth use in the wireless transmission parts of networks.
  • video conferencing systems it is known to allocate bandwidth use dependent on user interest. As is well-known videoconferences typically involve more than two persons talking with each other, a video conferencing system allowing each participant to view one or more of the other participants. In simple video conferencing systems only the image of a current speaker is shown. In more advanced video conferencing systems images of other participants are shown as well. This requires considerable transmission bandwidth. However, it has been known to reduce the overall bandwidth by using relatively less bandwidth for images of non-speakers.
  • a video conferencing system is described in an article titled "User-interest Driven Video Adaptation for Collaborative Workspace Applications" by Jeremiah Scholl, Stefan Elf and Peter Parnes and published in Networked Group Communication 2003 3-12. Scholl et al. describe the allocation of bandwidth for transmission from terminals on the basis of user interest from other terminals.
  • a first terminal of a participant is made to transmit images of its participant using a larger bandwidth if it is detected that at least one participant displays these images in a focus window (and not in a secondary, non- focus window) at a second terminal.
  • a handheld terminal that comprises a display screen and a camera unit, such as a mobile video telephone.
  • a camera unit such as a mobile video telephone.
  • the bandwidth used for transmission of image data to a target terminal for display on a display screen of the target terminal is increased and decreased dependent on whether a face has been detected or not by a camera unit of the target terminal respectively, the camera unit being directed at a region from which the display screen can be viewed.
  • a typical application is that when a user turns a handheld target terminal to capture a "landscape view". This action is detected from the fact that no face is visible in the image from the camera unit. Thereupon bandwidth via the link to the terminal is saved by reducing the bandwidth of image data for display on the display screen.
  • This feature is primarily intended for use other than videoconferencing situations, although it can of course also be used by a participant of a video conference for showing a landscape for example.
  • the target face (viewing) dependent bandwidth adjustment feature should be contrasted with videoconferencing feature wherein bandwidth is allocated merely based on the presence of a speaker at the source of the image data. It should also be contrasted with voting techniques wherein bandwidth is increased for video conferencing participants that are displayed predominantly at terminals where participants are present.
  • a method of operating a communication system comprising a terminal with a display screen coupled to a network interface for displaying images received form the network interface and a camera unit directed at a region from which the display screen can be viewed.
  • the system comprises a terminal with a display screen coupled to a network interface for displaying images received form the network interface and a camera unit directed at a region from which the display screen can be viewed.
  • it is detected whether a face is visible in an image from the camera unit and a reduction and an increase of a bandwidth that is used to transmit image data for display on said display screen via the link when a face is not detected and detected in the image respectively.
  • a typical application is that when a user turns the terminal to display a "landscape view" this is detected because no face is visible in the image from the camera unit.
  • bandwidth via the link to the terminal is saved by reducing the bandwidth of image data for display on the display screen.
  • the terminal is a handheld mobile telephone with a camera unit and a display screen.
  • Face detection may be performed in the network so that a known terminal can be used.
  • face detection may be performed in the terminal, a signal indicative of the face detection being sent via the link.
  • the bandwidth adjustment may be performed in the network, for example by means of transcoding of image data from a further terminal dependent on face detection so that a standard terminal, like another mobile telephone can be used to produce the original image data.
  • the bandwidth may be adjusted in the further terminal dependent on fact detection for image from the target of the imaged data, for example by increasing the compression rate for compression of the image data or during transcoding.
  • Bandwidth adjustment in the further terminal also reduces bandwidth use for communication with the further terminal.
  • Methods of increasing and decreasing compression rate are known per se and include for example alternatives like converting to a lower spatial resolution, reducing temporal resolution of a stream of images, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques.
  • image data be it at a lowered bandwidth is supplied even if no face is detected.
  • the image data can be made available for time-delayed viewing at the terminal after showing a "landscape", or for display on a further display screen at the rear of the terminal.
  • a size of a detected face in said image is measured and the bandwidth is progressively decreased with decreasing measured size.
  • less bandwidth is used when the image data is viewed from a greater distance.
  • face recognition is performed the bandwidth is decreased if the face of the user is not detected.
  • Fig. 1 shows a communication system
  • Fig. Ia shows a communication system
  • Figs. 2-3 show further communication systems
  • Figure 1 shows a communication system with a first terminal 10, a radio interlace 12, a communication network 14 and a second terminal 16.
  • First and second terminal 10, 16 are coupled to communication network 14, the former via radio interface 12.
  • first and second terminal are mobile video telephones.
  • the system is, or comprises, a mobile telephone system, first terminal 10 being a handheld mobile telephone with a wireless radio link to radio interface 12, which may be a mobile telephone base station for example.
  • First terminal 10 comprises a local radio interface 100, a decompression unit 102, a display screen 104, a camera unit 106 and a face detector 108.
  • Display screen 104 is coupled to local radio interface 100 via decompression unit 102.
  • Camera unit 106 is coupled to local radio interface 100 via face detector 108. Camera unit 106 is directed so that it images a region of space in front of display screen 104 where a human user that views display screen 104 will be located.
  • a separate decompression unit 102 and face detector 108 are shown, it should be understood that in practice the functions of these units may be performed by executing different programs on the same processing circuit. Alternatively separate circuits may be used, which are programmed to perform these functions or hardwired to do so.
  • a connection of camera unit 106 to local radio interface 100 via face detector is shown, it should be appreciated that usually camera unit 106 is also coupled to local radio interface 100 for transmitting image data from camera unit 106, typically after compression.
  • Second terminal 16 comprises a camera unit 160 coupled to a compression unit 162 that is in turn coupled to a network interface 164.
  • Network interface 164 has an output coupled to a bandwidth control input of compression unit 162.
  • second terminal 16 is similar to first terminal 10, network interface 164 being a radio interlace and second terminal 16 comprising a display screen (not shown) also coupled to network interface 164.
  • a camera unit 160 is used as a source device for image data, it should be appreciated that alternatively other types of source apparatus, such a graphics image generator, may be used.
  • camera unit 160 of second terminal 16 captures a stream of images and sends electronic data that represent the stream of images to compression unit 162.
  • Compression unit 162 compresses the electronic data and sends compressed data to network interface 164.
  • Network interface 164 transmits the compressed data through communication network 14 to radio interface 12, which in turn transmits the compressed data to first terminal 10.
  • Local radio interface 12 passes the compressed data to decompression unit 102, which decompresses the compressed data and uses decompressed data to control image display on display screen 104.
  • Camera unit 106 captures images of the region of space in front of display screen 104 and feeds resulting image data to face detector 108.
  • Face detector 108 detects from the image data whether a face is present in the images and transmits control data that is indicative of a result of said detection to second terminal 16 via communication network 14.
  • An SMS message may be used for example to transmit the control data if a telephone network is used, this has the advantage that no adaptations of the network are needed, but any other type of messaging may be used.
  • the invention does not depend on the specific face detection technique that is used. Any technique may be used for example any of the techniques that are described in the publications mentioned in the introduction.
  • Network interlace 164 receives the control data and feeds the control data to the control input of compression unit 162.
  • Compression unit 162 adapts the compression rate dependent on the control data.
  • compression unit 162 uses a first compression rate and when the control data indicates that no face has been detected in the image from camera unit 106, compression unit 162 uses a second, higher compression rate (so that less bandwidth is needed when no face has been detected than when no face has been detected).
  • Methods of increasing and decreasing compression rate are known per se and include for example alternatives like converting the image from camera unit 160 to a lower spatial resolution, reducing temporal resolution of the stream of images from camera unit 160, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques.
  • compression unit 162 operates on a mode-switching basis, by supporting different modes of compression, with different compression rates (compressed data bandwidth) and switching from a first mode with a first compression rate to a second mode with a second compression rate, which is higher than the first compression rate (produces image data that requires less bandwidth) upon receiving information indicating that no face has been detected.
  • Compression unit 162 switches back from the second mode to the first mode upon receiving information that a face has been detected.
  • automatic switching to the second mode may be used when no information that indicates that a face has been detected has been received during a predetermined time interval. This provides for minimal bandwidth.
  • automatic switching to the first mode may be used when no information that indicates that no face has been detected has been received during a predetermined time interval. This provides for mixed use of first terminals that do and do not support lace detection.
  • Network interface 164 receives the compressed data and transmits it through communication network 14.
  • Radio interface 12 receives the compressed data from communication network 14 and transmits it to first terminal 10.
  • Local radio interface 100 of first terminal 10 feeds the compressed signal to decompression unit 102, which decompresses the signal and uses it to control image display on display screen 104.
  • transmission of image data may be omitted entirely when no face is detected. However, preferably, at least some image data is transmitted. This data may be used for storage in a memory (not shown) in first terminal 10 for time-delayed viewing of an image from second terminal 16. In an embodiment a reduction of temporal resolution (fewer images per second) is used to increase the compression rate. The resulting loss of quality is more acceptable for time delayed viewing (which does not require a realtime experience) than if only a reduction of spatial resolution were used to realize the same compression rate.
  • Figure Ia shows an embodiment wherein first terminal 10 comprises a first and second display screen 104, 104a facing mutually opposite directions from first terminals 10 (that is the region from which second display screen 104a will be viewed is not within the field of view of camera unit 106).
  • Second display screen 104a has a lower resolution than first display screen 104 and is used during "landscape viewing"
  • the increase in bandwidth is at least partly realized by decreasing spatial resolution of the transmitted images.
  • further terminal switches off image display on second display screen 104a when a face is detected in the images from camera unit 106.
  • further terminal switches on image display on second display screen 104a when a face is detected in the images from camera unit 106.
  • power consumption by first terminal 10 can be reduced.
  • face detector 108 is arranged to measure one or more properties of the detected face, such as a size of the face (e.g. from a distance between detected eyes or the surface area of the face).
  • face detector 108 includes information on this or these properties in the control data.
  • compression unit 162 is arranged to adapt the compression rate on the detected properties, for example by increasing the compression rate (reducing the required bandwidth) with decreasing size of the detected face.
  • first terminal 10 transmits information to reduce the bandwidth also if a face is detected when the measured properties of the face differ by more than a threshold amount from the reference properties. This reduces the probability that appearance of a face other than that of the user in a landscape causes a return to high bandwidth transmission.
  • Reference face properties may be stored for example after a learning phase in which a user shows his or her phase to camera unit 106 and face detection unit 108 extracts reference properties in the way it extracts the properties during normal use.
  • the learning phase may be part of normal use while camera unit 106 faces the user, before first terminal 10 is turned for "landscape viewing". Alternatively a separate learning phase may be provided. It should be noted that perfect recognition is not needed. Any recognition may reduce bandwidth use.
  • Properties that are useful for face recognitions include image data for pattern matching, ratios between eye distance and eye to mouth distance, face shape properties etc.
  • first terminal 10 may transmit information to reduce the bandwidth even if a face is detected.
  • Motion can be detected from the images from camera unit 106 or by a mechanical motion detector (not shown) in first terminal 10.
  • local radio interface 100 transmits image data that represents a stream of images, including the images that are used for face detection, from camera unit 106 to network 14, for display at second terminal 16.
  • the landscape images are transmitted as well as used to decide whether the bandwidth for images in the opposite direction should be reduced.
  • the images that are used for face detection may not be transmitted from first terminal 10, for example if it has been indicated that these images are not needed at second terminal 16.
  • bandwidth demand is reduced (preferably in a network/link that is arranged to use the freed bandwidth for other calls unrelated to the video call that is performed using the first and second terminal, or for data transmission).
  • This bandwidth reduction technique may also be applied to videoconferencing applications, to save transmission bandwidth towards a user.
  • this bandwidth reduction technique by itself wherein the face recognition is used to select image streams of one or more participant that will be displayed with maximum resolution (bandwidth) while reducing transmission of data from other participants.
  • FIG. 2 shows an embodiment wherein face detection and the increase of compression rate are performed in an interface 20 between network 14 and the radio link to first terminal 10.
  • a transcoder 22 and a face detector 24 are used in interlace 20.
  • Transcoder 22 is coupled between network 14 and radio interface 12.
  • Face detector 24 receives image signals from radio interface 12 and controls transcoder 22.
  • Transcoder passes signals from network 14 and radio interface 12, replacing image data by transcoded image data when face detector 24 signals that no face has been detected.
  • the transcoded compression data has a higher compression rate (lower bandwidth) than the image data from network 14.
  • transcoding may be used, for example alternatives like converting the image to a lower spatial resolution, reducing temporal resolution of the stream of images, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques.
  • the embodiment of figure 2 saves bandwidth on the radio link to first terminal 10, where bandwidth reduction is most critical.
  • no additional provisions for face detection and adaptive image coding need to be included in the terminals.
  • Figure 3 shows an embodiment wherein face detection and the increase of compression rate are performed in network interfaces (not in terminals 10, 16), but on different sides of network 14.
  • a face detector 32 and a multiplexer 30 are included between network 14 and radio interface 12 to detect a face from image data from first terminal and to add information about face detection to data sent through network 14.
  • a network interface 34 and a transcoder 36 are provided between network 14 and second terminal to control the compression rate dependent on whether a face has been detected.
  • the embodiment of figure 3 saves bandwidth in network 14 as well as the radio link to first terminal 10. Moreover, no additional provisions for lace detection and adaptive image coding need to be included in the terminals. It will be appreciated that a similar result can be obtained in an alternative embodiment wherein both face detection and transcoding are performed between network 14 and second terminal 16.
  • face-detection-adaptive compression are preferably performed in second terminal 16. This saves bandwidth in the link between network 14 and second terminal 16.
  • face detection involves an image recognition process that may occasionally produce erroneous results (detecting a face when there is none, or detecting no face when there is one, even leaving aside the possibility that a user "plays" the system by disguising him- or herself etc.). Typically this is not a serious problem, since it merely leads to occasionally increased bandwidth use (detection when no face is present) or occasional loss of image quality (no detection when a face is present).
  • first terminal 10 provides for user control of the activation and de-activation of the adaptive compression mechanism.
  • first terminal 10 has a first and second user selectable mode, the first mode supporting automatic adaptation of the compression rate based on face detection and the second mode disabling the automatic adaptation (which corresponds to permanent face detection).
  • first terminal 10 provides for the transmission of a mode control signal to the network interface on the side of network 14.
  • the network interlace is arranged to support a first and second user selectable mode dependent on the control signal, the first mode supporting automatic adaptation of the compression rate based on face detection and the second mode disabling the automatic adaptation (which corresponds to permanent face detection).
  • first terminal 10 is a handheld mobile telephone
  • first terminal 10 is a handheld mobile telephone
  • embodiments using other communication equipment with a display screen and a camera unit are also possible.
  • face detection on images from first terminal 10 to control bandwidth of images that are sent to first terminal 10, reducing bandwidth if no face is detected, overall bandwidth use in the communication system is reduced.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A communication system comprises a terminal (10) with a network interface (100) for communicating with a communication network (14) via a link. The terminal (100) comprises a display screen (104) coupled to the network interface (100) for displaying images received form the network interface (100) and a camera unit (106) directed at a region from which the display screen (104) can be viewed. It is detected whether a face is visible in an image from the camera unit (106). A reduction and an increase of a bandwidth that is used to transmit image data via the link for controlling image content on the display screen (104), when a face is not detected and detected in the image respectively. Thus, more bandwidth is used for displaying images when it is detected that someone looks at the screen. In a mobile telephone for example this can be used to decrease bandwidth use towards the mobile telephone when the user uses the mobile telephone to show a 'landscape' rather than his or her face.

Description

Communication system with landscape viewing mode
The invention relates to a communication system with a terminal that contains a display screen and a camera unit. The invention also relates to a method of operating a communication system, to terminals for such a system and a communication network for use in such a system.
In recent years mobile video telephony has become increasingly popular. A mobile video telephone comprises a display screen and a camera unit that is directed at a region from which the display screen will be viewed. A user performs a call with such a mobile video telephone by looking at the display screen while the camera unit captures and transmits images of the user. Such mobile video telephones are also frequently used to show the "landscape" around the user, by directing the camera away from the user towards the "landscape" (i.e. any scene, natural or man-made around the user). As a side effect the display screen is also turned away from the user, so that the user cannot view it. Some mobile video telephones overcome this problem by using an additional display screen on the mobile video telephone on the side opposite the camera unit.
Video telephony requires a considerable amount of transmission bandwidth in the network. It is desirable to reduce the bandwidth use. It is in particular desirable to reduce bandwidth use in the wireless transmission parts of networks. In video conferencing systems it is known to allocate bandwidth use dependent on user interest. As is well-known videoconferences typically involve more than two persons talking with each other, a video conferencing system allowing each participant to view one or more of the other participants. In simple video conferencing systems only the image of a current speaker is shown. In more advanced video conferencing systems images of other participants are shown as well. This requires considerable transmission bandwidth. However, it has been known to reduce the overall bandwidth by using relatively less bandwidth for images of non-speakers.
A video conferencing system is described in an article titled "User-interest Driven Video Adaptation for Collaborative Workspace Applications" by Jeremiah Scholl, Stefan Elf and Peter Parnes and published in Networked Group Communication 2003 3-12. Scholl et al. describe the allocation of bandwidth for transmission from terminals on the basis of user interest from other terminals. A first terminal of a participant is made to transmit images of its participant using a larger bandwidth if it is detected that at least one participant displays these images in a focus window (and not in a secondary, non- focus window) at a second terminal.
Thus, effectively transmission bandwidth is allocated to a participant on the basis of votes from other participants. However, a vote on the basis of the mode of display on a second terminal means very little if the participant of the second terminal isn't looking at the videoconference at all. To avoid assigning undue importance to the mode of display, Scholl et al. propose to use "hints" to determine whether the participant is active, before assigning weight to the mode of display at the second terminal. One hint is generated by detecting motion in front of the camera of the second terminal. If there is no such motion, no weight is assigned to the mode of display at the second terminal, as a result less transmission bandwidth may be allocated to the first terminal.
From US patent No. 5,774,591 it is known to use face detection in connection with video conferencing (and many other applications). US 5,774,591 elaborately describes techniques for face detection and the measurement of properties of faces. Face detection is used to open a channel to a camera that produces an image wherein a face is detected, or wherein it is detected that the faces looks at the camera. Another application involves a data processing system wherein more processing resources are allocated to processes that produce image data in screen areas to which the gaze of a user is directed.
Face detection in general has been described in many publications, such as "Detecting Faces in Images: A Survey", by Ming-Hsuan Yang, David J. Kriegman and Narendra Ahuja, published in the IEEE transactions on pattern analysis and machine intelligence, Vol. 24, No.l January 2002 pages 34-58.
Among others it is an object to reduce bandwidth use in a system with a terminal that comprises a display screen and a camera unit.
Among others it is an object to reduce bandwidth use in a system with a handheld terminal that comprises a display screen and a camera unit, such as a mobile video telephone. Among others it is an object to reduce bandwidth use in a mobile telephone network that supports mobile video telephony.
Among others it is an object to reduce bandwidth use during "landscape viewing" with a terminal that comprises a display screen and a camera unit. According to one feature the bandwidth used for transmission of image data to a target terminal for display on a display screen of the target terminal is increased and decreased dependent on whether a face has been detected or not by a camera unit of the target terminal respectively, the camera unit being directed at a region from which the display screen can be viewed. A typical application is that when a user turns a handheld target terminal to capture a "landscape view". This action is detected from the fact that no face is visible in the image from the camera unit. Thereupon bandwidth via the link to the terminal is saved by reducing the bandwidth of image data for display on the display screen.
This feature is primarily intended for use other than videoconferencing situations, although it can of course also be used by a participant of a video conference for showing a landscape for example. In this case the target face (viewing) dependent bandwidth adjustment feature should be contrasted with videoconferencing feature wherein bandwidth is allocated merely based on the presence of a speaker at the source of the image data. It should also be contrasted with voting techniques wherein bandwidth is increased for video conferencing participants that are displayed predominantly at terminals where participants are present.
A method of operating a communication system is provided wherein the system comprises a terminal with a display screen coupled to a network interface for displaying images received form the network interface and a camera unit directed at a region from which the display screen can be viewed. During operation it is detected whether a face is visible in an image from the camera unit and a reduction and an increase of a bandwidth that is used to transmit image data for display on said display screen via the link when a face is not detected and detected in the image respectively. A typical application is that when a user turns the terminal to display a "landscape view" this is detected because no face is visible in the image from the camera unit. Thereupon bandwidth via the link to the terminal is saved by reducing the bandwidth of image data for display on the display screen.
In an embodiment the terminal is a handheld mobile telephone with a camera unit and a display screen. Face detection may be performed in the network so that a known terminal can be used. Alternatively face detection may be performed in the terminal, a signal indicative of the face detection being sent via the link. Thus no extra face detection function is needed in the network. Similarly the bandwidth adjustment may be performed in the network, for example by means of transcoding of image data from a further terminal dependent on face detection so that a standard terminal, like another mobile telephone can be used to produce the original image data. Alternatively, the bandwidth may be adjusted in the further terminal dependent on fact detection for image from the target of the imaged data, for example by increasing the compression rate for compression of the image data or during transcoding. Bandwidth adjustment in the further terminal also reduces bandwidth use for communication with the further terminal. Methods of increasing and decreasing compression rate are known per se and include for example alternatives like converting to a lower spatial resolution, reducing temporal resolution of a stream of images, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques.
Preferably some image data, be it at a lowered bandwidth is supplied even if no face is detected. Thus, the image data can be made available for time-delayed viewing at the terminal after showing a "landscape", or for display on a further display screen at the rear of the terminal.
In a further embodiment a size of a detected face in said image is measured and the bandwidth is progressively decreased with decreasing measured size. Thus, less bandwidth is used when the image data is viewed from a greater distance. In a further embodiment face recognition is performed the bandwidth is decreased if the face of the user is not detected. Thus faces that are accidentally captured during landscape viewing will lead to less increase in bandwidth.
These and other objects and advantages of the invention will be described using non-limitative embodiments that are shown in the accompanying figures.
Fig. 1 shows a communication system;
Fig. Ia shows a communication system; Figs. 2-3 show further communication systems;
Figure 1 shows a communication system with a first terminal 10, a radio interlace 12, a communication network 14 and a second terminal 16. First and second terminal 10, 16 are coupled to communication network 14, the former via radio interface 12. In one embodiment first and second terminal are mobile video telephones. Preferably the system is, or comprises, a mobile telephone system, first terminal 10 being a handheld mobile telephone with a wireless radio link to radio interface 12, which may be a mobile telephone base station for example.
First terminal 10 comprises a local radio interface 100, a decompression unit 102, a display screen 104, a camera unit 106 and a face detector 108. Display screen 104 is coupled to local radio interface 100 via decompression unit 102. Camera unit 106 is coupled to local radio interface 100 via face detector 108. Camera unit 106 is directed so that it images a region of space in front of display screen 104 where a human user that views display screen 104 will be located. Although a separate decompression unit 102 and face detector 108 are shown, it should be understood that in practice the functions of these units may be performed by executing different programs on the same processing circuit. Alternatively separate circuits may be used, which are programmed to perform these functions or hardwired to do so. Moreover, although only a connection of camera unit 106 to local radio interface 100 via face detector is shown, it should be appreciated that usually camera unit 106 is also coupled to local radio interface 100 for transmitting image data from camera unit 106, typically after compression.
Second terminal 16 comprises a camera unit 160 coupled to a compression unit 162 that is in turn coupled to a network interface 164. Network interface 164 has an output coupled to a bandwidth control input of compression unit 162. In one embodiment second terminal 16 is similar to first terminal 10, network interface 164 being a radio interlace and second terminal 16 comprising a display screen (not shown) also coupled to network interface 164. Although preferably a camera unit 160 is used as a source device for image data, it should be appreciated that alternatively other types of source apparatus, such a graphics image generator, may be used.
In operation, when a communications link through communication network 14 has been established between first and second terminal 10, 16, camera unit 160 of second terminal 16 captures a stream of images and sends electronic data that represent the stream of images to compression unit 162. Compression unit 162 compresses the electronic data and sends compressed data to network interface 164. Network interface 164 transmits the compressed data through communication network 14 to radio interface 12, which in turn transmits the compressed data to first terminal 10. Local radio interface 12 passes the compressed data to decompression unit 102, which decompresses the compressed data and uses decompressed data to control image display on display screen 104.
Camera unit 106 captures images of the region of space in front of display screen 104 and feeds resulting image data to face detector 108. Face detector 108 detects from the image data whether a face is present in the images and transmits control data that is indicative of a result of said detection to second terminal 16 via communication network 14. An SMS message may be used for example to transmit the control data if a telephone network is used, this has the advantage that no adaptations of the network are needed, but any other type of messaging may be used. The invention does not depend on the specific face detection technique that is used. Any technique may be used for example any of the techniques that are described in the publications mentioned in the introduction. Network interlace 164 receives the control data and feeds the control data to the control input of compression unit 162.
Compression unit 162 adapts the compression rate dependent on the control data. When the control data indicates that a face has been detected in the image from camera unit 106 of first terminal 10, compression unit 162 uses a first compression rate and when the control data indicates that no face has been detected in the image from camera unit 106, compression unit 162 uses a second, higher compression rate (so that less bandwidth is needed when no face has been detected than when no face has been detected). Methods of increasing and decreasing compression rate are known per se and include for example alternatives like converting the image from camera unit 160 to a lower spatial resolution, reducing temporal resolution of the stream of images from camera unit 160, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques. Preferably, compression unit 162 operates on a mode-switching basis, by supporting different modes of compression, with different compression rates (compressed data bandwidth) and switching from a first mode with a first compression rate to a second mode with a second compression rate, which is higher than the first compression rate (produces image data that requires less bandwidth) upon receiving information indicating that no face has been detected. Compression unit 162 switches back from the second mode to the first mode upon receiving information that a face has been detected. In a first alternative, automatic switching to the second mode may be used when no information that indicates that a face has been detected has been received during a predetermined time interval. This provides for minimal bandwidth. In a second alternative, automatic switching to the first mode may be used when no information that indicates that no face has been detected has been received during a predetermined time interval. This provides for mixed use of first terminals that do and do not support lace detection.
Network interface 164 receives the compressed data and transmits it through communication network 14. Radio interface 12 receives the compressed data from communication network 14 and transmits it to first terminal 10. Local radio interface 100 of first terminal 10 feeds the compressed signal to decompression unit 102, which decompresses the signal and uses it to control image display on display screen 104.
As can be appreciated has the effect that when no face is detected a reduction is made in the transmission bandwidth that is needed in the links between the terminals and network 14, as well as the bandwidth that is needed within network 14. The reduction results in reduced quality of the displayed image (or stream of images), which is expected to be acceptable when the image is not viewed. A typical application of this is when first terminal is used for "landscape viewing" by turning camera unit 106 and display screen 104 away from the user.
In one embodiment transmission of image data may be omitted entirely when no face is detected. However, preferably, at least some image data is transmitted. This data may be used for storage in a memory (not shown) in first terminal 10 for time-delayed viewing of an image from second terminal 16. In an embodiment a reduction of temporal resolution (fewer images per second) is used to increase the compression rate. The resulting loss of quality is more acceptable for time delayed viewing (which does not require a realtime experience) than if only a reduction of spatial resolution were used to realize the same compression rate.
Figure Ia shows an embodiment wherein first terminal 10 comprises a first and second display screen 104, 104a facing mutually opposite directions from first terminals 10 (that is the region from which second display screen 104a will be viewed is not within the field of view of camera unit 106). Second display screen 104a has a lower resolution than first display screen 104 and is used during "landscape viewing" By using transmission with increased compression rate when no face is detected it is ensured that updated images can be displayed on second display screen 104a. Preferably, in this embodiment the increase in bandwidth is at least partly realized by decreasing spatial resolution of the transmitted images. In a further embodiment further terminal switches off image display on second display screen 104a when a face is detected in the images from camera unit 106. In another embodiment further terminal switches on image display on second display screen 104a when a face is detected in the images from camera unit 106. Thus power consumption by first terminal 10 can be reduced.
In another embodiment face detector 108 is arranged to measure one or more properties of the detected face, such as a size of the face (e.g. from a distance between detected eyes or the surface area of the face). In this other embodiment face detector 108 includes information on this or these properties in the control data. In this other embodiment compression unit 162 is arranged to adapt the compression rate on the detected properties, for example by increasing the compression rate (reducing the required bandwidth) with decreasing size of the detected face. Thus a size indication which can take any of N non-zero size values can be transmitted (N being an integer greater than one) and N (or N' with
1<N'<N) different compression rates may be used dependent on the size. This is useful for example when the user of first terminal 10 views display screen 104 from a greater distance. In this other embodiment preferably a reduction of spatial resolution is used with decreasing detected face size. In a further embodiment the properties are compared with reference properties for a user of the first terminal 10. In this embodiment first terminal 10 transmits information to reduce the bandwidth also if a face is detected when the measured properties of the face differ by more than a threshold amount from the reference properties. This reduces the probability that appearance of a face other than that of the user in a landscape causes a return to high bandwidth transmission. Reference face properties may be stored for example after a learning phase in which a user shows his or her phase to camera unit 106 and face detection unit 108 extracts reference properties in the way it extracts the properties during normal use. The learning phase may be part of normal use while camera unit 106 faces the user, before first terminal 10 is turned for "landscape viewing". Alternatively a separate learning phase may be provided. It should be noted that perfect recognition is not needed. Any recognition may reduce bandwidth use. Properties that are useful for face recognitions include image data for pattern matching, ratios between eye distance and eye to mouth distance, face shape properties etc.
As an alternative to face recognition, other solutions may be used to reduce the probability that appearance of a face other than that of the user in a landscape causes a return to high bandwidth transmission. For example if motion of first terminal 10 is detected, first terminal 10 may transmit information to reduce the bandwidth even if a face is detected. Motion can be detected from the images from camera unit 106 or by a mechanical motion detector (not shown) in first terminal 10. Preferably, local radio interface 100 transmits image data that represents a stream of images, including the images that are used for face detection, from camera unit 106 to network 14, for display at second terminal 16. Thus, during "landscape viewing" the landscape images are transmitted as well as used to decide whether the bandwidth for images in the opposite direction should be reduced. However, as an alternative the images that are used for face detection may not be transmitted from first terminal 10, for example if it has been indicated that these images are not needed at second terminal 16.
An embodiment has been described for use in landscape viewing with a mobile telephone terminal the bandwidth demand is reduced (preferably in a network/link that is arranged to use the freed bandwidth for other calls unrelated to the video call that is performed using the first and second terminal, or for data transmission). This bandwidth reduction technique may also be applied to videoconferencing applications, to save transmission bandwidth towards a user. However, it may be noted that this bandwidth reduction technique by itself wherein the face recognition is used to select image streams of one or more participant that will be displayed with maximum resolution (bandwidth) while reducing transmission of data from other participants.
Although an embodiment has been shown wherein face detection and compression are performed in different terminals, it should be appreciated that these functions may be performed elsewhere. Figure 2 shows an embodiment wherein face detection and the increase of compression rate are performed in an interface 20 between network 14 and the radio link to first terminal 10. In this embodiment a transcoder 22 and a face detector 24 are used in interlace 20. Transcoder 22 is coupled between network 14 and radio interface 12. Face detector 24 receives image signals from radio interface 12 and controls transcoder 22. Transcoder passes signals from network 14 and radio interface 12, replacing image data by transcoded image data when face detector 24 signals that no face has been detected. In this case the transcoded compression data has a higher compression rate (lower bandwidth) than the image data from network 14. Any form of transcoding may be used, for example alternatives like converting the image to a lower spatial resolution, reducing temporal resolution of the stream of images, more coarsely quantizing data from the stream, switching to a different compression standard, omitting residue data and/or combinations of these techniques. The embodiment of figure 2 saves bandwidth on the radio link to first terminal 10, where bandwidth reduction is most critical. Moreover, no additional provisions for face detection and adaptive image coding need to be included in the terminals. Figure 3 shows an embodiment wherein face detection and the increase of compression rate are performed in network interfaces (not in terminals 10, 16), but on different sides of network 14. A face detector 32 and a multiplexer 30 are included between network 14 and radio interface 12 to detect a face from image data from first terminal and to add information about face detection to data sent through network 14. A network interface 34 and a transcoder 36 are provided between network 14 and second terminal to control the compression rate dependent on whether a face has been detected. The embodiment of figure 3 saves bandwidth in network 14 as well as the radio link to first terminal 10. Moreover, no additional provisions for lace detection and adaptive image coding need to be included in the terminals. It will be appreciated that a similar result can be obtained in an alternative embodiment wherein both face detection and transcoding are performed between network 14 and second terminal 16.
When the link between second terminal 16 and network 14 is also a radio link, face-detection-adaptive compression (and/or transcoding) are preferably performed in second terminal 16. This saves bandwidth in the link between network 14 and second terminal 16. It should be realized that face detection involves an image recognition process that may occasionally produce erroneous results (detecting a face when there is none, or detecting no face when there is one, even leaving aside the possibility that a user "plays" the system by disguising him- or herself etc.). Typically this is not a serious problem, since it merely leads to occasionally increased bandwidth use (detection when no face is present) or occasional loss of image quality (no detection when a face is present).
In a further embodiment first terminal 10 provides for user control of the activation and de-activation of the adaptive compression mechanism. In one embodiment pursuant to figure 1 first terminal 10 has a first and second user selectable mode, the first mode supporting automatic adaptation of the compression rate based on face detection and the second mode disabling the automatic adaptation (which corresponds to permanent face detection). In an pursuant to figures 2 and 3 first terminal 10 provides for the transmission of a mode control signal to the network interface on the side of network 14. The network interlace is arranged to support a first and second user selectable mode dependent on the control signal, the first mode supporting automatic adaptation of the compression rate based on face detection and the second mode disabling the automatic adaptation (which corresponds to permanent face detection).
Although an embodiment has been described wherein first terminal 10 is a handheld mobile telephone, it should be appreciated that embodiments using other communication equipment with a display screen and a camera unit are also possible. By using face detection on images from first terminal 10 to control bandwidth of images that are sent to first terminal 10, reducing bandwidth if no face is detected, overall bandwidth use in the communication system is reduced.

Claims

CLAIMS:
1. A communication system comprising: a communication network (14); a terminal (10) with a network interface (100) for communicating with the communication network (14) via a link, the terminal (10) comprising a display screen (104) coupled to the network interface (100) for displaying images received form the network interlace (14) and a camera unit (106) directed at a region from which the display screen (104) can be viewed; a bandwidth adjustment unit (162), arranged to adjust a bandwidth of image data that is transmitted through the link for controlling image content on the display screen (104), the bandwidth adjustment unit (162) having a bandwidth control input; a face detection unit (108) arranged to detect a face in an image from the camera unit (106), the face detection unit (108) having an output suitable for coupling to the bandwidth control input, and arranged to make the bandwidth adjustment unit (162) adjust to a relatively higher and lower bandwidth when a face is detected and not detected respectively.
2. A communication system according to Claim 1, wherein said link is a wireless communication link.
3. A communication system according to Claim 1, the communication network
(14) being or comprising a mobile telephone network, the terminal (10) being a handheld mobile telephone, said link being a link between a mobile telephone base station (12) and the terminal.
4. A communication system according to Claim 1, wherein the bandwidth adjustment unit (162) is located in an interface between a source of the image data and the communication network (14).
5. A communication system according to Claim 4, comprising a further terminal (16) with a network interface (164) for communicating with the communication network via a wireless link, the bandwidth adjustment unit (162) and the source of the image data (160) being located in the further terminal.
6. A communication system according to Claim 1, comprising a further terminal (16) with a source of the image data (160) and a network interface (164) for communicating with the communication network (14) via a further link, the bandwidth adjustment unit being coupled between said link and the further link.
7. A communication system according to Claim 1, wherein the face detection unit (108) is located in the terminal (16) between the camera unit (106) and the network interface (100) of the terminal (16).
8. A communication system according to Claim 1, wherein the face detection unit (108) is provided coupled between said link and the bandwidth adjustment unit (162).
9. A communication system according to Claim 1, wherein the face detection unit (108) is arranged to measure a size or area related parameter of a detected face in said image, and to cause the network interface (100) to transmit an indication of the measured parameter to the bandwidth adjustment unit (162), the bandwidth adjustment unit (162) being arranged to progressively decrease the bandwidth with decreasing measured size.
10. A communication system according to Claim 1, wherein the face detection unit (108) is arranged to perform face recognition and to cause the bandwidth adjustment unit (162) to decrease the bandwidth if no recognized face is detected.
11. A communication system according to Claim 1, wherein the terminal (14) comprises a further display screen(104a), the display screen (104) and the further display screen (104a) facing in mutually opposite directions from the terminal (14), the further display screen (104a) having a lower resolution than the display screen (104), the display screen (104) and the further display screen (104a) both being arranged to display images based on the image data.
12. A terminal apparatus (10) for use in communication with a further apparatus via a link, the terminal apparatus (10) comprising: a display screen (104); a camera unit (106) directed at a region from which the display screen (104) can be viewed; a face detection unit (108) coupled to the camera unit (106) and arranged to detect a face in an image from the camera unit (106); a communication interface (100) coupled between the link and the camera unit (106) and the display screen (104), and arranged to send a signal to cause reduction and increase of a transmission bandwidth of an image stream from the further apparatus through the link for controlling image content on the display screen (104), upon detection that a face has not been and has been detected in an image from the camera unit (106) respectively.
13. A terminal apparatus according to Claim 12, wherein said link is a wireless communication link.
14. A terminal apparatus according to Claim 13, wherein the terminal apparatus (10) is a handheld mobile telephone.
15. A terminal apparatus according to Claim 12, wherein the face detection unit
(108) is arranged to measure a size or area parameter of a detected face in said image, and to transmit a control signal indicative of the parameter to control a progressive decrease the bandwidth with decreasing measured size.
16. A terminal apparatus according to Claim 12, wherein the face detection unit
(108) is arranged to perform face recognition and to transmit a control signal for causing a decrease the bandwidth if a recognized face is not detected.
17. A terminal apparatus according to Claim 12, arranged to support user controlled switching between a first mode wherein said increasing and decreasing is active dependent on dynamic detection of the face and a second mode wherein face detection dependent bandwidth control is disabled.
18. A terminal apparatus (16) for use in communication with a further apparatus (10) via a link, the terminal apparatus (16) comprising: an interface (164) for communicating with the further terminal apparatus (10) via the link; - a bandwidth adjustment unit (162) having a bandwidth control input coupled to the network interface (164), the bandwidth adjustment unit (162) being arranged to decrease and increase a bandwidth used for image data for controlling image content on a display screen (104) of the further apparatus (10), upon reception of signals that are indicative of no-detection and detection of a face in an image from the further apparatus (10) respectively.
19. A communication network for communicating with a terminal (10) with a display screen (104) coupled and a camera unit (106) directed at a region from which the display screen (104) can be viewed, comprising: - a network interface (20) for communication with the terminal (10) via a link; a bandwidth adjustment unit (22) arranged to adjust a bandwidth of image data that is transmitted through the link for controlling image content on the display screen (104), the bandwidth adjustment unit (22) having a bandwidth control input responsive to a face detection signal indicative of detection of a face in an image from the camera unit (106), to adjust to a relatively higher and lower bandwidth when a face is detected and not detected respectively.
20. A communication network according to claim 19, comprising a face detection unit (24) coupled to the link, for receiving further image data from the terminal from the camera unit (106) and to perform face detection on said further image data to form the face detection signal.
21. A communication network according to claim 19, wherein said link is a wireless communication link.
22. A communication network according to claim 21, comprising a mobile telephone interlace (20) coupled to said link.
23. A method of operating a communication system that comprises a terminal (10) with a network interface (100) for communicating with a communication network (14) via a link, the terminal (100) comprising a display screen (104) coupled to the network interface (100) for displaying images received form the network interface (100) and a camera unit (106) directed at a region from which the display screen (104) can be viewed, the method comprising: detecting whether a face is visible in an image from the camera unit (106); causing a reduction and an increase of a bandwidth that is used to transmit image data via the link for controlling image content on the display screen (104), when a face is not detected and detected in the image respectively.
24. A method according to Claim 23, comprising: measuring a size or area parameter of a detected face in said image; further causing and a progressively decrease of the bandwidth with decreasing measured size or area.
25. A method according to Claim 23, comprising: performing face recognition and decreasing the bandwidth if no recognized face is detected.
PCT/IB2006/052504 2005-08-29 2006-07-21 Communication system with landscape viewing mode WO2007026269A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05107897.0 2005-08-29
EP05107897 2005-08-29

Publications (1)

Publication Number Publication Date
WO2007026269A1 true WO2007026269A1 (en) 2007-03-08

Family

ID=37564231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/052504 WO2007026269A1 (en) 2005-08-29 2006-07-21 Communication system with landscape viewing mode

Country Status (1)

Country Link
WO (1) WO2007026269A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008066705A1 (en) * 2006-11-27 2008-06-05 Eastman Kodak Company Image capture apparatus with indicator
US8266314B2 (en) 2009-12-16 2012-09-11 International Business Machines Corporation Automated audio or video subset network load reduction
US20160007047A1 (en) * 2014-07-04 2016-01-07 Magor Communications Corporation Method of controlling bandwidth in an always on video conferencing system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0579319A2 (en) * 1992-07-16 1994-01-19 Philips Electronics Uk Limited Tracking moving objects
EP0676899A2 (en) * 1994-04-06 1995-10-11 AT&T Corp. Audio-visual communication system having integrated perceptual speech and video coding
EP1158786A2 (en) * 2000-05-24 2001-11-28 Sony Corporation Transmission of the region of interest of an image
US6593955B1 (en) * 1998-05-26 2003-07-15 Microsoft Corporation Video telephony system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0579319A2 (en) * 1992-07-16 1994-01-19 Philips Electronics Uk Limited Tracking moving objects
EP0676899A2 (en) * 1994-04-06 1995-10-11 AT&T Corp. Audio-visual communication system having integrated perceptual speech and video coding
US6593955B1 (en) * 1998-05-26 2003-07-15 Microsoft Corporation Video telephony system
EP1158786A2 (en) * 2000-05-24 2001-11-28 Sony Corporation Transmission of the region of interest of an image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008066705A1 (en) * 2006-11-27 2008-06-05 Eastman Kodak Company Image capture apparatus with indicator
US7986336B2 (en) 2006-11-27 2011-07-26 Eastman Kodak Company Image capture apparatus with indicator
US8266314B2 (en) 2009-12-16 2012-09-11 International Business Machines Corporation Automated audio or video subset network load reduction
US20160007047A1 (en) * 2014-07-04 2016-01-07 Magor Communications Corporation Method of controlling bandwidth in an always on video conferencing system

Similar Documents

Publication Publication Date Title
US6947601B2 (en) Data transmission method, apparatus using same, and data transmission system
US6611281B2 (en) System and method for providing an awareness of remote people in the room during a videoconference
US6744460B1 (en) Video display mode automatic switching system and method
US7982762B2 (en) System and method for combining local and remote images such that images of participants appear overlaid on another in substanial alignment
US8379074B2 (en) Method and system of tracking and stabilizing an image transmitted using video telephony
US20020093531A1 (en) Adaptive display for video conferences
CN107147927B (en) Live broadcast method and device based on live broadcast wheat connection
US7508413B2 (en) Video conference data transmission device and data transmission method adapted for small display of mobile terminals
JP3487280B2 (en) Mobile phone terminal with image transmission function
CN105516638B (en) A kind of video call method, device and system
WO2007026269A1 (en) Communication system with landscape viewing mode
KR100735290B1 (en) Image Data Control Method in Video Call Mode of Mobile Device
KR100976361B1 (en) How to automatically switch mobile communication mode between video call and voice call
CN112153404B (en) Code rate adjusting method, code rate detecting method, code rate adjusting device, code rate detecting device, code rate adjusting equipment and storage medium
JP2002290808A (en) Mobile radio terminal
JP2013046319A (en) Image processing apparatus and image processing method
JP2001111976A (en) Video photographing device and communication terminal equipment
JP2001320707A (en) Image transmission system
KR100606067B1 (en) Mobile communication terminal with camera function and its method
US20040121763A1 (en) Method for providing television coverage, associated television camera station, receiving station and system
KR20060035064A (en) Display method of 3D image data in mobile terminal
JPH06253305A (en) Video conference system
JPH05328328A (en) Image transmission system
JP4087230B2 (en) Image transmission device
KR100770927B1 (en) How to shoot video on a mobile device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06780161

Country of ref document: EP

Kind code of ref document: A1