JP2000156849A

JP2000156849A - Portable information terminal equipment

Info

Publication number: JP2000156849A
Application number: JP11016598A
Authority: JP
Inventors: Kensuke Uehara; 堅助上原; Michiyo Morimoto; 美智代森本
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-06-29
Filing date: 1999-01-26
Publication date: 2000-06-06

Abstract

PROBLEM TO BE SOLVED: To obtain a portable information terminal equipment capable of stably displaying an image without blurring even when user's hands move, allowing a user to recognize the unsuitableness of a photographing angle, automatically switching display to a target photographing image, and saving power consumption. SOLUTION: A subject image whose area is extracted by an area extraction part 21 is moved to a prescribed position of a display frame and sent to the opposite party to display a stable picture having no blurs. When the area extracted subject image is brought into contact with the end of the display frame or extruded from the frame, the photographed image is displayed on the user side to allow the user to recognize it. Since a change in the direction of a camera part 4 is detected by a camera direction sensor part 28 and display is automatically switched to a target photographing image, operability can be improved.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、携帯が可能で、任
意の場所で情報のアクセスを行うことのできる携帯情報
端末装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a portable information terminal device which is portable and can access information at an arbitrary place.

【０００２】[0002]

【従来の技術】従来、携帯情報端末装置について、１つ
の利用形態としてテレビ電話機能がある。この機能は通
信回線を介し、遠隔地にいる相手方と画像及び音声を駆
使して会話を行うものである。一般に携帯情報端末装置
を持っている方は、屋外で無線により携帯情報端末装置
の表示画面に相手側の顔あるいは上半身を表示させ、相
手側の情報端末装置の表示画面にもユーザ側の顔あるい
は上半身を表示させ相互に会話を行う。このテレビ電話
機能は従来の携帯電話より相手側の顔が見られること
で、より臨場感のある会話ができるようになる。また、
ユーザ側は、ユーザの周りの景色、あるいは特定の対象
物などをカメラで撮影して相手側に送ることができるた
め、より具体的で中身のある会話が可能になる。2. Description of the Related Art Conventionally, there is a videophone function as one use form of a portable information terminal device. This function is to make a conversation with a remote party at a remote location by using an image and a voice via a communication line. Generally, a person who has a portable information terminal device displays the other person's face or upper body on the display screen of the portable information terminal device wirelessly outdoors, and also displays the user's face or upper body on the display screen of the other information terminal device. Display the upper body and talk with each other. This videophone function enables a more realistic conversation by seeing the face of the other party than a conventional mobile phone. Also,
The user can take a picture of a scene around the user or a specific object with a camera and send it to the other party, so that a more specific and rich conversation is possible.

【０００３】[0003]

【発明が解決しようとする課題】ところで、従来の携帯
情報端末装置においては、相手と会話しているとき携帯
情報端末装置を手で持ち、携帯情報端末装置に取り付け
られているカメラでユーザの被写体を撮影して図８２の
ような映像を相手側の情報端末装置に送り、表示部に表
示させている。一方、相手側の情報端末装置で撮影した
図８３のような映像を受信して、携帯情報端末装置の表
示部に表示させている。ユーザ側では相手側と会話する
際に、相手側の映像のみを表示させているためユーザの
被写体がどのように撮影されて、相手側に送られている
か分からない。また、携帯情報端末装置は手で持つため
手ぶれあるいはユーザの体が揺れて、揺れた状態で撮影
された映像が相手に送られることがある。しかし、相手
側に送られている映像をモニターする手段がないためユ
ーザには分からず、揺れた画面がそのまま送られ、相手
側に不快な思いをさせることがある。そこで、相手側に
送られる映像について、映像が揺れても、常に対象物す
なわちユーザの顔部分の映像を画像処理により、画面の
中央に移動させる方法が考えられる。例えば、図８４の
ように手ぶれあるいは体の揺れにより顔が画面の左側に
寄ってしまった場合を考える。By the way, in the conventional portable information terminal device, the user holds the portable information terminal device while talking with the other party and uses the camera attached to the portable information terminal device to shoot the subject of the user. 82 is sent to the information terminal device of the other party, and is displayed on the display unit. On the other hand, an image as shown in FIG. 83 taken by the information terminal device of the other party is received and displayed on the display unit of the portable information terminal device. When the user talks with the other party, only the video of the other party is displayed, so it is not known how the user's subject is photographed and sent to the other party. In addition, since the portable information terminal device is held in a hand, a camera shake or a user's body shakes, and an image shot in a shaken state may be sent to the other party. However, since there is no means for monitoring the video being sent to the other party, the user does not know, and the shaking screen is sent as it is, which may make the other party uncomfortable. Therefore, a method of always moving the image of the target object, that is, the image of the face of the user, to the center of the screen by image processing, even if the image is shaken, is considered. For example, consider a case in which the face has shifted to the left side of the screen due to camera shake or body shake as shown in FIG.

【０００４】図８５のように、ユーザの被写体及び背景
１００はレンズ７０を通ってカメラＩＦ部（カメライン
タフェース部）２５に入射する。カメラＩＦ部２５では
入射した映像をデジタル処理してからフレーム毎に画像
メモリ４５に蓄積する。そして、画像メモリ４５に蓄積
されたデータは映像情報１６０として外部に出力され
る。一方、画像メモリ４５に蓄積されたデータから輝度
情報及び色相情報１６１を抽出して形状抽出部２０に印
加する。形状抽出部２０では、輝度情報及び色相情報１
６１から肌色情報あるいは髪の毛情報などを使用してユ
ーザの顔形状を抽出し形状情報１６２を得る。パンニン
グ・チルティング駆動部７１で前記顔形状が表示画面の
中心に来るようにレンズ７０の光軸を機械的に変化させ
る。このループによってユーザの被写体及び背景１００
が揺れても安定した映像情報１６０が得られる。しか
し、この方式はレンズ７０の光軸を機械的に駆動してお
り、機械的な部品を使用しているためコストアップある
いは信頼性が低下する欠点があった。また別の方式とし
て、図８６のように、ユーザの被写体及び背景１００は
レンズ７０を通ってカメラＩＦ部２５に入射して、デジ
タル化する。デジタル化した映像信号はフレーム単位で
画像メモリ４５に蓄積される。画像メモリ４５に蓄積さ
れたデジタル映像から輝度情報及び色相情報１６１を抽
出して形状抽出部２０に印加する。形状抽出部２０では
輝度情報及び色相情報１６１から肌色情報あるいは髪の
毛情報などを使用してユーザの顔形状を抽出する。そし
て、アドレス制御部７２で画像メモリ４５のアドレスを
変えて顔形状が画面の中心に来るようにデジタル映像全
体を移動させる。そして、制御されたデジタル映像が映
像情報１６０として外部に出力される。[0005] As shown in FIG. 85, a subject and a background 100 of a user enter a camera IF unit (camera interface unit) 25 through a lens 70. The camera IF unit 25 digitally processes the incoming video and then stores it in the image memory 45 for each frame. The data stored in the image memory 45 is output to the outside as video information 160. On the other hand, luminance information and hue information 161 are extracted from the data stored in the image memory 45 and applied to the shape extraction unit 20. In the shape extraction unit 20, the luminance information and the hue information 1
From 61, the user's face shape is extracted using skin color information or hair information, and the like, and shape information 162 is obtained. The optical axis of the lens 70 is mechanically changed by the panning / tilting drive unit 71 so that the face shape is at the center of the display screen. By this loop, the user's subject and background 100
, The stable video information 160 can be obtained. However, this method has a drawback that the optical axis of the lens 70 is mechanically driven and the cost is increased or the reliability is reduced because mechanical parts are used. As another method, as shown in FIG. 86, the subject and the background 100 of the user enter the camera IF unit 25 through the lens 70 and are digitized. The digitized video signal is stored in the image memory 45 in frame units. The luminance information and the hue information 161 are extracted from the digital video stored in the image memory 45 and applied to the shape extracting unit 20. The shape extraction unit 20 extracts the user's face shape from the luminance information and the hue information 161 using skin color information or hair information. Then, the address control unit 72 changes the address of the image memory 45 and moves the entire digital image so that the face shape comes to the center of the screen. Then, the controlled digital video is output to the outside as video information 160.

【０００５】しかし、画像メモリ４５に蓄積されるデジ
タル映像が表示枠の範囲であると図８７のように顔形状
を中心に移動すると背景も一緒に移動して、画面の端に
映像が欠けた部分１２３が発生してしまう。そして、相
手に対して一部が欠けた映像を送ってしまうことにな
る。そこで、この欠点を改良するため、図８８のように
画像メモリ４５に蓄積するデジタル映像を、表示枠１１
１を超えて大き目の領域即ち撮影枠１２０により撮影し
ておく。そして、形状抽出部２０で顔形状１１２を抽出
して、顔形状１１２が表示枠１１１の中心に来るよう
に、撮影枠１２０から映像を切り出す方法がある。しか
し、この方式は表示枠１１１より大き目の撮影枠１２０
により撮影する必要があり、画像メモリ４５が通常の表
示枠として必要としている容量より大きくなる欠点があ
った。そして、図８９のように顔形状１１２が極端に撮
影枠１２０の端に寄れば表示枠１１１は撮影枠１２０か
らはみ出してはみ出した領域１２４の映像が欠けてしま
う欠点があった。また、ユーザがカメラで撮影してい
て、ユーザのプライバシーに関わる場所でも、上記従来
の方式においては相手にユーザの被写体と背景を一緒に
送ってしまうため、ユーザが知られては都合の悪い背景
も相手に送られてしまう不都合が発生していた。[0005] However, if the digital image stored in the image memory 45 is within the range of the display frame, if the image moves around the face shape as shown in FIG. 87, the background also moves and the image is missing at the edge of the screen. The portion 123 occurs. Then, a partially missing video is sent to the other party. Therefore, in order to improve this drawback, the digital video stored in the image memory 45 as shown in FIG.
An image is taken in an area larger than 1, that is, an image taking frame 120. Then, there is a method of extracting the face shape 112 by the shape extraction unit 20 and cutting out an image from the photographing frame 120 so that the face shape 112 comes to the center of the display frame 111. However, this method uses a shooting frame 120 larger than the display frame 111.
Therefore, there is a disadvantage that the capacity of the image memory 45 is larger than that required for the normal display frame. Then, as shown in FIG. 89, if the face shape 112 is extremely close to the end of the shooting frame 120, the display frame 111 has a drawback that the image of the area 124 that protrudes from the shooting frame 120 is lost. Further, even in a place where the user is taking a picture with a camera and the user is involved in privacy, in the above-described conventional method, the subject and the background of the user are sent to the other party together. Had to be sent to the other party.

【０００６】また、会話を行っている過程で周囲の景色
あるいは対象物を撮影して相手側に送る場合、撮影され
た映像がどのような状態で送られているか携帯情報端末
装置でモニタする必要がある。しかし、表示部には相手
側の被写体が表示されていたため、一度相手側の被写体
の表示から、周囲の景色の映像に切り替えなければなら
ず、操作が面倒であった。また、従来では一時的に表示
画面を周囲の景色を表示するときでも、相手から送られ
てきた映像を復号化する処理が行われており、復号化処
理部での電力が無駄に消費されていた。本発明は、この
ような問題点に鑑み為されたもので、手ぶれ等により筐
体がゆれた場合でも相手側では映像が振れることなく安
定して表示される携帯情報端末装置を提供することを目
的とする。また、本発明は、ユーザ側の撮影アングルが
適切でない場合に、このことをユーザに認識させること
ができる携帯情報端末装置を提供することを目的とす
る。更に、本発明は、カメラの向きを景色等の対象物の
方に変えた場合、ユーザ側に表示している映像を対象物
の映像に自動的に切り替えることができる携帯情報端末
装置を提供することを目的とする。[0006] Further, when photographing the surrounding scenery or an object during a conversation and sending it to the other party, it is necessary to monitor the state of the photographed image by using a portable information terminal device. There is. However, since the subject of the other party is displayed on the display unit, the user has to switch from displaying the subject of the other party once to the image of the surrounding scenery, and the operation is troublesome. Also, conventionally, even when the surrounding scenery is temporarily displayed on the display screen, processing for decoding the video sent from the other party is performed, and power in the decoding processing unit is wastefully consumed. Was. The present invention has been made in view of such a problem, and an object of the present invention is to provide a portable information terminal device that can display images stably without shaking on the other party even when the housing is shaken due to camera shake or the like. Aim. It is another object of the present invention to provide a portable information terminal device that allows a user to recognize when a user's shooting angle is not appropriate. Furthermore, the present invention provides a portable information terminal device that can automatically switch the image displayed on the user side to the image of the object when the direction of the camera is changed to the object such as the scenery. The purpose is to.

【０００７】更にまた、本発明は、消費電力の節約を図
ることができる携帯情報端末装置を提供することを目的
とする。Still another object of the present invention is to provide a portable information terminal device capable of saving power consumption.

【０００８】[0008]

【課題を解決するための手段】請求項１に記載の本発明
に係る携帯情報端末装置は、筐体がユーザにより保持さ
れた状態でカメラによりユーザの被写体及び背景を撮影
する撮影手段と、この撮影手段により撮影された被写体
を所定の処理により領域抽出し、領域抽出された被写体
映像を得る領域抽出手段と、この領域抽出手段により得
られた被写体映像が所定範囲の所定の位置より移動した
方向及び移動距離を検出する検出手段と、この検出手段
の検出結果を用いて被写体映像を所定の位置に移動する
手段と、所定の通信回線を介して相手側情報端末装置と
通信を行う通信手段と、所定の位置に移動された被写体
映像を通信手段により相手側情報端末装置に送信する映
像伝送制御手段とを具備したことを特徴とする。このよ
うな構成により、カメラで撮影した映像が、手揺れある
いはユーザの体の揺れ等により揺れても、所定領域を抽
出して所定の位置に移動させてから相手側に送られるた
め、相手側では映像が大きく揺れることなく安定して表
示される。そして、背景の映像は相手側に送らないた
め、画面の端の映像が欠落することがない。また、送信
する映像は領域抽出されているので、伝送情報量が少な
くて済む。According to a first aspect of the present invention, there is provided a portable information terminal device according to the present invention, wherein: a photographing means for photographing a subject and a background of a user by a camera in a state where a housing is held by the user; A region extraction unit that extracts a region of a subject photographed by the photographing unit by a predetermined process to obtain a region-extracted subject image; and a direction in which the subject image obtained by the region extraction unit moves from a predetermined position in a predetermined range. Detecting means for detecting a moving distance, a means for moving a subject image to a predetermined position by using a detection result of the detecting means, and a communication means for communicating with a counterpart information terminal device via a predetermined communication line. Video transmission control means for transmitting the subject video moved to a predetermined position to the partner information terminal device by the communication means. With such a configuration, even if the image shot by the camera shakes due to hand shaking or shaking of the user's body, a predetermined area is extracted and moved to a predetermined position and then sent to the other party. In, the image is displayed stably without significant shaking. Since the background video is not sent to the other party, the video at the edge of the screen is not lost. Further, since the region to be transmitted is extracted from the video, the amount of transmitted information can be reduced.

【０００９】請求項２に記載の本発明は、請求項１に記
載の携帯情報端末装置において、検出手段により、被写
体映像が所定範囲の端に接触し、又は被写体映像の少な
くとも一部が所定範囲からはみ出したことを検出した場
合、ユーザ側においてユーザの被写体及び背景を表示す
る手段を具備したことを特徴とする。このような構成に
よると、手揺れあるいはユーザの体の揺れがひどくな
り、被写体映像が所定範囲例えば表示画面の端に接触
し、又は所定範囲例えば表示画面からはみ出してしまう
と、自動的に自分自身の撮影している映像を見ることが
でき、携帯情報端末装置の持ち方を変えて正常なアング
ルで撮影ができるように修正することができる。請求項
３に記載の本発明は、請求項１に記載の携帯情報端末装
置において、所定範囲の所定の位置に移動した被写体映
像が拡大されて、所定範囲からはみ出した場合、被写体
と背景を含んだ映像を相手側情報端末装置に送信する手
段を具備したことを特徴とする。このような構成による
と、被写体映像が所定範囲例えば表示画面からはみ出し
た場合、背景部分が有する領域は極端に狭くなり、背景
部分を削除する必要性が無くなるので、最初から撮影し
て背景を含んだ生の映像を相手に送る。従って、領域抽
出手段は領域抽出する処理を省略できる。According to a second aspect of the present invention, in the portable information terminal device according to the first aspect, the object image contacts the end of the predetermined area or at least a part of the object image is in the predetermined area by the detecting means. A feature is provided in which the user is provided with a means for displaying the subject and the background of the user when the protruding part is detected. According to such a configuration, hand shake or shaking of the user's body becomes severe, and when the subject image comes into contact with a predetermined range, for example, the edge of the display screen, or protrudes from the predetermined range, for example, from the display screen, the subject automatically self-contains. Of the portable information terminal device can be changed so that it can be photographed at a normal angle by changing the way of holding the portable information terminal device. According to a third aspect of the present invention, in the portable information terminal device according to the first aspect, when a subject image moved to a predetermined position in a predetermined range is enlarged and protrudes from a predetermined range, the image includes a subject and a background. Means for transmitting an image to the other information terminal device. According to such a configuration, when the subject image protrudes from a predetermined range, for example, from the display screen, the area of the background portion becomes extremely narrow, and there is no need to delete the background portion. Send the raw video to the other party. Therefore, the region extracting means can omit the region extracting process.

【００１０】請求項４に記載の本発明は、請求項１又は
請求項２に記載の携帯情報端末装置において、被写体映
像が所定範囲からはみ出す直前の被写体映像を蓄積する
手段と、被写体映像が所定範囲からはみ出した場合、蓄
積した被写体映像を静止画として読み出して、所定範囲
からはみ出している期間通信回線を介して相手側情報端
末装置に送信する手段とを具備したことを特徴とする。
この構成によれば、被写体映像が所定範囲例えば表示画
面の端からはみ出して、ユーザが携帯情報端末装置の持
ち方を変えている間、直近の正常な被写体映像を相手側
に送ることで、ユーザが携帯情報端末装置の向きを修正
している間に撮影された揺れのひどい被写体映像を相手
側に送らなくて済み、相手側は常時安定したユーザ側の
被写体映像を見ることができる。請求項５に記載の本発
明に係る携帯情報端末装置は、筐体がユーザにより保持
された状態で、表示領域より広い撮影領域でカメラによ
りユーザの被写体及び背景を撮影する撮影手段と、この
撮影手段により撮影された撮影領域内の被写体を所定の
処理により領域抽出し、領域抽出された被写体映像を得
る領域抽出手段と、表示領域に被写体映像が収まるよう
に表示領域を移動する手段と、領域抽出手段により得ら
れた被写体映像が表示領域の所定の位置より移動した方
向及び移動距離を検出する検出手段と、この検出手段の
検出結果を用いて被写体映像を所定の位置に移動する手
段と、所定の通信回線を介して相手側情報端末装置と通
信を行う通信手段と、所定の位置に移動された被写体映
像を通信手段により相手側情報端末装置に送信する映像
伝送制御手段とを具備したことを特徴とする。According to a fourth aspect of the present invention, in the portable information terminal device according to the first or second aspect, means for accumulating a subject image immediately before the subject image protrudes from a predetermined range; Means for reading out the stored subject video as a still image when the image is outside the range, and transmitting the still image to the counterpart information terminal device via the communication line during the period outside the predetermined range.
According to this configuration, while the subject image protrudes from a predetermined range, for example, the edge of the display screen, and sends the latest normal subject image to the other party while the user changes the holding style of the portable information terminal device, Does not need to send the shaky subject image taken while the user is correcting the orientation of the portable information terminal device to the other party, and the other party can always see a stable subject image on the user side. A portable information terminal device according to a fifth aspect of the present invention includes: a photographing means for photographing a subject and a background of a user with a camera in a photographing area wider than a display area while the housing is held by the user; Area extracting means for extracting a subject in a photographing area photographed by the means by a predetermined process to obtain a subject image extracted from the area; means for moving the display area so that the subject image fits in the display area; Detecting means for detecting a direction and a moving distance of the subject image obtained by the extracting means from a predetermined position in the display area, and means for moving the subject image to a predetermined position by using a detection result of the detecting means; Communication means for communicating with the information terminal of the other party via a predetermined communication line, and transmitting the video of the subject moved to a predetermined position to the information terminal of the other party by the communication means Characterized by comprising a image transmission control means.

【００１１】このような構成により、ユーザの被写体を
撮影領域内で領域抽出して被写体映像を得ることで、被
写体映像の移動可能な範囲が拡大する。請求項６に記載
の本発明は、請求項５に記載の携帯情報端末装置におい
て、被写体映像が撮影領域の端に接触し、又は被写体映
像の少なくとも一部が撮影領域からはみ出したことを検
出した場合、ユーザ側においてユーザの被写体及び背景
を表示する手段を具備することを特徴とする。このよう
な構成により、手揺れあるいはユーザの体の揺れがひど
くなり、被写体映像が撮影領域の端に接触し、又は撮影
領域からはみ出してしまうと、自動的に自分自身の撮影
している映像を見ることができ、携帯情報端末装置の持
ち方を変えて正常なアングルで撮影ができるように修正
することができる。請求項７に記載の本発明は、請求項
５又は請求項６に記載の携帯情報端末装置において、被
写体映像が撮影領域からはみ出す直前の被写体映像を蓄
積する手段と、被写体映像が撮影領域からはみ出した場
合、蓄積した被写体映像を静止画として読み出して、撮
影領域からはみ出している期間通信回線を介して相手側
情報端末装置に送信する手段とを具備したことを特徴と
する。With such a configuration, by extracting a user's subject in the photographing region and obtaining a subject image, the movable range of the subject image is expanded. According to a sixth aspect of the present invention, in the portable information terminal device according to the fifth aspect, it is detected that the subject video has touched an edge of the shooting area or at least a part of the subject video has protruded from the shooting area. In this case, the user is provided with means for displaying the subject and the background of the user. With such a configuration, hand shaking or shaking of the user's body becomes severe, and when the subject image comes into contact with the edge of the shooting area or protrudes from the shooting area, the image shot by oneself is automatically displayed. It can be viewed and can be modified so that the portable information terminal device can be photographed at a normal angle by changing the holding method. According to a seventh aspect of the present invention, in the portable information terminal device according to the fifth or sixth aspect, means for accumulating a subject image immediately before the subject image protrudes from the shooting region, and a subject image protruding from the shooting region Means for reading out the stored subject video as a still image and transmitting the still video to the partner information terminal device via the communication line while the video is out of the shooting area.

【００１２】このような構成により、被写体映像が撮影
領域からはみ出して、ユーザが携帯情報端末装置の持ち
方を変えている間、直近の正常な被写体映像を相手側に
送ることで、ユーザが携帯情報端末装置の向きを修正し
ている間に撮影された揺れのひどい被写体映像を相手側
に送らなくて済み、相手側は常時安定したユーザ側の被
写体映像を見ることができる。請求項８に記載の本発明
に係る携帯情報端末装置は、筐体がユーザにより保持さ
れた状態でカメラによりユーザの被写体及び背景を撮影
する撮影手段と、この撮影手段により撮影された被写体
を所定の処理により領域抽出し、領域抽出された被写体
映像を得る領域抽出手段と、筐体においてカメラの取付
方向を検出する手段と、この手段によりカメラの取付方
向が被写体の撮影方向より外れた方向であることを検出
した場合、ユーザ側においてカメラが撮影した映像を表
示する手段と、所定の通信回線を介して相手側情報端末
装置と通信を行う通信手段と、カメラの取付方向が被写
体の撮影方向である場合は、領域抽出された被写体映像
を、またカメラの取付方向が被写体の撮影方向より外れ
た方向であることを検出した場合はカメラが撮影した映
像を通信手段により相手側情報端末装置に送信する映像
伝送制御手段とを具備したことを特徴とする。[0012] With this configuration, while the subject image protrudes from the shooting area and the user changes the way of holding the portable information terminal device, the latest normal subject image is sent to the other party, so that the user can carry the portable information terminal device. It is not necessary to send the subject image with severe shaking taken while the direction of the information terminal device is being corrected to the other party, and the other party can always see the stable subject image of the user. The portable information terminal device according to the present invention as defined in claim 8, wherein a photographing means for photographing a subject and a background of the user with a camera in a state in which the housing is held by the user, and a subject photographed by the photographing means. Area extracting means for extracting an area by the processing of the above, obtaining an object image of the extracted area, means for detecting the mounting direction of the camera in the housing, and the means for mounting the camera in a direction deviating from the shooting direction of the subject by this means. When it is detected that there is, a means for displaying an image captured by the camera on the user side, a communication means for communicating with the information terminal device of the other party via a predetermined communication line, In the case of, the camera shoots the image of the subject from which the area has been extracted. Characterized by comprising a video transmission control means for transmitting to the other information terminal device by the communication unit images were.

【００１３】このような構成により、相手側と会話して
いて、例えば周囲の景色の映像を相手に見せたいとき
に、カメラの向きを変えると、ユーザ側の携帯情報端末
装置の表示している映像は相手側の映像から景色の映像
に自動的に切り替わるので、ユーザはカメラを操作して
景色の映像を見ながら適切なアングルに調整することが
でき、操作性を向上させることができる。請求項９に記
載の本発明は、請求項１に記載の携帯情報端末装置にお
いて、筐体に設けられた赤外線放射手段及び赤外線カメ
ラと、赤外線放射手段から放射された赤外線の反射光を
赤外線カメラで撮影することにより、被写体の形状情報
を抽出する手段と、被写体映像の所定範囲に対する比率
が所定の比率となるように形状情報を制御する形状制御
手段とを具備し、領域抽出手段は、撮影手段により撮影
された被写体を、形状制御手段により制御された形状情
報を用いて領域抽出し、領域抽出された被写体映像を得
るようにしたことを特徴とする。このような構成によ
り、カメラで撮影した映像が手揺れあるいはユーザの身
体の揺れ等により揺れても、被写体映像を領域抽出して
一定の大きさにして、所定の位置に移動させてから相手
に送る為、相手側では映像が大きく揺れることなく、安
定して表示される。With this configuration, when the user is talking with the other party and wants to show the surrounding scenery video to the other party, for example, when the direction of the camera is changed, the portable information terminal device on the user side is displayed. Since the video is automatically switched from the video of the other party to the video of the landscape, the user can operate the camera to adjust the angle to an appropriate angle while watching the video of the landscape, thereby improving operability. According to a ninth aspect of the present invention, in the portable information terminal device according to the first aspect, an infrared radiating unit and an infrared camera provided in a housing, and an infrared camera that reflects reflected infrared light radiated from the infrared radiating unit. A means for extracting shape information of a subject by photographing, and a shape control means for controlling shape information such that a ratio of a subject image to a predetermined range is a predetermined ratio. The subject photographed by the means is subjected to region extraction using the shape information controlled by the shape control means, and a subject image in which the region is extracted is obtained. With such a configuration, even if the image captured by the camera shakes due to hand shaking or shaking of the user's body, the subject image is extracted to a certain size, moved to a predetermined position, and then moved to a partner. Since the image is transmitted, the image is displayed stably without much shaking on the other side.

【００１４】請求項１０に記載の本発明は、請求項１に
記載の携帯情報端末装置において、筐体に設けられた超
音波送信部及び超音波受信部と、超音波送信部から送信
した超音波パルスを超音波受信部で受信することによ
り、被写体の形状情報を抽出する手段と、被写体映像の
所定範囲に対する比率が所定の比率となるように形状情
報を制御する形状制御手段とを具備し、領域抽出手段
は、撮影手段により撮影された被写体を、形状制御手段
により制御された形状情報を用いて領域抽出し、領域抽
出された被写体映像を得るようにしたことを特徴とす
る。このような構成により、カメラで撮影した映像が手
揺れあるいはユーザの身体の揺れ等により揺れても、被
写体映像を領域抽出して一定の大きさにして、所定の位
置に移動させてから相手に送る為、相手側では映像が大
きく揺れることなく、安定して表示される。請求項１１
に記載の本発明に係る携帯情報端末装置は、筐体がユー
ザにより保持された状態でカメラによりユーザの被写体
及び背景を撮影する撮影手段と、筐体が揺れたときに揺
れの変位を検出する手段と、この手段により検出された
揺れの変位が所定の閾値を越えたときに、ユーザ側にお
いてユーザの被写体及び背景を所定の期間表示する手段
と、所定の通信回線を介して相手側情報端末装置と通信
を行う通信手段と、ユーザの被写体を含む映像を通信手
段により相手側情報端末装置に送信する映像伝送制御手
段とを具備したことを特徴とする。According to a tenth aspect of the present invention, in the portable information terminal device according to the first aspect, an ultrasonic transmitting unit and an ultrasonic receiving unit provided on a housing, and an ultrasonic wave transmitted from the ultrasonic transmitting unit. The apparatus comprises means for extracting shape information of a subject by receiving a sound wave pulse by an ultrasonic receiving unit, and shape control means for controlling shape information such that a ratio of a subject image to a predetermined range is a predetermined ratio. The area extracting means extracts an area of a subject photographed by the photographing means using the shape information controlled by the shape control means, and obtains an image of the subject from which the area is extracted. With such a configuration, even if the image captured by the camera shakes due to hand shaking or shaking of the user's body, the subject image is extracted to a certain size, moved to a predetermined position, and then moved to a partner. Since the image is transmitted, the image is displayed stably without much shaking on the other side. Claim 11
The portable information terminal device according to the present invention described in any of (1) to (3), further comprising: photographing means for photographing a subject and a background of the user with a camera in a state where the housing is held by the user; Means, a means for displaying the subject and background of the user for a predetermined period on the user side when the displacement of the shaking detected by the means exceeds a predetermined threshold value, and a counterpart information terminal via a predetermined communication line Communication means for communicating with the device and video transmission control means for transmitting a video including the user's subject to the information terminal of the other party by the communication means are provided.

【００１５】このような構成により、ユーザが携帯情報
端末装置を持って、表示手段に相手側の映像を表示して
相手と会話を行っている際に、手揺れ等による筐体の揺
れがひどくなると表示手段に表示されている表示内容を
ユーザの被写体及び背景に切り換えることにより、ユー
ザが、ユーザの被写体がどの程度表示枠からずれている
か確認でき、被写体の映像が正常な位置になるように携
帯情報端末装置の向きを修正することができる。請求項
１２に記載の本発明に係る携帯情報端末装置は、筐体が
ユーザにより保持された状態でカメラによりユーザの被
写体及び背景を撮影する撮影手段と、ユーザ側において
ユーザの被写体及び背景を所定の期間と所定の周期で表
示する手段と、所定の通信回線を介して相手側情報端末
装置と通信を行う通信手段と、ユーザの被写体を含む映
像を通信手段により相手側情報端末装置に送信する映像
伝送制御手段とを具備したことを特徴とする。このよう
な構成により、相手側の映像を表示させている際に、周
期的にユーザの被写体及び背景を表示させ、表示した映
像を随時確認することができ、表示枠から外れている被
写体の映像は、その都度携帯情報端末装置の向きを変え
ることで、正常な位置に被写体の映像を修正できる。[0015] With this configuration, when the user is holding the portable information terminal device and displaying the image of the other party on the display means to have a conversation with the other party, the housing is greatly shaken by hand shaking or the like. Then, by switching the display content displayed on the display means to the user's subject and the background, the user can confirm how much the user's subject has deviated from the display frame, so that the image of the subject is in a normal position. The orientation of the portable information terminal device can be corrected. According to a twelfth aspect of the present invention, there is provided a portable information terminal device according to the present invention, wherein a photographing means for photographing a subject and a background of a user by a camera in a state where the housing is held by the user, and a user's subject and a background determined by the user. Means for displaying at a predetermined period and a predetermined period, communication means for communicating with the other-side information terminal device via a predetermined communication line, and transmitting an image including the subject of the user to the other-side information terminal device by the communication means. Video transmission control means. With such a configuration, when the video of the other party is displayed, the subject and the background of the user can be displayed periodically, and the displayed video can be checked at any time, and the video of the subject that is out of the display frame can be checked. By changing the orientation of the portable information terminal device each time, the image of the subject can be corrected to a normal position.

【００１６】請求項１３に記載の本発明に係る携帯情報
端末装置は、筐体がユーザにより保持された状態でカメ
ラにより前記ユーザの被写体及び背景を撮影する撮影手
段と、所定の通信回線を介して相手側情報端末装置と通
信を行う通信手段と、ユーザの被写体を含む映像を通信
手段により相手側情報端末装置に送信する映像伝送制御
手段と、ユーザの発声した音声の無音区間を検出して、
無音区間から所定の長さを越えた休止期間を検出する第
１の検出手段と、相手側情報端末装置から通信手段を介
して伝送された音声を受信して音声の無音区間を検出
し、無音区間から所定の長さを越えた休止期間を検出す
る第２の検出手段と、第１及び第２の検出手段により検
出された休止期間に、ユーザ側においてユーザの被写体
及び背景を表示する手段とを具備したことを特徴とす
る。このような構成により、ユーザあるいは相手の発声
している音声の無音区間で、所定の長さ以上の場合、す
なわち会話が一息ついた段階でユーザ側の映像の表示に
切り替えるので、時間的に余裕を持って切り替えること
ができる。請求項１４に記載の本発明に係る携帯情報端
末装置は、筐体がユーザにより保持された状態でカメラ
によりユーザの被写体及び背景を撮影する撮影手段と、
所定の通信回線を介して相手側情報端末装置と通信を行
う通信手段と、ユーザの被写体を含む映像を通信手段に
より相手側情報端末装置に送信する映像伝送制御手段
と、ユーザの発声した音声のパワーが所定の閾値を越え
て、所定の時間継続した場合における発声期間を検出す
る手段と、発声期間にユーザ側においてユーザの被写体
及び背景を表示する手段とを具備したことを特徴とす
る。According to a thirteenth aspect of the present invention, the portable information terminal device according to the present invention includes a photographing means for photographing a subject and a background of the user with a camera in a state where the housing is held by the user, and a predetermined communication line. Communication means for communicating with the other party's information terminal device, video transmission control means for transmitting a video including the user's subject to the other party's information terminal device by the communication means, and detecting a silent section of the voice uttered by the user. ,
First detecting means for detecting a pause period exceeding a predetermined length from the silent section, and receiving a voice transmitted from the partner information terminal device via the communication means, detecting a silent section of the voice, and A second detecting means for detecting a pause period exceeding a predetermined length from the section, and a means for displaying the user's subject and background on the user side during the pause periods detected by the first and second detection means. It is characterized by having. With such a configuration, in the silent section of the voice uttered by the user or the other party, the display is switched to the display of the video on the user side when the duration is longer than a predetermined length, that is, when the conversation is paused, so that there is sufficient time. Can be switched. A portable information terminal device according to the present invention according to claim 14, wherein a photographing means for photographing a subject and a background of the user with a camera in a state where the housing is held by the user,
Communication means for communicating with the other information terminal device via a predetermined communication line; video transmission control means for transmitting a video including the user's subject to the other information terminal device by the communication means; The apparatus is characterized by comprising means for detecting an utterance period when the power exceeds a predetermined threshold and continuing for a predetermined time, and means for displaying a user's subject and background on the user side during the utterance period.

【００１７】このような構成により、ユーザが発声して
いる期間にユーザの被写体及び背景の表示に切り替える
ことで、発声しているユーザの被写体を見ながらポーズ
を取ることができる利点がある。請求項１５に記載の本
発明に係る携帯情報端末装置は、筐体がユーザにより保
持された状態でカメラによりユーザの被写体及び背景を
撮影する撮影手段と、所定の通信回線を介して相手側情
報端末装置と通信を行う通信手段と、ユーザの被写体を
含む映像を通信手段により相手側情報端末装置に送信す
る映像伝送制御手段と、通信手段により相手情報端末装
置と通信を開始する際にユーザ側においてユーザの被写
体及び背景を表示させる手段とを具備したことを特徴と
する。このような構成により、通信を開始する最初の過
程で、ユーザの被写体が表示枠に正常に収まっているこ
とを確認してから、相手と会話を開始するので、いきな
り相手の情報端末装置に対して表示枠に収まっていない
ユーザの被写体を含む映像を送信することが生じない。
請求項１６に記載の本発明に係る携帯情報端末装置は、
筐体がユーザにより保持された状態でカメラによりユー
ザの被写体及び背景を撮影する撮影手段と、所定の通信
回線を介して相手側情報端末装置と通信を行う通信手段
と、ユーザの被写体を含む映像を通信手段により相手側
情報端末装置に送信する映像伝送制御手段と、ユーザ側
においてユーザ側の映像を表示している期間を検出する
手段と、前記期間は通信手段により相手側情報端末装置
から伝送されてきた映像情報についての復号化手段に対
する電力の供給を停止させる手段とを具備したことを特
徴とする。With such a configuration, by switching to the display of the user's subject and the background while the user is uttering, there is an advantage that the user can pause while watching the uttering user's subject. A portable information terminal device according to the present invention as set forth in claim 15, further comprising: photographing means for photographing a subject and a background of the user with a camera in a state where the housing is held by the user; Communication means for communicating with the terminal device; video transmission control means for transmitting a video including the subject of the user to the other information terminal device by the communication means; and a user side for starting communication with the other information terminal device by the communication means. And means for displaying a subject and a background of the user. With such a configuration, in the first step of starting communication, after confirming that the subject of the user is normally contained in the display frame, the conversation with the other party is started. Therefore, the transmission of the video including the subject of the user who does not fit in the display frame does not occur.
The portable information terminal device according to the present invention according to claim 16,
Photographing means for photographing a subject and a background of the user by a camera while the housing is held by the user; communication means for communicating with a counterpart information terminal device via a predetermined communication line; and an image including the subject of the user Video transmission control means for transmitting to the information terminal of the other party by communication means, means for detecting the period during which the video of the user is displayed on the user side, and the time period is transmitted from the other information terminal apparatus by the communication means Means for stopping the supply of power to the decoding means for the decoded video information.

【００１８】このような構成により、ユーザ側の映像を
表示している期間は相手側情報端末装置から伝送されて
きた映像情報についての復号化手段に対する電力の供給
を停止させることで、携帯情報端末装置の消費電力の節
約を図ることができる。請求項１７に記載の本発明は、
請求項２、請求項６、請求項１１乃至請求項１５のいず
れかに記載の携帯情報端末装置において、ユーザ側にお
いてユーザの被写体及び背景を表示する際に、相手側情
報端末装置から伝送されてきた映像と共にマルチウイン
ドウで表示する手段を具備したことを特徴とする。この
ような構成により、ユーザ側の映像と相手側の映像をユ
ーザの携帯情報端末装置の表示手段に一緒にマルチウイ
ンドウで表示させることで、相手側の映像が途切れるこ
となく、同時にユーザの被写体も表示させ、ユーザが、
ユーザの被写体がどの程度表示枠から外れているか確認
でき、正常な位置に被写体の映像が表示されるように携
帯情報端末装置の向きを修正することができる。請求項
１８に記載の本発明は、請求項１１に記載の携帯情報端
末装置において、ユーザ側においてユーザの被写体及び
背景を表示する際に、相手側情報端末装置から伝送され
てきた映像と共にマルチウインドウで表示する手段と、
閾値を複数の等級に分けて、揺れの変位を前記等級に当
てはめる手段と、前記等級に応じてユーザの被写体と背
景の表示の大きさを変える手段とを具備したことを特徴
とする。[0018] With such a configuration, while the video on the user side is being displayed, the supply of power to the decoding means for the video information transmitted from the partner information terminal device is stopped, so that the portable information terminal is stopped. The power consumption of the device can be saved. The present invention according to claim 17 is:
In the portable information terminal device according to any one of claims 2, 6, 11 to 15, when the user displays the subject and the background of the user, the portable information terminal device is transmitted from the partner information terminal device. Means for displaying the image in a multi-window together with the image. With such a configuration, the video of the user and the video of the other party are displayed in a multi-window together with the display means of the portable information terminal device of the user, so that the video of the other party is not interrupted and the subject of the user is also displayed at the same time. And let the user
It is possible to check how much the user's subject is out of the display frame, and correct the orientation of the portable information terminal device so that the image of the subject is displayed at a normal position. The present invention according to claim 18 is the portable information terminal device according to claim 11, wherein when displaying the subject and the background of the user on the user side, the multi-window is displayed together with the video transmitted from the partner information terminal device. Means to display with,
A threshold is divided into a plurality of grades, and means for applying the displacement of shaking to the grade, and means for changing the size of the display of the subject and the background of the user according to the grade are provided.

【００１９】このような構成により、マルチウインドウ
に表示する際に、手揺れ等による筐体の揺れの大きさに
応じてユーザ側の映像のマルチウインドウ表示の大きさ
を変えることにより、揺れが大きいときは表示が大きく
なり、正常な位置にユーザの被写体が表示されるように
携帯情報端末装置の向きを調整させることが容易にな
る。請求項１９に記載の本発明は、請求項１２に記載の
携帯情報端末装置において、前記期間及び周期を設定す
る手段を具備したことを特徴とする。このような構成に
より、ユーザの被写体及び背景を表示させる期間及び周
期をユーザが自由に設定することができる。請求項２０
に記載の本発明は、請求項１２に記載の携帯情報端末装
置において、筐体が揺れたときに揺れの変位を検出する
手段と、この手段で検出された揺れの変位に応じて前記
周期を変える手段とを具備したことを特徴とする。この
ような構成により、ユーザの被写体及び背景を表示させ
る周期を揺れの大きさに適応して変えることができ、揺
れが大きいときは周期を短くして、ユーザの被写体及び
背景を頻繁に表示させることにより、正常な位置にユー
ザの被写体が表示されるように携帯情報端末装置の向き
を調整させることが容易になる。With such a configuration, when displaying in the multi-window, the size of the multi-window display of the image on the user side is changed according to the size of the shaking of the housing due to hand shaking or the like, so that the shaking is large. At this time, the display becomes large, and it becomes easy to adjust the orientation of the portable information terminal device so that the user's subject is displayed at a normal position. According to a nineteenth aspect of the present invention, in the portable information terminal device according to the twelfth aspect, a means for setting the period and the period is provided. With such a configuration, the user can freely set the period and cycle for displaying the subject and the background of the user. Claim 20
According to the present invention, in the portable information terminal device according to the twelfth aspect, a means for detecting a displacement of the shaking when the housing is shaken, and the cycle is set in accordance with the displacement of the shaking detected by the means. Changing means. With such a configuration, the period at which the subject and the background of the user are displayed can be changed in accordance with the magnitude of the shaking. When the shaking is large, the period is shortened and the subject and the background of the user are frequently displayed. This makes it easy to adjust the orientation of the portable information terminal device so that the user's subject is displayed at a normal position.

【００２０】請求項２１に記載の本発明は、請求項１１
又は請求項１２に記載の携帯情報端末装置において、ユ
ーザ側においてユーザの被写体及び背景を表示する際
に、筐体が揺れたときの揺れの変位が、ユーザが手揺れ
を修正する動作を行っているために所定の閾値を越えて
いる期間はユーザの被写体及び背景の表示を延長させる
手段を具備することを特徴とする。このような構成によ
り、相手側の映像を表示させている過程で、ユーザ側の
映像を挿入する際に、ユーザの被写体が表示枠から外れ
ている場合、ユーザが携帯情報端末装置の向きを変えて
表示枠に収まるように調整している間、ユーザの被写体
及び背景を表示手段に表示させることで、ユーザは時間
的に余裕を持って、正確に携帯情報端末装置の向きを変
えることができる。請求項２２に記載の本発明に係る携
帯情報端末装置は、筐体がユーザにより保持された状態
でカメラによりユーザの被写体及び背景を撮影する撮影
手段と、所定の通信回線を介して相手側情報端末装置と
通信を行う通信手段と、ユーザの被写体を含む映像を通
信手段により相手側情報端末装置に送信する映像伝送制
御手段と、手揺れで筐体が揺れたときに手揺れの変位を
検出する検出手段と、カメラで撮影したユーザの被写体
及び背景の映像に対して検出手段で検出した手揺れの変
位と逆方向の揺れの成分を加えることにより映像におけ
る手揺れの成分を打ち消す手段と、この手段により手揺
れの成分が打ち消された映像からユーザの被写体の揺れ
成分を検出する手段と、この手段により検出された被写
体の揺れ成分の大きさが所定の閾値を越えたときに、ユ
ーザ側においてユーザの被写体及び背景を表示させる手
段とを具備したことを特徴とする。The present invention described in claim 21 is directed to claim 11.
Alternatively, in the portable information terminal device according to claim 12, when displaying the subject and the background of the user on the user side, the displacement of the shake when the housing is shaken is performed by the user performing an operation of correcting the hand shake. Means for extending the display of the subject and the background of the user during a period in which the threshold value exceeds a predetermined threshold value. With such a configuration, when the user's subject is out of the display frame when inserting the user's video in the process of displaying the video of the other party, the user changes the orientation of the portable information terminal device. By displaying the subject and the background of the user on the display means while the adjustment is performed so as to fit in the display frame, the user can change the direction of the portable information terminal device accurately with sufficient time. . The portable information terminal device according to the present invention as set forth in claim 22, further comprising: photographing means for photographing a subject and a background of the user with a camera in a state where the housing is held by the user; Communication means for communicating with the terminal device, video transmission control means for transmitting an image including the user's subject to the other information terminal device by the communication means, and detecting displacement of hand shake when the housing is shaken by hand shake Detection means, and means for canceling the hand-shake component in the video by adding a hand-shake displacement and a hand-shake component detected by the detection means to the image of the user's subject and the background captured by the camera, A means for detecting a shaking component of the user's subject from the image in which the hand shaking component has been canceled by this means, and a magnitude of the shaking component of the subject detected by this means that a predetermined threshold value is set. When was example, it is characterized by comprising a means for displaying the user of the object and the background in the user side.

【００２１】このような構成により、手揺れ映像成分と
被写体の揺れ成分の合成された揺れ成分から、手揺れ成
分を例えばカメラのレンズの光軸を適応的に調整するこ
とで打ち消し、ユーザの被写体の揺れ成分を抽出するこ
とにより、被写体の揺れ成分が所定の閾値を越えた場
合、携帯情報端末装置の表示手段にユーザの被写体及び
背景を表示させることで、ユーザの被写体が表示枠から
どの程度外れているか、ユーザに注意を喚起することが
できる。With such a configuration, the hand-shake component is canceled out, for example, by adaptively adjusting the optical axis of the camera lens, from the shake component obtained by combining the hand-shake video component and the subject shake component, and the user's subject is removed. By extracting the swing component of the user, when the swing component of the subject exceeds a predetermined threshold, the user's subject and the background are displayed on the display means of the portable information terminal device, so that how much the user's subject is out of the display frame It can be off or alert the user.

【００２２】[0022]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態について詳細に説明する。なお、以下の図におい
て、同符号は同一部分又は対応部分を示す。（実施形態１）まず、本発明に係る携帯情報端末装置の
実施形態１について説明する。図１は、実施形態１に係
る携帯情報端末装置の構成を示すブロック図である。こ
の図において、符号１で示すものが端末本体であり、３
はイヤホンマイク、そして４はカメラ部である。端末本
体１は、主制御部１１、映像デコーダ１２、映像用ＬＣ
Ｄ制御回路部１３、映像用ＬＣＤ１４、テキスト用ＬＣ
Ｄ制御回路部１５、テキスト用ＬＣＤ１６、多重分離部
１７、ＰＨＳ回線ＩＦ部（ＰＨＳ回線インタフェース
部）１８、アンテナ１９、領域抽出部２１、音声コーデ
ック２３、イヤホンマイク用端子２４、カメラＩＦ部
（カメラインタフェース部）２５、カメラ用端子２６、
映像エンコーダ２７、カメラ向きセンサ部２８、操作入
力制御回路部２９、タッチパネル３０、スクロールダイ
ヤル３１、操作ボタン３２、電源ボタン３４及び電源部
３６を有する。このうち、主制御部１１、映像デコーダ
１２、映像用ＬＣＤ制御回路部１３、テキスト用ＬＣＤ
制御回路部１５、多重分離部１７、ＰＨＳ回線ＩＦ部１
８、領域抽出部２１、音声コーデック２３、カメラＩＦ
部２５、映像エンコーダ２７、カメラ向きセンサ部２
８、操作入力制御回路部２９、及び電源部３６は、主バ
ス３７を介して互いに接続されている。Embodiments of the present invention will be described below in detail with reference to the drawings. In the drawings, the same reference numerals indicate the same or corresponding parts. (Embodiment 1) First, Embodiment 1 of a portable information terminal device according to the present invention will be described. FIG. 1 is a block diagram illustrating a configuration of the portable information terminal device according to the first embodiment. In this figure, what is denoted by reference numeral 1 is the terminal body, and 3
Is an earphone microphone, and 4 is a camera unit. The terminal body 1 includes a main control unit 11, a video decoder 12, a video LC
D control circuit 13, LCD 14 for video, LC for text
D control circuit section 15, text LCD 16, demultiplexing section 17, PHS line IF section (PHS line interface section) 18, antenna 19, area extraction section 21, audio codec 23, earphone microphone terminal 24, camera IF section (camera Interface section) 25, camera terminal 26,
It has a video encoder 27, a camera orientation sensor unit 28, an operation input control circuit unit 29, a touch panel 30, a scroll dial 31, operation buttons 32, a power button 34, and a power unit 36. Among them, the main control unit 11, the video decoder 12, the video LCD control circuit unit 13, the text LCD
Control circuit unit 15, demultiplexing unit 17, PHS line IF unit 1
8, area extraction unit 21, audio codec 23, camera IF
Unit 25, video encoder 27, camera orientation sensor unit 2
The operation input control circuit unit 29 and the power supply unit 36 are connected to each other via a main bus 37.

【００２３】また、映像デコーダ１２、多重分離部１
７、音声コーデック２３、及び映像エンコーダ２７は、
同期バス３８を介して互いに接続されている。主制御部
１１は、ＣＰＵ、ＲＯＭ、及びＲＡＭなどを有してなる
ものであり、端末本体１の各部を総括制御することで携
帯情報端末装置としての動作を実現するものである。映
像デコーダ１２は、符号化映像データのデコードを行
い、再生した映像データを映像用ＬＣＤ制御回路部１３
へと与える。映像用ＬＣＤ制御回路部１３は、映像デコ
ーダ１２から与えられる映像データが示す映像を表示す
るべく映像用ＬＣＤ１４を制御する。映像用ＬＣＤ１４
は、映像（例えばＭＰＥＧ４映像）を表示するのに十分
な解像度を有したカラーＬＣＤであり、映像用ＬＣＤ制
御回路部１３の制御の下に映像を表示する。テキスト用
ＬＣＤ制御回路部１５は、主制御部１１から与えられる
テキストデータが示すテキスト画像を表示するようにテ
キスト用ＬＣＤ１６を制御する。テキスト用ＬＣＤ１６
は、映像用ＬＣＤ１４よりも面積が広く、かつ解像度が
低い白黒ＬＣＤであり、テキスト用ＬＣＤ制御回路部１
５の制御の下にテキスト画像を表示する。The video decoder 12, the demultiplexer 1
7, the audio codec 23 and the video encoder 27
They are connected to each other via a synchronous bus 38. The main control unit 11 includes a CPU, a ROM, a RAM, and the like, and realizes an operation as a portable information terminal device by comprehensively controlling each unit of the terminal body 1. The video decoder 12 decodes the encoded video data and outputs the reproduced video data to the video LCD control circuit unit 13.
Give to The video LCD control circuit 13 controls the video LCD 14 to display the video indicated by the video data supplied from the video decoder 12. LCD 14 for video
Is a color LCD having a resolution sufficient to display a video (for example, an MPEG4 video), and displays a video under the control of the video LCD control circuit unit 13. The text LCD control circuit unit 15 controls the text LCD 16 so as to display the text image indicated by the text data provided from the main control unit 11. LCD 16 for text
Is a black-and-white LCD having a larger area and a lower resolution than the video LCD 14, and the text LCD control circuit 1
The text image is displayed under the control of No. 5.

【００２４】多重分離部１７は、マルチメディア通信モ
ードと音声通話モードとの２つの動作モードを有してお
り、主制御部１１により指定されたモードで動作する。
マルチメディア通信モードのとき多重分離部１７は、映
像エンコーダ２７から同期バス３８を介して与えられる
符号化映像データ、音声コーデック２３から同期バス３
８を介して与えられる符号化音声データ、及び主制御部
１１から与えられる他データを所定の多重化方式（例え
ば、ＩＴＵ−Ｔ勧告のＨ．２２１、又はＩＴＵ−Ｔ勧告
のＨ．２２１を変形したもの）で多重化し、これにより
得られる伝送データをＰＨＳ回線ＩＦ部１８へと与え
る。またマルチメディア通信モードのとき多重分離部１
７は、ＰＨＳ回線ＩＦ部１８から与えられる伝送データ
から符号化映像データ、符号化音声データ、及び他デー
タをそれぞれ分離し、これらの各データを映像デコーダ
１２、音声コーデック２３、及び主制御部１１のそれぞ
れへと与える。これに対して、音声通話モードのとき多
重分離部１７は、音声コーデック２３から同期バス３８
を介して与えられる符号化音声データをそのままＰＨＳ
回線ＩＦ部１８へと与える。また音声通話モードのとき
多重分離部１７は、ＰＨＳ回線ＩＦ部１８から与えられ
る伝送データ（符号化音声データ）をそのまま音声コー
デック２３へと与える。The demultiplexing unit 17 has two operation modes, a multimedia communication mode and a voice communication mode, and operates in a mode specified by the main control unit 11.
In the multimedia communication mode, the demultiplexer 17 encodes video data supplied from the video encoder 27 via the synchronization bus 38,
8 and the other data provided from the main control unit 11 are converted into a predetermined multiplexing method (for example, H.221 of the ITU-T recommendation or H.221 of the ITU-T recommendation is modified). ), And the transmission data obtained by this is supplied to the PHS line IF unit 18. In the multimedia communication mode, the demultiplexer 1
7 separates coded video data, coded audio data, and other data from the transmission data supplied from the PHS line IF unit 18 and separates these data into the video decoder 12, the audio codec 23, and the main control unit 11. Give to each of the. On the other hand, in the voice communication mode, the demultiplexing unit 17 transmits a signal from the voice codec 23 to the synchronization bus 38.
PHS the encoded voice data given via
It is given to the line IF unit 18. In the voice communication mode, the demultiplexing unit 17 directly supplies the transmission data (coded voice data) provided from the PHS line IF unit 18 to the voice codec 23.

【００２５】ＰＨＳ回線ＩＦ部１８は、アンテナ１９を
介して無線によりＰＨＳ（Ｐｅｒｓｏｎａ１Ｈａｎｄ
ｙｐｈｏｎｅＳｙｓｔｅｍ）網に接続可能で、ＰＨＳ
網を介しての通信を行うための各種の呼処理を行うとと
もに、ＰＨＳ網を介して設定された通信バスを介して伝
送データの送受信を行う。領域抽出部２１はマルチメデ
ィア通信モードで動作し、カメラＩＦ部２５から与えら
れる映像からフレーム単位で頭部、即ち顔部分の映像を
切り出す。音声コーデック２３は、マルチメディア通信
モードと音声通話モードとの２つの動作モードを有して
おり、主制御部１１により指定されたモードで動作す
る。マルチメディア通信モードのとき音声コーデック２
３は、イヤホンマイク用端子２４を介して接続されたイ
ヤホンマイク３から出力される音声信号をデジタル化す
るとともに所定の低レート音声符号化方式（例えば、Ｉ
ＴＵ−Ｔ勧告のＧ７２９）でエンコードして符号化音声
データを得る。音声コーデック２３は、この符号化音声
データについて同期バス３８を介して多重分離部１７へ
と与える。またマルチメディア通信モードのとき音声コ
ーデック２３は、多重分離部１７から与えられる符号化
音声データにおける低レート音声符号をデコードすると
ともにアナログ化して音声信号を得る。音声コーデック
２３は、この音声信号をイヤホンマイク３へと与える。The PHS line IF unit 18 wirelessly transmits a PHS (Personal 1 Hand) through an antenna 19.
Yphone System) network and PHS
In addition to performing various types of call processing for performing communication via a network, transmission / reception of transmission data is performed via a communication bus set via the PHS network. The region extracting unit 21 operates in the multimedia communication mode, and cuts out the image of the head, that is, the image of the face, from the image supplied from the camera IF unit 25 in frame units. The voice codec 23 has two operation modes, a multimedia communication mode and a voice communication mode, and operates in the mode specified by the main control unit 11. Audio codec 2 in multimedia communication mode
3 digitizes an audio signal output from the earphone microphone 3 connected via the earphone microphone terminal 24 and performs a predetermined low-rate audio encoding method (for example, I / O).
The encoded audio data is obtained by encoding according to TU-T recommendation G729). The audio codec 23 supplies the encoded audio data to the demultiplexer 17 via the synchronization bus 38. In the multimedia communication mode, the audio codec 23 decodes the low-rate audio code in the encoded audio data supplied from the demultiplexing unit 17 and converts it into an analog signal to obtain an audio signal. The audio codec 23 supplies the audio signal to the earphone microphone 3.

【００２６】これに対して、音声通話モードのとき音声
コーデック２３は、イヤホンマイク用端子２４を介して
接続されたイヤホンマイク３から出力される音声信号を
デジタル化するとともに３２ｋｂｐｓのＡＤＰＣＭ方式
（ＩＴＵ−Ｔ勧告のＧ７２６）でエンコードして符号化
音声データを得る。音声コーデック２３は、この符号化
音声データについて同期バス３８を介して多重分離部１
７へと与える。また音声通話モードのとき音声コーデッ
ク２３は、多重分離部１７から与えられる符号化音声デ
ータにおけるＡＤＰＣＭ符号をデコードするとともにア
ナログ化して音声信号を得る。音声コーデック２３は、
この音声信号をイヤホンマイク３へと与える。なおイヤ
ホンマイク３は、周囲の音声を音声信号に変換して音声
コーデック２３に与えるとともに、音声コーデック２３
から与えられる音声信号を音声として出力する。このイ
ヤホンマイク３は、端末本体１に対して着脱自在となっ
ている。カメラＩＦ部２５は、カメラ用端子２６を介し
て接続されたカメラ部４から出力される映像信号を取込
み、デジタル化して映像データを得る。カメラＩＦ部２
５は、映像データを映像エンコーダ２７と領域抽出部２
１へ与える。On the other hand, in the voice communication mode, the voice codec 23 digitizes the voice signal output from the earphone microphone 3 connected via the earphone microphone terminal 24 and simultaneously converts the voice signal into a 32 kbps ADPCM system (ITU-M). The encoded audio data is obtained by encoding according to G recommendation T726 of T Recommendation). The audio codec 23 converts the encoded audio data via the synchronization bus 38 into the demultiplexer 1.
Give to 7. In the voice communication mode, the voice codec 23 decodes the ADPCM code in the coded voice data supplied from the demultiplexing unit 17 and converts it into an analog signal to obtain a voice signal. The audio codec 23
This audio signal is provided to the earphone microphone 3. Note that the earphone microphone 3 converts the surrounding sound into an audio signal and gives it to the audio codec 23,
Output as a voice. The earphone microphone 3 is detachable from the terminal body 1. The camera IF unit 25 takes in a video signal output from the camera unit 4 connected via the camera terminal 26 and digitizes the video signal to obtain video data. Camera IF section 2
5 is a video encoder 27 and an area extraction unit 2
Give to 1.

【００２７】映像エンコーダ２７は、カメラＩＦ部２５
又は領域抽出部２１から与えられる映像データをエンコ
ードして符号化映像データを得る。映像エンコーダ２７
は、符号化映像データを映像デコーダ１２や多重分離部
１７へと与える。領域抽出部２１は入力された映像から
被写体部分の領域映像を切り出す。切り出した映像は映
像エンコーダ２７に送られる。カメラ部４は、ＣＣＤカ
メラなどを用いたものである。このカメラ部４は、端末
本体１に対して着脱自在となっている。かつ、カメラ部
４は、映像用ＬＣＤ１４及びテキスト用ＬＣＤ１６の表
示面が設けられた側と同じ方向を撮影する状態と、映像
用ＬＣＤ１４及びテキスト用ＬＣＤ１６の表示面が設け
られた側に対する背面側の方向を撮影する状態との２つ
の状態での装着が可能である。カメラ向きセンサ部２８
は、カメラ部４の装着の有無、ならびにカメラ部４が上
記２つの装着状態のいずれで装着されているかを検出す
る。操作入力制御回路部２９には、タッチパネル３０、
スクロールダイヤル３１、操作ボタン３２、及び電源ボ
タン３４がそれぞれ接続されている。操作入力制御回路
部２９は、これらタッチパネル３０、スクロールダイヤ
ル３１、操作ボタン３２、及び電源ボタン３４でのユー
ザの指示操作を受付け、その指示操作の内容を主制御部
１１に通知する。The video encoder 27 includes a camera IF unit 25
Alternatively, encoded video data is obtained by encoding the video data provided from the area extracting unit 21. Video encoder 27
Supplies the encoded video data to the video decoder 12 and the demultiplexer 17. The region extracting unit 21 cuts out a region image of a subject portion from the input image. The clipped video is sent to the video encoder 27. The camera unit 4 uses a CCD camera or the like. The camera unit 4 is detachable from the terminal body 1. In addition, the camera unit 4 captures the image in the same direction as the side on which the display surfaces of the video LCD 14 and the text LCD 16 are provided, and the rear side with respect to the side on which the display surfaces of the video LCD 14 and the text LCD 16 are provided. It is possible to mount the camera in two states, that is, a state in which the direction is photographed. Camera orientation sensor unit 28
Detects whether the camera unit 4 is mounted and which of the two mounting states the camera unit 4 is mounted in. The operation input control circuit unit 29 includes a touch panel 30,
The scroll dial 31, the operation button 32, and the power button 34 are respectively connected. The operation input control circuit unit 29 receives a user's instruction operation on the touch panel 30, the scroll dial 31, the operation button 32, and the power button 34, and notifies the main control unit 11 of the content of the instruction operation.

【００２８】タッチパネル３０は、テキスト用ＬＣＤ１
６の表示面に重ねて配置されており、テキスト用ＬＣＤ
１６の表示内容に対応した各種の入力を受けるためのも
のである。スクロールダイヤル３１は、カーソル移動や
表示画面のスクロールなどの指示を受けるためのもので
ある。操作ボタン３２は、決定指示や取消指示の入力を
受けるためのものである。電源ボタン３４は、この携帯
情報端末装置の動作をＯＮ／ＯＦＦするための指示を受
付るためのものである。電源部３６は、例えばバッテリ
を電力源として有し、端末本体１の各部に電力供給を行
う。電源部３６は、主制御部１１の制御の下に各部への
電力供給をＯＮ／ＯＦＦする。ただし電源部３６は、少
なくとも主制御部１１及び操作入力制御回路部２９への
電力供給は常時行う。図２は端末本体１及びカメラ部４
の外観を示す図である。この図の（ａ）は正面図、
（ｂ）は左側面図、（ｃ）は右側面図、そして（ｄ）は
上面図である。なお、図１と同一部分には同一符号を付
している。この図に示すように端末本体１は、箱型の筐
体６０を有し、この筐体６０の内部に、前述した端末本
体１の各構成要素が収容されている。The touch panel 30 is an LCD 1 for text.
LCD for text, which is placed on the display surface of No.6
It is for receiving various inputs corresponding to the display contents of No. 16. The scroll dial 31 is for receiving instructions such as cursor movement and scrolling of a display screen. The operation button 32 is for receiving an input of a determination instruction or a cancellation instruction. The power button 34 is for receiving an instruction for turning on / off the operation of the portable information terminal device. The power supply unit 36 has, for example, a battery as a power source, and supplies power to each unit of the terminal body 1. The power supply unit 36 turns on / off power supply to each unit under the control of the main control unit 11. However, the power supply unit 36 always supplies power to at least the main control unit 11 and the operation input control circuit unit 29. FIG. 2 shows the terminal body 1 and the camera unit 4
FIG. (A) of this figure is a front view,
(B) is a left side view, (c) is a right side view, and (d) is a top view. The same parts as those in FIG. 1 are denoted by the same reference numerals. As shown in this figure, the terminal main body 1 has a box-shaped housing 60, and the above-described components of the terminal main body 1 are accommodated in the housing 60.

【００２９】映像用ＬＣＤ１４及びテキスト用ＬＣＤ１
６は、筐体６０の一面からその表示面を筐体６０の外部
に露出させた状態で設けられている。スクロールダイヤ
ル３１は、映像用ＬＣＤ１４及びテキスト用ＬＣＤ１６
がそれぞれ設けられている面（以下、筐体前面と称す
る。）と交差する４つの面（以下、筐体側面と称す
る。）のうちの１つ（ここでは図２（ａ）において右側
に示された筐体側面）に設けられている。また操作ボタ
ン３２及び電源ボタン３４は、筐体６０におけるスクロ
ールダイヤル３１が設けられた面に隣り合う筐体側面
（ここでは図２（ａ）において上側に示された筐体側
面）に設けられている。なおスクロールダイヤル３１と
操作ボタン３２は、人間の手の大きさを考慮し、筐体６
０の端部を手のひらに載せた状態で、同じ手の親指でス
クロールダイヤル３１を操作しつつ、同じ手の残りの指
で操作ボタン３２を操作可能なように相対的な位置が決
められている。イヤホンマイク用端子２４は、スクロー
ルダイヤル３１が設けられているのと同じ筐体側面に設
けられている。このイヤホンマイク用端子２４の位置
は、イヤホンマイク３（図２では図示せず）を装着した
状態でも、そのイヤホンマイク３がスクロールダイヤル
３１の操作の妨げとならないように決められている。LCD 14 for video and LCD 1 for text
Reference numeral 6 is provided in a state where the display surface is exposed to the outside of the housing 60 from one surface of the housing 60. The scroll dial 31 includes an image LCD 14 and a text LCD 16.
Of the four surfaces (hereinafter, referred to as side surfaces of the housing) intersecting with the surfaces (hereinafter, referred to as the front surfaces of the housing), respectively (shown on the right side in FIG. 2A here). On the side of the housing). Further, the operation button 32 and the power button 34 are provided on a side surface of the housing 60 (here, the side surface of the housing shown at the upper side in FIG. 2A) adjacent to the surface of the housing 60 on which the scroll dial 31 is provided. I have. Note that the scroll dial 31 and the operation buttons 32 are
The relative position is determined such that the scroll button 31 can be operated with the thumb of the same hand and the operation button 32 can be operated with the remaining fingers of the same hand while the end of the zero is placed on the palm. . The earphone microphone terminal 24 is provided on the same side surface of the housing as the one on which the scroll dial 31 is provided. The position of the earphone microphone terminal 24 is determined so that the earphone microphone 3 (not shown in FIG. 2) does not hinder the operation of the scroll dial 31 even when the earphone microphone 3 is mounted.

【００３０】カメラ用端子２６は、スクロールダイヤル
３１が設けられている筐体側面とは反対側の筐体側面
（ここでは図２（ａ）において左側に示された筐体側
面）に設けられている。カメラ部４は、カメラ部本体４
ａと支持部４ｂとをヒンジ部４ｃで連結してなり、操作
ボタン３２及び電源ボタン３４が設けられているのと同
じ筐体側面に設けられた凹部６１に支持部４ｂを挿入す
ることで端末本体１に装着される。さらにカメラ部４
は、図示しない接続線の先端に設けたプラグをカメラ用
端子２６に挿入することで端末本体１に対して電気的に
接続される。凹部６１は、筐体前寄りの凹部６１−１
と、筐体６０にて筐体前面とは反対側の面（以下、筐体
背面と称する）寄りの凹部６１−２との２つが形成され
ている。これにより、図２に示すように筐体前面寄りの
凹部６１−１に支持部４ｂを挿入することでカメラ部４
の撮影方向を筐体前面側とするほかに、筐体背面寄りの
凹部６１−２に、カメラ部４の向きを図２とは逆にして
支持部４ｂを挿入することでカメラ部４の撮影方向を筐
体背面側とすることもできる。またカメラ部本体４ａ
は、ヒンジ部４ｃを中心として回動可能であり、撮影角
度を変えることもできる。The camera terminal 26 is provided on the side of the housing opposite to the side of the housing on which the scroll dial 31 is provided (here, the side of the housing shown on the left side in FIG. 2A). I have. The camera unit 4 includes a camera unit body 4
a and the support portion 4b are connected by a hinge portion 4c, and the terminal is inserted by inserting the support portion 4b into a concave portion 61 provided on the same side of the housing where the operation button 32 and the power button 34 are provided. It is attached to the main body 1. Further camera unit 4
Is electrically connected to the terminal body 1 by inserting a plug provided at the end of a connection wire (not shown) into the camera terminal 26. The concave portion 61 is a concave portion 61-1 near the front of the housing.
And a concave portion 61-2 near the surface of the housing 60 opposite to the front surface of the housing (hereinafter, referred to as the rear surface of the housing). As a result, as shown in FIG. 2, by inserting the support 4b into the recess 61-1 near the front of the housing, the camera 4
In addition to setting the photographing direction of the camera unit 4 to the front side of the housing, the camera unit 4 is photographed by inserting the support unit 4b into the recess 61-2 near the rear surface of the housing with the direction of the camera unit 4 reversed from that of FIG. The direction may be the rear side of the housing. The camera body 4a
Is rotatable around the hinge 4c, and the photographing angle can be changed.

【００３１】次に、以上のように構成されたこの実施形
態の携帯情報端末装置の動作について説明する。まず、
主制御部１１は電源ＯＦＦの状態では、電源ボタン３４
が押下されるのを待ち受けている。そして主制御部１１
は、電源ボタン３４が押下されたことに応じて電源部３
６から各部への電力供給を開始させ、電源ＯＮ状態に移
行させる。この実施形態の携帯情報端末装置は、主な動
作モードとして、電話モードとテレビ電話モードを有し
ている。上述のように電源ＯＮ状態に移行した直後に主
制御部１１は、待機状態となる。待機状態において主制
御部１１は、電話モードとテレビ電話モードのうちのい
ずれかを選択するためのメインメニュー画面をテキスト
用ＬＣＤ１６に表示させるべくテキスト用ＬＣＤ制御回
路部１５を制御する。またメインメニュー画面には、現
在選択候補となっている動作モード名（初期状態では所
定の動作モード名）に重ねてカーソルを表示する。そし
てこのようなメインメニュー画面を表示させた状態で、
いずれかのモードによる選択操作（選択候補の変更指示
及び決定指示）がなされるのを待つ。なおメインメニュ
ー画面は、テキスト用ＬＣＤ１６よりも大きなサイズで
あっても良く、表示領域を後述のスクロールダイヤル３
１の操作に応じて変化させるようにしてもよい。この状
態でスクロールダイヤル３１が操作されると、主制御部
１１は選択候補の変更指示がなされたと判定する。そし
てこのときに主制御部１１は、スクロールダイヤル３１
の回転方向と回転量との情報を操作入力制御回路部２９
から受取り、一定量の回転毎に回転方向に応じた順序で
選択候補を変更するとともに、常に選択候補のモード名
に重ねてカーソルを表示するようにカーソルを移動させ
る。Next, the operation of the portable information terminal device of this embodiment configured as described above will be described. First,
When the power is off, the main control unit 11
Is waiting for the button to be pressed. And the main control unit 11
Indicates that the power supply unit 3
From 6, the power supply to each unit is started, and the power supply is turned on. The portable information terminal device of this embodiment has a telephone mode and a videophone mode as main operation modes. Immediately after shifting to the power-on state as described above, the main control unit 11 enters a standby state. In the standby state, the main control unit 11 controls the text LCD control circuit unit 15 so that the text LCD 16 displays a main menu screen for selecting one of the telephone mode and the videophone mode. Further, on the main menu screen, a cursor is displayed so as to overlap the operation mode name that is currently a candidate for selection (a predetermined operation mode name in an initial state). And with such a main menu screen displayed,
It waits for a selection operation (an instruction to change a selection candidate and an instruction to determine) in any mode. The main menu screen may be larger in size than the text LCD 16, and the display area is changed to a scroll dial 3 described later.
It may be changed in accordance with the operation 1. When the scroll dial 31 is operated in this state, the main control unit 11 determines that an instruction to change a selection candidate has been issued. At this time, the main control unit 11 controls the scroll dial 31
The operation input control circuit 29
And changes the selection candidate in an order corresponding to the rotation direction for each predetermined amount of rotation, and moves the cursor so as to always display the cursor over the mode name of the selection candidate.

【００３２】そして、操作ボタン３２がダブルクリック
されたら、主制御部１１は決定指示がなされたと判定す
る。そして、このときに主制御部１１は、現在選択候補
となっている動作モードの処理ルーチンに移行する。な
お主制御部１１は、選択候補の変更指示及び決定指示は
タッチパネル３０での入力状況に基づいても行うことが
できる。以下、各動作モードでの動作を、主制御部１１
の処理手順に従って順次説明する。［電話モード］この動作モードでは、ＰＨＳ端末として
音声通話を行うことを可能とする。この動作モードで主
制御部１１は、多重分離部１７及び音声コーデック２３
の動作モードを音声通話モードに設定している。そして
主制御部１１は、所定の電話番号指定方法メニュー画面
をテキスト用ＬＣＤ１６に表示させるべくテキスト用Ｌ
ＣＤ制御回路部１５を制御する。電話番号指定方法に
は、「番号入力」、「カナ検索」、「英字検索」、及び
「番号検索」がある。「番号入力」は数字の一覧をテキ
スト用ＬＣＤ１６に表示させ、スクロールダイヤル３１
を操作させて選択し、操作ボタン３２をクリックするこ
とにより電話番号を入力する。「カナ検索」、「英字検
索」、及び「数字検索」はカナ文字、英字、及び数字の
一覧を示した入力画面をテキスト用ＬＣＤ１６に表示
し、スクロールダイヤル３１の操作と操作ボタン３２を
クリックして希望の相手先名を入力する。そして、主制
御部１１内のＲＡＭ（図示せず）に格納されている電話
帳情報（電話番号と名称とが対応付けられた宛先情報が
複数登録されている）により、前記入力した文字列から
候補となる宛先情報を表示させ、その中から希望する宛
先情報を選択する。When the operation button 32 is double-clicked, the main control unit 11 determines that a determination instruction has been issued. Then, at this time, the main control unit 11 shifts to the processing routine of the operation mode that is currently the selection candidate. Note that the main control unit 11 can also issue a change instruction and a determination instruction of a selection candidate based on an input state on the touch panel 30. Hereinafter, the operation in each operation mode will be described by the main control unit 11.
Will be described one by one according to the processing procedure. [Telephone Mode] In this operation mode, it is possible to make a voice call as a PHS terminal. In this operation mode, the main controller 11 controls the demultiplexer 17 and the audio codec 23
Is set to voice call mode. Then, the main controller 11 controls the text LCD 16 to display a predetermined telephone number designation method menu screen on the text LCD 16.
The CD control circuit 15 is controlled. The phone number designation method includes “number input”, “kana search”, “alphabet search”, and “number search”. "Enter number" displays a list of numbers on the text LCD 16 and the scroll dial 31.
Is operated and selected, and the telephone number is input by clicking the operation button 32. “Kana search”, “English search”, and “Numeric search” display an input screen showing a list of Kana characters, English characters, and numbers on the LCD 16 for text, and click the operation of the scroll dial 31 and the operation button 32. The desired destination name. Then, based on the telephone directory information (a plurality of destination information in which a telephone number and a name are associated with each other are stored) stored in a RAM (not shown) in the main control unit 11, the input character string is The candidate destination information is displayed, and desired destination information is selected from the displayed destination information.

【００３３】主制御部１１は、以上のようにして発信電
話番号を確定すると、その発信電話番号による発信処理
を行うようにＰＨＳ回線ＩＦ部１８を制御する。この発
信処理に応じてＰＨＳ網によって発信先の端末（相手側
情報端末装置）との間の通信パスが形成されたならば、
ユーザはイヤホンマイク３を用いての通話を行うことが
できる。すなわち、ＰＨＳ網を介して相手側情報端末装
置から送られてきた符号化音声データ（ＡＤＰＣＭ方
式）は、アンテナ１９及びＰＨＳ回線ＩＦ部１８によっ
て受信され、多重分離部１７及び同期バス３８を介して
音声コーデック２３に与えられる。そして符号化音声デ
ータは、音声コーデック２３によってＡＤＰＣＭ符号が
デコードされるとともにアナログ化されて音声信号に再
生される。そしてイヤホンマイク３に与えられること
で、イヤホンマイク３から受話音声として出力される。
一方、ユーザが発した送話音声は、イヤホンマイク３に
よって音声信号に変換されて音声コーデック２３に与え
られる。そして音声信号は音声コーデック２３によって
デジタル化及びＡＤＰＣＭ符号化がなされて符号化デー
タとされたのち、同期バス３８及び多重分離部１７を介
してＰＨＳ回線ＩＦ部１８へと与えられ、ＰＨＳ回線Ｉ
Ｆ部１８によってアンテナ１９及びＰＨＳ網を介して相
手側情報端末装置へと送られる。When the outgoing telephone number is determined as described above, the main control section 11 controls the PHS line IF section 18 so as to perform outgoing call processing using the outgoing telephone number. If a communication path with the destination terminal (the partner information terminal device) is formed by the PHS network in response to this transmission process,
The user can make a call using the earphone microphone 3. That is, the encoded voice data (ADPCM system) transmitted from the partner information terminal device via the PHS network is received by the antenna 19 and the PHS line IF unit 18, and is received via the demultiplexing unit 17 and the synchronization bus 38. This is provided to the audio codec 23. Then, the encoded audio data is decoded by the audio codec 23 into an analog signal, converted to an analog signal, and reproduced as an audio signal. Then, when given to the earphone microphone 3, it is output from the earphone microphone 3 as a received voice.
On the other hand, the transmitted voice transmitted by the user is converted into a voice signal by the earphone microphone 3 and provided to the voice codec 23. The audio signal is digitized and ADPCM-encoded by the audio codec 23 to be coded data, and then supplied to the PHS line IF unit 18 via the synchronization bus 38 and the multiplex / demultiplex unit 17, and the PHS line I
The information is transmitted by the F unit 18 to the partner information terminal device via the antenna 19 and the PHS network.

【００３４】［テレビ電話モード］この動作モードで
は、音声通話を行いながら、映像の送受信が行える。こ
の動作モードで、主制御部１１は、初期状態では多重分
離部１７及び音声コーデック２３の動作モードを音声通
話モードに設定している。そして主制御部１１は、前述
した電話モードの場合と同様にして発信電話番号の指定
受付けと発信処理を行う。そしてＰＨＳ網によって発信
先の端末（相手側情報端末装置）との間の通信パスが形
成されたならば、主制御部１１は所定の手順（例えば、
ＩＴＵ−Ｔ勧告のＨ．２４５）で相手側情報端末装置と
ネゴシエーションを行い、相手側情報端末装置が映像・
音声多重通信を行うことができるか否か、及び映像・音
声多重通信の実施を許容するか否かを確認する。そして
相手側情報端末装置が、映像・音声多重通信を行うこと
ができないか、映像・音声多重通信の実施を拒否してい
る場合には、主制御部１１は以降、電話モードに移行
し、音声通話のみを可能とする。一方、相手側情報端末
装置が、映像・音声多重通信を行うことが可能で、かつ
映像・音声多重通信の実施を許可した場合には、主制御
部１１は多重分離部１７及び音声コーデック２３の動作
モードをマルチメディア通信モードに切替える。[Video Phone Mode] In this operation mode, video transmission and reception can be performed while a voice call is being made. In this operation mode, the main control unit 11 sets the operation mode of the demultiplexing unit 17 and the audio codec 23 to the voice communication mode in an initial state. Then, the main control unit 11 performs the designated reception of the outgoing telephone number and the outgoing call processing in the same manner as in the above-described telephone mode. Then, if a communication path with the destination terminal (the partner information terminal device) is formed by the PHS network, the main control unit 11 performs a predetermined procedure (for example,
ITU-T Recommendation H. 245), a negotiation is performed with the partner information terminal, and the partner information terminal
It is confirmed whether audio multiplex communication can be performed and whether implementation of video / audio multiplex communication is permitted. If the information terminal of the other party cannot perform the video / audio multiplex communication or refuses to perform the video / audio multiplex communication, the main control unit 11 shifts to the telephone mode thereafter, and Only calls can be made. On the other hand, when the partner information terminal device can perform the video / audio multiplex communication and permits the execution of the video / audio multiplex communication, the main control unit 11 The operation mode is switched to the multimedia communication mode.

【００３５】このようにして、イヤホンマイク３で生成
された音声信号を音声コーデック２３で低レート音声符
号化方式によりエンコードして得られた符号化音声デー
タと、カメラ部４で生成された映像信号あるいは領域抽
出部２１から送られてきた映像データを映像エンコーダ
２７でエンコードして得られた符号化映像データと、主
制御部１１から出力される他データとが多重分離部１７
にて多重化される。そして、これにより得られた伝送デ
ータが、相手側情報端末装置へと送られる。また相手側
情報端末装置から送信された伝送データからは、符号化
音声データ、符号化映像データ、及び他データが多重分
離部１７にてそれぞれ分離される。そして符号化音声デ
ータは、音声コーデック２３で音声信号に戻され、イヤ
ホンマイク３から音声として出力される。符号化映像デ
ータは、映像デコーダ１２で映像データに戻され、この
映像データに基づいて映像用ＬＣＤ制御回路部１３の制
御の下に、映像用ＬＣＤ１４にて映像表示が行われる。
これにより、イヤホンマイク３を用いての通話を行いつ
つ、カメラ部４により撮影した映像を相手側に送信し、
また相手側から送られた映像を映像用ＬＣＤ１４に表示
して見ることができる。As described above, the encoded audio data obtained by encoding the audio signal generated by the earphone microphone 3 by the audio codec 23 using the low-rate audio encoding method, and the video signal generated by the camera unit 4 Alternatively, the encoded video data obtained by encoding the video data sent from the region extracting unit 21 by the video encoder 27 and other data output from the main control unit 11 are separated by the demultiplexing unit 17.
Are multiplexed. Then, the transmission data thus obtained is sent to the partner information terminal device. In addition, coded audio data, coded video data, and other data are separated by the demultiplexing unit 17 from the transmission data transmitted from the partner information terminal device. The encoded audio data is returned to the audio signal by the audio codec 23 and output from the earphone microphone 3 as audio. The coded video data is returned to video data by the video decoder 12, and video is displayed on the video LCD 14 under the control of the video LCD control circuit 13 based on the video data.
Thereby, while making a call using the earphone microphone 3, the video taken by the camera unit 4 is transmitted to the other party,
Also, the image sent from the other party can be displayed on the image LCD 14 and viewed.

【００３６】上記のテレビ電話モードの際に主制御部１
１はカメラ向きセンサ部２８からの検出信号を調べ、カ
メラ部４が端末本体１に装着され、映像用ＬＣＤ１４及
びテキスト用ＬＣＤ１６の表示面が設けられた側と同じ
方向を撮影している状態ならば、以下の処理を行う。カ
メラ部４は端末本体１を持っているユーザ側のユーザの
被写体を撮影している状態であり、映像用ＬＣＤ１４に
相手側の被写体を表示させている。カメラ部４は、図８
２のような映像を撮影して、カメラＩＦ部２５を通って
領域抽出部２１で図４のように被写体部分の映像を切り
出し、映像エンコーダ２７でエンコードして得られた符
号化映像を多重分離部１７にて多重化して相手側の情報
端末装置に送る。被写体の映像切り出し方法は従来から
色々提案されている。例えば、特開平６−３０３１８号
の方法によって顔形状を切り出すことができる。この方
法は、画面に表示された映像について色相信号から肌色
領域、輝度信号から髪の毛領域を抽出し、両者を統合す
ることにより人間の顔モデルに適合した顔領域を得てい
る。尚、前記の方法は、顔映像だけでも抽出可能なた
め、被写体の頭部に髪の毛がない場合、すなわち禿げて
いる頭部でも抽出可能である。また、黒人の顔及び白髪
などに対しては切り出す対象範囲としての色相情報及び
輝度情報の抽出範囲を広げることにより適用することが
できる。In the videophone mode, the main control unit 1
1 is a state in which the detection signal from the camera orientation sensor unit 28 is checked, and the camera unit 4 is mounted on the terminal main body 1 and is shooting in the same direction as the side on which the display surfaces of the video LCD 14 and the text LCD 16 are provided. If so, the following processing is performed. The camera unit 4 is in a state where the subject of the user holding the terminal body 1 is being photographed, and the subject of the other party is displayed on the video LCD 14. The camera unit 4 is shown in FIG.
2, the image of the subject is cut out as shown in FIG. 4 by the area extraction unit 21 through the camera IF unit 25, and the encoded image obtained by encoding by the image encoder 27 is demultiplexed. The data is multiplexed by the unit 17 and sent to the information terminal device on the other side. Various methods of extracting an image of a subject have been conventionally proposed. For example, a face shape can be cut out by the method disclosed in Japanese Patent Application Laid-Open No. 6-30318. This method extracts a skin color region from a hue signal and a hair region from a luminance signal for an image displayed on a screen, and integrates the two to obtain a face region suitable for a human face model. In addition, since the above-mentioned method can extract only a face image, it is possible to extract even when the subject's head has no hair, that is, a bald head. Further, the present invention can be applied to a black face, white hair, and the like by expanding the extraction range of hue information and luminance information as target ranges to be cut out.

【００３７】ここで、従来、静止した背景の映像と、揺
れている被写体映像から動き部分を識別することにより
被写体部分を抽出する方法がある。しかし、携帯情報端
末装置をユーザが手で持っているため、手ぶれが起きた
際に、被写体映像とバックの映像は両方とも揺れてしま
うため、被写体部分の映像を区別することは困難であっ
た。しかし、前記方式は顔の動きベクトルを用いず、画
面上で座標の変化の影響を受けない人間の顔モデルを使
っているため手ぶれが起きても顔領域を抽出することが
できる特徴がある。次に、従来の被写体の動きベクトル
を抽出する方式を改良することで、被写体の領域を抽出
する実施形態について、図３により説明する。図３は被
写体の領域抽出に必要となる構成要素として、図１の領
域抽出部２１に代えて、方位センサ４１、変換部４２、
及び人物認識部４３を設けている。方位センサ４１は、
端末本体１が上下方向及び左右方向に移動する際の回転
角速度を検出して、その検出した回転角速度の情報を含
む電気信号を出力するセンサである。方位センサ４１
は、例えば圧電振動ジャイロが適用される。その場合、
方位センサ４１には、例えば圧電振動ジャイロを振動さ
せるためのパルス信号を生成する発振回路、及び、その
振動の変移状態を表す信号を増幅する増幅回路等の駆動
回路が備えられる。方位センサ４１は変換した電気信号
を変換部４２に出力する。Here, conventionally, there is a method of extracting a subject portion by identifying a moving portion from a still background video and a shaking subject video. However, since the user holds the portable information terminal device with his / her hand, when the camera shake occurs, both the subject video and the back video fluctuate, so it is difficult to distinguish the video of the subject portion. . However, the above method has a feature that a face region can be extracted even if camera shake occurs because a human face model which is not affected by a change in coordinates on a screen is used without using a face motion vector. Next, an embodiment of extracting a region of a subject by improving a conventional method of extracting a motion vector of a subject will be described with reference to FIG. FIG. 3 shows a azimuth sensor 41, a conversion unit 42 instead of the region extraction unit 21 in FIG.
And a person recognition unit 43. The direction sensor 41 is
It is a sensor that detects a rotational angular velocity when the terminal body 1 moves in the vertical and horizontal directions and outputs an electric signal including information on the detected rotational angular velocity. Direction sensor 41
For example, a piezoelectric vibrating gyroscope is applied. In that case,
The azimuth sensor 41 includes, for example, an oscillation circuit that generates a pulse signal for vibrating the piezoelectric vibrating gyroscope, and a drive circuit such as an amplifier circuit that amplifies a signal indicating a transition state of the vibration. The direction sensor 41 outputs the converted electric signal to the conversion unit 42.

【００３８】変換部４２は、方位センサ４１から出力さ
れた電気信号を端末本体１のぶれ状態に対応するぶれ信
号に変換する回路である。詳しくは、変換部４２は、方
位センサ４１から供給される電気信号を入力して、入力
された電気信号から端末本体１が回転する際の角速度を
表わす移動量データと、上下方向及び左右方向のそれぞ
れの移動方向を表わす方向データからなるぶれ信号を生
成する。一方、カメラ部４で撮影された被写体映像はカ
メラＩＦ部２５を介して人物認識部４３に入力する。前
記変換部４２で検出したぶれ方向及び、その方向の移動
量が生成するベクトルに対して前記ベクトルを打ち消す
方向に被写体映像を移動させて、ぶれ成分を取り除く。
そして、前記ぶれ成分が削除された被写体映像は人物認
識部４３で被写体の領域を特定して切り出し、映像エン
コーダ２７に入力してエンコードされる。被写体の領域
を抽出する方式として例えば、特開平７−７３２９８号
の方法により実現できる。この方法は被写体の動きを検
出して、検出した動き領域があらかじめ設定されている
顔エリアに存在するか、また前記エリア内で映像のエッ
ジを検出して顔面特徴を有するか判定することにより人
物領域を抽出することを特徴としている。The conversion section 42 is a circuit for converting the electric signal output from the direction sensor 41 into a shake signal corresponding to the shake state of the terminal body 1. Specifically, the conversion unit 42 receives an electric signal supplied from the direction sensor 41, and, based on the inputted electric signal, moving amount data representing an angular velocity when the terminal main body 1 rotates, and moving amount data in the vertical and horizontal directions. A blur signal including direction data representing each moving direction is generated. On the other hand, the subject image captured by the camera unit 4 is input to the person recognizing unit 43 via the camera IF unit 25. The blur component is removed by moving the subject image in a direction that cancels the vector with respect to the vector generated by the blur direction detected by the conversion unit 42 and the movement amount in that direction.
Then, the subject image from which the blur component has been deleted is specified and cut out by the person recognizing unit 43 to specify the subject area, and is input to the video encoder 27 and encoded. As a method of extracting a region of a subject, for example, it can be realized by the method disclosed in Japanese Patent Application Laid-Open No. 7-73298. This method detects a motion of a subject and determines whether the detected motion region is present in a preset face area, and determines whether the detected motion region has a facial feature by detecting an image edge in the area. It is characterized by extracting a region.

【００３９】また、本方法は被写体の動きベクトルを抽
出する方法であるため、前記の人間の顔モデルによる領
域抽出方式のように顔色を抽出する際に色の適用範囲に
制限がなく、抽出範囲も顔領域だけでなく被写体の上半
身も抽出可能である。上記顔領域が抽出されたら、その
領域に相当する顔映像を全体の画像から切り出すことに
より、図４のように背景の映像が消去された顔映像１０
１が得られる。ここで、図４で、顔映像１０１の輪郭に
接する矩形１０２で囲い、矩形の中心１０３を顔の部分
即ち顔映像１０１の中心としている。また、相手側から
送られてきて、映像用ＬＣＤ１４に表示される相手側の
顔映像は、図８３のように背景を含んだ映像か、あるい
は図５のように携帯情報端末装置と同様に相手側の情報
端末装置で相手側の顔映像１０５を抽出し、相手側の顔
映像１０５のみを送り、表示してもよい。ここで、カメ
ラ部４で撮影して、領域抽出部２１で顔映像を抽出した
結果、図６で示すように相手側に送ろうとしている顔映
像１０６は手ぶれにより画面の表示枠１１１の中央より
端に寄ってしまうことがある。そこで主制御部１１は、
領域抽出部２１に対して、顔映像１０６の中心１０４が
表示枠１１１の中央に位置する矩形の中心１０３から移
動した方向と移動距離を検出させ、その検出結果を用い
て、顔映像１０６を表示枠１１１の中央に、すなわち顔
映像１０１の位置に移動させるように制御する。表示枠
１１１の中央に戻された顔映像１０６は映像エンコーダ
２７でエンコードされ、符号化映像データは相手側情報
端末装置に送られ、相手側情報端末装置で表示される。Further, since the present method is a method for extracting a motion vector of a subject, there is no limitation on a color application range when extracting a face color as in the above-described region extraction method using a human face model. Can extract not only the face region but also the upper body of the subject. After the face region is extracted, the face image corresponding to the region is cut out from the entire image, and the face image 10 from which the background image is deleted as shown in FIG.
1 is obtained. Here, in FIG. 4, the face 102 is surrounded by a rectangle 102 that is in contact with the outline of the face video 101, and the center 103 of the rectangle is the face portion, that is, the center of the face video 101. The face image of the other party sent from the other party and displayed on the video LCD 14 may be a video including a background as shown in FIG. 83, or may be the same as the portable information terminal device as shown in FIG. The information terminal device on the side may extract the face video 105 of the other party, send only the face video 105 of the other party, and display it. Here, as a result of photographing by the camera unit 4 and extracting a face image by the area extracting unit 21, the face image 106 to be sent to the other side is shifted from the center of the display frame 111 of the screen due to camera shake as shown in FIG. It may come to the edge. Therefore, the main control unit 11
The region extracting unit 21 detects the direction and the moving distance of the center 104 of the face image 106 from the center 103 of the rectangle located at the center of the display frame 111, and displays the face image 106 using the detection result. It is controlled to move to the center of the frame 111, that is, to the position of the face image 101. The face image 106 returned to the center of the display frame 111 is encoded by the image encoder 27, and the encoded image data is sent to the partner information terminal device and displayed on the partner information terminal device.

【００４０】カメラ部４で被写体映像を撮影して、領域
抽出部２１で顔映像１０６を抽出し、顔映像１０６を元
の位置に移動させ、相手側情報端末装置に送信する処理
は、処理がシリーズなために映像をカメラ部４で撮影す
る周期すなわち１フレーム内の時間で済ませる必要があ
る。なお、相手側端末では顔映像１０６が送られてきた
ら、背景の景色の代わりとして、単一色の背景に前記顔
映像１０６を重ねて表示する。また、背景として壁紙な
どのパターンを表示させ、その上に顔映像１０６を重ね
て表示させてもよい。前述のように、従来は、図８４の
ような生の映像について手ぶれを修正すると図８７のよ
うに背景の端が欠落して映像が欠けた部分１２３が生じ
る。ところが、この実施形態によると、背景の映像は相
手に送らず、顔映像のみを対象としていて、送り側で手
ぶれが起きても顔画像を画面の中央に移動させてから相
手側に送るために、相手側では安定した映像を見ること
ができる。そして、相手側に送る映像は顔映像のみなの
でバックを含めた映像を送る方式よりデータの伝送量が
少なくて済む利点がある。ここで、相手側に顔映像を送
るには、顔映像を、例えば１６ビット１６ビットのブロ
ックに分解して、ブロック単位で相手側に送るとよい。
その際、個々のブロックには画面の表示枠１１１におい
て、そのブロックが位置しているアドレスを付加して送
り、相手側では指定されたアドレスにそのブロックの内
容を表示することで、全体として顔映像を表示する。上
記の通り、顔領域を抽出する際に、既に抽出された領域
はブロック単位に分割されているので、これらのブロッ
クを順番に相手側に送信するとよい。The camera unit 4 captures an image of a subject, the area extraction unit 21 extracts a face image 106, moves the face image 106 to its original position, and transmits the image to the partner information terminal device. Since it is a series, it is necessary to complete the cycle of shooting an image by the camera unit 4, that is, the time within one frame. When the face image 106 is sent to the other party's terminal, the face image 106 is displayed over a single color background instead of the background scenery. Alternatively, a pattern such as wallpaper may be displayed as a background, and the face image 106 may be displayed over the pattern. As described above, conventionally, when a camera shake is corrected for a raw image as shown in FIG. 84, a portion 123 where an image is missing due to a lack of a background edge occurs as shown in FIG. 87. However, according to this embodiment, the background image is not sent to the other party, only the face image is targeted, and even if camera shake occurs on the sending side, the face image is moved to the center of the screen and then sent to the other party. , The other party can see a stable image. Since the image to be sent to the other party is only a face image, there is an advantage that the amount of data transmission is smaller than in the method of sending an image including a back. Here, in order to send the face video to the other party, the face video may be decomposed into 16-bit and 16-bit blocks, for example, and sent to the other party in block units.
At this time, each block is added with the address where the block is located in the display frame 111 of the screen and sent, and the other party displays the contents of the block at the specified address, so that the entire face is displayed. Display video. As described above, when the face area is extracted, since the already extracted area is divided into blocks, these blocks may be transmitted to the other party in order.

【００４１】また、この実施形態では、ユーザ側の携帯
情報端末装置と相手側の情報端末装置の通信回線は一部
無線回線で接続されているが、この実施形態は通信回線
の実施形態が有線でも無線でも区別することなく実現可
能である。（実施形態２）次に、本発明に係る携帯情報端末装置の
実施形態２について説明する。図７は、実施形態２の携
帯情報端末装置の構成を示すブロック図である。この図
７に示すものは、図１に示す実施形態１の構成に対し
て、端末本体１に、画像メモリ４５、顔映像メモリ部４
６、ズーミングボタン４７、及びズーミング駆動部４８
を設けている点が異なる。画像メモリ４５は、カメラＩ
Ｆ部２５から与えられた映像データをフレーム単位で、
蓄積するメモリである。形状抽出部２０は画像メモリ４
５から与えられた映像データからフレーム単位で顔形状
を抽出する。顔映像メモリ部４６は、形状抽出部２０と
映像エンコーダ２７からなる領域抽出部５１で生成され
た顔映像を所定周期で記憶させておくメモリである。こ
の実施形態２においては、カメラＩＦ部２５は、カメラ
用端子２６を介して接続されたカメラ部４から出力され
る映像信号を取込み、デジタル化して映像データを得
て、この映像データをフレーム単位で画像メモリ４５に
与える。画像メモリ４５に蓄積された映像データは映像
エンコーダ２７と映像用ＬＣＤ制御回路部１３に与えら
れる。In this embodiment, the communication line between the portable information terminal device on the user side and the information terminal device on the partner side is partially connected by a wireless line. In this embodiment, the embodiment of the communication line is a wired line. However, it can be realized without distinction even by wireless. (Embodiment 2) Next, Embodiment 2 of the portable information terminal device according to the present invention will be described. FIG. 7 is a block diagram illustrating a configuration of the portable information terminal device according to the second embodiment. The configuration shown in FIG. 7 is different from the configuration of the first embodiment shown in FIG.
6. Zooming button 47 and zooming drive unit 48
Is different. The image memory 45 stores the camera I
The video data given from the F unit 25 is frame-by-frame
It is a memory for storing. The shape extraction unit 20 stores the image memory 4
The face shape is extracted in frame units from the video data given from 5. The face video memory unit 46 is a memory that stores the face video generated by the area extraction unit 51 including the shape extraction unit 20 and the video encoder 27 at a predetermined cycle. In the second embodiment, the camera IF unit 25 takes in a video signal output from the camera unit 4 connected via the camera terminal 26, digitizes the video signal to obtain video data, and converts the video data into frame units. To the image memory 45. The video data stored in the image memory 45 is provided to the video encoder 27 and the video LCD control circuit unit 13.

【００４２】一方、画像メモリ４５から輝度信号及び色
相信号５０が形状抽出部２０に与えられる。形状抽出部
２０は入力した輝度信号及び色相信号５０を用いて人間
顔モデルから顔部分の形状を切り出す。映像エンコーダ
２７は、画像メモリ４５から与えられるデジタル映像５
４と形状抽出部２０から与えられる顔形状１１２とを用
いてＭＰＥＧ４方式によるオブジェクト符号化映像デー
タを得る。映像エンコーダ２７は、符号化映像データを
映像デコーダ１２や多重分離部１７へと与える。領域抽
出部５１で生成された顔映像は所定周期で更新しながら
顔映像メモリ部４６に蓄積される。また、この実施形態
２においては、操作入力制御回路部２９には、ズーミン
グボタン４７、タッチパネル３０、スクロールダイヤル
３１、操作ボタン３２及び電源ボタン３４がそれぞれ接
統されている。操作入力制御回路部２９は、これらズー
ミングボタン４７、タッチパネル３０、スクロールダイ
ヤル３１、操作ボタン３２及び電源ボタン３４でのユー
ザの指示操作を受付け、その指示操作の内容を主制御部
１１に通知する。ズーミングボタン４７はユーザの指示
操作を受け、ズーミング駆動部４８を制御してカメラ部
４のレンズ部分をズーミングする。On the other hand, the luminance signal and the hue signal 50 are given from the image memory 45 to the shape extracting section 20. The shape extraction unit 20 extracts the shape of the face portion from the human face model using the input luminance signal and hue signal 50. The video encoder 27 receives the digital video 5 supplied from the image memory 45.
4 and the face shape 112 given from the shape extraction unit 20 to obtain object-encoded video data according to the MPEG4 system. The video encoder 27 supplies the coded video data to the video decoder 12 and the demultiplexer 17. The face image generated by the area extracting unit 51 is stored in the face image memory unit 46 while being updated at a predetermined cycle. In the second embodiment, a zooming button 47, a touch panel 30, a scroll dial 31, an operation button 32, and a power button 34 are connected to the operation input control circuit 29, respectively. The operation input control circuit unit 29 receives a user's instruction operation on the zooming button 47, the touch panel 30, the scroll dial 31, the operation button 32, and the power button 34, and notifies the main control unit 11 of the content of the instruction operation. The zooming button 47 receives a user's instruction operation, and controls a zooming drive unit 48 to zoom the lens portion of the camera unit 4.

【００４３】この実施形態２における、端末本体１及び
カメラ部４の外観を図８に示す。この図８に示すもの
は、図２に示すものに対して、ズーミングボタン４７が
設けられている点が異なる。この実施形態２において
も、実施形態１の場合と同様に、動作モードとして、電
話モードと、テレビ電話モードの２つのモードがある
が、テレビ電話モードの際に主制御部１１はカメラ向き
センサ部２８からの検出信号を調べ、カメラ部４が端末
本体１に装着され、映像用ＬＣＤ１４及びテキスト用Ｌ
ＣＤ１６の表示面が設けられた側と同じ方向を撮影して
いる状態ならば、以下の処理を行う。カメラ部４は端末
本体１を持っているユーザの顔を撮影している状態であ
り、映像用ＬＣＤ１４に相手側の顔を表示させている。
カメラ部４は図８２のような映像を撮影して、画像メモ
リ４５に蓄積した後、形状抽出部２０で図９のような顔
形状１１２が抽出される。顔形状１１２とデジタル映像
５４とを用いて映像エンコーダ２７でＭＰＥＧ４による
オブジェクト符号化データすなわち図１０（ａ）に示す
ような顔映像１２１が生成される。ここで、顔形状の切
り出し方法は従来から色々提案されている。例えば、実
施形態１で説明したような特開平６−３０３１８の方法
によって顔形状を切り出すことができる。FIG. 8 shows the appearance of the terminal body 1 and the camera unit 4 in the second embodiment. 8 differs from that shown in FIG. 2 in that a zooming button 47 is provided. In the second embodiment as well, as in the first embodiment, there are two operation modes, a telephone mode and a videophone mode. 28, the camera unit 4 is attached to the terminal body 1, and the image LCD 14 and the text L
If the same direction as the side on which the display surface of the CD 16 is provided is photographed, the following processing is performed. The camera section 4 is in a state where the face of the user holding the terminal body 1 is being photographed, and the face of the other party is displayed on the video LCD 14.
The camera unit 4 captures an image as shown in FIG. 82 and accumulates the image in the image memory 45, and then the shape extraction unit 20 extracts a face shape 112 as shown in FIG. Using the face shape 112 and the digital video 54, the video encoder 27 generates MPEG4 object encoded data, that is, a face video 121 as shown in FIG. Here, various methods for extracting a face shape have been conventionally proposed. For example, a face shape can be cut out by the method described in Japanese Patent Laid-Open No. 6-30318 as described in the first embodiment.

【００４４】また、この実施形態２においても、実施形
態１の場合と同様にして、従来の被写体の動きベクトル
を抽出する方式を改良することで、被写体の領域を抽出
することができる。この場合の構成を図１１に示す。即
ち、図１１は被写体の領域抽出に必要となる構成要素と
して、図７の形状抽出部２０に代えて、方位センサ４
１、変換部４２、及び人物認識部４３を設けたものであ
るが、被写体の領域を抽出する動作は実施形態１の場合
と同様である。ここで、図９で顔形状１１２の輪郭に接
する矩形１１３で囲い、矩形の中心を顔映像１１２の中
心１１４としている。また、相手側から送られてきて、
映像用ＬＣＤ１４に表示される相手側の顔映像は、図８
３のように背景を含んだ映像か、あるいは図５のように
相手側の情報端末で顔映像１０５を抽出し、相手側の顔
映像１０５のみを送り、表示してもよい。ここで、カメ
ラ部４で撮影して領域抽出部５１で顔映像を抽出した結
果、図１２で示すように相手側に送ろうとしている顔映
像が手ぶれあるいはユーザの体の揺れにより画面の表示
枠１１１の中心より端に寄ってしまい顔映像１２２のよ
うになってしまうことがある。そこで主制御部１１は領
域抽出部５１に対して顔映像１２２を表示枠１１１の中
心に、すなわち顔映像１２１の位置に移動させる。すな
わち顔映像１２２の中心が１１７で顔映像１２１の中心
が１１４なので、顔映像１２２は中心１１７から中心１
１４に引かれたベクトルの方向に移動したことになる。
顔映像１２１は相手側端末に送られ、相手側端末で表示
される。In the second embodiment, as in the first embodiment, the area of the subject can be extracted by improving the conventional method of extracting the motion vector of the subject. FIG. 11 shows the configuration in this case. That is, FIG. 11 shows the azimuth sensor 4 instead of the shape extraction unit 20 of FIG.
1, a converting unit 42 and a person recognizing unit 43 are provided, but the operation of extracting a subject area is the same as that of the first embodiment. Here, in FIG. 9, the image is surrounded by a rectangle 113 that is in contact with the contour of the face shape 112, and the center of the rectangle is the center 114 of the face image 112. Also, sent from the other party,
The face image of the other party displayed on the image LCD 14 is shown in FIG.
3, the face video 105 may be extracted by the information terminal of the other party, and only the face video 105 of the other party may be sent and displayed as shown in FIG. Here, as a result of the photographing by the camera unit 4 and the extraction of the face image by the area extracting unit 51, the face image to be sent to the other party is displayed on the screen by the camera shake or the shaking of the user's body as shown in FIG. In some cases, the image may be shifted from the center of the image 111 to an end, so that the image looks like a face image 122. Therefore, the main control unit 11 causes the area extracting unit 51 to move the face image 122 to the center of the display frame 111, that is, to the position of the face image 121. That is, since the center of the face image 122 is 117 and the center of the face image 121 is 114, the face image 122 is shifted from the center 117 to the center 1
That is, it has moved in the direction of the vector drawn by.
The face image 121 is sent to the partner terminal and displayed on the partner terminal.

【００４５】以上述べたように、図８４のような生の映
像について表示枠１１１の範囲だけの映像を撮影してい
ると、顔映像が画面の中心に位置していない場合、顔映
像を画面の中心に移動すると必ず図８７のように背景の
端が欠落してしまう。しかし、本方式によると撮影範囲
が表示枠１１１と等しい場合でも、図１３のように顔映
像１２２は表示枠１１１の内部に接する範囲１２２−１
〜１２２−６で動いても顔映像１２２は欠落することが
ない。すなわち、表示枠１１１内部では、顔映像１２２
は自由に動かすことができるといえる。言い換えれば従
来の背景画像を付加する方式より撮影画像の揺れに強い
といえる。ここで、本方式は相手側に送る映像は顔映像
のみなので背景を含めた映像を送る方式よりデータの伝
送量が少なくて済む利点がある。背景の映像に配分して
いた伝送レートを顔映像の伝送にまわすことができ、映
像エンコーダ２７で顔映像をより精細な映像でエンコー
ドすることができる。また、背景として配分していた伝
送レートを音声コーデック２３の伝送レートを上げるこ
とに振り向ければ、より高音質な音声でエンコードする
ことが可能である。また、上記背景を相手に送らなくて
済むため、伝送容量に余裕がある場合、図１４のよう
に、ユーザ側から定型のテキスト文１３０も顔映像１２
２と一緒に相手に送り、相手の情報端末装置で表示して
もよい。この場合、顔映像１２２を領域抽出部５１で生
成した後、画面の中央でなく所定の位置（図１４では右
側）に移動して、左側の空いた領域にテキスト文１３０
を表示させてもよい。As described above, if a raw image as shown in FIG. 84 is captured only in the range of the display frame 111, if the face image is not located at the center of the screen, the face image is displayed. 87, the edge of the background always drops off as shown in FIG. However, according to this method, even when the shooting range is equal to the display frame 111, the face image 122 is in the range 122-1 that contacts the inside of the display frame 111 as shown in FIG.
The face image 122 will not be lost even if the movement is performed at -122-6. That is, inside the display frame 111, the face image 122
Can be freely moved. In other words, it can be said that the captured image is more resistant to fluctuation than the conventional method of adding a background image. Here, this method has an advantage that the amount of data transmission is smaller than the method of transmitting the image including the background, since the image to be transmitted to the other party is only the face image. The transmission rate allocated to the background video can be transferred to the transmission of the face video, and the video encoder 27 can encode the face video with a finer video. Further, if the transmission rate allocated as the background is allocated to increasing the transmission rate of the audio codec 23, it is possible to encode with higher quality audio. In addition, since there is no need to send the background to the other party, if the transmission capacity has room, as shown in FIG.
2 may be sent to the other party and displayed on the other party's information terminal device. In this case, after the face image 122 is generated by the area extracting unit 51, the face image 122 is moved to a predetermined position (the right side in FIG. 14) instead of the center of the screen, and the text text 130
May be displayed.

【００４６】空いた領域は背景がないためバックの映像
に影響されることなく、テキスト文１３０は鮮明に表示
される。図１４はテキスト文１３０を表示させていた
が、任意の図形、アイコンあるいはこのシステムを運営
しているインフラ業者の宣伝文なども表示させてもよ
い。他に、ユーザが携帯型情報端末装置をプライバシー
に関わる場所で使用する場合、相手に背景画像を送りた
くないことがある。この実施形態の方式は背景映像を削
除するので、携帯型情報端末装置の使用場所を気にする
必要がなく、白由にどのような場所でも使用できる。ま
た従来の方式のように、交通の激しい場所などで顔映像
と背景を一緒に相手に送ると、相手が顔映像を確認しよ
うとしても、背景の激しい動きに影響されて、顔映像が
見づらくなることがある。しかし、本方式は背景を相手
に送らないため、顔映像のみを鮮明に確認することがで
きる。また、ズーミングボタン４７を押下すると映像用
ＬＣＤ１４に、図１０（ａ）のように顔映像１２１が表
示される。そして、ズーミングボタン４７を更に押下す
ると、ズーミング駆動部２３が動作してカメラ部４のレ
ンズ７０をズーミングして図１０（ｂ）のように顔映像
１２１が縮小される。また、図１０（ｃ）のように拡大
することもでき、顔映像１２１をユーザの希望する大き
さに設定できる。そして、ズーミング動作が終了する
と、設定した顔映像１２１が相手に送られ、映像用ＬＣ
Ｄ１４に相手側の顔映像１０５が表示される。また、相
手と通信を行っていない間は、映像用ＬＣＤ１４には常
時、ユーザの顔映像１２１を表示させておいて、上記の
ようにズーミング動作を行えるようにしておいてもよ
い。Since the vacant area has no background, the text text 130 is clearly displayed without being affected by the background image. Although FIG. 14 displays the text 130, any graphic, icon, or advertisement of an infrastructure company operating this system may be displayed. In addition, when the user uses the portable information terminal device in a place related to privacy, he or she may not want to send a background image to the other party. In the method of this embodiment, since the background image is deleted, it is not necessary to worry about where the portable information terminal device is used, and it can be used anywhere. Also, as in the conventional method, if the face image and the background are sent to the other party together in places with heavy traffic, even if the other party tries to check the face image, the face image will be difficult to see because of the strong background movement. Sometimes. However, in this method, since the background is not sent to the other party, only the face image can be clearly checked. When the zoom button 47 is pressed, a face image 121 is displayed on the image LCD 14 as shown in FIG. Then, when the zooming button 47 is further pressed, the zooming drive unit 23 operates to zoom the lens 70 of the camera unit 4 and the face image 121 is reduced as shown in FIG. In addition, the image can be enlarged as shown in FIG. 10C, and the face image 121 can be set to a size desired by the user. Then, when the zooming operation is completed, the set face image 121 is sent to the other party and the image LC 121
The face image 105 of the other party is displayed on D14. Further, while no communication with the other party is being performed, the face LCD 121 of the user may be constantly displayed on the video LCD 14 so that the zooming operation can be performed as described above.

【００４７】（実施形態３）次に、本発明に係る携帯情
報端末装置の実施形態３について説明する。図１５は、
実施形態３の携帯情報端末装置の構成を示すブロック図
である。この図１５に示すものは、図７に示す実施形態
２の構成に対して、端末本体１において、形状抽出部２
０及び顔映像メモリ部４６の代わりに、トリミング形状
蓄積部５５及びトリミング映像メモリ部５６を設け、更
に、カメラ向きセンサ部２８の代わりに、赤外線ＬＥＤ
５７及び瞳孔抽出部５８を設けるとともに、赤外線カメ
ラ用端子２２を介して赤外線カメラ５を接続したもので
ある。トリミング形状蓄積部５５は複数のトリミング形
状を蓄積していて、その中から所定の形状を選択して読
み出して映像エンコーダ２７に印加する。トリミング映
像メモリ部５６は領域抽出部５１で生成されたトリミン
グ映像を、所定周期で記憶させておくメモリである。こ
の実施形態３においては、カメラＩＦ部２５は、カメラ
用端子２６を介して接続されたカメラ部４から出力され
る映像信号を取込み、デジタル化して映像データを得
て、この映像データをフレーム単位で画像メモリ４５に
与える。画像メモリ４５に蓄積されたデジタル映像５４
は映像エンコーダ２７と映像用ＬＣＤ制御回路部１３に
与えられる。映像エンコーダ２７は、画像メモリ４５か
ら与えられるデジタル映像５４とトリミング形状蓄積部
５５から与えられるトリミング形状２００を用いてＭＰ
ＥＧ４方式によるオブジェクト符号化映像データを得
る。Embodiment 3 Next, Embodiment 3 of the portable information terminal device according to the present invention will be described. FIG.
FIG. 13 is a block diagram illustrating a configuration of a portable information terminal device according to a third embodiment. The configuration shown in FIG. 15 is different from the configuration of the second embodiment shown in FIG.
0 and a face image memory unit 46, a trimming shape storage unit 55 and a trimming image memory unit 56 are provided.
57 and a pupil extraction unit 58 are provided, and the infrared camera 5 is connected via the infrared camera terminal 22. The trimming shape storage unit 55 stores a plurality of trimming shapes. A predetermined shape is selected from the trimming shapes, read out, and applied to the video encoder 27. The trimming video memory unit 56 is a memory for storing the trimming video generated by the area extracting unit 51 at a predetermined cycle. In the third embodiment, the camera IF unit 25 takes in a video signal output from the camera unit 4 connected via the camera terminal 26, digitizes the video signal to obtain video data, and converts the video data into frame units. To the image memory 45. Digital image 54 stored in image memory 45
Is supplied to the video encoder 27 and the video LCD control circuit 13. The video encoder 27 uses the digital video 54 supplied from the image memory 45 and the trimming shape 200 supplied from the trimming shape storage unit 55 to generate an MP
Obtain object encoded video data according to the EG4 method.

【００４８】尚トリミング形状蓄積部５５と映像エンコ
ーダ２７を含めて領域抽出部５１と称する。映像エンコ
ーダ２７は、符号化映像データを映像デコーダ１２や多
重分離部１７へと与える。領域抽出部５１で生成された
トリミング映像は所定周期で更新しながらトリミング映
像メモリ部５６に蓄積される。赤外線ＬＥＤ５７はユー
ザの顔部に赤外線を照射するためのＬＥＤである。赤外
線カメラ５はユーザの顔部から反射してきた赤外線を入
力するカメラである。瞳孔抽出部５８は赤外線カメラ５
で撮影されたユーザの顔部から瞳孔部分を抽出する。こ
の実施形態３においては、ズーミング駆動部４８は主制
御部１１からの制御によりカメラ部４及び赤外線カメラ
５を同時にズーミングする機構である。また、ズーミン
グボタン４７は手動によりカメラ部４及び赤外線カメラ
５を同時にズーミングの指示を受けるためのものであ
る。この実施形態３における、端末本体１、カメラ部４
及び赤外線カメラ５の外観を図１６に示す。この図１６
に示すものは、図８に示すものに対して、赤外線ＬＥＤ
５７及び赤外線カメラ５が設けられている点が異なる。
図１６に示すように、赤外線カメラ５は、カメラ部４と
ともに、操作ボタン３２、電源ボタン３４及びズーミン
グボタン４７が設けられているのと同じ筐体側面に取り
付けられている。赤外線ＬＥＤ５７は筐体前面で映像用
ＬＣＤ１４の上部に取り付けられている。The trimming shape storage unit 55 and the video encoder 27 are referred to as an area extraction unit 51. The video encoder 27 supplies the coded video data to the video decoder 12 and the demultiplexer 17. The trimmed video generated by the area extracting unit 51 is stored in the trimmed video memory unit 56 while being updated at a predetermined cycle. The infrared LED 57 is an LED for irradiating infrared rays to the user's face. The infrared camera 5 is a camera for inputting infrared light reflected from the user's face. The pupil extraction unit 58 includes the infrared camera 5
The pupil portion is extracted from the user's face photographed in step (1). In the third embodiment, the zooming drive unit 48 is a mechanism for simultaneously zooming the camera unit 4 and the infrared camera 5 under the control of the main control unit 11. Further, the zooming button 47 is for manually receiving a zooming instruction for the camera unit 4 and the infrared camera 5 at the same time. Terminal body 1 and camera unit 4 in the third embodiment
FIG. 16 shows the appearance of the infrared camera 5. This FIG.
The one shown in FIG. 8 is the same as that shown in FIG.
57 and an infrared camera 5 are provided.
As shown in FIG. 16, the infrared camera 5 is mounted on the same side of the housing as the camera unit 4, on which the operation buttons 32, the power button 34, and the zooming button 47 are provided. The infrared LED 57 is attached to the upper part of the video LCD 14 on the front of the housing.

【００４９】次にこの動作について説明する。カメラ部
４は端末本体１を持っている人の顔を撮影している状態
であり、映像用ＬＣＤ１４に相手側の顔を表示させてい
るものとする。カメラ部４は図１７（ａ）のように表示
枠１１１内でユーザの顔と背景を含んだ映像を撮影し、
画像メモリ４５はデジタル映像５４を出力している。図
１７（ａ）はユーザの顔部分が中央より左下に寄ってい
る。ここで、ユーザの顔部分で特徴点を抽出する。この
実施形態３では特徴点としてユーザの眼球部分を抽出し
ている（詳細は後述）。そして、トリミング形状蓄積部
５５からトリミング形状２００を読み出す。トリミング
形状２００を、前記特徴点を手がかりにして顔部分がは
み出さないように、図１７（ｂ）のようにデジタル映像
５４の上に配置する。そして、映像エンコーダ２７では
トリミング形状２００とデジタル映像５４を使用し、Ｍ
ＰＥＧ４によるオブジェクト符号化データすなわち図１
８（ａ）のトリミング映像２０１を生成する。そして、
更に映像エンコーダ２７では図１８（ｂ）のようにトリ
ミング映像２０１を表示枠１１１の中央に移動して、中
央に移動したトリミング映像２０３のみを相手側に伝送
する。Next, this operation will be described. The camera unit 4 is in a state in which the face of the person holding the terminal body 1 is being photographed, and the face of the other party is displayed on the video LCD 14. The camera unit 4 captures an image including the user's face and background in the display frame 111 as shown in FIG.
The image memory 45 outputs a digital video 54. In FIG. 17A, the face of the user is shifted to the lower left from the center. Here, feature points are extracted from the face of the user. In the third embodiment, a user's eyeball portion is extracted as a feature point (details will be described later). Then, the trimming shape 200 is read from the trimming shape storage unit 55. The trimming shape 200 is arranged on the digital image 54 as shown in FIG. 17B so that the face portion does not protrude using the feature point as a clue. Then, the video encoder 27 uses the trimming shape 200 and the digital video 54,
Object encoded data by PEG4, that is, FIG.
8 (a) is generated. And
Further, the video encoder 27 moves the trimmed video 201 to the center of the display frame 111 as shown in FIG. 18B, and transmits only the trimmed video 203 moved to the center to the other party.

【００５０】相手側端末ではトリミング映像２０３が送
られてきたら、背景の景色の代わりとして単一色の背景
にトリミング映像２０３を重ねて表示する。また、背景
として壁紙などのパターンを表示させ、その上にトリミ
ング映像２０３を重ねて表示させてもよい。また、別個
に携帯情報端末装置に蓄積してある背景画像を相手側に
送り、相手側で送られてきた顔映像と背景画像を重ねて
表示してもよい。尚、トリミング形状蓄積部５５に格納
されているトリミング形状２００は、図１９（ａ）のよ
うに矩形、同図（ｂ）のようにひし形など複数の形状が
格納されている。そして、予め映像用ＬＣＤ１４に複数
のトリミング形状２００を表示させて、タッチパネル３
０の操作により、その中から希望のトリミング形状２０
０を選択しておくようにすればよい。図２０はユーザの
眼球の位置を検出して、検出した位置からトリミング形
状２００を確定する方法を説明するための図である。ユ
ーザの顔面に対して赤外線ＬＥＤ５７から赤外線が照射
される。赤外線はユーザの顔面から反射して赤外線カメ
ラ５で取り入れられる。赤外線はユーザの眼底の網膜で
は可視光を吸収して赤外線を反射するため、赤外線カメ
ラ５の映像信号２１０は、瞳孔部分２０４、２０５だけ
が非常に高輝度で撮像される。When the trimming video 203 is sent to the other party's terminal, the trimming video 203 is displayed over the single color background instead of the background scene. Alternatively, a pattern such as wallpaper may be displayed as a background, and the trimmed image 203 may be displayed over the pattern. Alternatively, the background image separately stored in the portable information terminal device may be sent to the other party, and the face image and the background image sent by the other party may be displayed in a superimposed manner. As the trimming shape 200 stored in the trimming shape storage unit 55, a plurality of shapes such as a rectangle as shown in FIG. 19A and a rhombus as shown in FIG. 19B are stored. Then, a plurality of trimming shapes 200 are displayed on the video LCD 14 in advance, and the touch panel 3 is displayed.
0 operation, the desired trimming shape 20
What is necessary is just to select 0. FIG. 20 is a diagram for explaining a method of detecting the position of the user's eyeball and determining the trimming shape 200 from the detected position. An infrared ray is emitted from the infrared LED 57 to the user's face. The infrared rays are reflected from the user's face and are taken in by the infrared camera 5. Since the infrared ray absorbs visible light and reflects the infrared ray in the retina at the fundus of the user, only the pupil portions 204 and 205 of the video signal 210 of the infrared camera 5 are imaged with very high brightness.

【００５１】瞳孔部分２０４、２０５の輝度は、周囲の
顔の皮膚部分の反射光量とは比較にならないほど大き
く、適切な閾値を設定することにより瞳孔部分２０４、
２０５を２値化して切り出すことができる。図２０にお
いて、ｇとｈの間すなわちａはユーザの両眼の間隔にな
る。そして、ｇとｈの中間２０９が鼻の位置に相当す
る。中間２０９から左右にｂだけトリミング幅を設定す
る。ｂはマージンを持って統計的に人間の顔について平
均的な値を決めておく。また、両眼の高さの位置２０８
から上方向に統計的な値ｄ（頭頂を越えた値）、下方向
にｃ（顎を越えた値）を設定して、トリミング形状２０
０が決定される。トリミング形状２００は周囲の背景と
一緒にユーザの顔部分を含んでいることになる。そし
て、トリミング形状２００（内部の映像を含んでいる）
の重心２０６を表示枠１１１の中心２０７に移動させる
ことによりトリミング映像２０３が表示枠１１１の中心
に移動したことになる。図２１は、２人の人物が並んだ
場合に、眼球の位置を検出する方法を説明するための図
である。前記と同様に眼球から反射した赤外線を閾値で
２値化して眼球の位置を決定する。ｇ、ｈ、ｉ、及びｊ
が眼球の位置になる。そして最も左側の眼球の位置ｇか
ら左方向に統計的な設定値ａをとる。同様に最も右側の
眼球の位置ｊから右方向に統計的な設定値ａをとる。ま
た、上側にある眼球ｉ、ｊから上方向にｃ、下側にある
眼球ｇ、ｈから下方向にｅを取ることで、トリミング形
状２００が決定される。The luminance of the pupil portions 204 and 205 is so large as to be incomparable with the amount of reflected light from the skin portion of the surrounding face.
205 can be binarized and cut out. In FIG. 20, a space between g and h, that is, a is a distance between both eyes of the user. The middle 209 between g and h corresponds to the position of the nose. The trimming width is set to b right and left from the middle 209 by b. b is a statistically determined average value for a human face with a margin. Also, the position 208 of the height of both eyes
From above, a statistical value d (value beyond the vertex) and a downward value c (value beyond the chin) are set, and the trimming shape 20 is set.
0 is determined. The trimmed shape 200 will include the user's face along with the surrounding background. Then, the trimming shape 200 (including the internal image)
By moving the center of gravity 206 to the center 207 of the display frame 111, the trimmed image 203 has moved to the center of the display frame 111. FIG. 21 is a diagram for explaining a method of detecting the position of the eyeball when two persons are arranged. As described above, the position of the eyeball is determined by binarizing the infrared light reflected from the eyeball with a threshold value. g, h, i, and j
Is the position of the eyeball. Then, a statistical set value a is taken leftward from the position g of the leftmost eyeball. Similarly, a statistical set value a is taken rightward from the rightmost eyeball position j. The trimming shape 200 is determined by taking c upward from the upper eyeballs i and j and e downward from the lower eyeballs g and h.

【００５２】トリミング形状２００は、２人の人物の顔
面部分を含んでいることになる。そして、トリミング形
状２００の重心２０６を表示枠１１１の中心２０７に移
動させることによりトリミング映像２０３が表示枠１１
１の中心に移動したことになる。ここで、上記の方法に
よってもユーザが横を向いたり、瞬きしたりして片目だ
けの位置しか検出できないことがある。また、両目の位
置も検出できないこともある。そこで、眼球の位置の検
出については、時間的に経過している間に、片目だけの
検出位置、あるいは急に検出位置が極端に外れた場合
は、眼球位置の検出結果を削除する必要がある。例え
ば、半分だけ横を向いた場合、図２０におけるａの値が
急に小さくなるので分かる。完全に横を向けば両眼も検
出されなくなるので分かる。瞬きの場合、瞬間的に片目
あるいは両眼が検出されなくなるので分かる。また、図
２０で説明した方法は、両眼の間隔ａを検出すること
で、トリミング形状２００が一意的に決まってしまう。
そこで、トリミング形状蓄積部５５からトリミング形状
２００を読み出して、前記両眼の間隔ａから前記読み出
したトリミング形状２００に比例して拡大あるいは縮小
する必要がある。すなわち、トリミング形状蓄積部５５
に格納されているトリミング形状２００はこれからトリ
ミングすべき、元のトリミング形状が格納されているこ
とになり、前記両眼の間隔によってトリミング形状２０
０を比例して大きさを変えればよい。The trimming shape 200 includes the face portions of two persons. Then, by moving the center of gravity 206 of the trimmed shape 200 to the center 207 of the display frame 111, the trimmed image 203 is displayed on the display frame 11
This means that it has moved to the center of 1. Here, even with the above method, the user may be able to detect only one eye position by turning sideways or blinking. Further, the positions of both eyes may not be detected. Therefore, regarding the detection of the position of the eyeball, it is necessary to delete the detection result of the eyeball position when the detection position of only one eye or the detection position suddenly deviates extremely during the passage of time. . For example, when the camera is turned halfway, the value of “a” in FIG. 20 is suddenly reduced. If the eyes are completely turned, both eyes are not detected. In the case of blinking, one or both eyes are not detected instantaneously. In the method described with reference to FIG. 20, the trimming shape 200 is uniquely determined by detecting the distance a between both eyes.
Therefore, it is necessary to read out the trimming shape 200 from the trimming shape storage unit 55 and enlarge or reduce the trimming shape 200 in proportion to the read trimming shape 200 from the distance a between the eyes. That is, the trimming shape storage unit 55
The original trimming shape to be trimmed is stored in the trimming shape 200 which is to be trimmed from now on.
The size may be changed in proportion to 0.

【００５３】前述でトリミング形状２００の大きさを決
めるために両眼の間隔を検出することで、ユーザの顔の
大きさを推定していた。しかし、従来の顔領域を抽出し
て、顔の輪郭線を求める方法によって、トリミング形状
２００を決めてもよい。図２２によると、従来の方法に
よりユーザの顔形状１１２を決定し、顔形状１１２を含
むトリミング形状２００を設定してもよい。この場合、
顔形状１１２の重心とトリミング形状２００の重心２０
６を一致させるとよい。そして、本方式は顔形状を正確
に求めなくても大まかな、顔形状を求めて、その顔形状
を抱合するトリミング形状２００を重ねれば済む利点が
ある。ここで、顔形状を検出する方法として、実施形態
１及び実施形態２の場合と同様に、例えば、特開平６−
３０３１８の方法によって顔形状を切り出すことができ
る。この方法は、画面に表示された映像について色相信
号から肌色領域、輝度信号から髪の毛領域を抽出し、両
者を統合することにより人間の顔モデルに適合した顔領
域を得ている。従来静止した背景の映像と、揺れている
顔映像から動き部分を識別することにより顔部分を抽出
する方法があった。ところが、携帯型情報端末装置は手
で持っていて、手ぶれが起きた際に、顔映像と背景の映
像は両方とも揺れてしまうため、従来の方法では顔部分
の映像を区別することは困難であった。しかし、この方
式は顔の動きベクトルを用いず、画面上で座標の変化の
影響を受けない人間の顔モデルを使っているため手ぶれ
が起きても顔領域を抽出することで外郭情報として顔形
状が抽出される。As described above, the size of the user's face has been estimated by detecting the distance between both eyes in order to determine the size of the trimming shape 200. However, the trimming shape 200 may be determined by a conventional method of extracting a face region and obtaining a face outline. According to FIG. 22, the face shape 112 of the user may be determined by a conventional method, and a trimming shape 200 including the face shape 112 may be set. in this case,
Center of gravity of face shape 112 and center of gravity 20 of trimmed shape 200
6 should be matched. The present method has an advantage in that a rough face shape is obtained without trimming the face shape accurately, and the trimming shape 200 that embraces the face shape is overlapped. Here, as a method of detecting the face shape, for example, as described in the first and second embodiments,
The face shape can be cut out by the method of 30318. This method extracts a skin color region from a hue signal and a hair region from a luminance signal for an image displayed on a screen, and integrates the two to obtain a face region suitable for a human face model. Conventionally, there has been a method of extracting a face portion by identifying a moving portion from a still background image and a shaking face image. However, since the portable information terminal device is held by hand, and when a camera shake occurs, both the face image and the background image fluctuate, it is difficult to distinguish the image of the face portion by the conventional method. there were. However, this method does not use a face motion vector and uses a human face model that is not affected by changes in coordinates on the screen. Is extracted.

【００５４】また、図２０ではトリミング形状２００と
して矩形による動作で説明していた。しかし、トリミン
グ形状２００として、図２３のように楕円でもかまわな
い。要するに、ユーザの顔が抱合するような形状ならど
のような形状でもかまわない。図２３で示した楕円の場
合、図２０で確定した縦方向の長さｄ＋ｃを長径に、横
方向の長さ２ｂを短径とすればよい。また、トリミング
映像２０３を相手側の情報端末装置で表示する際に、ト
リミング映像２０３の輪郭をバックの表示色と類似した
色でぼかして表示すると相手側がトリミング映像２０３
を長時間見ていても疲れが少なくなる。以上、説明した
ようにこの実施形態３の方式ではユーザの両眼の間隔を
測定して、その測定値からユーザの顔の大きさを推定し
ていた。しかし、ユーザが相手側と会話を行っていると
きは上述したように、ユーザの顔が横を向いたりする
と、両眼の間隔が変わってしまう。また、ユーザが携帯
情報端末装置自体を顔方向に近づけたり、遠ざけたりし
ても両眼の間隔が変わってしまい、どちらの動作により
間隔が変わったか区別できない。そこで、瞳孔抽出部５
８で検出した両眼の位置から、その位置の中心をユーザ
の顔の重心とし、この重心を、トリミング映像２０１が
表示画面上を移動するための基準座標として使用する。
そして、携帯情報端末装置自体が遠ざかったり近づいた
りして変化するユーザと携帯情報端末装置との距離は別
な方法で測定し、この距離に応じてトリミング映像２０
３の大きさを変えることにより、本方式はより手ぶれに
強い安定した方式となる。In FIG. 20, the operation using a rectangle as the trimming shape 200 has been described. However, the trimming shape 200 may be an ellipse as shown in FIG. In short, any shape is acceptable as long as the user's face is hugged. In the case of the ellipse shown in FIG. 23, the length d + c in the vertical direction determined in FIG. 20 may be set to the long diameter, and the length 2b in the horizontal direction may be set to the short diameter. Also, when displaying the trimmed video 203 on the information terminal device of the other party by blurring the outline of the trimmed video 203 with a color similar to the display color of the background, the other party can display the trimmed video 203.
Less fatigue even if you watch for a long time. As described above, in the method according to the third embodiment, the distance between the user's eyes is measured, and the size of the user's face is estimated from the measured value. However, as described above, when the user is having a conversation with the other party, if the user's face turns sideways, the distance between the eyes changes. Further, even if the user moves the portable information terminal device closer to or away from the face, the distance between the eyes changes, and it is not possible to distinguish which operation has changed the distance. Therefore, the pupil extraction unit 5
From the positions of both eyes detected in step 8, the center of the position is set as the center of gravity of the user's face, and the center of gravity is used as reference coordinates for moving the trimmed image 201 on the display screen.
Then, the distance between the user and the mobile information terminal device, which changes as the mobile information terminal device moves away or approaches, is measured by another method, and the trimming image 20 is determined in accordance with the distance.
By changing the size of 3, the method becomes a stable method that is more resistant to camera shake.

【００５５】図２４は、上記で述べた、携帯情報端末装
置とユーザの間の距離を測定する機能を含めたものにつ
いて、図１５より必要な構成部分のみを抜き出した構成
図である。即ち、図２４に示すものは、図１５に示すも
のの構成要素に、超音波送信部６２、及び超音波受信部
６３、トリミング形状可変ボタン６４が追加されてい
る。まず、相手と通信を行う前に、ユーザの顔をカメラ
部４で撮影して領域抽出部５１で領域抽出を行いトリミ
ング映像２０３を映像用ＬＣＤ１４に表示しておく。そ
して、ズーミングボタン４７を操作してトリミング形状
２００内部のユーザの顔を、希望する大きさに設定して
おく。同時に、トリミング形状可変ボタン６４でトリミ
ング形状蓄積部５５から読み出したトリミング形状２０
０の大きさを比例して変えて、希望の大きさに設定す
る。この際、トリミング形状２００の大きさにより、ユ
ーザの顔を抱合して、周囲の背景の面積を大きくした
り、小さくしたりして、その時、通信する相手に応じて
トリミング形状２００の大きさを変えることができる。
次に、上記でズーミングボタン４７とトリミング形状可
変ボタン６４を操作している間に、超音波送信部６２か
ら超音波をユーザの顔部分に照射し、反射した超音波を
超音波受信部６３で受信する。図２５は、超音波受信部
６３で受信した超音波を図示したものである。ａは最も
早く受信された反射波で、携帯情報端末装置から最も近
いユーザの顔からの反射波である。そこで、送信してか
ら受信されるまでの時間ｔ１により携帯情報端末装置か
らユーザまでの距離が分かる。FIG. 24 is a configuration diagram showing only the necessary components extracted from FIG. 15 with respect to the above-mentioned one including the function of measuring the distance between the portable information terminal device and the user. That is, in the configuration shown in FIG. 24, an ultrasonic transmission unit 62, an ultrasonic reception unit 63, and a trimming shape variable button 64 are added to the components shown in FIG. First, before communicating with the other party, the user's face is photographed by the camera unit 4, the region is extracted by the region extracting unit 51, and the trimmed video 203 is displayed on the video LCD 14. Then, the user operates the zooming button 47 to set the user's face inside the trimming shape 200 to a desired size. At the same time, the trimming shape 20 read from the trimming shape storage unit 55 by the trimming shape variable button 64
The value of 0 is changed in proportion to the desired size. At this time, depending on the size of the trimming shape 200, the user's face is embraced, and the area of the surrounding background is increased or decreased. At that time, the size of the trimming shape 200 is adjusted according to the communication partner. Can be changed.
Next, while the zooming button 47 and the trimming shape variable button 64 are being operated as described above, the ultrasonic wave is emitted from the ultrasonic wave transmitting unit 62 to the user's face, and the reflected ultrasonic wave is transmitted to the ultrasonic wave receiving unit 63. Receive. FIG. 25 illustrates an ultrasonic wave received by the ultrasonic wave receiving unit 63. a is the earliest reflected wave, which is the reflected wave from the user's face closest to the portable information terminal device. Therefore, the distance from the portable information terminal device to the user can be determined from the time t1 from transmission to reception.

【００５６】そして、ｂ、ｃ以下の超音波は背景によっ
て反射された超音波で、ａより遅れて受信される。以
下、ユーザの顔までの距離を上記の方法で常時測定し
て、距離に応じてズーミング駆動部４８を駆動すること
で、カメラ部４で撮影したユーザの顔の大きさがトリミ
ング形状２００内部で、等しくなるように制御する。す
なわち、映像用ＬＣＤ１４に表示されるトリミング映像
２０３は、ユーザの顔と携帯情報端末装置の距離が変わ
っても、一定の大きさに維持される。また、トリミング
形状２００と内部の顔映像を同時に比例して変えること
で、携帯情報端末装置がユーザの顔に近づいたらトリミ
ング映像２０３を小さくし、遠ざかったらトリミング映
像２０３を大きくすることができ、見る距離に関わらず
トリミング映像２０３を良好に判別することが可能にな
る。また、上記ではズーミング駆動部４８を制御するこ
とにより、ズーミングを行っていたが、カメラ部４で撮
影して、画像メモリ４５に蓄積したデジタルデータにつ
いて、拡大あるいは縮小することでズーミングと等価な
動作を実現してもよい。尚、トリミング形状２００は通
信を行う相手によって変えてもよい。例えば、通信を開
始する際に相手の電話番号によって、トリミング形状蓄
積部５５から選択して、そのトリミング形状２００で切
り出したトリミング映像２０３を相手に送ってもよい。
相手が親しくしている間柄の場合、多少大き目のトリミ
ング形状２００にして、背景を相手に見せてもかまわな
い。しかし、たまにかける相手などに対してはプライバ
シーを知られたくない。そこで、ユーザの顔部分ぎりぎ
りのトリミング形状２００にして、背景部分をできるだ
け相手に見せないようにすることもできる。場合によっ
ては、トリミング形状２００をユーザの顔領域より小さ
くして、背景を完全に隠してもよい。The ultrasonic waves below b and c are ultrasonic waves reflected by the background and are received later than a. Hereinafter, the distance to the user's face is constantly measured by the above-described method, and by driving the zooming drive unit 48 according to the distance, the size of the user's face captured by the camera unit 4 is adjusted within the trimming shape 200. , So as to be equal. That is, the trimmed video 203 displayed on the video LCD 14 is maintained at a constant size even if the distance between the user's face and the portable information terminal device changes. Also, by changing the trimming shape 200 and the internal face image simultaneously in proportion to each other, the trimming image 203 can be reduced when the portable information terminal device approaches the user's face, and can be increased when the mobile information terminal device moves away from the user. The trimmed image 203 can be distinguished satisfactorily regardless of the distance. In the above description, the zooming is performed by controlling the zooming drive unit 48. However, an operation equivalent to zooming is performed by enlarging or reducing digital data captured by the camera unit 4 and stored in the image memory 45. May be realized. Note that the trimming shape 200 may be changed depending on the communication partner. For example, when starting communication, a trimming image 203 selected from the trimming shape storage unit 55 according to the telephone number of the other party and cut out by the trimming shape 200 may be sent to the other party.
If the other party is close, the trimming shape 200 may be slightly larger to show the background to the other party. However, I don't want to know the privacy of occasional callers. Therefore, it is possible to make the trimming shape 200 just before the user's face part so that the background part is not shown to the other party as much as possible. In some cases, the trimmed shape 200 may be smaller than the user's face area to completely hide the background.

【００５７】以上述べたように図８４のような生の映像
について、表示枠１１１の範囲だけの映像を撮影してい
ると、顔映像が画面の中心に位置していない場合、顔映
像を画面の中心に移動すると必ず図８７のように背景の
端が欠落してしまう。しかし、この実施形態３の方式に
よると撮影範囲が表示枠１１１と等しい場合でも、図２
６のようにトリミング形状２００は表示枠１１１の内部
に接する範囲２００−１〜２００−６で動いてもトリミ
ング映像２０１は欠落することがない。すなわち、表示
枠１１１内部では、トリミング映像２０１は自由に動か
すことができるといえる。言い換えれば従来の背景画像
を付加する方式より撮影画像の揺れに強いといえる。こ
こで、本方式は相手側に送る映像はトリミング映像２０
３のみなので背景を含めた映像を送る方式よりデータの
伝送量が少なくて済む利点がある。背景の映像に配分し
ていた伝送レートをトリミング映像２０３の伝送にまわ
すことができ、映像エンコーダ２７でトリミング映像２
０３をより精細な映像でエンコードすることができる。
また、背景として配分していた伝送レートを音声コーデ
ック２３の伝送レートを上げることに振り向ければ、よ
り高音質な音声でエンコードすることが可能である。As described above, if the image of the raw image as shown in FIG. 84 is captured only in the range of the display frame 111, if the face image is not located at the center of the screen, the face image is displayed. 87, the edge of the background always drops off as shown in FIG. However, according to the method of the third embodiment, even when the photographing range is equal to the display frame 111, FIG.
As shown in FIG. 6, even if the trimming shape 200 moves in a range 200-1 to 200-6 that is in contact with the inside of the display frame 111, the trimming image 201 does not drop out. That is, it can be said that the trimmed image 201 can be freely moved inside the display frame 111. In other words, it can be said that the captured image is more resistant to fluctuation than the conventional method of adding a background image. Here, in this method, the video to be sent to the other party is a trimming video 20.
Since only 3 is used, there is an advantage that the amount of data transmission is smaller than in the method of transmitting the video including the background. The transmission rate allocated to the background video can be transferred to the transmission of the trimmed video 203.
03 can be encoded with finer video.
Further, if the transmission rate allocated as the background is allocated to increasing the transmission rate of the audio codec 23, it is possible to encode with higher quality audio.

【００５８】また、上記背景を相手に送らなくて済むた
め、伝送容量に余裕がある場合、図２７のように、ユー
ザ側から定型のテキスト文１３０もトリミング映像２０
３と一緒に相手に送り、相手の情報端末装置で表示して
もよい。この場合、トリミング映像２０３を領域抽出部
５１で生成した後、画面の中央でなく所定の位置（図２
７では右側）に移動して、左側の空いた領域にテキスト
文１３０を表示させてもよい。空いた領域は背景がない
ためバックの映像に影響されることなく、テキスト文１
３０は鮮明に表示される。図２７はテキスト文１３０を
表示させていたが、任意の図形、アイコンあるいはシス
テムを運営しているインフラ業者の宣伝文なども表示さ
せてもよい。（実施形態４）次に、実施形態４について説明する。上
述の実施形態１乃至実施形態３により、手ぶれあるいは
ユーザの体が揺れた映像から、顔映像のみを切り出して
相手側に送り、相手側の端末では安定した顔映像を見な
がら相互に会話を行うことができる。しかし、実施形態
１又は実施形態２において、ユーザ側で、カメラ部４で
撮影した映像の揺れがひどくなり、ユーザの顔が図２８
のように表示枠１１１に接触したり、あるいは図２９の
ように表示枠１１１の範囲を超えてしまうことがある。If the transmission capacity has room because the background need not be sent to the other party, as shown in FIG.
3 and may be sent to the other party and displayed on the information terminal device of the other party. In this case, after the trimming video 203 is generated by the area extracting unit 51, the trimming video 203 is not at the center of the screen but at a predetermined position (FIG. 2).
7, the text 130 may be displayed in an empty area on the left. The empty area has no background and is not affected by the background image.
30 is clearly displayed. In FIG. 27, the text 130 is displayed. However, an arbitrary figure, an icon, or an advertisement of an infrastructure company that operates the system may be displayed. (Embodiment 4) Next, Embodiment 4 will be described. According to the above-described first to third embodiments, only a face image is cut out from an image in which a camera shake or a user's body is shaken, and sent to the other party. be able to. However, in the first embodiment or the second embodiment, on the user side, the image captured by the camera unit 4 fluctuates greatly, and the user's face is
29, or may exceed the range of the display frame 111 as shown in FIG.

【００５９】図２８の場合は、顔映像１２２は中央に移
動させ顔映像１２１として相手側に送られ、問題が起き
ない。しかし、図２９の場合は、顔映像１２２を切り出
すと、顔映像１２２の一部が欠落するか、あるいはカメ
ラ部４で撮影した映像の揺れがひどく、全く表示枠１１
１から消えてしまうこともある。そして、欠落した顔映
像１２２については、そのまま、表示枠１１１の中央に
移動して顔映像１２１として相手端末に送られる。しか
し、ユーザは映像用ＬＣＤ１４で相手の顔映像を見てい
るために撮影された自分の顔が画面の端からはみ出して
いることが分からない。そこで、この実施形態では、切
り出した顔映像１２２が図２８のように表示枠１１１の
端に接触したり、あるいは図２９のように端からはみ出
したら映像用ＬＣＤ１４に相手側の顔映像１０５（例え
ば図５）の代わりに、そのときカメラ部４で撮影してい
た図３０あるいは図３１のような生の映像を表示させ
る。ユーザは自分の顔が表示されている画面を直接見る
ことにより、カメラ部４の撮影方向が狂ったことを知
り、手元を動かし直ちにカメラ部４の方向を直すことが
できる。再びユーザの顔映像１２２が画面の端から離れ
たら、また映像用ＬＣＤ１４に相手側の顔映像１０５
（例えば図５）を表示させる。In the case of FIG. 28, the face image 122 is moved to the center and sent to the other party as the face image 121, so that no problem occurs. However, in the case of FIG. 29, when the face image 122 is cut out, a part of the face image 122 is missing or the image captured by the camera unit 4 fluctuates greatly, and the display frame 11 is completely removed.
Sometimes it disappears from one. Then, the missing face image 122 moves to the center of the display frame 111 as it is and is sent to the partner terminal as the face image 121. However, the user does not know that his / her own face that has been photographed protrudes from the edge of the screen because he / she is watching the other party's face image on the image LCD 14. Therefore, in this embodiment, when the cut out face image 122 touches the edge of the display frame 111 as shown in FIG. 28 or protrudes from the edge as shown in FIG. 29, the other person's face image 105 (for example, Instead of FIG. 5), a raw video image as shown in FIG. 30 or FIG. 31 captured by the camera unit 4 at that time is displayed. By directly looking at the screen on which the user's face is displayed, the user knows that the shooting direction of the camera unit 4 is out of order, and can move the hand to immediately change the direction of the camera unit 4. When the user's face image 122 moves away from the edge of the screen again, the other person's face image 105 is displayed on the image LCD 14 again.
(For example, FIG. 5) is displayed.

【００６０】ここで、顔映像１２２が表示枠１１１に接
触した状態を調べるには、図２９に示すように、矩形１
１８の４辺のうち、一つの辺が表示枠１１１に接触しこ
とにより、調べればよい。また、顔映像１２２が表示枠
１１１を越えた状態を調べるには矩形１１８に４辺のう
ち２辺（図２９では横の２辺すなわち長さａ）が短くな
った状態で、顔映像１２２が表示枠１１１を越えたとす
ればよい。また、矩形の面積ａ×ｂが小さくなった状態
を顔映像１２２が表示枠１１１を越えた状態としてもよ
い。ここで、相手側の顔映像１０５（例えば図５）から
ユーザの生の映像（例えば図３０あるいは図３１）に切
り替える場合に、ユーザの映像の色を変えたり、薄く表
示させたりすると、相手側の映像か、ユーザの映像か区
別がつき分かりやすい。また、図２８又は図２９で示し
たように顔映像１２２が表示枠１１１の端に接触したり
端からはみ出したら直ちに映像用ＬＣＤ１４にユーザの
生の映像を表示させるのではなく、手ぶれあるいはユー
ザの体の揺れにより顔映像１２２が一瞬表示枠１１１の
端からはみ出すこともあるので、所定の時間例えば１秒
位表示枠１１１の端からはみ出したら画面を切り替える
ようにしてもよい。また、手ぶれあるいはユーザの体の
揺れにより顔映像１２２が表示枠１１１の端から複数回
はみ出すことを考慮して、顔映像１２２が表示枠１１１
の端から所定の回数例えば３回はみ出したら、このこと
を検出して、画面の表示を切り替えるようにしてもかま
わない。Here, in order to check the state where the face image 122 is in contact with the display frame 111, as shown in FIG.
One of the four sides 18 may be checked by contacting one of the sides with the display frame 111. In addition, in order to check the state in which the face image 122 has exceeded the display frame 111, the face image 122 is displayed in a state where two sides (two horizontal sides, that is, the length a in FIG. 29) of the four sides of the rectangle 118 are shortened. What is necessary is just to exceed the display frame 111. Further, the state in which the rectangular area a × b is reduced may be a state in which the face image 122 exceeds the display frame 111. Here, when the face image 105 (for example, FIG. 5) of the other party is switched to a live image of the user (for example, FIG. 30 or FIG. 31), when the color of the user image is changed or displayed faintly, Video or a user's video and it is easy to understand. Further, as shown in FIG. 28 or FIG. 29, when the face image 122 touches or protrudes from the edge of the display frame 111, instead of displaying the user's raw image on the image LCD 14 immediately, Since the face image 122 may momentarily protrude from the edge of the display frame 111 due to shaking of the body, the screen may be switched when the face video 122 protrudes from the edge of the display frame 111 for a predetermined time, for example, about one second. Also, in consideration of the fact that the face image 122 protrudes from the edge of the display frame 111 a plurality of times due to camera shake or shaking of the user's body, the face image 122 is displayed on the display frame 111.
If it protrudes from the end a predetermined number of times, for example, three times, this may be detected and the display of the screen may be switched.

【００６１】また、実施形態３においても、同様に、ユ
ーザ側で、カメラ部４で撮影した映像の揺れがひどくな
り、図３２のようにトリミング映像２１７が表示枠１１
１に接触したりあるいは表示枠１１１の範囲を越えた場
合も、トリミング映像２１７を切り出し、表示枠１１１
の中央に移動すると、トリミング映像２１８の内部の映
像に欠落部分２１９が発生することがある。この場合
も、一部欠落したトリミング映像２１８については、そ
のまま、相手端末に送られる。しかし、ユーザは映像用
ＬＣＤ１４で相手の顔映像を見ているために撮影された
自分の顔が画面の端からはみ出していることが分からな
い。そこで、図３２のように切り出したトリミング映像
２１７が表示枠１１１の端に接触したり、あるいは端か
らはみ出したら映像用ＬＣＤ１４に相手側の顔映像１０
５（例えば図５）の代わりに、そのときカメラ部４で撮
影していた図３０又は図３１のような生の映像を表示さ
せる。ユーザは自分の顔が表示されている画面を直接見
ることにより、カメラ部４の撮影方向が狂ったことを知
り、手元を動かし直ちにカメラ部４の方向を直すことが
できる。ここで、トリミング映像２１７が表示枠１１１
に接触した状態を調べるには、トリミング映像２１７の
面積を調べ、面積が小さくなったらトリミング映像２１
７が表示枠１１１を越えた状態とすればよい。Also, in the third embodiment, similarly, on the user side, the image taken by the camera unit 4 fluctuates greatly, and the trimmed image 217 is displayed on the display frame 11 as shown in FIG.
Also, when the user touches the display frame 111 or exceeds the range of the display frame 111, the trimming image 217 is cut out and the trimmed image 217 is cut out.
, The missing portion 219 may occur in the image inside the trimmed image 218. Also in this case, the trimmed video 218 that is partially missing is sent to the partner terminal as it is. However, the user does not know that his / her own face that has been photographed protrudes from the edge of the screen because he / she is watching the other party's face image on the image LCD 14. Therefore, when the trimmed image 217 cut out as shown in FIG. 32 touches or protrudes from the edge of the display frame 111, the face image 10 10
Instead of 5 (for example, FIG. 5), a live image as shown in FIG. 30 or 31 captured by the camera unit 4 at that time is displayed. By directly looking at the screen on which the user's face is displayed, the user knows that the shooting direction of the camera unit 4 is out of order, and can move the hand to immediately change the direction of the camera unit 4. Here, the trimming image 217 is displayed on the display frame 111.
In order to check the state of contact with the image, the area of the trimming image 217 is checked.
7 may be beyond the display frame 111.

【００６２】ここで、相手側の顔映像１０５から、ユー
ザの生の映像に切り替える場合に、ユーザの映像の色を
変えたり、薄く表示させたりすると、相手側の映像か、
ユーザの映像か区別がつき分かりやすい。また、図３２
で示したようにトリミング映像２１７が表示枠１１１の
端からはみ出したら直ちに映像用ＬＣＤ１４にユーザの
映像を表示させるのではなく、手ぶれあるいはユーザの
体の揺れによりトリミング映像２１７が一瞬表示枠１１
１の端からはみ出すこともあるので、所定の時間例えば
１秒位表示枠１１１の端からはみ出したら画面を切り替
えるようにしてもよい。また、手ぶれあるいはユーザの
体の揺れによりトリミング映像２１７が表示枠１１１の
端から複数回はみ出すことを考慮して、トリミング映像
２１７が表示枠１１１の端から所定の回数例えば３回は
み出したら画面の表示を切り替えるようにしてもよい。
更に、この実施形態において、相手側の顔映像１０５の
表示からユーザの生の映像の表示に切り替える際に、相
手の顔映像１０５とユーザの生の映像とをマルチウイン
ドウで一緒に表示させてもよい。（実施形態５）次に、実施形態５について説明する。Here, when switching from the other party's face image 105 to the user's live image, if the color of the user's image is changed or displayed faintly, the other party's image will be displayed.
It is easy to understand because it is distinguished from the user's video. FIG. 32
When the trimmed image 217 protrudes from the edge of the display frame 111 as shown in FIG. 7, the user's image is not displayed on the image LCD 14 immediately, but the trimmed image 217 is instantaneously displayed by the camera shake or the shaking of the user's body.
Since the image may protrude from the end of the display frame 111 for a predetermined time, for example, about one second, the screen may be switched. Also, considering that the trimmed image 217 protrudes from the edge of the display frame 111 a plurality of times due to camera shake or shaking of the user's body, the screen is displayed when the trimmed image 217 protrudes from the edge of the display frame 111 a predetermined number of times, for example, three times. May be switched.
Further, in this embodiment, when the display of the face video 105 of the other party is switched to the display of the raw video of the user, the face video 105 of the other party and the raw video of the user may be displayed together in a multi-window. Good. (Fifth Embodiment) Next, a fifth embodiment will be described.

【００６３】上述の実施形態４においては、切り出した
顔映像１２２、又はトリミング映像２１７が表示枠１１
１の端に接触したり、端からはみ出したら映像用ＬＣＤ
１４に相手側の顔映像１０５の代わりに、そのときカメ
ラ部４で撮影していた生の映像を表示させるようにした
が、この実施形態５においては、顔映像１２２、又はト
リミング映像２１７が表示枠１１１からはみ出している
間は、別な画面を相手に送ることで、相手に対して見苦
しい映像を送らないようにすることを目的としている。
即ち、この実施形態５においては、実施形態２のように
切り出した顔映像１２１を送出する場合は、領域抽出部
５１で正常に顔映像１２１を抽出して、相手に顔映像１
２１を送るのと同時に、顔映像メモリ部４６に顔映像１
２１を蓄積しておく。そして、顔映像メモリ部４６に蓄
積される顔映像１２１はフレーム毎あるいは所定のフレ
ーム周期で更新されている。そして、実施形態４のよう
に顔映像１２１が表示枠１１１からはみ出したら、前記
顔映像メモリ部４６に顔映像１２１の蓄積を停止する。
そして、実施形態４のように映像用ＬＣＤ１４にユーザ
を撮影した映像（図３１）を表示させる。In the fourth embodiment, the cut face image 122 or the trimmed image 217 is displayed on the display frame 11.
LCD for video if it touches the edge of 1 or protrudes from the edge
In place of the face image 105 of the other party, the live image captured by the camera unit 4 at that time is displayed on the display 14. In the fifth embodiment, the face image 122 or the trimming image 217 is displayed. An object is to prevent the unsightly image from being sent to the other party by sending another screen to the other party while the image is outside the frame 111.
That is, in the fifth embodiment, when the cut out face image 121 is transmitted as in the second embodiment, the face image 121 is normally extracted by the region extracting unit 51 and the face image 1 is sent to the other party.
21 and at the same time, the face image 1 is stored in the face image memory 46.
21 are stored. The face image 121 stored in the face image memory unit 46 is updated for each frame or at a predetermined frame cycle. When the face image 121 protrudes from the display frame 111 as in the fourth embodiment, the accumulation of the face image 121 in the face image memory unit 46 is stopped.
Then, as in the fourth embodiment, the video (FIG. 31) of the user is displayed on the video LCD 14.

【００６４】顔映像メモリ部４６に蓄積された顔映像１
２１を相手に静止画として一回だけ送り、相手の端末装
置の表示部に表示させる。相手の端末装置の表示部には
前記静止画の表示をそのまま維持しておく。ここで、静
止画はカメラ部４で撮影され揺れがひどくなり、表示枠
１１１からはみ出す直前で欠落部分のない正常な顔映像
１２１である。そして、ユーザが携帯情報端末装置を持
ち替えてユーザの顔が表示枠１１１に収まり、顔映像１
２１に欠落部分のない状態で正常に領域抽出部５１で抽
出可能になったら、実施形態２のように顔映像１２１の
伝送を再開する。また、実施形態３のようにトリミング
映像を送出する場合も、領域抽出部５１で正常にトリミ
ング映像２０３を抽出して、相手にトリミング映像２０
３を送るのと同時に、トリミング映像メモリ部５６にト
リミング映像２０３を蓄積しておく。そして、トリミン
グ映像メモリ部５６に蓄積されるトリミング映像２０３
はフレーム毎あるいは所定のフレーム周期で更新されて
いる。そして、実施形態４のようにトリミング映像２１
７が表示枠１１１からはみ出したら、前記トリミング映
像メモリ部５６にトリミング映像２０３の蓄積を停止す
る。そして、実施形態４のように映像用ＬＣＤ１４にユ
ーザを撮影した映像（図３１）を表示させる。Face image 1 stored in face image memory section 46
21 is sent to the other party only once as a still image, and is displayed on the display unit of the other party's terminal device. The display of the still image is maintained as it is on the display unit of the partner terminal device. Here, the still image is photographed by the camera unit 4, the shaking becomes severe, and is a normal face image 121 without any missing portion just before protruding from the display frame 111. Then, the user switches the portable information terminal device, and the face of the user fits in the display frame 111, and the face image 1
When the area extraction unit 51 can extract the image normally without any missing parts in the area 21, the transmission of the face image 121 is restarted as in the second embodiment. Also, when the trimming video is transmitted as in the third embodiment, the region extracting unit 51 normally extracts the trimming video 203 and sends the trimming video 203 to the other party.
At the same time as sending No. 3, the trimming video 203 is stored in the trimming video memory unit 56. Then, the trimmed video 203 stored in the trimmed video memory unit 56
Are updated every frame or at a predetermined frame cycle. Then, as in the fourth embodiment, the trimming video 21
When the number 7 protrudes from the display frame 111, the accumulation of the trimmed video 203 in the trimmed video memory unit 56 is stopped. Then, as in the fourth embodiment, the video (FIG. 31) of the user is displayed on the video LCD 14.

【００６５】トリミング映像メモリ部５６に蓄積された
トリミング映像２０３を相手に静止画として一回だけ送
り、相手の端末の表示部に表示させる。相手の表示部に
は前記静止画の表示をそのまま維持しておく。ここで、
静止画はカメラ部４で撮影され揺れがひどくなり、表示
枠１１１からはみ出す直前で欠落部分のないの正常なト
リミング映像２０３である。そして、ユーザが携帯情報
端末装置を持ち替えてユーザの顔が表示枠１１１に収ま
り、トリミング映像２０３が欠落部分がなく正常に領域
抽出部５１で抽出可能になったら、実施形態３のように
トリミング映像２０３の伝送を再開する。（実施形態６）次に、実施形態６について説明する。こ
の実施形態は、実施形態２又は実施形態３において、切
り出した顔映像１２２、又はトリミング映像２１７を映
像用ＬＣＤ１４に表示しズーミング動作を行なっている
場合に、切り出した顔映像１２２、又はトリミング映像
２１７が拡大されて、表示枠１１１からはみ出した場合
に、カメラ部４で撮影した生の映像を表示させ、その映
像を相手に送るようにしたものである。即ち、実施形態
２において、ズーミング動作を行なうためズーミングボ
タン４７を押下すると映像用ＬＣＤ１４に顔映像１２１
が表示されるが、図３３（ａ）に示すような顔映像１２
１が表示されている映像用ＬＣＤ１４についてズーミン
グボタン４７を押下していくと、同図（ｂ）のように顔
映像１２１は拡大されていく。そして、同図（ｃ）のよ
うに顔映像１２１が表示枠１１１からはみ出してしまう
ことがある。そこで、このような場合、同図（ｄ）のよ
うに映像用ＬＣＤ１４にカメラ部４で撮影した生の映像
を表示させる。そしてこの映像を相手にそのまま送るこ
とを特徴としている。The trimmed video 203 stored in the trimmed video memory 56 is sent to the other party only once as a still image, and displayed on the display unit of the other party's terminal. The display of the still image is maintained as it is on the display unit of the other party. here,
The still image is a normal trimmed image 203 which is shot by the camera unit 4, has severe shaking, and has no missing portions just before protruding from the display frame 111. Then, when the user switches the portable information terminal device and the user's face fits in the display frame 111, and the trimming image 203 can be normally extracted by the region extracting unit 51 without any missing portion, as in the third embodiment, The transmission of 203 is restarted. (Embodiment 6) Next, Embodiment 6 will be described. This embodiment is different from the second embodiment or the third embodiment in that the cut-out face video 122 or the trimmed video 217 is displayed on the video LCD 14 and a zooming operation is performed. Is enlarged, and when it protrudes from the display frame 111, a raw video taken by the camera unit 4 is displayed, and the video is sent to the other party. That is, in the second embodiment, when the zooming button 47 is pressed to perform the zooming operation, the face image 121 is displayed on the image LCD 14.
Is displayed, but the face image 12 as shown in FIG.
As the zooming button 47 is pressed down on the image LCD 14 displaying 1, the face image 121 is enlarged as shown in FIG. Then, the face image 121 may protrude from the display frame 111 as shown in FIG. Therefore, in such a case, a raw video taken by the camera unit 4 is displayed on the video LCD 14 as shown in FIG. The feature is that this video is sent to the other party as it is.

【００６６】すなわち、顔映像１２１が大きくなると、
領域抽出部５１でユーザの顔を領域抽出しても背景部分
が小さいため、相手に対して背景が含んだ映像を送って
も、ユーザのプライバシーを相手に知られることはない
ので、わざわざ背景を削除する必要が無い。そして、領
域抽出部５１での処理が簡単になって、通常の矩形の映
像を送る処理で済ますことができ、消費電力の節約につ
ながる。また、実施形態５のように顔映像１２１が表示
枠１１１からはみ出したら、顔映像メモリ部４６に蓄積
されている顔映像１２１（表示枠１１１からはみ出して
いない顔映像１２１）を静止画として相手に送ってもよ
い。また、実施形態３のようにトリミングを行なう場合
についても、同様に行なうことができる。即ち、図３４
（ａ）に示すように表示枠１１１の内部の映像につい
て、トリミング形状２０２で切り抜き、領域抽出部５１
で同図（ｂ）のようなトリミング映像２０３が生成され
ているが、ユーザはトリミング映像２０３が表示されて
いる映像用ＬＣＤ１４を見ながらズーミングボタン４７
を押下していくと同図（ｃ）のようにトリミング映像２
０３は拡大されていく。そして、同図（ｄ）のようにト
リミング映像２０３が表示枠１１１からはみ出してしま
うことがある。そこで、このような場合、同図（ｅ）の
ように映像用ＬＣＤ１４にカメラ部４で撮影した生の映
像を表示させる。そしてこの映像を相手にそのまま送る
ことを特徴としている。That is, when the face image 121 becomes large,
Even if the area of the user's face is extracted by the area extraction unit 51, the background portion is small, so even if a video including the background is sent to the other party, the privacy of the user is not known to the other party. No need to delete. Then, the processing in the area extracting unit 51 is simplified, and the processing for transmitting a normal rectangular image can be completed, which leads to a reduction in power consumption. When the face image 121 protrudes from the display frame 111 as in the fifth embodiment, the face image 121 stored in the face image memory unit 46 (the face image 121 not protruding from the display frame 111) is sent to the other party as a still image. You may send it. Also, the trimming can be performed in the same manner as in the third embodiment. That is, FIG.
As shown in (a), the image inside the display frame 111 is cut out by the trimming shape 202, and the region extraction unit 51
A trimming video 203 is generated as shown in FIG. 3B, but the user looks at the video LCD 14 on which the trimming video 203 is displayed, and
Is pressed, the trimming image 2 as shown in FIG.
03 is expanding. Then, the trimmed video 203 may run off the display frame 111 as shown in FIG. Therefore, in such a case, a raw video taken by the camera unit 4 is displayed on the video LCD 14 as shown in FIG. The feature is that this video is sent to the other party as it is.

【００６７】すなわち、トリミング映像２０３が大きく
なると、領域抽出部５１でユーザの顔を領域抽出しても
背景部分が小さく、中途半端な欠落部分２１５及び２１
６が発生してみにくくなる。そこで、相手に対して背景
が含んだ同図（ｅ）のような映像を送っても、ユーザの
プライバシーを相手に知られることはないので、わざわ
ざ背景を削除する必要が無い。そして、領域抽出部５１
での処理が簡単になって、通常の矩形の映像を送る処理
で済ますことができ、消費電力の節約につながる。ま
た、実施形態５のようにトリミング映像２０３が表示枠
１１１からはみ出した期間、トリミング映像メモリ部５
６に蓄積されているトリミング映像２０３（表示枠１１
１からはみ出していないトリミング映像２０３）を静止
画として相手に送ってもよい。（実施形態７）次に、実施形態７について説明する。こ
の実施形態７は、表示枠を超えて大き目の領域を撮影枠
として撮像し、これを画像メモリ４５に蓄積して、実施
形態２又は実施形態３のような処理を行うこととしたも
のである。まず、実施形態２のように顔領域を切り出し
て顔映像を抽出する場合に、表示枠を超えて大き目の領
域を撮影枠として撮像する方式について説明する。That is, when the trimmed image 203 becomes large, even if the area of the user's face is extracted by the area extracting unit 51, the background part is small, and the half-cut parts 215 and 21 are incomplete.
6 is less likely to occur. Therefore, even if the image including the background is transmitted to the other party as shown in FIG. 11E, the privacy of the user is not known to the other party, and there is no need to delete the background. Then, the region extraction unit 51
, The processing can be simplified, and the processing of sending a normal rectangular image can be completed, which leads to a reduction in power consumption. Also, during the period in which the trimmed video 203 protrudes from the display frame 111 as in the fifth embodiment, the trimmed video memory unit 5
6 (the display frame 11)
The trimmed video 203 that does not protrude from 1 may be sent to the other party as a still image. (Embodiment 7) Next, Embodiment 7 will be described. In the seventh embodiment, a region larger than the display frame is imaged as a shooting frame, the image is stored in the image memory 45, and the processing as in the second or third embodiment is performed. . First, a method of capturing a region larger than the display frame as a shooting frame when a face region is cut out and a face image is extracted as in the second embodiment will be described.

【００６８】図３５は、図８６の方式について、表示枠
を超えて大き目の領域を撮影枠として撮像する方式を適
用した場合の構成図である。尚、図３５は顔映像の移動
動作について、必要となる構成要素のみを記載してい
る。そして、図８６の方式で必要としていた、アドレス
制御部７２はこの実施形態では必要としていないので削
除してある。図３５に示すものは、図７に示すものと比
較すると、画像メモリ４５が表示枠１１１の範囲でな
く、撮影枠の範囲のメモリを有していることが異なる。
そして、この実施形態の方式は図３７のように、撮影枠
１２０の範囲で移動している顔形状１１２の映像を領域
抽出部５１で抽出すればよい。従って、本方式は表示枠
１１１から外れた顔形状１１２の映像も抽出できること
もあり、撮影枠１２０が表示枠１１１に等しい場合よ
り、より撮影映像の揺れに対して強くなる。図３６は従
来の方式について顔形状１１２の移動可能領域を説明し
た図である。すなわち、従来の方式は背景と顔領域が一
緒になっていて、顔形状１１２が表示枠１１１の中心に
来るように、表示枠１１１を、撮影枠１２０内の内部を
動かすことで、表示枠１１１の内部の映像を切り出して
いる。顔形状１１２が表示枠１１１の中心に位置すると
いう条件の元で、顔形状１１２の移動可能範囲を描いて
みると１１２−１〜１１２−６の範囲である。すなわ
ち、矩形１１９の内部で顔形状１１２を移動すれば、顔
形状１１２は表示枠１１１の中心に移動させることがで
きる。FIG. 35 is a configuration diagram in the case of applying the method of taking an image of a region larger than the display frame as a photographing frame to the method of FIG. 86. FIG. 35 shows only necessary components for the movement operation of the face image. The address control unit 72, which was required in the method of FIG. 86, is not required in this embodiment, and has been deleted. 35 differs from the one shown in FIG. 7 in that the image memory 45 has a memory not in the range of the display frame 111 but in the range of the photographing frame.
Then, in the method of this embodiment, as shown in FIG. 37, the image of the face shape 112 moving within the range of the shooting frame 120 may be extracted by the region extracting unit 51. Accordingly, the present method can sometimes extract an image of the face shape 112 deviating from the display frame 111, and is more resistant to shaking of the captured image than when the imaging frame 120 is equal to the display frame 111. FIG. 36 is a view for explaining the movable area of the face shape 112 in the conventional method. That is, in the conventional method, the display frame 111 is moved inside the shooting frame 120 so that the background and the face region are the same, and the face shape 112 is at the center of the display frame 111. The video inside is cut out. Under the condition that the face shape 112 is located at the center of the display frame 111, the movable range of the face shape 112 is drawn in the range of 112-1 to 112-6. That is, if the face shape 112 is moved inside the rectangle 119, the face shape 112 can be moved to the center of the display frame 111.

【００６９】一方、本方式は背景を削除しているため撮
影枠１２０から顔形状１１２の映像を切り出せればよ
い。すなわち、図３７のように、顔形状１１２は１１２
−１〜１１２−８の範囲まで移動可能である。すなわち
撮影枠１２０の内部で移動可能となる。図３６と図３７
とを比較すると、本方式は顔形状１１２の移動可能領域
が矩形１１９から撮影枠１２０に拡大したことになり、
より撮影映像の揺れに強くなったことを示している。次
に、図３８のように顔形状１１２が撮影枠１２０からは
み出した場合、上記と同様に、映像用ＬＣＤ１４に表示
枠１１１の内部の映像をそのまま表示（または相手側の
顔映像１０５と表示枠１１１の内部の映像とを映像用Ｌ
ＣＤ１４にマルチウインドウで一緒に表示）して、顔形
状１１２が表示枠１１１からはみ出している間は、顔映
像メモリ部４６に蓄積されている欠けていない正常な顔
映像１２１を相手に送ればよい。次に、図８５の方式に
ついて、表示枠を超えて大き目の領域を撮影枠として撮
像する方式を適用した場合について述べる。図３９は、
図８５の方式に対して、この表示枠１１１を超えて大き
目の領域を撮影枠１２０として撮像する方式を適用した
場合の、必要となる構成要素のみを記載した構成図であ
る。On the other hand, in this method, since the background is deleted, it is sufficient that the image of the face shape 112 can be cut out from the photographing frame 120. That is, as shown in FIG.
It can be moved to the range of -1 to 112-8. That is, it can be moved inside the shooting frame 120. 36 and 37
When this method is compared with this method, the movable area of the face shape 112 is expanded from the rectangle 119 to the shooting frame 120,
This indicates that the shot image has become more resistant to shaking. Next, when the face shape 112 protrudes from the photographing frame 120 as shown in FIG. 38, the image inside the display frame 111 is displayed as it is on the image LCD 14 (or the other party's face image 105 and the display frame 111 and the image inside
While the face shape 112 protrudes from the display frame 111 while the face shape 112 protrudes from the display frame 111, the normal and normal face image 121 stored in the face image memory unit 46 may be sent to the other party. . Next, as to the method of FIG. 85, a case will be described in which a method of imaging a region larger than the display frame as a shooting frame is applied. FIG.
FIG. 86 is a configuration diagram illustrating only necessary components when a method of imaging a region larger than the display frame 111 as a shooting frame 120 is applied to the system of FIG. 85.

【００７０】領域抽出部５１でパンニング・チルティン
グ駆動部７１に必要とする位置情報を、主バス３７を介
してパンニング・チルティング駆動部７１に供給してい
る。図４０は従来の方式、すなわち顔映像と背景が一緒
になった映像について、顔形状１１２を表示枠１１１の
中心に移動させるために顔形状１１２が移動できる範囲
を説明した図である。パンニング・チルティング駆動部
７１は機械的な駆動部のためパンニング及びチルティン
グが可能な範囲が決まっている。図４０でａはパンニン
グできる範囲、ｂはチルティングできる範囲である。す
なわち撮影枠１２０の範囲内のみレンズ７０の光軸が動
いて撮影できることになる。そして、撮影枠１２０の範
囲で顔形状１１２を表示枠１１１の中央に位置するよう
にパンニング・チルティング駆動部７１を制御するとい
う条件で、顔形状１１２が移動できる範囲は１１２−１
〜１１２−６の範囲、すなわち矩形１１９の範囲であ
る。一方、本方式は画像メモリ４５に蓄積された映像デ
ータに顔形状１１２が入るように大まかにパンニング・
チルティング駆動部７１を制御すればよい。すなわち、
本方式はパンニング・チルティング駆動部７１を精密に
正確に制御しなくてもよい利点がある。そして、形状抽
出部２０で顔形状１１２を抽出して、顔形状１１２とデ
ジタル映像５４とを用いて、領域抽出部５１で顔映像１
２２を生成し、顔映像１２２は表示枠１１１の中心に移
動させて、相手側に伝送する。The position information required for the panning / tilting drive unit 71 by the area extraction unit 51 is supplied to the panning / tilting drive unit 71 via the main bus 37. FIG. 40 is a diagram illustrating a range in which the face shape 112 can be moved in order to move the face shape 112 to the center of the display frame 111 in the conventional method, that is, for an image in which the face image and the background are combined. Since the panning / tilting drive unit 71 is a mechanical drive unit, the range in which panning and tilting can be performed is determined. In FIG. 40, a is a range in which panning is possible, and b is a range in which tilting is possible. In other words, it is possible to perform imaging by moving the optical axis of the lens 70 only within the range of the imaging frame 120. Then, under the condition that the panning / tilting driving unit 71 is controlled so that the face shape 112 is positioned at the center of the display frame 111 within the range of the shooting frame 120, the range in which the face shape 112 can move is 112-1.
１１２112-6, that is, the range of the rectangle 119. On the other hand, in this method, the panning is roughly performed so that the face shape 112 is included in the video data stored in the image memory 45.
What is necessary is just to control the tilting drive part 71. That is,
This method has an advantage that the panning / tilting drive unit 71 does not need to be precisely controlled. Then, the face shape 112 is extracted by the shape extraction unit 20, and the face image 1 is extracted by the region extraction unit 51 using the face shape 112 and the digital image 54.
22 is generated, the face image 122 is moved to the center of the display frame 111, and transmitted to the other party.

【００７１】図４１は顔形状１１２について移動可能領
域を示した図である。すなわち、背景がないため顔形状
１１２は撮影枠１２０に接するまで移動できる。すなわ
ち１１２−１〜１１２−８が移動可能範囲である。移動
可能範囲１１２−１〜１１２−８は撮影枠１２０の範囲
と同等となる。図４０と図４１とを比較すると、従来方
式における顔形状１１２の移動可能範囲は矩形１１９
で、本方式の移動可能範囲は撮影枠１２０で、本方式の
方が移動可能範囲が広く、それだけ撮影した映像の揺れ
に対して強いといえる。次に、図４２のように顔形状１
１２が撮影枠１２０からはみ出した場合、上記と同様
に、映像用ＬＣＤ１４に表示枠１１１の内部の映像をそ
のまま表示して、顔形状１１２が表示枠１１１からはみ
出している間は、顔映像メモリ部４６に蓄積されている
欠けていない正常な顔映像１１２を相手に送ればよい。
また、上述のように図３５又は図３９のように構成した
方式に対して、図１４のように顔映像１２２と一緒にテ
キスト文１３０等を表示させることもできる。この場合
も、上記と同様に、顔映像１２２を右側に寄せて、左側
にテキスト文１３０の表示領域を確保すればよい。FIG. 41 is a diagram showing a movable area of the face shape 112. That is, since there is no background, the face shape 112 can move until it comes into contact with the shooting frame 120. That is, 112-1 to 112-8 are movable ranges. The movable range 112-1 to 112-8 is equivalent to the range of the shooting frame 120. Comparing FIG. 40 with FIG. 41, the movable range of the face shape 112 in the conventional method is a rectangle 119.
Therefore, the movable range of the present method is the shooting frame 120, and the movable range of the present method is wider, and it can be said that the movable range is stronger against the fluctuation of the captured image. Next, as shown in FIG.
When the image 12 extends outside the shooting frame 120, the image inside the display frame 111 is displayed as it is on the image LCD 14, and the face image memory What is necessary is just to send the normal face image 112 that is stored in 46 to the other party.
In addition to the method configured as shown in FIG. 35 or FIG. 39 as described above, a text sentence 130 or the like can be displayed together with the face image 122 as shown in FIG. In this case as well, the face image 122 may be shifted to the right side and the display area of the text 130 may be secured on the left side in the same manner as described above.

【００７２】例えば、図８６で背景を一緒に表示させる
従来の方式においては、図４３のように顔形状１１２を
表示枠１１１の右側に寄せ、左側にテキスト文１３０の
表示領域を確保するために顔形状１１２と表示枠１１１
を撮影枠１２０の内部を移動させている。そして、図４
４のように顔形状１１２が右側ぎりぎりまで寄った場合
でも、左側の領域にテキスト文１３０を確保できる。し
かし、図４５のように顔形状１１２が左側に寄った場
合、右側にしかテキスト文１３０を表示させる領域を確
保できない。すなわち、従来の方式で、テキスト文１３
０を表示させようとすると顔形状１１２の位置によって
はテキスト文１３０の表示領域が変わってしまい、相手
側で表示される表示画面が非常に見難くなってしまう。
また、背景の上にテキスト文１３０を重ねているため背
景の映像によってはテキスト文１３０の内容が見難くな
ってしまう欠点がある。しかし、本方式は、顔形状１１
２が撮影枠１２０の内部を動く限り、表示枠１１１内部
で顔映像１２２とテキスト文１３０は、図１４のように
所定の位置に安定して表示される。そして、テキスト文
１３０のバックは背景が無いため、テキスト文１３０は
鮮明で見易くなる。For example, in the conventional method of displaying the background together in FIG. 86, the face shape 112 is shifted to the right of the display frame 111 and the display area of the text text 130 is secured on the left as shown in FIG. Face shape 112 and display frame 111
Is moved inside the shooting frame 120. And FIG.
Even when the face shape 112 approaches the right side as in FIG. 4, the text sentence 130 can be secured in the left area. However, when the face shape 112 is shifted to the left as shown in FIG. 45, an area for displaying the text 130 can be secured only on the right. That is, the text sentence 13
When trying to display 0, the display area of the text 130 changes depending on the position of the face shape 112, and the display screen displayed on the other party becomes very difficult to see.
Further, since the text 130 is superimposed on the background, there is a disadvantage that the contents of the text 130 are difficult to see depending on the video of the background. However, this method uses face shape 11
As long as 2 moves inside the shooting frame 120, the face video 122 and the text 130 are stably displayed at predetermined positions inside the display frame 111 as shown in FIG. Since the background of the text sentence 130 has no background, the text sentence 130 is clear and easy to see.

【００７３】次に、実施形態３のようにトリミングによ
り顔映像を抽出する場合に、表示枠を超えて大き目の領
域を撮影枠として撮像する方式について説明する。図４
６は、図８６の方式について、表示枠を超えて大き目の
領域を撮影枠として撮像する方式を適用した場合の動作
に関係ある構成要素を抜き出した構成図である。この場
合、アドレス制御部７２の動作はこの実施形態では関係
ない。撮影枠１２０の範囲で移動しているトリミング形
状２００を領域抽出部５１で抽出すればよい。表示枠１
１１から外れたトリミング形状２００も抽出できること
もあり、上述のように、撮影枠１２０が表示枠１１１に
等しい場合より、より撮影映像の揺れに対して強くな
る。図４７は従来の方式について顔形状１１２の移動可
能領域を説明した図である。すなわち、従来の方式は背
景と顔領域が一緒になっていて、顔形状１１２が表示枠
１１１の中心に来るように、表示枠１１１を撮影枠１２
０内の内部を動かすことで、表示枠１１１の内部の映像
を切り出している。前記顔形状１１２が表示枠１１１の
中心に位置するという条件の元で、顔形状１１２の移動
可能範囲を描いてみると１１２−１〜１１２−６の範囲
である。すなわち、矩形１１９の内部で顔形状１１２を
移動すれば、顔形状１１２は表示枠１１１の中心に移動
させることができる。Next, a description will be given of a method of taking a large area beyond the display frame as a shooting frame when a face image is extracted by trimming as in the third embodiment. FIG.
FIG. 6 is a configuration diagram in which components related to an operation in a case where a system in which a region larger than the display frame is captured as a shooting frame is applied to the system of FIG. 86 are extracted. In this case, the operation of the address control unit 72 has nothing to do with this embodiment. The trimming shape 200 moving within the range of the shooting frame 120 may be extracted by the area extracting unit 51. Display frame 1
In some cases, the trimming shape 200 deviating from 11 can be extracted, and as described above, the image is more resistant to shaking of the captured image than when the imaging frame 120 is equal to the display frame 111. FIG. 47 is a diagram illustrating the movable area of the face shape 112 in the conventional method. That is, in the conventional method, the display frame 111 is set to the shooting frame 12 such that the background and the face region are the same, and the face shape 112 is located at the center of the display frame 111.
By moving the inside of 0, an image inside the display frame 111 is cut out. Under the condition that the face shape 112 is located at the center of the display frame 111, the movable range of the face shape 112 is drawn in the range of 112-1 to 112-6. That is, if the face shape 112 is moved inside the rectangle 119, the face shape 112 can be moved to the center of the display frame 111.

【００７４】一方、本方式は背景を削除しているため撮
影枠１２０からトリミング形状２００を切り出せればよ
い。すなわち、図４８のようにトリミング形状２００は
２００−１〜２００−８の範囲まで移動可能である。す
なわち撮影枠１２０の内部で移動可能となる。図４７と
図４８とを比較すると、本方式はトリミング形状２００
の移動可能領域が矩形１１９から撮影枠１２０に拡大し
たことになり、より撮影映像の揺れに強くなったことを
示している。次に、図４９のようにトリミング映像２１
７が撮影枠１２０からはみ出した場合、上記と同様に、
映像用ＬＣＤ１４に表示枠１１１の内部の映像をそのま
ま表示して、トリミング映像２１７が表示枠１１１から
はみ出している間は、トリミング映像メモリ部５６に蓄
積されている欠けていない正常なトリミング映像２０３
を相手に送ればよい。次に、図８５の方式に対して表示
枠を超えて大き目の領域を撮影枠として撮像する方式を
適用した場合について述べる。図５０は図８５の方式に
ついて本方式を適用するために必要となる構成要素を抜
き出して図示した構成図である。図５１は従来の方式、
すなわち顔映像と背景が一緒になった映像について、顔
形状１１２を表示枠１１１の中心に移動させるために顔
形状１１２が移動できる範囲を説明した図である。パン
ニング・チルティング駆動部７１は機械的な駆動部のた
めパンニング及びチルティングが可能な範囲が決まって
いる。図５１でａはパンニングできる範囲、ｂはチルテ
ィングできる範囲である。すなわち撮影枠１２０の範囲
内のみレンズ７０の光軸が動いて撮影できることにな
る。そして、撮影枠１２０の範囲で顔形状１１２を表示
枠１１１の中央に位置するようにパンニング・チルティ
ング駆動部７１を制御するという条件で、顔形状１１２
が移動できる範囲は１１２−１〜１１２−６の範囲、す
なわち矩形１１９の範囲である。On the other hand, in this method, since the background is deleted, the trimming shape 200 may be cut out from the photographing frame 120. That is, as shown in FIG. 48, the trimming shape 200 can move to a range of 200-1 to 200-8. That is, it can be moved inside the shooting frame 120. 47 and FIG. 48, it is found that this method has a trimming shape 200
Indicates that the movable area has expanded from the rectangle 119 to the shooting frame 120, which indicates that the shot image is more resistant to shaking. Next, as shown in FIG.
When the number 7 protrudes from the shooting frame 120, as described above,
The image inside the display frame 111 is displayed as it is on the image LCD 14, and while the trimmed image 217 protrudes from the display frame 111, the normal trimmed image 203 stored in the trimmed image memory unit 56 without missing is not missing.
Can be sent to the other party. Next, a case will be described in which a method of imaging a region larger than the display frame as a shooting frame is applied to the method of FIG. 85. FIG. 50 is a configuration diagram in which components necessary for applying this method to the method of FIG. 85 are extracted and shown. FIG. 51 shows a conventional method.
That is, it is a diagram illustrating a range in which the face shape 112 can move in order to move the face shape 112 to the center of the display frame 111 in the image in which the face image and the background are combined. Since the panning / tilting drive unit 71 is a mechanical drive unit, the range in which panning and tilting can be performed is determined. In FIG. 51, a is a range in which panning is possible, and b is a range in which tilting is possible. In other words, it is possible to perform imaging by moving the optical axis of the lens 70 only within the range of the imaging frame 120. Then, under the condition that the panning / tilting driving unit 71 is controlled so that the face shape 112 is positioned at the center of the display frame 111 within the range of the shooting frame 120, the face shape 112 is controlled.
Is movable in the range of 112-1 to 112-6, that is, the range of the rectangle 119.

【００７５】一方、図５０の方式について、画像メモリ
４５に蓄積された映像データにトリミング形状２００が
入るように大まかにパンニング・チルティング駆動部７
１を制御すればよい。すなわち、本方式はパンニング・
チルティング駆動部７１を精密に正確に制御しなくても
よい利点がある。そして、領域抽出部５１でトリミング
形状２００を抽出して、トリミング形状２００とデジタ
ル映像５４を用いて、領域抽出部５１でトリミング映像
２０１を生成し、トリミング映像２０１は表示枠１１１
の中心に移動させて、相手側に伝送する。図５２はトリ
ミング形状２００について移動可能領域を示した図であ
る。すなわち、背景がないためトリミング形状２００は
撮影枠１２０に接するまで移動できる。すなわちトリミ
ング形状２００−１〜２００−８が移動可能範囲であ
る。移動可能範囲は撮影枠１２０の範囲と同等となる。
図５１と図５２とを比較すると、従来方式における顔形
状１１２の移動可能範囲は矩形１１９で、本方式の移動
可能範囲は撮影枠１２０であるので、本方式の方が、移
動可能範囲が広く、それだけ撮影した映像の揺れに対し
て強いといえる。次に、図５３のようにトリミング映像
２１７が撮影枠１２０からはみ出した場合、上記と同様
に、映像用ＬＣＤ１４に表示枠１１１の内部の映像をそ
のまま表示して、トリミング映像２１７が表示枠１１１
からはみ出している間は、トリミング映像メモリ部５６
に蓄積されている欠けていない正常なトリミング映像２
０３を相手に送ればよい。On the other hand, in the method shown in FIG. 50, the panning / tilting drive unit 7 is roughly set so that the trimming shape 200 is included in the video data stored in the image memory 45.
1 may be controlled. In other words, this method uses panning
There is an advantage that the tilting drive unit 71 does not need to be precisely controlled. Then, the trimming shape 200 is extracted by the region extracting unit 51, and the trimming image 201 is generated by the region extracting unit 51 using the trimming shape 200 and the digital image 54, and the trimming image 201 is displayed on the display frame 111.
And send it to the other party. FIG. 52 is a diagram showing a movable area of the trimming shape 200. That is, since there is no background, the trimming shape 200 can move until it comes into contact with the photographing frame 120. That is, the trimming shapes 200-1 to 200-8 are movable ranges. The movable range is equivalent to the range of the shooting frame 120.
51 and 52, since the movable range of the face shape 112 in the conventional method is a rectangle 119 and the movable range of the present method is the shooting frame 120, the movable range is wider in the present method. Therefore, it can be said that it is strong against shaking of the shot image. Next, as shown in FIG. 53, when the trimmed image 217 protrudes from the shooting frame 120, the image inside the display frame 111 is displayed as it is on the image LCD 14, and the trimmed image 217 is
During the protruding operation, the trimming video memory unit 56
Normal cropped video 2 not missing from the camera
03 can be sent to the other party.

【００７６】また、上述のように図４６又は図５０のよ
うに構成した方式に対して、図２７のようにトリミング
映像２０３と一緒にテキスト文１３０等を表示させるこ
ともできる。この場合も、上記と同様に、トリミング映
像２０３を右側に寄せて、左側にテキスト文１３０の表
示領域を確保すればよい。例えば、図８６の方式で、背
景を一緒に表示させる従来の方式においては、図５４の
ように顔形状１１２の領域を表示枠１１１の右側に、そ
して左側にテキスト文１３０の表示領域を確保するため
に顔形状１１２と表示枠１１１を撮影枠１２０の内部を
移動させている。そして、図５５のように顔形状１１２
が右側ぎりぎりまで寄った場合でも、左側の領域にテキ
スト文１３０を確保できる。しかし、図５６のように顔
形状１１２が左側に寄った場合、右側にしかテキスト文
１３０を表示させる領域を確保できない。すなわち、従
来の方式で、テキスト文１３０を表示させようとすると
顔形状１１２の位置によってはテキスト文１３０の表示
領域が変わってしまい、相手側で表示される表示画面が
非常に見難くなってしまう。また、背景の上にテキスト
文１３０を重ねているため背景の映像によってはテキス
ト文１３０の内容が見難くなってしまう欠点がある。In addition, in contrast to the method configured as shown in FIG. 46 or FIG. 50 as described above, a text 130 or the like can be displayed together with the trimmed image 203 as shown in FIG. In this case as well, the trimming video 203 may be shifted to the right and a display area for the text 130 may be secured on the left in the same manner as described above. For example, in the conventional method of displaying the background together with the method of FIG. 86, the area of the face shape 112 is secured on the right side of the display frame 111 and the display area of the text sentence 130 is secured on the left side as shown in FIG. For this purpose, the face shape 112 and the display frame 111 are moved inside the shooting frame 120. Then, as shown in FIG.
, The text sentence 130 can be secured in the left area. However, when the face shape 112 is shifted to the left as shown in FIG. 56, an area for displaying the text 130 can be secured only on the right. That is, when the text text 130 is displayed by the conventional method, the display area of the text text 130 changes depending on the position of the face shape 112, and the display screen displayed on the other party becomes very difficult to see. . Further, since the text 130 is superimposed on the background, there is a disadvantage that the contents of the text 130 are difficult to see depending on the video of the background.

【００７７】しかし、本方式はトリミング形状２００が
撮影枠１２０の内部を動く限り、図２７のように、表示
枠１１１内部でトリミング映像２０３とテキスト文１３
０は所定の位置に安定して表示される。そして、テキス
ト文１３０のバックは背景が無いため、テキスト文１３
０は鮮明で見易くなる。（実施形態８）次に、実施形態８について説明する。ユ
ーザの携帯情報端末装置に相手側の被写体映像を表示さ
せて、相互に会話を行っている過程で、ユーザ側から、
カメラ部４で周囲の景色あるいは特定の対象物を撮影
し、撮影した映像を相手側に伝送する用途が考えられ
る。この場合、ユーザ側も、相手側もこの撮影した映像
を見ながら相互に会話を行うのが一般的な使い道であ
る。ユーザ側は、対象物を撮影して、図５７に示すよう
な景色を相手側に送るのと同時に、映像用ＬＣＤ１４で
モニタし、カメラアングルなどを調整して最適な状態に
する必要がある。そこで、この実施形態では、カメラ向
きセンサ部２８の出力が、カメラ部４が端末本体１から
外されてカメラ部４を自由に色々な方向に向けることが
できる状態と、カメラ部４が映像用ＬＣＤ１４及びテキ
スト用ＬＣＤ１６の表示面に対して反対側の方向に切り
替えられた状態とを検知した場合に、ユーザ側が対象物
を撮影する状態と見倣して、自動的に主制御部１１は映
像用ＬＣＤ１４に表示されている相手側の顔映像１０５
から、カメラ部４で撮影した映像すなわち図５７に示す
ような景色に切り替えて表示する。However, in this method, as long as the trimming shape 200 moves inside the photographing frame 120, as shown in FIG.
0 is stably displayed at a predetermined position. Since the background of the text sentence 130 has no background, the text sentence 13
0 is clear and easy to see. Embodiment 8 Next, Embodiment 8 will be described. In the process of having the other party's subject image displayed on the user's portable information terminal device and having a conversation with each other, from the user side,
The camera unit 4 may be used to photograph the surrounding scenery or a specific object, and transmit the photographed image to the other party. In this case, it is a common usage that both the user and the other party have a conversation with each other while watching the captured video. The user needs to photograph the object and send the scenery as shown in FIG. 57 to the other side, and at the same time, monitor the image on the LCD 14 and adjust the camera angle and the like so as to be in an optimum state. Therefore, in this embodiment, the output of the camera direction sensor unit 28 is set to a state where the camera unit 4 is detached from the terminal main body 1 and the camera unit 4 can be freely turned in various directions. When detecting that the state has been switched to the opposite direction with respect to the display surfaces of the LCD 14 and the text LCD 16, the main control unit 11 automatically recognizes a state in which the user side captures an object and imitates the image. Face image 105 of the other party displayed on LCD 14
Then, the image is switched to the image captured by the camera unit 4, that is, the scenery shown in FIG.

【００７８】そして、カメラ向きセンサ部２８の出力
が、カメラ部４が端末本体１に装着され、かつカメラ部
４が映像用ＬＣＤ１４及びテキスト用ＬＣＤ１６の表示
面の方向に切り替えられた状態を検知した場合は、再び
映像用ＬＣＤ１４に相手側の顔映像１０５を表示する。
以上のように、この実施形態はカメラの装着状態を検知
するカメラ向きセンサ部２８の出力から、ユーザ側が、
周囲の景色あるいは特定の対象物を撮影しているか、ユ
ーザ自身の顔を撮影しているかを検知する。そして、映
像用ＬＣＤ１４の表示画像を、対象物そのものについて
撮影した映像、あるいは相手側の顔映像１０５のどちら
かに自動的に切りかえることを目的としていて、映像用
ＬＣＤ１４の画面内容を切り替えるための専用のスイッ
チ類を必要としないのが利点である。（実施形態９）次に、実施形態９について説明する。こ
の実施形態は、実施形態２において、端末本体１に赤外
線ＬＥＤと赤外線カメラを装着し、被写体の領域を赤外
線カメラが撮影した画像を用いて抽出するものである。
図５８に、この実施形態９の携帯情報端末装置の構成を
示す。同図において、６５は形状制御部、６６は形状可
変ボタン、６７は映像メモリ部である。Then, the output of the camera orientation sensor unit 28 detects a state in which the camera unit 4 is mounted on the terminal body 1 and the camera unit 4 is switched to the direction of the display surface of the video LCD 14 and the text LCD 16. In this case, the face image 105 of the other party is displayed on the image LCD 14 again.
As described above, in the present embodiment, the user side receives the output of the camera direction sensor unit 28 that detects the mounting state of the camera,
It detects whether it is taking a picture of the surrounding scenery or a specific object, or taking a picture of the user's own face. The purpose is to automatically switch the display image of the video LCD 14 to either a video image of the object itself or the face video 105 of the other party, and to switch the screen content of the video LCD 14. The advantage is that no switches are required. (Embodiment 9) Next, Embodiment 9 will be described. In this embodiment, an infrared LED and an infrared camera are attached to the terminal body 1 in the second embodiment, and a region of a subject is extracted using an image captured by the infrared camera.
FIG. 58 shows the configuration of the portable information terminal device of the ninth embodiment. In the figure, 65 is a shape control unit, 66 is a shape variable button, and 67 is a video memory unit.

【００７９】図５９に端末本体１、カメラ部４及び赤外
線カメラ５の外観を示す。形状制御部６５は、赤外線カ
メラ５で撮影された画像からフレーム単位で被写体形状
を抽出し、形状情報を拡大あるいは縮小して、表示枠１
１１に対する被写体形状が所定の比率になるように制御
する。映像エンコーダ２７は画像メモリ４５から与えら
れるデジタル映像５４と形状制御部６５から与えられる
形状情報２３０を用いてＭＰＥＧ４方式によるオブジェ
クト符号化映像データを得る。瞳孔抽出部５８は赤外線
カメラ５で撮影された画像から瞳孔部分を抽出する。形
状可変ボタン６６は、表示枠に１１１対する被写体形状
の比率を変更する。赤外線カメラ５が端末本体１にカメ
ラ部４の撮影方向と同じ方向に装着されている状態で、
以下の処理を行う。実施形態３で説明したのと同様に、
赤外線カメラ５で被写体を撮影し、まず、瞳孔抽出部５
８で瞳孔部分を抽出する。ここで、瞳孔を抽出する方法
は例えば特開平９−１７５２２４によって抽出すること
ができる。この方法は眼底の網膜では可視光を吸収して
赤外光を反射するため、図２０に示すように被写体の瞳
孔部分２０４、２０５だけが非常に高輝度で撮影され
る。FIG. 59 shows the appearance of the terminal body 1, the camera unit 4, and the infrared camera 5. The shape control unit 65 extracts a subject shape from the image captured by the infrared camera 5 in frame units, enlarges or reduces the shape information, and
Control is performed so that the shape of the subject with respect to 11 is a predetermined ratio. The video encoder 27 obtains object-coded video data according to the MPEG4 system using the digital video 54 given from the image memory 45 and the shape information 230 given from the shape controller 65. The pupil extraction unit 58 extracts a pupil portion from an image captured by the infrared camera 5. The shape variable button 66 changes the ratio of the subject shape to the display frame 111. With the infrared camera 5 attached to the terminal body 1 in the same direction as the shooting direction of the camera unit 4,
The following processing is performed. As described in the third embodiment,
The subject is photographed by the infrared camera 5 and first, the pupil extraction unit 5
At 8, the pupil portion is extracted. Here, the method of extracting the pupil can be extracted by, for example, Japanese Patent Application Laid-Open No. 9-175224. In this method, since the retina at the fundus absorbs visible light and reflects infrared light, only the pupil portions 204 and 205 of the subject are photographed with extremely high brightness as shown in FIG.

【００８０】瞳孔部分２０４、２０５は周囲の顔の皮膚
の部分の反射光量とは比較にならないほど大きく、適切
な閾値を設定することにより、瞳孔部分ｇ、ｈを抽出す
ることができる。この瞳孔部分ｇとｈの中心が表示枠の
真ん中となるように移動させる。次に、形状制御部６５
で被写体の形状情報２３０を抽出する。被写体は背景と
比較して赤外線カメラ５からの距離が近く、赤外光の反
射光量が大きくなる為、適切な閾値を設定し、入力され
た画像を２値化回路で２値化することにより、被写体の
形状情報２３０を抽出することができる。ここで、この
形状情報２３０から、表示枠１１１に対する被写体形状
の比率を計算し、一定の値、例えば４０％とすると、こ
の計算した比率が常に４０％になるように形状情報２３
０を制御する。また、被写体が複数の場合、図６０のよ
うに、瞳孔抽出部５８により瞳孔部分ｇ、ｈ及び瞳孔部
分ｇ１、ｈ１が抽出される。この場合は瞳孔部分ｇとｈ
の中心ｉと、瞳孔部分ｇ１、ｈ１の中心ｉ１より、その
中点ｊを抽出し、ｊが表示枠の真ん中になるように移動
させる。この場合、表示枠に対する被写体形状の比率は
例えば５０％と大きく設定しておけばよい。The pupil portions 204 and 205 are so large as to be incomparable with the amount of reflected light from the surrounding facial skin portion, and pupil portions g and h can be extracted by setting appropriate thresholds. The pupil parts g and h are moved so that the center of the pupil part becomes the center of the display frame. Next, the shape control unit 65
Extracts the shape information 230 of the subject. The subject is closer to the infrared camera 5 than the background, and the amount of reflected infrared light is larger. Therefore, an appropriate threshold value is set, and the input image is binarized by a binarization circuit. Then, the object shape information 230 can be extracted. Here, the ratio of the object shape to the display frame 111 is calculated from the shape information 230, and assuming a constant value, for example, 40%, the shape information 23 is set so that the calculated ratio is always 40%.
Control 0. When there are a plurality of subjects, the pupil parts g and h and the pupil parts g1 and h1 are extracted by the pupil extraction unit 58 as shown in FIG. In this case, the pupil parts g and h
Is extracted from the center i of the pupils g1 and h1 and the center i1 of the pupil parts g1 and h1, and is moved so that j is in the middle of the display frame. In this case, the ratio of the shape of the subject to the display frame may be set as large as, for example, 50%.

【００８１】カメラ部４が図８２のような映像を撮影し
ている場合、形状制御部６５では図６１（ａ）のような
被写体形状が抽出される。図６１（ａ）より形状の面積
を計算すると３０％であるから、これを図６１（ｂ）の
ように４０％に拡大し、形状情報２３０とする。図６１
（ｃ）のような場合は形状面積が４５％であるから縮小
し、形状情報２３０とする。この比率は形状可変ボタン
６６により、変化させることが可能である。図６２
（ａ）のような場合は、瞳孔抽出部５８では瞳孔部分が
片方しか、抽出されないので、被写体が表示枠１１１か
ら外れていることが検出できる。また、図６２（ｂ）及
び（ｃ）の場合は瞳孔部分は抽出できるが、被写体が表
示枠１１１から外れてしまっている。このような場合
は、被写体映像が表示枠１１１に接触した状態を調べ、
図２０の瞳孔部分ｇ、ｈから推定できるトリミング形状
２００の４辺のうち一つの辺が表示枠１１１に接触した
ことにより、検出することができる。このような場合
は、実施形態３の場合のように、領域抽出部５１で正常
に被写体映像を抽出したとき、相手に被写体映像を送る
のと同時に映像メモリ部６７に蓄積しておいた被写体映
像を静止画として送る。When the camera unit 4 is capturing an image as shown in FIG. 82, the shape control unit 65 extracts the object shape as shown in FIG. Since the area of the shape is calculated to be 30% from FIG. 61A, it is enlarged to 40% as shown in FIG. FIG.
In the case of (c), since the shape area is 45%, the shape area is reduced to form shape information 230. This ratio can be changed using the shape variable button 66. FIG. 62
In the case of (a), only one pupil portion is extracted by the pupil extraction unit 58, so that it is possible to detect that the subject is out of the display frame 111. In the case of FIGS. 62 (b) and 62 (c), the pupil portion can be extracted, but the subject is out of the display frame 111. In such a case, the state in which the subject image is in contact with the display frame 111 is checked,
It can be detected when one of the four sides of the trimming shape 200 that can be estimated from the pupil portions g and h in FIG. In such a case, when the subject image is normally extracted by the region extracting unit 51 as in the third embodiment, the subject image stored in the video memory unit 67 at the same time as the subject image is sent to the other party. As a still image.

【００８２】この映像メモリ部６７に蓄積される被写体
映像はフレーム毎あるいは所定のフレーム周期で更新さ
れるが、被写体が表示枠１１１から外れていることを検
出したら映像メモリ部６７の被写体映像の蓄積を停止す
る。（実施形態１０）実施形態９においては、被写体の領域
を、赤外線カメラ５を用いて抽出したが、この実施形態
１０は赤外線の代わりに超音波を用いて抽出するもので
ある。この実施形態の説明に必要な構成部分のみを抜き
出した構成を図６３に示す。超音波送信部６２より超音
波パルスを送信すると、被写体及び背景により超音波パ
ルスが反射される為、超音波受信部６３では、前述の実
施形態３において図２５で説明したように超音波パルス
の応答時間を測定することができる。被写体は端末本体
１に最も距離が近いので、応答時間が最も短いｔ１とな
る点の形状を調べることにより、実施形態９の場合と同
様にして被写体の形状を抽出することができる。（実施形態１１）次に、実施形態１１について説明す
る。図６４は、実施形態１１に係る携帯情報端末装置の
構成を示すブロック図である。この図において、端末本
体１中の３３は画面切替ボタン、３９は手揺れセンサで
ある。また、カメラ部４中の６８はＣＣＤ、６９は信号
処理プロセッサである。The subject image stored in the video memory unit 67 is updated every frame or at a predetermined frame cycle. To stop. (Embodiment 10) In the ninth embodiment, the region of the subject is extracted using the infrared camera 5, but in the tenth embodiment, an ultrasonic wave is used instead of the infrared ray to extract. FIG. 63 shows a configuration in which only the components necessary for the description of this embodiment are extracted. When the ultrasonic pulse is transmitted from the ultrasonic transmitting unit 62, the ultrasonic pulse is reflected by the subject and the background. Therefore, the ultrasonic receiving unit 63 transmits the ultrasonic pulse as described with reference to FIG. Response time can be measured. Since the subject is closest to the terminal body 1, the shape of the subject can be extracted in the same manner as in the ninth embodiment by examining the shape of the point at which the response time is the shortest at t1. (Embodiment 11) Next, Embodiment 11 will be described. FIG. 64 is a block diagram illustrating a configuration of the portable information terminal device according to the eleventh embodiment. In this figure, 33 in the terminal body 1 is a screen switching button, and 39 is a hand shake sensor. Reference numeral 68 in the camera unit 4 denotes a CCD, and 69 denotes a signal processor.

【００８３】画面切替ボタン３３は映像用ＬＣＤ１４に
対して、相手から送られてきた映像を表示する場合とカ
メラ部４で撮像したユーザの上半身映像を表示する場合
とで切替を行なうためのものである。なお、カメラ部４
は、レンズ７０を通して入射した光をＣＣＤ６８により
撮像し、信号処理プロセッサ６９でデジタル映像データ
に変換して、カメラＩＦ部２５に印加する。図６５は、
この実施形態における端末本体１及びカメラ部４の外観
を示す図である。なお、スクロールダイヤル３１と操作
ボタン３２と画面切替ボタン３３は、人間の手の大きさ
を考慮し、筐体６０の端部を手のひらに載せた状態で、
同じ手の親指でスクロールダイヤル３１を操作しつつ、
同じ手の残りの指で操作ボタン３２と画面切替ボタン３
３とを操作可能なように相対的な位置が決められてい
る。この動作について説明する。図８３に示すように、
相手と通話中にユーザの持っている端末本体１の映像用
ＬＣＤ１４に相手の上半身映像が表示され、また、図８
２に示すように、カメラ部４で撮影された上半身映像
が、映像エンコーダ２７で符号化して、相手に送られ、
相手の携帯情報端間装置の表示部に表示されているもの
とする。The screen switching button 33 is used to switch between displaying the video sent from the other party on the video LCD 14 and displaying the upper body video of the user captured by the camera unit 4. is there. The camera unit 4
The CCD 68 captures light incident through the lens 70, converts the light into digital video data by the signal processor 69, and applies the digital video data to the camera IF unit 25. FIG.
FIG. 2 is a diagram illustrating an appearance of a terminal body 1 and a camera unit 4 according to the embodiment. In addition, the scroll dial 31, the operation button 32, and the screen switching button 33 take the size of a human hand into consideration,
While operating the scroll dial 31 with the thumb of the same hand,
Operation button 32 and screen switching button 3 with the remaining fingers of the same hand
The relative positions are determined so as to be able to operate. This operation will be described. As shown in FIG.
While the user is talking with the other party, the upper body image of the other party is displayed on the image LCD 14 of the terminal body 1 held by the user.
As shown in FIG. 2, the upper body video taken by the camera unit 4 is encoded by the video encoder 27 and sent to the other party,
It is assumed that the information is displayed on the display unit of the portable information terminal device of the other party.

【００８４】ここで、ユーザが端末本体１を手で持って
いる過程で、手揺れが起きると手揺れセンサ３９が手揺
れの大きさを検知して主制御部１１がその大きさを知
る。ここで、手揺れセンサ３９はジャイロセンサや加速
度センサ、回転角速度センサなどが使用される。そし
て、手揺れセンサ３９から得られる揺れ検出出力はＸ軸
（水平軸）成分とＹ軸成分（垂直軸）に分けて出力され
る。主制御部１１はこれらＸ軸成分とＹ軸成分により合
成されたベクトルの絶対値（揺れ変位）を計算する。そ
して、主制御部１１は図６６のように前記揺れ変位が予
め決められていた閾値ｅを越えた時間ａから所定の時間
ｂの間、例えば図３０のようにユーザの上半身が表示枠
１１１の端に寄ってしまった映像を映像用ＬＣＤ１４に
表示している。そして、ユーザはこのような映像を見る
ことで、手揺れがひどくなったことが分かり、前記ユー
ザの映像が表示している期間ｂ内に、ユーザは端末本体
１を持ち替えて、図８２のように上半身映像が表示枠１
１１の中心に来るように修正する。ここで、端末本体１
の使い方として、端末本体１をユーザ前方の机などの上
に固定して置くことがある。この場合、相手に送られる
ユーザの上半身映像は手揺れなど起きないため安定する
が、カメラ部４の向きがユーザの方向を向いているとは
限らないため、撮影したユーザの上半身が表示枠１１１
に収まらないこともある。そこで、カメラ部４の向きを
指で操作して、ユーザの上半身映像が映像用ＬＣＤ１４
の表示枠１１１に収まるように調整することがある。Here, if the hand shake occurs while the user is holding the terminal body 1 with his / her hand, the hand shake sensor 39 detects the magnitude of the hand shake and the main control unit 11 knows the magnitude. Here, as the hand shake sensor 39, a gyro sensor, an acceleration sensor, a rotational angular velocity sensor, or the like is used. The shake detection output obtained from the hand shake sensor 39 is output separately for an X-axis (horizontal axis) component and a Y-axis component (vertical axis). The main controller 11 calculates the absolute value (swing displacement) of the vector synthesized by the X-axis component and the Y-axis component. Then, as shown in FIG. 66, the main control unit 11 sets the upper body of the user to the display frame 111 between a time a when the swing displacement exceeds a predetermined threshold value e and a predetermined time b as shown in FIG. The image approaching the end is displayed on the image LCD 14. Then, the user sees such an image and finds that the hand shaking has become severe, and within the period b in which the image of the user is displayed, the user switches the terminal body 1 and as shown in FIG. Display frame 1 on the upper body
Modify it to be at the center of 11. Here, the terminal body 1
In some cases, the terminal body 1 is fixedly placed on a desk or the like in front of the user. In this case, the upper body image of the user sent to the other party is stable because hand shaking does not occur, but the direction of the camera unit 4 is not necessarily in the direction of the user.
It may not fit in. Therefore, the direction of the camera unit 4 is operated with a finger, and the upper body image of the user is displayed on the image LCD 14.
May be adjusted so as to fit in the display frame 111.

【００８５】そこで、図６７に示すように、手揺れセン
サ３９をカメラ部４の内部に取り付けてカメラ部４の動
きだけを手揺れセンサ３９で検出してもよい。カメラ部
４に手揺れセンサー３９を取り付けておけば、端末本体
１が揺れても一緒にカメラ部４も揺れるので、端末本体
１の揺れを検出することができる利点がある。（実施形態１２）次に、実施形態１２について説明す
る。実施形態１１では、ユーザが端末本体１を手で持っ
ている際に、手揺れがひどくなり手揺れの変位が所定の
大きさを越えたら、映像用ＬＣＤ１４に相手の上半身映
像の代わりにユーザの上半身映像を表示させていた。し
かし、短期間とはいえ映像用ＬＣＤ１４に相手の上半身
が映らなくなり、その期間相手を観察できない欠点があ
る。そこで、この実施形態は図６８（ａ）のように手揺
れの変位が所定の閾値ｅを越えたら、相手の上半身映像
１４１と一緒にユーザの上半身映像１４０をマルチウイ
ンドウで表示させることを特徴としている。この実施形
態では、相手の上半身映像１４１にユーザの上半身映像
１４０が重なり見難くなるが、現在注目すべく映像はユ
ーザの上半身映像１４０がどのように映っているかが問
題のため相手の上半身映像１４１に注目することが少な
く、またユーザが正しいポーズを取るための時間も短時
間で済むため問題が起きない。また、図６８（ｂ）のよ
うに手揺れの変位が所定の大きさを越えたとき、ユーザ
の上半身映像１４０を大きく、そのユーザの上半身映像
１４０に重ねて相手の上半身映像１４１を小さくして重
ねて表示してもよい。Therefore, as shown in FIG. 67, the hand shake sensor 39 may be mounted inside the camera unit 4 and only the movement of the camera unit 4 may be detected by the hand shake sensor 39. If the hand shake sensor 39 is attached to the camera unit 4, the camera unit 4 also shakes when the terminal body 1 shakes, so that there is an advantage that the shake of the terminal body 1 can be detected. Embodiment 12 Next, Embodiment 12 will be described. In the eleventh embodiment, when the hand shake becomes severe and the displacement of the hand shake exceeds a predetermined size when the user holds the terminal body 1 with his / her hand, the image LCD 14 is displayed instead of the upper body image of the other party. The upper body image was displayed. However, there is a disadvantage that the upper body of the opponent is not displayed on the video LCD 14 even for a short period, and the opponent cannot be observed during that period. Therefore, this embodiment is characterized in that when the hand-shake displacement exceeds a predetermined threshold value e as shown in FIG. 68A, the user's upper body image 140 is displayed in a multi-window together with the partner's upper body image 141. I have. In this embodiment, the upper body image 140 of the user overlaps with the upper body image 141 of the other party, making it difficult to see. However, the image to be focused on now depends on how the upper body image 140 of the user is reflected. And the user does not need to take a short time to take a correct pose, so that no problem occurs. Further, when the displacement of the hand shake exceeds a predetermined size as shown in FIG. 68 (b), the upper body image 140 of the user is made larger, and the upper body image 141 of the other party is made smaller by overlapping the upper body image 140 of the user. They may be displayed in an overlapping manner.

【００８６】（実施形態１３）次に、実施形態１３につ
いて説明する。実施形態１２において、映像用ＬＣＤ１
４に相手の上半身映像１４１とユーザの上半身映像１４
０をマルチウインドウで表示する場合、ユーザの上半身
映像１４０の大きさは固定された大きさだったが、この
実施形態はユーザの上半身映像１４０を手揺れ変位の大
きさに応じて可変にすることを特徴としている。図６９
のように、揺れ変位の閾値を複数の等級に分けて、例え
ばｅ１、ｅ２及びｅ３と３段階に設定しておく。そし
て、揺れの変位の立ち上がりピークすなわちａ点が前記
閾値に収まっているか調べる。前記ピーク値がｅ１から
ｅ２の間にあれば、変位が比較的小さいとみて、図７０
（ａ）のように小さ目なユーザの上半身映像１４０を相
手の上半身映像１４１の上に重ねる。前記ピーク値がｅ
２からｅ３の間にあれば、図７０（ｂ）のように中程度
のユーザの上半身映像１４０を相手の上半身映像１４１
に重ねる。また、前記ピーク値がｅ３以上の場合、図７
１のように相手の上半身映像１４１の代わりに完全にユ
ーザの上半身映像１４０に置き換える。すなわち、この
実施形態は、揺れの変位が小さいときは相手の上半身映
像１４１の上に比較的小さなユーザの上半身映像１４０
を重ねることで、相手の上半身映像１４１をマルチウイ
ンドウによって重ねた際に起きる隠蔽した表示を少なく
することができる。また、前記変位が大きくなるとユー
ザの上半身映像１４０を大きく表示することにより、ユ
ーザが端末本体１の持つ方向を調整しやすくすることを
目的としている。(Thirteenth Embodiment) Next, a thirteenth embodiment will be described. In the twelfth embodiment, the image LCD 1
4 shows the upper body image 141 of the opponent and the upper body image 14 of the user
When displaying 0 in a multi-window, the size of the user's upper body image 140 is a fixed size. However, in this embodiment, the user's upper body image 140 is made variable according to the magnitude of the shaking displacement. It is characterized by. FIG.
As described above, the threshold value of the swing displacement is divided into a plurality of grades and set in three stages, e1, e2 and e3, for example. Then, it is checked whether or not the rising peak of the shaking displacement, that is, the point a, is within the threshold value. If the peak value is between e1 and e2, the displacement is considered to be relatively small, and FIG.
An upper body image 140 of a smaller user as shown in FIG. The peak value is e
If it is between 2 and e3, the middle user's upper body image 140 is displayed as a medium user's upper body image 141 as shown in FIG.
Layer on. When the peak value is e3 or more, FIG.
1, the user's upper body image 140 is completely replaced with the user's upper body image 140 instead of the partner's upper body image 141. That is, in this embodiment, when the shaking displacement is small, the relatively small upper body image 140 of the user is placed on the upper body image 141 of the opponent.
, It is possible to reduce the concealed display that occurs when the upper body image 141 of the opponent is overlapped by the multi-window. Further, it is another object of the present invention to make it easier for the user to adjust the direction of the terminal main body 1 by displaying the upper body image 140 of the user larger when the displacement becomes larger.

【００８７】（実施形態１４）次に、実施形態１４につ
いて説明する。実施形態１１〜実施形態１３で、揺れ変
位を調べる際に図６６（又は図６９）のように揺れ変位
が所定の閾値ｅ（又はｅ１）より大きくなってから所定
の期間ｂの間、ユーザの上半身映像１４０を映像用ＬＣ
Ｄ１４に表示させている。そして、ユーザはこの期間に
映像用ＬＣＤ１４に表示されたユーザの上半身映像１４
０を映像用ＬＣＤ１４の表示枠１１１に収まるように端
末本体１の持ち方を変えたりしている。しかし、前記期
間ｂが短いと端末本体１を動かしているうちに期間ｂが
過ぎてしまうことがある。また、期間が長すぎるとユー
ザは端末本体１を動かしてとっくにユーザの上半身映像
１４０が映像用ＬＣＤ１４の表示枠１１１に収まって
も、まだ相手の上半身映像１４１に切り替わらないの
で、いらいらしてくる場合がある。そこで、この実施形
態は、この切り替わった期間にユーザが端末本体１を動
かしている期間だけユーザの上半身映像１４０を映像用
ＬＣＤ１４に表示することを目的としている。図７２の
ように、揺れの変位１３１のピークを結んだ包絡線１３
２を描く。そして、包絡線１３２が閾値ｅより大きくな
った時点から時間ａを過ぎてから映像をユーザの上半身
映像１４０に切り替える。時間ａはこの期間に包絡線１
３２が閾値ｅより下がった場合は、一時的な揺れのピー
ク値と見て無視している。Embodiment 14 Next, Embodiment 14 will be described. In the eleventh embodiment to the thirteenth embodiment, when checking the swing displacement, as shown in FIG. 66 (or FIG. 69), the user's movement is performed for a predetermined period b after the swing displacement becomes larger than the predetermined threshold value e (or e1). Upper body image 140 for image LC
D14 is displayed. Then, the user operates the upper body image 14 displayed on the image LCD 14 during this period.
For example, the terminal body 1 is held differently so that “0” falls within the display frame 111 of the video LCD 14. However, if the period b is short, the period b may pass while the terminal body 1 is moving. Also, if the period is too long, the user moves the terminal body 1 and the user's upper body image 140 can be fit into the display frame 111 of the video LCD 14 before switching to the other person's upper body image 141. There is. Therefore, the present embodiment aims at displaying the upper body image 140 of the user on the image LCD 14 only during the period when the user is moving the terminal body 1 during the switching period. As shown in FIG. 72, the envelope 13 connecting the peaks of the shaking displacement 131
Draw 2. After a lapse of time a from the time when the envelope 132 becomes larger than the threshold value e, the video is switched to the upper body video 140 of the user. Time a is the envelope 1
When 32 falls below the threshold value e, it is ignored because it is regarded as the peak value of the temporary fluctuation.

【００８８】そして、包絡線１３２が閾値ｅより下回っ
てから時間ａを過ぎてもまだ下回っている場合、映像を
相手の上半身映像１４１に切り替える。ここで、包絡線
１３２が閾値ｅを下回ってから時間ａ以内にまた、包絡
線１３２が閾値ｅを上回った場合、包絡線１３２が一時
的な谷間を生じたとして無視している。尚、この実施形
態では、相手の上半身映像１４１の表示からユーザの上
半身映像１４０の表示に切り替える際に、実施形態１２
及び実施形態１３のように相手の上半身映像１４１とユ
ーザの上半身映像１４０とをマルチウインドウで一緒に
表示させてもよい。（実施形態１５）次に、実施形態１５について説明す
る。この実施形態は、ユーザの上半身映像１４０と相手
の上半身映像１４１とを周期的に切り替えることを特徴
としている。すなわち、図７３のように、ユーザの上半
身映像１４０の表示時間をａ、表示間隔ｂとし、ユーザ
の上半身映像１４０の表示時間以外は相手の上半身映像
１４１を表示させようにして周期的に繰り返す処理を行
う。この実施形態は揺れ変位１３１など調べないで、周
期的にユーザの上半身映像１４０を表示させているの
で、処理が簡単になる利点がある。When the envelope 132 has fallen below the threshold value e and is still lower than time a after a lapse of time a, the video is switched to the upper body image 141 of the other party. Here, if the envelope 132 has exceeded the threshold value e within a time a after the envelope 132 has fallen below the threshold value e, the envelope 132 has been ignored because it has caused a temporary valley. In this embodiment, when the display of the upper body image 141 of the other party is switched to the display of the upper body image 140 of the user, the embodiment 12
Also, as in the thirteenth embodiment, the upper body image 141 of the other party and the upper body image 140 of the user may be displayed together in a multi-window. (Embodiment 15) Next, Embodiment 15 will be described. This embodiment is characterized in that the user's upper body image 140 and the partner's upper body image 141 are periodically switched. That is, as shown in FIG. 73, a process in which the display time of the user's upper body image 140 is set to a and the display interval b, and the other party's upper body image 141 is displayed periodically except for the display time of the user's upper body image 140, and is periodically repeated. I do. In this embodiment, since the upper body image 140 of the user is displayed periodically without checking the swing displacement 131 or the like, there is an advantage that the processing is simplified.

【００８９】上記説明では、図７３のように、映像用Ｌ
ＣＤ１４に相手の上半身映像１４１を表示している期間
にユーザの上半身映像１４０を表示する時間を挿入して
いる。しかし、図７４のように大半の時間は映像用ＬＣ
Ｄ１４にユーザの上半身映像１４０を表示していて、合
間に、相手の上半身映像１４１を挿入するようにしても
よい。特に、相手が目上の人などでは表示枠１１１に収
まったキチンとしたユーザの上半身映像１４０を送る場
合に、常にユーザの上半身映像１４０を確認することが
できる。また、ユーザの上半身映像１４０の表示時間
ａ、相手の上半身映像１４１の表示時間ｂについてタッ
チパネル３０を介して任意に設定できるようにしてもよ
い。尚、この実施形態では、相手の上半身映像１４１の
表示からユーザの上半身映像１４０の表示に切り替える
際に、実施形態１２及び実施形態１３のように、相手の
上半身映像１４１とユーザの上半身映像１４０とをマル
チウインドウで一緒に表示させてもよい。（実施形態１６）次に、実施形態１６について説明す
る。実施形態１５は相手の上半身映像１４１を表示して
いる間に周期的にユーザの上半身映像１４０を切り替え
る期間を挿入していた。In the above description, as shown in FIG.
The time for displaying the upper body image 140 of the user is inserted into the period in which the upper body image 141 of the other party is displayed on the CD 14. However, most of the time, as shown in FIG.
The upper body image 140 of the user may be displayed on D14, and the other person's upper body image 141 may be inserted in between. In particular, when the opponent is a superior person, the upper body image 140 of the user can always be confirmed when the user's upper body image 140 that fits in the display frame 111 is sent. Further, the display time a of the user's upper body image 140 and the display time b of the partner's upper body image 141 may be arbitrarily set via the touch panel 30. In this embodiment, when the display of the partner's upper body image 141 is switched from the display of the partner's upper body image 141 to the display of the user's upper body image 140, as in Embodiments 12 and 13, the partner's upper body image 141 and the user's upper body image 140 May be displayed together in a multi-window. (Embodiment 16) Next, Embodiment 16 will be described. In the fifteenth embodiment, a period in which the upper body image 140 of the user is periodically switched while the upper body image 141 of the other party is displayed is inserted.

【００９０】この実施形態は、相手と回線が接続され通
話が開始されるときに、まずユーザの映像用ＬＣＤ１４
にユーザの上半身映像１４０を所定期間表示させること
を特徴としている。すなわち、相手に電話をかけたり、
相手から電話がかかってくると端末本体１のキー操作な
ど行う必要がある。そして、キー操作を行うと、端末本
体１が動き、カメラ部４がユーザを正確に向いているか
分からなくなってしまう可能性がある。そして、このま
まの状態で相手の上半身映像１４１を受信し、ユーザの
上半身映像１４０は自身で確認することなく相手に送ら
れてしまい、表示枠１１１から外れた上半身映像が送ら
れる可能性がある。そこで、この実施形態は、通信を開
始するときは必ずユーザの上半身映像１４０が表示枠１
１１に収まっていることを確認してから相手と通話を行
うことを目的としている。すなわち、ユーザは通信のは
じめにポーズを取って、正常な上半身映像になるような
ポーズにしてから通話を行うことができる。尚、この実
施形態において、ユーザの上半身映像１４０の表示の際
に、実施形態１２及び実施形態１３のように、相手の上
半身映像１４１とユーザの上半身映像１４０をマルチウ
インドウで一緒に表示させてもよい。In this embodiment, when a line is connected to a partner and a telephone call is started, first, a user's video LCD 14
A user's upper body image 140 is displayed for a predetermined period of time. In other words, call the other party,
When a call is received from the other party, it is necessary to perform a key operation of the terminal body 1 or the like. When a key operation is performed, the terminal body 1 may move, and it may not be possible to determine whether the camera unit 4 is correctly facing the user. Then, in this state, the upper body image 141 of the other party is received, and the upper body image 140 of the user is sent to the other party without confirming by himself, and there is a possibility that the upper body image outside the display frame 111 is sent. Therefore, in this embodiment, when the communication is started, the upper body image 140 of the user is always displayed on the display frame 1.
The purpose is to make a call with the other party after confirming that the number is within 11. In other words, the user can take a pause at the beginning of the communication and make a pause so that a normal upper body image is obtained, and then make a call. In this embodiment, when displaying the upper body image 140 of the user, as in the twelfth and thirteenth embodiments, the upper body image 141 of the other party and the upper body image 140 of the user may be displayed together in a multi-window. Good.

【００９１】（実施形態１７）次に、実施形態１７につ
いて説明する。実施形態１５は、ユーザの上半身映像を
表示する周期を固定していたが、常時揺れの変位を調べ
て揺れの変位に応じて周期を変えてもよい。図７５は揺
れの変位１３５のピーク値について包絡線１３６を描い
た例である。そして、図７６のグラフから、包絡線１３
６に応じて表示間隔を求める。変位が第１の値１３９と
第２の値１３７の間は、変位の大きさが大きくなると表
示間隔を直線１３８で示すように短くしていく。すなわ
ち、変位が大きいときは揺れが激しいため、表示間隔も
短くして、ユーザにユーザの上半身映像１４０を頻繁に
見せるようにする。変位が小さい状態は、揺れが少ない
ので表示間隔は長くする。そして、変位が第１の値１３
９以下の場合、表示間隔を一定の値に固定している。ま
た、変位が第２の値１３７以上の場合も、表示間隔を一
定の値に固定している。図７７は画面切り替え状態を示
した図である。表示間隔、すなわち、ユーザの上半身映
像表示期間１４４の間隔が時間毎に変わっていく様子を
描いている。また、図７８（ａ）のように揺れの変位１
４２の包絡線１４３が閾値ｅ以下の場合、ユーザの上半
身映像１４０を表示する期間はａとして固定している。
しかし、図７８（ｂ）のようにユーザの上半身映像１４
０を表示している期間ｂを過ぎてから、ユーザの上半身
映像１４０が表示枠１１１から外れていることに気がつ
き、端末本体１の持つ方向を変える動作に移る。端末本
体１は手揺れが大きくなり、包絡線１４３は時点ｇで閾
値ｅを超える。(Embodiment 17) Next, Embodiment 17 will be described. In the fifteenth embodiment, the cycle of displaying the upper body image of the user is fixed. However, the cycle may be changed in accordance with the swing displacement by always checking the swing displacement. FIG. 75 is an example in which an envelope 136 is drawn for the peak value of the swing displacement 135. Then, from the graph of FIG.
The display interval is determined according to 6. When the displacement is between the first value 139 and the second value 137, as the magnitude of the displacement increases, the display interval is shortened as shown by a straight line 138. That is, when the displacement is large, the shaking is severe, so that the display interval is also shortened so that the user's upper body image 140 is frequently shown to the user. In the state where the displacement is small, the display interval is lengthened because the swing is small. And the displacement is the first value 13
In the case of 9 or less, the display interval is fixed to a constant value. Also, when the displacement is equal to or greater than the second value 137, the display interval is fixed at a constant value. FIG. 77 is a diagram showing a screen switching state. The display interval, that is, the interval of the user's upper body image display period 144 changes with time. In addition, as shown in FIG.
When the envelope 143 of 42 is equal to or smaller than the threshold e, the period during which the upper body image 140 of the user is displayed is fixed as a.
However, as shown in FIG.
After the period b during which 0 is displayed, the user notices that the upper body image 140 of the user is out of the display frame 111, and shifts to the operation of changing the direction of the terminal body 1. The hand movement of the terminal body 1 becomes large, and the envelope 143 exceeds the threshold value e at the time point g.

【００９２】そして、期間ｃの間、端末本体１を動かし
ている。そして、時点ｈになると揺れの変位１４２の包
絡線１４３は閾値ｅを下回る。そして、期間ｄ内に再び
揺れの変位１４２が閾値ｅを上回らなければ時点ｉで相
手の上半身映像１４１の表示に切り替える。すなわち、
この実施形態は、ユーザの上半身映像１４０を表示して
いる期間にユーザが端末本体１を手で動かしている期間
は、ユーザの上半身映像１４０を表示させることを目的
としている。即ち、図７８（ａ）の場合はユーザの上半
身映像１４０を表示する期間は一定としていたが、図７
８（ｂ）の場合は一定期間ユーザが揺れの程度を確認す
る時間ｂを確保して、後はユーザが手揺れを補正するた
めに要する時間を適応的に取った後、相手の上半身映像
１４１に表示を切り替えるようにしている。そして、図
７８（ｂ）のように切り替えるこの実施形態は、実施形
態１５にも適用できる。すなわち、表示間隔ｂが一定で
も表示時間ａはユーザの確認時間を確保した後、ユーザ
の操作時間に合わせて適応的に延長させてから相手の表
示に切り替えてもよい。同様に、図７８（ｂ）のように
切り替えるこの実施形態は、実施形態１６にも適用でき
る。すなわち、相手と通信を開始した後、ユーザの上半
身映像１４０を映像用ＬＣＤ１４に表示させ、ユーザの
確認時間を確保した後、ユーザが端末本体１の持ち方を
変えている間、適応的にユーザの上半身映像１４０に表
示を切り替えるタイミングを延長させてもよい。Then, during the period c, the terminal main body 1 is moved. Then, at time h, the envelope 143 of the shaking displacement 142 falls below the threshold value e. If the shaking displacement 142 does not exceed the threshold value e again during the period d, the display is switched to the upper body image 141 of the partner at the time point i. That is,
This embodiment aims at displaying the user's upper body image 140 while the user is moving the terminal body 1 by hand while the user's upper body image 140 is being displayed. That is, in the case of FIG. 78 (a), the period during which the upper body image 140 of the user is displayed is fixed.
In the case of FIG. 8 (b), the user confirms the degree of shaking for a certain period of time b, and after the user adaptively takes the time required to correct the hand shaking, the upper body image 141 of the other party. The display is switched. This embodiment, which switches as shown in FIG. 78 (b), can also be applied to the fifteenth embodiment. In other words, even if the display interval b is constant, the display time a may be switched to the display of the other party after the confirmation time of the user is secured and the display time a is adaptively extended in accordance with the operation time of the user. Similarly, this embodiment switched as shown in FIG. 78B can be applied to the sixteenth embodiment. That is, after starting communication with the other party, the upper body image 140 of the user is displayed on the image LCD 14, and after confirming the user's confirmation time, the user is adaptively changed while holding the terminal body 1. The timing of switching the display to the upper body image 140 may be extended.

【００９３】尚、この実施形態において、相手の上半身
映像１４１の表示からユーザの上半身映像１４０の表示
に切り替える際に、実施形態１２及び１３のように相手
の上半身映像１４１とユーザの上半身映像１４０とをマ
ルチウインドウで一緒に表示させてもよい。（実施形態１８）次に、実施形態１８について説明す
る。ユーザあるいは相手が音声を発声していて、所定の
期間以上無音になった状態では、会話が一段落して一息
ついて時間的余裕ができた状態である。この所定の期間
を越えた休止期間にユーザの上半身映像１４０を映像用
ＬＣＤ１４に表示させてもよい。図７９はユーザあるい
は相手の音声のパワー１５０を調べて、所定の閾値ｅよ
り小さくなって、所定の時間ａを経過してもパワー１５
０が閾値ｅより上がらなければ映像用ＬＣＤ１４の表示
をユーザの上半身映像１４０に切り替える。そして、再
びユーザあるいは相手の音声が閾値ｅを上回ってから時
間ａを経過しても、パワー１５０が閾値ｅを下回らなけ
れぱ、映像用ＬＣＤ１４の表示を相手の上半身映像１４
１に切り替える。尚、この実施形態では相手の上半身映
像１４１の表示からユーザの上半身映像１４０の表示に
切り替える際に、実施形態１２及び１３のように相手の
上半身映像１４１とユーザの上半身映像１４０とをマル
チウインドウで一緒に表示させてもよい。In this embodiment, when the display of the upper body image 141 of the other party is switched from the display of the upper body image 141 of the other party to the display of the upper body image 140 of the user, the upper body image 141 of the other party and the upper body image 140 of the user are changed as in the twelfth and thirteenth embodiments. May be displayed together in a multi-window. (Embodiment 18) Next, Embodiment 18 will be described. In a state in which the user or the other party is uttering voice and has been silent for a predetermined period or more, the conversation has been settled down and a break has taken place, and time has been secured. The upper body image 140 of the user may be displayed on the image LCD 14 during the pause period exceeding the predetermined period. FIG. 79 shows the result of examining the power 150 of the voice of the user or the other party.
If 0 does not exceed the threshold value e, the display of the image LCD 14 is switched to the upper body image 140 of the user. If the power 150 does not fall below the threshold e even after the time a has elapsed since the voice of the user or the partner has exceeded the threshold e again, the display on the video LCD 14 is displayed on the upper body image 14 of the partner.
Switch to 1. In this embodiment, when the display of the upper body image 141 of the partner is switched from the display of the upper body image 141 of the user to the display of the upper body image 140 of the user, the upper body image 141 of the partner and the upper body image 140 of the user are multi-windowed as in the twelfth and thirteenth embodiments. They may be displayed together.

【００９４】（実施形態１９）次に、実施形態１９につ
いて説明する。ユーザが相手と会話している過程で、ユ
ーザが発声している発声期間はユーザの上半身映像１４
１を映像用ＬＣＤ１４に表示させ、相手が発声している
発声期間は映像用ＬＣＤ１４に相手の上半身映像１４０
を表示させてもよい。即ち、ユーザが発声しているとき
はキチンとポーズを取ることで、相手に対して不快感を
与えることが少ない。図８０はユーザの発声と相手の発
声状態について時間関係を示した図である。ユーザが発
声を開始して、音声のパワー１５１が所定の閾値ｅを超
えて、所定の時間ｃ以内に音声のパワー１５１が閾値ｅ
を下回らなければ、映像用ＬＣＤ１４の表示をユーザの
上半身映像１４０に切り替える。そして、ユーザの発声
が終了して無音区間になっても（期間ｆ）、前記表示は
そのまま、ユーザの上半身映像１４０を表示させてお
く。そして、無音区間の途中で、相手が発声したら相手
の音声のパワー１５２が閾値ｅを超えて、所定の時間ｃ
内に音声のパワー１５２が閾値ｅを下回らなければ映像
用ＬＣＤ１４の表示を相手の上半身映像１４１に切り替
える。そして、相手が発声している途中でユーザが発声
して、相手の音声とユーザの音声が重なったら、ユーザ
の発声を優先して前記と同様に発声開始から時間ｃ経過
後、映像用ＬＣＤ１４の表示をユーザの上半身映像１４
０の表示に切り替えるようにしている（期間ｄ）。(Nineteenth Embodiment) Next, a nineteenth embodiment will be described. In the process in which the user is talking with the other party, the uttering period during which the user is uttering is determined by the upper body image 14 of the user.
1 is displayed on the video LCD 14, and the other person's upper body image 140
May be displayed. That is, when the user is uttering, by taking a pause, the user is less likely to feel uncomfortable. FIG. 80 is a diagram showing a time relationship between the utterance of the user and the utterance state of the partner. When the user starts uttering, the audio power 151 exceeds a predetermined threshold value e, and within a predetermined time c, the audio power 151 becomes equal to the threshold value e.
If not, the display on the video LCD 14 is switched to the upper body video 140 of the user. Then, even when the utterance of the user ends and a silent section is reached (period f), the display of the upper body 140 of the user is displayed as it is. Then, when the other party utters the voice in the middle of the silent section, the power 152 of the other party's voice exceeds the threshold value e and the predetermined time c
If the audio power 152 does not fall below the threshold value e, the display of the image LCD 14 is switched to the upper body image 141 of the other party. Then, when the user utters while the other party is uttering and the voice of the other party and the user's voice overlap, the user's utterance is prioritized, and after a lapse of time c from the start of utterance in the same manner as described above, the image LCD 14 is displayed. Display the upper body image of the user 14
The display is switched to 0 (period d).

【００９５】なお、前記の説明では、ユーザと相手が無
音状態のときと、ユーザと相手が重なって発声したとき
は映像用ＬＣＤ１４にユーザの上半身映像１４０を表示
させているが、前記期間に映像用ＬＣＤ１４に相手の上
半身映像１４１を優先させて表示させてもかまわない。
尚、この実施形態では相手の上半身映像１４１の表示か
らユーザの上半身映像の表示１４０に切り替える際に実
施形態１２及び１３のように相手の上半身映像１４１と
ユーザの上半身映像１４０とをマルチウインドウで一緒
に表示させてもよい。（実施形態２０）次に、実施形態２０について説明す
る。実施形態１１〜実施形態１４、及び実施形態１７
は、手揺れセンサ３９が検出した揺れ成分が所定の閾値
を越えた場合に相手の上半身映像１４１からユーザの上
半身映像１４０に切り替え表示する処理を行っていた。
しかし、手揺れセンサ３９は端末本体１の揺れを検出す
るのみで、ユーザの揺れ（主にユーザの上半身の揺れ）
については検出することはできない。すなわち、手揺れ
が少なくても、ユーザの上半身の揺れが大きければ相手
に送られるユーザの上半身映像１４０は表示枠１１１か
らはみだしてしまう。In the above description, the upper body image 140 of the user is displayed on the image LCD 14 when the user and the other party are in a silent state and when the user and the other party are uttered together. The upper body image 141 of the other party may be displayed on the LCD 14 with priority.
In this embodiment, when switching from the display of the upper body image 141 of the other party to the display 140 of the upper body image of the user, the upper body image 141 of the other party and the upper body image 140 of the user are combined in a multi-window as in the twelfth and thirteenth embodiments. May be displayed. (Embodiment 20) Next, Embodiment 20 will be described. Embodiment 11 to Embodiment 14, and Embodiment 17
Has performed a process of switching from the partner's upper body image 141 to the user's upper body image 140 and displaying it when the shake component detected by the hand shake sensor 39 exceeds a predetermined threshold.
However, the hand shake sensor 39 only detects the shake of the terminal body 1 and shakes the user (mainly the shake of the upper body of the user).
Cannot be detected. That is, even if the hand shake is small, if the user's upper body shake is large, the user's upper body image 140 sent to the other party will protrude from the display frame 111.

【００９６】そこで、この実施形態は図６７のように、
レンズ７０の光軸を手揺れ成分に応じて変えて、手揺れ
成分を打ち消すことで、ユーザの上半身の揺れのみを抽
出することを目的としている。カメラ部４には手揺れセ
ンサ３９が取り付けられている。そして、手揺れセンサ
３９からの手揺れ検出出力は手揺れ補正信号生成手段７
３に印加される。手揺れ補正信号生成手段７３で所定の
手揺れ補正信号が生成され、この信号は手揺れ補正手段
７４に供給される。このループによって手揺れがあって
も手揺れ成分を打ち消した安定した映像が撮影できるよ
うになる。ここで、手揺れ補正手段７４としては、内部
に液体を封入したレンズの角度を調整することによって
その光軸を変化させる頂角可変プリズムや、純粋に一対
の円筒レンズ（若しくは球面レンズ）をその円筒面が回
動自在となるように対向配置し、手振れ量に応じて円筒
面を所定量だけ回動させて、プリズムの頂角を可変する
ことによって光軸を調整する頂角可変プリズムなどを使
用することができる。前記のように、カメラ部４から出
力するユーザの上半身映像１４０はカメラＩＦ部２５を
介して映像エンコーダ２７に印加される。Therefore, in this embodiment, as shown in FIG.
The purpose is to extract only the shaking of the upper body of the user by changing the optical axis of the lens 70 according to the shaking component and canceling the shaking component. A camera shake sensor 39 is attached to the camera unit 4. The hand shake detection output from the hand shake sensor 39 is output to the hand shake correction signal generation means 7.
3 is applied. A predetermined hand shake correction signal is generated by the hand shake correction signal generation means 73, and this signal is supplied to the hand shake correction means 74. By this loop, even if there is a hand shake, a stable image in which the hand shake component is canceled can be taken. Here, as the hand shake correction means 74, a vertex angle variable prism that changes the optical axis by adjusting the angle of a lens in which a liquid is sealed, or a pair of purely cylindrical lenses (or spherical lenses) is used. An apex angle variable prism or the like, which is arranged so that the cylindrical surface is rotatable, rotates the cylindrical surface by a predetermined amount according to the amount of camera shake, and adjusts the optical axis by changing the apex angle of the prism. Can be used. As described above, the upper body image 140 of the user output from the camera unit 4 is applied to the image encoder 27 via the camera IF unit 25.

【００９７】映像エンコーダ２７から動き監視部７５に
対して、複数フレームにわたった動きベクトルが供給さ
れる。前記動きベクトルはフレームを構成する全ブロッ
ク又はフレーム内の所定位置の一部のブロックの動きベ
クトルでもよい。動き監視部７５は映像エンコーダ２７
から供給された複数フレームにわたる動きベクトルの絶
対値の加算値（また平均値）などの評価値に基づいてカ
メラ部４の撮影視野内の物体の大きさ（例えば標準的な
顔の大きさ）と頻度を判定する。判定結果を予め設定し
た閾値と比較して当該閾値以上であるときは動き監視部
７５の出力として主制御部１１に出力する。上記の処理
により手揺れ成分が打ち消された上半身映像からユーザ
の上半身のみの動きが評価されることになる。そして、
実施形態１１〜実施形態１４、及び実施形態１７におけ
る端末本体１の手揺れ検出の代わりにこの実施形態のユ
ーザの上半身の動きを検出する方式を適用して、検出し
た顔の動きが所定の閾値を越えたら実施形態１１〜実施
形態１４、及び実施形態１７で指定する処理を行っても
よい。尚、この実施形態では相手の上半身映像１４１の
表示からユーザの上半身映像１４０の表示に切り替える
際に実施形態１２及び１３のように相手の上半身映像１
４１とユーザの上半身映像１４０とをマルチウインドウ
で一緒に表示させてもよい。The video encoder 27 supplies the motion monitoring unit 75 with a motion vector over a plurality of frames. The motion vector may be a motion vector of all blocks constituting a frame or a part of blocks at a predetermined position in the frame. The motion monitoring unit 75 includes the video encoder 27
And the size of an object (for example, a standard face size) within the field of view of the camera unit 4 based on an evaluation value such as an addition value (and an average value) of absolute values of motion vectors over a plurality of frames supplied from the camera. Determine the frequency. The determination result is compared with a preset threshold value, and when the result is equal to or larger than the threshold value, the result is output to the main control unit 11 as an output of the motion monitoring unit 75. The movement of only the upper body of the user is evaluated from the upper body video from which the hand shake component has been canceled by the above processing. And
The method of detecting the movement of the upper body of the user according to the present embodiment is applied instead of the hand shake detection of the terminal body 1 in the embodiments 11 to 14 and 17, and the detected face movement is determined by a predetermined threshold. If the number exceeds the limit, the processing specified in the embodiments 11 to 14 and 17 may be performed. In this embodiment, when the display of the upper body image 141 of the other party is switched to the display of the upper body image 140 of the user, the upper body image 1 of the other party as in the twelfth and thirteenth embodiments is used.
41 and the upper body image 140 of the user may be displayed together in a multi-window.

【００９８】（実施形態２１）次に、実施形態２１につ
いて説明する。上記各実施形態において、受信した相手
の上半身映像１４１は常時映像デコーダ１２で復号化し
ている状態で、必要に応じて映像用ＬＣＤ１４の表示を
切り替えて、ユーザの上半身映像１４０を表示させてい
た。しかし、一般に映像デコーダ１２は処理量が多く、
ＤＳＰなどで実行しているため消費電力が多かった。そ
こで、この実施形態では、上記各実施形態で記載した映
像用ＬＣＤ１４に相手の上半身映像１４１の表示から切
り替えて、ユーザの上半身映像１４０を表示する期間
に、電源部３６の制御により映像デコーダ１２の電源供
給を停止して、端末本体１全体の消費電力の節約を図る
ことを目的としている。そして、前記の期間は、カメラ
部４で撮影した映像はカメラＩＦ部２５を介して直接映
像用ＬＣＤ制御回路部１３に供給されてから、映像用Ｌ
ＣＤ１４で表示される。また、カメラ部４をユーザの上
半身映像１４０と反対側の方向に向けて図５７のような
景色を撮影して、この映像を相手に送る用途では、ユー
ザは映像用ＬＣＤ１４で景色をモニタしながら撮影した
方が、使い勝手がよい。そこで、カメラ向きセンサ部２
８でカメラ部４がユーザと反対側の方向を向いた状態を
検出したら、映像デコーダ１２の電源供給を停止しても
よい。また、ユーザが画面切替ボタン３３を操作して強
制的に映像用ＬＣＤ１４にユーザの上半身映像１４０を
表示させるように切り替えたら、映像デコーダ１２の電
源供給を停止してもよい。(Embodiment 21) Next, Embodiment 21 will be described. In each of the above embodiments, the upper body image 141 of the user is displayed by switching the display of the image LCD 14 as necessary while the received upper body image 141 of the other party is always decoded by the image decoder 12. However, generally, the video decoder 12 has a large processing amount,
Power consumption is high because it is executed by a DSP or the like. Therefore, in this embodiment, the display of the other party's upper body image 141 is switched from the display of the partner's upper body image 141 to the image LCD 14 described in each of the above embodiments, and during the period in which the user's upper body image 140 is displayed, the video decoder 12 is controlled by the power supply unit 36. The purpose is to stop the power supply and save the power consumption of the entire terminal body 1. During the above-mentioned period, the video taken by the camera unit 4 is supplied directly to the video LCD control circuit unit 13 via the camera IF unit 25, and then the video L
Displayed on CD14. Further, in a case where the camera unit 4 is oriented in the direction opposite to the upper body image 140 of the user to shoot a scene as shown in FIG. 57 and the image is sent to the other party, the user monitors the scene with the image LCD 14 while monitoring the scene. Shooting is more convenient. Therefore, the camera orientation sensor unit 2
If the camera unit 4 detects a state in which the camera unit 4 faces in a direction opposite to the user in 8, the power supply of the video decoder 12 may be stopped. When the user operates the screen switching button 33 to forcibly switch the image LCD 14 to display the upper body image 140 of the user, the power supply of the image decoder 12 may be stopped.

【００９９】ここで、映像用ＬＣＤ１４に相手の上半身
映像１４１を表示している段階から、ユーザの上半身映
像１４０あるいは景色を表示させている状態に切り替わ
り、次に相手の上半身映像１４１を表示させる状態に切
り替わった状態を考えてみる。図８１は、受信フレーム
と映像切替期間の関係を示した図である。相手からの受
信フレームは途切れること無く続いている。受信フレー
ムのうちＩと示しているフレームはイントラピクチャで
ある。まず、映像切替期間がユーザの上半身映像１４０
あるいは景色を表示させている期間すなわちフレームａ
〜ｃでの映像については、映像デコーダ１２で復号化処
理を実行しない。そして、相手の上半身映像１４１に切
り替わるタイミングがイントラピクチャ以外のフレーム
の場合、すなわちＰ、Ｂピクチャの場合、相互の予測依
存関係があるため、これらのピクチャのフレームから復
号化を開始すると画像が生成できない場合がある。そこ
で、この実施形態では受信フレームがＩ、Ｐ、Ｂフレー
ムであるかを調べるだけで、次のイントラピクチャＩが
来るまで、すなわちフレームｅ〜ｄも復号化を実行しな
い。そして、期間ｆはユーザの上半身映像／景色の表示
が延長されイントラピクチャＩであるフレームｂから相
手の上半身映像１４１が復号化され映像用ＬＣＤ１４の
表示が相手の上半身映像１４１に切り替わる。Here, the state where the upper body image 141 of the other party is displayed on the image LCD 14 is switched to a state where the upper body image 140 or the scenery of the user is displayed, and then the upper body image 141 of the other party is displayed. Let's consider the state that has been switched to. FIG. 81 is a diagram illustrating a relationship between a received frame and a video switching period. The frame received from the other party continues without interruption. The frame indicated by I in the received frames is an intra picture. First, the video switching period is the upper body image 140 of the user.
Alternatively, the period during which the scenery is displayed, that is, frame a
The decoding process is not performed by the video decoder 12 for the video images of (c) to (c). If the timing of switching to the upper body image 141 of the other party is a frame other than the intra picture, that is, if the picture is a P or B picture, there is a mutual prediction dependency. It may not be possible. Therefore, in this embodiment, it is only checked whether the received frame is an I, P, or B frame, and the decoding is not executed until the next intra picture I arrives, that is, the frames e to d. During the period f, the display of the user's upper body image / scenery is extended, the partner's upper body image 141 is decoded from the frame b which is the intra picture I, and the display of the video LCD 14 is switched to the partner's upper body image 141.

【０１００】[0100]

【発明の効果】以上説明したように、本発明の携帯情報
端末装置によれば、手揺れ等により筐体が揺れた場合で
も相手側では映像が振れることなく安定して表示され、
伝送量も少ない映像を送ることができる。また、ユーザ
側の撮影アングルが適切でない場合に、このことをユー
ザに認識させることができ、ユーザ側は筐体の持ち方を
変えてユーザ自身の被写体が画面の所定の位置になるよ
うに修正することができる。更に、カメラの向きを景色
等の対象物の方に変えた場合、ユーザ側に表示している
映像を対象物の映像に自動的に切り替えることができ、
操作性を向上させることができる。また、ユーザ側にお
いてユーザ側の映像を表示している期間に相手側から伝
送されてきた映像の復号化を停止させることにより、消
費電力の節約を図ることができる。As described above, according to the portable information terminal device of the present invention, even when the housing is shaken due to hand shaking or the like, the other party can display the image stably without shaking.
A video with a small transmission amount can be sent. In addition, when the user's shooting angle is not appropriate, the user can be made aware of this, and the user changes the way of holding the housing so that the user's own subject is at a predetermined position on the screen can do. Furthermore, when the direction of the camera is changed to an object such as a landscape, the image displayed on the user side can be automatically switched to the image of the object,
Operability can be improved. In addition, power consumption can be saved by stopping decoding of a video transmitted from the other party while the user's video is being displayed on the user's side.

[Brief description of the drawings]

【図１】実施形態１に係る携帯情報端末装置の構成を
示すブロック図。FIG. 1 is a block diagram showing a configuration of a portable information terminal device according to a first embodiment.

【図２】実施形態１に係る携帯情報端末装置の外観を
示す図。FIG. 2 is a view showing an appearance of the portable information terminal device according to the first embodiment.

【図３】実施形態１に係る携帯情報端末装置の他の構
成例を示すブロック図。FIG. 3 is a block diagram showing another configuration example of the portable information terminal device according to the first embodiment.

【図４】相手側に表示されるユーザ側の顔映像を示し
た表示図。FIG. 4 is a display diagram showing a user's face image displayed on the other side.

【図５】ユーザ側に表示される相手側の顔映像を示し
た表示図。FIG. 5 is a display diagram showing a partner's face image displayed on the user side.

【図６】相手側に表示されるユーザ側の顔映像の配置
を示した説明図。FIG. 6 is an explanatory diagram showing an arrangement of a user-side face image displayed on the other side.

【図７】実施形態２に係る携帯情報端末装置の構成を
示すブロック図。FIG. 7 is a block diagram showing a configuration of a portable information terminal device according to a second embodiment.

【図８】実施形態２に係る携帯情報端末装置の外観を
示す図。FIG. 8 is a diagram illustrating an appearance of a portable information terminal device according to a second embodiment.

【図９】相手側に表示されるユーザ側の顔形状を示し
た表示図。FIG. 9 is a display diagram showing a face shape on the user side displayed on the other side.

【図１０】ズーミング動作による顔形状の変化を説明
するための表示図。FIG. 10 is a display diagram for explaining a change in a face shape due to a zooming operation.

【図１１】実施形態２に係る携帯情報端末装置の他の
構成例を示すブロック図。FIG. 11 is a block diagram showing another configuration example of the portable information terminal device according to the second embodiment.

【図１２】相手側に表示されるユーザ側の顔映像の配
置を示した説明図。FIG. 12 is an explanatory diagram showing an arrangement of a user's face image displayed on the other side.

【図１３】相手側に表示されるユーザ側の顔映像の移
動範囲を示す図。FIG. 13 is a diagram showing a moving range of a face video on the user side displayed on the other side.

【図１４】表示枠に顔映像とテキスト文とを一緒に表
示させた例を示す図。FIG. 14 is a diagram showing an example in which a face image and a text sentence are displayed together on a display frame.

【図１５】実施形態３に係る携帯情報端末装置の構成
を示すブロック図。FIG. 15 is a block diagram showing a configuration of a portable information terminal device according to a third embodiment.

【図１６】実施形態３に係る携帯情報端末装置の外観
を示す図。FIG. 16 is a diagram illustrating an appearance of a portable information terminal device according to a third embodiment.

【図１７】トリミング形状の配置を説明するための表
示図。FIG. 17 is a display diagram for explaining the arrangement of a trimming shape.

【図１８】トリミング映像の配置を説明するための表
示図。FIG. 18 is a display diagram for explaining the arrangement of a trimming video.

【図１９】トリミング形状の例を示す図。FIG. 19 is a diagram showing an example of a trimming shape.

【図２０】ユーザの眼球の位置からトリミング形状を
決定する方法の説明図。FIG. 20 is an explanatory diagram of a method of determining a trimming shape from the position of the user's eyeball.

【図２１】２人の眼球の位置からトリミング形状を決
定する方法の説明図。FIG. 21 is an explanatory diagram of a method of determining a trimming shape from the positions of two eyes.

【図２２】従来の顔領域抽出法によりトリミング形状
を決定する方法の説明図。FIG. 22 is an explanatory diagram of a method for determining a trimming shape by a conventional face region extraction method.

【図２３】トリミング形状が楕円の場合の大きさと配
置を説明するための図。FIG. 23 is a view for explaining the size and arrangement when the trimming shape is an ellipse.

【図２４】実施形態３に係る携帯情報端末装置の他の
例の要部を示す構成図。FIG. 24 is a configuration diagram showing a main part of another example of the portable information terminal device according to the third embodiment.

【図２５】超音波の反射波の時間関係を示す図。FIG. 25 is a diagram showing a time relationship between reflected waves of an ultrasonic wave.

【図２６】トリミング映像の移動範囲を示す図。FIG. 26 is a diagram showing a moving range of a trimming video.

【図２７】トリミング映像とテキスト文とを一緒に表
示させた例を示す図。FIG. 27 is a diagram showing an example in which a trimmed video and a text sentence are displayed together.

【図２８】顔映像が表示枠の端に接触した状態を示し
た説明図。FIG. 28 is an explanatory diagram showing a state in which a face image has touched an edge of a display frame.

【図２９】顔映像が表示枠からはみ出した状態を示し
た説明図。FIG. 29 is an explanatory diagram showing a state in which a face image protrudes from a display frame.

【図３０】顔映像が表示枠の端に接触した場合の生の
映像を示す表示図。FIG. 30 is a display diagram showing a raw image when a face image touches an edge of a display frame.

【図３１】顔映像が表示枠からはみ出した場合の生の
映像を示す表示図。FIG. 31 is a display diagram showing a raw image when a face image protrudes from a display frame.

【図３２】トリミング映像が表示枠からはみ出した状
態を示した説明図。FIG. 32 is an explanatory diagram showing a state in which a trimming image protrudes from a display frame.

【図３３】ズーミング動作で変化する顔映像と生の映
像を示す表示図。FIG. 33 is a display diagram showing a face image and a raw image that change during a zooming operation.

【図３４】ズーミング動作で変化するトリミング映像
と生の映像を示す表示図。FIG. 34 is a display diagram showing a trimmed video and a raw video that are changed by a zooming operation.

【図３５】実施形態７に係る携帯情報端末装置の要部
構成を示すブロック図。FIG. 35 is a block diagram showing a main configuration of a portable information terminal device according to a seventh embodiment.

【図３６】図８６に示す従来方式における顔形状の移
動範囲を示す図。FIG. 36 is a view showing a movement range of a face shape in the conventional method shown in FIG. 86;

【図３７】図３５に示す方式における顔形状の移動範
囲を示す図。FIG. 37 is a view showing a movement range of a face shape in the method shown in FIG. 35;

【図３８】顔形状が撮影枠からはみ出した場合の生の
映像を示す表示図。FIG. 38 is a display diagram showing a raw image when the face shape is out of the shooting frame.

【図３９】実施形態７に係る携帯情報端末装置の他の
例の要部を示す構成図。FIG. 39 is a configuration diagram showing a main part of another example of the portable information terminal device according to the seventh embodiment.

【図４０】図８５に示す従来方式における顔形状の移
動範囲を示す図。FIG. 40 is a view showing a movement range of a face shape in the conventional method shown in FIG. 85;

【図４１】図３９に示す方式における顔形状の移動範
囲を示す図。FIG. 41 is a view showing a moving range of a face shape in the method shown in FIG. 39;

【図４２】顔形状が撮影枠からはみ出した場合の生の
映像を示す表示図。FIG. 42 is a display diagram showing a raw image when the face shape is out of the shooting frame.

【図４３】従来方式で映像とテキスト文とを一緒に表
示させた例を示す図。FIG. 43 is a view showing an example in which a video and a text sentence are displayed together in a conventional method.

【図４４】従来方式で映像とテキスト文とを一緒に表
示させた他の例を示す図。FIG. 44 is a diagram showing another example in which a video and a text are displayed together in a conventional method.

【図４５】従来方式で映像とテキスト文とを一緒に表
示させた他の例を示す図。FIG. 45 is a view showing another example in which a video and a text are displayed together in the conventional method.

【図４６】実施形態７に係る携帯情報端末装置の他の
例の要部を示す構成図。FIG. 46 is a configuration diagram showing a main part of another example of the portable information terminal device according to the seventh embodiment.

【図４７】図８６に示す従来方式における顔形状の移
動範囲を示す図。47 is a view showing a movement range of a face shape in the conventional method shown in FIG. 86.

【図４８】図４６に示す方式におけるトリミング形状
の移動範囲を示す図。FIG. 48 is a view showing a movement range of a trimming shape in the method shown in FIG. 46;

【図４９】トリミング映像が撮影枠からはみ出した場
合の生の映像を示す図。FIG. 49 is a view showing a raw video when the trimmed video is out of the shooting frame.

【図５０】実施形態７に係る携帯情報端末装置の他の
構成例を示すブロック図。FIG. 50 is a block diagram showing another configuration example of the portable information terminal device according to the seventh embodiment.

【図５１】図８５に示す従来方式における顔形状の移
動範囲を示す図。FIG. 51 is a view showing a movement range of a face shape in the conventional method shown in FIG. 85;

【図５２】図５０に示す方式におけるトリミング形状
の移動範囲を示す図。FIG. 52 is a view showing a movement range of a trimming shape in the method shown in FIG. 50;

【図５３】トリミング映像が撮影枠からはみ出した場
合の生の映像を示す図。FIG. 53 is a view showing a raw image when the trimmed image is out of the shooting frame.

【図５４】従来方式で映像とテキスト文とを一緒に表
示させた例を示す図。FIG. 54 is a view showing an example in which a video and a text are displayed together in a conventional method.

【図５５】従来方式で映像とテキスト文とを一緒に表
示させた他の例を示す図。FIG. 55 is a view showing another example in which a video and a text are displayed together in the conventional method.

【図５６】従来方式で映像とテキスト文とを一緒に表
示させた他の例を示す図。FIG. 56 is a view showing another example in which a video and a text are displayed together in the conventional method.

【図５７】実施形態８でカメラの向きを変えて撮影し
た景色を示す表示図。FIG. 57 is a display diagram showing a scene captured by changing the direction of a camera in the eighth embodiment.

【図５８】実施形態９に係る携帯情報端末装置の構成
を示すブロック図。FIG. 58 is a block diagram showing a configuration of a portable information terminal device according to a ninth embodiment.

【図５９】実施形態９に係る携帯情報端末装置の外観
を示す図。FIG. 59 is a diagram showing an appearance of a portable information terminal device according to a ninth embodiment.

【図６０】２人の眼球の位置から被写体の形状情報を
得る方法の説明図。FIG. 60 is an explanatory diagram of a method of obtaining shape information of a subject from the positions of two eyes.

【図６１】被写体形状情報の変化を説明するための表
示図。FIG. 61 is a display diagram for explaining changes in subject shape information.

【図６２】表示枠から外れた場合の形状情報を示す表
示図。FIG. 62 is a display diagram showing shape information when the object is out of the display frame.

【図６３】実施形態１０に係る携帯情報端末装置の要
部構成を示すブロック図。FIG. 63 is a block diagram showing a main configuration of a portable information terminal device according to the tenth embodiment.

【図６４】実施形態１１に係る携帯情報端末装置の構
成を示すブロック図。FIG. 64 is a block diagram showing a configuration of a portable information terminal device according to Embodiment 11.

【図６５】実施形態１１に係る携帯情報端末装置の外
観を示す図。FIG. 65 is a diagram illustrating an appearance of a portable information terminal device according to an eleventh embodiment.

【図６６】手揺れの変位と映像切替期間との関係を示
す図。FIG. 66 is a view showing a relationship between displacement of hand shaking and a video switching period.

【図６７】実施形態１１に係る携帯情報端末装置の他
の例を示す構成図。FIG. 67 is a configuration diagram showing another example of the portable information terminal device according to the eleventh embodiment.

【図６８】相手側映像とユーザ側映像とをマルチウイ
ンドウで表示した表示図。FIG. 68 is a display diagram in which the video of the other party and the video of the user are displayed in a multi-window.

【図６９】等級分けした手揺れの変位と映像切替期間
との関係を示す図。FIG. 69 is a view showing the relationship between the graded hand shake displacement and the video switching period.

【図７０】手揺れの変位によるマルチウインドウ表示
の変化を示す図。FIG. 70 is a view showing a change in the multi-window display due to the displacement of hand shaking.

【図７１】手揺れの変位が大きい場合のユーザの上半
身映像を表示した表示図。FIG. 71 is a display diagram showing the upper body image of the user when the displacement of the hand shake is large.

【図７２】手揺れの変位と映像切替期間との関係を示
す図。FIG. 72 is a view showing the relationship between the displacement of hand shaking and the video switching period.

【図７３】ユーザの上半身映像の表示時間と表示間隔
を示す図。FIG. 73 is a view showing the display time and display interval of the upper body video of the user.

【図７４】ユーザ及び相手方の上半身映像の表示時間
を示す図。FIG. 74 is a view showing the display time of the upper body image of the user and the other party.

【図７５】手揺れ変位のピーク値の包絡をプロットし
た図。FIG. 75 is a diagram in which the envelope of the peak value of the hand shake displacement is plotted.

【図７６】手揺れ変位とユーザの上半身映像の表示間
隔との関係を示す図。FIG. 76 is a view showing the relationship between the shaking displacement and the display interval of the upper body image of the user.

【図７７】ユーザの上半身映像の表示間隔が変化して
いる様子を示した説明図。FIG. 77 is an explanatory view showing a state where the display interval of the upper body image of the user is changing.

【図７８】揺れの変位に応じた映像表示期間とその延
長を説明するための図。FIG. 78 is an exemplary view for explaining a video display period according to a displacement of shaking and its extension.

【図７９】音声休止期間にユーザの上半身映像を表示
する時間関係を示す図。FIG. 79 is a diagram showing a time relationship for displaying the upper body image of the user during the audio pause period.

【図８０】ユーザ発声期間にユーザの上半身映像を表
示する時間関係を示す図。FIG. 80 is a diagram showing a time relationship in which the upper body image of the user is displayed during the user utterance period.

【図８１】受信フレームに対する復号化部の休止期間
の時間関係を示す図。FIG. 81 is a view showing a time relationship of a pause period of the decoding unit with respect to a received frame.

【図８２】相手側に表示されるユーザ側の被写体（中
央位置）を示した表示図。FIG. 82 is a display diagram showing a user-side subject (center position) displayed on the other side.

【図８３】ユーザ側に表示される相手側の被写体を示
した表示図。FIG. 83 is a display diagram showing a subject on the other side displayed on the user side.

【図８４】相手側に表示されるユーザ側の被写体（左
寄り）を示した表示図。FIG. 84 is a display diagram showing a user-side subject (leftward) displayed on the other side.

【図８５】従来の手ぶれ防止機構の例を説明した構成
図。FIG. 85 is a configuration diagram illustrating an example of a conventional camera shake prevention mechanism.

【図８６】従来の手ぶれ防止機構の他の例を説明した
構成図。FIG. 86 is a configuration diagram illustrating another example of a conventional camera shake prevention mechanism.

【図８７】相手側に表示されるユーザ側の被写体の変
形を示した表示図。FIG. 87 is a display diagram showing deformation of a subject on the user side displayed on the other side.

【図８８】従来例での撮影枠と表示枠との関係を示す
図。FIG. 88 is a diagram showing a relationship between a shooting frame and a display frame in a conventional example.

【図８９】従来例での撮影枠と表示枠との関係（はみ
出した場合）を示す図。FIG. 89 is a diagram showing a relationship between a photographing frame and a display frame (when the image is protruded) in a conventional example.

[Explanation of symbols]

１…端末本体３…イヤホンマイク４…カメラ部４ｂ…支持部４ａ…カメラ部本体４ｃ…ヒンジ部５…赤外線カメラ１１…主制御部１２…映像デコーダ１３…映像用ＬＣＤ制御回路部１４…映像用ＬＣＤ１５…テキスト用ＬＣＤ制御回路部１６…テキスト用ＬＣＤ１７…多重分離部１８…ＰＨＳ回線ＩＦ部（ＰＨＳ回線インタフェース
部）１９…アンテナ２０…形状抽出部２１…領域抽出部２２…赤外線カメラ用端子２３…音声コーデック２４…イヤホンマイク用端子２５…カメラＩＦ部（カメラインタフェース部）２６…カメラ用端子２７…映像エンコーダ２８…カメラ向きセンサ部２９…操作入力制御回路部３０…タッチパネル３１…スクロールダイヤル３２…操作ボタン３４…電源ボタン３６…電源部３７…主バス３８…同期バス３９…手揺れセンサ４１…方位センサ４２…変換部４３…人物認識部４５…画像メモリ４６…顔映像メモリ部４７…ズーミングボタン４８…ズーミング駆動部５０…輝度信号及び色相信号５１…領域抽出部５４…デジタル映像５５…トリミング形状蓄積部５６…トリミング映像メモリ部５７…赤外線ＬＥＤ５８…瞳孔抽出部６０…筐体６１−１、６１−２…凹部６２…超音波送信部６３…超音波受信部６４…トリミング形状可変ボタン６５…形状制御部６６…形状可変ボタン６７…映像メモリ部６８…ＣＣＤ６９…信号処理プロセッサ７０…レンズ７１…パンニング・チルティング駆動部７３…手揺れ補正信号生成手段７４…手揺れ補正手段７５…動き監視部１００…ユーザの被写体及び背景１０１、１０６、１０８、１２１、１２２…顔映像１０２、１０７、１１３、１１８、１１９…矩形１０３…矩形の中心１０４、１１４、１１７…顔映像の中心１０５…相手側の顔映像１１１…表示枠１１２、１１６…顔形状１１３、１１９…矩形１１４、１１７…顔映像の中心１２０…撮影枠１３０…テキスト文１３１、１３５、１４２…揺れの変位１３２、１３６、１４３…包絡線１３７…第２の値１３８…直線１３９…第１の値１４０…ユーザの上半身映像１４１…相手の上半身映像１４４…ユーザの上半身映像表示期間１５１、１５２、１５３…音声のパワー２００、２０２…トリミング形状２０１、２０３、２１７、２１８…トリミング映像ｇ、ｈ、２０４、２０５…瞳孔部分２０６…トリミング形状の重心２０７…表示枠の中心２０８…両眼の高さの位置２０９…中間２１０…映像信号２１５、２１６、２１９…欠落部分２３０…形状情報DESCRIPTION OF SYMBOLS 1 ... Terminal main body 3 ... Earphone microphone 4 ... Camera part 4b ... Support part 4a ... Camera part main body 4c ... Hinge part 5 ... Infrared camera 11 ... Main control part 12 ... Video decoder 13 ... Video LCD control circuit part 14 ... Video LCD 15: Text LCD control circuit 16: Text LCD 17: Demultiplexer 18: PHS line IF (PHS line interface) 19: Antenna 20: Shape extractor 21: Area extractor 22: Terminal for infrared camera 23 audio codec 24 earphone microphone terminal 25 camera IF section (camera interface section) 26 camera terminal 27 video encoder 28 camera direction sensor section 29 operation input control circuit section 30 touch panel 31 scroll dial 32 ... operation buttons 34 ... power button 36 ... power supply section 37 ... main bus 38 Synchronous bus 39 Hand shake sensor 41 Azimuth sensor 42 Conversion unit 43 Person recognition unit 45 Image memory 46 Face image memory unit 47 Zooming button 48 Zooming drive unit 50 Luminance signal and hue signal 51 Area extraction unit 54 Digital image 55 Trimming shape storage unit 56 Trimming image memory unit 57 Infrared LED 58 Pupil extraction unit 60 Housing 61-1, 61-2 Concavity 62 Ultrasonic transmission unit 63 Ultra Sound wave receiving unit 64 ... Trimming shape variable button 65 ... Shape control unit 66 ... Shape variable button 67 ... Video memory unit 68 ... CCD 69 ... Signal processor 70 ... Lens 71 ... Panning / tilting driving unit 73 ... Hand shake correction signal generation Means 74: hand shake correction means 75: motion monitoring unit 100: subject and background 101 of the user 06, 108, 121, 122 ... face images 102, 107, 113, 118, 119 ... rectangle 103 ... center of rectangles 104, 114, 117 ... center of face images 105 ... face image of the other party 111 ... display frames 112, 116 ... Face shape 113,119 ... Rectangle 114,117 ... Center of face image 120 ... Shooting frame 130 ... Text sentence 131,135,142 ... Shake displacement 132,136,143 ... Envelope 137 ... Second value 138 ... Line 139: first value 140: user's upper body image 141 ... partner's upper body image 144 ... user's upper body image display period 151, 152, 153 ... audio power 200, 202 ... trimming shape 201, 203, 217, 218 ... trimming Video g, h, 204, 205: pupil part 206: center of gravity of trimming shape 207: display Center 208 ... position in the height of both eyes 209 ... intermediate 210 ... video signal 215,216,219 ... missing part 230 ... shape information

Claims

[Claims]

1. A photographing means for photographing a subject and a background of a user by a camera while a housing is held by a user, and extracting a region of the subject photographed by the photographing means by predetermined processing, and extracting the region. Area extracting means for obtaining a subject image obtained by the detection, detecting means for detecting a direction and a moving distance of the subject image obtained by the area extracting means from a predetermined position in a predetermined range, and a detection result of the detecting means. Means for moving the subject image to the predetermined position using the communication device, communicating means for communicating with the counterpart information terminal device via a predetermined communication line, and transmitting the subject image moved to the predetermined position to the communication device. And a video transmission control means for transmitting to the other information terminal device by means of the portable information terminal device.

2. The method according to claim 1, wherein the detecting means detects that the subject image contacts an end of the predetermined range or that at least a part of the subject image is out of the predetermined range. And means for displaying the subject and the background.
A portable information terminal device according to claim 1.

3. A means for transmitting an image including the subject and the background to the counterpart information terminal device when the subject image moved to a predetermined position in the predetermined range is enlarged and is out of the predetermined range. The portable information terminal device according to claim 1, further comprising:

4. A means for accumulating the subject image immediately before the subject image protrudes from the predetermined range, and reading out the stored subject image as a still image when the subject image protrudes from the predetermined range, 3. The portable information terminal device according to claim 1, further comprising: a unit that transmits to the counterpart information terminal device via the communication line during a period in which the information terminal device is out of a predetermined range.

5. A photographing means for photographing a subject and a background of the user by a camera in a photographing area wider than a display area while the housing is held by a user, and a photographing means in the photographing area photographed by the photographing means. An area extracting means for extracting an area of the object by predetermined processing to obtain an image of the extracted object; a means for moving the display area so that the image of the object fits in the display area; Detecting means for detecting a direction and a moving distance of the subject image moved from a predetermined position in the display area, and means for moving the subject image to the predetermined position by using a detection result of the detecting means; Communication means for communicating with the information terminal device of the other party through a predetermined communication line; and the subject image moved to the predetermined position by the communication means. Portable information terminal apparatus characterized by comprising a video transmission control means for transmitting to the information terminal device.

6. When the subject image contacts an edge of the photographing area or when it is detected that at least a part of the subject image protrudes from the photographing area, the user's object and background are determined on the user side. The portable information terminal device according to claim 5, further comprising a display unit.

7. A means for accumulating the subject image immediately before the subject image protrudes from the shooting area, and reading out the stored subject image as a still image when the subject image protrudes from the shooting area, 7. The portable information terminal device according to claim 5, further comprising: a unit that transmits to the counterpart information terminal device via the communication line during a period of protruding from an imaging area.

8. A photographing means for photographing a subject and a background of the user by a camera in a state where the housing is held by a user, and extracting a region of the subject photographed by the photographing means by predetermined processing, and extracting the region. Area extracting means for obtaining a subject image obtained, means for detecting the mounting direction of the camera in the housing, and detecting that the mounting direction of the camera deviates from the shooting direction of the subject by this means. In the case, a means for displaying an image captured by the camera on the user side, a communication means for communicating with a counterpart information terminal device via a predetermined communication line, and a mounting direction of the camera in a shooting direction of the subject In some cases, it is detected that the subject image in which the region has been extracted and that the mounting direction of the camera is out of the shooting direction of the subject. Portable information terminal apparatus characterized by comprising a video transmission control means for transmitting the image the camera is photographed on the mating information terminal device by the communication unit if.

9. An infrared radiating means and an infrared camera provided on said housing, and said infrared camera irradiates said infrared camera with a reflection of infrared light to extract shape information of said subject. Means, and a shape control means for controlling the shape information so that a ratio of the subject image to the predetermined range is a predetermined ratio, wherein the region extracting means is configured to detect the object photographed by the photographing means. 2. The portable information terminal device according to claim 1, wherein an area is extracted using the shape information controlled by the shape control means, and the subject image in which the area is extracted is obtained.

10. An ultrasonic transmitter and an ultrasonic receiver provided in the housing, and an ultrasonic pulse transmitted from the ultrasonic transmitter is received by the ultrasonic receiver to form the object. Means for extracting information, and shape control means for controlling the shape information so that a ratio of the subject image to the predetermined range is a predetermined ratio, wherein the area extracting means is photographed by the photographing means. 2. The portable information terminal device according to claim 1, wherein an area of the subject is extracted using the shape information controlled by the shape control unit, and the subject image in which the area is extracted is obtained. .

11. A photographing means for photographing a subject and a background of the user by a camera while the housing is held by the user, a means for detecting a displacement of the shaking when the housing is shaken, and Means for displaying the subject and background of the user for a predetermined period on the user side when the detected displacement of the shaking exceeds a predetermined threshold value, and communicating with the counterpart information terminal device via a predetermined communication line And a video transmission control unit for transmitting a video including the user's subject to the communication partner information terminal device by the communication unit.

12. A photographing means for photographing a subject and a background of the user by a camera while the housing is held by the user, and displaying the subject and the background of the user at a predetermined period and a predetermined period on the user side. Means for performing communication with the other-side information terminal device via a predetermined communication line, and video transmission control means for transmitting a video including the user's subject to the other-side information terminal device by the communication means. A portable information terminal device comprising:

13. A photographing means for photographing a subject and a background of the user by a camera while the housing is held by the user; a communication means for communicating with a counterpart information terminal device via a predetermined communication line; A video transmission control unit for transmitting a video including the user's subject to the counterpart information terminal device by the communication unit, detecting a silent section of the voice uttered by the user, and setting a predetermined length from the silent section to First detecting means for detecting the exceeded pause period, and receiving a voice transmitted from the counterpart information terminal device via the communication means, detecting a silent section of the voice, and determining a predetermined period from the silent section. Second detecting means for detecting a pause period exceeding the length, and displaying the subject and background of the user on the user side during the pause period detected by the first and second detection means. Portable information terminal apparatus characterized by comprising a that means.

14. A photographing means for photographing a subject and a background of the user by a camera in a state where the housing is held by the user, a communication means for communicating with a counterpart information terminal device via a predetermined communication line, A video transmission control unit for transmitting a video including the user's subject to the communication partner information terminal device by the communication unit; and a case where the power of the voice uttered by the user exceeds a predetermined threshold and continues for a predetermined time. A portable information terminal device comprising: means for detecting an utterance period; and means for displaying a subject and a background of the user on the user side during the utterance period.

15. A photographing means for photographing a subject and a background of the user by a camera in a state where the housing is held by the user; a communication means for communicating with a counterpart information terminal device via a predetermined communication line; A video transmission control unit for transmitting a video including the user's subject to the counterpart information terminal device by the communication unit; and a user's subject on the user side when the communication unit starts communication with the counterpart information terminal device. And a means for displaying a background.

16. A photographing means for photographing a subject and a background of the user by a camera while the housing is held by the user, a communication means for communicating with a counterpart information terminal device via a predetermined communication line, Video transmission control means for transmitting an image including the user's subject to the other information terminal device by the communication means, means for detecting a period during which the user's side image is displayed on the user side, and the period Means for stopping the supply of power to decoding means for video information transmitted from said partner information terminal apparatus by said communication means.

17. The system according to claim 17, further comprising: means for displaying, in the user's side, a subject and a background of the user in a multi-window together with a video transmitted from the partner information terminal device. The portable information terminal device according to claim 2, claim 6, or claim 11 to claim 15.

18. A means for displaying a subject and a background of the user on the user side in a multi-window manner together with a video transmitted from the partner information terminal device, and dividing the threshold value into a plurality of classes. 12. The portable information according to claim 11, further comprising: means for applying the displacement of the shaking to the class; and means for changing a size of a display of a subject and a background of the user according to the class. Terminal device.

19. The portable information terminal device according to claim 12, further comprising means for setting said period and period.

20. The apparatus according to claim 1, further comprising: means for detecting a displacement of the shaking when the casing is shaken, and means for changing the period in accordance with the displacement of the shaking detected by the means. 13. The portable information terminal device according to 12.

21. When displaying the subject and the background of the user on the user side, the displacement of the shaking when the housing shakes is determined by a predetermined threshold value because the user performs an operation of correcting hand shaking. 13. The portable information terminal device according to claim 11, further comprising: means for extending a display of a subject and a background of the user during a period exceeding the period.

22. A photographing means for photographing a subject and a background of the user by a camera in a state where the housing is held by the user; a communication means for communicating with a counterpart information terminal device via a predetermined communication line; Video transmission control means for transmitting an image including the user's subject to the counterpart information terminal device by the communication means, detection means for detecting displacement of hand shake when the housing is shaken by hand shake, Means for canceling a hand-shake component in the video by adding a hand-shake displacement and a hand-shake component detected by the detection means to a video of the user's subject and the background captured by a camera; Means for detecting a shaking component of the subject of the user from the image in which the component of the hand shaking is canceled, and the magnitude of the shaking component of the subject detected by the means is predetermined. When exceeding the threshold value, the portable information terminal apparatus, characterized in that at the user side and means for displaying the object and the background of the user.