JPH0568241A

JPH0568241A - Cif image transforming system for video telephone

Info

Publication number: JPH0568241A
Application number: JP3227626A
Authority: JP
Inventors: Yuichi Fujino; 雄一藤野; Mamoru Nakanishi; 衛中西
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1991-09-09
Filing date: 1991-09-09
Publication date: 1993-03-19

Abstract

PURPOSE:To attain good conversation by picking up beforehand the image of an objective person in a wide angle, and detecting the face area of the person by picture processing, and generating this area by segmenting it by CIF or QCIF picture elements. CONSTITUTION:In order to generate a CIF or a QCIF picture, the image of the objective person is picked up beforehand in the wide angle, and the face area of the person is detected by the picture processing by a moving area detecting means 103, and center coordinates for segmentation are calculated by a segmenting address calculating means 104, and a rectangular range including the face area of the person is segmented by the CIF or the QCIF picture elements. Then, if a face moves, a segmented range is changed within the image- picked up range in accordance with the movement of the face area, and the face is displayed as following the face area. Thus, the face area never goes out of the picture frame of a camera on account of the movement of the face part of the objective person, and the face are can be transmitted always to an opposite party side, and as the result, the good conversation can be realized through a video telephone.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ＣＣＩＴＴ標準に準拠
したテレビ電話装置の、ＮＴＳＣからＣＩＦ画像への変
換処理方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an NTSC-to-CIF image conversion processing system for a videophone device conforming to the CCITT standard.

【０００２】[0002]

【従来の技術】ここでは従来の、ＣＣＩＴＴ標準に準拠
したテレビ電話装置の、ＮＴＳＣからＣＩＦ画像への変
換処理方式について説明する前に、ＣＩＦ画像規格につ
いて説明する。ＣＣＩＴＴではテレビ電話の相互接続を
実現するために１９９０年１２月にｐ×６４ｋｂｉｔ／
ｓオーディオビジュアル・サービス用ビデオ符号化方式
を勧告した（CCITT Rec. H261 "Video Codec for Audio
visual Service at px64kbit/s 1990 ）。ここでは、各
国のテレビ信号の方式の差を吸収するために共通中間フ
ォーマット画像（ＣＩＦ； Comon Intermediate Forma
t）を以下のように規定している。2. Description of the Related Art Here, the CIF image standard will be described before the description of the conventional conversion processing system from the NTSC to the CIF image in the video telephone apparatus conforming to the CCITT standard. In CCITT, p × 64 kbit /
s Recommended video coding method for audiovisual services (CCITT Rec. H261 "Video Codec for Audio
visual Service at px64kbit / s 1990). Here, a common intermediate format image (CIF; Comon Intermediate Forma) is used in order to absorb the difference in the television signal system of each country.
t) is defined as follows.

【０００３】第１のフォーマット（ＣＩＦ）では、輝度
の標本は、１ラインあたり３５２画素、１フレームあた
り２８８ラインで直交格子状に配列される。２つの色差
成分の標本は、それぞれ１ラインあたり１７６画素、１
フレームあたり１４４ラインで直交格子状に配列され
る。色差信号の画素ブロック境界は輝度信号の画素ブロ
ック境界と一致するように置かれ、これらの数の画素で
囲まれた画像領域は、アスペクト比が４：３であり、標
準テレビジョン信号の有効画面と一致する。In the first format (CIF), the luminance samples are arranged in an orthogonal grid with 352 pixels per line and 288 lines per frame. The samples of the two color difference components are 176 pixels per line and 1
144 lines per frame are arranged in an orthogonal grid. The pixel block boundary of the color difference signal is placed so as to coincide with the pixel block boundary of the luminance signal, and the image area surrounded by these number of pixels has an aspect ratio of 4: 3, and an effective screen of a standard television signal. Matches

【０００４】第２のフォーマット（ＱＣＩＦ）は、上述
のＣＩＦの画素とライン数を各々１／２にしたものであ
る。（以上、ＴＴＣ標準ＪＴ−Ｈ２２１より抜粋）以
上の規格に準拠したテレビ電話装置ではテレビジョンの
標準方式の違いに関わらず相互通信を行うことができ
る。以下に本規格に準拠した従来のテレビ電話装置にお
けるＮＴＳＣ／ＣＩＦ変換処理方式について説明する。The second format (QCIF) is one in which the number of pixels and the number of lines of the above-mentioned CIF are each halved. (The above is an excerpt from the TTC standard JT-H221.) In the videophone device conforming to the above standards, mutual communication can be performed regardless of the difference in standard television systems. The NTSC / CIF conversion processing method in the conventional videophone device based on this standard will be described below.

【０００５】従来のテレビ電話装置において、日本の標
準テレビジョン方式であるＮＴＳＣ方式からＣＩＦ画像
を作成するＮＴＳＣ／ＣＩＦ変換処理方式は、垂直、水
平方向の画素数、ライン数を縮小する縮小フィルタを使
用する方式であった。In a conventional videophone apparatus, the NTSC / CIF conversion processing method for creating a CIF image from the NTSC method, which is a Japanese standard television method, uses a reduction filter for reducing the number of pixels in the vertical and horizontal directions and the number of lines. It was the method used.

【０００６】図７はＮＴＳＣ信号を１３．５ＭＨｚで標
本化し、ディジタル色分離処理を行い、水平方向７０４
画素、垂直方向４８０ラインの画像を作成し、該画像か
らＣＩＦ画像を作成する従来方式を示す例である。な
お、水平方向７０４画素、垂直方向４８０ラインの画像
は１９８２年にＣＣＩＲにおいて「スタジオ用ディジタ
ルテレビの符号化パラメータ」（Ｒｅｃ．６０１）とし
て勧告された規格に準拠している。ここで、１は撮像
部、２は帯域制限用ローパスフィルタ（ＬＰＦ）部、３
はＡ／Ｄ変換部、４はディジタルＹ／Ｃ分離部、５は水
平方向縮小フィルタ部、６はノンインタレース変換部、
７は垂直方向縮小フィルタ部、８は符号化処理部、９は
網インタフェース処理部である。In FIG. 7, an NTSC signal is sampled at 13.5 MHz, digital color separation processing is performed, and horizontal direction 704 is performed.
It is an example showing a conventional method of creating an image of 480 lines in the vertical direction of pixels and creating a CIF image from the image. The image of 704 pixels in the horizontal direction and 480 lines in the vertical direction conforms to the standard recommended in CCIR in 1982 as "encoding parameter of studio digital television" (Rec. 601). Here, 1 is an imaging unit, 2 is a low-pass filter (LPF) unit for band limitation, 3
Is an A / D conversion unit, 4 is a digital Y / C separation unit, 5 is a horizontal reduction filter unit, 6 is a non-interlaced conversion unit,
Reference numeral 7 is a vertical reduction filter unit, 8 is an encoding processing unit, and 9 is a network interface processing unit.

【０００７】撮像部１により撮像されたＮＴＳＣ信号
は、ＬＰＦ部２により帯域制限され、Ａ／Ｄ変換部３に
よりサンプリング周波数１３．５ＭＨｚでサンプリン
グ、量子化ビット数９ビットで量子化される。これによ
り水平方向のサンプリング数は８５８サンプル、有効画
素数は７０４画素となる。ディジタル化されたＮＴＳＣ
信号はディジタルＹ／Ｃ分離部４により輝度信号Ｙ、色
差信号Ｃ₁，Ｃ₂のコンポーネント信号に分離される。
色分離されたコンポーネント信号はそれぞれ水平方向縮
小フィルタ部に入力される。ここでの具体的処理は、水
平方向の有効画素数が７０４画素であるため、１／２に
サブサンプリングし縮小する。水平方向に１／２に縮小
されたコンポーネント信号はノンインタレース変換部６
に入力され、奇数フィールド、偶数フィールドの信号は
それぞれ一度フィールドメモリに蓄積される。読み出す
際には、奇数、偶数フィールドの各ラインを交互に読み
出すことによりインタレース信号をノンインタレース信
号に変換する。ノンインタレース化されたコンポーネン
ト信号は垂直方向縮小フィルタ部７に入力され、垂直方
向に３／５倍に縮小される。具体的な縮小方法は、５ラ
インから３ラインを単純に間引く方法、５ラインの信号
それぞれに重み係数を乗じて３ラインに変換する方法が
あるが、ここでは単純に前者の方法にて縮小する。The NTSC signal picked up by the image pickup section 1 is band-limited by the LPF section 2, sampled at a sampling frequency of 13.5 MHz by the A / D conversion section 3, and quantized by a quantization bit number of 9 bits. As a result, the number of samples in the horizontal direction is 858 samples and the number of effective pixels is 704 pixels. Digitized NTSC
The signal is separated by the digital Y / C separation unit 4 into component signals of a luminance signal Y and color difference signals C ₁ and C ₂ .
The color-separated component signals are input to the horizontal reduction filter unit. In the concrete processing here, since the number of effective pixels in the horizontal direction is 704 pixels, sub-sampling is performed to ½ and reduction is performed. The non-interlaced conversion unit 6 receives the component signal that is reduced to 1/2 in the horizontal direction.
And the signals of the odd field and the even field are once stored in the field memory. When reading, the interlaced signal is converted into a non-interlaced signal by alternately reading the lines of the odd and even fields. The non-interlaced component signal is input to the vertical reduction filter unit 7 and vertically reduced by a factor of 3/5. As a concrete reduction method, there is a method of simply thinning out 5 lines to 3 lines and a method of multiplying each signal of 5 lines by a weighting factor to convert into 3 lines, but here, the reduction is simply performed by the former method. .

【０００８】以上により、水平、垂直方向にそれぞれ１
／２，３／５に縮小されたコンポーネント信号はそれぞ
れ３５２画素、２８８ラインとなり、ＣＩＦ画像が作成
され、符号化処理部８に入力され符号化処理を施され、
網インタフェース処理部９を介して伝送路に送出され
る。As a result of the above, 1 in each of the horizontal and vertical directions
The component signals reduced to / 2, 3/5 are 352 pixels and 288 lines, respectively, and a CIF image is created, input to the encoding processing unit 8 and subjected to encoding processing,
It is sent to the transmission line via the network interface processing unit 9.

【０００９】以上、ＣＩＦ画像の作成法について述べた
が、ＱＣＩＦ画像は上記方法で水平方向のサブサンプリ
ングを１／４に、垂直方向の間引きを３／１０にする事
により同様に作成できる。Although the method of creating a CIF image has been described above, a QCIF image can be similarly created by setting the horizontal subsampling to 1/4 and the vertical thinning to 3/10 by the above method.

【００１０】[0010]

【発明が解決しようとする課題】上述の従来方式では、
ＣＩＦまたはＱＣＩＦの画素数が少ないために、テレビ
電話として相手の顔の表情までをはっきり認識するため
にはある程度ズームアップして撮像し、伝送されなけれ
ばならず、このような場合には、相手被写体人物の顔の
部分が少しでも動いてしまうとカメラの画枠から外れて
しまい、正常な顔面像が伝送されない欠点がある。In the above-mentioned conventional method,
Since the number of pixels of CIF or QCIF is small, in order to clearly recognize the facial expression of the other party as a videophone, it must be zoomed up to some degree and imaged and transmitted. If the face part of the subject person moves even a little, it will be out of the image frame of the camera, and the normal facial image will not be transmitted.

【００１１】本発明は、被写体人物の顔の部分が動いて
しまうことによってカメラの画枠から顔領域が外れる点
を解決し、常に人物顔領域がカメラの画枠内にあり、良
好な会話が可能にすることを目的としている。The present invention solves the problem that the face area deviates from the image frame of the camera due to movement of the face portion of the subject person, and the human face area is always in the image frame of the camera, and good conversation is ensured. The purpose is to enable.

【００１２】[0012]

【課題を解決するための手段】図１は本発明の原理構成
図を示す。図中の符号１０１はＡ／Ｄ変換処理手段、１
０２はフレームメモリであってＡ／Ｄ変換されたテレビ
ジョン信号を蓄積するもの、１０３は動領域検出手段で
あってテレビジョン信号の中の動領域を検出するもの、
１０４は切り出しアドレス算出手段であって切り出しア
ドレスを算出する手段である。FIG. 1 is a block diagram showing the principle of the present invention. Reference numeral 101 in the figure is A / D conversion processing means, 1
Reference numeral 02 is a frame memory for storing the A / D converted television signal, 103 is a moving area detecting means for detecting a moving area in the television signal,
Reference numeral 104 denotes a cutout address calculating means, which is a means for calculating the cutout address.

【００１３】[0013]

【作用】ＣＩＦまたはＱＣＩＦ画像を作成するのに、あ
らかじめ広角で被写体人物像を撮像し、画像処理により
人物顔領域を検出して該領域をＣＩＦまたはＱＣＩＦ画
素で切り出して作成すること、顔が動いた場合には撮像
された範囲内で該顔領域の動きに応じて切り出し範囲を
変更することにより、常に顔領域を相手側に送信できる
ようにする。更に、切り出したＣＩＦ画像の歪をなくす
ため、ＣＩＦ画像切り出しに際し水平方向、垂直方向の
アスペクト比に基づいたサンプリング周波数を求め、該
サンプリング周波数で標本化する。In order to create a CIF or QCIF image, a person image of a subject is picked up in advance at a wide angle, a human face area is detected by image processing, and the area is cut out with CIF or QCIF pixels. In this case, the cutout range is changed within the imaged range according to the movement of the face area, so that the face area can be always transmitted to the other party. Further, in order to eliminate the distortion of the cut CIF image, a sampling frequency based on the aspect ratio in the horizontal direction and the vertical direction is obtained when the CIF image is cut out, and sampling is performed at the sampling frequency.

【００１４】従来の技術においては、ＣＩＦまたはＱＣ
ＩＦ画像を作成するのに、水平、垂直方向の縮小フィル
タを使用し縮小変換により作成するのに対し、本発明で
はあらかじめ広角で撮像した画像から顔領域を画像処理
により検出し、検出した顔領域を含む矩型領域をＣＩＦ
またはＱＣＩＦ画素で切り出すことによって作成する。
また、従来では、ディジタル色分離の容易性から１３．
５ＭＨｚまたは色副搬送波周波数ｆ_SC（３．５８ＭＨ
ｚ）の３または４倍の周波数でＡ／Ｄ変換を行っている
のに対し、本発明ではアナログ色分離を実施しているた
め上記周波数に依存せず、ＣＩＦ画像の歪をなくすため
に適当な周波数を決定している。In the prior art, CIF or QC
In order to create an IF image, reduction filters in horizontal and vertical directions are used to perform reduction conversion, whereas in the present invention, a face area is detected by image processing from an image picked up in advance at a wide angle, and the detected face area is detected. Rectangular area including
Alternatively, it is created by cutting out with QCIF pixels.
Further, in the conventional technique, 13.
5 MHz or color subcarrier frequency f _SC (3.58 MH
While the A / D conversion is performed at a frequency three or four times that of z), the present invention does not depend on the above frequency because analog color separation is performed, and is suitable for eliminating distortion of a CIF image. Frequency is decided.

【００１５】[0015]

【実施例】（実施例１）図２は本発明の第１の実施例で
あるＮＴＳＣ方式からＣＩＦ画像を作成する方式を説明
する図であり、図中の符号１，２，３，８，９は図７に
対応し、更に１０はアナログＹ／Ｃ分離部、１１は動領
域検出部、１２はＣＩＦ／ＱＣＩＦ画像切り出しアドレ
ス算出部、１３はフレームメモリ部である。(Embodiment 1) FIG. 2 is a diagram for explaining a method of creating a CIF image from the NTSC method according to the first embodiment of the present invention. Reference numerals 1, 2, 3, 8 in FIG. Reference numeral 9 corresponds to FIG. 7, 10 is an analog Y / C separation section, 11 is a moving area detection section, 12 is a CIF / QCIF image cutout address calculation section, and 13 is a frame memory section.

【００１６】撮像部１により撮像されたＮＴＳＣ方式の
テレビジョン信号は、アナログＹ／Ｃ分離部１０に入力
され、輝度信号Ｙ、色差信号Ｃ₁，Ｃ₂のコンポーネン
ト信号に分離される。分離されたＹ，Ｃ₁，Ｃ₂信号は
Ａ／Ｄ変換の前にＬＰＦ部２を通過し帯域制限される。
帯域制限されたＹ，Ｃ₁，Ｃ₂信号はＡ／Ｄ変換部３に
よりＡ／Ｄ変換される。ここで、Ａ／Ｄ変換のサンプリ
ング周波数はＣＩＦまたはＱＣＩＦ画素で切り出された
時の画像の水平方向対垂直方向の物理的な大きさの比が
４：３になるように決定される。すなわちＡ／Ｄ変換に
よりサンプリングされた画像から水平、垂直方向のそれ
ぞれの走査線帰線期間を除いた有効画像と、ＣＩＦまた
はＱＣＩＦ画素により作成された画像とが相似形である
ことが条件となる。具体的には、ＣＩＦ画像の水平方
向、垂直方向の画素、ライン数をそれぞれ３５２画素、
２８８本、サンプリングする映像信号の水平方向、垂直
方向の有効画素、ライン数をそれぞれｐ画素、４８０本
とした場合、以下の関係が成り立つ。An NTSC television signal picked up by the image pickup unit 1 is input to an analog Y / C separation unit 10 and separated into a luminance signal Y and component signals of color difference signals C ₁ and C ₂ . The separated Y, C ₁ and C ₂ signals pass through the LPF unit 2 before the A / D conversion and are band-limited.
The band-limited Y, C ₁ and C ₂ signals are A / D converted by the A / D converter 3. Here, the sampling frequency of the A / D conversion is determined so that the ratio of the physical size of the image in the horizontal direction to the vertical direction when cut out by the CIF or QCIF pixel is 4: 3. That is, it is required that the effective image obtained by removing the horizontal and vertical scanning line blanking periods from the image sampled by the A / D conversion and the image created by the CIF or QCIF pixel have similar shapes. .. Specifically, the horizontal and vertical pixels of the CIF image and the number of lines are 352 pixels each,
When 288 lines, the effective pixels in the horizontal direction and the vertical direction of the video signal to be sampled, and the number of lines are p pixels and 480 lines, respectively, the following relationships are established.

【００１７】ｐ：４８０＝３５２：２８８ ∴ｐ＝５８７従って、水平方向の有効画素数は５８７画素となり、水
平走査期間の８３％を有効期間とした場合、水平走査期
間の画素数は７０７画素となる。これより、Ａ／Ｄ変換
のサンプリング周波数は１１．１ＭＨｚと決定すること
ができる。なお、色差信号Ｃ₁，Ｃ₂は、輝度信号Ｙの
半分、すなわち５．５６ＭＨｚでＡ／Ｄ変換される。P: 480 = 352: 288 ∴p = 587 Therefore, the number of effective pixels in the horizontal direction is 587 pixels, and when 83% of the horizontal scanning period is the effective period, the number of pixels in the horizontal scanning period is 707 pixels. Become. From this, the sampling frequency for A / D conversion can be determined to be 11.1 MHz. The color difference signals C ₁ and C ₂ are A / D converted at half the luminance signal Y, that is, at 5.56 MHz.

【００１８】サンプリング周波数１１．１ＭＨｚまたは
５．５６ＭＨｚ、量子化ビット数８ビットでＡ／Ｄ変換
されたＹ信号を利用して動領域検出部１１にて顔領域を
含むＣＩＦまたはＱＣＩＦ画像の矩型領域の中心座標が
決定される。また、同時にＹ，Ｃ₁，Ｃ₂信号は２フィ
ールド毎にフレームメモリ部１３に蓄積される。なお、
ノンインタレース変換処理は該フレームメモリに蓄積さ
れた奇数、偶数フィールドを一度に読み出すことにより
行う。決定されたＣＩＦまたはＱＣＩＦ画像の中心座標
はＣＩＦ／ＱＣＩＦ画像切り出しアドレス算出部１２に
て、フレームメモリ部に蓄積されている、Ｙ，Ｃ₁，Ｃ
₂信号のそれぞれのフレーム画像から切り出すアドレス
を計算し、該アドレスに基づきＣＩＦまたはＱＣＩＦ画
像が読み出される。読み出されたＣＩＦまたはＱＣＩＦ
画像は、符号化処理部８、網インタフェース処理部９を
介して相手側に伝送される。A quadrature type of a CIF or QCIF image including a face area in the moving area detecting section 11 using the Y signal A / D converted with a sampling frequency of 11.1 MHz or 5.56 MHz and a quantization bit number of 8 bits. The center coordinates of the area are determined. At the same time, the Y, C ₁ and C ₂ signals are stored in the frame memory unit 13 every two fields. In addition,
The non-interlaced conversion processing is performed by reading the odd and even fields stored in the frame memory at once. The determined center coordinates of the CIF or QCIF image are stored in the frame memory unit by the CIF / QCIF image cutout address calculation unit 12, Y, C ₁ , C.
_An address cut out from each frame image of the _two signals is calculated, and a CIF or QCIF image is read based on the address. CIF or QCIF read
The image is transmitted to the other party via the encoding processing unit 8 and the network interface processing unit 9.

【００１９】図３は図２の動領域検出部１１の内容を詳
細に説明する図であり、背景参照画像を使用した背景差
分法による例を示す。ここで、１４は背景差分算出部、
１５は背景参照画像メモリ部、１６はブロック化処理
部、１７は２値化処理部、１８は頭頂部座標算出部、１
９はＣＩＦ／ＱＣＩＦ画像切り出し用中心座標算出部で
ある。FIG. 3 is a diagram for explaining in detail the contents of the moving area detecting section 11 of FIG. 2, and shows an example by the background subtraction method using the background reference image. Here, 14 is a background difference calculation unit,
Reference numeral 15 is a background reference image memory unit, 16 is a block processing unit, 17 is a binarization processing unit, 18 is a parietal coordinate calculation unit, 1
Reference numeral 9 denotes a CIF / QCIF image clipping center coordinate calculation unit.

【００２０】あらかじめ、人物像や動物体を含まない背
景参照画像を撮像し、背景参照画像メモリ部１５に蓄積
する。次に、人物が含まれた通常のテレビ電話画像を撮
像し、撮像された人物画像は背景差分算出部１４に入力
される。背景差分算出部１４では入力される人物画像の
フレーム毎に背景参照画像との差分信号を計算する。得
られた背景差分信号は、計算量の削減の為にブロック化
処理部１６に入力され、例えば１６×１６画素等でブロ
ック化処理を行う。ブロック化された画像は２値化処理
部１７に入力され、適当なしきい値で２値化され、背景
参照画像と人物画像の差分、すなわち人物が存在する領
域を抽出する。A background reference image that does not include a human figure or a moving object is captured in advance and stored in the background reference image memory unit 15. Next, a normal videophone image including a person is captured, and the captured person image is input to the background difference calculation unit 14. The background difference calculation unit 14 calculates a difference signal from the background reference image for each frame of the input human image. The obtained background difference signal is input to the blocking processing unit 16 in order to reduce the amount of calculation, and is blocked by 16 × 16 pixels or the like, for example. The block image is input to the binarization processing unit 17, binarized with an appropriate threshold value, and the difference between the background reference image and the person image, that is, the region where the person exists is extracted.

【００２１】図４は、ブロック化され、２値化された差
分画像の例を示す。ここで点ｐ（ｘ _C，ｙ_C）は人物頭
頂部座標、点ｑ（Ｘ_C，Ｙ_C）は切り出し中心座標であ
る。このように人物領域のみが差分として抽出される。FIG. 4 shows a block and binarized difference.
The example of a minute image is shown. Where the point p (x _C, Y_C) Is the person's head
Top coordinates, point q (X_C, Y_C) Is the cutting center coordinate
It In this way, only the person area is extracted as the difference.

【００２２】２値化処理部１７で２値化された差分画像
は頭頂部座標算出部１８に入力され頭頂部座標を算出す
る。頭頂部座標の算出法は、多種考えられるが、ここで
は最も簡単な方法である、人物頭頂部から頭頂部座標を
決定する方法について説明する。たとえば図４に示され
ているブロック差分画像から人物頭頂部を検出する方法
は、図４のブロック差分画像の左上から水平方向に走査
し、最初に差分を検出した位置を人物頭頂部座標
（ｘ_C，ｙ_C）とすることにより求める。ここで、人物
頭頂部水平方向ブロックが１ブロックである場合には該
ブロックの中心点を、また２ブロック以上の場合には該
ブロックの平均座標位置を頭頂部水平方向座標点ｘ_Cと
する。頭頂部座標算出部１８にて求められた座標はＣＩ
Ｆ／ＱＣＩＦ画像切り出し範囲用中心座標算出部に入力
され、ＣＩＦまたはＱＣＩＦ画像を切り出す際の切り出
し中心点が計算される。The difference image binarized by the binarization processing unit 17 is input to the parietal coordinate calculation unit 18 to calculate parietal coordinates. Although various methods of calculating the parietal region coordinates are conceivable, a method for determining the parietal region coordinates from the person's parietal region, which is the simplest method, will be described here. For example, in the method of detecting the person's crown from the block difference image shown in FIG. 4, the block difference image of FIG. 4 is scanned horizontally from the upper left, and the position at which the difference is detected first is the person's crown coordinate (x _C , y _C ). Wherein the center point of the block, in the case of 2 or more blocks to the average coordinate position of the block and the head top portion horizontal coordinate point x _C If the person parietal horizontal block is one block. The coordinates obtained by the parietal coordinate calculation unit 18 are CI
It is input to the F / QCIF image cutout range center coordinate calculation unit, and the cutout center point for cutting out the CIF or QCIF image is calculated.

【００２３】具体的には、求められた頭頂部垂直方向座
標ｙ_Cより適当なブロック上方、たとえば２ブロック上
方を切り出し用上方枠とし、この枠の位置から垂直方向
の切り出し中心座標Ｙ_Cを求める。また、頭頂部水平方
向座標はそのまま水平方向の切り出し中心座標Ｘ_Cとす
る。これによりＣＩＦまたはＱＣＩＦ画像切り出し範囲
用中心座標（Ｘ_C，Ｙ_C）が求められる。求められた中
心座標は図２のＣＩＦ／ＱＣＩＦ画像切り出しアドレス
算出部１２に入力される。Specifically, a suitable block upper part, for example, two blocks upper part from the obtained vertical coordinate y _{C of} the parietal region is set as an upper frame for cutting, and the vertical cutting center coordinate Y _C is obtained from the position of this frame. .. Further, the horizontal coordinate of the top of the head is directly used as the horizontal cut-out center coordinate X _C. Thereby, the center coordinates (X _C , Y _C ) for the CIF or QCIF image cutout range are obtained. The obtained center coordinates are input to the CIF / QCIF image cutout address calculation unit 12 in FIG.

【００２４】図５は、水平方向５８７画素、垂直方向４
８０ラインから、それぞれ３５２画素、２８８ラインを
切り出した例を示す。（実施例２）図６は本発明の第２の実施例であるＰＡＬ
またはＳＥＣＡＭ方式からＣＩＦ画像を作成する方式を
説明する図である。図中の符号１，２，３，４，８，
９，１１，１２，１３は図２および図７に対応してい
る。撮像部１により撮像されたＰＡＬまたはＳＥＣＡＭ
方式のテレビジョン信号は、ＬＰＦ部２により帯域制限
されＡ／Ｄ変換部３に入力される。ここでＮＴＳＣ方式
と同様にサンプリング周波数を決定する。ＰＡＬまたは
ＳＥＣＡＭ方式の場合には垂直方向の有効ライン数が５
７６本であるため、水平方向の有効画素数をｐ画素とす
ると、以下の関係が成り立つ。FIG. 5 shows 587 pixels in the horizontal direction and 4 pixels in the vertical direction.
An example is shown in which 352 pixels and 288 lines are cut out from 80 lines, respectively. (Embodiment 2) FIG. 6 shows PAL which is a second embodiment of the present invention.
It is a figure explaining the method of creating a CIF image from the SECAM method. Reference numerals 1, 2, 3, 4, 8 in the figure
Reference numerals 9, 11, 12, and 13 correspond to FIGS. 2 and 7. PAL or SECAM imaged by the imaging unit 1
The system television signal is band-limited by the LPF unit 2 and input to the A / D conversion unit 3. Here, the sampling frequency is determined as in the NTSC system. In case of PAL or SECAM system, the number of effective lines in the vertical direction is 5
Since there are 76 pixels, the following relationship holds when the number of effective pixels in the horizontal direction is p pixels.

【００２５】ｐ：５７６＝３５２：２８８ ∴ｐ＝７０４この水平方向の画素数は１３．５ＭＨｚでサンプリング
した場合の有効画素数と一致する。従って、ＰＡＬ，Ｓ
ＥＣＡＭ方式の場合にはサンプリング周波数を１３．５
ＭＨｚとし、以下の処理は上述したＮＴＳＣ方式の場合
と同様となる。P: 576 = 352: 288 ∴p = 704 The number of pixels in the horizontal direction coincides with the number of effective pixels when sampling is performed at 13.5 MHz. Therefore, PAL, S
In case of ECAM system, sampling frequency is 13.5
MHz, and the following processing is the same as in the case of the NTSC system described above.

【００２６】[0026]

【発明の効果】以上説明したように、本発明によればＣ
ＩＦまたはＱＣＩＦ画像を作成するのにあらかじめ広角
で被写体人物像を撮像し、画像処理により人物顔領域を
検出して切り出し用中心座標を算出し、人物顔領域を含
む矩型範囲をＣＩＦまたはＱＣＩＦ画素で切り出し、顔
が動いた場合には撮像された範囲内で該顔領域の動きに
応じて切り出し範囲を変更して顔領域に追従して顔を表
示するため、カメラの画枠から顔領域が外れることな
く、常に顔領域を相手側に送信でき、その結果としてテ
レビ電話を介して良好な会話が実現できる利点がある。As described above, according to the present invention, C
In order to create an IF or QCIF image, a subject person image is captured in advance at a wide angle, a human face area is detected by image processing, and the center coordinates for clipping are calculated, and a rectangular range including the human face area is divided into CIF or QCIF pixels. When the face is moved, the cutout range is changed according to the movement of the face area within the captured range and the face is displayed following the face area. There is an advantage that the face area can always be transmitted to the other party without being detached, and as a result, a good conversation can be realized via the videophone.

[Brief description of drawings]

【図１】本発明の原理構成図を示す。FIG. 1 shows a principle configuration diagram of the present invention.

【図２】本発明の第１の実施例であるＮＴＳＣ方式から
ＣＩＦ画像を作成する方式を説明する図である。FIG. 2 is a diagram illustrating a method of creating a CIF image from the NTSC method that is the first embodiment of the present invention.

【図３】図２の動領域検出部の内容を詳細に説明する図
である。FIG. 3 is a diagram illustrating in detail the content of a moving area detection unit in FIG.

【図４】ブロック化され、２値化された差分画像の例を
示す図である。FIG. 4 is a diagram showing an example of a binarized difference image which is made into blocks.

【図５】切り出した例を示す。FIG. 5 shows an example of cutting out.

【図６】本発明の第２の実施例であるＰＡＬまたはＳＥ
ＣＡＭ方式からＣＩＦ画像を作成する方式を説明する図
である。FIG. 6 is a second embodiment of the present invention, PAL or SE.
It is a figure explaining the system which produces a CIF image from a CAM system.

【図７】ＮＴＳＣ信号を１３．５ＭＨｚで標本化し、デ
ィジタル色分離処理を行い、水平方向７０４画素、垂直
方向４８０ラインの画像を作成し、該画像からＣＩＦ画
像を作成する従来方式の例を示す図である。FIG. 7 shows an example of a conventional method in which an NTSC signal is sampled at 13.5 MHz, digital color separation processing is performed, an image of 704 pixels in the horizontal direction and 480 lines in the vertical direction is created, and a CIF image is created from the image. It is a figure.

[Explanation of symbols]

１撮像部２帯域制限用ローパスフィルタ（ＬＰＦ）部３Ａ／Ｄ変換部４ディジタルＹ／Ｃ分離部５水平方向縮小フィルタ部６ノンインタレース変換部７垂直方向縮小フィルタ部８符号化処理部９網インタフェース処理部１０アナログＹ／Ｃ分離部１１動領域検出部１２ＣＩＦ／ＱＣＩＦ画像切り出しアドレス算出部１３フレームメモリ部１４背景差分算出部１５背景参照画像メモリ部１６ブロック化処理部１７２値化処理部１８頭頂部座標算出部１９ＣＩＦ／ＱＣＩＦ画像切り出し用中心座標算出部 DESCRIPTION OF SYMBOLS 1 Imaging unit 2 Low pass filter (LPF) unit for band limitation 3 A / D conversion unit 4 Digital Y / C separation unit 5 Horizontal reduction filter unit 6 Non-interlaced conversion unit 7 Vertical reduction filter unit 8 Encoding processing unit 9 Network interface processing unit 10 Analog Y / C separation unit 11 Moving area detection unit 12 CIF / QCIF image cutout address calculation unit 13 Frame memory unit 14 Background difference calculation unit 15 Background reference image memory unit 16 Blocking processing unit 17 Binarization processing Part 18 Parietal coordinate calculation unit 19 CIF / QCIF image clipping center coordinate calculation unit

Claims

[Claims]

1. A CIF image conversion system for a videophone for obtaining a CIF image from a television signal, processing means for A / D converting the television signal, and storing the A / D converted television signal in a frame memory. Means for detecting a moving area in the television signal, and means for calculating an address for cutting out a rectangular area including the detected moving area from the signal stored in the frame memory. CIF image conversion system for videophones.

2. When the television signal is the NTSC system, the sampling frequency for A / D conversion is 11.1 MH.
The CIF image conversion system for videophone according to claim 1, wherein z is z.

3. The television signal is PAL or SEC.
The videophone CI according to claim 1, wherein in the case of the AM system, the sampling frequency at the time of A / D conversion is 13.5 MHz.
F image conversion method.