KR100289698B1

KR100289698B1 - Method and apparatus for coding object information of image object plane

Info

Publication number: KR100289698B1
Application number: KR1019970071297A
Authority: KR
Inventors: 한석원
Original assignee: 전주범; 대우전자주식회사
Priority date: 1997-12-20
Filing date: 1997-12-20
Publication date: 2001-05-15
Anticipated expiration: 2017-12-20
Also published as: KR19990051878A

Abstract

PURPOSE: A method for coding object information of a VOP(Video Object plane) is provided to add each object symbol to a background pixel included in a boundary block to generate processing blocks, and to code more than one boundary blocks including object blocks and object symbols as well as coding processing blocks including unprocessed boundary blocks, to generate coded video signals, so as to improve coding efficiency of video signals having large object information. CONSTITUTION: A block sensor(332) separates video images into many blocks, and sorts out each block to one of background block, boundary block, and object block. The background block includes a background pixel only. The boundary block includes an object and a background pixel. The object block includes an object pixel only. A randomize object data generator(336) generates binary object symbol sets presenting object information. A processing block generator(338) couples object symbols with the object and the boundary blocks to generate processing blocks. A conversion unit(334) performs a conversion-coding for the processing blocks to generate coded object information.

Description

Object information encoding method and apparatus of an image object plane {METHOD AND APPARATUS FOR CODING OBJECT INFORMATION OF IMAGE OBJECT PLANE}

본 발명은 영상 객체 평면의 객체 정보를 부호화하는 방법 및 장치에 관한 것으로서, 더욱 상세하게는 부호화 효율을 향상시키는 영상 객체 평면의 객체 정보를 부호화하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for encoding object information of an image object plane, and more particularly, to a method and apparatus for encoding object information of an image object plane to improve encoding efficiency.

화상 전화, 원격 회의 및 고선명 텔레비전 시스템과 같은 디지털 텔레비전 시스템에서는 영상 프레임 신호 내의 영상 라인 신호가 화소 값이라 지칭되는 일련의 디지털 데이터로 이루어져 있기 때문에 각각의 영상 프레임 신호를 정의하기 위해서는 많은 양의 디지털 데이터가 필요하다. 그러나, 일반적인 전송 채널의 이용 가능한 주파수 대역폭이 제한되어 있기 때문에, 특히, 화상 전화나 원격 회의 시스템과 같은 저 전송률 영상 신호 부호화기의 경우, 이를 통해 많은 양의 디지털 데이터를 전송하기 위해서는, 다양한 데이터 압축 기법을 통해 데이터 양을 압축하거나 줄이는 것이 불가피하다.In digital television systems such as video telephony, teleconferencing, and high-definition television systems, a video line signal within a video frame signal consists of a series of digital data called pixel values, so a large amount of digital data is needed to define each video frame signal. Is needed. However, due to the limited frequency bandwidth available for common transmission channels, especially for low bit rate video signal encoders such as video telephony and teleconferencing systems, in order to transmit large amounts of digital data there are various data compression techniques. Through compression, it is inevitable to compress or reduce the amount of data.

이러한 저 전송률 부호화 시스템에 있어서 영상 신호를 부호화하는 기법들 중 하나가 이른바 객체-지향 분석-합성 부호화 기법이다( MPEG-4 Video Verification Model Version 7.0, International Organization for Standardization, Coding of Moving Pictures and Associated Audio Information, ISO/IEC JTC1/SC29/WG11 MPEG07/N1642, Bristol, April 1997 참조). 객체-지향 분석-합성 부호화 기법에 의하면, 입력 영상 이미지는 객체별로, 즉 영상 객체 평면(VOP:video object plane)으로 나뉘어지는데, VOP는 사용자가 접근하고 처리할 수 있는 비트 스트림 내의 실체에 해당하며; 각각의 객체의 움직임, 형상, 텍스쳐 정보를 정의하기 위한 세 세트의 변수들이 서로 다른 부호화 채널을 통해 처리된다.One of the techniques for encoding video signals in such a low rate coding system is the so-called object-oriented analysis-synthesis coding technique (MPEG-4 Video Verification Model Version 7.0, International Organization for Standardization, Coding of Moving Pictures and Associated Audio Information). , ISO / IEC JTC1 / SC29 / WG11 MPEG07 / N1642, Bristol, April 1997). According to the object-oriented analysis-synthesis coding technique, an input video image is divided into objects, that is, a video object plane (VOP), which corresponds to an entity in a bit stream that a user can access and process. ; Three sets of variables for defining the motion, shape, and texture information of each object are processed through different coding channels.

VOP는 객체라고 지칭되며, 각각의 객체를 둘러싸는, 넓이 및 높이가 16 화소의 최소 배수인 경계 사각형으로서, 부호화기는 입력 영상 이미지를 VOP 단위로, 즉 객체 단위로 처리하게 된다. VOP는 객체의 명암 및 색채 정보를 나타내는 텍스쳐 정보와 객체의 형상 및 위치를 나타내는 형상 정보를 포함한다.A VOP, called an object, is a bounding rectangle that is the minimum multiple of 16 pixels in width and height surrounding each object, and the encoder processes the input image image in VOP units, that is, in units of objects. The VOP includes texture information indicating contrast and color information of the object and shape information indicating the shape and position of the object.

텍스쳐 정보에서는 화소가 이를테면 1 내지 255 사이의 값 중 하나로 표현되고; 형상 정보에서는 VOP의 객체 외부에 있는 화소 즉 배경 화소를 나타내는 데는 예를 들어 이진수 0이 사용되고, 객체 내부에 있는 화소 즉 객체 화소를 나타내는 데는 예를 들어 다른 이진수 255가 사용된다.In texture information, a pixel is represented by one of values between 1 and 255, for example; In the shape information, for example, binary 0 is used to represent a pixel outside the object of the VOP, that is, a background pixel, and another binary 255 is used to represent a pixel inside the object, that is, an object pixel.

도 1을 참조하면, VOP의 텍스쳐 정보를 부호화하는 일반적인 부호화 장치 (100)가 도시되어 있다. 변환기(105)에서는, VOP의 형상 정보의 객체 화소 값이 1로 변환되어 변환된 형상 정보를 곱셈기(110) 및 블럭 선택부(125)로 제공하는데, 변환된 형상 정보는 객체 화소의 값은 1이고 배경 화소의 값은 0이다. 곱셈기(110)에서는, VOP의 텍스쳐 정보가 변환된 형상 정보와 곱해져서 변환된 텍스쳐 정보가 생성되고 패딩부(120)로 제공된다. 변환된 텍스쳐 정보 안에서 객체 화소들은 원래 값을 그대로 가지고 있고, 객체 외부에 있는 화소들은 0인 값을 가지게 된다.Referring to FIG. 1, a general encoding apparatus 100 for encoding texture information of a VOP is illustrated. The converter 105 converts the object pixel value of the shape information of the VOP to 1 and provides the converted shape information to the multiplier 110 and the block selector 125. The converted shape information has a value of 1 And the value of the background pixel is zero. In the multiplier 110, the texture information of the VOP is multiplied by the converted shape information to generate the converted texture information and are provided to the padding unit 120. Object pixels in the converted texture information retain their original values, and pixels outside the object have a value of zero.

패딩부(120)는 데이터 압축 효율을 높이기 위해, 변환된 텍스쳐 정보 내에 있는 배경 화소들에 대해 일반적인 반복 패딩 기법을 사용하여 패딩 과정을 행하는데, 반복 패딩 기법에서는 배경 화소들이 객체의 경계 화소 값에 기초하여 얻은 새로운 값으로 패딩된다. 도 2를 참조하면, 패딩된 텍스쳐 정보로 나타내어지는 VOP(10)가 도시되어 있는데, VOP(10)는 빗금친 부분으로 표시되는 객체(15)및 빗금치지 않은 부분으로 표시되는 배경(20)을 포함한다. 블럭 선택부(125)는 변환된 형상 정보 및 패딩된 텍스쳐 정보에 따라 VOP(10) 내에 있는 객체(15) 및 배경(20)을 감지하고; VOP(10)을 예를 들어 8 × 8 화소 크기의 다수 개의 DCT(discrete cosine transform) 블럭들로, 이를테면 1 내지 12로 나눈다. 그런 다음, 블럭 선택부(125)는 객체(15)와 겹친 DCT 블럭들을 선택하여 이들을 처리 블럭들로서 DCT부(130)으로 제공한다.The padding unit 120 performs a padding process on the background pixels in the converted texture information by using a general repetitive padding technique. In the repetitive padding technique, the background pixels are applied to the boundary pixel value of the object. Padded with new values obtained on the basis. Referring to FIG. 2, a VOP 10 represented by padded texture information is shown. The VOP 10 includes an object 15 displayed as a hatched portion and a background 20 displayed as a non-hatched portion. Include. The block selector 125 detects the object 15 and the background 20 in the VOP 10 according to the converted shape information and the padded texture information; The VOP 10 is divided into a number of discrete cosine transform (DCT) blocks, for example 8 × 8 pixels, such as 1-12. Then, the block selector 125 selects DCT blocks overlapping the object 15 and provides them to the DCT unit 130 as processing blocks.

도 2에 도시된 예에서, DCT 블럭 1은 객체에 15와 겹치지 않으므로; 따라서 DCT 블록들 중 2내지 12만이 처리 블럭들로 선택된다. DCT부(130)에서, 각각의 처리 블럭들은 세트의 DCT 계수 세트로 변환되고 DCT 계수 세트는 양자화부(140)으로 제공된다. 양자화부(140)에서는, DCT 계수 세트가 양자화되어 전송을 위해 전송기(도시되지 않음)로 보내진다.In the example shown in FIG. 2, DCT block 1 does not overlap 15 with the object; Thus, only two to twelve of the DCT blocks are selected as processing blocks. In the DCT unit 130, each processing block is converted into a set of DCT coefficient sets and the DCT coefficient set is provided to the quantization unit 140. In quantization unit 140, a set of DCT coefficients is quantized and sent to a transmitter (not shown) for transmission.

한편, 객체 정보, 예를 들어 객체의 색인, 제목, 저자, 사용자-편집 가능성(user-editability) 등은 객체의 움직임, 형상, 텍스쳐 정보들과는 별도로 부호화되어, 객체의 부호화된 시퀀스의 헤더로서 전송된다. 따라서, 전송될 객체의 수가 많은 경우 객체 정보의 양이 커져서 부호화 효율을 떨어뜨리게 된다.On the other hand, object information, for example, the index, title, author, user-editability, etc. of the object is encoded separately from the movement, shape, and texture information of the object and transmitted as a header of the encoded sequence of the object. . Therefore, when the number of objects to be transmitted is large, the amount of object information is increased, thereby reducing the coding efficiency.

따라서 본 발명의 목적은 영상 객체 평면(VOP)의 객체 정보를 효과적으로 부호화하는 방법 및 장치를 제공하는 것이다.Accordingly, an object of the present invention is to provide a method and apparatus for effectively encoding object information of a video object plane (VOP).

이러한 목적을 달성하기 위해, 본 발명의 한 측면에 따르면, 영상 객체 평면(VOP:video object plane) 및 그 객체 정보를 포함하는 영상 신호를 부호화하는 방법에 있어서, VOP는 그 안에 객체를 포함하고 있을 때: (a) VOP를 다수 개의 블럭들로 나누고 경계 블럭들과 객체 블럭들을 감지하되, 각각의 경계 블럭은 배경 화소 및 객체 화소를 모두 포함하고, 객체 블럭은 객체 화소만을 포함하며, 배경 및 객체 화소는 각각 객체 외부 및 내부에 속한 화소인 단계와; (b) 이진수로 표현되는 객체 정보를 객체 심볼 세트로 변환하는 단계와; (c) 각각의 객체 심볼을 경계 블럭에 포함된 배경 화소에 가산하여 처리 블럭을 생성하되, 처리 블럭은 객체 블럭 및 객체 심볼을 포함하는 하나 이상의 처리된 경계 블럭 및 객체 심볼을 포함하지 않고 남아있는 미처리된 경계 블럭들을 포함하는 단계; 및 (d) 처리 블럭을 부호화하여 부호화된 객체 정보를 생성하는 단계를 포함하는 것을 특징으로 하는 객체 정보 부호화 방법이 제공된다.In order to achieve this object, according to an aspect of the present invention, in a method of encoding a video signal including a video object plane (VOP) and its object information, the VOP may include an object therein. When: (a) Divide the VOP into a number of blocks and detect the boundary blocks and object blocks, each boundary block contains both background pixels and object pixels, and the object blocks contain only object pixels, and background and object The pixels are pixels belonging to the outside and inside of the object, respectively; (b) converting object information represented by a binary number into an object symbol set; (c) add each object symbol to a background pixel contained in the boundary block to generate a process block, the process block remaining without including one or more processed boundary blocks and object symbols comprising the object block and the object symbol; Including unprocessed boundary blocks; And (d) encoding the processing block to generate encoded object information.

이러한 목적을 달성하기 위해, 본 발명의 또 다른 측면에 따르면, 객체 및 객체 정보를 포함하는 영상 이미지를 포함하는 영상 신호를 부호화하는 장치에 있어서, 영상 이미지는 객체 화소 및 배경 화소들로 구성되고, 객체 화소는 객체의 내부에, 배경 화소는 객체의 외부에 각각 존재할 때: 영상 이미지를 다수 개의 블럭들로 분리하고 각각의 블럭을 배경 블럭, 경계 블럭, 객체 블럭 중 하나로 분류하되, 배경 블럭은 배경 화소만을 포함하고 경계 화소는 객체 및 배경 화소를 포함하며 객체 블럭은 객체 화소만을 포함하는 영상 이미지 분리 수단과; 객체 정보를 나타내는 이진수의 객체 심볼 세트를 생성하는 객체 심볼 세트 생성 수단과; 객체 심볼을 객체 블럭 및 경계 블럭들과 결합하여 처리 블럭들을 생성하는 처리 블럭 생성 수단; 및 처리 블럭들을 변환 부호화하여 부호화된 객체 정보를 생성하는 수단을 포함하는 것을 특징으로 하는 객체 정보 부호화 장치가 제공된다.In order to achieve this object, according to another aspect of the present invention, in the apparatus for encoding a video signal including a video image including the object and the object information, the video image is composed of object pixels and background pixels, When the object pixel exists inside the object and the background pixel exists outside the object: The image image is divided into a plurality of blocks, and each block is classified into one of a background block, a boundary block, and an object block. Video image separation means including only pixels, the boundary pixels including objects and background pixels, and the object blocks including only object pixels; Object symbol set generating means for generating a binary object symbol set representing object information; Processing block generating means for combining the object symbol with the object block and the boundary blocks to generate processing blocks; And means for transform encoding the processing blocks to generate encoded object information.

도 1은 영상 객체 평면(VOP:video object plane)의 정보를 부호화하는 일반적인 부호화 장치의 블럭도,1 is a block diagram of a general encoding apparatus for encoding information of a video object plane (VOP),

도 2는 다수 개의 DCT 블럭들로 나뉘어진 VOP를 도시한 도면,2 illustrates a VOP divided into a plurality of DCT blocks;

도 3은 본 발명의 바람직한 실시예에 따라 객체 정보를 부호화하는 장치의 블럭도,3 is a block diagram of an apparatus for encoding object information according to a preferred embodiment of the present invention;

도 4는 도 3에 도시된 객체 정보 삽입부의 상세도.FIG. 4 is a detailed view of the object information insertion unit shown in FIG. 3.

＜도면의 주요부분에 대한 부호의 설명＞<Description of the code | symbol about the principal part of drawing>

1 : 배경 블록 2 - 6, 8 - 12 : 경계 블럭1: Background block 2-6, 8-12: Boundary block

7 : 객체 블록 10 : VOP7: object block 10: VOP

15 : 객체 20 : 배경15: Object 20: Background

305 : 변환기 310 : 곱셈기305: converter 310: multiplier

320 : 패딩부 330 : 객체 정보 삽입부320: padding unit 330: object information insertion unit

332 : 블럭 감지부 334 : 변환 유닛332: block detection unit 334: conversion unit

336 : 랜더마이즈 객체 데이터 생성부336: randomized object data generation unit

338 : 처리 블럭 생성부 340 : 변환부338: processing block generation unit 340: conversion unit

350 : 양자화부 360 : 모듈러스부350: quantization unit 360: modulus unit

370 : VLC부370: VLC part

도 3을 참조하면, 본 발명의 바람직한 실시예에 따라 객체 정보를 부호화하기 위한 장치(300)의 블럭도가 도시되어 있다. 변환기(305), 곱셈기(310) 및 패딩부 (320)의 기능과 특성은 도 1에 도시된 상응하는 요소인(105), (110) 및 (120)과 똑같기 때문에, 편의상 다시 설명하지 않는다. 패딩부(320)로부터의 패딩된 텍스쳐 정보와 변환기(305)로부터의 변환된 형상 정보는 객체 정보를 입력받는 객체 정보 삽입부(330)으로 인가된다. 객체 정보 삽입부(330)가 객체 정보를 부호화하는 자세한 과정은 도 2 및 4를 참조하여 설명될 것이며, 도 4는 객체 정보 삽입부(330)의 상세 블럭도를 도시하고 있다.3, there is shown a block diagram of an apparatus 300 for encoding object information in accordance with a preferred embodiment of the present invention. The functions and characteristics of the transducer 305, multiplier 310 and padding 320 are the same as the corresponding elements 105, 110 and 120 shown in FIG. 1 and are not described again for convenience. The padded texture information from the padding unit 320 and the converted shape information from the converter 305 are applied to the object information inserting unit 330 that receives the object information. A detailed process of encoding the object information by the object information inserting unit 330 will be described with reference to FIGS. 2 and 4, and FIG. 4 shows a detailed block diagram of the object information inserting unit 330.

도 1에 도시된 패딩부(120)와 유사하게, 블럭 감지부(332)는 객체(15)와 겹치는 DCT 블럭을 감지한다. VOP(10) 내에 있는 DCT 블럭 중에, DCT 블럭 1은 객체(15)와 겹치지 않으며 배경 화소들만으로 구성되어 있어서 배경 DCT 블럭으로 정해지고, DCT 블럭 7은 객체 화소들로만 구성된 객체 블럭으로 정해지며, 나머지 DCT 블럭들 2 내지 6 그리고 8 내지 12는 객체 화소와 배경 화소를 모두 포함하는 경계 DCT 블럭들이다. 객체 및 배경 DCT 블럭들을 감지한 후, 블럭 감지부(332)는 식별 플래그를 각 배경 화소에 첨부하는데, 식별 플래그는 플래그가 첨부된 화소가 배경(20)에 속함을 나타낸다. 객체 및 배경 DCT 블럭들은 처리 DCT 블럭들로서 처리 블럭 생성부(338)로 제공된다.Similar to the padding unit 120 illustrated in FIG. 1, the block detector 332 detects a DCT block overlapping the object 15. Among the DCT blocks in the VOP 10, DCT block 1 is defined as a background DCT block because it does not overlap with the object 15 and is composed of only background pixels, and DCT block 7 is determined as an object block composed only of object pixels. Blocks 2 through 6 and 8 through 12 are boundary DCT blocks including both object pixels and background pixels. After detecting the object and background DCT blocks, the block detector 332 attaches an identification flag to each background pixel, which indicates that the pixel to which the flag is attached belongs to the background 20. The object and background DCT blocks are provided to the processing block generation unit 338 as processing DCT blocks.

한편, 객체 정보는 변환 유닛(334)으로 입력된다. 변환 유닛(334)에서 객체 정보를 나타내는 각각의 문자, 숫자, 심볼 등은 예를 들면 8 비트의 이진수로 표시되는 이진 심볼로 변환된다. 변환 유닛(334)에서 생성된 이진 심볼 세트는 랜더마이즈 객체 데이터 생성부(336)로 인가된다. 이진 심볼 세트는 일정한 패턴을 형성하는데, 이는 객체 화소의 패턴과는 사뭇 달라서 이어지는 변환 고정, 예를 들어 DCT의 부호화 효율을 저하시킨다. 이러한 이유로 랜더마이즈 객체 데이터 생성부(336)는 각각의 이진 심볼에 랜덤 넘버를 곱함으로써 랜더마이즈 객체 심볼 세트를 생성한다.On the other hand, the object information is input to the conversion unit 334. Each letter, number, symbol or the like representing the object information in the conversion unit 334 is converted into a binary symbol represented by an 8-bit binary number, for example. The binary symbol set generated by the conversion unit 334 is applied to the randomized object data generation unit 336. The set of binary symbols forms a constant pattern, which is very different from the pattern of the object pixels, resulting in lower coding efficiency of subsequent transform fixation, eg, DCT. For this reason, the randomized object data generator 336 generates a randomized object symbol set by multiplying each binary symbol by a random number.

랜덤 넘버는 기설정된 방식으로 기설정된 랜덤 넘버 세트 중에서 선택된다. 랜덤 넘버는 객체 정보가 이후의 양자화 과정에서 손상을 입지 않도록 충분히 큰 값을 가져야 한다. 랜더마이즈 객체 데이터 생성부(336)로부터의 랜더마이즈 객체 심볼 세트는 처리 블럭 생성부(338)로 인가된다. 처리 블럭 생성부(338)는 덧붙여진 식별 플래그에 기초하여 처리 DCT 블럭 내에 있는 배경 화소들을 감지한다.The random number is selected from among a set of random numbers preset in a predetermined manner. The random number should be large enough so that the object information is not damaged in subsequent quantization. The randomize object symbol set from the randomize object data generator 336 is applied to the processing block generator 338. The processing block generation unit 338 detects the background pixels in the processing DCT block based on the identified identification flag.

일반적으로, 블럭들은 좌에서 우, 상에서 하의 래스터(raster) 스캐닝 순서로 처리된다. 따라서, 도 2에 도시된 처리 DCT 블럭들은 DCT 블럭 2에서 12의 순서로 처리된다. 처리 블럭 생성부(338)는 제 1 처리 DCT 블럭, 즉, 블럭 2를 찾아내어 랜더마이즈 객체 심볼과 DCT 블럭 2 내에서 래스터 스캐닝 순서로 선택한 패딩된 배경 화소값을 하나씩 차례로 더한다. 아직 더해지지 않은 랜더마이즈 객체 심볼이 남아있으면, 이러한 객체 심볼들은 모든 랜더마이즈 객체 심볼들이 처리될 때까지 위에 설명한 방법으로 이어지는 경계 DCT 블럭의 패딩된 배경 화소에 더해진다. 랜더마이즈 객체 심볼이 처리되면, 하나 이상의 처리된 경계 DCT 블럭들이 랜더마이즈 객체 심볼 정보를 포함하고 있고, 남아있는 처리되지 않은 경계 DCT 블럭들과 모든 객체 DCT 블럭은 그대로 남아있게 된다.In general, blocks are processed in raster scanning order from left to right and top to bottom. Therefore, the processing DCT blocks shown in Fig. 2 are processed in the order of DCT blocks 2 to 12. The processing block generation unit 338 finds the first processing DCT block, that is, block 2, and adds the randomized object symbols and the padded background pixel values selected in the raster scanning order in the DCT block 2 one by one. If a randomized object symbol has not yet been added, these object symbols are added to the padded background pixel of the boundary DCT block followed in the manner described above until all the randomized object symbols have been processed. When a randomized object symbol is processed, one or more processed boundary DCT blocks contain the randomized object symbol information, and the remaining unprocessed boundary DCT blocks and all object DCT blocks remain intact.

도 3을 다시 참조하면, 변환부(340)는 각각의 처리 블럭을, 예를 들어 일반적인 이산 코사인 변환(DCT) 기법을 사용하여, 변환 계수 세트로 변환하여 이를 양자화부(350)로 제공하는데, 여기서 변환 계수 세트는 양자화 변환 계수 세트로 변환되고 모듈러스부(360)로 제공된다.Referring back to FIG. 3, the transform unit 340 transforms each processing block into a transform coefficient set by using a general discrete cosine transform (DCT) technique, and provides it to the quantization unit 350. Here, the transform coefficient set is converted into a quantized transform coefficient set and provided to the modulus unit 360.

모듈러스부(360)에서는, 일반적인 모듈러스 기법에 기초하여 기설정된 최대값을 초과하는 화소에 대해서는 모듈러스 과정이 수행된다. 양자화 변환 계수 값의 범위는 기설정된 최대값, 즉, 모듈러스의 정수배를 빼 줌으로써 기설정된 최대값 내로 제한되는데, 여기서 모듈러스는 이를테면 255이다. 예를 들어, 이어지는, 이를테면 VLC(variable length coding) 기법에 기초한 통계적 부호화 과정에서 최대값은 255이고, 랜더마이즈 객체 심볼을 포함한 배경 화소에 상응하는 양자화 변환 계수의 값이 520이라면, 양자화 변환 계수 값은 10( = 520 - 255 × 2 )으로 감소하고, 모듈러스 정보가 첨부되는데, 모듈러스 정보는 모듈러스 과정의 회수를 나타낸다.In the modulus unit 360, a modulus process is performed on pixels exceeding a predetermined maximum value based on a general modulus technique. The range of quantization transform coefficient values is limited within a preset maximum value by subtracting an integer multiple of the preset maximum value, ie modulus, where the modulus is, for example, 255. For example, in the subsequent statistical coding process based on a variable length coding (VLC) technique, the maximum value is 255, and if the value of the quantization transform coefficient corresponding to the background pixel including the randomized object symbol is 520, the quantization transform coefficient value Decreases to 10 (= 520-255 x 2), and modulus information is appended, which indicates the number of modulus processes.

한편, 랜더마이즈 객체 심볼을 포함한 배경 화소에 상응하는 양자화 변환 계수의 값이 270이라면, 양자화 변환 계수 값은 15( = 270 - 255 × 1)이 되고, 모듈러스 정보는 1이 된다. VLC부(370)에서는, 모듈러스 처리된 데이터가 VLC 부호화되어 전송기(도시되지 않음)로 전송을 위해 보내진다.On the other hand, if the value of the quantization transform coefficient corresponding to the background pixel including the randomized object symbol is 270, the quantization transform coefficient value is 15 (= 270-255 x 1), and the modulus information is 1. In the VLC unit 370, the modulated data is VLC encoded and sent for transmission to a transmitter (not shown).

수신단 측의 복호화기에서는, 전송된 VLC 부호화 데이터가 일련의 가변 길이 복호화, 역 모듈러스, 역 양자화 및 역 변환 과정을 거쳐 재생된다. 따라서, 객체의 텍스쳐 정보가 재생된 처리 블럭들과 전송기로부터 전송된 객체의 형상 정보에 의해 얻어진다. 그런 다음, 재생된 객체의 텍스쳐 정보의 경계 화소들에 기초하여 역 패딩 과정이 수행되고 배경 화소들의 패딩된 화소값들이 제거된다. 이때, 경계 화소의 남아있는 값들은 복호화기에 기저장된 기설정된 랜덤 넘버 세트들에 의해 나뉘어져서, 객체 정보에 해당하는 이진 심볼이 생성된다. 이 이진 심볼들은 객체 정보로 번역된다.In the decoder on the receiver side, the transmitted VLC coded data is reproduced through a series of variable length decoding, inverse modulus, inverse quantization, and inverse transformation. Thus, texture information of the object is obtained from the processed processing blocks and the shape information of the object transmitted from the transmitter. Then, an inverse padding process is performed based on the boundary pixels of the texture information of the reproduced object and the padded pixel values of the background pixels are removed. At this time, the remaining values of the boundary pixel are divided by preset random number sets previously stored in the decoder, so that a binary symbol corresponding to the object information is generated. These binary symbols are translated into object information.

상기에 있어서 본 발명의 특정의 실시예에 대하여 설명했지만, 본 명세서에 기재한 특허 청구의 범위를 일탈하지 않고 당업자는 여러 가지의 변경을 가할 수 있음은 물론이다.While specific embodiments of the invention have been described above, those skilled in the art can make various changes without departing from the scope of the claims described herein.

이상 설명한 바와같이 본 발명에 따르면, VOP가 객체를 포함하고 있을 때, VOP를 다수 개의 블럭들로 나누고, 배경 화소가 객체 외부에 위치한 화소이고 객체 화소가 객체 내부에 위치한 화소일 때, 배경 화소만을 포함하는 배경 블럭과 객체 화소만을 포함하는 객체 블럭 및 배경 화소와 객체 화소를 모두 포함하는 경계 블럭을 각각 찾아내고, 객체 정보를 객체 심볼 세트로 변환하며, 각각의 객체 심볼을 경계 블럭에 포함된 배경 화소에 더하여 처리 블럭들을 생성한 다음, 객체 블럭과 객체 심볼을 포함하는 하나 이상의 처리된 경계 블럭들 및 남아있는 미처리 경계 블럭들을 포함하는 처리 블럭들을 부호화하여 부호화된 영상 신호를 생성함으로써 객체 정보의 양이 많은 영상 신호의 부호화 효율을 개선할 수 있다.As described above, when the VOP includes an object, the VOP is divided into a plurality of blocks, and when the background pixel is a pixel located outside the object and the object pixel is a pixel located inside the object, only the background pixel is used. Finds the background block and the object block including only the object pixel and the boundary block including both the background pixel and the object pixel, converts the object information into the object symbol set, and converts each object symbol to the background included in the boundary block. The amount of object information is generated by generating processing blocks in addition to the pixels, and then encoding the processing blocks including the object block and one or more processed boundary blocks containing the object symbols and the remaining raw boundary blocks to produce an encoded image signal. The coding efficiency of many video signals can be improved.

Claims

In a method for encoding a video signal comprising a video object plane (VOP) and its object information, the VOP includes an object therein:

(a) dividing the VOP into a plurality of blocks and detecting boundary blocks and object blocks, each boundary block including both a background pixel and an object pixel, the object block including only an object pixel, and a background and object pixel Are pixels belonging to the outside and inside of the object, respectively;

(b) converting object information represented by a binary number into an object symbol set;

(c) adding each object symbol to a background pixel contained in the boundary block to generate a processing block, wherein the processing block remains without one or more processed boundary blocks and object symbols including the object block and the object symbol; Including unprocessed boundary blocks; And

(d) encoding the processing block to generate an encoded object information signal.

The method of claim 1, wherein the converting step (b) is:

(b1) converting object information into a set of binary symbols, wherein each binary symbol is represented by a P bit which is a positive integer; And

(b2) generating an object symbol set by multiplying each binary symbol by a random number selected from a set of preset random numbers.

The method of claim 1, wherein the encoding step (d) is:

(d1) transforming each processing block to provide a transform coefficient set;

(d2) quantizing the transform coefficient set to generate a quantized coefficient set; And

(d3) encoding the quantization coefficient set based on a statistical encoding technique to generate an encoded image signal.

The method of claim 1, wherein the adding step (c) is:

(c1) detecting one or more boundary blocks and combining object symbols therein; And

(c2) generating the processed boundary blocks by adding each object symbol to the background pixel included in the detected boundary blocks.

An apparatus for encoding a video signal including a video image including an object and object information, wherein the video image includes object pixels and background pixels, the object pixel inside the object, and the background pixel outside the object. When each exists:

The image is divided into a plurality of blocks, and each block is classified into one of a background block, a boundary block, and an object block, wherein the background block includes only background pixels, the boundary pixels include objects and background pixels, and the object blocks are objects. Video image separation means including only pixels;

Object symbol set generating means for generating a binary object symbol set representing object information;

Processing block generating means for combining the object symbol with the object block and the boundary blocks to generate processing blocks; And

And means for transform-coding the processing blocks to generate encoded object information.

6. The apparatus of claim 5, wherein the means for generating an object symbol set are:

Means for converting the object information into a set of binary symbols, wherein each binary symbol is represented by a predetermined number of binary digits; And

And means for obtaining an object symbol set by multiplying each binary symbol with a random number selected from a preset random number set in a predetermined method.

6. The object information of claim 5, wherein the processing blocks comprise one or more processed boundary blocks including object blocks and object symbols and boundary blocks remaining without object block and object information. Encoding device.

6. The processing block according to claim 5, wherein the processing block generating means is:

Means for finding one or more boundary blocks to combine the object symbols; And

And means for generating processed boundary blocks by adding each object symbol to a background pixel included in the found boundary blocks.

The object of claim 5, wherein the image image separating means attaches an identification flag to background pixels included in the boundary blocks, and an identification flag attached to one pixel indicates that the pixel is a boundary pixel. Information encoding apparatus.

10. The apparatus of claim 9, wherein the background pixel to which one or more boundary blocks and each object symbol are added is detected by an identification flag.