KR100780124B1

KR100780124B1 - Encoding and Decoding of Images

Info

Publication number: KR100780124B1
Application number: KR1020067011139A
Authority: KR
Inventors: 라즈 쿠마르; 토마스 펀; 히-정 우; 크리스챤 드바이비에르
Original assignee: 애플 인크.
Priority date: 2004-06-27
Filing date: 2005-06-24
Publication date: 2007-11-27
Anticipated expiration: 2025-06-24
Also published as: KR20070031850A

Abstract

일부 실시예들은 비디오 시퀀스의 제 2 이미지를 참조하여 제 1 이미지의 제 1 세트 픽셀들을 인코딩하기 위한 방법을 제공한다. 제 2 이미지 내의 제 1 서치 윈도에서, 본 방법은 제 1 이미지의 제 1 세트 픽셀들과 최선으로 매칭되는 제 2 이미지의 제 1 특정 부분을 식별하기 위해 서치한다. 제 2 이미지 내의 제 1 서치 윈도에서, 본 방법은, 제 1 특정 부분에 대응되는 제 1 위치를 식별한다. 제 2 이미지 내의 제 2 서치 윈도에서, 본 방법은 다음으로, 제 1 이미지의 제 1 세트 픽셀들에 최선으로 매칭되는 제 2 이미지의 제 2 특정 부분을 식별하기 위해 서치하는데, 여기에서, 제 2 서치 윈도는 제 1 위치 부근에 정의된다. Some embodiments provide a method for encoding first set pixels of a first image with reference to a second image of a video sequence. In a first search window in the second image, the method searches to identify a first specific portion of the second image that best matches the first set of pixels of the first image. In a first search window in the second image, the method identifies a first location corresponding to the first specific portion. In a second search window in the second image, the method then searches to identify a second specific portion of the second image that best matches the first set of pixels of the first image, wherein the second The search window is defined near the first position.

비디오 시퀀스, 이미지, 픽셀, 인터블록, 인코딩, 디코딩, 매칭, 서치 윈도, 위치 Video sequence, image, pixel, interblock, encoding, decoding, matching, search window, position

Description

Encoding and decoding of images {ENCODING AND DECODING IMAGES}

본 발명은 이미지들을 인코딩 및 디코딩하는 방법에 관한 것이다. The present invention relates to a method of encoding and decoding images.

비디오 코덱들은 좀더 빠른 전송과 좀더 작은 저장 공간을 위해 비디오 데이터 스트림들을 인코딩(즉, 압축) 및 디코딩(즉, 압축 해제)하여 스트림들의 사이즈를 감소시키도록 설계된 압축 알고리즘들이다. 유손실이긴 하지만, 현재의 비디오 코덱들은 비디오 스트림의 2진 데이터를 압축하는 것과 동시에 비디오 품질을 유지하고자 한다. Video codecs are compression algorithms designed to reduce the size of streams by encoding (ie, compressing) and decoding (ie, decompressing) video data streams for faster transmission and smaller storage space. Although lossy, current video codecs seek to maintain video quality while compressing the binary data of the video stream.

비디오 스트림은 통상적으로 비디오 프레임들의 시퀀스에 의해 형성된다. 비디오 인코더들은 대개 각 프레임을 몇 개의 매크로블록들로 분할하는데, 각각의 매크로블록은 16×16 픽셀들의 세트이다. 비디오 인코더들은 통상적으로 프레임내 인코딩(intraframe encoding) 또는 프레임간 인코딩(interframe encoding)을 사용해 비디오 프레임들 내의 비디오 프레임들 또는 매크로블록들을 인코딩한다. 프레임내 인코딩형 프레임 또는 매크로블록은 다른 프레임들 또는 다른 프레임들의 매크로블록들과는 무관하게 인코딩된 프레임 또는 매크로블록이다. A video stream is typically formed by a sequence of video frames. Video encoders usually divide each frame into several macroblocks, where each macroblock is a set of 16 × 16 pixels. Video encoders typically encode video frames or macroblocks within video frames using intraframe encoding or interframe encoding. An intra frame encoded frame or macroblock is an encoded frame or macroblock independent of other frames or macroblocks of other frames.

프레임간 인코딩형 프레임 또는 매크로블록은 하나 이상의 다른 프레임들 또는 다른 프레임들의 매크로블록들을 참조하여 인코딩되는 프레임 또는 매크로블록 이다. 인터블록 인코딩은 통상적으로 시간 소모적인데, 이 인코딩은 특정 프레임의 매크로블록들 또는 매크로블록들 내의 파티션들을 다른 기준 프레임의 매크로블록들 또는 매크로블록들 내의 파티션들과 비교해야 하기 때문이다. 따라서, 업계에서는 좀더 효율적인 인터블록 인코딩 방법들을 필요로 한다. 이상적으로, 이러한 인코딩 방법들은 인코딩 및 디코딩 연산들을 가속할 것이다. An interframe encoded frame or macroblock is a frame or macroblock that is encoded with reference to one or more other frames or macroblocks of other frames. Interblock encoding is typically time consuming because the encoding must compare macroblocks or partitions within macroblocks of a particular frame to macroblocks or partitions within macroblocks of another reference frame. Therefore, the industry needs more efficient interblock encoding methods. Ideally, these encoding methods would accelerate encoding and decoding operations.

일부 실시예들은 제 1 이미지의 제 1 세트 픽셀들을 비디오 시퀀스의 제 2 이미지를 참조하여 인코딩하기 위한 방법을 제공한다. 제 2 이미지 내의 제 1 서치 윈도(search window)에서, 본 방법은 제 1 이미지의 제 1 세트 픽셀들과 최선으로 매칭되는 제 2 이미지의 특정한 제 1 부분을 식별하기 위해 서치한다. 제 2 이미지 내의 제 1 서치 윈도에서, 본 방법은 특정한 제 1 부분에 대응되는 제 1 위치를 식별한다. 제 1 위치 부근에서 정의되는 제 2 이미지의 제 2 서치 윈도에서, 본 방법은 그 다음으로, 제 1 이미지의 제 1 세트 픽셀들과 최선으로 매칭되는 제 2 이미지의 특정한 제 2 부분을 식별하기 위해 서치한다. Some embodiments provide a method for encoding a first set of pixels of a first image with reference to a second image of a video sequence. In a first search window in the second image, the method searches to identify a particular first portion of the second image that best matches the first set of pixels of the first image. In a first search window in the second image, the method identifies a first location corresponding to the particular first portion. In a second search window of the second image defined near the first position, the method then determines to identify a particular second portion of the second image that best matches the first set of pixels of the first image. Search.

일부 실시예들은 비디오 시퀀스의 이미지들을 인터블록 인코딩하기 위한 방법을 제공한다. 비디오 시퀀스의 각 이미지는 몇 개의 정수 픽셀 위치들(integer pixel locations)을 갖는데, 각각의 정수 픽셀 위치는 하나 이상의 이미지 값(예를 들어, 휘도 값)을 가진다. 본 방법은 제 2 이미지를 참조하여 인코딩하기 위한 제 1 이미지를 선택한다. 그 다음, 본 방법은 제 1 이미지의 픽셀들의 세트와 매칭되는 제 2 이미지의 비-정수 픽셀 위치들의 세트를 식별한다. 이러한 식별은 제 2 이미지의 몇 가지 정수 픽셀 위치들에 대한 이미지 값들로부터 제 2 이미지의 비-정수 픽셀 위치들과 연관된 이미지 값들을 보간하는 단계를 수반한다. 본 방법은 비-정수 픽셀 위치들의 보간된 이미지 값들을, 제 2 이미지를 참조하여 제 3 이미지를 인코딩하는 동안의 후속 사용을 위해, 저장한다.Some embodiments provide a method for interblock encoding of images of a video sequence. Each image of the video sequence has several integer pixel locations, each integer pixel location having one or more image values (eg, luminance values). The method selects a first image for encoding with reference to the second image. The method then identifies a set of non-integer pixel locations of the second image that match the set of pixels of the first image. This identification involves interpolating image values associated with non-integer pixel positions of the second image from image values for some integer pixel positions of the second image. The method stores interpolated image values of non-integer pixel positions for subsequent use while encoding the third image with reference to the second image.

일부 실시예들은 비디오 시퀀스의 이미지들을 인터블록 디코딩하기 위한 방법을 제공한다. 비디오 시퀀스의 각 이미지는 몇 가지 정수 픽셀 위치들을 갖는데, 각각의 정수 픽셀 위치는 하나 이상의 이미지 값(예를 들어, 휘도 값)을 가진다. 본 방법은 제 2 이미지를 참조하여 디코딩하기 위한 제 1 이미지를 선택한다. 그 다음, 본 방법은 제 1 이미지를 식별한다. 그 다음, 본 방법은 제 2 이미지의 몇 가지 정수 픽셀 위치들에 대한 이미지 값들로부터 제 2 이미지의 비-정수 픽셀 위치들과 연관된 이미지 값들을 보간한다. 본 방법은 비-정수 픽셀 위치들의 보간된 이미지 값들을, 제 2 이미지를 참조하여 제 3 이미지를 디코딩하는 동안의 후속 사용을 위해, 저장한다.Some embodiments provide a method for interblock decoding images of a video sequence. Each image of the video sequence has several integer pixel positions, each integer pixel position having one or more image values (eg, luminance values). The method selects a first image for decoding with reference to the second image. The method then identifies the first image. The method then interpolates the image values associated with the non-integer pixel positions of the second image from the image values for some integer pixel positions of the second image. The method stores interpolated image values of non-integer pixel positions for subsequent use while decoding the third image with reference to the second image.

일부 실시예들은 제 1 이미지의 제 1 부분을 비디오 이미지들의 시퀀스에서의 제 2 이미지를 참조하여 인터블록 프로세싱하기 위한 방법을 제공한다. 본 방법은 제 2 이미지를 한 세트의 타일들로 분할하고 타일들을 제 1의 비-캐시 메모리 스토리지에 저장한다. 제 1 이미지의 제 1 부분을 제 2 이미지 부분과 매칭하기 위해 타일들의 서브-세트가 필요할 때마다, 본 방법은 제 1의 비-캐시 메모리 스토리지로부터 타일들의 서브-세트를 검색하고, 제 1 부분과 타일들의 검색된 서브-세트의 일부인 제 2 이미지 부분들간의 빠른 비교들을 위해, 타일들의 검색된 서브-세트를 제 2의 캐시 메모리 스토리지에 저장한다. 타일들의 검색된 서브-세트는 타일들의 전체 세트보다 작다. Some embodiments provide a method for interblock processing a first portion of a first image with reference to a second image in a sequence of video images. The method divides the second image into a set of tiles and stores the tiles in the first non-cache memory storage. Each time a sub-set of tiles is needed to match the first portion of the first image with the second image portion, the method retrieves the sub-set of tiles from the first non-cache memory storage, and the first portion. And for a quick comparison between the second image portions that are part of the retrieved subset of tiles, store the retrieved subset of tiles in a second cache memory storage. The retrieved sub-set of tiles is smaller than the full set of tiles.

일부 실시예들에서, 본 방법은, 본 방법이 제 1 부분에 매칭되는 제 2 이미지 부분을 식별하기 위해 서치할, 타일들의 서브-세트에 대응되는, 제 2 이미지의 위치를 식별할 때, 타일들의 서브-세트가 검색되어 제 2 캐시 메모리 스토리지에 저장될 필요가 있는지를 판정한다. 일부 실시예들에서, 캐시 메모리 스토리지는 컴퓨터의 RAM인 한편, 비-캐시 메모리 스토리지는 컴퓨터의 비-휘발성 스토리지이다. 또한, 인터블록 프로세싱 방법이 일부 실시예들에서는 인터블록 인코딩 방법인 한편, 다른 실시예들에서는 인터블록 디코딩 방법이다. 또한, 일부 실시예들에서의 타일들의 세트는 2 이상의 수평으로 인접한 타일들 및 2 이상의 수직으로 인접한 타일들을 포함한다. In some embodiments, the method identifies a tile when identifying the location of the second image, corresponding to a sub-set of tiles, to be searched to identify a second image portion that matches the first portion. To determine if a subset of these needs to be retrieved and stored in the second cache memory storage. In some embodiments, cache memory storage is a computer's RAM, while non-cache memory storage is a computer's non-volatile storage. Also, while the interblock processing method is an interblock encoding method in some embodiments, it is an interblock decoding method in other embodiments. Further, the set of tiles in some embodiments includes two or more horizontally adjacent tiles and two or more vertically adjacent tiles.

일부 실시예들은, 제 1 세트 픽셀들과 매칭될 수 있는 제 2 이미지 부분들을 검사하기 위한 패턴을 각각 정의하는 한 세트의 서치 패턴들로부터 제 1 서치 패턴을 선택하는 것에 의해, 제 1 비디오 이미지의 제 1 세트 픽셀들을 인코딩하는 인터블록 인코딩 방법을 제공한다. 이러한 인코딩 방법은, 한 세트의 기준들에 기초해, 서치 패턴들의 세트에서 제 1 서치 패턴을 적응적으로 선택한다. 일부 실시예들에서의 기준들의 세트는 비디오 이미지의 미디어 유형을 포함한다.Some embodiments provide a method of selecting a first search pattern from a set of search patterns that each define a pattern for inspecting second image portions that may match the first set of pixels, thereby selecting a first search pattern of the first video image. An interblock encoding method for encoding first set of pixels is provided. This encoding method adaptively selects a first search pattern in a set of search patterns based on a set of criteria. The set of criteria in some embodiments includes the media type of the video image.

본 발명의 신규한 사양들이, 첨부된 청구항들에서 기술된다. 그러나, 설명을 위해, 다음 도면들에서는 본 발명의 몇 가지 실시예들이 기술된다. The novel features of the invention are set forth in the appended claims. However, for illustrative purposes, some embodiments of the invention are described in the following figures.

도 1은 신규한 여러 전정 기술들(pruning techniques)을 사용해 그것의 인코딩 프로세스를 단순화하는 인코더의 흐름을 개념적으로 도시하는 프로세스를 제시한다. 1 presents a process conceptually illustrating the flow of an encoder using several novel pruning techniques to simplify its encoding process.

도 2는 하나 또는 2개의 기준 프레임들과 현재 프레임 사이에서 매크로블록의 모션을 특정하는 모션 벡터를 식별하기 위해 2-단계의 모션-예측 연산을 수행하는 프로세스를 도시한다.2 shows a process for performing a two-step motion-prediction operation to identify a motion vector specifying the motion of a macroblock between one or two reference frames and the current frame.

도 3은, 일부 실시예들이 현재 프레임에서의 매크로블록 위치에 대응되는 기준 프레임의 위치 부근에 제 1 서치 윈도를 배치하는 방법을 도시한다. 3 illustrates a method in which some embodiments arrange a first search window near a position of a reference frame corresponding to a macroblock position in a current frame.

도 4는 현재-프레임의 매크로블록과 연관되어 있는 예측된 모션 벡터에 기초해 제 1 서치 윈도 위치를 식별하는 일 방식을 도시한다.4 illustrates one way of identifying a first search window position based on a predicted motion vector associated with a macroblock of the current-frame.

도 5는 제 1 서치 윈도 내의 다수 시작점의 일례를 도시한다.5 shows an example of multiple starting points in the first search window.

도 6은 제 2 단계 서치 윈도의 일례를 도시한다.6 shows an example of a second stage search window.

도 7은 현재-프레임 매크로블록의 픽셀들의 파티션들 세트와 최선으로 매칭되는 기준-프레임에서의 픽셀들의 파티션들 세트를 식별하기 위해 수행되는 세분된 모션 예측 프로세스(refined motion estimation process)를 도시한다. 7 shows a refined motion estimation process performed to identify a set of partitions of pixels in a reference-frame that best matches a set of partitions of pixels of the current-frame macroblock.

도 8은 몇 가지 위치점들을 갖춘 서치 윈도를 개념적으로 도시한다.8 conceptually illustrates a search window with several location points.

도 9는 다수 픽셀 레벨에서의 기준-프레임 매크로블록을 위해 픽셀들의 파티션을 서치하기 위한 프로세스를 개념적으로 도시한다. 9 conceptually illustrates a process for searching for a partition of pixels for a reference-frame macroblock at the multiple pixel level.

도 10은 몇 가지 가능한 파티션(즉, 블록) 사이즈들을 개념적으로 도시한다. 10 conceptually illustrates some possible partition (ie, block) sizes.

도 11은 상이한 픽셀 레벨들을 위한 몇 가지 서치 위치들을 개념적으로 도시 한다. 11 conceptually illustrates several search locations for different pixel levels.

도 12는 기준 프레임에서의 서브-픽셀 위치들에 맞추어 정렬되어 있는 현재-프레임의 매크로블록을 개념적으로 도시한다.12 conceptually illustrates a macroblock of the current-frame that is aligned with sub-pixel positions in the reference frame.

도 13은 동일한 프레임을 포인팅하는 모션 벡터들을 포함하는 몇 가지 프레임들을 개념적으로 도시한다.13 conceptually illustrates several frames that include motion vectors pointing to the same frame.

도 14는 한 세트의 픽셀들에 관한 데이터(예를 들어, 정수, 비-정수)가 캐시에 저장될 수 있는 방법을 개념적으로 도시한다.14 conceptually illustrates how data (eg, integers, non-integers) about a set of pixels may be stored in a cache.

도 15는 서치 윈도 내의 저-밀도 서치 패턴을 도시한다.15 shows a low-density search pattern in a search window.

도 16은 서치 윈도 내의 고-밀도 서치 패턴을 도시한다.16 shows a high-density search pattern in a search window.

도 17은 수직 방향으로 바이어싱 되어 있는 검색 패턴의 일례를 도시한다.17 shows an example of a search pattern biased in the vertical direction.

도 18은 수평 방향으로 바이어싱 되어 있는 검색 패턴의 일례를 도시한다.18 shows an example of a search pattern biased in the horizontal direction.

도 19는 RD 비용을 계산할 필요가 있는 모션-예측 솔루션들을 식별하기 위해 모션-예측 솔루션들의 서브-세트를 선택적으로 점검하는 프로세스를 도시한다.19 shows a process for selectively checking a sub-set of motion-prediction solutions to identify motion-prediction solutions that need to calculate an RD cost.

도 20은 본 발명의 일부 실시예들이 구현되는 컴퓨터 시스템을 도시한다.20 illustrates a computer system in which some embodiments of the present invention are implemented.

본 발명에 대한 다음의 상세한 설명에서는, 본 발명에 대한 여러 세부 사항들, 일례들 및 실시예들이 기술되고 설명된다. 그러나, 당업자라면, 본 발명이, 기술된 실시예들로 제한되지 않으며, 논의된 구체적 세부 사항들과 일례들 중 일부가 없이도, 본 발명이 실시될 수 있다는 것을 분명히 알 수 있을 것이다. In the following detailed description of the invention, several details, examples and embodiments of the invention are described and described. However, it will be apparent to one skilled in the art that the present invention is not limited to the described embodiments, and that the present invention may be practiced without some of the specific details and examples discussed.

1. 개요1. Overview

본 발명의 일부 실시예들은 신규한 인터블록 인코딩 및 디코딩 프로세스를 제공한다. 이러한 신규한 프로세스들은 (1) 다단식 모션 예측 프로세스(multi-stage motion estimation process), (2) 기준 프레임의 비-정수 픽셀 위치 값들을 캐싱하기 위한 보간 캐싱 프로세스, (3) 기준 프레임의 타일들의 서브-세트를 캐싱하기 위한 타일 캐싱 프로세스, 및 (4) 기준 프레임에서의 서치에 사용하기 위한 서치 패턴을 적응적으로 선택하는 모션 예측 프로세스를 포함한다. Some embodiments of the present invention provide a novel interblock encoding and decoding process. These novel processes include (1) a multi-stage motion estimation process, (2) an interpolation caching process for caching non-integer pixel position values of a reference frame, and (3) a subset of tiles of the reference frame. A tile caching process for caching the set, and (4) a motion prediction process for adaptively selecting a search pattern for use in the search in the reference frame.

A. 다단식 A. Multistage 모션motion 예측 prediction

일부 실시예들의 다단식 모션 예측 프로세스는 비디오 시퀀스의 제 2 이미지를 참조하여 제 1 이미지의 제 1 세트 픽셀들을 인코딩한다. 제 2 이미지 내의 제 1 서치 윈도에서, 모션 예측 프로세스는 제 1 이미지의 제 1 세트 픽셀들과 최선으로 매칭되는 제 2 이미지의 특정한 제 1 부분을 식별하기 위해 서치한다. 제 2 이미지 내의 제 1 서치 윈도에서, 모션 예측 프로세스는 특정한 제 1 부분에 대응되는 제 1 위치를 식별한다. 제 2 이미지 내의 제 2 서치 윈도에서, 모션 예측 프로세스는 그 다음으로 제 1 이미지의 제 1 세트 픽셀들과 최선으로 매칭되는 제 2 이미지의 특정한 제 2 부분을 식별하기 위해 서치하는데, 이 경우, 제 2 서치 윈도는 제 1 위치 부근에 정의된다. 일부 실시예들에서, 제 1 서치는 대략적 모션 예측 프로세스인 한편, 제 2 서치는 세분된 모션 예측 프로세스이다. 또한, 일부 실시예들에서는, 세분된 모션 예측 프로세스가 가변 블록 사이즈들을 서치한다. The multi-stage motion prediction process of some embodiments encodes the first set of pixels of the first image with reference to the second image of the video sequence. In the first search window in the second image, the motion prediction process searches to identify a particular first portion of the second image that best matches the first set of pixels of the first image. In the first search window in the second image, the motion prediction process identifies a first location corresponding to the particular first portion. In a second search window in the second image, the motion prediction process then searches to identify a particular second portion of the second image that best matches the first set of pixels of the first image, in which case the first Two search windows are defined near the first position. In some embodiments, the first search is a coarse motion prediction process, while the second search is a refined motion prediction process. Further, in some embodiments, the granular motion prediction process searches for variable block sizes.

B. 보간 캐싱B. Interpolation Caching

본 발명의 일부 실시예들의 인코더는 비디오 시퀀스의 이미지들을 인터블록 인코딩한다. 비디오 시퀀스의 각 이미지는 몇 가지 정수 픽셀 위치들을 갖는데, 각각의 정수 픽셀 위치는 하나 이상의 이미지 값(예를 들어, 휘도 값)을 가진다. 인코더는 제 2 이미지를 참조해 인코딩하기 위한 제 1 이미지를 선택한다. 그 다음, 인코더는 제 1 이미지의 픽셀들의 세트와 매칭되는 제 2 이미지의 비-정수 픽셀 위치들의 세트를 식별한다. 이 식별은 제 2 이미지의 몇 가지 정수 픽셀 위치들에 대한 이미지 값들로부터 제 2 이미지의 비-정수 픽셀 위치들과 연관된 이미지 값들을 보간하는 단계를 수반한다. 인코더는 비-정수 픽셀 위치들의 보간된 이미지 값들을, 제 2 이미지를 참조하여 제 3 이미지를 인코딩하는 동안의 후속 사용을 위해, 보간 캐시에 저장한다. The encoder of some embodiments of the present invention interblock encodes images of a video sequence. Each image of the video sequence has several integer pixel positions, each integer pixel position having one or more image values (eg, luminance values). The encoder refers to the second image to select the first image for encoding. The encoder then identifies a set of non-integer pixel locations of the second image that match the set of pixels of the first image. This identification involves interpolating image values associated with non-integer pixel positions of the second image from image values for some integer pixel positions of the second image. The encoder stores interpolated image values of non-integer pixel locations in an interpolation cache for subsequent use while encoding a third image with reference to the second image.

본 발명의 일부 실시예들의 디코더는 유사한 보간 캐시를 사용한다. 구체적으로, 디코더는 제 2 이미지를 참조해 디코딩하기 위한 제 1 이미지를 선택한다. 다음으로, 디코더는 제 1 이미지의 픽셀들의 세트에 대응되는 제 2 이미지의 비-정수 픽셀 위치들의 세트를 식별한다. 그 다음, 디코더는 제 2 이미지의 몇 가지 정수 픽셀 위치들에 대한 이미지 값들로부터 제 2 이미지의 비-정수 픽셀 위치들과 연관된 이미지 값들을 보간한다. 디코더는 비-정수 픽셀 위치들의 보간된 이미지 값들을, 제 2 이미지를 참조하여 제 3 이미지를 디코딩하는 동안의 후속 사용을 위해, 저장한다.The decoder of some embodiments of the present invention uses a similar interpolation cache. In detail, the decoder selects a first image for decoding with reference to the second image. Next, the decoder identifies a set of non-integer pixel positions of the second image corresponding to the set of pixels of the first image. The decoder then interpolates the image values associated with the non-integer pixel positions of the second image from the image values for some integer pixel positions of the second image. The decoder stores interpolated image values of non-integer pixel positions for subsequent use while decoding the third image with reference to the second image.

C. 타일 캐싱 프로세스C. Tile Caching Process

일부 실시예들은 비디오 이미지들의 시퀀스에서의 제 2 이미지를 참조하여 제 1 이미지의 제 1 부분을 프로세싱하는 인터블록 프로세스들에 타일 캐싱 프로세 스를 사용한다. 이 캐싱 프로세스는 제 2 이미지를 타일들의 세트로 분할하고 타일들을 제 1의 비-캐시 메모리 스토리지에 저장한다. 제 1 이미지의 제 1 부분을 제 2 이미지 부분과 매칭하기 위해 타일들의 서브-세트가 필요할 때마다, 캐싱 프로세스는 제 1의 비-캐시 메모리 스토리지로부터 타일들의 서브-세트를 검색하고, 제 1 부분과 타일들의 검색된 서브-세트의 일부인 제 2 이미지 부분들간의 빠른 비교들을 위해 타일들의 검색된 서브-세트를 제 2의 캐시 메모리 스토리지에 저장한다. 타일들의 검색된 서브-세트는 타일들의 전체 세트보다 작다.Some embodiments use a tile caching process for interblock processes that process a first portion of a first image with reference to a second image in a sequence of video images. This caching process divides the second image into a set of tiles and stores the tiles in the first non-cache memory storage. Each time a sub-set of tiles is needed to match the first portion of the first image to the second image portion, the caching process retrieves the sub-set of tiles from the first non-cache memory storage, and the first portion. And store the retrieved sub-set of tiles in a second cache memory storage for quick comparisons between the second image portions that are part of the retrieved subset of tiles. The retrieved sub-set of tiles is smaller than the full set of tiles.

일부 실시예들에서, 캐싱 프로세스는, 캐싱 프로세스가 제 1 부분과 매칭되는 제 2 이미지 부분을 식별하기 위해 서치할 제 2 이미지의 위치를 식별할 때, 타일들의 서브-세트가 검색되어 제 2의 캐시 메모리 스토리지에 저장될 필요가 있는지를 판정하는데, 여기에서, 이렇게 식별된 위치는 타일들의 서브-세트에 대응된다. 일부 실시예들에서, 캐시 메모리 스토리지는 컴퓨터의 RAM인 한편, 비-캐시 메모리 스토리지는 컴퓨터의 비-휘발성 스토리지이다. 또한, 인터블록 프로세스가 일부 실시예들에서는 인터블록의 인코딩 프로세스인 한편, 다른 실시예들에서는 인터블록의 디코딩 프로세스이다. 또한, 일부 실시예들에서의 타일들의 세트는 2 이상의 수평으로 인접한 타일들 및 2 이상의 수직으로 인접한 타일들을 포함한다. In some embodiments, the caching process, when the caching process identifies a location of a second image to search to identify a second image portion that matches the first portion, a sub-set of tiles is retrieved to generate a second Determine if it needs to be stored in cache memory storage, where the location so identified corresponds to a subset of tiles. In some embodiments, cache memory storage is a computer's RAM, while non-cache memory storage is a computer's non-volatile storage. Further, the interblock process is in some embodiments the encoding process of the interblock, while in other embodiments the interblock decoding process. Further, the set of tiles in some embodiments includes two or more horizontally adjacent tiles and two or more vertically adjacent tiles.

D. D. 적응적Adaptive 서치search 패턴 pattern

본 발명의 일부 실시예들의 모션 예측 프로세스는, 제 1 세트 픽셀들과 매칭될 수 있는 제 2 이미지 부분들을 검사하기 위한 패턴을 각각 정의하는 한 세트의 서치 패턴들로부터 제 1 서치 패턴을 선택하는 것에 의해, 제 1 비디오 이미지의 제 1 세트 픽셀들을 인코딩한다. 모션 예측 프로세스는, 한 세트의 기준들에 기초해, 서치 패턴들의 세트에서 제 1 서치 패턴을 적응적으로 선택한다. 일부 실시예들에서의 기준들의 세트는 비디오 이미지의 미디어 유형을 포함한다.The motion prediction process of some embodiments of the present invention is directed to selecting a first search pattern from a set of search patterns that each define a pattern for inspecting second image portions that may match the first set of pixels. By encoding the first set of pixels of the first video image. The motion prediction process adaptively selects a first search pattern in a set of search patterns based on a set of criteria. The set of criteria in some embodiments includes the media type of the video image.

앞서 언급된 신규한 인터블록 인코딩 및 디코딩 프로세스들을 설명하기 전에, 다음에서는 먼저, 본 발명의 인터블록 인코딩 프로세스를 포함하는 인코딩 프로세스의 전반적인 흐름을 설명할 것이다. Before describing the novel interblock encoding and decoding processes mentioned above, the following will first describe the overall flow of the encoding process including the interblock encoding process of the present invention.

II. 전반적 흐름II. Overall flow

도 1은 신규한 여러 전정 기술들(pruning techniques)을 사용해 그것의 인코딩 프로세스를 단순화하는 인코더의 흐름을 개념적으로 도시하는 프로세스(100)를 제시한다. 일부 실시예들은 이 섹션에서 설명되는 전정 기술들 모두를 사용하지는 않는다. 또한, 일부 실시예들은 섹션 II에서 후술될 다단식 모션 예측 연산과 함께 이러한 전정 기술들을 사용한다. FIG. 1 presents a process 100 conceptually illustrating the flow of an encoder using several novel pruning techniques to simplify its encoding process. Some embodiments do not use all of the pruning techniques described in this section. In addition, some embodiments use these vestibular techniques in conjunction with the multi-stage motion prediction operations described below in section II.

도 1에 나타낸 바와 같이, 프로세스(100)는 (단계 105에서) 매크로블록을 인터블록으로 인코딩하는 것을 무시할 것인지를 판정하는 것에 의해 시작한다. 일부 실시예들에서, 프로세스는 소정 환경들하에서 인터블록 인코딩을 무시한다. 이러한 환경들로는, 인코더를 각각의 프레임을 인트라블록으로서 코딩할 것을 요하는 디버깅 모드로 배치하는 것, 몇 가지 매크로블록들을 인트라블록들로서 코딩할 것을 요하는 인트라블록 리프레시의 지시, 인트라블록 인코딩이 최종적으로 선택될 실현, 너무 적은 매크로블록들이 인트라블록 인코딩된 실현, 또는 매크로블록이 인트라블록으로 코딩될 것을 요하는 소정의 다른 지시를 들 수 있다. As shown in FIG. 1, process 100 begins by determining whether to ignore encoding (in step 105) a macroblock into an interblock. In some embodiments, the process ignores interblock encoding under certain circumstances. These circumstances include placing the encoder in a debugging mode that requires coding each frame as an intrablock, indicating an intrablock refresh requiring coding some macroblocks as intrablocks, and intrablock encoding finally An implementation to be selected, an implementation in which too few macroblocks are intrablock encoded, or some other indication that requires a macroblock to be coded into an intrablock.

프로세스(100)가 매크로블록을 인터블록으로 인코딩할 필요가 없다고 판정하면, 프로세스(100)는 단계 110으로 천이한다. 단계 110에서, 프로세스는 매크로블록을 인트라블록으로 인코딩한다. Attorney Docket APLE.P0078("Intrablock Encoding Application")을 가진 "Selecting Encoding Types and Predictive Modes for Encoding Video Data"라는 명칭의 미국 특허출원에 인트라블록 인코딩을 수행하기 위한 신규한 여러 방식들이 설명되어 있다. 이러한 미국 특허출원은 여기에 참조로써 포함되어 있다. If process 100 determines that the macroblock does not need to be encoded as an interblock, process 100 transitions to step 110. In step 110, the process encodes the macroblock into an intrablock. A number of new ways to perform intrablock encoding are described in a US patent application entitled "Selecting Encoding Types and Predictive Modes for Encoding Video Data" with Attorney Docket APLE.P0078 ("Intrablock Encoding Application"). Such US patent applications are incorporated herein by reference.

프로세스가 (단계 110에서) 매크로블록을 인트라블록으로 인코딩하고 나면, 프로세스는 인코딩 솔루션을 지시하기 위해 단계 150으로 천이한다. 이 경우, 프로세스는 단계 110에서의 프로세스의 인트라코딩 결과를 지시하는데, 이것이, 프로세스(100)가 흐름을 통한 이 경로에서 검사한 유일한 인코딩이기 때문이다. 단계 150 이후에, 프로세스(100)는 종료한다.After the process encodes the macroblock into an intrablock (in step 110), the process transitions to step 150 to indicate an encoding solution. In this case, the process indicates the intracoding result of the process in step 110, because this is the only encoding that process 100 checked on this path through the flow. After step 150, process 100 ends.

다른 방법으로, 프로세스(100)가 (단계 105에서), 프로세스(100)가 인터블록 인코딩을 무시(즉, 전정)해서는 안된다고 판정할 경우, 프로세스는 (단계 115에서) 매크로블록의 스킵 모드 인코딩을 수행하고, 필요하다면, 매크로블록의 직접 모드 인코딩을 수행한다. 스킵 모드 인코딩에서, 매크로블록은 스킵형 매크로블록으로 코딩되고; 디코더 측에서, 이 매크로블록은 주변 매크로블록들 및/또는 주변 매크로블록들 내의 파티션들의 모션 벡터들을 참조하여 디코딩될 것이다. 스킵 모드 인코딩은, Attorney Docket APLE.P0073("Pruning Application")을 가진, "Pruning During Video Encoding"이라는 명칭으로 동시 출원된 미국 특허출원에서 부연된다. 이 출원은 여기에 참조로써 포함되어 있다. 직접 모드 인코딩에서는, 매크로블록의 텍스처 데이터 중 일부가 양자화되어 인코딩된 비트 스트림과 함께 송신된다는 것을 제외하면, 직접 모드 인코딩은 스킵 모드 인코딩과 유사하다. 일부 실시예들에서, 직접 모드 인코딩은 매크로블록의 B-모드 인코딩을 위해 수행된다. 일부 실시예들은 P-모드 인코딩 동안에도 직접 모드 인코딩을 수행할 수 있다. 단계 115 이후에, 프로세스(100)는 (단계 120에서), 스킵 모드 인코딩이 단계 115에서 최상의 인코딩 솔루션을 발생시켰는지의 여부를 판정한다. 이것은 명백하게, 단계 115에서 직접 모드 인코딩이 수행되지 않는 경우일 것이다. 한편, 단계 115에서 직접 모드 인코딩이 수행되고, 이러한 인코딩이 스킵 모드 인코딩보다 양호한 솔루션을 발생시켰다면, 프로세스는, 후술될 인터코딩(intercoding)을 수행하기 위해 단계 135로 천이한다. Alternatively, if process 100 determines (in step 105) that process 100 should not ignore (i.e. pruning) the interblock encoding, then the process (in step 115) skips the skip mode encoding of the macroblock. And if necessary, direct mode encoding of the macroblock. In skip mode encoding, the macroblock is coded as a skipped macroblock; On the decoder side, this macroblock will be decoded with reference to the motion macros of the neighboring macroblocks and / or partitions within the neighboring macroblocks. Skip mode encoding is further described in a U.S. patent application filed under the name "Pruning During Video Encoding" with Attorney Docket APLE.P0073 ("Pruning Application"). This application is incorporated herein by reference. In direct mode encoding, direct mode encoding is similar to skip mode encoding except that some of the macroblock's texture data is quantized and transmitted with the encoded bit stream. In some embodiments, direct mode encoding is performed for B-mode encoding of a macroblock. Some embodiments may perform direct mode encoding even during P-mode encoding. After step 115, process 100 determines (at step 120) whether skip mode encoding has generated the best encoding solution at step 115. This would obviously be the case when direct mode encoding is not performed in step 115. On the other hand, if direct mode encoding is performed in step 115 and this encoding resulted in a better solution than skip mode encoding, the process transitions to step 135 to perform intercoding, which will be described later.

그러나, 프로세스가 (단계 120에서), 스킵 모드 인코딩이 단계 115에서 최선의 결과를 발생시켰다고 판정할 경우, 프로세스는 (단계 125에서), 스킵 모드 인코딩이 인코딩을 종결하기에 충분히 양호하였는지를 판정한다. 이러한 판정을 내리는 일 방법이 앞서 언급된 Pruning Application에 설명되어 있다.However, if the process (in step 120) determines that the skip mode encoding produced the best result in step 115, the process determines (in step 125) whether the skip mode encoding was good enough to terminate the encoding. One way to make this determination is described in the aforementioned Pruning Application.

프로세스가 (단계 125에서), 스킵 모드 인코딩이 충분히 양호하였다고 판정하면, 프로세스(100)는 단계 130으로 천이하는데, 여기에서, 프로세스는 스킵 모드 인코딩 솔루션이 파기되어야 하는지를 판정한다. 일부 실시예들은 RD 비용(rate-distortion cost)이라고 하는 인코딩 비용에 기초해 솔루션들을 판정한다. 섹션 II에서 후술되는 바와 같이, 인코딩 솔루션의 RD 비용은 흔히, 인코딩된 매크로블 록에서의 왜곡을 설명하며 인코딩 솔루션을 위해 발생될 실제 비트들을 카운팅한다. 스킵 모드 솔루션들은 간혹, 상당한 RD 비용들을 가짐에도 불구하고 여전히 형편없는 솔루션들일 수 있다. 이것은, 이러한 솔루션들이 아주 작은 레이트 비용들(very small rate costs)을 가지며 이러한 레이트 비용들은 때때로, 불량한 솔루션이 최선 솔루션처럼 보이도록 하기에 충분한 크기만큼 총 RD 비용들을 빗나가게 하기 때문이다.If the process determines (at step 125) that the skip mode encoding was good enough, process 100 transitions to step 130, where the process determines whether the skip mode encoding solution should be discarded. Some embodiments determine solutions based on an encoding cost called a rate-distortion cost. As described below in section II, the RD cost of the encoding solution often accounts for distortion in the encoded macroblock and counts the actual bits that will be generated for the encoding solution. Skip mode solutions sometimes can still be poor solutions, despite having significant RD costs. This is because these solutions have very small rate costs and these rate costs sometimes deviate from the total RD costs by a size sufficient to make the bad solution appear to be the best solution.

따라서, 단계 125에서 스킵 모드 인코딩 솔루션을 선택한 이후라 하더라도, 프로세스(100)는 (단계 125에서), 스킵 모드 솔루션을 제거해야 하는지를 판정한다. 일부 실시예들에서, 이러한 판정을 내리기 위한 기준은, 현재 매크로블록의 스킵 모드 인코딩을 위한 왜곡이 현재 매크로블록의 인접한 이웃 매크로블록들에 대한 최대 왜곡의 2배보다 큰지의 여부이다. Thus, even after selecting the skip mode encoding solution in step 125, process 100 determines (step 125) whether the skip mode solution should be removed. In some embodiments, the criterion for making this determination is whether the distortion for skip mode encoding of the current macroblock is greater than twice the maximum distortion for adjacent neighboring macroblocks of the current macroblock.

프로세스가 (단계 130에서), 스킵-모드 솔루션이 제거되지 않아야 한다고 판정하면, 프로세스는 인코딩 솔루션을 지시하기 위해 천이한다. 이 경우, 프로세스는 스킵-모드 인코딩의 결과를 지시한다. 단계 150 이후에, 프로세스(100)는 종료한다. 한편, 프로세스(100)가 (단계 130에서), 스킵-모드 인코딩 솔루션이 제거되어야 한다고 판정하면, 프로세스는 단계 135로 천이한다. 프로세스는, 프로세스가 (단계 125에서) 스킵 모드 솔루션이 인코딩을 종결하기에 충분히 양호하지 않다고 판정할 경우에도 단계 135로 천이한다. If the process determines (at step 130) that the skip-mode solution should not be removed, the process transitions to indicate the encoding solution. In this case, the process indicates the result of the skip-mode encoding. After step 150, process 100 ends. On the other hand, if process 100 determines (at step 130) that the skip-mode encoding solution should be removed, the process transitions to step 135. The process transitions to step 135 even if the process determines (at step 125) that the skip mode solution is not good enough to terminate the encoding.

단계 135에서, 프로세스는 다양한 인터블록 인코딩들을 점검한다. 일부 실시예들에서, 프로세스(100)는, 섹션 II에서 부연되는, 다양한 매크로블록 및 서브- 매크로블록 인코딩들(예를 들어, 16×16, 8×16, 16×8, 8×8, 8×4, 4×8, 및 4×4 B-모드 및 P-모드 인코딩들)을 검사할 수 있다. 그러나, 앞서 언급한 Pruning Application에서 설명된 바와 같이, 일부 실시예들은 매크로블록 또는 서브-매크로블록 인코딩 모드들 중 일부에 대한 검사 및/또는 분석을 전정(즉, 무시)하는 것에 의해 인터블록 인코딩 프로세스를 가속한다.In step 135, the process checks various interblock encodings. In some embodiments, process 100 may include various macroblock and sub-macroblock encodings (eg, 16 × 16, 8 × 16, 16 × 8, 8 × 8, 8, as discussed in section II). X4, 4x8, and 4x4 B-mode and P-mode encodings). However, as described in the aforementioned Pruning Application, some embodiments may interleave (i.e. ignore) the inspection and / or analysis of some of the macroblock or sub-macroblock encoding modes. To accelerate.

단계 135에서 인터블록 인코딩을 수행한 후, 프로세스는 (단계 140에서), 매크로블록의 인터블록 인코딩이 프로세스가 매크로블록의 인트라블록 인코딩을 무시하기에 충분할 정도로 양호한지를 판정한다. 상이한 실시예들은 이 판정을 상이하게 수행한다. 이러한 접근 방법들 중 일부는 다음의 섹션 II에서 부연된다. After performing interblock encoding in step 135, the process (in step 140) determines whether the interblock encoding of the macroblock is good enough to ignore the intrablock encoding of the macroblock. Different embodiments perform this determination differently. Some of these approaches are discussed in Section II below.

프로세스(100)가 (단계 140에서), 인트라블록 인코딩이 수행되어야 한다고 판정하면, 프로세스는 단계 145로 천이하는데, 여기에서, 프로세스는 이러한 인코딩을 수행한다. 상술된 바와 같이, 이 프로세스의 인트라블록 인코딩에 대한 몇 가지 신규한 사양들이 앞서 언급된 Intrablock Encoding Application에 설명되어 있다. 단계 145 이후에, 프로세스는 단계 150으로 천이한다. 프로세스는, (단계 140에서) 프로세스가 인트라블록을 무시해야 한다고 판정할 경우에도 단계 150으로 천이한다. If process 100 determines (in step 140) that intrablock encoding should be performed, the process transitions to step 145, where the process performs this encoding. As mentioned above, several new specifications for intrablock encoding of this process are described in the aforementioned Intrablock Encoding Application. After step 145, the process transitions to step 150. The process transitions to step 150 even if the process determines (in step 140) that the intrablock should be ignored.

상술된 바와 같이, 프로세스는 (단계 150에서) 매크로블록을 위한 인코딩 솔루션을 지시한다. 프로세스(100)가 단계 150 이전의 처리 동작 동안 다수의 인코딩 솔루션들을 식별한다면, 프로세스는 (단계 150에서) 이러한 솔루션들 중 하나를 선택한다. 일부 실시예들에서, 프로세스(100)는 최상의 RD 비용을 가진 솔루션을 선택한다. RD 비용의 몇 가지 일례들이 다음에서 제공된다. 단계 150 이후에, 프로세스는 종료한다. As described above, the process directs (at step 150) an encoding solution for the macroblock. If process 100 identifies multiple encoding solutions during the processing operation prior to step 150, the process selects one of these solutions (at step 150). In some embodiments, process 100 selects the solution with the best RD cost. Some examples of RD costs are provided below. After step 150, the process ends.

Ⅲ. III. 인터블록Interblock 인코딩 Encoding

A. 다단식 A. Multistage 모션motion 예측 prediction

상술된 바와 같이, 일부 실시예들은 도 1에 도시된 프로세스(100)와 함께 다단식 모션 예측 연산을 사용한다. 일부 실시예들에서, 다단식 모션 예측 연산은, 매크로블록이 인터블록 인코딩될 때 수행된다. 후술되는 바와 같이, 일부의 다단식 모션 예측 연산들은 대략적 모션 예측 및 세분된 모션 예측을 포함한다. 일부 실시예들에서, 프로세스(100)는 초기의 대략적 모션 예측 연산 이후에 수행된다. 그러나, 당업자라면, 초기의 대략적 모션 예측 연산이 프로세스(100) 동안에 (예를 들어, 단계들(105 및 115) 사이의 단계 140에서) 수행될 수도 있다는 것을 알 수 있을 것이다.As described above, some embodiments use multistage motion prediction computation in conjunction with the process 100 shown in FIG. In some embodiments, multistage motion prediction operations are performed when the macroblocks are interblock encoded. As described below, some multistage motion prediction operations include coarse motion prediction and granular motion prediction. In some embodiments, process 100 is performed after an initial coarse motion prediction operation. However, one skilled in the art will appreciate that an initial coarse motion prediction operation may be performed during the process 100 (eg, at step 140 between steps 105 and 115).

1. 전반적 흐름1. Overall flow

도 2는 하나 또는 2개의 기준 프레임들과 현재 프레임 사이의 매크로블록 모션을 특정하는 모션 벡터를 식별하기 위해 다단식 모션-예측 연산을 수행하는 프로세스를 도시한다. 본 발명의 다단식 모션 예측 연산에 대한 논의를 명확하게 하기 위해, 프로세스(200)는, 하나의 기준 프레임에서 현재-프레임 매크로블록의 위치를 찾는다는 관점에서 후술된다. 그러나, 당업자라면, 이 프로세스가 대개는, 매크로블록에 대한 최상의 모션-예측 인코딩을 식별하기 위해 2개의 기준 프레임들을 검사한다는 것을 알 수 있을 것이다. 2 shows a process for performing a multi-stage motion-prediction operation to identify a motion vector specifying a macroblock motion between one or two reference frames and a current frame. To clarify the discussion of the multi-stage motion prediction operation of the present invention, process 200 is described below in terms of finding the location of the current-frame macroblock in one reference frame. However, one skilled in the art will appreciate that this process usually examines two reference frames to identify the best motion-prediction encoding for the macroblock.

이 프로세스의 제 1 단계는, 기준 프레임에서 현재-프레임 매크로블록의 위치에 대한 대략적 근사를 식별하는 대략적 서치(예를 들어, 대략적 모션 예측)인 한편, 제 2 단계는, 기준 프레임에서 현재-프레임 매크로블록의 위치에 대한 좀더 정확한 근사를 식별하는 좀더 세분된 서치(예를 들어, 세분된 모션 예측)이다. The first step of this process is an approximate search (eg, approximate motion prediction) that identifies an approximate approximation to the location of the current-frame macroblock in the reference frame, while the second step is a current-frame in the reference frame A more refined search (eg, refined motion prediction) that identifies a more accurate approximation of the location of the macroblock.

프로세스는 먼저 (단계 210에서) 현재-프레임 매크로블록에 최선으로 매칭되는 매크로블록을 위해 기준 프레임의 제 1 서치를 수행한다. 제 1 서치는 기준 프레임 내의 제 1 서치 윈도 내에서 수행된다. 상이한 실시예들은 제 1 서치 윈도의 위치를 상이하게 식별한다. 예를 들어, 도 3에 나타낸 바와 같이, 일부 실시예들은 제 1 서치 윈도(300)를 현재 프레임의 매크로블록(330) 위치(320)에 대응되는 기준 프레임의 위치(310) 부근에 배치한다. The process first performs a first search of the reference frame for the macroblock that best matches the current-frame macroblock (in step 210). The first search is performed in the first search window in the reference frame. Different embodiments identify the location of the first search window differently. For example, as shown in FIG. 3, some embodiments place the first search window 300 near the location 310 of the reference frame corresponding to the location 320 of the macroblock 330 of the current frame.

다른 실시예들은 기준 프레임에서 현재-프레임 매크로블록에 대해 예측된 위치에 제 1 서치 윈도를 배치한다. 도 4는 현재-프레임 매크로블록과 연관된 예측 모션 벡터에 기초해 제 1 서치 윈도 위치를 식별하는 일 방식을 도시한다. 도 4는 현재 프레임(400)의 현재-프레임 매크로블록(410)을 도시한다. 이 도면은 또한 현재-프레임 매크로블록(410)과 연관된 예측 벡터(420)를 도시한다. 이러한 예측 모션 벡터(420)는, 현재 프레임의 현재-프레임 매크로블록(410)과 인접한 매크로블록들의 모션 벡터들에 기초해 계산될 수 있다. 도 4에 나타낸 바와 같이, 예측 모션 벡터(420)는, 기준 프레임(430)의 위치(440)에 대응되는 현재 프레임(400)의 위치점(460)을 포인팅한다. 따라서, 도 4에 추가적으로 도시되어 있는 바와 같이, 일부 실시예들은 기준 프레임(430)의 제 1 서치 윈도(450)를 위치점(440) 부근에 배 치한다.Other embodiments place the first search window at the location predicted for the current-frame macroblock in the reference frame. 4 illustrates one way of identifying a first search window location based on a predictive motion vector associated with a current-frame macroblock. 4 shows a current-frame macroblock 410 of the current frame 400. This figure also shows a prediction vector 420 associated with the current-frame macroblock 410. This predictive motion vector 420 may be calculated based on the motion vectors of the macroblocks adjacent to the current-frame macroblock 410 of the current frame. As shown in FIG. 4, the predictive motion vector 420 points to the location point 460 of the current frame 400 corresponding to the location 440 of the reference frame 430. Thus, as further shown in FIG. 4, some embodiments place the first search window 450 of the reference frame 430 near the location point 440.

프로세스(200)는 (단계 210에서), 현재-프레임 매크로블록이 기준 프레임에 등장한 이후로 현재-프레임 매크로블록이 얼마나 많이 이동되었는지를 특정하는 모션 벡터를 식별하기 위해, 제 1 서치 윈도 내에서 대략적 서치를 수행한다. 프로세스는 현재-프레임 매크로블록에 가장 근접하게 매칭되는 제 1 서치 윈도 내의 기준-프레임 매크로블록을 서치하는 것에 의해, 이 모션 벡터를 식별할 수 있다. 프로세스가 서치 윈도 내의 모든 기준-프레임 매크로블록들을 조사할 필요는 없으며, 사전-정의된 소정 파라미터들 내에 해당되는 것을 판정하는 것으로 충분하다. The process 200 (at step 210) may roughly identify within the first search window to identify a motion vector that specifies how much the current-frame macroblock has moved since the current-frame macroblock appeared in the reference frame. Perform a search. The process can identify this motion vector by searching for a reference-frame macroblock in the first search window that most closely matches the current-frame macroblock. It is not necessary for the process to examine all the reference-frame macroblocks in the search window, but it is sufficient to determine that it falls within certain predefined parameters.

프로세스가 충분한 기준-프레임 매크로블록들을 식별하고 나면, 프로세스는 (단계 210에서), 프로세스가 이러한 대략적 서치 동안에 발견하게 되는 최선의 기준-프레임 매크로블록을 식별한다. 그 다음, 프로세스는 (단계 210에서) 식별된 최선의 기준-프레임 매크로블록을 사용해, 기준 프레임에서 현재-프레임 매크로블록의 위치에 대한 대략적 근사를 지시하는 모션 벡터를 특정한다. After the process has identified enough reference-frame macroblocks, the process (in step 210) identifies the best reference-frame macroblock that the process will find during this approximate search. The process then uses the best reference-frame macroblock identified (in step 210) to specify a motion vector that indicates an approximate approximation to the position of the current-frame macroblock in the reference frame.

단계 210 이후에, 프로세스는 (단계 220에서), 프로세스가 제 1 서치 윈도에서 대략적 서치(coarse search)를 충분히 반복했는지의 여부를 판정한다. 일부 실시예들은 이 윈도 내에서 단 한 번의 서치만을 수행한다. 이런 실시예들에서는, 프로세스(200)가 단계 220에서 판정을 내릴 필요가 없는 대신, 단계 210에서 단계 230으로 바로 진행한다. 다른 방법으로, 다른 실시예들은, 이 윈도 내의 상이한 다수 포인트들에서 시작하는 다수 서치들을 수행한다. After step 210, the process (in step 220) determines whether the process has sufficiently repeated the coarse search in the first search window. Some embodiments perform only one search within this window. In such embodiments, the process 200 does not need to make a determination at step 220, but instead proceeds directly from step 210 to step 230. Alternatively, other embodiments perform multiple searches starting at different multiple points in this window.

프로세스(200)가 (단계 220에서), 프로세스가 제 1 서치 윈도 내에서 또 한 번의 대략적 서치를 수행해야 한다고 판정하면, 프로세스는 매크로블록을 위해 단계 210에서 수행된 이전의 대략적 서치들과는 상이한 위치에서 시작하는 또 한 번의 서치를 (이 윈도 내에서) 수행하기 위해 단계 210으로 되돌아 간다. If the process 200 determines (at step 220) that the process should perform another coarse search within the first search window, the process is at a different location than the previous coarse searches performed at step 210 for the macroblock. Return to step 210 to perform another search (within this window) to begin.

도 5는 제 1 서치 윈도 내의 다수 시작점의 일례를 도시한다. 구체적으로, 본 도면은 제 1 서치 윈도(500) 내의 4개 시작점들(510-540)을 도시한다. 각 시작점(510-540)은 상이한 기준-프레임 매크로블록들을 식별하는 서치를 초래한다. 일부 실시예들에서는, 상이한 시작점들이 동일한 기준-프레임 매크로블록들을 식별할 수도 있다. 5 shows an example of multiple starting points in the first search window. Specifically, the figure shows four starting points 510-540 in the first search window 500. Each starting point 510-540 results in a search identifying different reference-frame macroblocks. In some embodiments, different starting points may identify the same reference-frame macroblocks.

일단 프로세스가 (단계 220에서), 프로세스가 제 1 서치 윈도에서의 대략적 서치를 충분히 반복했다고 판정하고 나면, 프로세스는 (단계 230에서) 단계 210을 통한 1회 이상의 프로세스 반복들을 통해 식별한, 가능한 최선의 대략적-단계 솔루션을 식별한다. 이 솔루션은, 도 6에 나타낸 바와 같이, 현재 프레임에서, 기준 프레임의 위치(630)에 대응되는, 매크로블록(410)을 위한 위치(610)를 식별하는 모션 벡터(620)를 식별한다. Once the process has determined (in step 220) that the process has sufficiently repeated the approximate search in the first search window, the process (in step 230) identified through one or more process iterations through step 210, the best possible Identify an approximate step solution. This solution identifies, as shown in FIG. 6, the motion vector 620 that identifies the position 610 for the macroblock 410, corresponding to the position 630 of the reference frame, in the current frame.

다음으로, 프로세스는 (단계 240에서) 현재-프레임 매크로블록에 매칭되는 기준-프레임 매크로블록을 위해 제 2의 세분된 모션-예측 서치(refined motion-estimation search)를 수행한다. 제 2 서치는 기준 프레임의 제 2 서치 윈도 내에서 수행된다. 일부 실시예들에서, 이러한 제 2 서치 윈도는, 단계 210에서의 대략적 제 1-단계 서치 동안에 사용되는 제 1 서치 윈도보다 작다. 또한, 일부 실시예들에서, 제 2 서치 윈도는 제 1 단계 서치에 의해 (즉, 단계 230에서 선택된 모션 벡터에 의해) 발생된 모션 벡터에 의해 식별되는, 기준 프레임에서의 위치 부근에 정의된다. 도 6은 이러한 제 2 단계 서치 윈도의 일례를 도시한다. 구체적으로, 이 도면은 제 1 서치에서 특정된 위치점(630) 부근의 제 2 서치 윈도(640)를 도시한다. Next, the process performs a second refined motion-estimation search for the reference-frame macroblock that matches the current-frame macroblock (at step 240). The second search is performed in the second search window of the reference frame. In some embodiments, this second search window is smaller than the first search window used during the approximate first-stage search in step 210. Further, in some embodiments, the second search window is defined near a position in the reference frame, identified by the motion vector generated by the first stage search (ie, by the motion vector selected in step 230). 6 shows an example of such a second stage search window. Specifically, this figure shows a second search window 640 near the location point 630 specified in the first search.

일부 실시예들에서, (단계 240에서의) 제 2 서치 단계 동안에 사용되는 서치 프로세스는 제 1 서치 단계 동안에 사용되는 서치 프로세스보다 좀더 철저하다. 예를 들어, 일부 실시예들이 제 2 단계 동안에는 레이트 왜곡 최적화(rate distortion optimization)를 사용하는 철저한 서브-매크로블록 서치(exhaustive sub-macroblock search)를 사용하는 한편, 제 1 서치 단계 동안에는 좀더 단순한 3-단계 서치(simpler three-step search)를 사용한다. In some embodiments, the search process used during the second search phase (at step 240) is more thorough than the search process used during the first search phase. For example, some embodiments use exhaustive sub-macroblock search using rate distortion optimization during the second stage, while simpler 3- during the first search stage. Use simpler three-step search.

단계 240에서의 제 2 서치 단계 끝에서, 프로세스(200)는, 현재-프레임 매크로블록이 기준 프레임에 등장한 이후로 얼마나 많이 이동되었는지를 특정하는 모션 벡터를 제공한다. 단계 240 이후에, 프로세스는 종료한다. At the end of the second search step in step 240, process 200 provides a motion vector specifying how much the current-frame macroblock has moved since appearing in the reference frame. After step 240, the process ends.

2. 세분된 2. Subdivided 모션motion 예측 prediction

도 7은, 현재-프레임 매크로블록의 픽셀들의 파티션들 세트에 최선으로 매칭되는 기준-프레임의 픽셀들의 파티션들 세트를 식별하기 위해 수행되는 세분된 모션 예측 프로세스(700)를 도시한다. 일부 실시예들에서, 프로세스(700)는 프로세스(200)의 (단계 240에서의) 제 2 서치 동안에 구현된다. 7 shows a granular motion prediction process 700 performed to identify a set of partitions of pixels of a reference-frame that best matches a set of partitions of pixels of the current-frame macroblock. In some embodiments, process 700 is implemented during a second search (in step 240) of process 200.

이 도면에 도시되어 있는 바와 같이, 프로세스(700)는 (단계 705에서) 서치 윈도 내의 위치점을 선택한다. 일부 실시예들에서, 서치 윈도는 처음에, 프로세 스(200)의 단계 230에서 식별되는 기준-프레임 매크로블록 부근에 정의된다.As shown in this figure, process 700 selects a location point within the search window (at step 705). In some embodiments, the search window is initially defined near the reference-frame macroblock identified in step 230 of process 200.

도 8은 몇 개의 위치점들을 갖춘 서치 윈도(800)를 개념적으로 도시한다. 이 도면에 도시되어 있는 바와 같이, 서치 윈도(800)는 9개 위치점들(805-845)을 포함한다. 일부 실시예들에서, 이러한 위치점들(805-845)은 무작위로 발생될 수도 있다. 다른 실시예들에서는, 이러한 위치점들(805-845)이 한 세트의 기준들에 의해 미리 판정된다. 이러한 위치점들(805-845) 각각은 정수 픽셀 레벨의 기준-프레임 매크로블록에 대응된다. 또한, 도 8은, 1/2 및 1/4 픽셀 레벨들의 위치점들과 같은, 비-정수 픽셀 레벨들(즉, 서브 픽셀 레벨)의 위치점들도 도시한다. 이러한 서브 픽셀 레벨 위치점들의 용도는 도 9를 참조하여 부연될 것이다. 8 conceptually illustrates a search window 800 with several location points. As shown in this figure, the search window 800 includes nine location points 805-845. In some embodiments, these location points 805-845 may be generated randomly. In other embodiments, these location points 805-845 are pre-determined by a set of criteria. Each of these location points 805-845 corresponds to an integer pixel level reference-frame macroblock. 8 also shows location points of non-integer pixel levels (ie, sub pixel level), such as location points of 1/2 and 1/4 pixel levels. The use of these sub pixel level location points will be discussed with reference to FIG. 9.

다음으로, 현재-프레임 매크로블록 내의 픽셀들의 가능한 파티션 각각에 대해, 프로세스(700)는 (단계 710에서), 선택된 위치점에서의 픽셀들의 특정 파티션이 현재-프레임 매크로블록의 픽셀들의 파티션에 얼마나 근접하게 매칭되는지를 검사한다. 도 10은 몇 가지 가능한 파티션(즉, 블록) 사이즈들을 개념적으로 도시한다. 구체적으로, 이 도면은 9개의 가능한 블록 사이즈들을 도시하는데, 여기에서, 각각의 블록 사이즈는 픽셀들의 특정 블록을 표현한다. 예를 들어, 블록 사이즈 1은 픽셀들의 16×16 어레이를 포함하는 픽셀들의 블록을 표현한다. 블록 사이즈 2는, 픽셀들의 16×8 어레이를 포함하는 픽셀들의 블록을 표현한다. 이 도면이 9개의 블록 사이즈들만을 도시하고 있지만, 프로세스(700)는 다른 픽셀 구성들을 갖춘 블록 사이즈들을 서치할 수도 있다. 일부 실시예들은 이러한 블록 사이즈들 모두를 서치하지만, 다른 실시예들은 이러한 블록 사이즈들 중 일부만을 서치할 수도 있다. Next, for each possible partition of pixels in the current-frame macroblock, process 700 (in step 710) determines how close a particular partition of pixels at the selected location point is to the partition of pixels of the current-frame macroblock. Is matched correctly. 10 conceptually illustrates some possible partition (ie, block) sizes. Specifically, this figure shows nine possible block sizes, where each block size represents a particular block of pixels. For example, block size 1 represents a block of pixels that contains a 16x16 array of pixels. Block size 2 represents a block of pixels comprising a 16x8 array of pixels. Although this figure shows only nine block sizes, process 700 may search for block sizes with other pixel configurations. Some embodiments search for all of these block sizes, while other embodiments may search for only some of these block sizes.

일단 (710에서) 검사가 수행되고 나면, 프로세스(700)는 (단계 715에서) 각각의 블록 사이즈를 위한 기준-프레임 매크로블록을 최선 위치로 업데이트한다. 프로세스(700)는 (단계 720에서), 다른 위치점이 존재하는지의 여부를 판정한다. 그렇다면, 프로세스(700)는 다른 위치점을 선택하기 위해 단계 705로 진행하고 단계들(710-720)에 대한 또 한 번의 반복을 수행한다. Once the check is performed (at 710), process 700 updates the reference-frame macroblock for each block size (in step 715) to the best location. Process 700 determines (at 720) whether another location point exists. If so, process 700 proceeds to step 705 to select another location point and performs another iteration of steps 710-720.

일단 프로세스(700)가 (단계 720에서), 더 이상의 위치점들이 존재하지 않는다고 판정하고 나면, 프로세스(700)는 (단계 725에서), 소정 블록 사이즈들을 위한 서치 결과들이 충분히 양호한지의 여부를 판정한다. 일부 실시예들에서는, 업데이트된 위치의 블록 사이즈가 소정 기준(예를 들어, 소정 임계치 미만의 SAD)을 충족시키면, 서치 결과는 충분히 양호한 것이다. 일부 실시예들에서는, 특정 블록 사이즈와 연관된 비용과 최저 비용을 가진 블록 사이즈와 연관된 비용 간의 차이가 임계치보다 크면, 서치 결과는 충분히 양호하지 못한 것이다. 일부 실시예들에서는, 임계치가 서치 동안에 동적으로 정의된다. 프로세스(700)가 (단계 725에서), 소정 블록 사이즈들을 위한 서치 결과들이 충분히 양호하지 않다고 판정하면, 프로세스(700)는 (단계 730에서) 임의의 후속 서치들에서 이러한 블록 사이즈들을 제외시킨다. Once the process 700 determines (at step 720) that there are no more location points, the process 700 determines (at step 725) whether the search results for certain block sizes are good enough. . In some embodiments, if the block size of the updated location meets certain criteria (eg, SAD below a certain threshold), the search result is good enough. In some embodiments, if the difference between the cost associated with the particular block size and the cost associated with the lowest cost is greater than the threshold, the search result is not good enough. In some embodiments, the threshold is dynamically defined during the search. If process 700 determines (at step 725) that the search results for certain block sizes are not good enough, process 700 excludes these block sizes from any subsequent searches (at step 730).

(단계 730에서) 이러한 블록 사이즈들을 제외한 이후에 또는 (단계 725에서) 모든 서치 결과들이 충분히 양호하다고 판정한 이후에, 프로세스(700)는 (단계 735에서) 또 한 번의 서치를 수행한다. 이러한 서치 동안, 각각의 블록 사이즈에 대 해, 프로세스(700)는 기준 프레임에서, 현재-프레임 매크로블록의 파티션과 최선으로 매칭되는 픽셀들의 파티션을 서치한다. 이러한 서치는 서브-픽셀 레벨에서 픽셀들의 파티션을 서치하는 단계를 포함한다. 이러한 서브-픽셀 레벨의 서치는 다음에서 부연될 것이다. (단계 735에서의) 서치 단계 이후에, 프로세스(700)는 종료한다. After excluding these block sizes (at step 730) or after determining that all the search results are good enough (at step 725), the process 700 performs another search (at step 735). During this search, for each block size, process 700 searches in the reference frame a partition of pixels that best matches the partition of the current-frame macroblock. This search includes searching for a partition of pixels at the sub-pixel level. This sub-pixel level search will be discussed further in the following. After the search phase (in step 735), the process 700 ends.

3. 서브 픽셀 레벨에서의 3. At the subpixel level 서칭Search

도 9는 다수 픽셀 레벨에서 기준-프레임 매크로블록을 위한 픽셀들의 파티션을 서치하기 위한 프로세스(900)를 개념적으로 도시한다. 일부 실시예들에서, 프로세스(900)는 프로세스(700)의 서치(단계 735) 동안에 수행된다. 이 도면에 도시되어 있는 바와 같이, 프로세스(900)는 (단계 905에서), 현재-프레임 매크로블록을 위한 픽셀들의 파티션(즉, 블록 사이즈)을 선택한다. 프로세스(900)는 단계 905를 통해 수차례 반복한다. 단계 905를 통한 프로세스의 반복들에서, 일부 실시예들의 프로세스는, 도 10에 도시되어 있는 파티션들의 수치적 지시들(numerical designations)에 순차적으로 기초해, 단계 730에서 파기되지 않은 파티션들(즉, 블록들)을 반복적으로 선택한다. 예를 들어, 파티션들 중 어떤 것도 단계 730에서 파기되지 않았을 경우, 프로세스(900)는 블록들(1 내지 9)을 순차적으로 선택한다. 9 conceptually illustrates a process 900 for searching for a partition of pixels for a reference-frame macroblock at the multiple pixel level. In some embodiments, process 900 is performed during a search of process 700 (step 735). As shown in this figure, process 900 selects (in step 905) a partition of pixels (ie, block size) for the current-frame macroblock. Process 900 repeats several times through step 905. In the iterations of the process through step 905, the process of some embodiments is based on the numerical designations of the partitions shown in FIG. 10 in sequence, so that partitions that were not destroyed in step 730 (ie, Blocks are selected repeatedly. For example, if none of the partitions were destroyed at step 730, process 900 selects blocks 1-9 sequentially.

단계 905 이후에, 프로세스(900)는 (단계 910에서) 서치의 초기 픽셀 해상도(예를 들어, 픽셀 레벨)를 정의한다(즉, 서치 세분성을 정의한다). 예를 들어, 프로세스(900)가 처음에는, 픽셀 해상도가 정수 픽셀에서의 격 위치(즉, 정수 픽셀 레벨 해상도의 1/2 해상도)이도록 정의할 수도 있다. 다음으로, 프로세스(900)는 (단계 915에서) 서치 위치를, 현재-프레임 매크로블록의 선택된 파티션을 위해 지금까지 식별된 최선의 위치이도록 정의한다. 이러한 최선의 식별 위치는 도 7의 프로세스(700)에 대한 픽셀 레벨 서치 동안에 식별될 수 있거나, 다음에서 부연되는 바와 같이, 도 9의 프로세스(900)에 대한 픽셀 해상도 서치들 중 어느 하나 동안에 식별될 수 있다. After step 905, process 900 defines (ie, defines the search granularity) the initial pixel resolution (eg, pixel level) of the search (at step 910). For example, process 900 may initially define that the pixel resolution is at every other location in the integer pixel (ie, half resolution of integer pixel level resolution). Next, process 900 defines the search location (at step 915) to be the best location identified so far for the selected partition of the current-frame macroblock. This best identification location may be identified during the pixel level search for process 700 of FIG. 7, or may be identified during any one of the pixel resolution searches for process 900 of FIG. 9, as further described below. Can be.

단계 730에서 파기되지 않은 소정의 현재-프레임 파티션 각각에 대해, 프로세스(900)는 (단계 920에서) (1) 정의된 픽셀 레벨 해상도(즉, 서치 세분성)로 단계 915에서 식별된 서치 위치 부근의 기준-프레임 파티션들을 검사하고, (2) 현재-프레임 파티션과 최선으로 매칭되는, 검사된 특정 기준-프레임 파티션을 식별한다. For each given current-frame partition not discarded in step 730, process 900 (in step 920) includes (1) near the search location identified in step 915 with a defined pixel level resolution (i.e., search granularity). Examine the reference-frame partitions, and (2) identify the particular reference-frame partition examined that best matches the current-frame partition.

다음으로는, 단계 730에서 파기되지 않은 소정의 현재-프레임 파티션 각각에 대해, 프로세스(900)는 (단계 925에서), 소정의 현재-프레임 파티션을 위해 단계 920에서 식별된 소정의 기준-프레임 파티션이 소정의 현재-프레임 파티션을 위해 앞서 식별된 최선 매칭보다 양호한 매칭인지의 여부를 판정한다. 그렇다면, 프로세스는 (단계 925에서), 단계 920에서 특정한 현재-프레임 파티션을 위한 최선 위치로서 식별된 특정한 기준-프레임 파티션의 위치를 정의한다. Next, for each of the current-frame partitions not destroyed in step 730, process 900 (in step 925), the predetermined reference-frame partition identified in step 920 for the given current-frame partition. It is determined whether this is a better match than the best match previously identified for this given current-frame partition. If so, the process (in step 925) defines the location of the particular reference-frame partition identified in step 920 as the best location for the particular current-frame partition.

다음으로, 프로세스(900)는 (단계 930에서), 프로세스가 선택된 파티션을 위한 최대 픽셀 레벨 해상도에서 기준 프레임을 검사했는지의 여부를 판정한다. 그렇지 않다면, 프로세스(900)는 (단계 935에서), 픽셀 레벨 해상도를 후속의 픽셀 레벨 해상도(예를 들어, 1/2, 1/4)로 증가시키고, 상술된 단계 915로 되돌아 간다. 따라서, 단계들(915-935)의 후속 반복들에서, 프로세스(900)는 현재-프레임 매크로 블록의 파티션들을 서브 픽셀 레벨(예를 들어, 1/2, 1/4)에서 검사한다. Next, process 900 determines (at step 930) whether the process has examined the reference frame at the maximum pixel level resolution for the selected partition. If not, process 900 (in step 935) increases the pixel level resolution to the subsequent pixel level resolution (eg, 1/2, 1/4) and returns to step 915 described above. Thus, in subsequent iterations of steps 915-935, process 900 checks the partitions of the current-frame macro block at the subpixel level (eg, 1/2, 1/4).

프로세스(900)가 (단계 930에서), 프로세스가 선택된 파티션을 위한 최대 픽셀 레벨 해상도에서 기준 프레임을 검사했다고 판정하면, 프로세스(900)는 (단계 940에서), 프로세스가 단계 730에서 파기되지 않은 현재-프레임 파티션들 모두를 검사했는지의 여부를 판정한다. 그렇지 않다면, 프로세스(900)는 후속의 현재-프레임 파티션을 선택하기 위해 단계 905로 복귀한 다음 이 파티션을 위해 단계 910-935를 반복한다. 프로세스(900)가 (단계 940에서), 프로세스가 단계 730에서 파기되지 않은 픽셀들의 파티션들 모두를 검사했다고 판정하면, 프로세스(900)는 종료한다. If process 900 determines (in step 930) that the process has examined the reference frame at the maximum pixel level resolution for the selected partition, process 900 (in step 940) indicates that the process has not been destroyed in step 730 Determine whether all frame partitions have been checked. If not, process 900 returns to step 905 to select a subsequent current-frame partition and then repeats steps 910-935 for this partition. If process 900 determines (at step 940) that the process has examined all of the partitions of pixels that have not been discarded at step 730, process 900 ends.

도 11은 상이한 픽셀 레벨들을 위한 몇 가지 서치 위치들을 개념적으로 도시한다. 구체적으로, 이 도면은, 4개의 정수 픽셀 레벨 위치들(825-830 및 840-845)에 의해 경계가 정해지는 서치 영역(860)을 도시한다. 일부 실시예들에서, 이러한 경계형 서치 영역(860; bounded search area)은, 도 8에 도시되어 있는 바와 같이, 서치 윈도(800) 내에 배치된다. 11 conceptually illustrates several search locations for different pixel levels. Specifically, this figure shows the search area 860 bounded by four integer pixel level positions 825-830 and 840-845. In some embodiments, this bounded search area 860 is disposed within the search window 800, as shown in FIG. 8.

경계형 서치 영역(860) 내에는 5개의 1/2 픽셀 레벨 위치들이 존재한다. 또한, 이러한 경계형 서치 영역(860) 내에는 16개의 1/4 픽셀 레벨 위치들이 존재한다. 상이한 실시예들은 더 많거나 적은 정수 및 비-정수 위치들을 포함하는 상이한 경계형 서치 영역들을 특정할 수 있다. 프로세스(900)가 (단계 915에서) 서치 위치가 위치(850)인 것으로 정의할 경우, 일부 실시예들은, 단계 920에서의 서치 동안, 이러한 경계형 영역(860)에서 그리고 주위에서 서치할 수도 있다. There are five half pixel level locations within the boundary search area 860. In addition, there are sixteen quarter pixel level locations within this bounded search area 860. Different embodiments may specify different boundary search areas that include more or less integer and non-integer positions. If process 900 defines (at step 915) the search location to be location 850, some embodiments may search at and around this bordered area 860 during the search at step 920. .

일부 실시예들에서는, 상술된 단계들이 수차례 반복된다. 상술된 바와 같이, 일부 실시예들은 각 픽셀 레벨을 위해 별도의 서치들을 수행한다. 그러나, 당업자라면, 일부 실시예들은, 각각의 서치 위치를 위해 상이한 픽셀 레벨들에서 동시에 몇 가지 블록 사이즈들을 서치할 수도 있다는 것(즉, 각각의 위치에 대해, 모든 블록 사이즈들을 위해 정수, 1/2 및 1/4 픽셀 레벨들에서 동시에 서치할 수도 있다는 것)을 알 수 있을 것이다. 서브 픽셀 레벨들이 1/2 및 1/4 픽셀 레벨들로서 설명되지만, 당업자라면, 서브 픽셀 레벨이 임의의 비-정수 픽셀 레벨일 수 있다는 것을 알 수 있을 것이다. In some embodiments, the above-described steps are repeated several times. As mentioned above, some embodiments perform separate searches for each pixel level. However, one of ordinary skill in the art would appreciate that some embodiments may search for several block sizes at the same time at different pixel levels for each search location (ie, for each location an integer, 1 / for all block sizes). One may search simultaneously at 2 and 1/4 pixel levels. Although the sub pixel levels are described as half and quarter pixel levels, those skilled in the art will appreciate that the sub pixel level may be any non-integer pixel level.

또한, 프로세스(700)는, 소정 블록 사이즈(들)를 위한 서치 결과들이 충분히 양호한지의 여부를 판정하는 (단계 725에서의) 단계를 기술한다. 일부 실시예들에서는, 이러한 판정(725)이 프로세스(900) 동안에도 수행될 수 있다. 또한, 당업자라면, 이러한 판정(725)이 프로세스들(700 및 900)의 상이한 단계들 동안에 수행될 수도 있다는 것을 알 수 있을 것이다. 예를 들어, 이러한 판정 프로세스(725)는, 각 블록 사이즈의 최선 위치를 찾아낸 이후에 수행될 수도 있다. Process 700 also describes (at step 725) determining whether the search results for a given block size (s) are good enough. In some embodiments, this determination 725 may also be performed during process 900. Those skilled in the art will also appreciate that such a determination 725 may be performed during different steps of processes 700 and 900. For example, this determination process 725 may be performed after finding the best location of each block size.

또한, 일부 실시예들은 프로세스(700) 동안 단계 735에서 서치를 수행하지 않을 수도 있다. 또한, 상기 프로세스들(700 및 900)은 기준-프레임 매크로블록을 위해 서치들을 수행하는 단계를 기술하지만, 당업자라면, 프로세스들(700 및 900)이 픽셀 어레이의 다른 유형들(예를 들어, 16×8 서브-매크로블록들)을 서치하는데 사용될 수도 있다는 것을 알 수 있을 것이다. In addition, some embodiments may not perform a search at step 735 during process 700. In addition, although the processes 700 and 900 describe performing searches for a reference-frame macroblock, those skilled in the art will appreciate that processes 700 and 900 may be implemented in other types of pixel arrays (eg, 16). It will be appreciated that it may be used to search for x8 sub-macroblocks).

B. B. 보간값들의Of interpolation values 캐싱 Caching

도 12는 기준 프레임에서의 몇 가지 픽셀 및 서브-픽셀 위치들을 개념적으로 도시한다. 이들 서브-픽셀 위치들은 1/2 및 1/4 픽셀 위치들을 포함한다. 이 도면에 추가적으로 도시되어 있는 바와 같이, 현재-프레임의 매크로블록(1200)은 1/2 서브-픽셀 위치들(1205)로써 정렬되어 있다(즉, 현재 프레임 매크로블록의 픽셀 위치들은 기준 프레임의 1/2 서브-픽셀 위치들과 일열로 정렬되어 있다). 12 conceptually illustrates several pixel and sub-pixel positions in a reference frame. These sub-pixel positions include half and quarter pixel positions. As further shown in this figure, the macroblock 1200 of the current-frame is aligned with ½ sub-pixel positions 1205 (ie, the pixel positions of the current frame macroblock are 1 in the reference frame). / 2 sub-pixel positions and aligned in a row).

앞서 언급된 바와 같이, 인코더는, 일부 실시예들의 모션 예측 연산 동안 기준 프레임의 서브-픽셀 위치들로써 정렬되어 있는(즉, 픽셀 위치들로써 정렬되어 있지 않은) 매크로블록들 또는 매크로블록 파티션들을 검사한다. 기준 프레임으로부터, 일부 실시예들의 디코더는 일부 경우들에서, 서브-픽셀 위치들로써 정렬되어 있는(즉, 픽셀 위치들로써 정렬되어 있지 않은) 매크로블록들 또는 매크로블록 파티션들을 검색해야 할 수도 있다. As mentioned above, the encoder checks macroblocks or macroblock partitions that are aligned with sub-pixel positions of the reference frame (ie, not aligned with pixel positions) during the motion prediction operation of some embodiments. From the reference frame, the decoder of some embodiments may in some cases need to search for macroblocks or macroblock partitions that are aligned by sub-pixel positions (ie, not aligned by pixel positions).

서브-픽셀 위치들로써 정렬되어 있는 매크로블록들 또는 매크로블록 파티션들의 검사 및 검색은, 인코더 또는 디코더가, 디코딩 연산 동안 현재 프레임의 픽셀 위치들에 대응되고 인코딩 연산 동안 현재 프레임의 픽셀 위치들에 비교되어야 하는 서브-픽셀 위치들에서 기준 프레임을 위한 이미지 값들(예를 들어, 휘도 값들)을 발생시킬 것을 요한다. Inspection and retrieval of macroblocks or macroblock partitions that are aligned with sub-pixel positions requires that the encoder or decoder correspond to the pixel positions of the current frame during the decoding operation and to the pixel positions of the current frame during the encoding operation. Generating image values (eg, luminance values) for the reference frame at the sub-pixel locations.

일부 실시예들에서, 서브-픽셀 위치들에 대응되는 이미지 값들을 발생시키는 단계는 이웃한 픽셀 위치들의 이미지 값들로부터 이미지 값들을 보간하는 단계(즉, 픽셀 위치들의 이미지 값들로부터 서브-픽셀 위치를 위한 이미지 값을 유도하는 단계)를 수반한다. 많은 경우들에서, 서브-픽셀 위치를 위한 이미지 값을 보간하는 단계는, 2개의 가장 인접한 픽셀 위치들의 이미지 값들의 단순한 평균 이상을 수반하는 어려운 연산(예를 들어, 계산 집약적인 연산)이다. 따라서, 일부 실시예들은 서브-픽셀 위치를 위해 보간된 이미지 값을, 다른 현재 프레임 파티션의 후속 서치가 서브-픽셀 위치를 위해 앞서 언급된 보간된 이미지 값을 검사하고자 할 경우, 용이하게 검색될 수 있는 캐시에 저장한다. 일부 실시예들은 보간된 모든 값들을 캐시에 저장하는 한편, 다른 실시예들은 보간된 값들 중 일부만을 캐시에 저장한다.In some embodiments, generating image values corresponding to the sub-pixel positions may comprise interpolating the image values from the image values of neighboring pixel positions (ie, for the sub-pixel position from the image values of the pixel positions). Deriving an image value). In many cases, interpolating the image value for the sub-pixel position is a difficult operation (eg, computationally intensive operation) involving more than just a mean of the image values of the two nearest pixel positions. Thus, some embodiments may be easily retrieved if the interpolated image value for the sub-pixel position is to be examined by a subsequent search of another current frame partition to check the interpolated image value mentioned above for the sub-pixel position. Store it in a cache. Some embodiments store all interpolated values in the cache, while other embodiments store only some of the interpolated values in the cache.

인코딩 및/또는 디코딩 연산들 동안, 현재-프레임 매크로블록들의 세트를 위한 다수의 모션 벡터들은 동일한 기준 프레임을 포인팅할 것이다. 예를 들어, 도 13에 나타낸 바와 같이, 프레임(1310)은 프레임들(1305 및 1325)을 참조하여 정의되는 모션 벡터들을 가진다. 또한, 프레임들(1315 및 1320) 모두는 프레임(1305)을 참조하여 정의되는 모션 벡터들을 가진다. 따라서, 일부 경우들에서는, 기준 프레임이 하나 이상의 다른 프레임을 인코딩 또는 디코딩하는데 사용될 수도 있다. 따라서, 기준 프레임을 위해 보간된 서브-픽셀 값들이 다른 프레임들의 인코딩에 사용될 수도 있으므로, 그들 모두를 또는 그들 중 일부를 캐싱하는 것이 바람직하다. During encoding and / or decoding operations, multiple motion vectors for the set of current-frame macroblocks will point to the same reference frame. For example, as shown in FIG. 13, frame 1310 has motion vectors defined with reference to frames 1305 and 1325. Also, both frames 1315 and 1320 have motion vectors defined with reference to frame 1305. Thus, in some cases, a reference frame may be used to encode or decode one or more other frames. Thus, since interpolated sub-pixel values for a reference frame may be used for encoding of other frames, it is desirable to cache all or some of them.

C. 캐시 C. cache 타일링Tiling (Cache Tiling)(Cache Tiling)

도 14는 기준 프레임을 캐시에 저장하기 위한 방법을 개념적으로 도시한다. 일부 실시예들에서, 이 방법은 상술된 보간 연산과 함께 구현된다. 이 도면에 도시되어 있는 바와 같이, 기준 프레임(1305)은 몇 개의 타일들(1430)로 분할된다. 일부 실시예들에서, 프레임(1305)은 타일들의 2 이상의 컬럼들 및 타일들의 2 이상의 로우들을 포함하는 방식으로 분할된다. 14 conceptually illustrates a method for storing a reference frame in a cache. In some embodiments, this method is implemented with the interpolation operation described above. As shown in this figure, the reference frame 1305 is divided into several tiles 1430. In some embodiments, frame 1305 is divided in a manner that includes two or more columns of tiles and two or more rows of tiles.

또한, 도 14는, 기준 프레임의 픽셀 위치들로써 정렬되어 있을 수도 그렇지 않을 수도 있는 픽셀 블록(1450)을 도시한다. 픽셀 블록(1450)은, 인코딩 연산 동안(즉, 모션 예측 동안) 검사되거나 디코딩 연산 동안 검색될 기준 프레임 부분을 표현한다. 14 also shows pixel block 1450, which may or may not be aligned with pixel positions of the reference frame. Pixel block 1450 represents the portion of the reference frame to be examined during the encoding operation (ie, during motion prediction) or to be retrieved during the decoding operation.

도 14에 도시되어 있는 바와 같이, 타일들의 부분들(1430a-1430d)은 픽셀 블록(1450)을 검사하거나 검색하는데 필요하다. 따라서, 인코딩 또는 디코딩 연산 동안 (픽셀 블록(1450)과 같은) 픽셀 블록들의 검사를 용이하게 하기 위해, 일부 실시예들은 기준 프레임(1305)을 기준 프레임의 타일들의 견지에서 캐싱한다. 다시 말해, 기준 프레임(1305)에 걸쳐 확장하는 픽셀들의 로우들(예를 들어, 픽셀 블록(1450)을 포함하는 픽셀 로우들(1401-1425))을 캐싱하는 대신에, 일부 실시예들은 기준 프레임 내의 타일들만을 캐싱한다. As shown in FIG. 14, portions of the tiles 1430a-1430d are needed to inspect or retrieve the pixel block 1450. Thus, to facilitate inspection of pixel blocks (such as pixel block 1450) during an encoding or decoding operation, some embodiments cache the reference frame 1305 in terms of the tiles of the reference frame. In other words, instead of caching rows of pixels that extend across reference frame 1305 (eg, pixel rows 1401-1425 that include pixel block 1450), some embodiments provide a reference frame. Cache only tiles within.

특정 픽셀 블록의 분석을 위해 한 세트의 타일들이 필요할 경우, 이러한 실시예들의 인코더 또는 디코더는, 특정 픽셀 블록이 중첩하는 타일들 모두가 캐시에 존재하는지의 여부를 판정한다. 그렇다면, 인코더 또는 디코더는 캐싱된 타일들을 사용해 특정 픽셀 블록을 프로세싱한다. 그렇지 않다면, 인코더 또는 디코더는 (1) 비-캐시 스토리지로부터 소정 타일들(즉, 특정 픽셀 블록과 중첩하지만 현재적으로 캐시에 존재하지 않는 타일들)을 검색하고, (2) 이러한 타일들을 캐시에 저장한 다음, (3) 이 타일들을 사용해 특정 픽셀 블록을 프로세싱한다. 예를 들어, 픽 셀 블록(1450)을 프로세싱하려 할 때, 인코더 또는 디코더는, 이 블록이 타일들(143Oa-143Od)과 중첩한다고 판정한다. 따라서, 인코더 또는 디코더는 (이 타일들이 이미 캐시에 존재하고 있지 않다면) 이 타일들(143Oa-143Od)을 캐시로 인출한 다음, 블록(1450)을 프로세싱하는데 이 타일들을 사용한다. If a set of tiles is needed for the analysis of a particular pixel block, the encoder or decoder of these embodiments determines whether all of the tiles that the particular pixel block overlaps are in the cache. If so, the encoder or decoder uses cached tiles to process the particular pixel block. Otherwise, the encoder or decoder retrieves (1) certain tiles from the non-cache storage (ie, tiles that overlap a particular pixel block but are not currently in the cache), and (2) retrieve these tiles into the cache. After saving, (3) use these tiles to process a specific block of pixels. For example, when attempting to process pixel block 1450, the encoder or decoder determines that the block overlaps tiles 1380a-143Od. Thus, the encoder or decoder fetches these tiles 143Oa-143Od into the cache (if these tiles are not already in the cache) and then uses these tiles to process block 1450.

일부 실시예들에서, 캐시 스토리지는 인코딩 또는 디코딩 연산들을 수행하는데 사용되는 컴퓨터 시스템 프로세서의 캐시이다. 다른 실시예들에서, 캐시 스토리지는 인코딩 또는 디코딩 연산들을 수행하는데 사용되는 컴퓨터 시스템의 비휘발성 메모리(예를 들어, RAM(random access memory))의 전용 섹션이다. 또한, 도 14가 캐싱을 위한 1/4형 타일들을 도시하고 있기는 하지만, 일부 실시예들은 그들의 타일들을 위해, 직사각형들과 같은, 다른 형태들을 사용할 수도 있다. In some embodiments, cache storage is a cache of a computer system processor used to perform encoding or decoding operations. In other embodiments, cache storage is a dedicated section of non-volatile memory (eg, random access memory) of a computer system used to perform encoding or decoding operations. In addition, although FIG. 14 shows quarter-shaped tiles for caching, some embodiments may use other shapes, such as rectangles, for their tiles.

D. D. 모션motion 예측을 위한 For prediction 적응적Adaptive 서치search 패턴 pattern

일부 실시예들은 상술된 다단식 모션 예측 연산 동안 상이한 서치 기준들을 사용해 서치들을 수행한다. 일부 실시예들은, 서치들을 수행할 때 고정 서치 패턴을 사용한다. 다른 실시예들은 상이한 서치 패턴들을 사용할 수 있다. 예를 들어, 일부 실시예들은 소정 기준들에 기초해 서치 패턴을 적응적으로 선택한다. Some embodiments perform searches using different search criteria during the multi-stage motion prediction operation described above. Some embodiments use a fixed search pattern when performing searches. Other embodiments may use different search patterns. For example, some embodiments adaptively select a search pattern based on certain criteria.

일례는 저밀도와 고밀도 서치 패턴(low-density and high-density search pattern) 사이에서 선택하는 것이다. 도 15는 서치 윈도(1500) 내의 저밀도 서치 패턴을 도시한다. 이 도면은, 패턴이 서치를 위해 특정하는 위치들을 표현하는 블랙 서클들의 견지에서 서치 패턴을 도시한다. 도 15에 도시되어 있는 바와 같이, 본 서치 패턴은, 검사될 수 있는 (블랙 및 화이트 서클들에 의해 식별되는) 49개의 잠재적 매크로블록 위치들 중에서, 서치를 위해 16개의 위치들만을 특정한다. 도 16은 서치 윈도(1500) 내의 고밀도 서치 패턴을 도시한다. 이 도면의 서치 패턴은, 검사될 수 있는 (블랙 및 화이트 서클들에 의해 식별되는) 49개의 잠재적 매크로블록 위치들 중에서, 서치를 위해 25개 위치들을 특정한다. One example is to choose between a low-density and high-density search pattern. 15 illustrates a low density search pattern in search window 1500. This figure shows the search pattern in terms of black circles representing the locations that the pattern specifies for the search. As shown in FIG. 15, this search pattern specifies only 16 positions for the search, out of the 49 potential macroblock positions (identified by black and white circles) that can be examined. 16 illustrates a high density search pattern in search window 1500. The search pattern in this figure specifies 25 positions for search, out of 49 potential macroblock positions (identified by black and white circles) that can be examined.

일부 실시예들은 소정 인코딩 결과들에 기초해 도 15 및 도 16에 도시되어 있는 서치 패턴들 사이에서 적응적으로 선택할 수도 있다. 예를 들어, 일부 실시예들은 고해상도 인코딩들(예를 들어, HD TV 인코딩)을 위해 도 16에 도시되어 있는 고밀도 패턴을 사용할 수 있는 한편, 다른 실시예들은 네트워크를 통해 전달되는 실시간 비디오를 스트리밍하기 위해 도 15에 도시되어 있는 저밀도 패턴을 사용할 수 있다. Some embodiments may adaptively select between the search patterns shown in FIGS. 15 and 16 based on certain encoding results. For example, some embodiments may use the high density pattern shown in FIG. 16 for high resolution encodings (eg, HD TV encoding), while other embodiments may stream live video delivered over a network. For example, the low density pattern shown in FIG. 15 may be used.

다른 방법으로, 일부 실시예들은 수직 서치 이동들을 강조하는 서치 패턴을 사용하는 한편, 다른 실시예들은 수평 서치 이동들을 강조하는 서치 패턴을 사용한다. 도 17은, 예측되는 매크로블록 위치 부근에 중심이 위치하는 서치 윈도에서의 서치 패턴 일례를 도시한다. 이 서치 패턴은 수직 방향으로 바이어스되어 있다. 이러한 서치 패턴이 검사할 수 있는 위치들의 수가 한정되어 있을 경우, 도 17에 도시된 패턴은 인코더의 한정된 서치 예산을 서치 윈도(1500) 중앙의 예측된 매크로블록 위치 부근의 수직 컬럼들에 존재하는 위치들을 검사하는데 사용한다. Alternatively, some embodiments use a search pattern that emphasizes vertical search movements, while other embodiments use a search pattern that emphasizes horizontal search movements. 17 shows an example of a search pattern in a search window in which the center is located near the predicted macroblock position. This search pattern is biased in the vertical direction. If this search pattern is limited in the number of locations that can be inspected, the pattern shown in FIG. 17 is the location of the encoder's limited search budget present in vertical columns near the predicted macroblock location in the center of the search window 1500. To test them.

도 18은 예측된 매크로블록 위치 부근에 중심이 위치하는 서치 윈도에서의 서치 패턴 일례를 도시한다. 이 서치 패턴은 수평 방향으로 바이어스되어 있다. 이러한 서치 패턴이 검사할 수 있는 위치들의 수가 한정되어 있을 경우, 도 18에 도시된 패턴은 인코더의 한정된 서치 예산을 서치 윈도(1500) 중앙의 예측된 매크로블록 위치 부근의 수평 로우들에 존재하는 위치들을 검사하는데 사용한다. 18 shows an example of a search pattern in a search window centered near the predicted macroblock position. This search pattern is biased in the horizontal direction. If this search pattern is limited in the number of locations that can be inspected, the pattern shown in FIG. 18 is a location where the limited search budget of the encoder exists in horizontal rows near the predicted macroblock location in the center of the search window 1500. To test them.

일부 실시예들은 인접한 매크로블록들의 벡터들에 기초해 도 17 및 도 18에 도시되어 있는 2개 패턴 사이에서 적응적으로 선택한다. 그들 중 대부분 또는 그들 모두가 특정 방향(예를 들어, 수직 또는 수평 방향)을 포인팅한다면, 이 실시예들은 도 17 또는 도 18에 도시되어 있는 패턴들을 선택한다. 일부 실시예들은, 일 방향(예를 들어, y-축)에 따른 모션 벡터의 절대값이 나머지 방향(예를 들어, x-축)에 따른 모션 벡터의 절대값보다 큰지의 여부를 판정하는 것에 의해, 인접한 매크로블록들의 모션 벡터들이 특정 방향을 포인팅하는지의 여부를 판정한다. 일부 실시예들은 인접한 매크로블록들에 대한 모션 벡터들의 방향들을 고려할 뿐만 아니라 이들 벡터들의 크기들도 고려한다. 일부 실시예들은 서치 패턴을 적응적으로 선택함에 있어서 이미지들의 세트에 대한 모션 필드도 고려한다(예를 들어, 이미지들의 세트가 특정 방향의 이동을 묘사하는지의 여부를 고려한다). Some embodiments adaptively select between the two patterns shown in FIGS. 17 and 18 based on vectors of adjacent macroblocks. If most or all of them point to a particular direction (eg vertical or horizontal direction), these embodiments select the patterns shown in FIG. 17 or FIG. 18. Some embodiments provide for determining whether the absolute value of a motion vector along one direction (eg, the y-axis) is greater than the absolute value of the motion vector along the other direction (eg, the x-axis). By determining whether the motion vectors of adjacent macroblocks point in a particular direction. Some embodiments take into account the directions of motion vectors for adjacent macroblocks as well as the magnitudes of these vectors. Some embodiments also consider the motion field for a set of images in adaptively selecting a search pattern (eg, consider whether the set of images depicts movement in a particular direction).

E. RD 비용 계산들E. RD Cost Calculations

앞서 언급된 바와 같이, 본 발명의 일부 실시예들은 모션 예측 연산 동안, RD(rate distortion) 비용과 같은, 특정 매크로블록을 위한 비용을 계산한다. 모션 예측 동안 가능한 모든 모드들에 대한 RD 비용을 발생시키는 것은 계산 집약적이다. 이 비용이 대부분 왜곡을 측정하는 단계 및 발생될 실제 비트들을 카운팅하는 단계를 수반한다는 것을 감안할 때, 특히 그러하다. 따라서, 일부 실시예들은 가능한 모든 모드들에 대해 RD 비용을 계산하지 않는다. 대신에, 이 실시예들은, 모션-예측 솔루션들의 순위를 정하고, 상부의 N개 모션-예측 솔루션들을 선택한 다음, 선택된 솔루션들에 대한 RD 비용을 계산하는 것에 의해, 가능한 모드들의 수를 감소시킨다. As mentioned above, some embodiments of the present invention calculate the cost for a particular macroblock, such as a rate distortion (RD) cost, during a motion prediction operation. It is computationally intensive to induce RD costs for all possible modes during motion prediction. This is especially true given that this cost mostly involves measuring distortion and counting the actual bits to be generated. Thus, some embodiments do not calculate the RD cost for all possible modes. Instead, these embodiments reduce the number of possible modes by ranking motion-prediction solutions, selecting the top N motion-prediction solutions, and then calculating the RD cost for the selected solutions.

도 19는 본 발명의 일부 실시예들에 대한 프로세스(1900)를 도시한다. 이 프로세스는, 프로세스가 RD 비용을 계산할 필요가 있는 모션-예측 솔루션들을 식별하기 위해, 모션-예측 솔루션들의 서브-세트를 선택적으로 검사한다. 일부 실시예들에서는, 이 프로세스가 시작하기 전에, 이미 다수의 인코딩 솔루션들이 계산되어 있는 상태이다. 다른 실시예들은 이 프로세스를 인코딩 솔루션들과 함께 수행한다. 19 shows a process 1900 for some embodiments of the present invention. This process optionally checks a sub-set of motion-prediction solutions to identify the motion-prediction solutions for which the process needs to calculate the RD cost. In some embodiments, before this process begins, a number of encoding solutions have already been calculated. Other embodiments perform this process with encoding solutions.

본 프로세스(1900)는 처음에 (단계 1910에서), 최저에서 최고 예측 오류들에 기초해 인코딩 솔루션을 정렬한다. 일부 실시예들에서, 각각의 인코딩 솔루션은 모션 벡터를 발생시킬 뿐만 아니라 예측되는 오류도 발생시킨다. 상이한 실시예들은 상이한 메트릭 계산들을 사용해 오류를 정량화한다. 예를 들어, 일부 실시예들은 MAD(mean absolute difference) 메트릭 스코어를 사용하는 한편, 다른 실시예들은 앞서 언급된 출원에 설명되어 있는 SAD(sum of absolute differences) 메트릭 스코어를 사용한다. 또 다른 실시예들은 2 이상의 메트릭 스코어들의 조합을 사용한다. The process 1900 initially sorts (at step 1910) the encoding solution based on the lowest to highest prediction errors. In some embodiments, each encoding solution generates a motion vector as well as a predicted error. Different embodiments use different metric calculations to quantify the error. For example, some embodiments use mean absolute difference (MAD) metric scores, while others use the sum of absolute differences (SAD) metric score described in the above-mentioned application. Still other embodiments use a combination of two or more metric scores.

다음으로, 본 프로세스는 (단계 1920에서), 정렬된 리스트로부터 상부의 N개 인코딩 솔루션들을 선택한다. 일부 실시예들에서, N의 값은 미리 정해진 수인 한편, 다른 실시예들에서, N의 값은 동적으로 발생되는 수이다. 다음으로, 본 프로 세스는 (단계 1930에서), 선택된 상부의 N개 결과들을 위한 RD 비용을 계산하고, (단계 1940에서) 최저 RD 비용의 인코딩 솔루션을 선택한 다음, 종결한다. Next, the process selects (at 1920) the top N encoding solutions from the sorted list. In some embodiments, the value of N is a predetermined number, while in other embodiments, the value of N is a dynamically generated number. Next, the process calculates (in step 1930) the RD cost for the selected top N results, selects the lowest RD cost encoding solution (in step 1940), and then terminates.

일부 실시예들은 인코딩 솔루션의 RD 비용을 다음의 수학식 1로서 표현하는데, Some embodiments express the RD cost of the encoding solution as:

여기에서, λ는 가중화 팩터이고, NB는 인코딩으로 인해 발생되는 비트들의 수이다. 이러한 RdCost는, 전송되어야 하는 데이터량 및 그 데이터와 연관된 왜곡량을 정량화한다. Where λ is the weighting factor and NB is the number of bits generated due to encoding. This RdCost quantifies the amount of data to be transmitted and the amount of distortion associated with that data.

단순한 RD 비용을 계산하는 대신에, 일부 실시예들은 (단계 1930에서), RD 비용을 고려할 뿐만 아니라 인코딩 솔루션이 발생된 소정 모드를 디코딩하는 복잡도도 고려하는 비용을 계산한다. 이 비용은 다음의 수학식 2로서 표현될 수 있는데, Instead of calculating a simple RD cost, some embodiments (at step 1930) calculate a cost that not only takes into account the RD cost but also the complexity of decoding a given mode in which an encoding solution is generated. This cost can be expressed as:

,

여기에서, RdCost는 앞서 특정된 수학식 1에서와 같이 계산되고, α는 디코딩 복잡도와 연관된 중요도 팩터이며, cf는 데이터에 대해 수행되는 디코딩의 양을 정량화하는 복잡도 팩터이다. Here, RdCost is calculated as in Equation 1 specified above, α is an importance factor associated with decoding complexity, and cf is a complexity factor that quantifies the amount of decoding performed on the data.

단계 1930 이후에, 본 프로세스는 (단계 1940에서), 단계 1930에서 계산된 최저 비용을 초래한 모션-예측 솔루션을 선택한 다음, 종료한다. 초기 메트릭 스코어에 의해 모션 예측 연산을 초기에 정렬하고 최저의 초기 메트릭 스코어를 가진 인코딩 솔루션들에 대한 비용 메트릭만을 정량화하는 것에, 프로세스(1900)는, 프로세스가 가능한 가장 빠른 방법으로 수용 가능한 결과를 찾아낸다는 것을 보장한다. After step 1930, the process selects (in step 1940) the motion-prediction solution that resulted in the lowest cost calculated in step 1930, and then ends. In initially aligning the motion prediction operation by the initial metric score and quantifying only the cost metric for the encoding solutions with the lowest initial metric score, the process 1900 finds an acceptable result in the fastest way the process can find. Guaranteed to pay.

IV. 컴퓨터 시스템IV. Computer systems

도 20은, 본 발명의 일부 실시예들이 구현되는 컴퓨터 시스템을 개념적으로 도시한다. 컴퓨터 시스템(2000)은 버스(2005), 프로세서(2010), 시스템 메모리(2015), ROM(read-only memory; 2020), 영구 저장 장치(2025), 입력 장치들(2030), 및 출력 장치들(2035)을 포함한다.20 conceptually illustrates a computer system in which some embodiments of the present invention are implemented. Computer system 2000 includes bus 2005, processor 2010, system memory 2015, read-only memory 2020, persistent storage 2025, input devices 2030, and output devices. (2035).

버스(2005)는, 컴퓨터 시스템(2000)의 내부 장치들 사이의 통신을 지원하는 시스템, 주변 장치, 및 칩셋 버스들을 집합적으로 표현한다. 예를 들어, 버스(2005)는 프로세서(2010)를 ROM(2020), 시스템 메모리(2015), 및 영구 저장 장치(2025)와 통신 접속한다. The bus 2005 collectively represents a system, peripheral, and chipset buses that support communication between internal devices of the computer system 2000. For example, the bus 2005 connects the processor 2010 in communication with a ROM 2020, a system memory 2015, and a persistent storage 2025.

이렇게 다양한 메모리 유닛들로부터, 프로세서(2010)는 본 발명의 프로세스들을 실행하기 위해 실행할 명령어들 및 프로세싱할 데이터를 검색한다. ROM(2020)은 프로세서(2010) 및 컴퓨터 시스템의 다른 모듈들에 의해 필요한 정적 데이터 및 명령어들을 저장한다. 한편, 영구 저장 장치(2025)는 판독 및 기입 메모리 장치이다. 이 장치는, 컴퓨터 시스템(2000)의 전원이 차단될 경우에도 명령어 및 데이터를 저장하는 비-휘발성 메모리이다. 본 발명의 일부 실시예들은 (자 기 또는 광 디스크 및 그것의 대응되는 디스크 드라이브와 같은) 대용량 저장 장치를 영구 저장 장치(2025)로서 사용한다. 다른 실시예들은 (플로피 디스크 또는 zip® 디스크, 및 그것의 대응되는 디스크 드라이브와 같은) 분리형 저장 장치를 영구 저장 장치로서 사용한다. From these various memory units, processor 2010 retrieves instructions to execute and data to process to execute the processes of the present invention. ROM 2020 stores static data and instructions needed by processor 2010 and other modules of a computer system. On the other hand, persistent storage 2025 is a read and write memory device. This device is a non-volatile memory that stores instructions and data even when the computer system 2000 is powered off. Some embodiments of the present invention use mass storage (such as magnetic or optical disks and their corresponding disk drives) as permanent storage 2025. Other embodiments use removable storage devices (such as floppy disks or zip® disks, and their corresponding disk drives) as permanent storage devices.

영구 저장 장치(2025)와 마찬가지로, 시스템 메모리(2015)는 판독 및 기입 메모리 장치이다. 그러나, 저장 장치(2025)와 달리, 시스템 메모리는, RAM(random access memory)과 같은, 휘발성의 판독 및 기입 메모리이다. 시스템 메모리는, 프로세서가 실행시에 필요로 하는 명령어들 및 데이터 중 일부를 저장한다. 일부 실시예들에서, 본 발명의 프로세스들은 시스템 메모리(2015), 영구 저장 장치(2025), 및/또는 ROM(2020)에 저장된다. As with persistent storage 2025, system memory 2015 is a read and write memory device. However, unlike storage device 2025, system memory is volatile read and write memory, such as random access memory (RAM). System memory stores some of the instructions and data that the processor needs at run time. In some embodiments, the processes of the present invention are stored in system memory 2015, persistent storage 2025, and / or ROM 2020.

또한, 버스(2005)는 입력 및 출력 장치들(2030 및 2035)에 접속한다. 입력 장치들은, 사용자로 하여금 컴퓨터 시스템에 정보를 전달할 수 있게 하고 컴퓨터 시스템을 위한 명령들을 선택할 수 있게 한다. 입력 장치들(2030)은 영숫자 키보드들 및 커서-제어기들을 포함한다. 출력 장치들(2035)은 컴퓨터 시스템에 의해 발생된 이미지들을 디스플레이한다. 출력 장치들은 프린터들 및, CRT(cathode ray tubes) 또는 LCD(liquid crystal displays)와 같은, 디스플레이 장치들을 포함한다. In addition, bus 2005 connects to input and output devices 2030 and 2035. Input devices allow a user to communicate information to a computer system and to select instructions for the computer system. Input devices 2030 include alphanumeric keyboards and cursor-controllers. Output devices 2035 display images generated by the computer system. Output devices include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD).

마지막으로, 도 20에 도시되어 있는 바와 같이, 버스(2005)는 컴퓨터(2000)를 (나타내지 않은) 네트워크 어댑터를 통해 네트워크(2065)에도 커플링한다. 이런 식으로, 컴퓨터는 (LAN(local area network), WAN(wide area network), 또는 인 트라넷과 같은) 컴퓨터들의 네트워크 일부 또는 (인터넷과 같은) 네트워크들의 네트워크일 수 있다. 컴퓨터 시스템(2000)의 컴포넌트들 중 어느 하나 또는 모두가 본 발명과 관련하여 사용될 수 있다. 그러나, 당업자라면, 임의의 다른 시스템 구성이 본 발명과 관련하여 사용될 수도 있다는 것을 알 수 있을 것이다. Finally, as shown in FIG. 20, bus 2005 couples computer 2000 to network 2065 through a network adapter (not shown). In this way, a computer may be part of a network of computers (such as a local area network, a wide area network, or an intranet) or a network of networks (such as the Internet). Any or all of the components of computer system 2000 may be used in connection with the present invention. However, it will be apparent to one skilled in the art that any other system configuration may be used in connection with the present invention.

다양한 구체적 세부 사항들을 참조하여 본 발명이 설명되었지만, 당업자라면, 본 발명이 본 발명의 정신을 벗어나지 않으면서 다른 특정 형태들로 구현될 수도 있다는 것을 알 수 있을 것이다. 예를 들어, 본 발명의 다수 실시예들은 매크로블록들을 참조하여 상술되었다. 당업자는, 이들 실시예들이 픽셀 값들의 다른 임의 어레이와 관련하여 사용될 수도 있다는 것을 알 수 있을 것이다. While the invention has been described with reference to various specific details, those skilled in the art will recognize that the invention may be embodied in other specific forms without departing from the spirit of the invention. For example, many embodiments of the invention have been described above with reference to macroblocks. Those skilled in the art will appreciate that these embodiments may be used in conjunction with any other array of pixel values.

Claims

A method for encoding a first set of pixels of a first image with reference to a second image of a video sequence, the method comprising:

a) in the first search window in the second image,

Searching to identify a first specific portion of the second image that best matches the first set of pixels of the first image;

Identifying a first location corresponding to the first specific portion; And

b) in a second search window in the second image,

Searching to identify a second specific portion of the second image that best matches the first set of pixels of the first image, the second search window being defined near the first location

How to include.

The method of claim 1,

And the second search window in the second image is smaller than the first search window of the second image.

The method of claim 1,

Searching for identifying the first specific portion comprises searching from one or more start locations.

The method of claim 1,

Searching in the first search window includes a coarse search,

The searching in the second search window includes a refined search.

The method of claim 1,

Searching in the second search window,

a) identifying a plurality of search points within the second search window;

b) for each particular search point, repeatedly

i. Identifying a plurality of second pixel groups;

ii. Calculating a motion vector metric for each identified second group of pixels;

iii. Designating a best second pixel group for each of the first pixel groups; And

iv. If the criteria are met, further comprising ignoring the remaining search points.

The method of claim 5,

a) determining whether the designated second pixel group associated with the particular pixel group has a calculated motion vector metric that exceeds a threshold; And

b) after determining that the designated second pixel group associated with the particular first pixel group exceeds the threshold, excluding the particular first pixel group from subsequent searches.

The method of claim 6,

The threshold is dynamically defined while searching in the second search window.

The method of claim 5,

Searching in the second search window comprises searching at a first pixel level.

The method of claim 8,

Searching in the second search window further includes searching at a second pixel level,

Wherein the first pixel level is an integer pixel level and the second pixel level is a half pixel level.

A method for interblock encoding images of a video sequence, each image of the video sequence having a plurality of integer pixel positions, each integer pixel position having at least one image value,

a) selecting a first image for encoding with reference to a second image;

b) identifying a first set of non-integer pixel positions of the second image that match the pixel set of the first image, wherein the identification is performed from image values for a plurality of integer pixel positions of the second image. Interpolating image values associated with the non-integer pixel locations of a second image; And

c) storing the interpolated image values of the non-integer pixel positions for subsequent use during third image encoding referencing the second image.

The method of claim 10,

After identifying the set of non-integer pixel positions of the second image, interpolating image values for another group of non-integer pixel positions of the second image.

The method of claim 11,

And the group of non-integer pixel positions is located near the first set of non-integer pixel positions.

A method for interblock decoding images of a video sequence, each image of the video sequence having a plurality of integer pixel positions, each of the integer pixel positions having at least one image value,

a) selecting a first image for decoding with reference to the second image;

b) identifying a set of non-integer pixel locations of the second image corresponding to the pixel set of the first image;

c) interpolating image values associated with non-integer pixel positions of the second image from image values for a plurality of integer pixel positions of the second image; And

d) storing the interpolated image values of the non-integer pixel positions for subsequent use during third image decoding referencing the second image.

The method of claim 13,

After interpolating image values associated with the non-integer pixel position, further comprising interpolating image values for a group of other non-integer pixel positions of the second image.

The method of claim 14,

And the group of non-integer pixel positions is located near the first non-integer pixel position.

A method for interblock processing a first portion of a first image with reference to a second image in a sequence of video images,

a) dividing the second image into a set of tiles;

b) storing the tiles in a first non-cache memory storage;

c) retrieving the sub-set of tiles from the first non-cache memory storage whenever a sub-set of tiles is needed; And

d) storing the subset of the retrieved tiles in a second cache memory storage between the portions of the second image that are part of the subset of the retrieved tiles and the first portion, the sub-set of the retrieved tiles Set is smaller than the full set of tiles-

How to include.

The method of claim 16,

The method includes, when identifying a location of the second image to search to identify the second image portion that matches the first portion, a sub-set of tiles is retrieved and stored in the second cache memory storage. Determining if there is a need to be made, and the location so identified corresponds to the sub-set of tiles.

The method of claim 16,

Said cache memory storage is a random access memory (RAM) of a computer.

The method of claim 16,

And the cache memory storage is a non-volatile storage device of a computer.

The method of claim 16,

The interblock processing method is an interblock encoding method.

The method of claim 16,

The interblock processing method is an interblock decoding method.

The method of claim 16,

Wherein the set of tiles comprises at least two horizontally adjacent tiles and at least two vertically adjacent tiles.

The method of claim 16,

The tiles are sequentially stored in the cache memory storage.

An interblock encoding method for encoding a first set of pixels of a first video image, the method comprising:

a) selecting a first search pattern from a set of search patterns each defining a pattern for inspecting second image portions that may match the first pixel set; And

b) adaptively selecting the first search pattern in the set of search patterns based on the set of criteria.

The method of claim 24,

Wherein the set of criteria comprises a resolution encoding of the sequence of images of media.

The method of claim 24,

Wherein the set of criteria comprises motion vectors of adjacent motion vectors.

The method of claim 24,

Wherein the set of criteria comprises a motion field for the set of video images.

A method for encoding a first set of pixels of a first image with reference to a second image in a sequence of images, the method comprising:

a) identifying a plurality of second pixel sets in a second image;

b) calculating a first metric score for each of the second set of pixels;

c) based on the first metric score, identifying a subset of second pixel sets;

d) from the subset of the identified second pixel sets,

i. Calculating a second metric score for each of the identified second pixel sets; And

ii. Selecting the identified second set of pixels having the best second metric score, wherein the selected and identified second set of pixels best matches the first set of pixels.

How to include.

The method of claim 28,

Each of the second pixel sets includes a plurality of second grouping of pixels, and each of the second pixel groupings includes a plurality of second group of pixels. Way.

The method of claim 29,

Computing the first metric score,

a) calculating a first metric score for each of the second pixel groupings; And

b) calculating a first metric score for each of the second group of pixels.

The method of claim 30,

Identifying the subset of second pixel sets comprises identifying a subset of second pixel groupings and a subset of the second pixel groups.

The method of claim 31, wherein

Calculating the second metric score comprises calculating a second metric score for each second pixel grouping and each second pixel group.

The method of claim 32,

Wherein the first metric score is a sum absolute difference (SAD) metric score.

The method of claim 28,

Identifying the subset of second pixel sets comprises selecting an upper N second set of pixels with the lowest first metric score.

The method of claim 34, wherein

The second rate metric score is a rate distortion (RD) cost that quantifies the amount of data to be transmitted and the amount of distortion associated with the transmitted data.

The method of claim 28,

a) calculating a third metric score for the top N second set of pixels with the lowest second metric score; And

b) selecting the identified second set of pixels having the best third score, wherein the selected and identified second set of pixels best matches the first set of pixels.

How to include more.