KR100984650B1

KR100984650B1 - H. 24 Encoding method that enables highly efficient partial decoding of 264 and other transform coded information.

Info

Publication number: KR100984650B1
Application number: KR1020087010065A
Authority: KR
Inventors: 페이송 천; 세이풀라 할리트 오구즈
Original assignee: 퀄컴 인코포레이티드
Priority date: 2005-09-27
Filing date: 2006-09-27
Publication date: 2010-10-01
Anticipated expiration: 2026-09-27
Also published as: TW200719726A; CN101310536B; CN101310536A; JP2012231505A; WO2007038727A2; KR20080066714A; EP1941742A2; AR055185A1; WO2007038727A3; JP2009510938A

Abstract

변환 코딩된 데이터의 효율적인 부분 디코딩을 가능하게 하는 멀티미디어 데이터를 프로세싱하기 위한 방법 및 장치가 개시된다. 디코더 디바이스는 변환 계수를 수신하고, 여기서 변환 계수는 멀티미디어 데이터와 연관된다. 디코더 디바이스는, 재구성되는 멀티미디어 샘플들의 세트를 결정한다. 일 양태에서, 재구성되는 샘플들의 세트는, 변환된 멀티미디어 샘플들의 행렬의 서브세트이다. 디코더 디바이스는, 멀티미디어 샘플들을 재구성하기 위해 사용되는 변환 계수들의 세트를 결정한다. 일 양태에서, 변환 계수들은, 그 변환 계수들을 발생시키기 위해 사용되는 인코딩 방법과 연관된 부분 기본 이미지들을 스케일링하여 재구성된 멀티미디어 샘플들을 초래하기 위해 사용된다.A method and apparatus for processing multimedia data that enables efficient partial decoding of transform coded data is disclosed. The decoder device receives the transform coefficients, where the transform coefficients are associated with the multimedia data. The decoder device determines the set of multimedia samples to be reconstructed. In one aspect, the set of samples to be reconstructed is a subset of the matrix of transformed multimedia samples. The decoder device determines the set of transform coefficients used to reconstruct the multimedia samples. In one aspect, the transform coefficients are used to scale the partial base images associated with the encoding method used to generate the transform coefficients, resulting in reconstructed multimedia samples.

멀티미디어 데이터, 멀티미디어 샘플, 변환 계수, 역양자화, 디코더 디바이스 Multimedia data, multimedia samples, transform coefficients, dequantization, decoder device

Description

H.264 AND OTHER TRANSFORM CODED INFORMATION} VIDEO ENCODING METHOD ENABLING HIGHLY EFFICIENT PARTIAL DECODING OF H.264 AND OTHER TRANSFORM CODED INFORMATION

관련 출원에 대한 상호 참조Cross Reference to Related Application

U.S.C.§119 에 따른 우선권 주장Claim priority under U.S.C. §119

본 특허출원은, 2005년 9월 27일자로 출원되고, 본 양수인에게 양도되어 여기에 참조로 명백히 포함되는, 발명의 명칭이 "ERROR CONCEALMENT" 인 가출원번호 제60/721,377호를 우선권 주장한다.This patent application claims priority to Provisional Application No. 60 / 721,377, filed September 27, 2005, which is assigned to this assignee and expressly incorporated herein by reference.

배경background

발명의 분야Field of invention

본 발명은, 멀티미디어 신호 처리, 더 상세하게는 비디오 인코딩 및 디코딩에 관련된다.The present invention relates to multimedia signal processing, and more particularly to video encoding and decoding.

관련 기술의 설명Description of the related technology

비디오 인코더와 같은 멀티미디어 신호 처리 시스템은, MPEG-x 및 H.26x 표준과 같은 국제 표준에 기초한 인코딩 방법들을 이용하여 멀티미디어 데이터를 인코딩할 수도 있다. 이러한 인코딩 방법들은, 일반적으로 송신 및/또는 저장을 위해 멀티미디어 데이터를 압축하는 것을 지향하게 된다. 압축은 대체로 데이터에서 중복성 (redundancy) 을 제거하는 프로세스이다.Multimedia signal processing systems, such as video encoders, may encode multimedia data using encoding methods based on international standards such as the MPEG-x and H.26x standards. Such encoding methods are generally directed to compressing multimedia data for transmission and / or storage. Compression is usually the process of removing redundancy from data.

비디오 신호는, 프레임들 (전체 화상), 또는 필드들 (예를 들어, 비월 비디오 신호는 화상의 홀수 또는 짝수 선들을 교번하는 필드를 포함) 을 포함하는, 화상들의 시퀀스에 의하여 설명될 수도 있다. 여기에 사용한 것처럼, "프레임" 이란 용어는, 화상, 프레임 또는 필드를 나타낸다. 비디오 인코딩 방법들은, 각 프레임을 압축하기 위해 무손실 또는 유손실 (lossy) 압축 알고리즘을 이용함으로써 비디오 신호들을 압축한다. 인트라-프레임 코딩 (여기서, 인트라-코딩으로 지칭) 은, 프레임을 그 프레임을 사용하여 인코딩하는 것을 의미한다. 인터-프레임 코딩 (여기서, 인터-코딩으로 지칭) 은, 프레임을 다른 "참조" 프레임들에 기초하여 인코딩하는 것을 의미한다. 예를 들어, 비디오 신호들은 종종, 동일 프레임 내에서 서로에 근접한 비디오 프레임 샘플들의 일부가 서로 매칭하거나 적어도 근사적으로 매칭하는 적어도 일부를 갖는 공간 중복성을 보인다. The video signal may be described by a sequence of pictures, including frames (full picture), or fields (eg, interlaced video signal includes a field that alternates odd or even lines of the picture). As used herein, the term "frame" refers to an image, frame or field. Video encoding methods compress video signals by using a lossless or lossy compression algorithm to compress each frame. Intra-frame coding (herein referred to as intra-coding) means encoding a frame using that frame. Inter-frame coding (herein referred to as inter-coding) means encoding a frame based on other "reference" frames. For example, video signals often exhibit spatial redundancy with at least a portion where some of the video frame samples close to each other within the same frame match or at least approximately match each other.

비디오 인코더와 같은 멀티미디어 프로세서는, 일 프레임을, 블록들, 또는, 예를 들어, 16×16 화소들의 "매크로블록들" 로 분할함으로써 인코딩할 수도 있다. 인코더는 또한, 각 매크로블록을 서브블록들로 분할할 수도 있다. 각 서브블록은 또한, 부가적인 서브블록들을 포함할 수도 있다. 예를 들어, 일 매크로블록의 서브블록들은 16×8 서브블록들 및 8×16 서브블록들을 포함할 수도 있다. 8×16 서브블록들의 서브블록들은, 8×8 서브블록들 등을 포함할 수도 있다. 여기에 사용한 것처럼, "블록" 이란 용어는 매크로블록 또는 서브블록 중 어느 하나를 의미한다.A multimedia processor, such as a video encoder, may encode a frame by dividing it into blocks, or “macroblocks” of, for example, 16 × 16 pixels. The encoder may also split each macroblock into subblocks. Each subblock may also include additional subblocks. For example, subblocks of one macroblock may include 16x8 subblocks and 8x16 subblocks. Subblocks of 8x16 subblocks may include 8x8 subblocks and the like. As used herein, the term "block" means either a macroblock or a subblock.

진보하는 산업 표준에 기초한 한가지 압축 기술은, "H.264" 비디오 압축으로 통칭된다. H.264 기술은, 인코딩된 비디오 비트스트림의 신택스 (syntax) 를, 이 비트스트림을 디코딩하는 방법과 함께 정의한다. H.264 인코딩 프로세스의 일 양태에서, 일 입력 비디오 프레임이 인코딩을 위해 제시된다. 그 프레임은 원래의 이미지에 대응하는 매크로블록들의 유닛들로 처리된다. 각 매크로블록은, 인트라 또는 인터 모드에서 인코딩될 수 있다. 예측된 매크로블록은, 이미 재구성된 프레임이나 인과적인 (causal) 이웃들로 알려져 있는 동일 프레임 내의 이미 재구성된 이웃 블록들 중 일부에 기초하여 형성된다. 인트라 모드에서, 매크로블록은, 이전에 인코딩, 디코딩, 및 재구성된 현재의 프레임 내의 인과적인 샘플들로부터 형성된다. 하나 이상의 인과적인 이웃 매크로블록들의 멀티미디어 샘플들은, 인코딩되는 현재의 매크로블록에서 감산되어 잔여 또는 차액 (difference) 매크로블록 D 를 생성한다. 이 잔여 블록 D 은, 블록 변환을 이용하여 변환되고, 양자화된 변환 계수들의 세트 X 를 생성하기 위해 양자화된다. 이들 변환 계수들은 재배열되고 엔트로피 인코딩된다. 매크로블록을 디코딩하기 위한 다른 정보와 함께, 엔트로피 인코딩된 계수들은, 수신 디바이스로 송신되는 압축된 비트스트림의 일부가 된다. One compression technique, based on evolving industry standards, is commonly referred to as "H.264" video compression. H.264 technology defines the syntax of an encoded video bitstream along with a method of decoding this bitstream. In one aspect of the H.264 encoding process, one input video frame is presented for encoding. The frame is processed into units of macroblocks corresponding to the original image. Each macroblock may be encoded in intra or inter mode. The predicted macroblock is formed based on some of the already reconstructed neighboring blocks in the same frame, known as already reconstructed frames or causal neighbors. In intra mode, a macroblock is formed from causal samples in the current frame that have been previously encoded, decoded, and reconstructed. Multimedia samples of one or more causal neighboring macroblocks are subtracted from the current macroblock to be encoded to produce a residual or difference macroblock D. This residual block D is transformed using a block transform and quantized to produce a set X of quantized transform coefficients. These transform coefficients are rearranged and entropy encoded. The entropy encoded coefficients, along with other information for decoding the macroblock, become part of the compressed bitstream transmitted to the receiving device.

공교롭게도, 송신 프로세스 동안, 하나 이상의 매크로블록들에 에러가 도입될 수도 있다. 예를 들어, 신호 페이딩과 같은 하나 이상의 열악한 송신 효과는, 하나 이상의 매크로블록들에 있어서 데이터의 손실을 야기할 수도 있다. 그 결과, 무선 채널들과 같은 에러 유발 (error prone) 네트워크를 통해 멀티미디어 컨텐츠를 전달할 때 에러 은닉 (concealment) 이 중요해진다. 에러 은닉 방 식은, 비디오 신호에 존재하는 공간 및 시간 상관관계를 사용한다. 에러에 직면할 때, 엔트로피 디코딩 동안 복구가 발생할 수도 있다. 예를 들어, 패킷 에러에 직면할 때, 하나 이상의 매크로블록들 또는 비디오 슬라이스들 (보통 이웃 매크로블록들의 그룹들) 에 속하는 데이터의 전부 또는 일부가 손실될 수 있다. 일 슬라이스의 비디오 데이터가 손실될 때, 디코딩의 재동기화가 다음의 슬라이스에서 발생할 수 있으며, 손실된 슬라이스의 분실 블록들은 공간 은닉을 이용하여 은닉될 수 있다.Unfortunately, during the transmission process, an error may be introduced in one or more macroblocks. For example, one or more poor transmission effects, such as signal fading, may cause loss of data in one or more macroblocks. As a result, error concealment becomes important when delivering multimedia content over an error prone network such as wireless channels. The error concealment method uses the spatial and temporal correlation present in the video signal. When faced with an error, recovery may occur during entropy decoding. For example, when faced with a packet error, all or part of the data belonging to one or more macroblocks or video slices (usually groups of neighboring macroblocks) may be lost. When one slice of video data is lost, resynchronization of decoding may occur in the next slice, and missing blocks of the lost slice may be concealed using spatial concealment.

디코더 디바이스에 이용가능한 디코딩된 데이터는 이미 디코딩되고 재구성된 인과적인 이웃들을 포함하기 때문에, 공간 은닉은 보통 분실 블록들을 은닉하기 위해 인과적인 이웃들을 이용한다. 손실된 블록들을 은닉하기 위해 인과적인 이웃들을 이용하는 한가지 이유는, 다음의 슬라이스의 비순차적 재구성 후의 현재의 슬라이스의 손실된 부분의 은닉이, 특히 고도의 파이프라인 비디오 하드웨어 디코더 코어를 사용할 때 매우 비효율적일 수 있다는 것이다. 비인과적인 (non-causal) 이웃들은, 향상된 공간 은닉을 위한 값진 정보를 제공할 수 있다. 필요한 것은, 비인과적인 이웃 멀티미디어 샘플들의 비순차적 재구성을 제공하기 위한 효율적인 방법이다.Since the decoded data available to the decoder device includes causal neighbors that have already been decoded and reconstructed, spatial concealment usually uses causal neighbors to conceal lost blocks. One reason to use causal neighbors to conceal lost blocks is that concealment of the lost portion of the current slice after non-sequential reconstruction of the next slice would be very inefficient, especially when using a highly pipelined video hardware decoder core. Can be. Non-causal neighbors can provide valuable information for improved spatial concealment. What is needed is an efficient way to provide non-sequential reconstruction of non-causal neighboring multimedia samples.

요약summary

본 발명의 시스템, 방법, 및 디바이스들 각각은 몇몇 양태들을 갖는데, 그 양태들 중 단 하나만이 원하는 속성들에 대해 단독 책임이 있는 것은 아니다. 다음에 오는 특허청구범위에 의해 표현한 것처럼 본 발명의 범위를 벗어남 없이, 보다 중요한 특징들이 이제 간략하게 설명될 것이다. 이 설명을 고려한 후에, 특히 "특정 양태들의 상세한 설명" 으로 표제가 붙여진 부분을 판독한 후에, 본 발명의 샘플 특징들이 향상된 에러 은닉, 및 향상된 효율을 포함한 멀티미디어 인코딩 및 디코딩에 이점들을 제공하는 방법을 이해할 것이다.Each of the systems, methods, and devices of the present invention has several aspects, not just one of which is solely responsible for the desired attributes. More important features will now be briefly described without departing from the scope of the present invention as expressed by the following claims. After considering this description, in particular after reading the section entitled “Detailed Description of Specific Aspects”, a method is provided in which the sample features of the present invention provide advantages for multimedia encoding and decoding including improved error concealment, and improved efficiency. I will understand.

멀티미디어 데이터를 처리하는 방법이 제공된다. 이 방법은, 변환 계수들을 수신하는 단계를 포함하고, 그 변환 계수들은 멀티미디어 데이터와 연관된다. 이 방법은, 재구성되는 멀티미디어 샘플들의 세트를 결정하는 단계, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정하는 단계, 및 결정된 세트의 멀티미디어 샘플들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하는 단계를 더 포함한다.A method of processing multimedia data is provided. The method includes receiving transform coefficients, the transform coefficients being associated with the multimedia data. The method includes determining a set of reconstructed multimedia samples, determining a set of received transform coefficients based on the reconstructed multimedia samples, and generating reconstructed samples corresponding to the determined set of multimedia samples. Processing the determined set of transform coefficients.

멀티미디어 데이터 프로세서가 제공된다. 이 프로세서는, 변환 계수들을 수신하도록 구성되고, 그 변환 계수들은 멀티미디어 데이터와 연관된다. 이 프로세서는 또한, 재구성되는 멀티미디어 샘플들의 세트를 결정하고, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정하며, 결정된 세트의 멀티미디어 샘플들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하도록 구성된다.A multimedia data processor is provided. The processor is configured to receive transform coefficients, the transform coefficients being associated with the multimedia data. The processor may also determine a set of multimedia samples to be reconstructed, determine a set of received transform coefficients based on the multimedia samples to be reconstructed, and determine to generate reconstructed samples corresponding to the determined set of multimedia samples. And to process the set of transform coefficients.

멀티미디어 데이터를 처리하는 장치가 제공된다. 이 장치는 변환 계수들을 수신하는 수신기를 포함하며, 그 변환 계수들은 멀티미디어 데이터와 연관된다. 이 장치는, 재구성되는 멀티미디어 샘플들의 세트를 결정하는 제 1 결정기, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정 하는 제 2 결정기, 및 결정된 세트의 멀티미디어 세트들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하는 발생기를 더 포함한다.An apparatus for processing multimedia data is provided. The apparatus includes a receiver for receiving transform coefficients, the transform coefficients being associated with the multimedia data. The apparatus includes a first determiner for determining a set of multimedia samples to be reconstructed, a second determiner for determining a set of received transform coefficients based on the multimedia samples to be reconstructed, and a reconstructed corresponding to the determined set of multimedia sets. And a generator for processing the determined set of transform coefficients to generate samples.

실행 시에, 머신으로 하여금, 멀티미디어 데이터를 처리하게 하는 명령들을 포함한 머신 판독가능 매체가 제공된다. 명령들은, 머신으로 하여금, 변환 계수들을 수신하게 하며, 그 변환 계수들은 멀티미디어 데이터와 연관된다. 명령들은 또한, 머신으로 하여금, 재구성되는 멀티미디어 샘플들의 세트를 결정하게 하고, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정하게 하며, 결정된 세트의 멀티미디어 샘플들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하게 한다.In execution, a machine readable medium is provided that includes instructions that cause a machine to process multimedia data. The instructions cause the machine to receive transform coefficients, which are associated with the multimedia data. The instructions also cause the machine to determine a set of multimedia samples to be reconstructed, to determine a set of received transform coefficients based on the multimedia samples to be reconstructed, and to reconstruct the sample corresponding to the determined set of multimedia samples. To process the determined set of transform coefficients to generate them.

도면의 간단한 설명Brief description of the drawings

도 1 은, 일 양태에 다른 멀티미디어 통신 시스템을 나타낸 블록도이다.1 is a block diagram illustrating a multimedia communication system according to one aspect.

도 2a 는, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스의 일 양태를 나타낸 블록도이다.FIG. 2A is a block diagram illustrating an aspect of a decoder device that may be used in a system such as that shown in FIG. 1.

도 2b 는, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스의 컴퓨터 프로세서 시스템의 일 예를 나타낸 블록도이다.FIG. 2B is a block diagram illustrating an example of a computer processor system of a decoder device that may be used in a system such as that shown in FIG. 1.

도 3 은, 도 1 에 도시한 것과 같은 시스템에서 비디오 스트림의 일부를 디코딩하는 방법의 일 예를 나타낸 흐름도이다.3 is a flowchart illustrating an example of a method of decoding a portion of a video stream in a system such as that shown in FIG. 1.

도 4 는, 도 1 에 도시한 것과 같은 시스템에서 비디오 스트림의 일부를 디코딩하는 방법의 다른 예를 더 상세히 나타낸 흐름도이다.4 is a flowchart illustrating another example of a method of decoding a portion of a video stream in a system such as that shown in FIG. 1 in more detail.

도 5 는, 4×4 블록과 그 주변의 인과적인 이웃 화소들의 상세도이다.5 is a detail view of a 4x4 block and causal neighboring pixels around it.

도 6 은, H.264 에서 일 블록의 방향성 특성을 설명하기 위해 사용되는 9 (0 내지 8) 개의 방향성 모드들을 나타낸 방향성 모드 도면이다.FIG. 6 is a directional mode diagram illustrating 9 (0 to 8) directional modes used to describe directional characteristics of one block in H.264.

도 7 은, 하나 이상의 슬라이스 경계들의 바로 아래 및 우측의 인트라-코딩된 4×4 화소 블록의 일 예를 나타낸 도면이다.7 is a diagram illustrating an example of an intra-coded 4x4 pixel block just below and to the right of one or more slice boundaries.

도 8 은, 이웃 화소들 및 인트라-코딩된 4×4 화소 블록 내의 화소들의 일람표이다.8 is a list of neighboring pixels and pixels within an intra-coded 4x4 pixel block.

도 9 는, 일 슬라이스 경계의 바로 아래 및 우측의 인트라-코딩된 16×16 휘도 (Luma) 매크로블록의 일 예를 나타낸 도면이다.9 is a diagram illustrating an example of an intra-coded 16 × 16 Luma (Luma) macroblock just below and to the right of one slice boundary.

도 10 은, 일 슬라이스 경계의 바로 아래 및 우측의 인트라-코딩된 8×8 색도 (Chroma) 블록의 일 예를 나타낸 도면이다.FIG. 10 is a diagram illustrating an example of an intra-coded 8x8 Chroma block just below and to the right of one slice boundary.

도 11 은, 일 슬라이스 경계 바로 아래에 위치된 멀티미디어 샘플들의 일부를 나타낸 도면이다.11 is a diagram illustrating a portion of multimedia samples located directly below one slice boundary.

도 12 는, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스의 다른 예를 나타낸 블록도이다.12 is a block diagram illustrating another example of a decoder device that may be used in a system such as that shown in FIG. 1.

도 13 은, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스 (150) 의 다른 예를 나타낸 블록도이다.FIG. 13 is a block diagram illustrating another example of a decoder device 150 that may be used in a system such as that shown in FIG. 1.

특정 양태들의 상세한 설명Detailed Description of Specific Aspects

다음의 상세한 설명은 본 발명의 다소의 특정 샘플 양태들에 관련된다. 그러나, 본 발명은 특허청구범위에 의해 한정 및 커버되는 것처럼 다수의 상이한 방식들로 구현될 수 있다. 이 설명은 도면을 참조로 행해지며, 도면 전반에 걸쳐 동일 부분들은 동일 번호로 나타내진다.The following detailed description relates to some specific sample aspects of the present invention. However, the present invention can be implemented in many different ways as defined and covered by the claims. This description is made with reference to the drawings, in which like parts are denoted by like numerals.

비디오 신호들은, 일련의 화상들, 프레임들, 또는 필드들이 특징이다. 여기서 사용한 것처럼, "프레임" 이란 용어는, 순차 (progressive) 비디오 신호의 프레임들이나 비월 (interlaced) 비디오 신호의 프레임들 또는 필드들 중 어느 하나를 포함할 수도 있는 광범위한 용어이다.Video signals are characterized by a series of pictures, frames, or fields. As used herein, the term "frame" is a broad term that may include either frames of a progressive video signal or frames or fields of an interlaced video signal.

양태들은, 멀티미디어 송신 시스템 내의 인코더 및 디코더에서의 처리를 향상시키는 시스템 및 방법을 포함한다. 멀티미디어 데이터는, 동영상 비디오, 오디오, 정지 이미지, 또는 임의의 다른 적합한 타입의 시청각 데이터 (audio-visual data) 중 하나 이상의 포함할 수도 있다. 양태들은, 비인과적인 멀티미디어 샘플들을 재구성하고, 재구성된 샘플들을 사용하여 손실되거나 잘못 인코딩된 멀티미디어 데이터의 공간 은닉을 수행함으로써 향상된 에러 은닉을 제공하는 효율적인 방식으로 비디오 데이터를 인코딩하는 장치 및 방법을 포함한다. 예를 들어, 일 양태에 의하면, 손실되거나 잘못된 데이터에 대해 멀티미디어 은닉 데이터를 추정하기 전에, 재구성된 인과적 및/또는 비인과적인 이웃 샘플들의 발생은 공간 은닉의 품질을 향상시킬 수 있다는 것이 확인되었다. 일부 예에서, 재구성된 샘플들이 처음에 인코딩되었던 재구성된 멀티미디어 샘플들 및 방향성 표시자가 멀티미디어 은닉 데이터의 추정에 사용된다. 다른 양태에서, 공간 에러 은닉에 사용되는 멀티미디어 샘플들의 행렬의 서브세트의 재구성은 처리 효율을 추가 향상시킬 수 있다는 것이 확인되었다. 일부 예에서, 멀티미디어 샘플들의 재구성 및 멀티미디어 은닉 데이터의 추정은 전-처리기에서 수행된다. 멀티미디어 은닉 데이터는 그 후, 효율적인 비디오 코어 프로세서에서 디코딩되는 처음 인코딩된 비인과적인 멀티미디어 데이터와 함께 전달될 수 있어, 처리 효율이 추가 향상될 수 있다.Aspects include a system and method for enhancing processing at an encoder and decoder in a multimedia transmission system. The multimedia data may include one or more of video, audio, still images, or any other suitable type of audio-visual data. Aspects include an apparatus and method for encoding video data in an efficient manner to reconstruct non-causal multimedia samples and to provide improved error concealment by performing spatial concealment of lost or incorrectly encoded multimedia data using the reconstructed samples. do. For example, according to one aspect, before estimating multimedia concealment data for lost or erroneous data, it has been found that the generation of reconstructed causal and / or causal neighboring samples can improve the quality of spatial concealment. . In some examples, the reconstructed multimedia samples and the directional indicator from which the reconstructed samples were initially encoded are used for estimation of the multimedia concealment data. In another aspect, it has been found that reconstruction of a subset of the matrix of multimedia samples used for spatial error concealment can further improve processing efficiency. In some examples, reconstruction of multimedia samples and estimation of multimedia concealment data are performed in the pre-processor. The multimedia concealment data may then be conveyed with the first encoded non-causal multimedia data decoded in an efficient video core processor, further improving processing efficiency.

멀티미디어 통신 시스템Multimedia communication systems

도 1 은, 일 양태에 따른 멀티미디어 통신 시스템 (100) 을 나타낸 기능 블록도이다. 멀티미디어 통신 시스템 (100) 은, 네트워크 (140) 를 통해 디코더 디바이스 (150) 와 통신하고 있는 인코더 디바이스 (110) 를 포함한다. 일 예에서, 인코더 디바이스는, 외부 소스 (102) 로부터 멀티미디어 신호를 수신하고, 그 신호를 네트워크 (140) 를 통한 송신을 위해 인코딩한다.1 is a functional block diagram illustrating a multimedia communication system 100 according to an aspect. The multimedia communication system 100 includes an encoder device 110 in communication with a decoder device 150 via a network 140. In one example, the encoder device receives a multimedia signal from an external source 102 and encodes the signal for transmission over the network 140.

이 예에서, 인코더 디바이스 (110) 는, 메모리 (114) 와 트랜시버 (116) 에 결합된 프로세서 (112) 를 포함한다. 프로세서 (112) 는, 멀티미디어 데이터 소스로부터의 데이터를 인코딩하고, 그것을 네트워크 (140) 를 통한 통신을 위해 트랜시버 (116) 에 제공한다.In this example, encoder device 110 includes a processor 112 coupled to memory 114 and transceiver 116. The processor 112 encodes data from the multimedia data source and provides it to the transceiver 116 for communication over the network 140.

이 예에서, 디코더 디바이스 (150) 는, 메모리 (154) 와 트랜시버 (156) 에 결합된 프로세서 (152) 를 포함한다. 프로세서 (152) 는, 범용 프로세서 및/또는 디지털 신호 프로세서 및/또는 주문형 하드웨어 프로세서 (application specific hardware processor) 중 하나 이상을 포함할 수도 있다. 메모리 (154) 는, 고체 또는 디스크 기반 저장장치 또는 임의의 판독가능 및 기록가능 랜덤 액세스 메모리 디바이스 중 하나 이상의 포함할 수도 있다. 트랜시버 (156) 는, 네트워크 (140) 를 통해 멀티미디어 데이터를 수신하고 그것을 디코딩을 위한 프로세서 (152) 에 이용가능한 것으로 만들도록 구성된다. 일 예에서, 트랜시버 (156) 는 무선 트랜시버를 포함한다. 네트워크 (140) 는, 이더넷, 전화 (예를 들어, POTS), 케이블, 전력선, 및 광섬유 시스템 중 하나 이상을 포함한 유선 또는 무선 통신 시스템, 및/또는 코드 분할 다중 액세스 (CDMA 또는 CDMA2000) 통신 시스템, 주파수 분할 다중 액세스 (FDMA) 시스템, GSM/GPRS (General Packet Radio Service)/EDGE (Enhanced Data GSM Environment) 와 같은 시분할 다중 액세스 (TDMA) 시스템, TETRA (Terrestrial Trunked Radio) 이동 전화 시스템, 광대역 코드 분할 다중 액세스 (WCDMA) 시스템, 고속 데이터 (1xEV-DO 또는 1xEV-DO Gold Multicast) 시스템, IEEE 802.11 시스템, MediaFLO 시스템, DMB 시스템, 직교 주파수 분할 다중 액세스 (OFDM) 시스템, 또는 DVB-H 시스템 중 하나 이상을 포함한 무선 시스템 중 하나 이상을 포함할 수도 있다.In this example, decoder device 150 includes a processor 152 coupled to memory 154 and transceiver 156. Processor 152 may include one or more of a general purpose processor and / or a digital signal processor and / or an application specific hardware processor. Memory 154 may include one or more of solid or disk-based storage or any readable and writable random access memory device. The transceiver 156 is configured to receive multimedia data via the network 140 and make it available to the processor 152 for decoding. In one example, the transceiver 156 includes a wireless transceiver. Network 140 may be a wired or wireless communication system including one or more of Ethernet, telephone (eg, POTS), cable, powerline, and fiber optic systems, and / or code division multiple access (CDMA or CDMA2000) communication systems, Frequency Division Multiple Access (FDMA) systems, Time Division Multiple Access (TDMA) systems such as GSM / General Packet Radio Service (GPRS) / Enhanced Data GSM Environment (EDGE), Terrestrial Trunked Radio (TETRA) mobile phone systems, Wideband code division multiplexing One or more of an access (WCDMA) system, a high-speed data (1xEV-DO or 1xEV-DO Gold Multicast) system, an IEEE 802.11 system, a MediaFLO system, a DMB system, an Orthogonal Frequency Division Multiple Access (OFDM) system, or a DVB-H system. It may also include one or more of the wireless systems included.

도 2a 는, 도 1 에 도시된 시스템 (100) 과 같은 시스템에서 사용될 수도 있는 디코더 디바이스 (150) 의 일 양태를 나타낸 기능 블록도이다. 이 양태에서, 디코더 (150) 는, 수신기 엘리먼트 (202), 멀티미디어 샘플 결정기 엘리먼트 (204), 변환 계수 결정기 엘리먼트 (206), 재구성된 샘플 발생기 엘리먼트 (208), 및 멀티미디어 은닉 추정기 엘리먼트 (210) 를 포함한다.FIG. 2A is a functional block diagram illustrating an aspect of a decoder device 150 that may be used in a system such as the system 100 shown in FIG. 1. In this aspect, the decoder 150 includes the receiver element 202, the multimedia sample determiner element 204, the transform coefficient determiner element 206, the reconstructed sample generator element 208, and the multimedia concealment estimator element 210. Include.

수신기 (202) 는, 인코딩된 비디오 데이터 (예를 들어, 도 1 의 인코더 (110) 에 의해 인코딩된 데이터) 를 수신한다. 수신기 (202) 는, 도 1 의 네트워크 (140) 와 같은 유선 또는 무선 네트워크를 통해 인코딩된 데이터를 수신할 수 도 있다. 일 양태에서, 수신된 데이터는, 소스 멀티미디어 데이터를 나타내는 변환 계수들을 포함한다. 변환 계수들은, 이웃 샘플들의 상관관계가 상당히 감소되는 영역에서 변환된다. 예를 들어, 이미지들은 통상, 공간 영역에서 고도의 공간 상관관계를 보인다. 한편, 변환 계수들은 통상 서로에 대해 직교이며, 이는 제로 상관관계를 보인다. 멀티미디어 데이터용으로 사용될 수 있는 변환들의 일부 예는, 한정하려는 것은 아니지만, H.264 에서 사용한 것과 같은, DCT (Discrete Cosine Transform; 이산 코사인 변환), DFT (Discrete Fourier Transform; 이산 푸리에 변환), 아다마르 (Hadamard, 또는 왈시-아다마르) 변환, 이산 웨이브렛 변환, DST (Discrete Sine Transform; 이산 사인 변환), 하르 (Haar) 변환, 경사 (Slant) 변환, 카루넨 루베 (KL; Karhunen-Loeve) 변환 및 정수 변환을 포함한다. 변환들은, 멀티미디어 샘플들의 행렬 또는 어레이를 변환하도록 사용된다. 2 차원 행렬들이 보통 사용되지만, 일 차원 어레이들이 사용될 수도 있다. 수신된 데이터는 또한, 인코딩된 블록들이 인코딩된 방법을 나타내는 정보를 포함한다. 이런 정보는, 움직임 벡터 및 프레임 시퀀스 번호와 같은 인터-코딩 참조 정보, 및 블록 크기 및 공간 예측 방향성 표시자를 포함한 인트라-코딩 참조 정보 등을 포함할 수도 있다. 일부 수신된 데이터는, 각 변환 계수가 라운딩된 방법을 나타내는 양자화 파라미터, 변환된 행렬 내의 얼마나 많은 변환 계수들이 넌-제로 (non-zero) 인지를 나타내는 넌-제로 표시자 등을 포함한다.Receiver 202 receives encoded video data (eg, data encoded by encoder 110 of FIG. 1). Receiver 202 may receive encoded data via a wired or wireless network, such as network 140 of FIG. 1. In one aspect, the received data includes transform coefficients representing source multimedia data. Transform coefficients are transformed in the region where the correlation of neighboring samples is significantly reduced. For example, images typically exhibit high spatial correlation in the spatial domain. On the other hand, the transform coefficients are usually orthogonal to each other, which shows zero correlation. Some examples of transforms that can be used for multimedia data include, but are not limited to, the Discrete Cosine Transform (DCT), the Discrete Fourier Transform (DFT), Adamar, as used in H.264. (Hadamard, or Walsh-Adamar), Discrete Wavelet, DST (Discrete Sine Transform), Harar, Slant, and Karhunen-Loeve (KL) And integer conversion. Transforms are used to transform a matrix or array of multimedia samples. Two-dimensional matrices are usually used, but one-dimensional arrays may be used. The received data also includes information indicating how the encoded blocks were encoded. Such information may include inter-coding reference information such as motion vectors and frame sequence numbers, intra-coding reference information including block size and spatial prediction directional indicators, and the like. Some received data includes quantization parameters indicating how each transform coefficient is rounded, non-zero indicator indicating how many transform coefficients in the transformed matrix are non-zero, and the like.

멀티미디어 샘플 결정기 (204) 는, 어느 멀티미디어 샘플들이 재구성될지를 결정한다. 일 양태에서, 멀티미디어 샘플 결정기 (204) 는, 손실되고 또한 은 닉될 수 있는 멀티미디어 데이터에 근접하고, 및/또는 그 멀티미디어 데이터의 접경 지역 (border region) 에 있는 이웃 멀티미디어 샘플들 또는 화소들을 결정한다. 일 예에서, 멀티미디어 샘플 결정기는, 에러 또는 채널 손실로 인해 데이터의 일부가 손실된 일 슬라이스의 접경지 또는 블록들의 다른 그룹에 인접한 화소들을 식별한다. 일부 예에서, 멀티미디어 샘플 결정기 (204) 는, 결정된 화소들로부터 공간 예측된 이웃 블록들을 재구성하는 것과 연관된 가장 적은 수의 화소들을 식별한다. 예를 들어, 압축된 멀티미디어 데이터는, 개별 블록들 (예를 들어, 8×8 화소 블록들 및/또는 4×4 화소 블록들) 또는 행렬들의 변환으로부터 발생하는 변환 계수들의 블록을 포함할 수 있다. 멀티미디어 샘플 결정기 (204) 는, 손실된 데이터를 은닉하기 위해 사용되거나 샘플들로부터 예측된 다른 블록들 내의 다른 인코딩된 멀티미디어 샘플들을 재구성하기 위해 사용되도록, 재구성되는 변환된 블록의 멀티미디어 샘플들의 특정 서브세트를 식별할 수 있다. 결정된 멀티미디어 샘플들은, 비인과적인 샘플들 및/또는 인과적인 샘플들을 포함할 수 있다.The multimedia sample determiner 204 determines which multimedia samples are to be reconstructed. In one aspect, the multimedia sample determiner 204 determines neighboring multimedia samples or pixels that are close to multimedia data that may be lost and hidden and / or that are in a border region of the multimedia data. In one example, the multimedia sample determiner identifies pixels adjacent to one slice's border or another group of blocks where some of the data was lost due to an error or channel loss. In some examples, multimedia sample determiner 204 identifies the fewest number of pixels associated with reconstructing spatial predicted neighboring blocks from the determined pixels. For example, the compressed multimedia data may include blocks of transform coefficients resulting from the transformation of individual blocks (eg, 8x8 pixel blocks and / or 4x4 pixel blocks) or matrices. . The multimedia sample determiner 204 is a particular subset of the multimedia samples of the transformed block that is reconstructed to be used to conceal lost data or to reconstruct other encoded multimedia samples in other blocks predicted from the samples. Can be identified. The determined multimedia samples may include non-causal samples and / or causal samples.

변환 계수 결정기 (206) 는, 멀티미디어 샘플 결정기 (204) 에 의해 재구성되도록 결정된 멀티미디어 샘플들의 일부 또는 전부를 재구성하기 위해 사용되는 변환 계수들의 세트를 결정한다. 사용하기 위한 변환 계수들의 결정은, 변환 계수들을 발생시키기 위해 사용된 인코딩 방법에 의존한다. 변환 계수 결정은 또한, 어느 멀티미디어 샘플들이 재구성되고 있는지와 제로 값들을 가진 변환 계수들이 존재하는지 여부에 의존한다 (이로써, 그들을 사용할 잠재적 요구를 부정한 다). 변환 계수들이 멀티미디어 샘플들을 재구성하기에 충분할 수도 있다는 것에 대한 상세가 이하 설명된다.Transform coefficient determiner 206 determines a set of transform coefficients used to reconstruct some or all of the multimedia samples determined to be reconstructed by multimedia sample determiner 204. The determination of transform coefficients for use depends on the encoding method used to generate the transform coefficients. Transform coefficient determination also depends on which multimedia samples are being reconstructed and whether there are transform coefficients with zero values (thus negating the potential need to use them). Details are described below that the transform coefficients may be sufficient to reconstruct the multimedia samples.

재구성된 샘플 발생기 (208) 는, 멀티미디어 샘플 결정기 (204) 에 의해 결정되는 샘플들에 기초하여 멀티미디어 샘플들을 재구성한다. 재구성된 샘플들의 세트는, 샘플들의 전체 N×N (여기서, N 은 정수) 행렬과 같은 전체 세트일 수 있다. 샘플들의 세트는, 행, 열, 행 또는 열의 일부, 대각선 등과 같은 N×N 행렬로부터의 샘플들의 서브세트일 수 있다. 재구성된 샘플 발생기 (208) 는, 샘플들을 재구성할 시에 변환 계수 결정기 (206) 에 의해 결정되는 변환 계수들을 이용한다. 재구성된 샘플 발생기 (208) 는 또한, 멀티미디어 샘플들을 재구성할 시에 변환 계수들을 인코딩하기 위해 사용되는 인코딩 방법에 기초한 정보를 이용한다. 재구성된 샘플 발생기 (208) 에 의해 수행되는 동작들에 대한 상세는 이하 설명된다.The reconstructed sample generator 208 reconstructs the multimedia samples based on the samples determined by the multimedia sample determiner 204. The set of reconstructed samples may be an entire set, such as the entire N × N (where N is an integer) matrix of samples. The set of samples may be a subset of samples from an N × N matrix, such as a row, column, part of a row or column, a diagonal line, or the like. Reconstructed sample generator 208 uses the transform coefficients determined by transform coefficient determiner 206 when reconstructing the samples. Reconstructed sample generator 208 also uses information based on the encoding method used to encode the transform coefficients in reconstructing the multimedia samples. Details of the operations performed by the reconstructed sample generator 208 are described below.

멀티미디어 은닉 추정기 (210) 는, 재구성된 샘플 발생기 (208) 에 의해 계산되는 재구성된 샘플들을 이용하여, 은닉 멀티미디어 샘플들을 형성하여 송신/수신 중에 에러로 손실 또는 변경되는 멀티미디어 데이터의 지역들을 교체 또는 은닉한다. 멀티미디어 은닉 추정기 (210) 는, 일 양태에서 재구성된 샘플 값들을 이용하여 은닉 멀티미디어 데이터를 형성한다. 다른 양태에서, 멀티미디어 은닉 추정기 (210) 는, 멀티미디어 은닉 데이터를 추정할 시에 재구성된 샘플 값들 및 수신된 공간 예측 방향성 모드 표시자를 이용한다. 공간 에러 은닉에 대한 추가 상세는, 본원의 양수인에게 양도된, 출원번호 제11/182,621호 (미국공개특허 번호 제2006/0013320호), "METHODS AND APPARATUS FOR SPATIAL ERROR CONCEALMENT" 에서 확인될 수 있다.The multimedia concealment estimator 210 uses reconstructed samples calculated by the reconstructed sample generator 208 to form hidden multimedia samples to replace or conceal areas of multimedia data that are lost or changed in error during transmission / reception. do. The multimedia concealment estimator 210 uses the reconstructed sample values to form the hidden multimedia data in one aspect. In another aspect, the multimedia concealment estimator 210 uses the reconstructed sample values and the received spatial prediction directional mode indicator in estimating the multimedia concealment data. Further details on spatial error concealment can be found in Application No. 11 / 182,621 (US Patent Publication No. 2006/0013320), “METHODS AND APPARATUS FOR SPATIAL ERROR CONCEALMENT”, assigned to the assignee herein.

일부 양태에서는, 도 2a 의 디코더 (150) 의 엘리먼트들 중 하나 이상이 재배열 및/또는 조합될 수도 있다. 그 엘리먼트들은, 하드웨어, 소프트웨어, 펌웨어, 미들웨어, 마이크로코드 또는 이들의 임의의 조합에 의해 구현될 수도 있다. 디코더 (150) 의 엘리먼트들에 의해 수행되는 동작들에 대한 상세는 이하, 도 3 및 도 4 에 도시된 방법들을 참조로 설명될 것이다.In some aspects, one or more of the elements of decoder 150 of FIG. 2A may be rearranged and / or combined. The elements may be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. Details of the operations performed by the elements of decoder 150 will be described below with reference to the methods shown in FIGS. 3 and 4.

도 2b 는, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스의 컴퓨터 프로세서 시스템의 일 예를 나타낸 블록도이다. 이 예의 디코더 디바이스 (150) 는, 전-처리기 엘리먼트 (220), RAM 엘리먼트 (222), 디지털 신호 프로세서 (DSP) 엘리먼트 (224), 및 비디오 코어 엘리먼트 (226) 를 포함한다.FIG. 2B is a block diagram illustrating an example of a computer processor system of a decoder device that may be used in a system such as that shown in FIG. 1. Decoder device 150 of this example includes pre-processor element 220, RAM element 222, digital signal processor (DSP) element 224, and video core element 226.

전-처리기 (220) 는, 일 양태에서, 도 2a 의 다양한 엘리먼트들에 의해 수행되는 동작들 중 하나 이상을 수행하기 위해 사용된다. 전-처리기는, 비디오 비트스트림을 파싱 (parse) 하고 그 데이터를 RAM (222) 에 기록한다. 또한, 일 양태에서, 전-처리기 (220) 는, 멀티미디어 샘플 결정기 (204), 변환 계수 결정기 (206), 재구성된 샘플 발생기 (208), 및 멀티미디어 은닉 추정기 (210) 의 동작들을 구현한다. 이들의 더 효율적이고, 덜 계산 집약적인 동작들을 전-처리기 (220) 에서 수행함으로써, 보다 계산 집약적인 비디오 디코딩이 매우 효율적인 비디오 코어 (226) 에서 인과적인 순서 (causal order) 로 행해질 수 있다.Pre-processor 220 is used, in one aspect, to perform one or more of the operations performed by the various elements of FIG. 2A. The pre-processor parses the video bitstream and writes the data to RAM 222. Also, in one aspect, pre-processor 220 implements the operations of multimedia sample determiner 204, transform coefficient determiner 206, reconstructed sample generator 208, and multimedia concealment estimator 210. By performing their more efficient, less computationally intensive operations in the pre-processor 220, more computationally intensive video decoding can be done in a causal order in the highly efficient video core 226.

DSP (224) 는, RAM (222) 에 저장되는 파싱된 비디오 데이터를 검색하고 그것을 비디오 코어 (226) 에 의해 처리되도록 재편성한다. 비디오 코어 (226) 는, 역양자화 (리스케일링 (rescaling) 또는 스케일링으로도 공지됨), 역변환 및 디블록킹 기능들은 물론 다른 비디오 압축해제 기능들을 수행한다. 비디오 코어는 매우 최적화 및 파이프라인의 방식으로 통상 구현된다. 이 때문에, 비디오 데이터는, 그것이 인과적인 순서로 디코딩될 때 가장 빠른 방식으로 디코딩될 수 있다. 전-처리기에서 멀티미디어 샘플들의 비순차적 재구성 및 후속의 공간 은닉을 수행함으로써, 비디오 코어에서의 디코딩을 위해 인과적인 순서가 유지되어 향상된 전체 디코딩 성능이 허용된다.The DSP 224 retrieves the parsed video data stored in the RAM 222 and reorganizes it for processing by the video core 226. Video core 226 performs inverse quantization (also known as rescaling or scaling), inverse transform and deblocking functions, as well as other video decompression functions. Video cores are usually implemented in a highly optimized and pipelined manner. Because of this, video data can be decoded in the fastest way when it is decoded in a causal order. By performing out-of-order reconstruction of the multimedia samples and subsequent spatial concealment in the pre-processor, a causal order is maintained for decoding in the video core, allowing for improved overall decoding performance.

도 3 은, 도 1 에 도시한 것과 같은 시스템에서 비디오 스트림의 일부를 디코딩하는 방법의 일 예를 나타낸 흐름도이다. 방법 300 은, 도 2a 및 도 2b 에 도시된 예들과 같은 디코딩 디바이스에 의해 수행될 수 있다. 방법 300 은, 선택된 멀티미디어 샘플들의 재구성을 가능하게 한다. 방법 300 은, 다른 인코딩된 멀티미디어 데이터가 인과적인 데이터로부터 예측되는 인과적인 순서로 멀티미디어 샘플들을 재구성하기 위해 사용될 수도 있고, 그 자신을 재구성하기 전에 인과적인 데이터의 재구성을 필요로 할 수도 있다. 방법 300 은 비인과적인 순서로 멀티미디어 샘플들을 재구성하기 위해 사용될 수도 있다. 일 양태에서, 비인과적인 데이터는, 보다 효율적이고 시기 적절한 방식으로 멀티미디어 데이터 모두 (인과적 및 비인과적 모두) 의 후속 재구성을 허용하도록 하는 방식으로 재구성된다.3 is a flowchart illustrating an example of a method of decoding a portion of a video stream in a system such as that shown in FIG. 1. The method 300 may be performed by a decoding device, such as the examples shown in FIGS. 2A and 2B. The method 300 enables reconstruction of the selected multimedia samples. The method 300 may be used to reconstruct the multimedia samples in a causal order in which other encoded multimedia data is predicted from the causal data, and may require reconstruction of the causal data before reconstructing itself. The method 300 may be used to reconstruct the multimedia samples in an in causal order. In one aspect, the non-causal data is reconstructed in a manner that allows subsequent reconstruction of both multimedia data (both causal and non-causal) in a more efficient and timely manner.

방법 300 은, 디코더 디바이스가 멀티미디어 데이터 비트스트림과 연관된 변환 계수들을 수신하는 블록 305 에서 시작된다. 디코더 디바이스는, 도 1 에 도시된 네트워크 (140) 와 같은 유선 및/또는 무선 네트워크를 통해 변환 계수들을 수신할 수도 있다. 변환 계수들은, 각각, 색도 (chrominance) 및 휘도와 같은 색 및/또는 밝기 파라미터들을 포함하는 멀티미디어 샘플들을 나타낼 수 있다. 변환 계수들을 발생시키기 위해 사용된 변환들은, 한정하려는 것은 아니지만, H.264 에서 사용한 것과 같이, DCT (이산 코사인 변환), DFT (이산 푸리에 변환), 아다마르 (또는 왈시-아다마르) 변환, 이산 웨이브렛 변환, DST (이산 사인 변환), 하르 변환, 경사 변환, KL (카루넨 루베) 변환 및 정수 변환을 포함할 수도 있다. 멀티미디어 샘플들은, 변환 계수들이 인코딩 동안 발생될 때 1 차원 어레이들 및/또는 2 차원 행렬들과 같은 그룹들에서 변환될 수도 있다. 변환 계수들은, 인트라-코딩될 수도 있고, 공간 예측을 포함할 수도 또는 포함하지 않을 수도 있다. 변환 계수들의 발생 시에 공간 예측이 사용된 경우에, 변환 계수들은, 참조 값에 의해 제공된 예측기의 에러인 잔여 값을 나타낼 수도 있다. 변환 계수들은, 양자화될 수도 있다. 변환 계수들은 엔트로피 인코딩될 수도 있다. 도 2a 의 수신기 엘리먼트 (202) 는 블록 305 에서의 행위 (act) 를 수행할 수도 있다.The method 300 begins at block 305 where a decoder device receives transform coefficients associated with a multimedia data bitstream. The decoder device may receive the transform coefficients via a wired and / or wireless network, such as the network 140 shown in FIG. 1. The transform coefficients may represent multimedia samples that include color and / or brightness parameters, such as chroma and luminance, respectively. The transforms used to generate the transform coefficients are, but are not limited to, DCT (discrete cosine transform), DFT (discrete Fourier transform), Adama (or Walsh-Adamar) transform, discrete as used in H.264. Wavelet transforms, DST (discrete sine transforms), Har transforms, gradient transforms, KL (Karunen Rube) transforms, and integer transforms may be included. Multimedia samples may be transformed in groups such as one-dimensional arrays and / or two-dimensional matrices when the transform coefficients are generated during encoding. Transform coefficients may be intra-coded and may or may not include spatial prediction. If spatial prediction was used in the generation of the transform coefficients, the transform coefficients may represent a residual value that is an error of the predictor provided by the reference value. Transform coefficients may be quantized. Transform coefficients may be entropy encoded. The receiver element 202 of FIG. 2A may perform an act at block 305.

변환 계수들을 수신한 후에, 방법 300 은, 디코더 디바이스가 재구성되는 멀티미디어 샘플들의 세트를 결정하는 블록 310 으로 진행한다. 재구성되는 멀티미디어 샘플들은, 휘도 (luma) 및 색도 (chroma) 샘플들을 포함할 수도 있다. 일부 예에서, 재구성되는 멀티미디어 샘플들의 세트는, 블록 305 에서 수신되는 멀티미디어 비트스트림을 디코딩하면서 동기화의 손실에 응답하여 결정된다. 동기화의 손실은, 매크로블록들의 제 1 슬라이스에 포함된 멀티미디어 샘플들에 대응하는 인코딩된 데이터의 일부 또는 전부의 손실 또는 잘못된 수신에 의해 야기될 수도 있다. 재구성되는 결정된 멀티미디어 샘플들은, 매크로블록들의 제 2 슬라이스에 포함될 수도 있다. 매크로블록들의 제 2 슬라이스는 매크로블록들의 제 1 슬라이스의 손실된 부분의 적어도 일부에 접한다. 결정된 멀티미디어 샘플들은, 상기 설명한 것처럼, 멀티미디어 샘플들의 손실된 부분에 대하여 인과적 또는 비인과적일 수도 있다.After receiving the transform coefficients, the method 300 proceeds to block 310 where the decoder device determines the set of multimedia samples to be reconstructed. The multimedia samples to be reconstructed may include luma and chroma samples. In some examples, the set of multimedia samples to be reconstructed is determined in response to a loss of synchronization while decoding the multimedia bitstream received at block 305. The loss of synchronization may be caused by the loss or incorrect reception of some or all of the encoded data corresponding to the multimedia samples included in the first slice of macroblocks. The determined multimedia samples to be reconstructed may be included in the second slice of macroblocks. The second slice of macroblocks abuts at least a portion of the missing portion of the first slice of macroblocks. The determined multimedia samples may be causal or non-causal with respect to the missing portion of the multimedia samples, as described above.

일 양태에서, 블록 310 에서 재구성되도록 결정된 멀티미디어 샘플들은, 은닉되는 멀티미디어 데이터의 손실 부분에 접하는 다른 멀티미디어 샘플들의 재구성을 가능하게 할 수도 있다. 예를 들어, 매크로블록들의 다른 슬라이스의 아래쪽에 있는 인트라-코딩된 매크로블록들은, 블록 310 에서 재구성되도록 결정되는 멀티미디어 샘플들의 결정된 세트에 관하여 공간 예측될 수도 있다. 따라서, 인트라-코딩된 블록들과 강하게 상관하는 결정된 멀티미디어 샘플들의 결정된 세트를 재구성함으로써, 인트라-코딩된 블록들 자신이 은닉 프로세스를 통하여 재구성될 수 있다. 다른 양태에서, 블록 310 에서 재구성되도록 결정된 멀티미디어 샘플들은, 슬라이드 접경지에 또는 그 근처에 위치되는 샘플들을 포함할 수도 있다. 재구성되는 샘플들은, 인코딩 동안 일 그룹으로서 변환된 연관된 멀티미디어 샘플들의 전체 행렬을 포함할 수도 있다. 재구성되는 샘플들은 또한, 행, 열, 대각선, 또는 일부 및/또는 이들의 조합과 같이 연관된 멀티미디어 샘플들의 행렬의 일부를 포함할 수도 있다. 도 2a 의 멀티미디어 샘플 결정기 (204) 는 블록 310 에서의 행동을 수행할 수도 있다. 재구성될 수도 있는 멀티미디어 샘플들의 서브세트에 대한 상세는 이하 설명된다.In an aspect, the multimedia samples determined to be reconstructed at block 310 may enable reconstruction of other multimedia samples that abut the lost portion of the hidden multimedia data. For example, intra-coded macroblocks below the other slice of macroblocks may be spatially predicted with respect to the determined set of multimedia samples determined to be reconstructed at block 310. Thus, by reconstructing the determined set of determined multimedia samples that strongly correlate with the intra-coded blocks, the intra-coded blocks themselves can be reconstructed through the concealment process. In another aspect, the multimedia samples determined to be reconstructed at block 310 may include samples located at or near the slide border. The samples to be reconstructed may include the entire matrix of associated multimedia samples transformed as a group during encoding. The samples to be reconstructed may also include part of a matrix of associated multimedia samples, such as rows, columns, diagonals, or some and / or combinations thereof. The multimedia sample determiner 204 of FIG. 2A may perform the action at block 310. Details of the subset of multimedia samples that may be reconstructed are described below.

방법 300 은, 디코더 디바이스가 블록 310 에서 재구성되도록 결정된 멀티미디어 샘플들과 연관된 변환 계수들의 세트를 결정하는 블록 315 로 진행한다. 재구성을 위해 사용하기 위한 변환 계수들의 결정은 변환 계수들을 발생시키기 위해 사용된 인코딩 방법에 의존한다. 변환 계수 결정은 또한, 어느 멀티미디어 샘플이 재구성되고 있는지에 의존한다. 예를 들어, 블록 310 에서 결정되는 멀티미디어 샘플들의 세트 전체가 재구성될 수도 있는 것으로 결정될 수도 있고, 다르게는 일 서브세트가 재구성되도록 결정될 수도 있다. 블록 315 에서의 변환 계수 결정은, 제로 값을 가진 변환 계수들이 존재하는지 여부에 의존한다 (이로써, 그들을 사용하기 위한 잠재적 요구를 부정한다). 어느 변환 계수가 멀티미디어 샘플들을 재구성하기에 충분할 수도 있는지에 대한 상세가 이하 설명된다. 도 2a 의 변환 계수 결정기는 블록 315 에서의 행동을 수행할 수 있다.The method 300 proceeds to block 315, where the decoder device determines a set of transform coefficients associated with the multimedia samples determined to be reconstructed at block 310. The determination of transform coefficients for use for reconstruction depends on the encoding method used to generate the transform coefficients. The transform coefficient determination also depends on which multimedia sample is being reconstructed. For example, it may be determined that the entire set of multimedia samples determined at block 310 may be reconstructed, or alternatively one subset may be determined to be reconstructed. The transform coefficient determination at block 315 depends on whether or not there are transform coefficients with zero values (thus negating the potential need to use them). Details of which transform coefficients may be sufficient to reconstruct the multimedia samples are described below. The transform coefficient determiner of FIG. 2A may perform the action at block 315.

블록 310 에서 재구성되는 멀티미디어 샘플들의 세트를 결정한 후, 및 블록 315 에서 결정된 멀티미디어 샘플들과 연관된 변환 계수들의 세트를 결정한 후, 방법 300 은 블록 320 으로 진행한다. 블록 320 에서, 디코더 디바이스는, 재구성된 멀티미디어 샘플들을 발생시키기 위해 결정된 변환 계수들의 세트를 처리한다. 수행된 처리는, 변환 계수들을 발생시키기 위해 사용된 인코딩 방법들에 의존한다. 상기 처리는, 변환 계수들의 역변환을 포함하지만, 한정하려는 것은 아니지만 엔트로피 디코딩, 역양자화 (리스케일링 또는 스케일링으로도 불림) 등을 포함하는 다른 행동을 더 포함할 수도 있다. 블록 320 에서 수행되는 처리의 예들의 상세는 도 4 를 참조로 이하 설명된다.After determining the set of multimedia samples to be reconstructed in block 310, and after determining the set of transform coefficients associated with the multimedia samples determined in block 315, the method 300 proceeds to block 320. At block 320, the decoder device processes the determined set of transform coefficients to generate reconstructed multimedia samples. The processing performed depends on the encoding methods used to generate the transform coefficients. The processing may further include other actions, including but not limited to entropy decoding, inverse quantization (also called rescaling or scaling), and the like. Details of examples of the processing performed at block 320 are described below with reference to FIG. 4.

일부 예시적인 시스템에서, 방법 300 의 일부 또는 모든 행동들은, 도 2b 에 도시된 전-처리기 (220) 와 같은 전-처리기에서 수행된다. 방법 300 의 블록들 중 일부는 조합, 생략, 재배열 또는 이들의 임의의 조합일 수도 있음을 알아야 한다.In some example systems, some or all of the actions of method 300 are performed in a pre-processor, such as pre-processor 220 shown in FIG. 2B. It should be appreciated that some of the blocks of method 300 may be combinations, omissions, rearrangements, or any combination thereof.

도 4 는, 도 1 에 도시한 것과 같은 시스템에서 비디오 스트림의 일부를 디코딩하는 방법의 다른 예를 상세히 나타낸 흐름도이다. 예시적인 방법 400 은, 방법 300 에 포함된 블록 305 내지 블록 320 에서 수행된 동작들 모두를 포함한다. 블록 305, 블록 310 및 블록 315 는, 도 3 에 도시되고 상기 설명된 예들로부터 변경되지 않은 채로 있다. 재구성된 샘플들을 발생시키기 위해 변환 계수들을 처리하는 방법 300 의 블록 320 은, 4 개의 블록들 (405, 410, 420, 및 425) 을 포함하는 방법 400 에서 더 상세히 설명된다. 방법 400 은 또한, 은닉 멀티미디어 샘플들이 추정되는 블록 430 과 추정된 은닉 멀티미디어 샘플들에 기초한 변환 계수들이 발생되는 블록 435 와 같은 부가적인 블록들을 포함한다.4 is a flowchart detailing another example of a method of decoding a portion of a video stream in a system such as that shown in FIG. Exemplary method 400 includes all of the operations performed at blocks 305-320 included in method 300. Blocks 305, 310 and 315 remain unchanged from the examples shown in FIG. 3 and described above. Block 320 of method 300 of processing transform coefficients to generate reconstructed samples is described in more detail in method 400, which includes four blocks 405, 410, 420, and 425. The method 400 also includes additional blocks, such as block 430 in which hidden multimedia samples are estimated and block 435 in which transform coefficients based on the estimated hidden multimedia samples are generated.

디코더 디바이스는, 상기 설명한 것과 유사한 방식으로 블록들 (305, 310 및 315) 에서의 동작들을 수행한다. 블록 320 의 상세한 예는, 변환 계수들이 멀티미디어 샘플들을 효율적으로 재구성하기 위하여 기본 이미지 (basis image) 와 연관된다는 것을 보여준다. 블록 405 에서, 디코더 디바이스는 변환 계수들을 그룹들로 분할하고, 변환 계수들의 그룹들은 블록 305 에서 재구성되도록 결정되는 멀티미디어 샘플들과 연관된다. 일 양태에서, 변환 계수들의 그룹들은, 재구성시의 역변환 프로세스 동안에 공통의 기본 이미지를 변형 (또는 가중) 시키는 변환 계수들을 포함한다. 변환 계수들이 그룹들로 분할되는 방법에 대한 상세가 H.264 를 이용한 예에 관하여 이하 설명된다.The decoder device performs the operations at blocks 305, 310 and 315 in a similar manner as described above. The detailed example of block 320 shows that the transform coefficients are associated with a base image to efficiently reconstruct the multimedia samples. At block 405, the decoder device divides the transform coefficients into groups, and the groups of transform coefficients are associated with multimedia samples that are determined to be reconstructed at block 305. In one aspect, the groups of transform coefficients include transform coefficients that transform (or weight) the common base image during the inverse transform process in reconstruction. Details of how the transform coefficients are divided into groups are described below with respect to an example using H.264.

블록 410 에서, 디코더 디바이스는, 계수들을 발생시킨 인코딩 방법에 기초한 각 분할된 그룹과 연관된 가중 값을 계산한다. 일 양태에서, 가중은 각 그룹의 스케일링된 변환 계수들의 합이다. 스케일링은 인코딩 방법의 역변환 특성을 중복시킨다. 가중 값을 스케일링 및 계산하는 예는 H.264 예에 관하여 이하 설명된다.At block 410, the decoder device calculates a weight value associated with each divided group based on the encoding method that generated the coefficients. In one aspect, the weighting is the sum of the scaled transform coefficients of each group. Scaling duplicates the inverse transform characteristics of the encoding method. Examples of scaling and calculating weight values are described below with respect to H.264 examples.

블록 420 에서, 기본 이미지는, 인코딩 변환 방법에 기초하여 그룹들 각각에 대해 결정된다. 기본 이미지는 통상 2 차원 직교 행렬들이지만, 일 차원 어레이들이 사용될 수도 있다. 2 차원 기본 이미지들의 일부가 사용되며, 여기서, 그 일부는 (블록 310 에서 결정한 것처럼) 어느 멀티미디어 샘플들이 재구성되고 있는지에 의존한다. 블록 410 에서의 각 그룹에 대해 계산된 값들은, 블록 425 에서의 연관된 기본 이미지들을 변형 (또는 가중) 시키기 위해 사용한다. 가중된 기본 이미지들 모두를 조합함으로써, 멀티미디어 샘플들이 블록 425 에서 재구성된다. 블록 420 및 블록 425 에 대한 상세는 H.264 예에 관하여 이하 설명된다.At block 420, a base image is determined for each of the groups based on the encoding conversion method. The base image is typically two dimensional orthogonal matrices, but one dimensional arrays may be used. Some of the two-dimensional base images are used, where some depend on which multimedia samples are being reconstructed (as determined in block 310). The values calculated for each group in block 410 are used to transform (or weight) the associated base images in block 425. By combining all of the weighted base images, the multimedia samples are reconstructed at block 425. Details for blocks 420 and 425 are described below with respect to the H.264 example.

재구성된 멀티미디어 샘플들을 발생시킨 후, 방법 400 은, 디코더 디바이스가 일부 예에서 재구성된 샘플들에 기초하여 은닉 멀티미디어 샘플들을 추정하는 블록 430 으로 진행한다. 일 양태에서, 멀티미디어 샘플들의 재구성된 샘플 값들은, 은닉 멀티미디어 데이터를 형성하기 위해 사용된다. 다른 양태에서, 재구성된 샘플 값들 및 수신된 공간 예측 방향성 모드 표시자가 멀티미디어 은닉 데이터를 형성하기 위해 사용된다. 공간 에러 은닉에 대한 추가 상세는, 본원의 양수인에게 양도된, 출원번호 제11/182,621호 (미국공개특허번호 제2006/0013320호), "METHODS AND APPARATUS FOR SPATIAL ERROR CONCEALMENT" 에서 확인될 수 있다.After generating the reconstructed multimedia samples, the method 400 proceeds to block 430 where the decoder device estimates hidden multimedia samples based on the reconstructed samples in some examples. In one aspect, the reconstructed sample values of the multimedia samples are used to form the hidden multimedia data. In another aspect, the reconstructed sample values and the received spatial prediction directional mode indicator are used to form the multimedia concealment data. Further details on spatial error concealment can be found in Application No. 11 / 182,621 (US Patent Publication No. 2006/0013320), “METHODS AND APPARATUS FOR SPATIAL ERROR CONCEALMENT”, assigned to the assignee herein.

일부 예에서, 추정된 은닉 멀티미디어 샘플들은, 직접 사용되고 후에 디스플레이되는 동일 프레임의 재구성된 데이터를 포함한 프레임 버퍼 내에 삽입된다. 다른 예에서, 추정된 은닉 멀티미디어 샘플들은, 블록 435 에서 추정된 은닉 멀티미디어 샘플들을 나타내는 변환 계수들을 발생시키기 위해, 인코딩 프로세스를 반복하는 방식으로 변환된다. 이들 변환 계수들은 그 후, 그들의 통상의 인코딩된 샘플들인 것처럼 디코딩되지 않은 (여전히 인코딩된) 비트스트림 내에 삽입된다. 전체 비트스트림은 그 후 디코딩되도록 도 2b 내의 비디오 코어 (226) 와 같은 비디오 디코더 코드로 포워딩될 수 있다. 이들 예에서, 방법 400 의 전부 또는 일부는, 도 2b 의 전-처리기 (220) 와 같은 전-처리기에서 수행될 수 있다. 재구성 및 은닉 추정을 수행하는 이런 방법은, 후에 채널 에러들로 인해 손실된 멀티미디어 데이터의 다른 부분들을 은닉하기 위해 사용되는 비인과적인 부분들을 재구성하는데 특히 유용하다. 이제, 멀티미디어 샘플들의 재구성의 효율성을 향상시키기 위해 사용되는 방법들에 대한 상세가 H.264 인코딩된 멀티미디어 비트스트림에 관하여 설명될 것이다.In some examples, the estimated hidden multimedia samples are inserted into a frame buffer containing reconstructed data of the same frame that is used directly and later displayed. In another example, the estimated hidden multimedia samples are transformed in a manner that repeats the encoding process to generate transform coefficients representing the estimated hidden multimedia samples at block 435. These transform coefficients are then inserted into the undecoded (still encoded) bitstream as if they were their normal encoded samples. The entire bitstream may then be forwarded to video decoder code such as video core 226 in FIG. 2B to be decoded. In these examples, all or part of method 400 may be performed in a pre-processor such as pre-processor 220 of FIG. 2B. This method of performing reconstruction and concealment estimation is particularly useful for reconstructing non-causal portions that are later used to conceal other portions of multimedia data lost due to channel errors. Now, details of the methods used to improve the efficiency of reconstruction of multimedia samples will be described with respect to an H.264 encoded multimedia bitstream.

H.264 H.264 비트스트림에서의In the bitstream 고효율 부분 High efficiency part 인트라Intra 디코딩 decoding

H.264 는, 화소들의 이웃 블록들 사이에서 공간 상관관계를 활용하는 공간 예측을 사용한다. 공간 예측 모드는, 공간 예측을 위해 좌측과 위의 4×4, 8×8, 또는 16×16 화소 블록의 인과적인 이웃들을 이용한다. H.264 는, 휘도 값들에 대한 공간 예측에 대해 2 개의 모드들을 제공하는데, 하나는 4×4 화소 블록들 (여기서, 인트라-4×4 코딩으로 지칭) 이고 다른 하나는 16×16 화소 매크로블록들 (여기서, 인트라-16×16 코딩으로 지칭) 이다. 다른 인과적 및 비인과적인 이웃 샘플들이 공간 예측을 위해 사용될 수도 있음을 주목하다.H.264 uses spatial prediction that utilizes spatial correlation between neighboring blocks of pixels. The spatial prediction mode uses causal neighbors of 4x4, 8x8, or 16x16 pixel blocks on the left and top for spatial prediction. H.264 provides two modes for spatial prediction on luminance values, one of 4x4 pixel blocks (herein referred to as intra-4x4 coding) and the other of 16x16 pixel macroblocks. (Herein referred to as intra-16 × 16 coding). Note that other causal and non-causal neighbor samples may be used for spatial prediction.

도 5 는, 4×4 화소 블록 (502) 과 그 4×4 화소 블록을 에워싸는 좌측과 위의 인과적인 이웃 화소들 (504) 의 상세도를 나타낸다. 예를 들어, H.264 인코딩 프로세스 동안, 인과적인 이웃 화소들 (504) 은, 블록 (502) 화소들을 설명하는 다양한 예측기, 값들 및/또는 파라미터들을 발생시키기 위해 사용된다. 블록 (502) 은, 화소들 (p0 내지 p15) 을 포함하고, 인과적인 이웃 화소들 (504) 은, 참조 표시자들 (n3, n7, n11, n12, n13, n14, 및 n15) 을 이용하여 식별되며, 여기서 번호는, 블록 (502) 화소들의 유사한 위치들에 대응한다.FIG. 5 shows a detailed view of the 4x4 pixel block 502 and the causal neighboring pixels 504 on the left and top that surround the 4x4 pixel block. For example, during the H.264 encoding process, causal neighboring pixels 504 are used to generate various predictors, values, and / or parameters that describe the blocks 502 pixels. Block 502 includes pixels p0 to p15, and causal neighboring pixels 504 use reference indicators n3, n7, n11, n12, n13, n14, and n15. Where the number corresponds to similar locations of the block 502 pixels.

H.264 에 제공된 공간 예측 모드들은, 다양한 인과적인 이웃 화소들 (504) 로부터 블록 (502) 을 공간 예측하기 위해 다양한 방향성 모드들을 이용한다. 도 6 은, H.264 에서 인트라-코딩된 블록의 방향성 특성을 설명하기 위해 사용되는 9 개의 방향성 모드들 (0 내지 8) 을 나타내는 방향성 모드 도면 (600) 을 나타낸다. 9 개의 방향성 모드들 (또는 표시자) 은, 블록 502 의 공간 예측의 방향성 특성을 설명하기 위해 사용된다. 예를 들어, 모드 0 은, 수직 방향성 특성을 설명하고, 모드 1 은, 수평 방향성 특성을 설명하며, 모드 2 는, 가용 인과적인 이웃 화소들의 평균 값이 예측을 위한 참조로서 사용되는 DC 특성을 설명한다. DC 모드에서, 동일 슬라이스 내에 존재하는 (4×4, 8×8, 또는 16×16 화소 블록의 바로 위와 좌측에 있는) 인과적인 이웃 화소들은 평균의 계산시에 사용된다. 예를 들어, 인코딩되는 블록이 일 슬라이스 위에 접한다면, 좌측의 화소들이 평균화된다. 인코딩되는 블록이 다른 슬라이스의 좌측과 위에 접한다면, 128 의 값이 DC 평균 (H.264 에서 제공된 값들의 8-비트 범위의 절반) 으로 사용된다. 방향성 모드 도면 (600) 에 도시된 모드들은, 블록 502 에 대한 예측 값들을 발생시키기 위해 H.264 인코딩 프로세스에서 사용된다.The spatial prediction modes provided in H.264 use various directional modes to spatially predict block 502 from various causal neighboring pixels 504. FIG. 6 shows a directional mode diagram 600 showing nine directional modes 0-8 used to describe the directional characteristic of an intra-coded block in H.264. Nine directional modes (or indicators) are used to describe the directional characteristic of the spatial prediction of block 502. For example, mode 0 describes the vertical directional characteristic, mode 1 describes the horizontal directional characteristic, and mode 2 describes the DC characteristic in which the average value of available causal neighboring pixels is used as a reference for prediction. do. In the DC mode, causal neighboring pixels (just above and to the left of a 4x4, 8x8, or 16x16 pixel block) that exist in the same slice are used in the calculation of the mean. For example, if the block to be encoded touches over one slice, the pixels on the left are averaged. If the block to be encoded touches the left and top of the other slice, then a value of 128 is used as the DC average (half of the 8-bit range of values provided in H.264). The modes shown in the directional mode diagram 600 are used in the H.264 encoding process to generate prediction values for block 502.

H.264 의 인트라-4×4 코딩에서, 휘도 값들은, 9 개의 방향성 모드들 중 임의의 모드를 이용하여, 4×4 블록의 좌측 및 위의 화소들에 관하여 인코딩될 수 있다. 인트라-16×16 코딩에서, 휘도 값들은, 4 개의 모드들: i) 수직 (모드 0), ii) 수평 (모드 1), iii) DC (모드 2), 및 iv) 평면 (모드 3) 을 이용하여 전체 16×16 화소 블록의 좌측 및 위의 화소들에 관하여 인코딩될 수 있다. 평면 예측 모드에서, 휘도 값들은 매크로블록에 걸쳐 공간 및 평활하게 변하고 참조는 평면 등식에 기초하여 형성된다. 색도의 경우, 하나의 에측 모드, 즉 8×8 이 있다. 인트라-8×8 색도 코딩에 있어서, 8×8 블록은, 인트라-16×16 코딩에서 사용된 것과 동일한 모드들: i) 수직 (모드 0), ii) 수평 (모드 1), iii) DC (모드 2), 및 iv) 평면 (모드 3) 로 예측될 수 있다. H.264 에서 인코딩된 예측된 블록들을 재구성하는 것에 대한 상세는 이제 설명될 것이다.In Intra-4 × 4 coding of H.264, the luminance values can be encoded with respect to the left and top pixels of the 4 × 4 block, using any of nine directional modes. In Intra-16 × 16 coding, the luminance values are divided into four modes: i) vertical (mode 0), ii) horizontal (mode 1), iii) DC (mode 2), and iv) plane (mode 3). Can be encoded with respect to the left and top pixels of the entire 16x16 pixel block. In planar prediction mode, luminance values vary spatially and smoothly across macroblocks and references are formed based on planar equations. In the case of chromaticity, there is one prediction mode, 8x8. In intra-8x8 chroma coding, the 8x8 block has the same modes as used in intra-16x16 coding: i) vertical (mode 0), ii) horizontal (mode 1), iii) DC ( Mode 2), and iv) plane (mode 3). Details on reconstructing the predicted blocks encoded in H.264 will now be described.

예측의 (인트라 또는 인터) 코딩된 4×4 (휘도 또는 색도) 블록 내의 재구성된 신호는,The reconstructed signal in the (intra or inter) coded 4 × 4 (luminance or chromaticity) block of prediction is

로서 표현될 수 있으며, 여기서, r, p 및

는 각각 재구성된 신호 (원래의 압축해제된 신호 s 에 근사), 예측 신호, 및 압축된 잔여 신호 (원래의 압축해제된 잔여 신호에 근사:

= s - p (여기서, s 는 원래의 신호이다)) 를 나타내고, 이들 모두는, 이 예에서 정수 값의 4×4 행렬들이다. 잔여 값들

은, 변환 계수들의 역변환에 의해 재구성될 수 있다. 예측 값들 p 은, 그들을 인코딩하기 위해 사용된 공간 예측 모드에 따라 인과적인 이웃 화소들로부터 획득된다.And r, p and

Are respectively approximated to the reconstructed signal (approximate to the original decompressed signal s), the predictive signal, and the compressed residual signal (the original decompressed residual signal:

= s-p (where s is the original signal), all of which are 4x4 matrices of integer values in this example. Residual values

Can be reconstructed by the inverse transform of the transform coefficients. The prediction values p are obtained from causal neighboring pixels according to the spatial prediction mode used to encode them.

다음은, 슬라이스 경계 바로 아래에 위치된 인트라-4×4 코딩된 매크로블록들 내의 화소들 (H.264 의 비인과적인 이웃들) 의 재구성에 영향을 주는 관찰들 (observations) 이다. 16×16 매크로블록에서, 이들 블록들은, 슬라이스 경계 바로 아래에 위치되는 최상의 4 개의 4×4 블록들을 포함한다. 예를 들어, 도 9 에 도시된 16×16 화소 매크로블록 내의 인덱스들 b0, b1, b4, 및 b5 를 가진 블 록들이 슬라이스 경계 AA' 바로 아래의 블록들을 나타낸다.The following are observations that affect the reconstruction of the pixels (inhumane neighbors of H.264) in intra-4x4 coded macroblocks located just below the slice boundary. In a 16x16 macroblock, these blocks contain the best four 4x4 blocks located just below the slice boundary. For example, blocks with indices b0, b1, b4, and b5 in the 16x16 pixel macroblock shown in FIG. 9 represent blocks just below slice boundary AA '.

도 7 은, 슬라이스 경계 바로 아래의 인트라-4×4 코딩된 블록의 일 양태를 나타낸다. 선 AA' 는, 상기 언급된 슬라이스 경계를 나타내고, 4×4 블록 (702) 은 재구성되는 현재의 블록이다. 인트라-4×4 코딩에서 공간 예측을 수행하기 위해 보통 사용될 수 있는 슬라이스 경계 선 AA' 위의 9 개의 이웃 화소들 (704) 은, 그들이 슬라이스 경계의 다른 측에 위치되기 때문에, 그리고 그들이 또 다른 슬라이스에 속하기 때문에 이용가능하지 않다. 슬라이스 경계에 걸친 임의의 다른 예측의 코딩 의존성에 더한 공간 예측은, 슬라이스들이 재동기화 포인트로서의 역할을 하기 때문에 H.264 에서 허용되지 않는다.7 illustrates an aspect of an intra-4 × 4 coded block just below the slice boundary. Line AA 'represents the slice boundary mentioned above and 4x4 block 702 is the current block being reconstructed. The nine neighboring pixels 704 above slice boundary line AA 'that can usually be used to perform spatial prediction in intra-4x4 coding are because they are located on the other side of the slice boundary, and they are another slice. It is not available because it belongs to. Spatial prediction in addition to the coding dependence of any other prediction across slice boundaries is not allowed in H.264 because slices serve as resynchronization points.

도 8 은, 인트라-4×4 코딩된 블록 내의 화소들과 이웃 화소들에 대한 일람표를 나타낸다. 슬라이스 경계 AA' 위의 화소들이 공간 예측에 이용가능하지 않기 때문에, 예측에 이용가능한 블록 (702) 의 이웃 화소들은 화소들 {I, J, K, L} 이다. 이것은, 4×4 블록 (702) 에 대해 허용되는 인트라-4×4 코딩 예측 모드들이, i) 모드 1 (수평), ii) 모드 2 (DC), 및 iii) 모드 8 (horizontal-up) 임을 의미한다. 도 7 의 선 BB' 가 또 다른 슬라이스 경계를 나타낸다면, 화소들 {I, J, K, L} 또는 {M, A, B, C, D, E, F, G, H} 중 어떤 것도 공간 예측에 이용가능하지 않을 것이다. 이 경우에, 이용가능한 허용되는 인트라-4×4 코딩 예측 모드는, 블록 (702) 의 화소들 모두에 대한 참조 값이 128 인 모드 2 (DC) 이다.8 shows a list of pixels and neighboring pixels in an intra-4x4 coded block. Since the pixels above slice boundary AA 'are not available for spatial prediction, the neighboring pixels of block 702 available for prediction are pixels {I, J, K, L}. This indicates that the intra-4x4 coding prediction modes allowed for the 4x4 block 702 are i) mode 1 (horizontal), ii) mode 2 (DC), and iii) mode 8 (horizontal-up). it means. If line BB 'of FIG. 7 represents another slice boundary, none of the pixels {I, J, K, L} or {M, A, B, C, D, E, F, G, H} is spaced It will not be available for prediction. In this case, the allowed intra-4 × 4 coding prediction mode available is Mode 2 (DC) with a reference value of 128 for all the pixels in block 702.

따라서, 가장 일반적인 경우에, 슬라이스 경계 바로 아래에 위치되는 인트라 -4×4 코딩된 블록의 화소들의 일부 또는 전부를 디코딩 및 재구성하기 위한 정보는,Thus, in the most common case, information for decoding and reconstructing some or all of the pixels of an intra-4x4 coded block located directly below the slice boundary is

1. 인트라-4×4 예측 모드 표시자;1. Intra-4 × 4 prediction mode indicator;

2. 잔여 정보 (양자화된 변환 계수들); 및2. residual information (quantized transform coefficients); And

3. 4×4 블록의 좌측에 바로 위치되는 4 개의 이웃 화소들 {도 8 의 I, J, K, L} 의 값들;3. values of four neighboring pixels {I, J, K, L} in FIG. 8 located directly to the left of the 4x4 block;

을 포함한다. 이 충분한 데이터 세트는, 현재의 4×4 블록의 모든 화소 값들 {도 8 의 a, b, c,..., n, o, p} 의 재구성을 가능하게 할 수 있다. 또한, 이 데이터 세트는, 화소 서브세트의 값들 {d, h, l, p} 을 재구성하기에 충분하며, 이 화소 서브세트의 값들은 후에 우측에 바로 위치되는 다음의 4×4 블록의 재구성에 사용될 수도 있다.. This sufficient data set may enable reconstruction of all pixel values {a, b, c, ..., n, o, p} of the current 4x4 block. In addition, this data set is sufficient to reconstruct the values {d, h, l, p} of the pixel subset, which values are then used to reconstruct the next 4x4 block located immediately to the right. May be used.

다음은, 슬라이스 경계 바로 아래에 위치되는 인트라-16×16 코딩된 매크로블록들 (H.264 의 비인과적인 이웃들) 의 재구성에 영향을 주는 관찰들이다. 여기서 다시, 슬라이스 경계 바로 아래에 위치되는 인트라-16×16 코딩된 매크로블록의, 최상의 4 개의 4×4 블록들 (즉, 도 9 의 블록 인덱스들 b0, b1, b4, 및 b5 를 가진 것들) 에 관심이 있다.The following are observations that affect the reconstruction of intra-16 × 16 coded macroblocks (inhumane neighbors of H.264) located just below the slice boundary. Here again, the best four 4 × 4 blocks (ie those with block indices b0, b1, b4, and b5 of FIG. 9) of an intra-16 × 16 coded macroblock located just below the slice boundary. have an interest in.

도 9 는, 슬라이스 경계 아래에 위치되는 인트라-16×16 코딩된 매크로블록의 일 양태를 나타낸다. 선 AA' 는, 상기 언급된 슬라이스 경계를 나타내고, b0, b1, b4, 및 b5 로 라벨링된 4 개의 4×4 블록들은, 재구성을 위해 고려중인 16×16 매크로블록의 부분을 구성한다. 보통, 인트라-16×16 공간 예측을 수행하 기 위해 사용될 수 있는 선 AA' 위의 17 개의 이웃 화소들은, 그들이 슬라이스 경계의 다른 측에 위치되기 때문에, 그리고 그들이 또 다른 슬라이스에 속하기 때문에 이용가능하지 않다. 이 예에서는 선 BB' 의 좌측에 바로 위치되는, 16 개의 이웃 화소들의 잠재적 이용가능성은, 현재의 매크로블록에 대해 허용되는 인트라-16×16 코딩 공간 예측 모드들이, i) 모드 1 (수평), 및 ii) 모드 2 (DC) 라는 것을 의미한다. 예를 들어, 선 BB' 가 또 다른 슬라이스 경계 (또는, 비디오 프레임의 좌측 경계) 를 나타내는 경우인, 선 BB' 의 좌측에 바로 위치되는 16 개의 이웃 화소들도, 선 AA' 위에 위치되는 17 개의 화소들도 이용가능하지 않을 경우, 허용되는 인트라-16×16 예측 모드는 모드 2 (DC) 이다.9 illustrates an aspect of an intra-16 × 16 coded macroblock located below a slice boundary. Line AA 'represents the slice boundary mentioned above, and the four 4x4 blocks labeled b0, b1, b4, and b5 constitute part of the 16x16 macroblock under consideration for reconstruction. Usually, 17 neighboring pixels on line AA 'that can be used to perform intra-16x16 spatial prediction are available because they are located on the other side of the slice boundary, and because they belong to another slice. Not. In this example the potential availability of 16 neighboring pixels, located directly to the left of line BB ', indicates that the intra-16 by 16 coding spatial prediction modes allowed for the current macroblock are: i) mode 1 (horizontal), And ii) mode 2 (DC). For example, if the line BB 'represents another slice boundary (or the left boundary of the video frame), the sixteen neighboring pixels located immediately to the left of the line BB' also have seventeen pixels located above the line AA '. If no pixels are also available, the allowed intra-16x16 prediction mode is mode 2 (DC).

현재의 매크로블록이 인트라-16×16 예측 모드 1 (수평) 을 이용하여 인코딩되는 경우, 선 BB' 의 좌측에, 그리고 선 AA' 아래에 바로 위치되는 최상의 4 개의 이웃 화소들은 현재의 16×16 매크로블록 내의 최상의 4 개의 4×4 블록들을 디코딩 및 재구성하기에 충분하다. 이것은, 인트라-4×4 코딩된 매크로블록 내의 최상의 4 개의 4×4 블록들의 디코딩을 가능하게 하는 상기 설명된 구조와 일치한다.If the current macroblock is encoded using intra-16 × 16 prediction mode 1 (horizontal), the best four neighboring pixels located to the left of line BB 'and directly below line AA' are the current 16 × 16 pixels. It is sufficient to decode and reconstruct the four best 4x4 blocks in the macroblock. This is consistent with the structure described above, which enables decoding of the best four 4x4 blocks in an intra-4x4 coded macroblock.

그러나, 현재의 매크로블록이 인트라-16×16 공간 예측 모드 2 (DC) 를 이용하여 인코딩되고, 그것이 슬라이스 경계의 우측에도 좌측 프레임 경계에도 바로 없을 경우, 선 BB' 의 좌측에 바로 위치되는 모든 16 개의 이웃 화소들이, 현재의 MB 내의 최상의 4 개의 4×4 블록들 (게다가, 열 내의 모든 다른 것들) 을 디코딩 및 재구성하기 위해 사용된다. 이것은 바람직하지 않은 상황이다. 일 양태에 서, 슬라이스 경계 바로 아래의 인트라-16×16 공간 예측 모드 2 (DC) 를 이용한 인코딩을 회피하는 것이 유익하다. 최상의 4 개의 이웃 화소들은, 슬라이스 경계 아래의 화소들 (예를 들어, 도 8 의 화소들 I, J, K 및 L) 의 재구성을 위해 사용될 수도 있는 것이 바람직하다.However, if the current macroblock is encoded using intra-16 × 16 spatial prediction mode 2 (DC), and it is not directly on the right side of the slice boundary nor on the left frame boundary, then all 16 located immediately to the left of the line BB '. Neighbor pixels are used to decode and reconstruct the best four 4x4 blocks (in addition to all others in the column) in the current MB. This is an undesirable situation. In one aspect, it is beneficial to avoid encoding with intra-16 × 16 spatial prediction mode 2 (DC) just below the slice boundary. The best four neighboring pixels may be used for reconstruction of the pixels below the slice boundary (eg, pixels I, J, K and L of FIG. 8).

일 양태에서, 슬라이스 경계 바로 아래에 위치되는 매크로블록들의 인트라-16×16 코딩은, 그들이 슬라이스 경계의 우측에 바로 위치되지 않거나 좌측 프레임 경계에 있지 않다면, 공간 예측 모드 1 (수평) 로 제한되어야 한다. 이것은, 행의 최상의 4×4 블록들 모두의 최우측 4 개의 화소들의 계산 효율적인 재구성을 허용한다. 이것은 후에, 행의 최상의 4×4 블록들 모두의 최상의 4 개의 화소들의 계산 효율적인 재구성을 허용한다.In one aspect, intra-16 × 16 coding of macroblocks located directly below the slice boundary should be limited to spatial prediction mode 1 (horizontal), unless they are located directly to the right of the slice boundary or at the left frame boundary. . This allows computationally efficient reconstruction of the rightmost four pixels of all the best 4x4 blocks of the row. This allows later computationally efficient reconstruction of the best four pixels of all the best 4x4 blocks of the row.

도 10 은, 슬라이스 경계 바로 아래에 위치되는 8×8 색도 블록의 일 양태를 나타낸다. 선 AA' 는 슬라이스 경계를 나타내고, 선 AA' 바로 아래 및 선 BB' 의 우측의 2 개의 4×4 블록들은 2 개의 색도 채널들 (Cr 및 Cb) 중 하나에 대한 데이터를 구성한다. 슬라이스 경계 선 AA' 위의 9 개의 이웃 화소들은, 그들이 슬라이스 경계의 다른 측 상에 위치되기 때문에, 그리고 그들이 또 다른 슬라이스에 속하기 때문에 이 예에서는 공간 예측에 이용가능하지 않다. 선 BB' 의 좌측에 바로 위치되는, 8 개의 이웃 화소들의 이용가능성은, 현재의 MB 에 대해 허용되는 색도 채널 인트라 예측 모드들이 i) 모드 0 (DC) 및 ii) 모드 1 (수평) 으로 제한된다는 것을 의미한다. 선 BB' 가 또한 슬라이스 경계 또는 비디오 프레임의 좌측 경계일 때, 선 BB' 의 좌측에 바로 위치되는 8 개의 이웃 화소들도, 선 AA' 바로 위에 위치되는 9 개의 화소들도 공간 예측에 이용가능하지 않다. 이 경우에, 허용되는 색도 채널 인트라 예측 모드는 모드 0 (DC) 이다.10 illustrates one aspect of an 8x8 chromaticity block located directly below the slice boundary. Line AA 'represents a slice boundary, and two 4x4 blocks just below line AA' and to the right of line BB 'constitute data for one of two chroma channels Cr and Cb. Nine neighboring pixels above slice boundary line AA 'are not available for spatial prediction in this example because they are located on the other side of the slice boundary, and because they belong to another slice. The availability of eight neighboring pixels, located directly to the left of line BB ', indicates that the chromaticity channel intra prediction modes allowed for the current MB are limited to i) mode 0 (DC) and ii) mode 1 (horizontal). Means that. When line BB 'is also a slice boundary or the left boundary of a video frame, neither the eight neighboring pixels located directly to the left of line BB' nor the nine pixels located directly above line AA 'are available for spatial prediction. not. In this case, the allowed chromaticity channel intra prediction mode is mode 0 (DC).

현재의 인트라 코딩된 매크로블록들의 색도 채널들이 인트라-8×8 색도 수평 예측 모드를 이용하여 인코딩될 때, 선 BB' 좌측에 바로 위치되는 최상의 4 개의 이웃 화소들은, 현재의 MB 내의 최상의 2 개의 4×4 색도 블록들을 디코딩 및 재구성하기 위해 필요로 될 수도 있다. 하나의 16×16 휘도 매크로블록에 대응하는 2 개의 8×8 색도 블록들이 존재한다는 것에 주목해야 한다.When the chroma channels of the current intra coded macroblocks are encoded using the intra-8 × 8 chromatic horizontal prediction mode, the four best neighboring pixels located directly to the left of the line BB 'are the best two fours in the current MB. It may be necessary to decode and reconstruct × 4 chromaticity blocks. It should be noted that there are two 8x8 chromaticity blocks corresponding to one 16x16 luma macroblock.

마찬가지로, 현재의 인트라-코딩된 매크로블록 색도 채널들 (Cr 및 Cb) 이 인트라-8×8 색도 예측 모드 2 (DC) 를 이용하여 인코딩될 때, 선 BB' 의 좌측에 바로 위치된 8 개의 이웃 화소의 이용가능성은 최상의 2 개의 4×4 블록들을 디코딩 및 재구성하기에 적합하다. 이것은 또 상기 설명된 구조와 일치한다.Similarly, when the current intra-coded macroblock chromaticity channels Cr and Cb are encoded using intra-8 × 8 chroma prediction mode 2 (DC), eight neighbors located immediately to the left of the line BB '. The availability of the pixel is suitable for decoding and reconstructing the best two 4x4 blocks. This is also consistent with the structure described above.

일 양태에서, 슬라이스 경계 바로 아래에 위치되는, 인트라-코딩된 매크로블록들의 색도 채널들 (Cr 및 Cb) 의 인트라-8×8 코딩은, 그들이 슬라이스 경계의 우측에 바로 위치되지 않거나, 좌측 프레임 경계에 있지 않다면, 공간 예측 모드 1 (수평) 로 제한되어야 한다. 이것은, 행의 최상의 4×4 블록들 모두의 최우측 4 개의 화소들의 계산 효율적인 재구성을 허용한다. 이것은 후에, 행의 최상의 4×4 블록들 모두의 최상의 4 개의 화소들의 계산 효율적인 재구성을 허용한다. 이것은, 인트라-코딩된 매크로블록들 색도 채널들 내의 최상의 4 개의 4×4 블록들의 디코딩을 가능하게 하는 상기 설명된 구조와 일치한다 (제한들을 가진 인트라-4×4 코딩된 매크로블록들, 및 인트라 16×16 코딩된 매크로블록들 모두는, 상기 설명한 것처럼 인트라-16×16 DC 공간 예측 모드를 사용하였다). In one aspect, intra-8 × 8 coding of chroma channels (Cr and Cb) of intra-coded macroblocks, located just below the slice boundary, means that they are not located directly to the right of the slice boundary, or the left frame boundary. If not, it should be limited to spatial prediction mode 1 (horizontal). This allows computationally efficient reconstruction of the rightmost four pixels of all the best 4x4 blocks of the row. This allows later computationally efficient reconstruction of the best four pixels of all the best 4x4 blocks of the row. This is consistent with the structure described above that allows decoding of the best four 4x4 blocks in intra-coded macroblocks chromaticity channels (Intra-4x4 coded macroblocks with limitations, and intra All 16x16 coded macroblocks used intra-16x16 DC spatial prediction mode as described above).

H.H. 264 에서의At 264 인트라Intra -코딩된 샘플들의 효율적인 부분 디코딩Efficient partial decoding of coded samples

4×4 화소 블록들의 4 개의 최우측 화소들의 부분 디코딩은, 초기의 4×4 블록의 우측의 인트라-코딩된 블록들의 화소들의 일부 및/또는 전부의 디코딩을 허용하는 것이 보여져 있다. 도 8 의 위치들 {d, h, l, p} 에 대한 최종 화소 값들의 재구성에 기여하는 4×4 인트라-코딩된 블록의 잔여 컴포넌트의 4 번째, 즉, 마지막 열을 효율적으로 디코딩하는 문제가 이제 다루어질 것이다. 이 예는, H.264 정수 변환의 기본 이미지들을 이용한다. 그러나, 다른 변형들의 기본 이미지들이 유사한 방식으로 조작될 수 있어 유사한 효율적인 부분 디코딩이 허용될 수 있다는 것에 주목해야 한다. 이들 방법들을 이용하여 부분적으로 디코딩될 수도 있는 다른 변형들은, 한정하려는 것은 아니지만, DCT (이산 코사인 변환), DFT (이산 푸리에 변환), 아다마르 (또는, 왈시-아다마르) 변환, 이산 웨이브렛 변환, DST (이산 사인 변환), 하르 변환, 경사 변환, 및 KL (카루넨 루베) 변환을 포함한다.Partial decoding of the four rightmost pixels of the 4x4 pixel blocks has been shown to allow decoding of some and / or all of the pixels of the intra-coded blocks on the right side of the initial 4x4 block. The problem of efficiently decoding the fourth, ie, last, column of the residual component of the 4x4 intra-coded block contributing to the reconstruction of the final pixel values for positions {d, h, l, p} of FIG. It will be covered now. This example uses base images of H.264 integer conversion. However, it should be noted that the base images of the other variants can be manipulated in a similar manner, allowing similar efficient partial decoding. Other variants that may be partially decoded using these methods include, but are not limited to, DCT (Discrete Cosine Transform), DFT (Discrete Fourier Transform), Adama (or Walsh-Adamar) Transform, Discrete Wavelet Transform , DST (Discrete Sine Transformation), Har Transform, Slope Transformation, and KL (Karunen Rube) Transformation.

일반적으로, 변환 계수 행렬 [w] 을 야기하는 변환 행렬 [T] 을 이용하여 멀티미디어 샘플들의 N×N 행렬 [Y] 의 순방향 변환은 다음과 같은 형태를 취한다 :In general, the forward transform of an N × N matrix [Y] of multimedia samples using the transform matrix [T] resulting in a transform coefficient matrix [w] takes the form:

멀티미디어 샘플 행렬 [Y] 을 재구성하기 위한 대응 역변환은 다음과 같은 형태이다 :The corresponding inverse transform for reconstructing the multimedia sample matrix [Y] is of the form:

식 (3) 및 식 (4) 에 의해 표현된 변환들은 각각, 2 차원 (2D) 변환을 야기하는 2 개의 1 차원 (1D) 변환들로 간주될 수 있다. 예를 들어, [Y][T] 행렬 승산 연산은, 1D 행 변환으로 간주될 수 있고, [T]^T[Y] 행렬 승산 연산은 1D 열 변환으로 간주될 수 있다. 그 조합은 2D 변환을 형성한다. N×N 행렬 [Y] 의 2D 변환에 대해 생각하는 다른 방법은, 변환 행렬 [T] 에 의해 특징지어지는 2D 변환에 대응하는 2D 기본 이미지와 [Y] 의 N² 내적을 수행하여, 변환 계수들의 세트와 동일한 N² 값들의 세트를 이끌어 내는 것이다.The transforms represented by equations (3) and (4) can be regarded as two one-dimensional (1D) transforms, respectively, resulting in a two-dimensional (2D) transform. For example, the [Y] [T] matrix multiplication operation may be considered a 1D row transformation and the [T] ^T [Y] matrix multiplication operation may be considered a 1D column transformation. The combination forms a 2D transform. Another way to think about the 2D transformation of the N × N matrix [Y] is to perform the transform coefficients by performing the 2D base image and the N ² dot product of [Y] corresponding to the 2D transformation characterized by the transformation matrix [T]. Derive the same set of N ² values as the set of?

주어진 변환 [T] 의 기본 이미지들은, 변환 계수들 중 하나를 1 로 설정하고 다른 모든 것들을 0 으로 설정하며, 결과로 발생한 계수 행렬의 역변환을 취함으로써 계산될 수 있다. 예를 들어, 4×4 변환 계수 행렬 [w] 을 이용하고, w₁₁ 계수를 1 로, 나머지 모두를 0 으로 설정하며, H.264 정수 변환 [T_H] 를 이용하여, 식 (4) 는,The base images of a given transform [T] can be calculated by setting one of the transform coefficients to 1 and all others to 0, and taking the inverse transform of the resulting coefficient matrix. For example, using the 4x4 transform coefficient matrix [w], w ₁₁ coefficients are set to 1 and all others are set to 0, and using H.264 integer transform [T _H ], equation (4) is ,

를 야기한다.Cause.

16 (N², 여기서, N=4) 기본 이미지들을 가중 (스케일링) 하기 위해 [w] 내의 개별 변환 계수들 (가중치들) 을 이용하여 형성된 16 (N², 여기서, N=4) 행렬들을 합산함으로써, 전체 재구성된 행렬 [Y] 이 계산될 수 있다. 이것은, 전체 행렬을 계산하기 위한 고속 변환 방법과 비교해 효율적인 방법이 아니다. 그러나, 행 또는 열과 같은 서브세트의 재구성은, 기본 이미지들을 이용한 고속 변환보다 훨씬 효율적으로 행해질 수 있다.Sum 16 (N ² , where N = 4) matrices formed using the individual transform coefficients (weights) in [w] to weight (scale) 16 (N ² , where N = 4) base images By doing so, the entire reconstructed matrix [Y] can be calculated. This is not as efficient as the fast conversion method for calculating the entire matrix. However, reconstruction of a subset, such as a row or column, can be done much more efficiently than a fast transform using base images.

잔여 4×4 블록들에 대한 H.264 4×4 정수 변환 프로세스와 연관된 16 기본 이미지들은 다음으로 결정될 수 있으며, 여기서, sij (i, j ∈ {0, 1, 2, 3} 인 경우) 는 i 번째 수평 및 j 번째 수직 주파수 채널과 연관된 기본 이미지이다.The 16 base images associated with the H.264 4 × 4 integer conversion process for the remaining 4 × 4 blocks can be determined as follows, where sij (if i, j ∈ {0, 1, 2, 3}) is The base image associated with the i th horizontal and j th vertical frequency channels.

이들 16 기본 이미지들을 주의 깊게 보면, 그들의 마지막 열이 실제로는, 스케일 팩터 (scale factor) 를 제외하고, 4 개의 별개의 벡터들을 포함하는 것이 보인다. 이것은, 4×1 행렬/벡터인 마지막 열이 4 차원 벡터 공간에 놓여 있기 때문에, 그리고 정확히 4 기본 벡터로 표현될 수 있기 때문에 직관적으로 명백해야 한다.Looking carefully at these 16 basic images, it can be seen that their last column actually contains four distinct vectors, except for the scale factor. This should be intuitively obvious because the last column, which is a 4x1 matrix / vector, lies in four-dimensional vector space, and because it can be represented exactly 4 fundamental vectors.

양자화된 변환 계수들 (즉, 레벨들, zij i,j∈ {0, 1, 2, 3}) 이 비트스트림에서 수신되는 경우, 그들이 계수들 w'ij i,j ∈ {0, 1, 2, 3} 을 발생시키기 위해 리스케일링 (재양자화) 된다. 이들 양자화된 변환 계수들 (w'ij i,j ∈ {0, 1, 2, 3}) 은 그 후, 조합되는 그룹들로 파싱될 수 있고, 역변환 프로세스가 에뮬레이팅 (즉, 합성 프로세스에서 기본 이미지들을 가중하기 위해 가중치들을 발생) 시키기 위해 기본 이미지들의 마지막 열 (또는 벡터) 과 승산될 수 있다. 이 관찰은, 도 8 의 위치들 {d, h, l, p} 에 대응하는 4×4 잔여 신호들

의 마지막 열에 대한 재구성 표현 식이,If quantized transform coefficients (ie levels, zij i, j∈ {0, 1, 2, 3}) are received in the bitstream, they are coefficients w'ij i, j ∈ {0, 1, 2 , 3} to be rescaled (requantized). These quantized transform coefficients (w'ij i, j ∈ {0, 1, 2, 3}) can then be parsed into groups that are combined, and the inverse transform process is emulated (ie, the base in the synthesis process). Multiplied by the last column (or vector) of base images to generate weights to weight the images. This observation shows that 4 × 4 residual signals corresponding to positions {d, h, l, p} in FIG. 8.

The reconstruction expression for the last column of,

로서 기록될 수 있음을 의미한다.It can be recorded as.

일단, 상기 소괄호의 4 개의 세트들의 스칼라 양들 w'ij 의 4 개의 상이한 조합들이 계산되면, 각 기본 벡터의 스케일링/계산을 완료하기 위해 우측 시프트 및 가산/감산이 이용될 수 있다. 재구성된 샘플들의 계산은 그 후 직접적 (straight forward) 이다. 프레임의 먼 좌측 또는 슬라이스 경계의 바로 우측 에서 시작함으로써, 공간 예측 모드 2 (DC) 가 사용될 수도 있고, 모든 화소 값들이 128 과 동일한 참조 (또는 예측) 값 (상기 식 1 의 p 참조) 을 갖는다는 것이 알려져 있다. 이 제 1 최좌측 블록에 대한 위치들 {d, h, l, p} 에 대응하는 재구성된 샘플들 [r_d, r_h, r_l, r_p] 이,Once four different combinations of scalar quantities w'ij of the four sets of parentheses are calculated, right shift and addition / subtraction can be used to complete the scaling / calculation of each base vector. The calculation of the reconstructed samples is then straight forward. By starting at the far left of the frame or just to the right of the slice boundary, spatial prediction mode 2 (DC) may be used and all pixel values have a reference (or prediction) value equal to 128 (see p in equation 1 above). It is known. Reconstructed samples [r _d , r _h , r _l , r _p ] corresponding to positions {d, h, l, p} for this first leftmost block,

로서 계산될 수 있으며, 여기서, 재구성된 잔여 값들

은 식 (7) 로 계산된다. 이 블록의 우측의 4×4 블록들은 그 후, 식 (1) 의 예측 신호 컴포넌트 p 를 발생시키기 위해 좌측의 블록으로부터 적절한 재구성된 값들을 이용함으로써 계산될 수 있다 (발생된 예측 신호 값들은, 어느 공간 예측 모드가 재구성되는 4×4 블록을 인코딩하도록 사용되었는지에 의존한다). 슬라이스 경계 아래에 위치되는 4×4 블록들에 대한 예측 값들을 계산하는 예가 이제 설명된다.Can be calculated as: where reconstructed residual values

Is calculated by equation (7). The 4x4 blocks on the right side of this block can then be calculated by using the appropriate reconstructed values from the block on the left to generate the prediction signal component p of equation (1) (the generated prediction signal values Depends on whether the spatial prediction mode was used to encode the 4x4 block to be reconstructed). An example of calculating prediction values for 4x4 blocks located below the slice boundary is now described.

도 11 은, 슬라이스 경계 바로 아래에 위치되는 멀티미디어 샘플들의 일부를 나타낸다. 화소들은 휘도 및 색도 값들을 포함할 수도 있다. 화소 위치들 {q, r, s, t} 은, (예를 들어, 상기 식 (7) 을 이용하여 계산된) 화소 값들

을 사용하여 미리 재구성된 위치들을 나타낸다. 화소 위치들 {d, h, l, p} 에 대한 잔여 신호 컴포넌트 값들

의 재구성 후에, 식 (1) 에 따라 재구성을 완결하기 위해 동일 세트의 위치들 {d, h, l, p} 에 대한 예측 신호 컴포넌트 값들

이 발생될 것이다. 화소들 {d, h, l, p} 을 포함하는 인트라-4×4 코딩된 4×4 블록이 슬라이스 경계 바로 아래에 있다고 가정하면, 이 4×4 블록에 대해 예측 신호를 발생시키기 위해 사용될 수 있었던 인트라-4×4 공간 예측 모드들은 다음 중 하나일 수 있다 :11 shows some of the multimedia samples located just below the slice boundary. The pixels may include luminance and chromaticity values. Pixel positions {q, r, s, t} are pixel values (e.g., calculated using equation (7) above)

To indicate the pre-reconstructed positions. Residual signal component values for pixel positions {d, h, l, p}

After reconstruction of, the prediction signal component values for the same set of positions {d, h, l, p} to complete the reconstruction according to equation (1)

Will occur. Assuming that an intra-4x4 coded 4x4 block containing pixels {d, h, l, p} is just below the slice boundary, it can be used to generate a prediction signal for this 4x4 block. Intra-4 × 4 spatial prediction modes that were present may be one of the following:

1, 인트라-4×4 공간 예측 모드 1 (수평) :1, intra-4 × 4 spatial prediction mode 1 (horizontal):

도 11 을 참조하면, 예측 신호 컴포넌트 값들은, Referring to FIG. 11, the prediction signal component values are

로 주어지고, 0 가산, 0 산술 시프트, 및 0 승산을 포함한다.And includes zero addition, zero arithmetic shift, and zero multiplication.

2. 인트라-4×4 공간 예측 모드 2 (DC) :2. Intra-4 × 4 Spatial Prediction Mode 2 (DC):

만약, 위치들 {q, r, s, t} 에 있는 화소들이 이용가능하지 않다면, 예측 신호 컴포넌트 값들은, If the pixels at positions {q, r, s, t} are not available, the prediction signal component values are

만약, {q, r, s, t} 가 이용가능하다면, 예측 신호 컴포넌트 값들은,If {q, r, s, t} is available, the prediction signal component values are

로 주어지고, 4 가산, 1 산술 시프트 및 0 승산을 포함한다.Is given by 4 additions, 1 arithmetic shift and 0 multiplication.

3. 인트라-4×4 공간 예측 모드 8 (Horizontal-Up) :Intra-4 × 4 Spatial Prediction Mode 8 (Horizontal-Up):

예측 신호 컴포넌트 값들은,The predictive signal component values are

로 주어지고, 6 가산, 4 산술 시프트, 및 0 승산, 또는 8 가산, 2 산술 시프트 및 0 승산을 포함한다.6 add, 4 arithmetic shift, and 0 multiplication, or 8 add, 2 arithmetic shift, and 0 multiplication.

리스케일링 프로세스 (w'ij i,j ∈ {0, 1, 2, 3} 을 발생시키기 위해 zij i,j ∈ {0, 1, 2, 3} 을 역양자화) 에 관한 하나 이상의 관찰은 상당한 계산 절감의 다른 원인을 드러낼 수도 있다. 양자와 파라미터에 대한 의존성 이외에, zij ij ∈ {0, 1, 2, 3} 을 스케일링하기 위해 사용되는 리스케일링 팩터 (rescaling factor) vij i,j ∈ {0, 1, 2, 3} 는 또한, 4×4 행렬 내의 다음의 위치 관련 구조를 처리하는데 :One or more observations regarding the rescaling process (inverse quantization of zij i, j ∈ {0, 1, 2, 3} to generate w'ij i, j ∈ {0, 1, 2, 3}) yield significant calculations It may reveal other sources of savings. In addition to the dependence on protons and parameters, the rescaling factor vij i, j ∈ {0, 1, 2, 3} used to scale zij ij ij {0, 1, 2, 3} is also To process the following positional structure in a 4x4 matrix:

여기서, [v00, v20, v02, v22], [v11, v31, v13, v33] 및 [v10, v30, v01, v21, v12, v32, v03, v23] 을 각각 포함하는 리스케일링 팩터들의 3 개의 그룹들은 주어진 양자화 파라미터 QP_Y 에 대해 동일 값을 갖는다. 이것은, 다음과 같이 zij 로부터의 w'ij 의 발생과 연관된 승산들의 수를 감소시키는데 바람직하게 사용될 수 있다. 4×4 잔여 신호의 마지막 열을 재구성하기 위한 상기 주어진 가중된 원리 벡터 합 공식 (식 (7)) 에 있어서, 기본 벡터 [1 1 1 1]^T 를 가중하는 제 1 가중치는, 이들 2 개의 가중치의 개별 값들보다 w'00 과 w'20 의 합을 포함한다는 것에 주목한다. 따라서, 2 번의 정수 승산을 보통 수반하는, 2 개의 값들, 즉, w'00 과 w'20 을 개별적으로 계산한 후, 그들을 합하는 대신에, 먼저 z00 과 z20 을 더한 후, 이 합을 v00 = v20 으로 리스케일링하여 한 번의 정수 승산을 통해 (w'00+w'20) 에 대한 동일 최종 값을 얻는다.Here, three groups of rescaling factors including [v00, v20, v02, v22], [v11, v31, v13, v33] and [v10, v30, v01, v21, v12, v32, v03, v23], respectively. Have the same value for a given quantization parameter QP _Y. This can preferably be used to reduce the number of multiplications associated with the occurrence of w'ij from zij as follows. In the given weighted principle vector sum formula (Equation (7)) for reconstructing the last column of the 4x4 residual signal, the first weight that weights the base vector [1 1 1 1] ^T is the two weights Note that we include the sum of w'00 and w'20 rather than the individual values of. Thus, instead of computing two values, usually w'00 and w'20, usually involving two integer multiplications, instead of adding them up, first add z00 and z20 and then add this sum to v00 = v20 Rescaling to obtain the same final value for (w'00 + w'20) through one integer multiplication.

이 부분 디코딩을 실행하기 위한 계산 단계들에서의 이들의 직접적인 감소 이외에, 4×4 잔여 신호의 원하는 마지막 열 및 제 1 (최상의) 행을 계산하기 위한 고속의 알고리즘이 또한 설계될 수 있다.In addition to their direct reduction in the computational steps for performing this partial decoding, a fast algorithm for calculating the desired last column and the first (best) row of the 4x4 residual signal can also be designed.

이 부분 디코딩 프로세스에 대해 낮은 계산 단계들을 초래할 수도 있는 다른 실제 사실은, 대게 잔여 신호 블록 내의 최대 16 개의 양자화된 계수들 중에서, 소수, 통상 5 개 미만이 실제로 논-제로이다. 이 사실과 관련하여 위에서는, 수반되는 승산 수를 추가 감소, 거의 반감시키기 위해 사용될 수 있다.Another practical fact that may result in low computational steps for this partial decoding process is that, of a maximum of 16 quantized coefficients in the residual signal block, usually a fraction, typically less than 5, is actually non-zero. In connection with this fact, in the above, it can be used to further reduce or almost halve the accompanying multiplication number.

당업자는, 상기 식 (7) 과 유사한 공식이 임의의 열, 행, 대각선 또는 임의의 부분 및/또는 이들의 조합을 재구성하도록 유도될 수도 있음을 인식할 것이다. 예를 들어, 기본 이미지들 (상기 식 6a 내지 6p) 의 상부 행 값들은, 슬라이스 경계 바로 아래의 화소들 (도 11 의 화소 위치들 {A B C D} 참조) 을 재구성하기 위해 대응 변환 계수들 w'_ij 와 조합될 수 있으며, 이는 블록의 좌측의 동일한 4 개 의 화소 위치들 {d h l p} 에 의존한다. 이들 방법들을 이용하여 재구성될 수 있는 멀티미디어 샘플들의 다른 서브세트는 당업자에게 명백할 것이다. Those skilled in the art will appreciate that a formula similar to Equation (7) above may be derived to reconstruct any column, row, diagonal or any portion and / or combination thereof. For example, the upper row values of the base images (Equations 6a-6p above) correspond to corresponding transform coefficients w ' _ij to reconstruct the pixels directly below the slice boundary (see pixel positions {ABCD} in FIG. 11). Can be combined with and depends on the same four pixel positions {dhlp} on the left side of the block. Other subsets of multimedia samples that can be reconstructed using these methods will be apparent to those skilled in the art.

도 12 는, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스 (150) 의 다른 예를 나타낸 기능 블록도이다. 이 양태는, 멀티미디어 데이터와 연관되는 변환 계수들을 수신하는 수단, 재구성되는 멀티미디어 샘플들의 세트를 결정하는 제 1 결정기 수단, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정하는 제 2 결정기 수단, 및 결정된 세트의 멀티미디어 샘플들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하는 발생기 수단을 포함한다. 이 양태의 일부 예는, 수신 수단이 수신기 (202) 를 구비하고, 제 1 결정기 수단이 멀티미디어 샘플 결정기 (204) 를 구비하고, 제 2 결정기 수단이 변환 계수 결정기 (206) 를 구비하며, 발생기 수단이 재구성된 샘플 발생기 (208) 를 구비하는 것을 포함한다.12 is a functional block diagram illustrating another example of a decoder device 150 that may be used in a system such as that shown in FIG. 1. This aspect includes: means for receiving transform coefficients associated with multimedia data, first determiner means for determining a set of multimedia samples to be reconstructed, and second determiner for determining a set of received transform coefficients based on the multimedia samples to be reconstructed Means, and generator means for processing the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples. Some examples of this aspect include that the receiving means comprises a receiver 202, the first determiner means comprises a multimedia sample determiner 204, the second determiner means comprises a transform coefficient determiner 206, and generator means. And having this reconstructed sample generator 208.

도 13 은, 도 1 에 도시한 것과 같은 시스템에서 사용될 수도 있는 디코더 디바이스 (150) 의 다른 예를 나타낸 기능 블록도이다. 이 양태는, 멀티미디어 데이터와 연관되는 변환 계수들을 수신하는 수단, 재구성되는 멀티미디어 샘플들의 세트를 결정하는 제 1 결정기 수단, 재구성되는 멀티미디어 샘플들에 기초하여 수신된 변환 계수들의 세트를 결정하는 제 2 결정기 수단, 및 결정된 세트의 멀티미디어 샘플들에 대응하는 재구성된 샘플들을 발생시키기 위해 결정된 세트의 변환 계수들을 처리하는 발생기 수단을 포함한다. 이 양태의 일부 예는, 수신 수단이 수신하는 모듈 (1302) 를 구비하고, 제 1 결정기 수단이 재구성을 위한 샘플들 을 결정하는 모듈 (1304) 을 구비하고, 제 2 결정기 수단이 변환 계수들을 결정하는 모듈 (1306) 을 구비하며, 발생기 수단이 변환 계수들을 처리하는 모듈 (1308) 을 구비하는 것을 포함한다.FIG. 13 is a functional block diagram illustrating another example of a decoder device 150 that may be used in a system such as that shown in FIG. 1. This aspect includes: means for receiving transform coefficients associated with multimedia data, first determiner means for determining a set of multimedia samples to be reconstructed, and second determiner for determining a set of received transform coefficients based on the multimedia samples to be reconstructed Means, and generator means for processing the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples. Some examples of this aspect include a module 1302 that the receiving means receives, the first determiner means comprises a module 1304 for determining samples for reconstruction, and the second determiner means determines the transform coefficients. A module 1306, the generator means comprising a module 1308 for processing the transform coefficients.

당업자는, 정보 및 신호들이 다양한 서로 다른 기술 및 기법 중 임의의 것을 이용하여 표현될 수도 있음을 이해할 것이다. 예를 들어, 상기 설명 전반에 걸쳐 참조될 수도 있는 데이터, 명령들, 커맨드들, 정보, 신호들, 비트들, 심볼들, 및 칩들은, 전압들, 전류들, 전자기파들, 자기장들 또는 자기 입자들, 광학계들 또는 광학 입자들, 또는 이들의 임의의 조합으로 표현될 수도 있다.Those skilled in the art will appreciate that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the description above may be voltages, currents, electromagnetic waves, magnetic fields or magnetic particles. , Optical systems or optical particles, or any combination thereof.

당업자는 또한, 여기에 개시된 예들과 함께 설명된 다양한 예시적인 논리 블록들, 모듈들, 및 알고리즘 단계들은, 전자 하드웨어, 펌웨어, 컴퓨터 소프트웨어, 미들웨어, 마이크로코드, 또는 이들의 조합으로 구현될 수도 있다. 하드웨어와 소프트웨어의 호환성을 명확히 설명하기 위해, 다양한 예시적인 컴포넌트들, 블록들, 모듈들, 회로들, 및 단계들이 그들의 기능성의 관점에서 상기 일반적으로 설명되고 있다. 이런 기능성이 하드웨어로 구현되는지 소프트웨어로 구현되는지 여부는, 전체 시스템에 부과되는 설계 제약 및 특정 애플리케이션에 의존한다. 숙련된 기술자는, 각 특정 애플리케이션에 대해 다양한 방식으로 설명된 기능성을 구현할 수도 있지만, 이런 구현 판정은, 개시된 방법들의 범위에서 벗어나는 것처럼 해석되어서는 안된다.Those skilled in the art may also implement various illustrative logical blocks, modules, and algorithm steps described in conjunction with the examples disclosed herein in electronic hardware, firmware, computer software, middleware, microcode, or a combination thereof. To clearly illustrate hardware and software compatibility, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented in hardware or software depends on the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed methods.

여기에 개시된 예들과 함께 다양한 예시적인 논리 블록들, 컴포넌트들, 모듈들, 및 회로들은, 여기에 설명된 기능들을 수행하도록 설계된 범용 프로세서, DSP (디지털 신호 프로세서), ASIC (주문형 집적 회로), FPGA (필드 프로그램가능한 게이트 어레이) 또는 다른 프로그램가능한 논리 디바이스, 별도의 게이트 또는 트랜지스터 논리, 별도의 하드웨어 컴포넌트들, 또는 이들의 임의의 조합으로 구현 또는 수행될 수도 있다. 범용 프로세서는 마이크로프로세서일 수도 있지만, 다르게는, 이 프로세서는 임의의 종래 프로세서, 제어기, 마이크로제어기, 또는 상태 머신일 수도 있다. 프로세서는 또한, 컴퓨팅 디바이스들의 조합, 예를 들어, DSP 와 마이크로프로세서의 조합, 복수의 마이크로프로세서들, DSP 코어 또는 ASIC 코어와 관련한 하나 이상의 마이크로프로세서들, 또는 임의의 다른 구성으로 구현될 수도 있다.The various illustrative logic blocks, components, modules, and circuits, along with the examples disclosed herein, are general purpose processors, digital signal processors (DSPs), custom integrated circuits (ASICs), FPGAs, designed to perform the functions described herein. (Field programmable gate array) or other programmable logic device, separate gate or transistor logic, separate hardware components, or any combination thereof. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented in a combination of computing devices, eg, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core or ASIC core, or any other configuration.

여기에 개시된 예들과 함께 설명된 방법 또는 알고리즘의 단계들은, 하드웨어, 프로세서에 의해 실행되는 소프트웨어 모듈, 또는 이 둘의 조합 내에 직접 수록될 수도 있다. 소프트웨어 모듈은, RAM 메모리, 플래시 메모리, ROM 메모리, EPROM 메모리, EEPROM 메모리, 레지스터, 하드 디스크, 착탈식 디스크, CD-ROM, 광학 저장 매체, 또는 당업계에 공지된 임의의 다른 형태의 저장 매체 내에 상주할 수도 있다. 예의 저장 매체는, 프로세스가 저장 매체로부터 정보를 판독하고 저장 매체에 정보를 기록할 수 있도록 프로세서에 결합된다. 다르게는, 저장 매체는, 프로세서와 일체형일 수도 있다. 프로세서와 저장 매체는, 주문형 집적 회로 (ASIC) 내에 상주할 수도 있다. ASIC 은 무선 모뎀에 상주할 수도 있다. 다르게는, 프로세서와 저장 매체는, 무선 모뎀 내의 별도의 컴포넌트들로서 상주할 수도 있다.The steps of a method or algorithm described in conjunction with the examples disclosed herein may be contained directly in hardware, in a software module executed by a processor, or in a combination of the two. The software module resides in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, optical storage medium, or any other form of storage medium known in the art. You may. An example storage medium is coupled to the processor such that a process can read information from and write information to the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (ASIC). The ASIC may reside in a wireless modem. In the alternative, the processor and the storage medium may reside as discrete components in a wireless modem.

상기 개시된 예들에 대한 이전의 설명은, 임의의 당업자가 개시된 방법들 및 장치들을 실시 또는 이용하게 하기 위해 제공된다. 이들 예들에 대한 다양한 변형들은 당업자에게 쉽게 알려져 있으며, 여기에 정의된 원리들은, 다른 예들에 적용될 수도 있고 부가적인 엘리먼트들이 부가될 수도 있다.The previous description of the disclosed examples is provided to enable any person skilled in the art to make or use the disclosed methods and apparatus. Various modifications to these examples are readily known to those skilled in the art, and the principles defined herein may be applied to other examples and additional elements may be added.

따라서, 멀티미디어 데이터의 매우 효율적인 부분 디코딩을 수행하기 위한 방법 및 장치들이 설명되어 있다.Thus, methods and apparatus for performing highly efficient partial decoding of multimedia data are described.

Claims

Receiving transform coefficients associated with the multimedia data;

Determining a set of multimedia samples to be reconstructed, wherein the set of multimedia samples to be reconstructed comprises at least one sample adjacent a portion of the lost multimedia;

Determining the set of received transform coefficients based on the reconstructed multimedia samples;

Processing the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples; And

Estimating a set of concealment multimedia samples for the portion of the lost multimedia based on the reconstructed samples.

The method of claim 1,

And said processing comprises scaling said determined set of transform coefficients.

The method of claim 2,

Scaling the transform coefficients comprises dequantizing.

The method of claim 1,

And wherein said multimedia samples in said determined set include multimedia samples referenced when other multimedia samples are encoded.

The method of claim 1,

And wherein said multimedia samples of said determined set comprise multimedia samples within a first slice of multimedia data in contact with a second slice of multimedia data.

The method of claim 1,

Wherein the received transform coefficients are associated with a matrix of transformed multimedia samples as a set, the reconstructed samples comprising a subset of the matrix of multimedia samples.

The method of claim 1,

And said processing includes dividing said determined set of transform coefficients into a plurality of groups.

The method of claim 7, wherein

The processing further includes calculating a value for each group,

And said calculation is based on an encoding method that generated said transform coefficients.

The method of claim 8,

The processing step,

Determining an array for each group based on the encoding method that generated the transform coefficients; And

Generating a set of reconstructed samples of the multimedia data based on the values and the arrays.

delete

The method of claim 1,

Generating a set of transform coefficients corresponding to the estimated set of hidden multimedia samples.

The method of claim 1,

And the reconstructed samples are non-causal to the estimated set of hidden multimedia samples.

The method of claim 1,

Receiving a directivity mode indicator associated with each reconstructed sample; And

Estimating a set of hidden multimedia samples based on the reconstructed samples and the directional mode indicators.

A multimedia data processor for multimedia data concealment,

Receive transform coefficients associated with the multimedia data;

Determine a set of multimedia samples to be reconstructed;

Determine the set of received transform coefficients based on the reconstructed multimedia samples;

Process the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples,

Estimate a set of hidden multimedia samples for the portion of the lost multimedia based on the reconstructed samples,

Wherein the set of reconstructed multimedia samples comprises at least one sample adjacent to a portion of the lost multimedia.

The method of claim 14,

The multimedia data processor is further configured to scale the determined set of transform coefficients.

The method of claim 14,

The multimedia data processor is further configured to dequantize the determined set of transform coefficients.

The method of claim 14,

And the multimedia samples of the determined set include multimedia samples that are referenced when other multimedia samples are encoded.

The method of claim 14,

And the multimedia samples of the determined set include multimedia samples within a first slice of multimedia data abutting a second slice of multimedia data.

The method of claim 14,

And the received transform coefficients are associated with a matrix of transformed multimedia samples as a set, wherein the reconstructed samples comprise a subset of the matrix of multimedia samples.

The method of claim 14,

The multimedia data processor is further configured to divide the determined set of transform coefficients into a plurality of groups.

The method of claim 20,

The multimedia data processor is further configured to calculate a value for each group,

The method of claim 21,

The multimedia data processor also,

Determine an array for each group based on the encoding method that generated the transform coefficients;

And generate a set of reconstructed samples of the multimedia data based on the values and the arrays.

delete

The method of claim 14,

And the multimedia data processor is further configured to generate a set of transform coefficients corresponding to the estimated set of hidden multimedia samples.

The method of claim 14,

And the reconstructed samples are in causal to the estimated set of hidden multimedia samples.

The method of claim 14,

The multimedia data processor also,

Receive a directional mode indicator associated with each reconstructed sample;

And estimate a set of hidden multimedia samples based on the reconstructed samples and the directional mode indicators.

A receiver for receiving transform coefficients associated with the multimedia data;

A first determiner for determining a set of multimedia samples to be reconstructed, wherein the set of multimedia samples to be reconstructed comprises at least one sample adjacent a portion of the lost multimedia;

A second determiner for determining the set of received transform coefficients based on the reconstructed multimedia samples;

A generator for processing the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples; And

And an estimator for estimating a set of hidden multimedia samples for the portion of the lost multimedia based on the reconstructed samples.

28. The method of claim 27,

And the generator to scale the determined set of transform coefficients.

28. The method of claim 27,

And the generator inverse quantizes the determined set of transform coefficients.

28. The method of claim 27,

And said multimedia sample of said determined set comprises multimedia samples referenced when other multimedia samples are encoded.

28. The method of claim 27,

And wherein said multimedia samples in said determined set comprise multimedia samples in a first slice of multimedia data in contact with a second slice of multimedia data.

28. The method of claim 27,

And the received transform coefficients are associated with a matrix of transformed multimedia samples as a set, the reconstructed samples comprising a subset of the matrix of multimedia samples.

28. The method of claim 27,

And wherein the generator divides the determined set of transform coefficients into a plurality of groups.

The method of claim 33, wherein

The generator calculates a value for each group,

And said calculation is based on an encoding method that has generated said transform coefficients.

The method of claim 34, wherein

Wherein the generator determines an array for each group based on the encoding method that generated the transform coefficients, and generates a set of reconstructed samples of the multimedia data based on the values and the arrays. Concealment device.

delete

28. The method of claim 27,

And the estimator generates a set of transform coefficients corresponding to the estimated set of hidden multimedia samples.

28. The method of claim 27,

The receiver receives a directional mode indicator associated with each reconstructed sample,

And an estimator for estimating a set of hidden multimedia samples based on the reconstructed samples and the directional mode indicators.

Means for receiving transform coefficients associated with the multimedia data;

First determiner means for determining a set of multimedia samples to be reconstructed, the set of multimedia samples to be reconstructed comprising at least one sample adjacent a portion of the lost multimedia;

Second determiner means for determining the set of received transform coefficients based on the reconstructed multimedia samples;

Generator means for processing the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples; And

Means for estimating a set of hidden multimedia samples for the portion of the lost multimedia based on the reconstructed samples.

41. The method of claim 40,

And the generator means for scaling the determined set of transform coefficients.

41. The method of claim 40,

And the generator means inverse quantizes the determined set of transform coefficients.

41. The method of claim 40,

The generator means for dividing the determined set of transform coefficients into a plurality of groups.

The method of claim 46,

The generator means calculates a value for each group,

49. The method of claim 47,

The generator means for determining an array for each group based on the encoding method that generated the transform coefficients and generating a set of reconstructed samples of the multimedia data based on the values and the arrays. Data concealment device.

delete

41. The method of claim 40,

And the means for estimating generates a set of transform coefficients corresponding to the set of estimated hidden multimedia samples.

41. The method of claim 40,

The receiving means receives a directional mode indicator associated with each reconstructed sample,

And means for estimating a set of hidden multimedia samples based on the reconstructed samples and the directional mode indicators.

At run time, let the machine

Receive transform coefficients associated with the multimedia data;

Determine a set of multimedia samples to be reconstructed;

Process the determined set of transform coefficients to generate reconstructed samples corresponding to the determined set of multimedia samples;

Instructions for concealing multimedia data, causing to estimate a set of hidden multimedia samples for the portion of the lost multimedia based on the reconstructed samples,

And the set of reconstructed multimedia samples comprises at least one sample adjacent to a portion of the lost multimedia.

54. The method of claim 53,

And the instructions further cause the machine to scale the determined set of transform coefficients.

54. The method of claim 53,

The instructions further cause the machine to dequantize the determined set of transform coefficients.

54. The method of claim 53,

And said multimedia samples in said determined set comprise multimedia samples referenced when other multimedia samples are encoded.

54. The method of claim 53,

And the multimedia samples of the determined set comprise multimedia samples within a first slice of multimedia data abutting a second slice of multimedia data.

54. The method of claim 53,

And the instructions further cause the machine to divide the determined set of transform coefficients into a plurality of groups.

The method of claim 59,

The instructions also cause the machine to calculate a value for each group,

The method of claim 60,

The instructions also cause the machine to:

Generate a set of reconstructed samples of the multimedia data based on the values and the arrays.

delete

54. The method of claim 53,

And the instructions further cause the machine to generate a set of transform coefficients corresponding to the estimated set of hidden multimedia samples.

54. The method of claim 53,

The instructions also cause the machine to:

Receive a directional mode indicator associated with each reconstructed sample;