JP2013061851A

JP2013061851A - Memory controller and simd type processor

Info

Publication number: JP2013061851A
Application number: JP2011200529A
Authority: JP
Inventors: Takao Katayama; 貴雄片山
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2011-09-14
Filing date: 2011-09-14
Publication date: 2013-04-04

Abstract

PROBLEM TO BE SOLVED: To provide a memory controller capable of reducing the overall processing time of an SIMD type processor, which is used for predetermined processing such as image processing, in comparison with the conventional technique; and an SIMD type processor including the memory controller.SOLUTION: Each time a read buffer RB0 transfers data to registers Rj of processor elements PE0 to PEN, a read buffer counter circuit 51 increments and outputs an address value C51. A loop register 52 stores a predetermined maximum address value. A comparator 53 compares the address value C51, which is output from the read buffer counter circuit 51, with the maximum address value C52; generates, when the address value C51 matches the maximum address value C52, a counter reset signal S53 that resets the read buffer counter circuit 51; and outputs the signal to the read buffer counter circuit 51.

Description

本発明は、ＳＩＭＤ（Single Instruction-stream Multiple Data-stream）プロセッサのためのメモリコントローラ及び当該メモリコントローラを備えたＳＩＭＤ型プロセッサに関する。 The present invention relates to a memory controller for a single instruction-stream multiple data-stream (SIMD) processor and a SIMD type processor including the memory controller.

近年、ディジタル複写機、プリンタ及びカメラなどの画像処理装置では、画素数の増加及び画像処理の多様化などにより画質の向上が図られている。このような画像処理装置では、複数の画素データに対して同一の処理を行うことが多いため、１つの演算命令により１つのデータを処理するＳＩＳＤ（Single Instruction-stream Single Data-stream）型のマイクロプロセッサより、１つの演算命令により複数のデータを並列処理するＳＩＭＤ型のマイクロプロセッサ（以下、ＳＩＭＤ型プロセッサという。）が用いられることが多い。 In recent years, image processing apparatuses such as digital copying machines, printers, and cameras have improved image quality by increasing the number of pixels and diversifying image processing. In such an image processing apparatus, the same processing is often performed on a plurality of pixel data. Therefore, an SISD (Single Instruction-stream Single Data-stream) type micro processor that processes one data by one arithmetic instruction. In many cases, a processor uses a SIMD type microprocessor (hereinafter referred to as a SIMD type processor) that processes a plurality of data in parallel by one arithmetic instruction.

ＳＩＭＤ型プロセッサは、算術論理演算器（ＡＬＵ（Arithmetic Logic Unit））と複数の汎用レジスタとをそれぞれ備えた複数のプロセッサエレメントと、そのプロセッサエレメントを制御するグローバルプロセッサとを備えて構成されている。プロセッサエレメントの個数は画像データの大きさに応じて決定され、これらの複数のプロセッサエレメントを、単一のグローバルプロセッサが、同時に演算処理を行うように制御している。具体的には、ＳＩＭＤ型プロセッサにおいて、各プロセッサエレメントは、画像データのうちの１画素の画素データの画像処理を行う。１画素に対応するプロセッサエレメントが複数あるため、複数の画素に対応する画素データが並列に演算処理される。このように、複数の画素データを同時に処理することで、画像処理の効率を高めている。 The SIMD type processor includes a plurality of processor elements each including an arithmetic logic unit (ALU (Arithmetic Logic Unit)) and a plurality of general-purpose registers, and a global processor that controls the processor elements. The number of processor elements is determined according to the size of the image data, and the plurality of processor elements are controlled so that a single global processor simultaneously performs arithmetic processing. Specifically, in a SIMD type processor, each processor element performs image processing of pixel data of one pixel of image data. Since there are a plurality of processor elements corresponding to one pixel, pixel data corresponding to the plurality of pixels are processed in parallel. Thus, the efficiency of image processing is increased by processing a plurality of pixel data simultaneously.

ＳＩＭＤ型プロセッサによって実行される画像処理の中には、ディザ処理のようなテーブルデータを扱う処理がある。通常、ディザ処理のためのディザテーブルデータなどのテーブルデータは、ＳＩＭＤ型プロセッサの外部メモリにあらかじめ格納されている。ＳＩＭＤ型プロセッサの外部に設けられた従来技術に係るメモリコントローラは、外部メモリからテーブルデータをリードし、各プロセッサエレメントの汎用レジスタに転送する。一方、グローバルプロセッサは各プロセッサエレメントの演算用のレジスタに画素データを格納し、汎用レジスタに転送されたテーブルデータから演算用レジスタに格納された画素データを減算するように各プロセッサエレメントを制御し、これによりディザ処理を実行できる。 Among the image processing executed by the SIMD type processor, there is processing for handling table data such as dither processing. Normally, table data such as dither table data for dither processing is stored in advance in an external memory of the SIMD type processor. The memory controller according to the related art provided outside the SIMD type processor reads the table data from the external memory and transfers it to the general-purpose register of each processor element. On the other hand, the global processor stores the pixel data in the calculation register of each processor element, controls each processor element to subtract the pixel data stored in the calculation register from the table data transferred to the general-purpose register, Thus, dither processing can be executed.

特許文献１及び２は、従来技術に係るメモリコントローラを開示している。また、特許文献３は、ディザマトリクスを格納する領域を削減できる従来技術に係る画像生成方法を開示している。 Patent documents 1 and 2 disclose a memory controller according to the prior art. Patent Document 3 discloses a conventional image generation method that can reduce the area for storing a dither matrix.

一般に、従来技術に係るＳＩＭＤ型プロセッサは、メモリコントローラが外部メモリから読み出したテーブルデータを、１つの処理単位として処理される所定数（以下、ＳＩＭＤ数という。ＳＩＭＤ数はプロセッサエレメントの個数と等しい。）の画素データ単位又は１ライン分の画素データ単位に展開するように各プロセッサエレメントの汎用レジスタに格納するとき、プロセッサエレメントのシフト動作を利用したり、グローバルプロセッサから各汎用レジスタに同じデータを繰り返して転送したりしていた。この場合、各プロセッサエレメントへのテーブルデータの転送時間はＳＩＭＤ数に比例して増加し、転送中にはＳＩＭＤ型プロセッサでの演算が行えないという弊害が生じた。このため、ＳＩＭＤ型プロセッサの全体の処理時間を削減できなかった。 In general, the SIMD type processor according to the related art has a predetermined number (hereinafter referred to as SIMD number) in which table data read from the external memory by the memory controller is processed as one processing unit. The SIMD number is equal to the number of processor elements. ) When storing in the general-purpose register of each processor element so that it expands into a pixel data unit or one line of pixel data unit, use the shift operation of the processor element or repeat the same data from the global processor to each general-purpose register And was transferred. In this case, the transfer time of the table data to each processor element increases in proportion to the number of SIMDs, and there is an adverse effect that the operation in the SIMD type processor cannot be performed during the transfer. For this reason, the entire processing time of the SIMD type processor cannot be reduced.

本発明の目的は以上の問題点を解決し、画像処理などの所定の処理に用いられるＳＩＭＤ型プロセッサ全体の処理時間を従来技術に比較して削減できるメモリコントローラと、当該メモリコントローラを備えたＳＩＭＤ型プロセッサとを提供することにある。 The object of the present invention is to solve the above problems and to reduce the processing time of the entire SIMD type processor used for predetermined processing such as image processing as compared with the prior art, and SIMD including the memory controller. Providing a type processor.

本発明に係るメモリコントローラは、
記憶装置からのデータを順次それぞれ所定のアドレスに一時的に格納し、入力されるアドレス値のアドレスに格納されたデータを、ＳＩＭＤ型プロセッサの複数のプロセッサエレメントに転送するリードバッファと、
上記アドレス値を発生して上記リードバッファに出力するリードバッファコントローラとを備えたメモリコントローラにおいて、
上記リードバッファコントローラは、
上記リードバッファが上記各プロセッサエレメントに上記データを転送する毎に上記アドレス値をインクリメントして出力するリードバッファカウンタ回路と、
所定の最大アドレス値を格納するループレジスタと、
上記リードバッファカウンタ回路から出力されるアドレス値を上記最大アドレス値と比較し、上記アドレス値が上記最大アドレス値と一致したとき、上記リードバッファカウンタ回路をリセットするためのカウンタリセット信号を発生して上記リードバッファカウンタ回路に出力する比較器とを備えたことを特徴とする。 The memory controller according to the present invention includes:
A read buffer for temporarily storing data from the storage device sequentially at predetermined addresses, and transferring the data stored at the address of the input address value to a plurality of processor elements of the SIMD type processor;
In a memory controller including a read buffer controller that generates the address value and outputs the address value to the read buffer,
The read buffer controller
A read buffer counter circuit that increments and outputs the address value each time the read buffer transfers the data to the processor elements;
A loop register for storing a predetermined maximum address value;
The address value output from the read buffer counter circuit is compared with the maximum address value, and when the address value matches the maximum address value, a counter reset signal for resetting the read buffer counter circuit is generated. And a comparator for outputting to the read buffer counter circuit.

本発明に係るメモリコントローラによれば、リードバッファコントローラは、リードバッファが各プロセッサエレメントにデータを転送する毎にアドレス値をインクリメントして出力するリードバッファカウンタ回路と、所定の最大アドレス値を格納するループレジスタと、リードバッファカウンタ回路から出力されるアドレス値を最大アドレス値と比較し、アドレス値が最大アドレス値と一致したとき、リードバッファカウンタ回路をリセットするためのカウンタリセット信号を発生してリードバッファカウンタ回路に出力する比較器とを備える。従って、リードバッファに格納されたデータをプロセッサエレメントに繰り返して転送でき、当該転送中にプロセッサエレメントで他の演算処理を実行できるので、画像処理などの所定の処理に用いられるＳＩＭＤ型プロセッサ全体の処理時間を従来技術に比較して削減できる。 According to the memory controller of the present invention, the read buffer controller stores a read buffer counter circuit that increments and outputs an address value each time the read buffer transfers data to each processor element, and a predetermined maximum address value. The address value output from the loop register and the read buffer counter circuit is compared with the maximum address value. When the address value matches the maximum address value, a counter reset signal is generated to reset the read buffer counter circuit and read. And a comparator for outputting to the buffer counter circuit. Accordingly, the data stored in the read buffer can be repeatedly transferred to the processor element, and other arithmetic processing can be executed by the processor element during the transfer, so that the entire SIMD type processor used for predetermined processing such as image processing is processed. Time can be reduced compared to the prior art.

本発明の第１の実施形態に係るメモリコントローラ４と、ＳＩＭＤ型プロセッサ１と、ＤＤＲメモリ３とを示すブロック図である。1 is a block diagram showing a memory controller 4, a SIMD type processor 1, and a DDR memory 3 according to a first embodiment of the present invention. 図１の各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３をより詳細に示すブロック図である。FIG. 2 is a block diagram showing registers Rj to Rj + 3 of each processor element PE0 to PEN in FIG. 1 in more detail. 図１のメモリコントローラ４のリードバッファコントローラ１０をより詳細に示すブロック図である。FIG. 2 is a block diagram showing the read buffer controller 10 of the memory controller 4 of FIG. 1 in more detail. 図１のＤＤＲメモリ３に格納される４×４のマトリクスサイズを有するディザテーブルデータの構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration of dither table data having a 4 × 4 matrix size stored in a DDR memory 3 of FIG. 1. （ａ）は、図１のＤＤＲメモリ３に格納されるディザテーブルデータの一例を示すブロック図であり、（ｂ）は、図１のＳＩＭＤ型プロセッサ１により処理される画像データの一例を示すブロック図であり、（ｃ）は、図５（ｂ）の画像データに対して図５（ａ）のディザテーブルデータを用いてディザ処理を行ったときの処理結果の２値化データを示すブロック図であり、（ｄ）は、図５（ｃ）の２値化データに対応する印刷イメージを示すブロック図である。(A) is a block diagram showing an example of dither table data stored in the DDR memory 3 of FIG. 1, and (b) is a block showing an example of image data processed by the SIMD type processor 1 of FIG. FIG. 5C is a block diagram illustrating binarized data as a result of processing when the dither processing is performed on the image data in FIG. 5B using the dither table data in FIG. 5A. (D) is a block diagram showing a print image corresponding to the binarized data of FIG. 5 (c). 図１のＳＩＭＤ型プロセッサ１により図４のディザテーブルデータを用いてディザ処理を行うとき、各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データを示すブロック図である。FIG. 5 is a block diagram showing threshold data stored in registers Rj to Rj + 3 of processor elements PE0 to PEN when the dither processing is performed by the SIMD type processor 1 of FIG. 1 using the dither table data of FIG. 本発明の第２の実施形態に係るメモリコントローラ４Ａの構成を示すブロック図である。It is a block diagram which shows the structure of 4 A of memory controllers which concern on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係るメモリコントローラ４Ｂの構成を示すブロック図である。It is a block diagram which shows the structure of the memory controller 4B which concerns on the 3rd Embodiment of this invention. 本発明の第４の実施形態に係るメモリコントローラ４Ｃの構成を示すブロック図である。It is a block diagram which shows the structure of 4 C of memory controllers which concern on the 4th Embodiment of this invention. 図９のＤＤＲメモリ３に格納される８×８のマトリクスサイズを有するディザテーブルデータと、当該ディザテーブルデータを用いてディザ処理を行うとき、各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データとを示すブロック図である。When dither processing is performed using the dither table data having the matrix size of 8 × 8 stored in the DDR memory 3 of FIG. 9 and the dither table data, the data is stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN. It is a block diagram which shows threshold value data. 図１０のＤＤＲメモリ３に格納されているディザテーブルデータと、図９のリードバッファＲＢ０〜ＲＢ３へのしきい値データの格納状態と、オフセット値Ｃ５８が０でありかつ最大アドレス値Ｃ５２Ａが８であるとき（１回目のリード転送時）に各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データと、オフセット値Ｃ５８が８でありかつ最大アドレス値Ｃ５２Ａが１６であるとき（２回目のリード転送時）に各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データとを示すブロック図である。The dither table data stored in the DDR memory 3 in FIG. 10, the storage state of threshold data in the read buffers RB0 to RB3 in FIG. 9, the offset value C58 is 0, and the maximum address value C52A is 8. At some time (during the first read transfer), when the threshold value data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN, the offset value C58 is 8, and the maximum address value C52A is 16 ( It is a block diagram showing threshold value data stored in registers Rj to Rj + 3 of each processor element PE0 to PEN at the time of second read transfer.

以下、本発明に係る実施形態について図面を参照して説明する。なお、以下の各実施形態において、同様の構成要素については同一の符号を付している。 Hereinafter, embodiments according to the present invention will be described with reference to the drawings. In addition, in each following embodiment, the same code | symbol is attached | subjected about the same component.

第１の実施形態．
図１は、本発明の第１の実施形態に係るメモリコントローラ４と、ＳＩＭＤ型プロセッサ１と、ＤＤＲ（Double-Data-Rate）メモリ３とを示すブロック図である。また、図２は、図１の各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３をより詳細に示すブロック図であり、図３は、図１のメモリコントローラ４のリードバッファコントローラ１０をより詳細に示すブロック図である。図１のＳＩＭＤ型プロセッサ１は、ディジタル複写機、高性能プリンタ又はカメラなどの画像処理装置において、ディザ処理などの画像処理を行うためのマイクロプロセッサである。 First embodiment.
FIG. 1 is a block diagram showing a memory controller 4, a SIMD type processor 1, and a DDR (Double-Data-Rate) memory 3 according to the first embodiment of the present invention. 2 is a block diagram showing in more detail the registers Rj to Rj + 3 of the processor elements PE0 to PEN of FIG. 1, and FIG. 3 shows the read buffer controller 10 of the memory controller 4 of FIG. 1 in more detail. It is a block diagram. A SIMD type processor 1 in FIG. 1 is a microprocessor for performing image processing such as dither processing in an image processing apparatus such as a digital copying machine, a high-performance printer, or a camera.

図１において、ＳＩＭＤ型プロセッサ１は、複数（Ｎ＋１）個のプロセッサエレメントＰＥ０，ＰＥ１，…，ＰＥＮ（Ｎは正の整数である。）と、グローバルプロセッサ２とを備えて構成される。また、各プロセッサエレメントＰＥｎ（ｎ＝０，１，…，Ｎ）は、複数（Ｊ＋１）個の汎用レジスタ（以下、レジスタという。）Ｒ０，Ｒ１，…，ＲＪ（Ｊは３以上の整数である。）と、算術論理演算器９１と、算術論理演算器９１による演算結果を示すデータを格納するアキュムレータ９２とを備えて構成される。なお、各プロセッサエレメントＰＥｎ（ｎ＝０，１，…，Ｎ）には、固有のアドレスが割り当てられている。また、本実施形態において、各レジスタＲ０〜ＲＪのサイズは８ビットである。 In FIG. 1, the SIMD type processor 1 includes a plurality (N + 1) of processor elements PE0, PE1,..., PEN (N is a positive integer) and a global processor 2. Each processor element PEn (n = 0, 1,..., N) has a plurality (J + 1) of general purpose registers (hereinafter referred to as registers) R0, R1,..., RJ (J is an integer of 3 or more). )), An arithmetic logic unit 91, and an accumulator 92 that stores data indicating the result of the arithmetic logic unit 91. A unique address is assigned to each processor element PEn (n = 0, 1,..., N). In the present embodiment, the size of each of the registers R0 to RJ is 8 bits.

図３を参照して詳細後述するように、本実施形態に係るメモリコントローラ４は、ＤＤＲメモリ３からのデータを順次それぞれ所定のアドレスに一時的に格納し、入力されるアドレス値のアドレスに格納されたデータを、ＳＩＭＤ型プロセッサ１の複数のプロセッサエレメントＰＥ０〜ＰＥＮに転送するリードバッファＲＢ０と、アドレス値Ｃ５１を発生してリードバッファＲＢ０に出力するリードバッファコントローラ１０とを備えて構成される。 As will be described in detail later with reference to FIG. 3, the memory controller 4 according to the present embodiment temporarily stores data from the DDR memory 3 sequentially at predetermined addresses, and stores them at the address of the input address value. A read buffer RB0 that transfers the processed data to the plurality of processor elements PE0 to PEN of the SIMD type processor 1 and a read buffer controller 10 that generates an address value C51 and outputs it to the read buffer RB0.

ここで、リードバッファコントローラ１０は、図３を参照して詳細後述するように、
（ａ）リードバッファＲＢ０が各プロセッサエレメントＰＥ０〜ＰＥＮにデータを転送する毎にアドレス値Ｃ５１をインクリメントして出力するリードバッファカウンタ回路５１と、
（ｂ）所定の最大アドレス値Ｃ５２を格納するループレジスタと、
（ｃ）リードバッファカウンタ回路５１から出力されるアドレス値Ｃ５１を最大アドレス値Ｃ５２と比較し、アドレス値Ｃ５１が最大アドレス値Ｃ５２と一致したとき、リードバッファカウンタ回路５１をリセットするためのカウンタリセット信号Ｓ５３を発生してリードバッファカウンタ回路５１に出力する比較器５３とを備えたことを特徴としている。 Here, as will be described later in detail with reference to FIG.
(A) a read buffer counter circuit 51 that increments and outputs an address value C51 each time the read buffer RB0 transfers data to each of the processor elements PE0 to PEN;
(B) a loop register for storing a predetermined maximum address value C52;
(C) The address value C51 output from the read buffer counter circuit 51 is compared with the maximum address value C52, and a counter reset signal for resetting the read buffer counter circuit 51 when the address value C51 matches the maximum address value C52. And a comparator 53 that generates S53 and outputs it to the read buffer counter circuit 51.

さらに、リードバッファＲＢ０は、ＤＤＲメモリ３からのディザテーブルデータの第１行の各列のしきい値データを順次それぞれ所定のアドレスに格納し、最大アドレス値Ｃ５２は、リードバッファＲＢ０に最後に格納されたしきい値データの格納アドレスの次のリードバッファＲＢ０のアドレスに設定されたことを特徴としている。 Further, the read buffer RB0 sequentially stores the threshold data of each column of the first row of the dither table data from the DDR memory 3 at a predetermined address, and the maximum address value C52 is finally stored in the read buffer RB0. It is characterized in that it is set to the address of the read buffer RB0 next to the stored threshold data address.

図１において、グローバルプロセッサ２は、当該グローバルプロセッサ２内のプログラムメモリに格納されたプログラムを実行することにより、各プロセッサエレメントＰＥ０〜ＰＥＮ及びＳＩＭＤ型プロセッサ１の外部に設けられたメモリコントローラ４を制御する。ここで、グローバルプロセッサ２は、制御対象のプロセッサエレメントＰＥｎ（ｎ＝０，１，…，Ｎ）のアドレスを指定して、各プロセッサエレメントＰＥｎを制御する。また、グローバルプロセッサ２は、各プロセッサエレメントＰＥｎ（ｎ＝０，１，…，Ｎ）を、レジスタＲ０〜ＲＪのうちの所定の１つのレジスタからデータをリードして算術論理演算器９１に出力し、算術論理演算器９１による演算結果をアキュムレータ９２及びレジスタＲ０〜ＲＪのうちの所定の１つのレジスタに出力するように制御する。グローバルプロセッサ２は、プロセッサエレメントＰＥ０〜ＰＥＮを同時に同一の処理を並列に行うように制御し、これにより、複数のデータを並列処理する。 In FIG. 1, the global processor 2 controls the memory elements 4 provided outside the processor elements PE <b> 0 to PEN and the SIMD type processor 1 by executing a program stored in the program memory in the global processor 2. To do. Here, the global processor 2 controls each processor element PEn by designating the address of the processor element PEn (n = 0, 1,..., N) to be controlled. Further, the global processor 2 reads each processor element PEn (n = 0, 1,..., N) from a predetermined one of the registers R0 to RJ and outputs it to the arithmetic logic unit 91. The arithmetic logic unit 91 is controlled to output the calculation result to a predetermined one of the accumulator 92 and the registers R0 to RJ. The global processor 2 controls the processor elements PE0 to PEN to simultaneously perform the same processing in parallel, thereby processing a plurality of data in parallel.

図２に示すように、各プロセッサエレメントＰＥ０〜ＰＥＮの４個のレジスタＲｊ，Ｒｊ＋１，Ｒｊ＋２，Ｒｊ＋３（ｊは、０以上Ｊ−３以下の所定の整数である。）は、メモリコントローラ４を介して、ＤＤＲ、ＤＤＲ２又はＤＤＲ３などの大容量メモリであるＤＤＲメモリ３に接続されている。ＤＤＲメモリ３は、電源のオン時などの所定のタイミングにおいて、処理対象の画像データの種類（文字、写真、又は文字及び写真）及び画像処理の内容（ディザ処理、ぼかし処理、又はシャープネス処理など）毎に、画像処理のためのテーブルデータを格納する。 As shown in FIG. 2, the four registers Rj, Rj + 1, Rj + 2, Rj + 3 (j is a predetermined integer between 0 and J−3) of each of the processor elements PE0 to PEN are connected via the memory controller 4. The DDR memory 3 is a large capacity memory such as DDR, DDR2 or DDR3. The DDR memory 3 is a type of image data to be processed (characters, photos, or characters and photos) and contents of image processing (dither processing, blurring processing, sharpness processing, etc.) at a predetermined timing such as when the power is turned on. Each time, table data for image processing is stored.

ここで、ディザ処理及びディザ処理のためのディザテーブルデータを説明する。ディザ処理は、Ｐ階調の画像データをＱ階調（Ｑ＜Ｐ）の画像データに変換する階調処理である。例えば、シアン、マゼンタ、イエロー及びブラックの各８ビットの画素データを含む多値の画像データ（１画素当たりのデータサイズは３２ビットである。）を、シアン、マゼンタ、イエロー及びブラックの各２ビット、４ビット又は１６ビットの画素データを含む２値、４値又は１６値の印刷用の画像データ（１画素当たりのデータサイズは４ビット、８ビット又は１６ビットである。）に変換するとき、ディザ処理が行われる。図４は、図１のＤＤＲメモリ３に格納される４×４のマトリクスサイズを有するディザテーブルデータの構成を示すブロック図である。図４の横方向は画像データの主走査方向に対応しており、縦方向は画像データの副走査方向に対応している。図４のディザテーブルデータの各セルＡ〜Ｐには各画素位置で用いられるしきい値データが格納される。 Here, dither processing and dither table data for the dither processing will be described. The dither processing is gradation processing that converts image data of P gradation into image data of Q gradation (Q <P). For example, multi-valued image data including 8-bit pixel data of cyan, magenta, yellow, and black (the data size per pixel is 32 bits), and 2 bits of cyan, magenta, yellow, and black. When converting to binary, 4-value, or 16-value printing image data including 4-bit or 16-bit pixel data (the data size per pixel is 4, 8, or 16 bits), Dither processing is performed. FIG. 4 is a block diagram showing the configuration of dither table data having a 4 × 4 matrix size stored in the DDR memory 3 of FIG. The horizontal direction in FIG. 4 corresponds to the main scanning direction of the image data, and the vertical direction corresponds to the sub-scanning direction of the image data. Threshold data used at each pixel position is stored in each cell A to P of the dither table data in FIG.

図５（ａ）は、図１のＤＤＲメモリ３に格納されるディザテーブルデータの一例を示すブロック図である。図５（ａ）において、ディザテーブルデータの各セルには、２５６階調の（すなわち、８ビットの）画素データを２値化するためのしきい値データが格納されている。また、図５（ｂ）は、図１のＳＩＭＤ型プロセッサ１により処理される画像データの一例を示すブロック図であり、処理対象の画像データは、所定のカラーの８ビットの画素データを含む。図５（ｂ）の画像データを図５（ａ）のディザテーブルデータを用いて２値化する場合、各画素位置において、画素データ値はしきい値データ値と比較される。そして、画素データ値がしきい値データ以上であるときは、当該画素位置における２値化結果は「１」となり、画素データ値がしきい値データ未満であるときは、当該画素位置における２値化結果は「０」となる。このため、図５（ｂ）の画像データを図５（ａ）のディザテーブルデータを用いて２値化すると、図５（ｃ）の２値化結果データが得られる。そして、２値化結果が「１」の画素位置にインクを置かず白色に設定し、２値化結果が「０」の画素位置にインクをおいて黒色に設定することにより、例えば図５（ｄ）のような、白黒の印刷イメージを得ることができる。 FIG. 5A is a block diagram showing an example of dither table data stored in the DDR memory 3 of FIG. In FIG. 5A, threshold data for binarizing 256-gradation (that is, 8-bit) pixel data is stored in each cell of the dither table data. FIG. 5B is a block diagram illustrating an example of image data processed by the SIMD type processor 1 of FIG. 1, and the image data to be processed includes 8-bit pixel data of a predetermined color. When the image data of FIG. 5B is binarized using the dither table data of FIG. 5A, the pixel data value is compared with the threshold data value at each pixel position. When the pixel data value is greater than or equal to the threshold data, the binarization result at the pixel position is “1”, and when the pixel data value is less than the threshold data, the binary value at the pixel position is The conversion result is “0”. Therefore, when the image data of FIG. 5B is binarized using the dither table data of FIG. 5A, the binarization result data of FIG. 5C is obtained. Then, the binarization result is set to white without placing ink at the pixel position of “1”, and the binarization result is set to black by placing ink at the pixel position of “0”. A black and white printed image as shown in d) can be obtained.

図６は、図１のＳＩＭＤ型プロセッサ１により図４のディザテーブルデータを用いてディザ処理を行うとき、各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データを示すブロック図である。グローバルプロセッサ２は、画像データに対して図４のディザテーブルデータを用いてディザ処理を行うとき、始めに、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊにディザテーブルデータ第１行の各列のセルＡ，Ｂ，Ｃ，Ｄの各しきい値データを繰り返して格納し、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋１にディザテーブルデータの第２行の各列のセルＥ，Ｆ，Ｇ，Ｈの各しきい値データを繰り返して格納し、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋２にディザテーブルデータの第３行の各列のセルＩ，Ｊ，Ｋ，Ｌの各しきい値データを繰り返して格納し、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋３にディザテーブルデータの第４行の各列のセルＭ，Ｎ，Ｏ，Ｐの各しきい値データを繰り返して格納するように、メモリコントローラ４を制御する。なお、グローバルプロセッサ２によるメモリコントローラ４の制御方法は後述する。 FIG. 6 is a block diagram showing threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN when the dither processing is performed by the SIMD type processor 1 of FIG. 1 using the dither table data of FIG. It is. When the global processor 2 performs dither processing on the image data using the dither table data shown in FIG. 4, first, the cells A and C in the first row of the dither table data are stored in the registers Rj of the processor elements PE0 to PEN. Each threshold data of B, C, D is repeatedly stored, and each threshold of cells E, F, G, H in each column of the second row of dither table data is stored in register Rj + 1 of processor elements PE0-PEN. The data is repeatedly stored, and the threshold data of the cells I, J, K, and L in each column of the third row of the dither table data is repeatedly stored in the register Rj + 2 of the processor elements PE0 to PEN. ~ P threshold value data of cells M, N, O, P in each column of the fourth row of dither table data in register Rj + 3 of PEN As stored repeatedly, controls the memory controller 4. A method for controlling the memory controller 4 by the global processor 2 will be described later.

さらに、グローバルプロセッサ２は、ディザ処理において、処理対象の１ライン目の画像データを各プロセッサエレメントＰＥ０〜ＰＥＮのアキュムレータ９２に格納する。そして、グローバルプロセッサ２は、アキュムレータ９２に格納された画素データ値から、レジスタＲｊに格納されたしきい値データ値を減算し、減算結果が正であれば１を出力し、減算結果が負であれば０を出力するように各算術論理演算器９１を制御する。これにより、最大でプロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）と同一の数の画素データに対してディザ処理を行う。以下同様に、レジスタＲｊ＋１，Ｒｊ＋２，Ｒｊ＋３に格納されたしきい値データを用いて２ライン目、３ライン目及び４ライン目の画像データに対してディザ処理を行う。また、５ライン目以降の画像データを処理するときは、Ｒｊ，Ｒｊ＋１，Ｒｊ＋２，Ｒｊ＋３に格納されたしきい値データを繰り返して使用する。 Furthermore, the global processor 2 stores the image data of the first line to be processed in the accumulator 92 of each processor element PE0 to PEN in the dither processing. Then, the global processor 2 subtracts the threshold data value stored in the register Rj from the pixel data value stored in the accumulator 92, and outputs 1 if the subtraction result is positive, and the subtraction result is negative. If so, each arithmetic logic unit 91 is controlled to output 0. As a result, the dither processing is performed on the same number of pixel data as the maximum number (N + 1) of the processor elements PE0 to PEN. Similarly, dither processing is performed on the image data of the second line, the third line, and the fourth line using the threshold value data stored in the registers Rj + 1, Rj + 2, and Rj + 3. Further, when processing the image data on and after the fifth line, the threshold data stored in Rj, Rj + 1, Rj + 2, and Rj + 3 are repeatedly used.

図２に戻り参照すると、メモリコントローラ４は、ＳＩＭＤ型プロセッサ１とＤＤＲメモリ３との間のインターフェース処理を行うための回路であって、ＤＤＲアドレスコントローラ４１と、リードバッファＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３と、ライトバッファＷＢ０，ＷＢ１，ＷＢ２，ＷＢ３と、リードバッファＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３をそれぞれ制御するリードバッファコントローラ１０，１１，１２，１３と、ライトバッファＷＢ０，ＷＢ１，ＷＢ２，ＷＢ３を制御するライトバッファコントローラ４２とを備えて構成される。 Referring back to FIG. 2, the memory controller 4 is a circuit for performing interface processing between the SIMD type processor 1 and the DDR memory 3, and includes a DDR address controller 41 and read buffers RB0, RB1, RB2, RB3. Write buffer WB0, WB1, WB2, WB3, read buffer controller 10, 11, 12, 13 for controlling read buffer RB0, RB1, RB2, RB3, and write buffer WB0, WB1, WB2, WB3, respectively. And a write buffer controller 42.

図２において、リードバッファＲＢ０及びライトバッファＷＢ０は、８ビットのデータバスＤＰ０を介してプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊに接続され、８ビットのデータバスＤＭ０を介してＤＤＲメモリ３に接続されている。また、リードバッファＲＢ１及びライトバッファＷＢ１は、８ビットのデータバスＤＰ１を介してプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋１に接続され、８ビットのデータバスＤＭ１を介してＤＤＲメモリ３に接続されている。さらに、リードバッファＲＢ２及びライトバッファＷＢ２は、８ビットのデータバスＤＰ２を介してプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋２に接続され、８ビットのデータバスＤＭ２を介してＤＤＲメモリ３に接続されている。またさらに、リードバッファＲＢ３及びライトバッファＷＢ３は、８ビットのデータバスＤＰ３を介してプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ＋３に接続され、８ビットのデータバスＤＭ３を介してＤＤＲメモリ３に接続されている。 In FIG. 2, a read buffer RB0 and a write buffer WB0 are connected to a register Rj of processor elements PE0 to PEN via an 8-bit data bus DP0, and connected to a DDR memory 3 via an 8-bit data bus DM0. Yes. The read buffer RB1 and the write buffer WB1 are connected to the register Rj + 1 of the processor elements PE0 to PEN via an 8-bit data bus DP1, and are connected to the DDR memory 3 via an 8-bit data bus DM1. Further, the read buffer RB2 and the write buffer WB2 are connected to the register Rj + 2 of the processor elements PE0 to PEN via the 8-bit data bus DP2, and are connected to the DDR memory 3 via the 8-bit data bus DM2. Furthermore, the read buffer RB3 and the write buffer WB3 are connected to the registers Rj + 3 of the processor elements PE0 to PEN via the 8-bit data bus DP3, and are connected to the DDR memory 3 via the 8-bit data bus DM3. .

図２において、グローバルプロセッサ２は、ＳＩＭＤ型プロセッサ１からＤＤＲメモリ３へのデータ転送（以下、ライト転送という。）時に、ＤＤＲメモリ３へのライト時のライト開始のアドレスとデータ転送数（バースト数）とを含むライト転送開始指令を、ＤＤＲアドレスコントローラ４１及びライトバッファコントローラ４２に出力する。これに応答して、ＤＤＲアドレスコントローラ４１は、ＤＤＲメモリ３のライト対象のアドレスを、ライト開始のアドレスから、当該アドレスにデータ転送数を加算したアドレスまで、１ずつインクリメントする。また、ライトバッファコントローラ４２は、ライト転送開始指令に応答して、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３からデータをリードしてライトバッファＷＢ０〜ＷＢ３に転送し、転送されたデータをバッファリング（一時的に格納）した後に、ＤＤＲメモリ３のＤＤＲアドレスコントローラ４１により指定されたアドレスに転送してライトするように、ライトバッファＷＢ０〜ＷＢ３を制御する。従って、１回のライト転送により、ＳＩＭＤ型プロセッサ１からＤＤＲメモリ３に４×（Ｎ＋１）個のデータが転送される。本実施形態では、レジスタＲｊ〜Ｒｊ＋３のサイズはそれぞれ８ビットであるので、ＳＩＭＤ型プロセッサ１とＤＤＲメモリ３との間のデータ転送幅は３２ビットである。なお、グローバルプロセッサ２は、プロセッサエレメントＰＥ０〜ＰＥＮのうちのライト転送の対象となるプロセッサエレメントを示す制御信号をプロセッサエレメントＰＥ０〜ＰＥＮに出力することにより、ライト転送の対象となるプロセッサエレメントを順次指定する。 In FIG. 2, the global processor 2 performs a write start address and a data transfer number (burst number) at the time of writing to the DDR memory 3 at the time of data transfer from the SIMD type processor 1 to the DDR memory 3 (hereinafter referred to as write transfer). The write transfer start command including) is output to the DDR address controller 41 and the write buffer controller 42. In response to this, the DDR address controller 41 increments the write target address of the DDR memory 3 from the write start address to the address obtained by adding the number of data transfers to the address. Further, in response to the write transfer start command, the write buffer controller 42 reads data from the registers Rj to Rj + 3 of the processor elements PE0 to PEN, transfers the data to the write buffers WB0 to WB3, and buffers the transferred data ( After temporarily storing, the write buffers WB0 to WB3 are controlled so as to be transferred and written to the address designated by the DDR address controller 41 of the DDR memory 3. Accordingly, 4 × (N + 1) pieces of data are transferred from the SIMD type processor 1 to the DDR memory 3 by one write transfer. In this embodiment, the size of each of the registers Rj to Rj + 3 is 8 bits, so the data transfer width between the SIMD type processor 1 and the DDR memory 3 is 32 bits. The global processor 2 sequentially designates the processor elements to be write-transferred by outputting to the processor elements PE0 to PEN a control signal indicating the processor element to be subjected to write transfer among the processor elements PE0 to PEN. To do.

次に、図３を参照してリードバッファコントローラ１０の構成及び動作を説明する。なお、リードバッファコントローラ１１〜１３はリードバッファコントローラ１０と同様に構成される。図３において、リードバッファコントローラ１０は、リードバッファカウンタ回路５１と、ループレジスタ５２と、比較器５３とを備えて構成される。ここで、ループレジスタ５２は、所定の最大アドレス値Ｃ５２をあらかじめ格納し、当該最大アドレス値Ｃ５２を比較器５３に出力する。また、リードバッファカウンタ回路５１は、グローバルプロセッサ２から、ＳＩＭＤ型プロセッサ１へのデータ転送を指示するリード転送開始指令を受信すると、アドレス値Ｃ５１を０にリセットし、リードバッファＲＢ０がＳＩＭＤ型プロセッサ１に１つのデータを転送する毎に、アドレス値Ｃ５１を１だけインクリメントしてリードバッファＲＢ０と比較器５３とに出力する。 Next, the configuration and operation of the read buffer controller 10 will be described with reference to FIG. The read buffer controllers 11 to 13 are configured in the same manner as the read buffer controller 10. In FIG. 3, the read buffer controller 10 includes a read buffer counter circuit 51, a loop register 52, and a comparator 53. Here, the loop register 52 stores a predetermined maximum address value C52 in advance and outputs the maximum address value C52 to the comparator 53. When the read buffer counter circuit 51 receives a read transfer start command instructing data transfer from the global processor 2 to the SIMD type processor 1, the read buffer counter circuit 51 resets the address value C51 to 0, and the read buffer RB0 causes the SIMD type processor 1 to Each time one data is transferred, the address value C51 is incremented by 1 and output to the read buffer RB0 and the comparator 53.

さらに、図３において、比較器５３は、アドレス値Ｃ５１を最大アドレス値Ｃ５２と比較し、アドレス値Ｃ５１が最大アドレス値Ｃ５２と一致したときにカウンタリセット信号Ｓ５３を発生してリードバッファカウンタ回路５１に出力する。リードバッファカウンタ回路５１は、カウンタリセット信号Ｓ５３に応答してアドレス値Ｃ５１を０にリセットする。また、リードバッファＲＢ０は、アドレス値Ｃ５１のアドレスに格納されたしきい値データを、所定の転送タイミングにおいてデータバスＤＰ０に出力する。 Further, in FIG. 3, the comparator 53 compares the address value C51 with the maximum address value C52, and generates a counter reset signal S53 when the address value C51 matches the maximum address value C52, to the read buffer counter circuit 51. Output. The read buffer counter circuit 51 resets the address value C51 to 0 in response to the counter reset signal S53. Further, the read buffer RB0 outputs threshold data stored at the address of the address value C51 to the data bus DP0 at a predetermined transfer timing.

次に、ＳＩＭＤ型プロセッサ１によるディザ処理時の、ＤＤＲメモリ３からＳＩＭＤ型プロセッサ１への図６のディザテーブルデータの転送（以下、リード転送という。）方法を説明する。図２において、グローバルプロセッサ２は、ディザ処理の開始時に、ＤＤＲメモリ３に格納されたディザテーブルデータの第１行の各列のセルＡ，Ｂ，Ｃ，Ｄの各しきい値データをリードバッファＲＢ０のアドレス０，１，２，３に順次それぞれ格納し、ディザテーブルデータの第２行の各列のセルＥ，Ｆ，Ｇ，Ｈの各しきい値データをリードバッファＲＢ１のアドレス０，１，２，３に順次それぞれ格納し、ディザテーブルデータの第３行の各列のセルＩ，Ｊ，Ｋ，Ｌの各しきい値データをリードバッファＲＢ２アドレス０，１，２，３に順次それぞれ格納し、ディザテーブルデータの第３行の各列のセルＭ，Ｎ，Ｏ，Ｐの各しきい値データを順次それぞれリードバッファＲＢ３のアドレス０，１，２，３に格納するように、メモリコントローラ４を制御する。また、図３において、最大アドレス値Ｃ５２は、各リードバッファＲＢ０〜ＲＢ３に最後に格納されたしきい値データの格納アドレス（３である。）の次のアドレスである４に設定される。 Next, a method of transferring the dither table data in FIG. 6 from the DDR memory 3 to the SIMD type processor 1 (hereinafter referred to as read transfer) during dither processing by the SIMD type processor 1 will be described. In FIG. 2, the global processor 2 reads the threshold data of the cells A, B, C, and D in each column of the first row of the dither table data stored in the DDR memory 3 at the start of the dither processing. The data is sequentially stored in addresses 0, 1, 2, and 3 of RB0, and the threshold data of cells E, F, G, and H in each column of the second row of dither table data are stored in addresses 0 and 1 of read buffer RB1. , 2 and 3 sequentially, and the threshold data of cells I, J, K and L in the third row of the dither table data are sequentially stored in the read buffer RB2 addresses 0, 1, 2 and 3, respectively. The memory is stored so that the threshold data of the cells M, N, O, and P in each column of the third row of the dither table data are sequentially stored in the addresses 0, 1, 2, and 3 of the read buffer RB3, respectively. Conte To control the over La 4. In FIG. 3, the maximum address value C52 is set to 4 which is the next address of the threshold data storage address (3) stored last in each of the read buffers RB0 to RB3.

次に、グローバルプロセッサ２は、リードバッファコントローラ１０〜１３の各リードバッファカウンタ回路５１にリード転送開始指令を出力する。図３において、リードバッファコントローラ１０のリードバッファカウンタ回路５１は、グローバルプロセッサ２からのリード転送開始指令に応答して、アドレス値Ｃ５１を０にリセットしてリードバッファＲＢ０及び比較器５３に出力する。そして、リードバッファＲＢ０は、第１の転送タイミングにおいて、アドレス０に格納されたセルＡのしきい値データをデータバスＤＰ０に出力する。グローバルプロセッサ２は、データバスＤＰ０に出力されたセルＡのしきい値データを、プロセッサエレメントＰＥ０のレジスタＲｊに格納するように制御する。さらに、リードバッファカウンタ回路５１はアドレス値Ｃ５１を１だけインクリメントする。 Next, the global processor 2 outputs a read transfer start command to each read buffer counter circuit 51 of the read buffer controllers 10 to 13. In FIG. 3, the read buffer counter circuit 51 of the read buffer controller 10 resets the address value C 51 to 0 in response to the read transfer start command from the global processor 2 and outputs it to the read buffer RB 0 and the comparator 53. Then, the read buffer RB0 outputs the threshold data of the cell A stored at the address 0 to the data bus DP0 at the first transfer timing. The global processor 2 performs control so that the threshold data of the cell A output to the data bus DP0 is stored in the register Rj of the processor element PE0. Further, the read buffer counter circuit 51 increments the address value C51 by 1.

次に、リードバッファカウンタ回路５１はアドレス値Ｃ５１（１である。）をリードバッファＲＢ０及び比較器５３に出力する。そして、リードバッファＲＢ０は、第２の転送タイミングにおいて、アドレス１に格納されたセルＢのしきい値データをデータバスＤＰ０に出力する。グローバルプロセッサ２は、データバスＤＰ０に出力されたセルＢのしきい値データを、プロセッサエレメントＰＥ１のレジスタＲｊに格納するように制御する。さらに、リードバッファカウンタ回路５１はアドレス値Ｃ５１を１だけインクリメントする。 Next, the read buffer counter circuit 51 outputs the address value C51 (1) to the read buffer RB0 and the comparator 53. Then, the read buffer RB0 outputs the threshold data of the cell B stored at the address 1 to the data bus DP0 at the second transfer timing. The global processor 2 performs control so that the threshold data of the cell B output to the data bus DP0 is stored in the register Rj of the processor element PE1. Further, the read buffer counter circuit 51 increments the address value C51 by 1.

次に、リードバッファカウンタ回路５１はアドレス値Ｃ５１（２である。）をリードバッファＲＢ０及び比較器５３に出力する。そして、リードバッファＲＢ０は、第３の転送タイミングにおいて、アドレス２に格納されたセルＣのしきい値データをデータバスＤＰ０に出力する。グローバルプロセッサ２は、データバスＤＰ０に出力されたセルＣのしきい値データを、プロセッサエレメントＰＥ２のレジスタＲｊに格納するように制御する。さらに、リードバッファカウンタ回路５１はアドレス値Ｃ５１を１だけインクリメントする。 Next, the read buffer counter circuit 51 outputs the address value C51 (2) to the read buffer RB0 and the comparator 53. Then, the read buffer RB0 outputs the threshold data of the cell C stored at the address 2 to the data bus DP0 at the third transfer timing. The global processor 2 performs control so that the threshold data of the cell C output to the data bus DP0 is stored in the register Rj of the processor element PE2. Further, the read buffer counter circuit 51 increments the address value C51 by 1.

次に、リードバッファカウンタ回路５１はアドレス値Ｃ５１（３である。）をリードバッファＲＢ０及び比較器５３に出力する。そして、リードバッファＲＢ０は、第４の転送タイミングにおいて、アドレス３に格納されたセルＤのしきい値データをデータバスＤＰ０に出力する。グローバルプロセッサ２は、データバスＤＰ０に出力されたセルＤのしきい値データを、プロセッサエレメントＰＥ３のレジスタＲｊに格納するように制御する。さらに、リードバッファカウンタ回路５１はアドレス値Ｃ５１を１だけインクリメントする。この結果、アドレス値Ｃ５１は４になり最大アドレス値Ｃ５２と一致するので、比較器５３はカウンタリセット信号Ｓ５３を発生してリードバッファカウンタ回路５１に出力する。これに応答して、リードバッファカウンタ回路５１はアドレス値Ｃ５１をゼロにリセットする。 Next, the read buffer counter circuit 51 outputs the address value C51 (3) to the read buffer RB0 and the comparator 53. Then, the read buffer RB0 outputs the threshold data of the cell D stored at the address 3 to the data bus DP0 at the fourth transfer timing. The global processor 2 performs control so that the threshold data of the cell D output to the data bus DP0 is stored in the register Rj of the processor element PE3. Further, the read buffer counter circuit 51 increments the address value C51 by 1. As a result, the address value C51 becomes 4 and coincides with the maximum address value C52, so the comparator 53 generates a counter reset signal S53 and outputs it to the read buffer counter circuit 51. In response to this, the read buffer counter circuit 51 resets the address value C51 to zero.

次に、リードバッファカウンタ回路５１はアドレス値Ｃ５１（０である。）をリードバッファＲＢ０及び比較器５３に出力する。従って、リードバッファＲＢ０は、第５の転送タイミングにおいて、アドレス０に格納されたセルＡのしきい値データをデータバスＤＰ０に出力する。以下同様に、リードバッファＲＢ０のアドレス０，１，２，３にそれぞれ格納されたセルＡ，Ｂ，Ｃ，Ｄのしきい値データは、セルＡ，Ｂ，Ｃ，Ｄ，Ａ，Ｂ，Ｃ，…のように、繰り返してデータバスＤＰ０に出力され、ＳＩＭＤ型プロセッサ１のプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊに格納される（図６参照。）。 Next, the read buffer counter circuit 51 outputs the address value C51 (0) to the read buffer RB0 and the comparator 53. Accordingly, the read buffer RB0 outputs the threshold data of the cell A stored at the address 0 to the data bus DP0 at the fifth transfer timing. Similarly, the threshold data of the cells A, B, C, D stored in the addresses 0, 1, 2, 3 of the read buffer RB0 are the cells A, B, C, D, A, B, C, respectively. ... Are repeatedly output to the data bus DP0 and stored in the registers Rj of the processor elements PE0 to PEN of the SIMD type processor 1 (see FIG. 6).

図２において、リードバッファコントローラ１１は、リードバッファコントローラ１０と同様に、ディザテーブルデータのセルＥ，Ｆ，Ｇ，Ｈのしきい値データをプロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ＋１に繰り返して転送する。また、リードバッファコントローラ１２は、リードバッファコントローラ１０と同様に、ディザテーブルデータのセルＩ，Ｊ，Ｋ，Ｌのしきい値データをプロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ＋２に繰り返して転送する。さらに、リードバッファコントローラ１３は、リードバッファコントローラ１０と同様に、ディザテーブルデータのセルＭ，Ｎ，Ｏ，Ｐのしきい値データをプロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ＋３に繰り返して転送する。最後に、グローバルプロセッサ２は、プロセッサエレメントＰＥＮのレジスタＲｊ〜Ｒｊ＋３にしきい値データを格納すると、ＳＩＭＤ型プロセッサ１へのリード転送を終了するようにメモリコントローラ４を制御する。 In FIG. 2, as with the read buffer controller 10, the read buffer controller 11 repeatedly transfers the threshold data of the cells E, F, G, and H of the dither table data to the registers Rj + 1 of the processor elements PE0 to PEN. . Similarly to the read buffer controller 10, the read buffer controller 12 repeatedly transfers the threshold data of the cells I, J, K, and L of the dither table data to the registers Rj + 2 of the processor elements PE0 to PEN. Further, like the read buffer controller 10, the read buffer controller 13 repeatedly transfers the threshold data of the cells M, N, O, and P of the dither table data to the registers Rj + 3 of the processor elements PE0 to PEN. Finally, when the global processor 2 stores the threshold data in the registers Rj to Rj + 3 of the processor element PEN, the global processor 2 controls the memory controller 4 to end the read transfer to the SIMD type processor 1.

以上説明したように、本実施形態によれば、ＤＤＲメモリ３に格納されたディザテーブルデータの１行目のセルＡ〜Ｄの４個しきい値データをリードバッファＲＢ０に転送する。そして、リードバッファＲＢ０に格納された４個のしきい値データを繰り返してデータバスＤＰ０を介してプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊに出力する。このとき、プロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）が３５２である場合、３５２をディザマトリクスの列数４で割った剰余はゼロであるので、セルＤのしきい値データが、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１に最後に転送される。また、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１に転送されたしきい値データのセット数は８８である。このため、ディザテーブルデータの各セルＡ〜Ｐのしきい値データは、図６に示すように、各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ，Ｒｊ＋１，Ｒｊ＋２，Ｒｊ＋３に繰り返して格納される。 As described above, according to the present embodiment, the four threshold data of the cells A to D in the first row of the dither table data stored in the DDR memory 3 are transferred to the read buffer RB0. Then, the four threshold data stored in the read buffer RB0 are repeatedly output to the registers Rj of the processor elements PE0 to PEN via the data bus DP0. At this time, when the number (N + 1) of the processor elements PE0 to PEN is 352, the remainder obtained by dividing 352 by the number of columns 4 of the dither matrix is zero, so that the threshold data of the cell D is read from the read buffer RB0. Finally, it is transferred to the SIMD type processor 1. The number of threshold data sets transferred from the read buffer RB0 to the SIMD type processor 1 is 88. Therefore, the threshold data of the cells A to P of the dither table data is repeatedly stored in the registers Rj, Rj + 1, Rj + 2, Rj + 3 of the processor elements PE0 to PEN, as shown in FIG.

以上説明したように、本実施形態によれば、メモリコントローラ４からＳＩＭＤ型プロセッサ１への１回のリード転送により、４×４のマトリクスサイズを有するディザテーブルデータを、メモリコントローラ４からＳＩＭＤ型プロセッサ１に転送できる。 As described above, according to the present embodiment, dither table data having a matrix size of 4 × 4 is transferred from the memory controller 4 to the SIMD type processor by one read transfer from the memory controller 4 to the SIMD type processor 1. 1 can be transferred.

なお、リードバッファＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３のサイズは、ＤＤＲメモリ３に格納されたディザテーブルデータの列数と同一の個数のしきい値データを格納できるサイズ（本実施形態の場合、８ビット×４である。）以上であればよい。また、アドレス値Ｃ５１のビット数はリードバッファＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３のサイズに依存する。例えば、リードバッファＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３のサイズが８ビット×３５２の場合、アドレス値Ｃ５１のビット数は９ビット（２^９＝５１２＞３５２）以上であればよい。さらに、ループレジスタ５２のビット数はリードバッファカウンタ回路５１のビット数と同一、又は最大アドレス値Ｃ５２を格納できるビット数であればよい。 The size of the read buffers RB0, RB1, RB2, and RB3 is a size that can store the same number of threshold data as the number of columns of the dither table data stored in the DDR memory 3 (in this embodiment, 8 bits). X4.) It is sufficient if it is above. The number of bits of the address value C51 depends on the size of the read buffers RB0, RB1, RB2, and RB3. For example, when the size of the read buffers RB0, RB1, RB2, and RB3 is 8 bits × 352, the number of bits of the address value C51 may be 9 bits (2 ⁹ = 512> 352) or more. Furthermore, the number of bits of the loop register 52 may be the same as the number of bits of the read buffer counter circuit 51 or the number of bits that can store the maximum address value C52.

なお、リードバッファＲＢ０〜ＲＢ３からＳＩＭＤ型プロセッサ１にしきい値データが繰り返して転送されている間は、ＤＤＲメモリ３から各リードバッファＲＢ０〜ＲＢ３のアドレス４以降に、ディザテーブルデータ以外のデータが転送される。しかしながら、各リードバッファＲＢ０〜ＲＢ３のアドレス０〜３のみからＳＩＭＤ型プロセッサ１へのデータ転送が行われるので、問題は生じない。 Note that while the threshold data is repeatedly transferred from the read buffers RB0 to RB3 to the SIMD type processor 1, data other than the dither table data is transferred from the DDR memory 3 to the addresses 4 and thereafter of the read buffers RB0 to RB3. Is done. However, since data transfer to the SIMD type processor 1 is performed only from the addresses 0 to 3 of the read buffers RB0 to RB3, no problem occurs.

以上説明したように、本実施形態によれば、プロセッサエレメントＰＥ０〜ＰＥＮの各算術論理演算器９１がレジスタＲ０〜ＲＪのうちレジスタＲｊ〜Ｒｊ＋３以外のレジスタに格納されたデータを用いて演算を行っているときに、リードバッファＲＢ０〜ＲＢ３から各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３にしきい値データを並行して転送できる。従って、従来技術に比較してＳＩＭＤ型プロセッサ１全体の処理時間を削減できる。 As described above, according to the present embodiment, the arithmetic logic units 91 of the processor elements PE0 to PEN perform calculations using data stored in registers other than the registers Rj to Rj + 3 among the registers R0 to RJ. The threshold data can be transferred in parallel from the read buffers RB0 to RB3 to the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Therefore, the processing time of the entire SIMD type processor 1 can be reduced as compared with the prior art.

また、一般に、従来技術に係るメモリコントローラはリードバッファカウンタ回路５１を備えているので、従来技術に係るメモリコントローラにループレジスタ５２及び比較器５３を設けるだけで、本実施形態に係るメモリコントローラ４を実現できる。 In general, since the memory controller according to the conventional technique includes the read buffer counter circuit 51, the memory controller 4 according to the present embodiment can be configured by simply providing the loop controller 52 and the comparator 53 in the memory controller according to the conventional technique. realizable.

第２の実施形態．
図７は、本発明の第２の実施形態に係るメモリコントローラ４Ａの構成を示すブロック図である。図７において、メモリコントローラ４Ａは、図３のメモリコントローラ４に比較して、ＤＤＲコントローラ４５をさらに備え、リードバッファコントローラ１０，１１，１２，１３に代えてリードバッファコントローラ１０Ａ，１１Ａ，１２Ａ，１３Ａを備えた点が異なる。なお、リードバッファコントローラ１１Ａ，１２Ａ，１３Ａはリードバッファコントローラ１０Ａと同様に構成されるので、図示及び説明を省略する。 Second embodiment.
FIG. 7 is a block diagram showing a configuration of a memory controller 4A according to the second embodiment of the present invention. 7, the memory controller 4A further includes a DDR controller 45 as compared with the memory controller 4 of FIG. 3, and replaces the read buffer controllers 10, 11, 12, and 13 with read buffer controllers 10A, 11A, 12A, and 13A. Is different. Since the read buffer controllers 11A, 12A, and 13A are configured in the same manner as the read buffer controller 10A, illustration and description are omitted.

図７において、リードバッファコントローラ１０Ａは、リードバッファカウンタ回路５１と、ループレジスタ５２と、比較器５３とを備えて構成される。ここで、リードバッファカウンタ回路５１は、図３のメモリコントローラ４のリードバッファカウンタ回路５１と同様に、グローバルプロセッサ２からリード転送開始指令を受信すると、アドレス値Ｃ５１を０にリセットし、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１に１つのデータが転送される毎に、アドレス値Ｃ５１を１だけインクリメントしてリードバッファＲＢ０と比較器５３とに出力する。また、ループレジスタ５２は、図３のメモリコントローラ４のループレジスタ５２と同様に、所定の最大アドレス値Ｃ５２をあらかじめ格納し、比較器５３に出力する。さらに、比較器５３は、アドレス値Ｃ５１を最大アドレス値Ｃ５２と比較し、アドレス値Ｃ５１が最大アドレス値Ｃ５２と一致したときにカウンタリセット信号Ｓ５３を発生してリードバッファカウンタ回路５１に出力すると、ＤＤＲコントローラ４５とに出力する。 In FIG. 7, the read buffer controller 10 </ b> A includes a read buffer counter circuit 51, a loop register 52, and a comparator 53. Here, as with the read buffer counter circuit 51 of the memory controller 4 of FIG. 3, when the read buffer counter circuit 51 receives a read transfer start command from the global processor 2, the read buffer counter circuit 51 resets the address value C51 to 0 and reads the read buffer RB0. Each time one piece of data is transferred from to the SIMD processor 1, the address value C51 is incremented by 1 and output to the read buffer RB0 and the comparator 53. The loop register 52 stores a predetermined maximum address value C52 in advance and outputs it to the comparator 53 in the same manner as the loop register 52 of the memory controller 4 of FIG. Further, the comparator 53 compares the address value C51 with the maximum address value C52, generates a counter reset signal S53 when the address value C51 matches the maximum address value C52, and outputs the counter reset signal S53 to the read buffer counter circuit 51. Output to the controller 45.

図７において、ＤＤＲコントローラ４５は、カウンタリセット信号Ｓ５３に応答して、ＤＤＲメモリ３からリードバッファＲＢ０へのデータ転送を停止させるためのストップ信号Ｓ４５を発生し、ＤＤＲメモリ３に出力する。これに応答して、ＤＤＲメモリ３はリードバッファＲＢ０へのデータ転送を停止する。 In FIG. 7, the DDR controller 45 generates a stop signal S45 for stopping data transfer from the DDR memory 3 to the read buffer RB0 in response to the counter reset signal S53, and outputs it to the DDR memory 3. In response to this, the DDR memory 3 stops data transfer to the read buffer RB0.

一般に、ＤＤＲメモリ３は所定のデータ群を連続して転送するバースト転送を行うので、ストップ信号Ｓ４５に応答してリアルタイムでは転送を停止できない（オーバーランする）。しかしながら、本実施形態によれば、ＤＤＲメモリ３がプロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）分の最大のデータ数のデータ転送を行っている場合であっても、リードバッファＲＢ０へのデータ転送を停止できるので、第１の実施形態に比較して、ＤＤＲメモリ３からメモリコントローラ４へのディザテーブルデータ以外のデータの余分な転送を大幅に削減できる。このため、メモリコントローラ４ＡからＤＤＲメモリ３へのアクセス回数が削減され、ＳＩＭＤ型プロセッサ１を搭載した装置の消費電流を削減できる。また、メモリコントローラ４ＡからＳＩＭＤ型プロセッサ１へのリード転送を行っているときに、ＳＩＭＤ型プロセッサ１以外の回路からＤＤＲメモリ３にアクセスできるので、第１の実施形態に比較して、ＳＩＭＤ型プロセッサ１を搭載した装置の処理速度を向上できる。 In general, since the DDR memory 3 performs burst transfer for continuously transferring a predetermined data group, the transfer cannot be stopped (overrun) in real time in response to the stop signal S45. However, according to the present embodiment, even when the DDR memory 3 performs the data transfer of the maximum number of data corresponding to the number (N + 1) of the processor elements PE0 to PEN, the data transfer to the read buffer RB0 is performed. Since it can be stopped, the extra transfer of data other than the dither table data from the DDR memory 3 to the memory controller 4 can be greatly reduced as compared with the first embodiment. For this reason, the number of accesses from the memory controller 4A to the DDR memory 3 is reduced, and the current consumption of the device equipped with the SIMD type processor 1 can be reduced. In addition, since the DDR memory 3 can be accessed from a circuit other than the SIMD type processor 1 when performing read transfer from the memory controller 4A to the SIMD type processor 1, the SIMD type processor can be compared with the first embodiment. The processing speed of the apparatus equipped with 1 can be improved.

第３の実施形態．
上述した各実施形態では、プロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）がＤＤＲメモリ３に格納されたディザテーブルデータの列数で割り切れる場合のリード転送を説明した。しかしながら、１ライン分の画像データに含まれる画素データの数がプロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）よりも多く、かつ個数（Ｎ＋１）がディザテーブルデータの列数で割り切れないときは、以下の問題が生じる。 Third embodiment.
In each of the above-described embodiments, the read transfer in the case where the number (N + 1) of the processor elements PE0 to PEN is divisible by the number of columns of the dither table data stored in the DDR memory 3 has been described. However, when the number of pixel data included in the image data for one line is larger than the number (N + 1) of the processor elements PE0 to PEN and the number (N + 1) is not divisible by the number of columns of the dither table data, Problems arise.

例えば、１ライン分の画像データに含まれる画素データの数が７００個であり、プロセッサエレメントＰＥ０〜ＰＥＮの個数が３５０個であり、ディザテーブルデータの列数が４であるとき、１ライン分の画像データの前半の３５０個の画素データに対してディザ処理を行うときは、プロセッサエレメントＰＥ０〜ＰＥ３４９の各レジスタＲｊに、ディザテーブルデータのセルＡ，Ｂ，Ｃ，Ｄ，Ａ，Ｂ…，Ａ，Ｂのようにしきい値データが格納される。次に、１ライン分の画像データの後半の３５０個の画素データに対してディザ処理を行うときは、プロセッサエレメントＰＥ０〜ＰＥ３５１の各レジスタＲｊに、ディザテーブルデータのセルＣ，Ｄ，Ａ，Ｂ，Ｃ，Ｄ，…のしきい値データを格納する必要がある。しかしながら、上述した実施形態の場合、メモリコントローラ４及び４ＡからＳＩＭＤ型プロセッサ１へのリード転送開始時のリードバッファＲＢ０〜ＲＢ３のアドレスは０であるので、プロセッサエレメントＰＥ０のレジスタＲｊには、ディザテーブルデータのセルＡのデータし格納できない。このため、１ライン分の画像データの後半の３５０個の画素データに対してディザ処理を行えないという問題がある。本実施形態はこの問題を解決することを目的とする。 For example, when the number of pixel data included in image data for one line is 700, the number of processor elements PE0 to PEN is 350, and the number of columns of dither table data is 4, the data for one line When dither processing is performed on 350 pixel data in the first half of the image data, cells A, B, C, D, A, B..., A of dither table data are stored in the registers Rj of the processor elements PE0 to PE349. , B are stored as threshold data. Next, when dither processing is performed on 350 pixel data in the latter half of the image data for one line, cells C, D, A, and B of dither table data are stored in the registers Rj of the processor elements PE0 to PE351. , C, D,... Need to be stored. However, in the above-described embodiment, the address of the read buffers RB0 to RB3 at the start of read transfer from the memory controllers 4 and 4A to the SIMD type processor 1 is 0, so that the dither table is stored in the register Rj of the processor element PE0. Data in cell A cannot be stored. For this reason, there is a problem that dither processing cannot be performed on 350 pixel data in the latter half of the image data for one line. The present embodiment aims to solve this problem.

図８は、本発明の第３の実施形態に係るメモリコントローラ４Ｂの構成を示すブロック図である。本実施形態に係るメモリコントローラ４Ｂは、図３のメモリコントローラ４に比較して、リードバッファコントローラ１０，１１，１２，１３に代えてリードバッファコントローラ１０Ｂ，１１Ｂ，１２Ｂ，１３Ｂを備えた点が異なる。なお、リードバッファコントローラ１１Ｂ，１２Ｂ，１３Ｂはリードバッファコントローラ１０Ｂと同様に構成されるので、図示及び説明を省略する。 FIG. 8 is a block diagram showing the configuration of the memory controller 4B according to the third embodiment of the present invention. The memory controller 4B according to the present embodiment is different from the memory controller 4 of FIG. 3 in that read buffer controllers 10B, 11B, 12B, and 13B are provided instead of the read buffer controllers 10, 11, 12, and 13. . Since the read buffer controllers 11B, 12B, and 13B are configured in the same manner as the read buffer controller 10B, illustration and description thereof are omitted.

図８において、リードバッファコントローラ１０Ｂは、リードバッファコントローラ１０に比較して、リセット値レジスタ５４及びマルチプレクサ５５を備えたリセット値設定回路５６をさらに備えたことを特徴とする。図８において、リセット値レジスタ５４は、リセット値０，１，２，３をあらかじめ格納し、マルチプレクサ５５に出力する。また、グローバルプロセッサ２は、メモリコントローラ４ＢからＳＩＭＤ型プロセッサ１へのリード転送の開始時に、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１へのデータ転送を指示しかつリセット値を指定するリード転送開始指令をマルチプレクサ５５に出力する。これに応答して、マルチプレクサ５５は、リセット値レジスタ５４からのリセット値０〜３のうち、リード転送開始指令に含まれるリセット値をリセット値Ｃ５５としてリードバッファカウンタ回路５１に出力する。 8, the read buffer controller 10 </ b> B further includes a reset value setting circuit 56 including a reset value register 54 and a multiplexer 55, as compared with the read buffer controller 10. In FIG. 8, the reset value register 54 stores the reset values 0, 1, 2, and 3 in advance and outputs them to the multiplexer 55. The global processor 2 multiplexes a read transfer start command for instructing data transfer from the read buffer RB0 to the SIMD type processor 1 and designating a reset value at the start of read transfer from the memory controller 4B to the SIMD type processor 1. To 55. In response to this, the multiplexer 55 outputs the reset value included in the read transfer start command among the reset values 0 to 3 from the reset value register 54 to the read buffer counter circuit 51 as the reset value C55.

また、図８において、リードバッファカウンタ回路５１は、メモリコントローラ４ＢからＳＩＭＤ型プロセッサ１へのリード転送の開始時に、アドレス値Ｃ５１をマルチプレクサ５５からのリセット値Ｃ５５にリセットする。従って、本実施形態によれば、グローバルプロセッサ２は、メモリコントローラ４ＢからＳＩＭＤ型プロセッサ１へのリード転送開始時に、リードバッファカウンタ回路５１のアドレス値Ｃ５１を所望の値にリセットできる。このため、リード転送開始時に、リードバッファカウンタ回路５１のアドレス値Ｃ５１が例えば２にリセットされると、リードバッファＲＢ０のアドレス２に格納されたディザテーブルデータのセルＣのしきい値データから、セルＤ，Ａ，Ｂ，Ｃ，Ｄ，Ａ…のしきい値データが順次繰り返して転送される。このため、上述した問題を解決できる。 In FIG. 8, the read buffer counter circuit 51 resets the address value C51 to the reset value C55 from the multiplexer 55 at the start of read transfer from the memory controller 4B to the SIMD type processor 1. Therefore, according to the present embodiment, the global processor 2 can reset the address value C51 of the read buffer counter circuit 51 to a desired value at the start of read transfer from the memory controller 4B to the SIMD type processor 1. For this reason, when the address value C51 of the read buffer counter circuit 51 is reset to 2, for example, at the start of read transfer, the threshold data of the cell C of the dither table data stored at the address 2 of the read buffer RB0 The threshold data D, A, B, C, D, A... Are sequentially and repeatedly transferred. For this reason, the problem mentioned above can be solved.

一般に、ＳＩＭＤ型プロセッサ１のプロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）は、ディザテーブルデータの列数で割り切れることが多い。しかしながら、ディザテーブルデータのマトリクスサイズが大きくなるほど（例えば、６４×６４など。）、ＳＩＭＤ型プロセッサ１のプロセッサエレメントＰＥ０〜ＰＥＮの個数（Ｎ＋１）がディザテーブルデータの列数で割り切れない可能性が高くなり、このような場合、本実施形態に係るメモリコントローラ４Ｂは有効である。 In general, the number (N + 1) of processor elements PE0 to PEN of the SIMD type processor 1 is often divisible by the number of columns of dither table data. However, the larger the matrix size of the dither table data (for example, 64 × 64), the higher the possibility that the number (N + 1) of the processor elements PE0 to PEN of the SIMD processor 1 cannot be divided by the number of columns of the dither table data. In such a case, the memory controller 4B according to the present embodiment is effective.

なお、本実施形態においてリセット値設定回路５６は図８に示した構成を有したが、本発明はこれに限られず、リードバッファコントローラ５１に対して任意のリセット値Ｃ５５を出力する構成を有していればよい。例えば、グローバルプロセッサ２を、リード転送終了時のリードバッファカウンタ回路５１のアドレス値Ｃ５１をリセット値トレジスタに転送するようにリードバッファコントローラ５１を制御するように構成し、リセット値設定回路を、リード転送開始時に、リセット値トレジスタに格納されたリセット値をリセット値Ｃ５５としてリードバッファカウンタ回路５１に出力するように構成してもよい。また、ユーザがプログラムでリセット値レジスタに所望のリセット値を設定し、リセット値設定回路を、ユーザがプログラムで設定可能なリセット値を格納するオフセットレジスタを備えるように構成してもよい。 In the present embodiment, the reset value setting circuit 56 has the configuration shown in FIG. 8, but the present invention is not limited to this, and has a configuration in which an arbitrary reset value C55 is output to the read buffer controller 51. It only has to be. For example, the global processor 2 is configured to control the read buffer controller 51 so as to transfer the address value C51 of the read buffer counter circuit 51 at the end of the read transfer to the reset value register, and the reset value setting circuit is read transfer At the start, the reset value stored in the reset value register may be output to the read buffer counter circuit 51 as the reset value C55. Further, the user may set a desired reset value in the reset value register by a program, and the reset value setting circuit may be configured to include an offset register that stores a reset value that can be set by the user by the program.

また、第２の実施形態に係るメモリコントローラ４Ａと同様に、カウンタリセット信号Ｓ５３に基づいてストップ信号を発生してＤＤＲメモリ３に出力してもよい。 Further, similarly to the memory controller 4A according to the second embodiment, a stop signal may be generated based on the counter reset signal S53 and output to the DDR memory 3.

第４の実施形態．
上述した各実施形態において、ＳＩＭＤ型プロセッサ１とメモリコントローラ４，４Ａ，４Ｂとは、４本のデータバスＤＰ０〜ＤＰ３を介して接続された。このため、ディザテーブルデータの行数が４であるときは、メモリコントローラ４，４Ａ，４ＢからＳＩＭＤ型プロセッサ１への１回のリード転送で、ディザテーブルデータの全てのセルＡ〜Ｐのしきい値データを、図６に示すようにプロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納できた。 Fourth embodiment.
In each of the above-described embodiments, the SIMD type processor 1 and the memory controllers 4, 4A, 4B are connected via the four data buses DP0 to DP3. For this reason, when the number of rows of the dither table data is 4, the threshold of all the cells A to P of the dither table data is obtained by one read transfer from the memory controller 4, 4A, 4B to the SIMD type processor 1. The value data could be stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN as shown in FIG.

上述した各実施形態において、例えば８×８のマトリクスサイズを有するディザテーブルデータを用いる場合、メモリコントローラ４，４Ａ又は４Ｂと、ＳＩＭＤ型プロセッサ１はとの間で以下のようなリード転送が行われる。図１０は、図９のＤＤＲメモリ３に格納される８×８のマトリクスサイズを有するディザテーブルデータと、当該ディザテーブルデータを用いてディザ処理を行うとき、各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データとを示すブロック図である。なお、図１０のディザテーブルにおいて、各セル内の数字はセル番号を示す。 In each of the above-described embodiments, for example, when using dither table data having a matrix size of 8 × 8, the following read transfer is performed between the memory controller 4, 4A or 4B and the SIMD processor 1. . FIG. 10 shows dither table data having an 8 × 8 matrix size stored in the DDR memory 3 of FIG. 9, and when performing dither processing using the dither table data, registers Rj˜ of each processor element PE0˜PEN. It is a block diagram which shows the threshold value data stored in Rj + 3. In the dither table of FIG. 10, the numbers in each cell indicate cell numbers.

例えば、第１の実施形態の場合、図１０において、始めに、ＤＤＲメモリ３からメモリコントローラ４（図２参照。）のリードバッファＲＢ０にディザテーブルデータのセル１〜８のしきい値データを転送し、リードバッファＲＢ１にディザテーブルデータのセル９〜１６のしきい値データを転送し、リードバッファＲＢ２にディザテーブルデータのセル１７〜２４のしきい値データを転送し、リードバッファＲＢ３にディザテーブルデータのセル２５〜３２のしきい値データを転送する。そして、リードバッファＲＢ０〜ＲＢ３に格納された各しきい値データをプロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に繰り返して転送する。そして、グローバルプロセッサ２は、１ライン目〜４ライン目の画像データに対して、プロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に格納されたしきい値データを用いてディザ処理を行う。 For example, in the case of the first embodiment, in FIG. 10, the threshold data of the dither table data cells 1 to 8 is first transferred from the DDR memory 3 to the read buffer RB0 of the memory controller 4 (see FIG. 2). The threshold data of the dither table data cells 9 to 16 is transferred to the read buffer RB1, the threshold data of the dither table data cells 17 to 24 is transferred to the read buffer RB2, and the dither table is transferred to the read buffer RB3. Transfer threshold data of data cells 25-32. Then, the threshold data stored in the read buffers RB0 to RB3 are repeatedly transferred to the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Then, the global processor 2 performs dither processing on the image data of the first line to the fourth line by using threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN.

次に、図１０において、ＤＤＲメモリ３からメモリコントローラ４のリードバッファＲＢ０にディザテーブルデータのセル３３〜４０のしきい値データを転送し、リードバッファＲＢ１にディザテーブルデータのセル４１〜４８のしきい値データを転送し、リードバッファＲＢ２にディザテーブルデータのセル４９〜５６のしきい値データを転送し、リードバッファＲＢ３にディザテーブルデータのセル５７〜６４のしきい値データを転送する。そして、リードバッファＲＢ０〜ＲＢ３に格納された各しきい値データをプロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に繰り返して転送する。そして、グローバルプロセッサ２は、５ライン目〜８ライン目の画像データに対して、プロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に格納されたしきい値データを用いてディザ処理を行う。従って、ＤＤＲメモリ３からメモリコントローラ４へのリード転送を２回行う必要があった。本実施形態は、上記各実施形態に比較して、ＤＤＲメモリ３からメモリコントローラ４への転送回数を削減することを目的とする。 Next, in FIG. 10, the threshold data of the dither table data cells 33 to 40 is transferred from the DDR memory 3 to the read buffer RB0 of the memory controller 4, and the dither table data cells 41 to 48 are transferred to the read buffer RB1. The threshold data is transferred, threshold data of dither table data cells 49 to 56 is transferred to the read buffer RB2, and threshold data of dither table data cells 57 to 64 is transferred to the read buffer RB3. Then, the threshold data stored in the read buffers RB0 to RB3 are repeatedly transferred to the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Then, the global processor 2 performs dither processing on the image data on the 5th to 8th lines using the threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Therefore, read transfer from the DDR memory 3 to the memory controller 4 has to be performed twice. The purpose of this embodiment is to reduce the number of transfers from the DDR memory 3 to the memory controller 4 as compared to the above embodiments.

図９は、本発明の第４の実施形態に係るメモリコントローラ４Ｃの構成を示すブロック図である。図９において、メモリコントローラ４Ｃは、図３のメモリコントローラ４に比較して、リードバッファコントローラ１０，１１，１２，１３に代えてリードバッファコントローラ１０Ｃ，１１Ｃ，１２Ｃ，１３Ｃを備えて構成される。ここで、リードバッファコントローラ１０Ｃは、リードバッファカウンタ回路５１と、ループレジスタ５２Ａと、比較器５３と、オフセット値レジスタ５７及びマルチプレクサ５８を備えたオフセット値設定回路６０と、加算器Ｃ５９とを備えて構成される。また、リードバッファコントローラ１１Ｃ，１２Ｃ，１３Ｃはリードバッファコントローラ１０Ｃと同様に構成されるので、図示及び説明を省略する。 FIG. 9 is a block diagram showing a configuration of a memory controller 4C according to the fourth embodiment of the present invention. In FIG. 9, the memory controller 4C includes read buffer controllers 10C, 11C, 12C, and 13C instead of the read buffer controllers 10, 11, 12, and 13 as compared with the memory controller 4 of FIG. Here, the read buffer controller 10C includes a read buffer counter circuit 51, a loop register 52A, a comparator 53, an offset value setting circuit 60 including an offset value register 57 and a multiplexer 58, and an adder C59. Composed. Since the read buffer controllers 11C, 12C, and 13C are configured in the same manner as the read buffer controller 10C, illustration and description thereof are omitted.

また、図９において、グローバルプロセッサ２は、メモリコントローラ４ＣからＳＩＭＤ型プロセッサ１へのリード転送の開始時に、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１へのデータ転送を指示しかつオフセット値及び最大アドレス値を指定するリード転送開始指令をマルチプレクサ５８と、リードバッファカウンタ回路５１と、ループレジスタ５２Ａとに出力する。オフセット値レジスタ５７は、オフセット値０及び８をあらかじめ格納し、マルチプレクサ５８に出力する。また、マルチプレクサ５８は、オフセット値レジスタ５７からのオフセット値のうち、グローバルプロセッサ２からのリード転送開始指令に含まれるオフセット値を、オフセット値Ｃ５８として加算器５９に出力する。 In FIG. 9, the global processor 2 instructs the data transfer from the read buffer RB0 to the SIMD processor 1 at the start of the read transfer from the memory controller 4C to the SIMD processor 1, and sets the offset value and the maximum address value. A designated read transfer start command is output to the multiplexer 58, the read buffer counter circuit 51, and the loop register 52A. The offset value register 57 stores offset values 0 and 8 in advance and outputs them to the multiplexer 58. Also, the multiplexer 58 outputs the offset value included in the read transfer start command from the global processor 2 among the offset values from the offset value register 57 to the adder 59 as the offset value C58.

図９において、ループレジスタ５２Ａは、グローバルプロセッサ２からのリード転送開始指令に含まれる最大アドレス値を格納し、最大アドレス値Ｃ５２Ａとして比較器５３に出力する。リードバッファカウンタ回路５１は、グローバルプロセッサ２からのリード転送開始指令に応答してアドレス値Ｃ５１を０にリセットし、リードバッファＲＢ０がＳＩＭＤ型プロセッサ１に１つのデータを転送する毎に、アドレス値Ｃ５１を１だけインクリメントして加算器５９に出力する。さらに、加算器５９は、リードバッファカウンタ回路５１からのアドレス値Ｃ５１にオフセット値レジスタ５７からのオフセット値Ｃ５８を加算し、加算結果の加算値をアドレス値Ｃ５９としてリードバッファＲＢ０と比較器５３とに出力する。これに応答して、リードバッファＲＢ０は、アドレス値Ｃ５９を有するアドレスに格納されたしきい値データをデータバスＤＰ０に出力する。また、比較器５３は、アドレス値Ｃ５９を最大アドレス値Ｃ５２Ａと比較し、アドレス値Ｃ５９が最大アドレス値Ｃ５２Ａと一致したとき、カウンタリセット信号Ｓ５３を発生してリードバッファカウンタ回路５１に出力する。これに応答して、リードバッファカウンタ回路５１はアドレス値Ｃ５１を０にリセットする。 In FIG. 9, the loop register 52A stores the maximum address value included in the read transfer start command from the global processor 2, and outputs it to the comparator 53 as the maximum address value C52A. The read buffer counter circuit 51 resets the address value C51 to 0 in response to the read transfer start command from the global processor 2, and every time the read buffer RB0 transfers one data to the SIMD type processor 1, the address value C51 is read. Is incremented by 1 and output to the adder 59. Further, the adder 59 adds the offset value C58 from the offset value register 57 to the address value C51 from the read buffer counter circuit 51, and uses the addition result as an address value C59 for the read buffer RB0 and the comparator 53. Output. In response to this, the read buffer RB0 outputs the threshold data stored at the address having the address value C59 to the data bus DP0. The comparator 53 compares the address value C59 with the maximum address value C52A. When the address value C59 matches the maximum address value C52A, the comparator 53 generates a counter reset signal S53 and outputs it to the read buffer counter circuit 51. In response to this, the read buffer counter circuit 51 resets the address value C51 to 0.

次に、図１１を参照して、図１０の８×８のマトリクスサイズを有するディザテーブルテータを用いてディザ処理を行うときのグローバルプロセッサ２及びメモリコントローラ４Ｃの動作を説明する。図１１は、図１０のＤＤＲメモリ３に格納されているディザテーブルデータと、図９のリードバッファＲＢ０〜ＲＢ３へのしきい値データの格納状態と、オフセット値Ｃ５８が０でありかつ最大アドレス値Ｃ５２Ａが８であるとき（１回目のリード転送時）に各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データと、オフセット値Ｃ５８が８でありかつ最大アドレス値Ｃ５２Ａが１６であるとき（２回目のリード転送時）に各プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に格納されるしきい値データとを示すブロック図である。 Next, operations of the global processor 2 and the memory controller 4C when performing dither processing using the dither table data having the 8 × 8 matrix size of FIG. 10 will be described with reference to FIG. 11 shows the dither table data stored in the DDR memory 3 of FIG. 10, the storage state of the threshold data in the read buffers RB0 to RB3 of FIG. 9, the offset value C58 is 0, and the maximum address value. When C52A is 8 (during the first read transfer), the threshold value data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN, the offset value C58 is 8, and the maximum address value C52A is 16 Is a block diagram showing threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN at the time of (when the second read transfer is performed).

まず始めに、図１１において、グローバルプロセッサ２は、ディザ処理の開始時に、ＤＤＲメモリ３に格納された８×８のディザテーブルデータの第１行の各列のセル１〜８及び第５行の各列のセル３３〜４０の各しきい値データをリードバッファＲＢ０のアドレス１〜１６に順次それぞれ格納し、ディザテーブルデータの第２行の各列のセル９〜１６及び第２行の各列のセル４１〜４８の各しきい値データをリードバッファＲＢ１のアドレス１〜１６に順次それぞれ格納し、ディザテーブルデータの第３行の各列のセル１７〜２４及び第７行の各列のセル４９〜５６の各しきい値データをリードバッファＲＢ２のアドレス１〜１６に順次それぞれ格納し、ディザテーブルデータの第４行の各列のセル２５〜３２及び第８行の各列のセル５７〜６４の各しきい値データをリードバッファＲＢ３のアドレス１〜１６に順次それぞれ格納するように、メモリコントローラ４Ｃを制御する。これにより、ＤＤＲメモリ３からメモリコントローラ４Ｃへの１回のデータ転送により、図１１に示すように、リードバッファＲＢ０〜ＲＢ３に全てのしきい値データが格納される。 First, in FIG. 11, when the dither processing is started, the global processor 2 reads the cells 1 to 8 and the fifth row in each column of the first row of 8 × 8 dither table data stored in the DDR memory 3. The threshold data of the cells 33 to 40 in each column are sequentially stored in the addresses 1 to 16 of the read buffer RB0, and the cells 9 to 16 in the second row of the dither table data and the columns in the second row are stored. Are sequentially stored in addresses 1 to 16 of the read buffer RB1, and cells 17 to 24 in the third row of the dither table data and cells in the columns of the seventh row are stored. The threshold data 49 to 56 are sequentially stored in the addresses 1 to 16 of the read buffer RB2, and the cells 25 to 32 in the fourth row and the cells 57 in the eighth row of the dither table data are stored. Each threshold data 64 to store successively each address 1-16 read buffer and RB3, then controls the memory controller 4C. As a result, all threshold data is stored in the read buffers RB0 to RB3 as shown in FIG. 11 by one data transfer from the DDR memory 3 to the memory controller 4C.

次に、グローバルプロセッサ２は、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１へのデータ転送を指示しかつオフセット値０及び最大アドレス値８を指定するリード転送開始指令をマルチプレクサ５８と、リードバッファカウンタ回路５１と、ループレジスタ５２Ａとに出力する。これに応答して、マルチプレクサ５８はオフセット値Ｃ５８（０である。）を加算器５９に出力し、リードバッファカウンタ回路５１はアドレス値Ｃ５１を０にリセットし、ループレジスタ５２Ａは最大アドレス値Ｃ５２Ａ（８である。）を比較器５３に出力する。 Next, the global processor 2 issues a read transfer start command for instructing data transfer from the read buffer RB0 to the SIMD type processor 1 and designating the offset value 0 and the maximum address value 8, the multiplexer 58, the read buffer counter circuit 51, To the loop register 52A. In response to this, the multiplexer 58 outputs the offset value C58 (0) to the adder 59, the read buffer counter circuit 51 resets the address value C51 to 0, and the loop register 52A has the maximum address value C52A ( 8) is output to the comparator 53.

従って、リードバッファカウンタ回路５１からのアドレス値Ｃ５１は０，１，２，…，７，８，０，１，…のように変化する。また、アドレス値Ｃ５１が８になるとアドレス値Ｃ５１は０にリセットされるので、メモリコントローラ４ＣからＳＩＭＤ型プロセッサ１へのデータ転送タイミングにおける加算器５９からのアドレス値Ｃ５９は、０，１，２，…，７，０，１，２，…のように変化する。これにより、リードバッファＲＢ０のアドレス０，１，２，…，７に格納されたセル０〜７の各しきい値データがデータバスＤＰ０に順次繰り返して出力される。グローバルプロセッサ２は、データバスＤＰ０に出力されたしきい値データを、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊに順次それぞれ格納するように制御する。 Accordingly, the address value C51 from the read buffer counter circuit 51 changes as 0, 1, 2,..., 7, 8, 0, 1,. Since the address value C51 is reset to 0 when the address value C51 reaches 8, the address value C59 from the adder 59 at the data transfer timing from the memory controller 4C to the SIMD type processor 1 is 0, 1, 2, .., 7, 0, 1, 2,. Thereby, the threshold data of the cells 0 to 7 stored in the addresses 0, 1, 2,..., 7 of the read buffer RB0 are sequentially and repeatedly output to the data bus DP0. The global processor 2 performs control so that the threshold data output to the data bus DP0 is sequentially stored in the registers Rj of the processor elements PE0 to PEN.

また、グローバルプロセッサ２は、リードバッファコントローラ１０Ｃと同様に構成されたリードバッファコントローラ１１Ｃ，１２Ｃ，１３Ｃを、リードバッファコントローラ１０Ｃと同様に制御する。これにより、図１１に示すように、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に、ディザテーブルデータのセル１〜３２のしきい値データが格納される。そして、グローバルプロセッサ２は、１ライン目〜４ライン目の画像データに対して、プロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に格納されたしきい値データを用いてディザ処理を行う。 Further, the global processor 2 controls the read buffer controllers 11C, 12C, and 13C configured similarly to the read buffer controller 10C in the same manner as the read buffer controller 10C. Thus, as shown in FIG. 11, the threshold data of the dither table data cells 1 to 32 are stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Then, the global processor 2 performs dither processing on the image data of the first line to the fourth line by using threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN.

次に、グローバルプロセッサ２は、グローバルプロセッサ２は、ＤＤＲメモリ３からメモリコントローラ４Ｃへのデータ転送を行うことなく、リードバッファＲＢ０からＳＩＭＤ型プロセッサ１へのデータ転送を指示しかつオフセット値８及び最大アドレス値１６を指定するリード転送開始指令をマルチプレクサ５８と、リードバッファカウンタ回路５１と、ループレジスタ５２Ａとに出力する。これに応答して、マルチプレクサ５８はオフセット値Ｃ５８（８である。）を加算器５９に出力し、リードバッファカウンタ回路５１はアドレス値Ｃ５１を０にリセットし、ループレジスタ５２Ａは最大アドレス値Ｃ５２Ａ（１６である。）を比較器５３に出力する。 Next, the global processor 2 instructs the data transfer from the read buffer RB0 to the SIMD type processor 1 without performing the data transfer from the DDR memory 3 to the memory controller 4C, and the offset value 8 and the maximum A read transfer start command designating the address value 16 is output to the multiplexer 58, the read buffer counter circuit 51, and the loop register 52A. In response to this, the multiplexer 58 outputs the offset value C58 (8) to the adder 59, the read buffer counter circuit 51 resets the address value C51 to 0, and the loop register 52A has the maximum address value C52A ( 16) is output to the comparator 53.

従って、リードバッファカウンタ回路５１からのアドレス値Ｃ５１は０，１，２，…，７，８，０，１，…のように変化する。また、アドレス値Ｃ５１が１６になるとアドレス値Ｃ５１は０にリセットされるので、メモリコントローラ４ＣからＳＩＭＤ型プロセッサ１へのデータ転送タイミングにおける加算器５９からのアドレス値Ｃ５９は、８，９，１０…，１５，８，９，１０，…のように変化する。これにより、リードバッファＲＢ０のアドレス８，９，１０…，１５に格納されたセル３３〜４０の各しきい値データがデータバスＤＰ０に順次繰り返して出力される。グローバルプロセッサ２は、データバスＤＰ０に出力されたしきい値データを、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊに順次それぞれ格納するように制御する。 Accordingly, the address value C51 from the read buffer counter circuit 51 changes as 0, 1, 2,..., 7, 8, 0, 1,. Further, since the address value C51 is reset to 0 when the address value C51 becomes 16, the address value C59 from the adder 59 at the data transfer timing from the memory controller 4C to the SIMD type processor 1 is 8, 9, 10. , 15, 8, 9, 10,... As a result, the threshold data of the cells 33 to 40 stored at the addresses 8, 9, 10,..., 15 of the read buffer RB0 are sequentially and repeatedly output to the data bus DP0. The global processor 2 performs control so that the threshold data output to the data bus DP0 is sequentially stored in the registers Rj of the processor elements PE0 to PEN.

また、グローバルプロセッサ２は、リードバッファコントローラ１０Ｃと同様に構成されたリードバッファコントローラ１１Ｃ，１２Ｃ，１３Ｃを、リードバッファコントローラ１０Ｃと同様に制御する。これにより、図１１に示すように、プロセッサエレメントＰＥ０〜ＰＥＮのレジスタＲｊ〜Ｒｊ＋３に、ディザテーブルデータのセル３３〜６４のしきい値データが格納される。そして、グローバルプロセッサ２は、５ライン目〜８ライン目の画像データに対して、プロセッサエレメントＰＥ０〜ＰＥＮの各レジスタＲｊ〜Ｒｊ＋３に格納されたしきい値データを用いてディザ処理を行う。 Further, the global processor 2 controls the read buffer controllers 11C, 12C, and 13C configured similarly to the read buffer controller 10C in the same manner as the read buffer controller 10C. Thus, as shown in FIG. 11, the threshold data of the dither table data cells 33 to 64 are stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN. Then, the global processor 2 performs dither processing on the image data on the 5th to 8th lines using the threshold data stored in the registers Rj to Rj + 3 of the processor elements PE0 to PEN.

以上説明したように、本実施形態によれば、ＳＩＭＤ型プロセッサ１とＤＤＲメモリ３との間のデータ転送幅が３２ビットであるとき、ＤＤＲメモリ３からメモリコントローラ４Ｃへのデータ転送を１回だけ行い、メモリコントローラ４ＣからＳＩＭＤ型プロセッサ１へのデータ転送を２回行うことにより、８×８のディザテーブルデータをＳＩＭＤ型プロセッサ１に転送できる。 As described above, according to the present embodiment, when the data transfer width between the SIMD type processor 1 and the DDR memory 3 is 32 bits, the data transfer from the DDR memory 3 to the memory controller 4C is performed only once. Then, 8 × 8 dither table data can be transferred to the SIMD processor 1 by performing data transfer from the memory controller 4C to the SIMD processor 1 twice.

なお、本実施形態において、８×８のマトリクスサイズを有するディザテーブルデータをＳＩＭＤ型プロセッサ１に転送したが、本発明はこれに限られない。Ｋ×Ｌのマトリクスサイズ（Ｋ及びＬは正の整数。）を有するディザテーブルデータをＳＩＭＤ型プロセッサ１に転送する場合、例えば、リードバッファＲＢ０はディザテーブルデータの所定の第１の行の各列のデータと、ディザテーブルデータの所定の第２の行の各列のデータとを順次それぞれ所定のアドレスに格納する。さらに、上述した第１の行の各列のデータの各プロセッサエレメントＰＥ０〜ＰＥＮへの転送時は、オフセット値Ｃ５８は第１の行の第１の列のデータの格納アドレスに設定され、かつ最大アドレス値Ｃ５２Ａは第１の行の各列のデータのうちリードバッファＲＢ０に最後に格納されたデータの格納アドレスの次のリードバッファＲＢ０のアドレスに設定される。また、上述した第２の行の各列のデータの各プロセッサエレメントＰＥ０〜ＰＥＮへの転送時は、オフセット値Ｃ５８は第２の行の第１の列のデータの格納アドレスに設定され、かつ最大アドレス値Ｃ５２Ａは第２の行の各列のデータのうちリードバッファＲＢ０に最後に格納されたデータの格納アドレスの次のリードバッファＲＢ０のアドレスに設定される。 In the present embodiment, dither table data having a matrix size of 8 × 8 is transferred to the SIMD type processor 1, but the present invention is not limited to this. When dither table data having a matrix size of K × L (K and L are positive integers) is transferred to the SIMD type processor 1, for example, the read buffer RB0 has each column of a predetermined first row of dither table data. And the data in each column of the predetermined second row of the dither table data are sequentially stored at predetermined addresses. Further, at the time of transferring the data of each column of the first row to the processor elements PE0 to PEN, the offset value C58 is set to the storage address of the data of the first column of the first row, and the maximum The address value C52A is set to the address of the read buffer RB0 next to the storage address of the data stored last in the read buffer RB0 among the data in each column of the first row. In addition, at the time of transferring the data of each column in the second row to the processor elements PE0 to PEN, the offset value C58 is set to the storage address of the data in the first column of the second row, and is the maximum. The address value C52A is set to the address of the read buffer RB0 next to the storage address of the data stored last in the read buffer RB0 among the data in each column of the second row.

また、本実施形態において、各リードバッファＲＢ０〜ＲＢ３にディザテーブルデータの２行分のしきい値データを転送したが、本発明はこれに限られず、３行以上の複数の行のしきい値データを転送してもよい。例えば、リードバッファＲＢ０にディザテーブルデータの複数の行のしきい値データを順次転送した場合、オフセット値Ｃ５８を、ＳＩＭＤ型プロセッサ１に繰り返して転送する行の第１の列のしきい値データの格納アドレスに設定し、最大アドレス値Ｃ５２Ａを、ＳＩＭＤ型プロセッサ１に繰り返して転送する行の各列のしきい値データのうちリードバッファＲＢ０に最後に格納されたデータの格納アドレスの次のリードバッファＲＢ０のアドレスに設定すればよい。 In this embodiment, threshold data for two rows of dither table data is transferred to each of the read buffers RB0 to RB3. However, the present invention is not limited to this, and threshold values for a plurality of rows of three or more rows are used. Data may be transferred. For example, when threshold data of a plurality of rows of dither table data is sequentially transferred to the read buffer RB0, the offset value C58 is repeatedly transferred to the SIMD processor 1 in the first column of threshold data of the row. The read address next to the storage address of the data stored last in the read buffer RB0 among the threshold data of each column of the row that is set to the storage address and the maximum address value C52A is repeatedly transferred to the SIMD type processor 1 What is necessary is just to set to the address of RB0.

また、本実施形態において、リードバッファコントローラ１０Ｃは、第３の実施形態のリセット値設定回路５６をさらに備えてもよい。 In the present embodiment, the read buffer controller 10C may further include the reset value setting circuit 56 of the third embodiment.

さらに、上記各実施形態において、ディザテーブルデータのマトリクスサイズは４×４又は８×８であったが、本発明はこれに限られず、１６×１６又は３２×３２等の他のマトリクスサイズであってもよい。 Further, in each of the above embodiments, the matrix size of the dither table data is 4 × 4 or 8 × 8, but the present invention is not limited to this, and other matrix sizes such as 16 × 16 or 32 × 32 may be used. May be.

またさらに、上記各実施形態において、メモリコントローラ４，４Ａ，４Ｂ，４Ｃは、ディザ処理に用いるディザテーブルデータをＳＩＭＤ型プロセッサ１に転送したが、本発明はこれに限られず、画像処理などの所定の処理に用いる所定のデータをＤＤＲメモリ３からＳＩＭＤ型プロセッサ１に転送してもよい。 Furthermore, in each of the above embodiments, the memory controllers 4, 4A, 4B, and 4C transfer dither table data used for dither processing to the SIMD type processor 1, but the present invention is not limited to this, and predetermined processing such as image processing is performed. Predetermined data used for this processing may be transferred from the DDR memory 3 to the SIMD type processor 1.

また、上述した各実施形態において、各プロセッサエレメントＰＥ０〜ＰＥＮの各４個レジスタＲｊ〜Ｒｊ＋３をメモリコントローラ４，４Ａ，４Ｂ，４Ｃを介してＤＤＲメモリ３に接続したが、本発明はこれに限られず、各プロセッサエレメントＰＥ０〜ＰＥＮの少なくとも１個のレジスタをメモリコントローラ４，４Ａ，４Ｂ，４Ｃを介してＤＤＲメモリ３に接続すればよい。この場合、各プロセッサエレメントＰＥ０〜ＰＥＮにおいてＤＤＲメモリ３に接続されたレジスタの個数と同数のリードバッファコントローラ１０，１０Ａ，１０Ｂ又は１０Ｃを設ければよい。 In the above-described embodiments, the four registers Rj to Rj + 3 of the processor elements PE0 to PEN are connected to the DDR memory 3 via the memory controllers 4, 4A, 4B, and 4C. However, the present invention is not limited to this. However, at least one register of each of the processor elements PE0 to PEN may be connected to the DDR memory 3 via the memory controllers 4, 4A, 4B, and 4C. In this case, the same number of read buffer controllers 10, 10A, 10B or 10C as the number of registers connected to the DDR memory 3 may be provided in each of the processor elements PE0 to PEN.

さらに、上述した各実施形態において、メモリコントローラ４，４Ａ，４Ｂ，４ＣはＳＩＭＤ型プロセッサ１の外部に設けられたが、本発明はこれに限られず、メモリコントローラ４，４Ａ，４Ｂ，４ＣはＳＩＭＤ型プロセッサ１の内部に設けられてもよい。これにより、プロセッサエレメントＰＥ０〜ＰＥＮと、メモリコントローラ４，４Ａ，４Ｂ又は４Ｃとを備えたＳＩＭＤ型プロセッサを提供できる。 Further, in each of the above-described embodiments, the memory controllers 4, 4A, 4B, 4C are provided outside the SIMD type processor 1, but the present invention is not limited to this, and the memory controllers 4, 4A, 4B, 4C are SIMD. It may be provided inside the type processor 1. Thereby, a SIMD type processor including the processor elements PE0 to PEN and the memory controllers 4, 4A, 4B, or 4C can be provided.

１…ＳＩＭＤ型プロセッサ、
２…グローバルプロセッサ、
３…ＤＤＲメモリ、
４，４Ａ，４Ｂ，４Ｃ…メモリコントローラ、
１０，１０Ａ，１０Ｂ，１０Ｃ，１１，１２，１３…リードバッファコントローラ、
５１…リードバッファカウンタ回路、
５２，５２Ａ…ループレジスタ、
５３…比較器、
５４…リセット値レジスタ、
５５…マルチプレクサ、
５６…リセット値設定回路、
５７…オフセット値レジスタ、
５８…マルチプレクサ、
５９…加算器、
６０…オフセット値レジスタ、
ＰＥ０〜ＰＥＮ…プロセッサエレメント、
ＲＢ０，ＲＢ１，ＲＢ２，ＲＢ３…リードバッファ。 1 ... SIMD type processor,
2 ... Global processor,
3 ... DDR memory,
4, 4A, 4B, 4C ... memory controller,
10, 10A, 10B, 10C, 11, 12, 13... Read buffer controller,
51 ... Read buffer counter circuit,
52, 52A ... loop register,
53 ... Comparator,
54 ... Reset value register,
55. Multiplexer,
56 ... Reset value setting circuit,
57: Offset value register,
58. Multiplexer,
59 ... adder,
60: Offset value register,
PE0 to PEN: Processor element,
RB0, RB1, RB2, RB3... Read buffer.

特開２０１０−１５４３８号公報。JP, 2010-15438, A. 特表２００６−５０９２８４号公報。JP-T-2006-509284. 特開２００２−１２７５０３号公報。JP 2002-127503 A.

Claims

Data from the storage device is temporarily stored at predetermined addresses in sequence, and the data stored at the address of the input address value is converted into a plurality of processor elements of a SIMD (Single Instruction-stream Multiple Data-stream) type processor. A read buffer to transfer to,
In a memory controller including a read buffer controller that generates the address value and outputs the address value to the read buffer,
The read buffer controller
A read buffer counter circuit that increments and outputs the address value each time the read buffer transfers the data to the processor elements;
A loop register for storing a predetermined maximum address value;
The address value output from the read buffer counter circuit is compared with the maximum address value, and when the address value matches the maximum address value, a counter reset signal for resetting the read buffer counter circuit is generated. And a comparator for outputting to the read buffer counter circuit.

2. The memory controller generates a stop signal for stopping data transfer from the storage device to the memory controller based on the counter reset signal, and outputs the stop signal to the storage device. Memory controller.

The read buffer controller further includes a reset value setting circuit that outputs a predetermined reset value to the read buffer counter circuit at the start of data transfer from the read buffer to each processor element,
3. The memory controller according to claim 1, wherein the read buffer counter circuit resets the address value to the reset value at the start of data transfer from the read buffer to each processor element.

The read buffer sequentially stores data of each column in a predetermined row of predetermined table data from the storage device at a predetermined address, respectively.
4. The maximum address value is set to an address of the read buffer next to a storage address of data last stored in the read buffer. Memory controller.

The read buffer controller
An offset value setting circuit for outputting a predetermined offset value;
An adder for adding the offset value to the address value output from the read buffer counter circuit and outputting the address value of the addition result to the read buffer and the comparator;
4. The comparator according to claim 1, wherein the comparator compares the address value from the adder with the maximum address value instead of the address value output from the read buffer counter circuit. The memory controller according to one.

The read buffer sequentially stores data in each column of a predetermined first row of predetermined table data from the storage device and data in each column of a predetermined second row of the table data at predetermined addresses. Stored in
When transferring the data of each column of the first row to the processor elements, the offset value is set to the storage address of the data of the first column of the first row, and the maximum address value is Set to the address of the read buffer next to the storage address of the data stored last in the read buffer among the data of each column of the first row,
When transferring the data of each column of the second row to the processor elements, the offset value is set to the storage address of the data of the first column of the second row, and the maximum address value is 6. The memory controller according to claim 5, wherein the memory controller is set to an address of the read buffer next to a storage address of data last stored in the read buffer among the data of each column of the second row.

7. The memory controller according to claim 4, wherein the table data is dither table data for dither processing.

The plurality of processor elements;
An SIMD type processor comprising the memory controller according to claim 1.