[go: up one dir, main page]

JPH06119368A - Relaxation method simultaneous equation analysis computer - Google Patents

Relaxation method simultaneous equation analysis computer

Info

Publication number
JPH06119368A
JPH06119368A JP3248099A JP24809991A JPH06119368A JP H06119368 A JPH06119368 A JP H06119368A JP 3248099 A JP3248099 A JP 3248099A JP 24809991 A JP24809991 A JP 24809991A JP H06119368 A JPH06119368 A JP H06119368A
Authority
JP
Japan
Prior art keywords
calculation
analysis
elements
equations
gauss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP3248099A
Other languages
Japanese (ja)
Inventor
Naohiko Shimizu
尚彦 清水
Mamoru Tanaka
衞 田中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP3248099A priority Critical patent/JPH06119368A/en
Publication of JPH06119368A publication Critical patent/JPH06119368A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)
  • Multi Processors (AREA)

Abstract

(57)【要約】 【目的】 本発明は線形連立方程式の解析に係り、特に
緩和法を用いた大規模方程式の解析に好適な専用計算機
に関し、緩和法の解析アルゴリズムとしてSOR 法やガウ
スザイデル法などの高性能なアルゴリズムを汎用に扱え
る方法の実現を目的とする。 【構成】 計算要素PE1 ,PE2 ,...,PEN をリング状に
接続し各々の計算要素に連立方程式の全ての変数を計算
させ、この計算の過程で各々の計算要素はリング上の隣
接する計算要素に自らが計算した変数の値を転送し、緩
和法の反復演算に次々と新しい値を用いることで、収束
までの反復回数を低減させる。
(57) [Summary] [Object] The present invention relates to the analysis of linear simultaneous equations, and particularly to a dedicated computer suitable for the analysis of large-scale equations using the relaxation method. The SOR method and the Gauss-Seidel method are used as the analysis algorithms of the relaxation method. The purpose is to realize a method that can handle high-performance algorithms such as. [Configuration] Computation elements PE 1 , PE 2 , ..., PE N are connected in a ring shape, and each computation element is made to compute all variables of the simultaneous equations. The number of iterations until convergence is reduced by transferring the value of the variable calculated by itself to the adjacent calculation element of and using new values one after another in the iterative calculation of the relaxation method.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は線形連立方程式の解析に
係り、特に緩和法を用いた大規模方程式の解析に好適な
専用計算機に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to analysis of linear simultaneous equations, and more particularly to a dedicated computer suitable for analysis of large-scale equations using a relaxation method.

【0002】[0002]

【従来の技術】多くの工学上の解析は線形連立方程式の
解析に帰結するので、大規模な線形連立方程式の解析は
応用上重要である。そのためマルチプロセッサや専用ハ
ードウェアにより解析の高速化をはかる研究が行なわれ
てきた。
2. Description of the Related Art Since many engineering analyzes result in the analysis of linear simultaneous equations, the analysis of large-scale linear simultaneous equations is important for application. For this reason, studies have been conducted to speed up analysis by using a multiprocessor and dedicated hardware.

【0003】線形連立方程式の解法にはLU分解法と緩和
法として知られる方法がある。大規模な線形連立方程式
はその行列表現においてほとんどの行列要素が零である
スパース性を示すことが多い。
Methods for solving linear simultaneous equations include methods known as LU decomposition method and relaxation method. Large linear simultaneous equations often show sparseness in which most matrix elements are zero in their matrix representation.

【0004】LU分解法はこのようなスパースな行列を解
析する場合に解析中に非零要素が増加するフィルインを
起こすことが知られている。緩和法は行列の性質を用い
て演算の反復によって解析を行なうものであり、反復法
とも呼ばれる。緩和法はフィルインを起こさないので特
に大規模な線形連立方程式に有用である。良く知られた
方法にガウスヤコビ法,ガウスザイデル法,SOR 法など
がある。
The LU decomposition method is known to cause fill-in in which non-zero elements increase during analysis when analyzing such a sparse matrix. The relaxation method is an iterative method that performs analysis by iterating operations using the properties of a matrix. The relaxation method is particularly useful for large-scale linear simultaneous equations because it does not cause fill-in. Well-known methods include the Gauss-Jacobi method, the Gauss-Seidel method, and the SOR method.

【0005】ガウスヤコビ法は、連立方程式をAx=b
とすると、 x(k+1)={b−(A−D)x(k) }/D, ただし、DはAの対角要素からなる対角行列、x(k)は
xのk番目の反復による近似解である。ガウスヤコビ法
は簡単だが反復回数が多いので単一プロセッサでの解析
には普通使われない。しかし、xの各要素の解析が独立
して行えるのでマルチプロセッサでの解析には適してい
る。
In the Gauss-Jacobi method, simultaneous equations are expressed as Ax = b
Then, x (k + 1) = {b− (A−D) x (k)} / D, where D is a diagonal matrix of diagonal elements of A, and x (k) is the k-th position of x Is an approximate solution by iterating. The Gauss-Jacobi method is simple but has a large number of iterations and is not commonly used for single processor analysis. However, since each element of x can be analyzed independently, it is suitable for analysis by a multiprocessor.

【0006】ガウスザイデル法は、同様に表すと、 x(k+1)={b−Lx(K+1)−Ux(k)}/D, ただし、LはAの左下半分の要素からなる下三角行列、
UはAの右上半分の要素からなる上三角行列、DはAの
対角要素からなる対角行列である。ガウスザイデル法で
はLに係る項がその反復ステップで計算した値であり、
マルチプロセッサで実現するためにはこの計算値の伝播
が問題である。
The Gauss-Seidel method is similarly expressed as follows: x (k + 1) = {b-Lx (K + 1) -Ux (k)} / D, where L is the lower left half element of A Lower triangular matrix,
U is an upper triangular matrix composed of elements in the upper right half of A, and D is a diagonal matrix composed of diagonal elements of A. In the Gauss-Seidel method, the term related to L is the value calculated in the iterative step,
Propagation of this calculated value is a problem to realize with multiprocessor.

【0007】密行列を仮定するとx(k+1)の各要素を順
次計算する必要があり並列化の効果を出すことは難し
い。SOR 法はガウスザイデル法の反復の各ステップにお
いてガウスザイデル法の収束を過大修正により加速する
ものである。過大修正の修正量が対象とする行列毎に異
なり修正量の決定が難しい。その他の特徴はガウスザイ
デル法に準じているが、適切な修正量を用いると非常に
高速化が成されることが知られている。
If a dense matrix is assumed, it is necessary to sequentially calculate each element of x (k + 1), and it is difficult to achieve the effect of parallelization. The SOR method accelerates the convergence of the Gauss-Seidel method by overcorrection at each step of the Gauss-Seidel method iteration. The correction amount of overcorrection varies depending on the target matrix, and it is difficult to determine the correction amount. Other features are based on the Gauss-Seidel method, but it is known that a very high speed can be achieved by using an appropriate correction amount.

【0008】従来マルチプロセッサで緩和法を実施する
場合は、線形連立方程式が記述するグラフ構造を分割し
て各プロセッサに計算させている。緩和法の計算を分割
して実行するためには分割のために切断したグラフの枝
の数だけの通信が各プロセッサ間で必要となる。
Conventionally, when the relaxation method is carried out by a multiprocessor, the graph structure described by the linear simultaneous equations is divided and each processor is made to calculate. In order to divide and execute the calculation of the relaxation method, it is necessary to communicate between the processors as many as the number of edges of the graph cut for the division.

【0009】ガウスザイデル法では分割したプロセッサ
間で最新の演算結果を通信し合う必要があり、マルチプ
ロセッサでの実現が難しく、また、計算の順序性からマ
ルチプロセッサにしても逐次処理となることも多かっ
た。SOR 法もガウスザイデルと同じ特徴がある。
In the Gauss-Seidel method, it is necessary to communicate the latest calculation results among the divided processors, which is difficult to realize in a multiprocessor. Further, due to the order of calculation, even a multiprocessor may be serialized. There were many. The SOR method has the same characteristics as Gauss-Seidel.

【0010】そこで反復回数が多く性能は劣るが実現が
容易なガウスヤコビ法を用いることが多かった。SOR 法
はLaplace 方程式の離散化解析などの特定の問題向きに
マルチプロセッサで実施された例がある。
Therefore, the Gauss-Jacobi method, which has a large number of iterations and is inferior in performance but is easy to realize, is often used. The SOR method has been implemented by a multiprocessor for specific problems such as discretized analysis of Laplace equation.

【0011】従来の専用ハードウェアは、小池:CADマシ
ン;オーム社, pp.181-187,村上,田中:反復法に基づ
く非線形回路網解析とハード化の検討;昭和59年度電子
通信学会総合全国大会,1702,浦野,田中:緩和法に基づ
く大規模非線形回路網解析用計算機;信学技法,CAS85-3
2 (1985),Ping-Sheng Tseng:A Systolic Array Paralle
lizing Compiler ;Kluwer Academic Publishers(1990)
などに示されている。
Conventional dedicated hardware is Koike: CAD machine; Ohmsha, pp.181-187, Murakami, Tanaka: Nonlinear network analysis based on iterative method and examination of hardware; Tournament, 1702, Urano, Tanaka: Computer for large-scale nonlinear network analysis based on relaxation method; IEEJ, CAS85-3
2 (1985), Ping-Sheng Tseng: A Systolic Array Paralle
lizing Compiler ; Kluwer Academic Publishers (1990)
Etc.

【0012】[0012]

【発明が解決しようとする課題】一般に、マルチプロセ
ッサシステムでは通信の極小化が求められており分割の
最適化問題は困難な課題である。
Generally, in a multiprocessor system, minimization of communication is required, and the optimization problem of division is a difficult problem.

【0013】上記従来技術では、並列性を確保するため
にガウスヤコビ法を用いることが多いが、緩和法の解析
アルゴリズムとしてSOR 法やガウスザイデル法などの高
性能なアルゴリズムを汎用に扱える方法の実現が求めら
れている。
In the above-mentioned conventional technique, the Gauss-Jacobi method is often used in order to secure parallelism. However, it is possible to realize a general-purpose method that can use a high-performance algorithm such as the SOR method or the Gauss-Seidel method as an analysis algorithm for the relaxation method. It has been demanded.

【0014】また性能向上のためにプロセッサ台数を増
やすに当たってはプロセッサ同士の接続形態が簡単なこ
とが要請されている。
When increasing the number of processors to improve performance, it is required that the connection form between the processors is simple.

【0015】[0015]

【課題を解決するための手段】上記従来技術の課題は各
々がm次元の線形連立方程式を緩和法により解析するN
個の計算要素と、各計算要素が互いに特定の1ないし複
数の計算要素と通信する通信手段を有し、緩和法の演算
における特定のステップにおいて、各計算要素はm個の
線形方程式のうちそれぞれ異なる特定の1つの線形方程
式を解析し、解析結果を前記通信手段によってそれぞれ
特定の1ないし複数の計算要素に転送し、各転送を受け
た計算要素は、次のステップにおいてそれぞれ転送を受
けた解析結果を用いてつぎなる解析を行う事を特徴とす
る緩和法による線形連立方程式解析計算機を提供するこ
とにより解決できる。
SUMMARY OF THE INVENTION The above-mentioned problems of the prior art are N for analyzing an m-dimensional linear simultaneous equation by a relaxation method.
Number of calculation elements, and each calculation element has a communication means for communicating with one or more specific calculation elements, and at each specific step in the calculation of the relaxation method, each calculation element has m linear equations. One different specific linear equation is analyzed, and the analysis result is transferred by the communication means to each specific one or a plurality of calculation elements, and each transferred calculation element is analyzed in the next step. This can be solved by providing a linear simultaneous equation analysis computer by the relaxation method, which is characterized by performing the next analysis using the result.

【0016】上記の他の課題は前記通信手段がそれぞれ
が特定の2つの計算要素をつなぐN個の1方向転送経路
よりなることを特徴とする緩和法による線形連立方程式
解析計算機を提供することにより解決できる。
Another object of the present invention is to provide a computer system for analyzing linear simultaneous equations by the relaxation method, characterized in that the communication means is composed of N unidirectional transfer paths each connecting two specific calculation elements. Solvable.

【0017】[0017]

【作用】本発明の線形連立方程式解析計算機はN個の計
算要素それぞれが連立方程式の情報をすべて保持する。
そのため各計算要素においてガウスザイデル法による高
速な緩和法演算が実行可能であり、さらに通信手段によ
り送られた他の計算要素の最新の計算値を使って計算す
るのでガウスザイデル法を超えた高速化が実現される。
In the linear simultaneous equation analysis computer of the present invention, each of N calculation elements holds all information of simultaneous equations.
Therefore, high-speed relaxation method operation by Gauss-Seidel method can be executed in each calculation element, and the latest calculated values of other calculation elements sent by communication means are used for calculation, which is faster than Gauss-Seidel method. Is realized.

【0018】また、本発明の線形連立方程式解析計算機
は従来のマルチプロセッサで必要となっていた方程式の
プロセッサへの分割が不要であり分割の最適化問題も生
じない。また複数の計算要素は各々が計算した結果を特
定の計算要素に対してのみ転送する。そのため通信経路
の簡単化と通信の極小化がはかれる。
Further, the linear simultaneous equation analysis computer of the present invention does not require division of the equations into processors, which is required in the conventional multiprocessor, and the division optimization problem does not occur. In addition, the plurality of calculation elements transfer the result calculated by each of them to only a specific calculation element. Therefore, the communication route can be simplified and the communication can be minimized.

【0019】[0019]

【実施例】図1に本発明の緩和法による線形連立方程式
解析計算機の一実施例の構成図を示す。計算要素(以下
PEと呼ぶ)PE1 ....PEN は単一方向のデータ線によって
リング状に接続し通信する。リングにはホストコンピュ
ータ(Host CPU)との通信を行うゲートウェイプロセッサ
GEがあり各PEのセットアップやコマンドの送出、収束判
定等を行う。
1 is a block diagram of an embodiment of a computer for analyzing linear simultaneous equations according to the relaxation method of the present invention. Computational element (below
Called PE) PE 1 .... PE N are connected and communicate in a ring by a unidirectional data line. A gateway processor that communicates with the host computer (Host CPU) in the ring
There is a GE to set up each PE, send commands, and determine convergence.

【0020】以下説明の簡単のため線形連立方程式を回
路解析における節点方程式とみなし、方程式の行列表現
の中の非零要素を回路における枝コンダクタンスとみな
す。このようにみなしても方程式の一般性は失われな
い。一般にマルチプロセッサによって回路解析を行なう
場合にはプロセッサ間の通信が問題となる。
For simplification of the following description, the linear simultaneous equations are regarded as the nodal equations in the circuit analysis, and the nonzero elements in the matrix expression of the equations are regarded as the branch conductances in the circuit. Even if viewed in this way, the generality of the equation is not lost. Generally, when circuit analysis is performed by a multiprocessor, communication between processors becomes a problem.

【0021】並列性を確保し、かつ通信に投入するハー
ドウェアは少ない方式が望ましい。本実施例の線形連立
方程式解析計算機は1方向のデータ転送線路でお互いに
接続するプロセッサエレメントPEをリング状に配置す
る。各PEはガウス・ザイデル法で1つの節点電圧の計算
を行なうたびに、このデータ転送線路を介して計算した
節点電圧を隣のPEに転送する。
It is desirable to use a system that secures parallelism and requires a small amount of hardware for communication. In the linear simultaneous equation analysis computer of this embodiment, processor elements PE connected to each other by one-way data transfer lines are arranged in a ring shape. Each PE transfers the calculated node voltage to the adjacent PE via this data transfer line every time it calculates one node voltage by the Gauss-Seidel method.

【0022】各PEは節点電圧の計算において自ら計算し
た電圧の他に隣から転送された電圧を使用することによ
り、通常のガウス・ザイデル法よりも更に新しい計算値
を使う計算ができる。
Each PE uses the voltage transferred from the neighbor in addition to the voltage calculated by itself in the calculation of the node voltage, so that it is possible to perform the calculation using a newer calculated value than the usual Gauss-Seidel method.

【0023】各PEが回路グラフのすべての節点電圧の計
算を終了した時点でガウス・ザイデル法の1回の反復が
終了する。このようにガウス・ザイデル法の各ステップ
においてPEの数だけの通信が発生するが、本実施例の線
形連立方程式解析計算機では通信先は特定の1PEに限定
しているので通信のためのハードウェアの負担は少な
い。
One iteration of the Gauss-Seidel method ends when each PE finishes calculating all node voltages of the circuit graph. As described above, communication is performed by the number of PEs at each step of the Gauss-Seidel method, but in the linear simultaneous equation analysis computer of the present embodiment, the communication destination is limited to a specific 1PE, so the communication hardware is used. Is less burdensome.

【0024】また、隣合うPE間の転送しか行なわないの
で、積和演算を含むガウス・ザイデル法の1ステップに
比べて通信の時間は充分短いと考えられる。本実施例の
線形連立方程式解析計算機では、すべてのPEは回路のす
べての節点を計算する。このため、各PEはすべての節点
電圧と枝のコンダクタンスを記憶するローカルメモリを
持つ。
Further, since only the transfer between adjacent PEs is performed, it is considered that the communication time is sufficiently short as compared with one step of the Gauss-Seidel method including the product-sum operation. In the linear simultaneous equation analysis computer of this embodiment, all PEs calculate all nodes of the circuit. Therefore, each PE has a local memory that stores all node voltages and branch conductances.

【0025】本実施例のPEにおけるガウスザイデル法の
実行について説明する。回路の節点方程式をAx=bとし
て、行列Aの要素をaij,ベクトルx,bの要素をxi ,
biのように表す。
The execution of the Gauss-Seidel method in the PE of this embodiment will be described. The node equation of the circuit is Ax = b, the elements of the matrix A are aij, the elements of the vectors x and b are xi,
It is expressed as bi.

【0026】ある節点iの電圧xi の計算にはその節点
と枝によって接続している節点jの電圧xj と枝のコン
ダクタンスaij が必要である。1つの節点の電圧の計算
をガウスザイデル法の1ステップと定義する。節点j,
l,mに接続する節点iの電圧xi の計算のステップは
次の様になる。
Calculation of the voltage xi of a certain node i requires the voltage xj of the node j connected to the node by a branch and the conductance aij of the branch. The calculation of the voltage at one node is defined as one step of the Gauss-Seidel method. Node j,
The steps of calculating the voltage xi of the node i connected to l and m are as follows.

【0027】 xi(k+1)={bi-xj(k)*aij-xl(k+1)*ail-xm(k+1)*aim}/aii ここでj>i,l,m<iと仮定した。PEはあらかじめ
設定した節点から順次ガウス・ザイデル法の各ステップ
を実行する。その際、i−1番目のPEが計算した節点電
圧を用いる。また、自らが計算した節点電圧はi+1番
目のPEに送る。即ち前記計算ステップにおいて用いた他
の節点の電圧は自PEが計算する電圧に限らない。
Xi (k + 1) = {bi-xj (k) * aij-xl (k + 1) * ail-xm (k + 1) * aim} / aii where j> i, l, m < Assume i. The PE sequentially executes each step of the Gauss-Seidel method from preset nodes. At that time, the node voltage calculated by the i−1th PE is used. The node voltage calculated by itself is sent to the i + 1th PE. That is, the voltages at the other nodes used in the calculation step are not limited to the voltages calculated by the self PE.

【0028】それぞれのPEは、同時に同じ節点の計算を
することのないように、またそれぞれのPEは回路グラフ
の枝によって接続する節点同士の計算を同時に実行する
ことのないように計算順序をあらかじめ設定しておく。
この計算順序は解析すべき方程式によって異なるので新
しく解析を行う毎に計算順序の決定を行う。
The respective PEs are preliminarily set in the calculation order so that the same node is not calculated at the same time, and the PEs are not simultaneously executed to calculate the nodes connected by the branches of the circuit graph. Set it.
Since this calculation order differs depending on the equation to be analyzed, the calculation order is determined each time a new analysis is performed.

【0029】本発明の計算機ではこの計算順序の決定が
性能確保の為に重要であるが、必ずしも厳密に順序を保
証しなくても有る程度の性能向上が確保される。計算順
序の決定はたとえば次のアルゴリズムで行う。
In the computer of the present invention, the determination of the calculation order is important for ensuring the performance, but a certain degree of performance improvement is ensured even if the order is not strictly guaranteed. The calculation order is determined by the following algorithm, for example.

【0030】1つの節点を選び、1番とする。番号が付
いていない節点が有る場合次を繰り返す。回路グラフの
枝によって番号が付いた節点に接続する番号が付いてい
ない節点に新しい番号を付ける。
Select one node and call it number 1. If there are unnumbered nodes, repeat the following. Connect unnumbered nodes with new numbers by connecting branches in the circuit graph.

【0031】次に、図2に示す回路の直流解析を例に本
発明の計算機の動作を説明する。図2に示す回路の各枝
はそれぞれ1オームの抵抗である。この回路の節点方程
式を行列で表すと次の式となる。 説明の簡単の為に各PEの計算順序は節点番号に沿って設
定する。
Next, the operation of the computer of the present invention will be described by taking the DC analysis of the circuit shown in FIG. 2 as an example. Each branch of the circuit shown in FIG. 2 is a 1 ohm resistor. The nodal equation of this circuit can be expressed as a matrix as follows. For ease of explanation, the calculation order of each PE is set according to the node number.

【0032】この回路を3台のPE PE1,PE2,PE3で解析を
行う。各PEの計算順序は、 PE1:5,6,1,2,3,4 PE2:3,4,5,6,1,2 PE3:1,2,3,4,5,6 とする。各PEはこの計算順序に従って互いに通信を行い
ながらガウスザイデル法により解析を行う。通信手段に
よって転送を受けた節点の電圧は次のステップにおいて
使用できる。各PEはステップ毎に同期して動作する。
This circuit is analyzed with three PE PE 1 , PE 2 and PE 3 . The calculation order of each PE is PE 1 : 5,6,1,2,3,4 PE 2 : 3,4,5,6,1,2 PE 3 : 1,2,3,4,5,6 To do. Each PE performs analysis by the Gauss-Seidel method while communicating with each other according to this calculation order. The node voltage transferred by the communication means can be used in the next step. Each PE operates synchronously at each step.

【0033】PE1 の計算ステップは次の通りである。 1. x5=x3+x4+x6' 2. x6=x1'+x2+x5 3. x1=1+x2'+x6 4. x2=x1+x3'+x6 5. x3=x2+x4'+x5 6. x4=x3+x5' PE2 の計算ステップは次の通りである。The calculation steps of PE 1 are as follows. 1.x5 = x3 + x4 + x6 '2.x6 = x1' + x2 + x5 3.x1 = 1 + x2 '+ x6 4.x2 = x1 + x3' + x6 5.x3 = x2 + x4 '+ x5 6. The calculation steps of x4 = x3 + x5 'PE 2 are as follows.

【0034】1. x3=x2+x4'+x5 2. x4=x3+x5' 3. x5=x3+x4+x6' 4. x6=x1'+x2+x5 5. x1=1+x2'+x6 6. x2=x1+x3'+x6 PE3 の計算ステップは次の通りである。1.x3 = x2 + x4 '+ x5 2.x4 = x3 + x5' 3.x5 = x3 + x4 + x6 '4.x6 = x1' + x2 + x5 5.x1 = 1 + x2 '+ The calculation steps of x6 6. x2 = x1 + x3 '+ x6 PE 3 are as follows.

【0035】1. x1=1+x2'+x6 2. x2=x1+x3'+x6 3. x3=x2+x4'+x5 4. x4=x3+x5' 5. x5=x3+x4+x6' 6. x6=x1'+x2+x5 各PEは6ステップの実行でガウスザイデルの1回の反復
を行う。
1.x1 = 1 + x2 '+ x6 2.x2 = x1 + x3' + x6 3.x3 = x2 + x4 '+ x5 4.x4 = x3 + x5' 5. x5 = x3 + x4 + x6 '6 x6 = x1' + x2 + x5 Each PE performs one Gauss-Seidel iteration in 6 steps.

【0036】上記ステップにおいてxi'として示した値
は通信手段によって転送された他PEの計算値を用いるこ
とを示す。xi とした値は自PEの計算値であり、ローカ
ルメモリに記憶した値を用いる。計算結果はローカルメ
モリに格納すると共に隣のPEに転送する。
The value shown as xi 'in the above step indicates that the calculated value of another PE transferred by the communication means is used. The value set as xi is the calculated value of the PE itself, and the value stored in the local memory is used. The calculation result is stored in the local memory and transferred to the adjacent PE.

【0037】この計算ステップを計算値が収束するまで
繰り返し実行する。この例題では、PEを1台としたとき
の収束までの反復回数は425 回であったがPE2台では21
5 回、PE3台では218 回といずれも1台の場合より収束
までの反復回数が低減され高速化が果たされた。PE3台
の場合には前記計算順序の条件を満たしていないので2
台の場合に比べて性能向上はなされていない。
This calculation step is repeatedly executed until the calculated values converge. In this example, the number of iterations until convergence was 425 when one PE was used, but it was 21 for two PEs.
The number of iterations until convergence was reduced to 5 times and 218 times with 3 PE units, which was faster than the case with only 1 unit. In the case of 3 PEs, the condition of the above calculation order is not satisfied, so 2
Performance has not been improved compared to the case of a stand.

【0038】Laplace 方程式として知られる偏微分方程
式を30x30に離散化してに本実施例の計算機を適用
した場合には、PEが1台の時の性能を1とするとPEが
4,8,15台の時にそれぞれ3.98倍,7.99倍,14.98 倍
の性能がそれぞれ得られた。
When the computer of this embodiment is applied after discretizing the partial differential equation known as the Laplace equation into 30 × 30, if the performance when one PE is 1, the number of PEs is 4, 8 and 15 At that time, performances of 3.98 times, 7.99 times, and 14.98 times were obtained respectively.

【0039】また、非線形回路シミュレーションに本実
施例の計算機を適用した場合には図3で示すようなPE台
数−性能曲線が得られた。非線形問題はニュートン法と
して知られる方法で線形連立方程式の解を求める問題に
帰結できるので本発明の計算機の対象となる。図3には
スタティックメモリ(SRAM)と加算器(ADDR)を解析した場
合を示す。
When the computer of this embodiment was applied to the non-linear circuit simulation, the PE number-performance curve as shown in FIG. 3 was obtained. The non-linear problem is a target of the computer of the present invention because it can result in a problem for solving a solution of linear simultaneous equations by a method known as Newton's method. FIG. 3 shows a case where the static memory (SRAM) and the adder (ADDR) are analyzed.

【0040】本実施例では通信は1計算ステップあたり
1回行っているが、通信手段の性能に余裕がある場合に
はもっと多くの計算値を転送することが可能である。例
えば、PEはまず自ら計算した値を転送し次に隣から転送
されて来た値をそのまま転送することにより複数の計算
値を次の計算ステップに反映させることが可能である。
In the present embodiment, the communication is performed once for each calculation step, but if the communication means has a margin, it is possible to transfer more calculated values. For example, the PE can reflect a plurality of calculated values in the next calculation step by first transferring the value calculated by itself and then directly transferring the value transferred from the neighbor.

【0041】また本発明の計算機は通信量が限定されて
いることから専用の通信路ではなく共用メモリを用いて
PE間の通信を行うことも容易にできる。また本実施例の
計算要素PEはガウスザイデル法を実行しているがガウス
ザイデル法は本発明の本質を成してはおらずSOR 法やそ
の他の緩和法を用いても本発明の範囲を越えない。
Since the computer of the present invention has a limited communication volume, a shared memory is used instead of a dedicated communication path.
It is also easy to communicate between PEs. Further, the calculation element PE of this embodiment executes the Gauss-Seidel method, but the Gauss-Seidel method does not form the essence of the present invention, and the use of the SOR method or other relaxation methods does not exceed the scope of the present invention. .

【0042】[0042]

【発明の効果】本発明によれば、複数の計算要素からな
る計算機において各々の計算要素が互いに少ない通信量
で高性能の線形方程式解析計算機が実現できる。特にLa
place方程式の解を求める問題では15台のPEで約15倍の
性能向上を実現した。更に非線形問題に対しても本発明
の計算機を用いて高速化が図れることを示した。
According to the present invention, in a computer having a plurality of calculation elements, a high-performance linear equation analysis computer can be realized with a small amount of communication between the respective calculation elements. Especially La
In the problem of finding the solution of the place equation, we achieved about 15 times the performance improvement with 15 PEs. Furthermore, it has been shown that the computer of the present invention can be used to increase the speed even for nonlinear problems.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の一実施例の計算機の構成図である。FIG. 1 is a configuration diagram of a computer according to an embodiment of the present invention.

【図2】例題の回路図である。FIG. 2 is a circuit diagram of an example.

【図3】本発明の一実施例の例題の性能曲線をそれぞれ
示す図である。
FIG. 3 is a diagram showing performance curves of an example of an example of the present invention.

【符号の説明】[Explanation of symbols]

PE1 ,PE2 , ...,PEN 計算要素 GE ゲートウェイプロセッサPE 1 , PE 2, ..., PE N Computation element GE gateway processor

Claims (4)

【特許請求の範囲】[Claims] 【請求項1】 各々がm次元の連立方程式を緩和法によ
り解析するN個の計算要素と、前記N個の計算要素が互
いに特定の1ないし複数の計算要素と通信する通信手段
を有し、前記m次元の連立方程式に対応するm個の方程
式を解析する緩和法のm個の演算ステップを前記N個の
計算要素がそれぞれ順次実行し、前記m個のステップの
内特定のステップにおいて各計算要素はm個の方程式の
うちそれぞれ異なる特定の1つの方程式を解析し、各計
算要素が解析結果を前記通信手段によってそれぞれ特定
の1ないし複数の計算要素に転送し、各転送を受けた計
算要素はそれぞれ転送を受けた解析結果を用いて後続の
ステップの実行を行う事を特徴とする緩和法による連立
方程式解析計算機。
1. N-computation elements each of which analyzes an m-dimensional simultaneous equation by a relaxation method, and communication means for communicating the N computation elements with specific one or more computation elements. The N calculation elements sequentially execute m operation steps of the relaxation method for analyzing m equations corresponding to the m-dimensional simultaneous equations, and each calculation is performed in a specific step of the m steps. The element analyzes one different specific equation out of m equations, each calculation element transfers the analysis result to each one or more specific calculation elements by the communication means, and each calculation element receives each transfer. Is a simultaneous equation analysis computer by the relaxation method, characterized in that the subsequent steps are executed using the analysis results received.
【請求項2】 前記通信手段が少なくとも特定の2つの
計算要素に接続する1方向転送経路よりなることを特徴
とする請求項1記載の緩和法による連立方程式解析計算
機。
2. The simultaneous equation analysis computer according to claim 1, wherein the communication means comprises a one-way transfer path connected to at least two specific calculation elements.
【請求項3】 前記通信手段が少なくとも特定の2つの
計算要素に接続するメモリからなることを特徴とする請
求項1記載の緩和法による連立方程式解析計算機。
3. The relaxation equation simultaneous equation analysis computer according to claim 1, wherein said communication means comprises a memory connected to at least two specific calculation elements.
【請求項4】 前記連立方程式が線形連立方程式である
ことを特徴とする請求項1記載の緩和法による連立方程
式解析計算機。
4. The simultaneous equation analysis computer according to claim 1, wherein the simultaneous equations are linear simultaneous equations.
JP3248099A 1991-09-26 1991-09-26 Relaxation method simultaneous equation analysis computer Pending JPH06119368A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP3248099A JPH06119368A (en) 1991-09-26 1991-09-26 Relaxation method simultaneous equation analysis computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP3248099A JPH06119368A (en) 1991-09-26 1991-09-26 Relaxation method simultaneous equation analysis computer

Publications (1)

Publication Number Publication Date
JPH06119368A true JPH06119368A (en) 1994-04-28

Family

ID=17173206

Family Applications (1)

Application Number Title Priority Date Filing Date
JP3248099A Pending JPH06119368A (en) 1991-09-26 1991-09-26 Relaxation method simultaneous equation analysis computer

Country Status (1)

Country Link
JP (1) JPH06119368A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996013779A1 (en) * 1994-10-31 1996-05-09 Nkk Corporation Multiprocessor system
JP2018206078A (en) * 2017-06-05 2018-12-27 富士通株式会社 Parallel processing apparatus, parallel operation method, and parallel operation program

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996013779A1 (en) * 1994-10-31 1996-05-09 Nkk Corporation Multiprocessor system
US6131153A (en) * 1994-10-31 2000-10-10 Nkk Corporation Multiprocessor system having a plurality of gateway units and wherein each gateway unit controls memory access requests and interferences from one hierchical level to another
JP2018206078A (en) * 2017-06-05 2018-12-27 富士通株式会社 Parallel processing apparatus, parallel operation method, and parallel operation program

Similar Documents

Publication Publication Date Title
JP2956800B2 (en) Computer system for simultaneous linear equations
US20190370664A1 (en) Operation method
CN110275733B (en) GPU Parallel Acceleration Method for Solving Phonon-Boltzmann Equation Based on Finite Volume Method
Gmeiner et al. Parallel multigrid on hierarchical hybrid grids: a performance study on current high performance computing clusters
CN112199636A (en) Fast convolution method and device suitable for microprocessor
JP7401513B2 (en) Sparse matrix multiplication in hardware
US20170140072A1 (en) Method and system for determining a configuration of a model having a collection of entities and satisfying a set of constraints
Zhang et al. Tucker tensor decomposition on FPGA
Qin et al. π-ba: Bundle adjustment acceleration on embedded fpgas with co-observation optimization
CN114004186A (en) Spectrogram sparsification-based on-chip ultra-large-scale power supply network parallel simulation method
Khimich et al. Hybrid algorithms for solving the algebraic eigenvalue problem with sparse matrices
Hung et al. Digital hardware realization of a recurrent neural network for solving the assignment problem
Wu et al. Skeletongcn: a simple yet effective accelerator for gcn training
US20190130274A1 (en) Apparatus and methods for backward propagation in neural networks supporting discrete data
CN114897188B (en) Large-scale data processing method
Niu et al. SPEC2: Spectral sparse CNN accelerator on FPGAs
Peltekis et al. FusedGCN: A systolic three-matrix multiplication architecture for graph convolutional networks
Xu et al. Towards a scalable hierarchical high-order CFD solver
US20190080241A1 (en) Apparatus and methods for backward propagation in neural networks supporting discrete data
Liu et al. Data-transfer-bottleneck-less architecture for FPGA-based quantum annealing simulation
JPH06119368A (en) Relaxation method simultaneous equation analysis computer
Lo et al. Iterative solution of general sparse linear systems on clusters of workstations
Polat et al. GPU‐accelerated and mixed norm regularized online extreme learning machine
CN116932988B (en) Method and device for solving combined optimization problem, storage medium and electronic equipment
CN109101708B (en) Implicit finite element parallel method based on two-stage region decomposition

Legal Events

Date Code Title Description
A711 Notification of change in applicant

Effective date: 20040916

Free format text: JAPANESE INTERMEDIATE CODE: A711

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20040917

A621 Written request for application examination

Effective date: 20060208

Free format text: JAPANESE INTERMEDIATE CODE: A621

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20070417

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20070424

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20070717