TWI854675B

TWI854675B - Method for improving parallelization of an ldpc shuffle decoder by reordering an ldpc code

Info

Publication number: TWI854675B
Application number: TW112120129A
Authority: TW
Inventors: 魏大鈞; 曾戈忠
Original assignee: 睿寬智能科技有限公司
Priority date: 2023-05-30
Filing date: 2023-05-30
Publication date: 2024-09-01
Also published as: TW202448130A

Abstract

LDPC重組解碼器(LDPC shuffle decoder)有兩個核心及兩個記憶體，依序處理一個LDPC矩陣的所有欄位。處理每一欄位時，把它的非零元素的讀(reading)與寫(writing)均分給兩個記憶體。依此分派方式，LDPC重組解碼器可在最少的循環完成此LDPC矩陣的讀及寫。 The LDPC shuffle decoder has two cores and two memories, processing all the fields of an LDPC matrix in sequence. When processing each field, the reading and writing of its non-zero elements are evenly distributed to the two memories. With this distribution method, the LDPC shuffle decoder can complete the reading and writing of the LDPC matrix in the least number of cycles.

Description

Method for improving LDPC recombinant decoder parallelism by rearranging LDPC codes

本發明關於對LDPC重組解碼器，尤其關於重排LDPC碼而改善LDPC重組解碼器並行的方法。 The present invention relates to an LDPC reorganization decoder, and more particularly to a method for improving the parallel operation of an LDPC reorganization decoder by re-arranging LDPC codes.

在LDPC重組解碼(LDPC shuffle decoding)中，解碼器以欄位為單位處理上述矩陣。 In LDPC shuffle decoding, the decoder processes the above matrix in columns.

有鑑於上述習知技藝之問題，本發明之目的是提供一種重排LDPC碼欄位而縮短LDPC重組解碼器延遲的方法。 In view of the above problems of the prior art, the purpose of the present invention is to provide a method for rearranging LDPC code fields to shorten the delay of LDPC reorganization decoder.

為達成上述目的，上述方法包括：為上述LDPC重組解碼器提供兩個記憶體，平均分配上述LDPC碼的矩陣的非零元素給上述兩個記憶體，依上述分配方式，從上述兩個記憶體讀上述非零元素，並寫上述非零元素到上述兩個記憶體，更新分配方式，判斷是否達最大遞迴數。若未達最大遞迴數，則回上述平均分配上述非零元素給上述兩個記憶體的步驟。若達最大遞迴數，則判斷分配是否成功。若分配成功，則產生排程檔案，否則結束。 To achieve the above purpose, the above method includes: providing two memories for the above LDPC recombinant decoder, evenly allocating the non-zero elements of the matrix of the above LDPC code to the above two memories, reading the above non-zero elements from the above two memories according to the above allocation method, and writing the above non-zero elements to the above two memories, updating the allocation method, and judging whether the maximum recursion number is reached. If the maximum recursion number is not reached, returning to the step of evenly allocating the above non-zero elements to the above two memories. If the maximum recursion number is reached, judging whether the allocation is successful. If the allocation is successful, generating a scheduling file, otherwise ending.

S20:分配的第一階段 S20: The first stage of allocation

S21:累計某欄位中分別從兩個記憶體來的非零元素數量 S21: Accumulate the number of non-zero elements in a column from two memories respectively

S23:暫存當今整體分配狀態當最佳分配狀態 S23: Temporarily save the current overall allocation status as the optimal allocation status

S25:從兩個記憶體讀取的數量平衡？ S25: Is the amount read from the two memories balanced?

S27:回溯次數達上限？ S27: Has the number of backtrackings reached the upper limit?

S29:執行回溯 S29: Execution backtracking

S31:回復先前暫存之最佳狀況 S31: Restore the previously saved best state

S33:依平衡寫入條件隨機分配給記憶體 S33: Randomly allocate to memory based on balanced write conditions

S35:已達矩陣的最後欄位？ S35: Reached the last field of the matrix?

S37:處理開頭欄位中未確認的記憶體讀取資訊 S37: Handle unconfirmed memory read information in the beginning field

S39:每一欄位的讀取數量皆平衡或已達最大遞迴數？ S39: Is the read quantity of each field balanced or has the maximum number of repetitions been reached?

S40:分配的第二階段 S40: Second stage of allocation

S41:依記憶體分配方式，從對應的記憶體依序讀取非零元素 S41: Read non-zero elements from the corresponding memory in sequence according to the memory allocation method

S43:讀完此欄位所有的非零元素？ S43: Read all non-zero elements in this column?

S45:目的記憶體滿？ S45: Is the destination memory full?

S47:依記憶體分配方式，把一個非零元素寫入對應的記憶體 S47: According to the memory allocation method, write a non-zero element into the corresponding memory

S49:寫完此欄位的非零元素？ S49: Finish writing the non-zero elements of this column?

S51:達最後欄位？ S51: Reached the last column?

S53:執行環繞 S53: Execute Surround

S55:在此次環繞時有覆蓋發生？ S55: Was there any coverage during this orbit?

S57:達最大環繞次數？ S57: Reached the maximum number of loops?

S59:記錄失敗一次 S59: Record failure once

S60:更新最佳結果 S60: Update best results

S70:判斷是否達最大遞迴數 S70: Determine whether the maximum number of repetitions has been reached

S80:判斷分配是否成功 S80: Determine whether the allocation is successful

S90:產生重組解碼器的指令檔案 S90: Generate command file for reorganizing the decoder

〔圖1〕現一個LDPC碼；〔Figure 1〕A LDPC code is shown;

〔圖2〕呈現一個典型的LDPC並行解碼器； [Figure 2] shows a typical LDPC parallel decoder;

〔圖3〕是本發明的重排LDPC碼而改善LDPC重組解碼器並行的方法的流程圖； [Figure 3] is a flow chart of the method of rearranging LDPC codes to improve the parallel operation of LDPC reorganization decoders of the present invention;

〔圖4〕是圖3所示之重排LDPC碼而改善LDPC重組解碼器並行的方法的一道子程序的流程圖； [Figure 4] is a flow chart of a subroutine of the method for improving the parallel operation of LDPC reorganization decoder by rearranging LDPC codes shown in Figure 3;

〔圖5〕呈現一個被簡化的LDPC碼與其平衡讀寫之分配； [Figure 5] shows a simplified LDPC code and its balanced read and write allocation;

〔圖6〕呈現另一個LDPC碼的一種失衡分配方式； [Figure 6] shows another unbalanced allocation of LDPC codes;

〔圖7〕呈現圖6之LDPC碼回溯後的一種平衡分配； [Figure 7] shows a balanced distribution after the LDPC code in Figure 6 is traced back;

〔圖8〕呈現圖6之LDPC碼回溯後的另一種失衡分配； [Figure 8] shows another unbalanced distribution after LDPC code backtracking in Figure 6;

〔圖9〕呈現圖6之LDPC碼可能產生的巢狀回溯分配； [Figure 9] shows the nested traceback allocation that may be generated by the LDPC code in Figure 6;

〔圖10〕是圖3所示之重排LDPC碼而改善LDPC重組解碼器並行的方法的另一道子程序的流程圖； [Figure 10] is a flow chart of another subroutine of the method for improving the parallel operation of LDPC reorganization decoder by rearranging LDPC codes shown in Figure 3;

〔圖11〕呈現以一種記憶體的容量處理一個LDPC碼； [Figure 11] shows the processing of an LDPC code with a certain memory capacity;

〔圖12〕呈現處理圖11所示的LDPC碼的第一次環繞； [Figure 12] shows the first pass of processing the LDPC code shown in Figure 11;

〔圖13〕呈現處理圖11所示的LDPC碼的第二次環繞；及 [Figure 13] shows the second round of processing the LDPC code shown in Figure 11; and

〔圖14〕呈現以另一種記憶體容量處理圖11的LDPC碼的第一次環繞。 [Figure 14] shows the first pass of the LDPC code in Figure 11 processed with another memory capacity.

以下參考相關圖式進一步說明本發明的較佳實施例。為便於理解本發明，以下用相同符號標示相同元件。 The following reference figures further illustrate the preferred embodiments of the present invention. To facilitate understanding of the present invention, the same symbols are used to indicate the same components.

參考圖1，LDPC(low-density parity check)碼以矩陣的形式存在。此矩陣有M列(row)*N欄位(column)的元素，每一個元素是一個較小的循環矩陣(circulant)。若循環矩陣僅包括數值0，則此元素是「零元素」(圖中，用X表示)。若循環矩陣包含數值1，則此元素是「非零元素」(圖中，用數字表示循環位移值)。實務上，運算LDPC碼時僅須用非零元素。受限於解碼器硬體限制，LDPC重組解碼以「欄」為單位向後進行運算。 Referring to Figure 1, LDPC (low-density parity check) codes exist in the form of a matrix. This matrix has M rows*N columns, and each element is a smaller circulant. If the circulant contains only the value 0, then this element is a "zero element" (in the figure, it is represented by X). If the circulant contains the value 1, then this element is a "non-zero element" (in the figure, the circulant shift value is represented by a number). In practice, only non-zero elements are required when calculating LDPC codes. Due to the hardware limitations of the decoder, LDPC reorganization decoding operates backwards in "column" units.

參考圖2，為增加LDPC解碼效率，可拓展LDPC解碼器，此時解碼器擁有兩個記憶體(A及B)及兩個核心(1及2)。核心1及2都是處理器。理想上，核心1從記憶體A(或B)讀資料時，核心2從記憶體B(或A)讀資料。接著，LDPC解碼器處理這些資料。然後，核心1把被處理的資料寫回記憶體A(或B)時，核心2把被處理的資料寫回記憶體B(或A)。核心1及核心2同時運作被稱為「並行」(parallelization)。然而，同一列之相異欄位的非零元素有相依性(dependency)，就是處理每一個非零元素時所需的資料皆是讀取同列先前欄位所載內容(因記憶體位置須相同)。此外，LDPC矩陣中非零元素的分布是稀疏且不規則，因此運算每一欄時，兩個核心可能須同時讀(或寫)單一記憶體A(或B)。此時，只有一個核心工作，須閒置另一核心，導致整體系統延宕。 Referring to Figure 2, to increase the efficiency of LDPC decoding, the LDPC decoder can be expanded. In this case, the decoder has two memories (A and B) and two cores (1 and 2). Cores 1 and 2 are both processors. Ideally, when core 1 reads data from memory A (or B), core 2 reads data from memory B (or A). Then, the LDPC decoder processes the data. Then, when core 1 writes the processed data back to memory A (or B), core 2 writes the processed data back to memory B (or A). The simultaneous operation of core 1 and core 2 is called "parallelization." However, non-zero elements in different columns of the same row have dependencies, that is, the data required to process each non-zero element is to read the content of the previous column in the same row (because the memory location must be the same). In addition, the distribution of non-zero elements in the LDPC matrix is sparse and irregular, so when calculating each column, two cores may need to read (or write) a single memory A (or B) at the same time. At this time, only one core is working, and the other core must be idle, causing the entire system to delay.

圖3呈現本發明的較佳實施例的重排LDPC碼而改善LDPC重組解碼器的方法。本發明的方法重排上述LDPC碼而改善上述LDPC重組解碼器並行。換言之，本發明的方法避免一個核心(1或2)運作時，另一個核心(2或1)閒置。具體而言，本發明的方法把每一欄位的非零元素的資料讀取與寫入排序(sort)或分配，致平行且平衡處理這些非零元素而最佳化上述LDPC重組解碼器的效能。 FIG3 shows a method of improving the LDPC reorganization decoder by rearranging the LDPC code according to a preferred embodiment of the present invention. The method of the present invention rearranges the above LDPC code to improve the parallelism of the above LDPC reorganization decoder. In other words, the method of the present invention avoids one core (1 or 2) being idle while the other core (2 or 1) is operating. Specifically, the method of the present invention sorts or distributes the data reading and writing of the non-zero elements of each field, so as to parallelize and balance the processing of these non-zero elements and optimize the performance of the above LDPC reorganization decoder.

在步驟S20，進行分配的第一階段。檢視整個矩陣，標示且安排每一欄位的一些非零元素到記憶體A，標示且安排每一欄位的另一些非零元素到記憶體B，並確保兩個記憶體有最適當的平衡狀態。重複此步驟，兩個記憶體平衡或達最大遞迴數(兩個記憶體失衡)時才停。 In step S20, the first phase of allocation is performed. Check the entire matrix, mark and arrange some non-zero elements of each column to memory A, mark and arrange other non-zero elements of each column to memory B, and ensure that the two memories have the most appropriate balance. Repeat this step until the two memories are balanced or the maximum number of recursions is reached (the two memories are unbalanced).

在步驟S40，進行分配的第二階段。用此舉發現記憶體A或B是否因上述分配而導致記憶體溢出(overflow)的錯誤，並傳遞錯誤標記到步驟S60。 In step S40, the second phase of allocation is performed. This is used to detect whether memory A or B has caused a memory overflow error due to the above allocation, and the error flag is passed to step S60.

在步驟S60，依限制及硬體要求，更新最佳結果。只留一個結果當最後結果。為此，把本次遞迴的結果與先前貯存的結果相比。若把本次遞迴的結果優於先前貯存的結果，則用本次遞迴的結果取代先前貯存的結果，否則留先前貯存的結果。 In step S60, the best result is updated according to the restrictions and hardware requirements. Only one result is retained as the final result. To this end, the result of this recursion is compared with the previously stored result. If the result of this recursion is better than the previously stored result, the previously stored result is replaced by the result of this recursion, otherwise the previously stored result is retained.

在步驟S70，判斷是否達最大遞迴數。若是，則到步驟S80，否則回步驟S20。因此，重複步驟S20、S40及S60，達最大遞迴數才停。 In step S70, determine whether the maximum number of loops has been reached. If so, go to step S80, otherwise return to step S20. Therefore, repeat steps S20, S40 and S60 until the maximum number of loops has been reached.

在步驟S80，判斷分配是否成功。若是，則到步驟S90，否則結束。 In step S80, determine whether the allocation is successful. If so, go to step S90, otherwise end.

在步驟S90，產生重組解碼器的指令檔案當最後輸出資料。此指令檔案被稱為「排程檔案」(scheduling files)。 In step S90, a command file for reorganizing the decoder is generated when the data is finally output. This command file is called a "scheduling file".

參考圖4，詳細描述步驟S20。步驟S20的目標是以平衡的方式把非零元素投入記憶體A及記憶體B。步驟S20包括步驟S21、S23、S25、S27、S29、S31、S33、S35、S37、S39。步驟S21、S23、S25、S27、S29及S31屬於讀的階段，步驟S33屬於寫的階段，步驟S35、S37、S39屬於後續處理的階段。 Referring to FIG. 4 , step S20 is described in detail. The goal of step S20 is to put non-zero elements into memory A and memory B in a balanced manner. Step S20 includes steps S21, S23, S25, S27, S29, S31, S33, S35, S37, and S39. Steps S21, S23, S25, S27, S29, and S31 belong to the reading stage, step S33 belongs to the writing stage, and steps S35, S37, and S39 belong to the subsequent processing stage.

在步驟S21，作為讀取的第一步驟，累計某欄位中分別從記憶體A及B來的非零元素的數量。為此，須檢視前一或數個欄位在同一列的非零元素是寫入記憶體A或記憶體B。應注意，此時矩陣最前面的幾個欄位可能無先前記憶體資訊，故暫不累計此欄位的元素數量，但假定其來自兩個記憶體的數量平衡，使在步驟S25滿足條件並進入步驟S33而繼續寫入記憶體的步驟。這些未確認的讀取記憶體資訊，稍後會在步驟S37被處理。 In step S21, as the first step of reading, the number of non-zero elements in a certain column from memory A and B is accumulated. To do this, it is necessary to check whether the non-zero elements in the previous one or several columns in the same column are written into memory A or memory B. It should be noted that the first few columns of the matrix may not have previous memory information at this time, so the number of elements in this column is not accumulated temporarily, but it is assumed that the number from the two memories is balanced, so that the condition is met in step S25 and enters step S33 to continue writing into the memory. These unconfirmed read memory information will be processed later in step S37.

在步驟S23，若當今欄位是新欄位(表示分配已進行到新欄位)，則暫存當今整體分配狀態當最佳分配狀態，並記錄現今最大欄位序號。此最大欄位序號紀錄可用來判斷現今欄位是否為新欄位，若現今欄位序號大於先前最大欄位序號，則此欄位是新欄位。 In step S23, if the current field is a new field (indicating that the allocation has proceeded to the new field), the current overall allocation status is temporarily saved as the optimal allocation status, and the current maximum field sequence number is recorded. This maximum field sequence number record can be used to determine whether the current field is a new field. If the current field sequence number is greater than the previous maximum field sequence number, then this field is a new field.

在步驟S25，取用步驟S21的結果，判斷分別從兩個記憶體A及B讀取的數量是否平衡。換言之，判斷此欄位之非零元素們從記憶體A讀取的量是否等於從記憶體B讀取的量(若非零元素總數為奇數，則滿足平衡之條件為相差1)。若是，則到步驟S33，否則到步驟S27。 In step S25, the result of step S21 is used to determine whether the amount read from the two memories A and B is balanced. In other words, determine whether the amount of non-zero elements in this field read from memory A is equal to the amount read from memory B (if the total number of non-zero elements is an odd number, the condition for balance is a difference of 1). If so, go to step S33, otherwise go to step S27.

在步驟S27，表示從兩記憶體讀取的數量不平衡，須執行回溯(rolling back)予以修正。首先判斷回溯次數是否達上限。若是，則到步驟S31，否則到步驟S29執行回溯。回溯次數為統計在固定欄位範圍內(未往後執行到任何新欄位的狀況下)執行回溯的次數，即是在此範圍裡，每回溯1次，就把回溯次數加1。同時定義回溯次數上限，以避免在找不到任何平衡的分配狀態時，陷於無限回溯之迴路。此次回溯可能在先前的回溯中，即在執行回溯時，可能觸發另一次的回溯，但只要發生在此固定範圍內，每次回溯都應觸發次數加1。 In step S27, it indicates that the amount read from the two memories is unbalanced, and rolling back is required to correct it. First, determine whether the number of rollbacks has reached the upper limit. If so, go to step S31, otherwise go to step S29 to perform rollback. The number of rollbacks is the number of times rollbacks are performed within the fixed field range (without executing to any new field). That is, within this range, each time rollback is performed, the rollback number is increased by 1. At the same time, the upper limit of the rollback number is defined to avoid falling into an infinite rollback loop when no balanced allocation state is found. This backtracking may be in the previous backtracking, that is, when executing the backtracking, it may trigger another backtracking, but as long as it occurs within this fixed range, the triggering count should be increased by 1 for each backtracking.

在步驟S29，實際執行回溯。隨機選一個造成失衡狀況的非零元素，回到同一列中前一個非零元素，並從步驟S21重新開始處理它所在的那一個欄位。重新處理此欄位時，因在步驟S33是依平衡寫入條件隨機分配寫到記憶體A或記憶體B，將有機會把後續欄位(包括上述造成失衡的那一個欄位)的讀取數量重新分配而達到平行狀態。 In step S29, backtracking is actually performed. A non-zero element that causes an imbalance is randomly selected, and the previous non-zero element in the same column is returned, and the column where it is located is processed again from step S21. When reprocessing this column, since writing to memory A or memory B is randomly allocated according to the balanced writing condition in step S33, there will be an opportunity to redistribute the read quantity of subsequent columns (including the above-mentioned column that causes an imbalance) to achieve a parallel state.

在步驟S31，若回溯時未發現更佳組合且回溯次數已達上限，則回復先前暫存之最佳狀況(在步驟S23被貯存)。 In step S31, if no better combination is found during the backtracking and the backtracking times have reached the upper limit, the previously saved best state (stored in step S23) is restored.

在步驟S33，依平衡的寫入條件，隨機分配非零元素的資料寫到記憶體A或記憶體B。 In step S33, based on the balanced writing condition, the data of non-zero elements are randomly allocated to be written to memory A or memory B.

在步驟S35，判斷是否已達矩陣的最後欄位。若是，則到步驟S37，否則回步驟S21繼續處理下一欄位。 In step S35, determine whether the last field of the matrix has been reached. If so, go to step S37, otherwise return to step S21 to continue processing the next field.

在步驟S37，處理開頭欄位中未確認的記憶體讀取資訊。參考最後幾個欄位的寫入資訊，得知資料所存放之記憶體，再回到開頭欄位中填入其相對應讀取的記憶體。換言之，視最後幾個欄位視為在最前面幾個欄位以前的先前記憶體資訊。 In step S37, the unconfirmed memory read information in the beginning field is processed. The memory where the data is stored is known by referring to the write information of the last few fields, and then the corresponding memory read is filled in the beginning field. In other words, the last few fields are regarded as the previous memory information before the first few fields.

在步驟S39，判斷是否每一欄位的讀取數量皆平衡(不必判斷寫入，因寫入時依平衡之數量寫入)或已達最大遞迴數。若是，則到步驟S40，否則回步驟S21。 In step S39, determine whether the read quantity of each field is balanced (no need to determine the write, because the write is based on the balanced quantity) or the maximum number of loops has been reached. If so, go to step S40, otherwise return to step S21.

參考圖5，舉例描述如何執行本發明的方法。 Referring to FIG. 5 , an example is given to describe how to implement the method of the present invention.

在步驟S21，暫不處理欄位1之資料讀取，因不知列2、3、5或6先行資料的寫被分配到記憶體A或記憶體B。在步驟S23，記錄當今狀態，並假設讀取數量平衡而由步驟S25跳至步驟S33。在步驟S33，將非零元素資料依平衡之數量寫入記憶體A與記憶體B。 In step S21, the data reading of column 1 is temporarily not processed because it is unknown whether the writing of the preceding data of columns 2, 3, 5 or 6 is allocated to memory A or memory B. In step S23, the current status is recorded, and it is assumed that the reading quantity is balanced and jumps from step S25 to step S33. In step S33, the non-zero element data is written into memory A and memory B according to the balanced quantity.

在步驟S21，暫不處理欄位2之資料讀取，因不知列1、4先行資料的寫被分配到記憶體A或記憶體B。在步驟S23，記錄當今狀態，並假設讀取數量平衡而由步驟S25跳至步驟S33。在步驟S33，將非零元素資料依平衡之數量寫入記憶體A與記憶體B。 In step S21, the data reading of column 2 is temporarily not processed because it is unknown whether the writing of the preceding data of columns 1 and 4 is allocated to memory A or memory B. In step S23, the current status is recorded, and it is assumed that the read quantity is balanced and jumps from step S25 to step S33. In step S33, the non-zero element data is written into memory A and memory B according to the balanced quantity.

在步驟S21，處理欄位3，可知列1的資料須從記憶體A讀取(由欄位2列1寫入)，列2的資料須從記憶體A讀取(由欄位1列2寫入)，列3的資料須從記憶體B讀取(由欄位2列3寫入)，列4的資料須從記憶體B讀取(由欄位2列4寫入)。 In step S21, processing column 3, it can be seen that the data of row 1 must be read from memory A (written from column 2, row 1), the data of row 2 must be read from memory A (written from column 1, row 2), the data of row 3 must be read from memory B (written from column 2, row 3), and the data of row 4 must be read from memory B (written from column 2, row 4).

在步驟S23，判斷欄位3是新欄位，並因此存當今分配狀態當最佳狀態。 In step S23, it is determined that field 3 is a new field, and therefore the current allocation state is saved as the optimal state.

在步驟S25，從步驟S21的結果，判斷兩個記憶體平衡，並因此到步驟S33。 In step S25, from the result of step S21, it is determined that the two memories are balanced, and thus the process goes to step S33.

在步驟S33，把非零元素資料依平衡之數量寫入記憶體。此時，分配列1及2的寫到記憶體A，並分配列3及4的寫到記憶體B。 In step S33, non-zero element data is written to the memory in a balanced amount. At this time, rows 1 and 2 are allocated to be written to memory A, and rows 3 and 4 are allocated to be written to memory B.

在步驟S35，判斷欄位3非最後欄位，並因此回步驟S21。 In step S35, it is determined that field 3 is not the last field, and therefore the process returns to step S21.

在步驟S21，處理欄位4，可知列2的資料須從記憶體A讀取(由欄位3列2寫入)，列4的資料須從記憶體B讀取(由欄位3列4寫入)，列5的資料須從記憶體A讀取(由欄位2列5寫入)，列6的資料須從記憶體B讀取(由欄位1列6寫入)。 In step S21, column 4 is processed. It can be seen that the data of row 2 must be read from memory A (written from column 3 column 2), the data of row 4 must be read from memory B (written from column 3 column 4), the data of row 5 must be read from memory A (written from column 2 column 5), and the data of row 6 must be read from memory B (written from column 1 column 6).

在步驟S23，判斷欄位4是新欄位，並因此存當今分配狀態當最佳狀態。 In step S23, it is determined that field 4 is a new field, and therefore the current allocation state is saved as the optimal state.

在步驟S33，把非零元素資料依平衡之數量隨機分配寫入記憶體。此時分配列2及5的寫到記憶體A，並分配列4及6的寫到記憶體B。 In step S33, the non-zero element data is randomly allocated and written into the memory according to the balanced quantity. At this time, rows 2 and 5 are allocated to be written into memory A, and rows 4 and 6 are allocated to be written into memory B.

在步驟S35，判斷欄位4非最後欄位，並因此回步驟S21。 In step S35, it is determined that field 4 is not the last field, and therefore the process returns to step S21.

在步驟S21，處理欄位5，可知列2的資料須從記憶體A讀取(由欄位4列2寫入)，列3的資料須從記憶體B讀取(由欄位3列3寫入)，列5的資料須從記憶體A讀取(由欄位4列5寫入)，列6的資料須從記憶體B讀取(由欄位4列6寫入)。 In step S21, column 5 is processed. It can be seen that the data of row 2 must be read from memory A (written from column 4 column 2), the data of row 3 must be read from memory B (written from column 3 column 3), the data of row 5 must be read from memory A (written from column 4 column 5), and the data of row 6 must be read from memory B (written from column 4 column 6).

在步驟S23，判斷欄位5是新欄位，並因此存當今分配狀態當最佳狀態。 In step S23, it is determined that field 5 is a new field, and therefore the current allocation state is saved as the optimal state.

在步驟S33，把非零元素資料依平衡之數量寫入記憶體。此時，分配列2及3的寫到記憶體A，並分配列5及6的寫到記憶體B。 In step S33, non-zero element data is written to the memory in a balanced amount. At this time, rows 2 and 3 are allocated to be written to memory A, and rows 5 and 6 are allocated to be written to memory B.

在步驟S35，判斷欄位5是最後欄位，並因此到步驟S37。 In step S35, it is determined that field 5 is the last field, and therefore the process goes to step S37.

在步驟S37，開始進行後續的環繞分配，由欄位1開始填補缺失的讀取資訊。此時可知列2的資料須從記憶體A讀取(由欄位5列2寫入)，列3的資料須從記憶體A讀取(由欄位5列3寫入)，列5的資料須從記憶體B讀取(由欄位5列5寫入)，列6的資料須從記憶體B讀取(由欄位5列6寫入)。 In step S37, the subsequent wraparound allocation begins, filling in the missing read information starting from field 1. At this point, it can be seen that the data in row 2 must be read from memory A (written from field 5 row 2), the data in row 3 must be read from memory A (written from field 5 row 3), the data in row 5 must be read from memory B (written from field 5 row 5), and the data in row 6 must be read from memory B (written from field 5 row 6).

以此方式繼續填補欄位2缺的讀取資訊。可知列1的資料須從記憶體A讀取(由欄位3列1寫入)，列3的資料須從記憶體A讀取(由欄位1列3寫入)，列4的資料須從記憶體B讀取(由欄位4列4寫入)，列5的資料須從記憶體B讀取(由欄位1列5寫入)。 Continue filling in the missing read information in field 2 in this way. It can be seen that the data in row 1 must be read from memory A (written from field 3 row 1), the data in row 3 must be read from memory A (written from field 1 row 3), the data in row 4 must be read from memory B (written from field 4 row 4), and the data in row 5 must be read from memory B (written from field 1 row 5).

在步驟S39，判斷所有欄位在讀取數量上皆為平衡(從記憶體A的讀與記憶體B的讀平衡)，並因此結束步驟S20。 In step S39, it is determined that all fields are balanced in terms of read quantity (the read from memory A is balanced with the read from memory B), and thus step S20 is terminated.

參考圖6及圖7，描述回溯1次，從一種失衡的分配狀態變成一種平衡的分配狀態。 Refer to Figures 6 and 7 to describe the backtracking once, from an unbalanced distribution state to a balanced distribution state.

在圖6，欄位1到欄位k+2中，每一欄位從兩個記憶體讀取的數量平衡。在欄位k+3讀取資料時，從兩個記憶體讀取的數量失衡，可知列1到列3的讀取是從記憶體A，僅列4的讀取是從記憶體B。 In Figure 6, in columns 1 to k+2, the amount of data read from the two memories for each column is balanced. When reading data from column k+3, the amount of data read from the two memories is unbalanced. It can be seen that the data from columns 1 to 3 are read from memory A, and only column 4 is read from memory B.

在步驟S21，發現欄位k+3有3個非零元素的讀取是從記憶體A，1個非零元素的讀取是從記憶體B。 In step S21, it is found that 3 non-zero elements in column k+3 are read from memory A, and 1 non-zero element is read from memory B.

在步驟S23，判斷欄位k+3是新欄位，並因此存當今分配狀態當最佳狀態。 In step S23, it is determined that field k+3 is a new field, and therefore the current allocation state is considered the optimal state.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的數量失衡，並因此到步驟S27。 In step S25, from the result of step S21, it is determined that the amount read from the two memories is unbalanced, and thus the process goes to step S27.

在步驟S27，判斷回溯次數未達上限，並因此到步驟S29。 In step S27, it is determined that the number of backtracking times has not reached the upper limit, and therefore the process goes to step S29.

在步驟S29，隨機選列1(圖7)，回欄位k+2，並回步驟S21。此時，回溯次數須加1。 In step S29, randomly select row 1 (Figure 7), return to column k+2, and return to step S21. At this time, the number of backtracking times must be increased by 1.

參考圖7，在步驟S21，發現欄位k+2有2個非零元素的讀取是從記憶體A，2個非零元素的讀取是從記憶體B。 Referring to Figure 7, in step S21, it is found that 2 non-zero elements in column k+2 are read from memory A, and 2 non-zero elements are read from memory B.

在步驟S23，判斷欄位k+2是舊欄位，並因此不存當今分配狀態當最佳狀態。 In step S23, it is determined that field k+2 is an old field, and therefore the current allocation state is not the best state.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的數量平衡，並因此到步驟S33。 In step S25, from the result of step S21, it is determined that the amount read from the two memories is balanced, and thus proceeds to step S33.

在步驟S33，把非零元素資料依平衡數量且隨機分配的方式寫入記憶體，此時分配列3及5的寫到記憶體A，並分配列1及4的寫到記憶體B。 In step S33, non-zero element data is written into the memory in a balanced and randomly distributed manner. At this time, rows 3 and 5 are allocated to be written into memory A, and rows 1 and 4 are allocated to be written into memory B.

在步驟S35，判斷欄位k+2非最後欄位，並因此回步驟S21。 In step S35, it is determined that field k+2 is not the last field, and therefore the process returns to step S21.

在步驟S21，發現欄位k+3有2個非零元素的讀被分配到記憶體A，2個非零元素的讀被分配到記憶體B。 In step S21, it is found that the reads of 2 non-zero elements in column k+3 are allocated to memory A, and the reads of 2 non-zero elements are allocated to memory B.

在步驟S23，判斷欄位k+3是舊欄位，並因此不存當今分配狀態當最佳狀態。 In step S23, it is determined that field k+3 is an old field, and therefore the current allocation state is not the best state.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的數量平衡，並因此到步驟S33執行資料寫入步驟。此時，資料讀取已由原本失衡的分配狀態，變成一種平衡的分配狀態。 In step S25, based on the result of step S21, it is determined that the amount of data read from the two memories is balanced, and thus the data writing step is performed in step S33. At this point, the data reading has changed from the original unbalanced distribution state to a balanced distribution state.

參考圖6及圖8，描述回溯1次，從一種失衡的分配狀態到另一種失衡分配狀態的可能狀況及後續處置方式。 Refer to Figures 6 and 8 to describe the possible situations and subsequent treatment methods of backtracking once from one unbalanced allocation state to another unbalanced allocation state.

參考圖6，在步驟S21，發現欄位k+3有3個非零元素的讀取是從記憶體A，1個非零元素的讀取是從記憶體B。 Referring to Figure 6, in step S21, it is found that 3 non-zero elements of column k+3 are read from memory A, and 1 non-zero element is read from memory B.

在步驟S23，判斷欄位k+3是新欄位，並存當今分配狀態當最佳狀態。 In step S23, it is determined that field k+3 is a new field, and the current allocation state is considered the optimal state.

在步驟S29，隨機選列3，回欄位k+2，並回步驟S21。此時回溯次數須加1。 In step S29, randomly select row 3, return to column k+2, and return to step S21. At this time, the number of backtracking must be increased by 1.

參考圖8，在步驟S21，發現欄位k+2有2個非零元素的讀取是從記憶體A，2個非零元素的讀取是從記憶體B。 Referring to Figure 8, in step S21, it is found that 2 non-zero elements in column k+2 are read from memory A, and 2 non-zero elements are read from memory B.

在步驟S33，將非零元素資料依平衡數量且隨機分配的方式寫入記憶體，分配列1及4的寫到記憶體A，並分配列3及5的寫到記憶體B。 In step S33, non-zero element data is written into the memory in a balanced and randomly distributed manner, with rows 1 and 4 being written into memory A, and rows 3 and 5 being written into memory B.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的數量依舊失衡，並因此到步驟S27。在此失衡的狀態，須在步驟S27判斷是否達回溯次數上限，並根據結果再執行回溯(回溯次數須加1)或如後述在S31執行回復。 In step S25, based on the result of step S21, it is determined that the amount read from the two memories is still unbalanced, and therefore the process goes to step S27. In this unbalanced state, it is necessary to determine in step S27 whether the upper limit of the backtracking number has been reached, and perform backtracking again based on the result (the backtracking number must be increased by 1) or perform recovery in S31 as described later.

若一直未能使分配狀態從失衡到平衡，則在某次回溯的步驟S27時，判斷回溯次數已達上限，並因此到步驟S31。在步驟S31，因無法找到更佳分配狀態，所以採用先前儲存之最佳分配狀態作為回復(在此狀況中為欄位k+3)。接著進入步驟S33執行資料寫入部分，並在步驟S35判斷是否處理下一欄位或已處理完畢。 If the allocation state cannot be changed from imbalance to balance, then at step S27 of a certain backtracking, it is determined that the backtracking times have reached the upper limit, and therefore the process goes to step S31. In step S31, since no better allocation state can be found, the previously stored best allocation state is used as a recovery (field k+3 in this case). Then, the process goes to step S33 to execute the data writing part, and in step S35, it is determined whether the next field is processed or the processing has been completed.

參考圖6及圖9，描述回溯1次後，從一種失衡的分配狀態到另一種失衡分配狀態的可能狀況，並須進行第2次且巢狀式(多次涵蓋式)回溯，並敘述相關後續處置。 Refer to Figures 6 and 9 to describe the possible situation from one unbalanced allocation state to another unbalanced allocation state after backtracking once, and the need for a second and nested (multiple covering) backtracking, and describe the relevant subsequent disposal.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的非零元素的數量失衡，並因此到步驟S27。 In step S25, from the result of step S21, it is determined that the number of non-zero elements read from the two memories is unbalanced, and thus the process goes to step S27.

在步驟S29，隨機選列2，回欄位k+2，並回步驟S21。此時，回溯次數須加1。 In step S29, randomly select row 2, return to column k+2, and return to step S21. At this time, the number of backtracking times must be increased by 1.

參考圖9，在步驟S21，發現欄位k+1有2個非零元素的讀取是從記憶體A，2個非零元素的讀取是從記憶體B。 Referring to Figure 9, in step S21, it is found that 2 non-zero elements of column k+1 are read from memory A, and 2 non-zero elements are read from memory B.

在步驟S23，判斷欄位k+1是舊欄位，並因此不存當今分配狀態當最佳狀態。 In step S23, it is determined that field k+1 is an old field, and therefore the current allocation state is not the best state.

在步驟S33，把非零元素資料依平衡數量且隨機分配的方式寫入記憶體，分配列3及5的寫到記憶體A，並分配列2及6的寫到記憶體B。 In step S33, non-zero element data is written into the memory in a balanced and randomly distributed manner, with rows 3 and 5 being written into memory A, and rows 2 and 6 being written into memory B.

在步驟S35，判斷欄位k+1非最後欄位，並因此回步驟S21。 In step S35, it is determined that field k+1 is not the last field, and therefore the process returns to step S21.

在步驟S21，發現欄位k+2有3個非零元素的讀取是從記憶體A，1個非零元素的讀取是從記憶體B。 In step S21, it is found that 3 non-zero elements in column k+2 are read from memory A, and 1 non-zero element is read from memory B.

在步驟S25，從步驟S21的結果，判斷從兩個記憶體讀取的數量依舊失衡，並因此到步驟S27。在此失衡的狀態，須在步驟S27判斷是否達回溯次數上限，並根據結果再執行回溯(回溯次數須加1)，或在S31執行回復。若判斷為可執行回溯，因此時仍在先前未完成之回溯中，同時再執行回溯，就是巢狀回溯。可重複執行回溯，直到達回溯次數上限，或成功向後續欄位推進(在步驟S23判斷現今欄位為新欄位)才結束。 In step S25, from the result of step S21, it is determined that the amount read from the two memories is still unbalanced, and therefore step S27 is reached. In this unbalanced state, it is necessary to determine in step S27 whether the upper limit of the backtracking number has been reached, and perform backtracking again according to the result (the number of backtracking must be increased by 1), or perform recovery in S31. If it is determined that backtracking can be performed, since it is still in the previously unfinished backtracking, performing backtracking at the same time is nested backtracking. Backtracking can be performed repeatedly until the upper limit of the backtracking number is reached, or the subsequent field is successfully advanced (in step S23, the current field is determined to be a new field).

分配的第二階段(步驟S40)是在前一階段已完成的記憶體A(或B)分配情況下，進一步安排所對應之記憶體位址，以檢查記憶體的可用性。參考圖10，步驟S40包括步驟S41、S43、S45、S47、S49、S51、S53、S55、S57、S59。步驟S41、S43屬於讀的階段，步驟S45、S47、S49屬於寫的階段，步驟S53、S55、S57則是後續處理環繞(wrap around)的階段。 The second phase of allocation (step S40) is to further arrange the corresponding memory address to check the availability of the memory when the memory A (or B) allocation has been completed in the previous phase. Referring to Figure 10, step S40 includes steps S41, S43, S45, S47, S49, S51, S53, S55, S57, and S59. Steps S41 and S43 belong to the read phase, steps S45, S47, and S49 belong to the write phase, and steps S53, S55, and S57 are the subsequent wrap around phase.

步驟S41，在某一欄位中，依在步驟S20決定的記憶體分配方式，從對應的記憶體依序讀取一個非零元素的資料，讀取後將此元素資料從記憶體中移除。 Step S41, in a certain field, read the data of a non-zero element from the corresponding memory in sequence according to the memory allocation method determined in step S20, and remove the element data from the memory after reading.

在步驟S43，判斷是否讀完此欄位所有的非零元素。若是，則到步驟S45執行安排寫入的步驟，否則回步驟S41繼續讀取。 In step S43, determine whether all non-zero elements in this column have been read. If so, go to step S45 to execute the step of arranging writing, otherwise return to step S41 to continue reading.

在步驟S45，先判斷目的記憶體是否為滿。若是，則表示記憶體空間不足以實現此分配方法，立即到步驟S59結束第二階段，否則到步驟S47繼續寫入的步驟。 In step S45, first determine whether the destination memory is full. If so, it means that the memory space is insufficient to implement this allocation method, and immediately go to step S59 to end the second stage, otherwise go to step S47 to continue writing.

在步驟S47，依在步驟S20決定的記憶體分配方式，依序把上述欄位中的一個非零元素的資料，寫入對應的記憶體中一個可用的位址(宜從可用的空間隨機選擇位址以增加分配多樣性)。 In step S47, according to the memory allocation method determined in step S20, the data of a non-zero element in the above field is sequentially written into an available address in the corresponding memory (it is advisable to randomly select addresses from the available space to increase allocation diversity).

在步驟S49，判斷是否寫完此欄位的非零元素資料。若是，則到步驟S51，否則回步驟S45繼續寫入。 In step S49, determine whether the non-zero element data of this field has been written. If so, go to step S51, otherwise return to step S45 to continue writing.

在步驟S51，判斷是否達最後欄位。若是，則到步驟S53，否則回步驟S41處理下一欄位。 In step S51, determine whether the last field has been reached. If so, go to step S53, otherwise return to step S41 to process the next field.

在步驟S53，執行環繞(wrap around)，再處理整個矩陣的記憶體讀取與寫入。若寫入時發生衝突(前次指定的寫入位址已被佔)，則從現今可用的記憶體空間，隨機指定一個位址覆蓋(overwrite)之前的分配。此外，此衝突發生時，註記此次環繞曾發生覆蓋。 In step S53, wrap around is executed to process the memory read and write of the entire matrix. If a conflict occurs during writing (the previously specified write address is already occupied), a random address is specified from the currently available memory space to overwrite the previous allocation. In addition, when this conflict occurs, it is noted that this wrap has been overwritten.

在步驟S55，判斷在此次環繞時(步驟S53)是否曾有覆蓋發生。若有，則到步驟S57判斷是否須再環繞，否則表示第二階段完成，分配無誤可直接結束至步驟S60。 In step S55, determine whether there has been any coverage during this wrap (step S53). If so, go to step S57 to determine whether another wrap is required. Otherwise, it means that the second stage is completed and the allocation is correct and can be ended directly to step S60.

在步驟S57，判斷是否達最大環繞次數。若是，則到步驟S59準備結束，否則回步驟S53再執行環繞。 In step S57, determine whether the maximum number of wraps has been reached. If so, proceed to step S59 to prepare for termination, otherwise return to step S53 to execute wrap again.

在步驟S59，表示在第二階段發現記憶體空間無法實現此分配方法，記錄S20與S40兩階段分配的失敗一次，到步驟S60執行後續工作。 In step S59, it is found that the memory space cannot implement this allocation method in the second stage, and the failure of the allocation in the two stages S20 and S40 is recorded once, and the subsequent work is performed in step S60.

參考圖11，舉例描述步驟S40的執行。令記憶體A及B都有3個記憶空間。依序處理欄位1、欄位2、欄位3。 Refer to Figure 11 to describe the execution of step S40. Assume that both memory A and B have 3 memory spaces. Process field 1, field 2, and field 3 in sequence.

處理欄位1時，暫不執行步驟S41及S43，因無任何記憶位址的資訊。 When processing field 1, steps S41 and S43 are not executed because there is no information about the memory address.

在步驟S45，判斷預期寫入之記憶體未滿，因此到步驟S47，把一個非零元素的資料寫入對應的記憶體中一位址。 In step S45, it is determined that the memory to be written is not full, so step S47 is performed to write a non-zero element of data into an address in the corresponding memory.

重複執行步驟S45、S47及S49，直到把欄位1元素對應資料皆寫入記憶體為止。參考圖11下方結果，列1之資料被寫到記憶體A的位址1，列2之資料被寫到記憶體A的位址2，列3之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。 Repeat steps S45, S47 and S49 until all data corresponding to the elements in column 1 are written into the memory. Referring to the result at the bottom of Figure 11, the data in row 1 is written to address 1 of memory A, the data in row 2 is written to address 2 of memory A, the data in row 3 is written to address 1 of memory B, and the data in row 5 is written to address 2 of memory B.

在步驟S51，判斷欄位1非最後欄位，並回步驟S41。 In step S51, it is determined that field 1 is not the last field, and the process returns to step S41.

處理欄位2時，重複執行步驟S41及S43，直到讀完欄位2(列4暫不執行步驟S41及S43，因無任何記憶位址的資訊)。參考圖11下方結果，列1從記憶體A的位址1讀，列3從記憶體B的位址1讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址2仍有資料。 When processing column 2, repeat steps S41 and S43 until column 2 is read (column 4 does not execute steps S41 and S43 temporarily because there is no information about the memory address). Refer to the result at the bottom of Figure 11, column 1 reads from address 1 of memory A, column 3 reads from address 1 of memory B, and column 5 reads from address 2 of memory B. When reading, the data is also removed from the corresponding memory address, and at this time, address 2 of memory A still has data.

在步驟S45，判斷預期寫入之記憶體未滿，並因此到步驟S47，把一個非零元素的資料寫入對應的記憶體中一位址。 In step S45, it is determined that the memory to be written is not full, and therefore the process goes to step S47 to write a non-zero element of data into an address in the corresponding memory.

重複執行步驟S45、S47及S49，直到把欄位2元素對應資料皆寫入記憶體為止。參考圖11下方結果，列1之資料被寫到記憶體A的位址1，列3之資料被寫到記憶體A的位址3(因位址2已有資料)，列4之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。若記憶體A及B都只有2個記憶空間，則在寫列3以前在步驟S45會判斷記憶體A已滿而無法執行寫入，直接步驟S59記錄錯誤與結束排序。 Repeat steps S45, S47 and S49 until all data corresponding to the elements in column 2 are written into the memory. Refer to the result at the bottom of Figure 11, the data in row 1 is written to address 1 of memory A, the data in row 3 is written to address 3 of memory A (because address 2 already has data), the data in row 4 is written to address 1 of memory B, and the data in row 5 is written to address 2 of memory B. If both memory A and B have only 2 memory spaces, then before writing row 3, step S45 will determine that memory A is full and cannot perform writing, and directly go to step S59 to record the error and end the sorting.

在步驟S51，判斷欄位2非最後欄位，並回步驟S41。 In step S51, it is determined that field 2 is not the last field, and the process returns to step S41.

處理欄位3時，重複執行步驟S41及S43，直到讀完欄位2。參考圖11下方結果，列1從記憶體A的位址1讀，列2從記憶體A的位址2讀，列4從記憶體B的位址2讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址3仍有資料。 When processing column 3, repeat steps S41 and S43 until column 2 is read. Refer to the result at the bottom of Figure 11, column 1 reads from address 1 of memory A, column 2 reads from address 2 of memory A, column 4 reads from address 2 of memory B, and column 5 reads from address 2 of memory B. During reading, the data is also removed from the corresponding memory address, and at this time, address 3 of memory A still has data.

重複執行步驟S45、S47及S49，先判斷對應記憶體空間再寫入，直到寫完欄位3資料為止。參考圖11下方結果，列1之資料被寫到記憶體A的位址1，列2之資料被寫到記憶體B的位址1，列4之資料被寫到記憶體A的位址2，列5之資料被寫到記憶體B的位址2。 Repeat steps S45, S47 and S49, first determine the corresponding memory space and then write, until the data in column 3 is written. Refer to the result at the bottom of Figure 11, the data in row 1 is written to address 1 of memory A, the data in row 2 is written to address 1 of memory B, the data in row 4 is written to address 2 of memory A, and the data in row 5 is written to address 2 of memory B.

在步驟S51，判斷欄位3是最後欄位，並因此到步驟S53。 In step S51, it is determined that field 3 is the last field, and therefore the process goes to step S53.

參考圖12，在步驟S53，執行第一次環繞，就是返回開頭重新檢查整個矩陣、填補缺少的資訊、並處理任何覆蓋的情形。 Referring to Figure 12, in step S53, the first loop is performed, which is to go back to the beginning to recheck the entire matrix, fill in the missing information, and handle any overlap situations.

檢驗欄位1的讀取。列1從記憶體A的位址1讀，列2從記憶體B的位址1讀，列3從記憶體A的位址3讀(因圖11中列3最後是寫入記憶體A的位址3)，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址2仍有資料。 Check the reading of column 1. Column 1 reads from address 1 of memory A, column 2 reads from address 1 of memory B, column 3 reads from address 3 of memory A (because column 3 in Figure 11 is the last to be written to address 3 of memory A), and column 5 reads from address 2 of memory B. When reading, the data is also removed from the corresponding memory address, and at this time, address 2 of memory A still has data.

接著，檢驗欄位1的寫入。列1之資料被寫到記憶體A的位址1，列2之資料發生衝突無法寫到記憶體A的位址2，列3之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。注意，因列2之資料原指定之記憶體A位址2被佔，故寫入位址3。發生覆蓋情形，予以紀錄。 Next, check the writing of column 1. The data in row 1 is written to address 1 of memory A. The data in row 2 conflicts and cannot be written to address 2 of memory A. The data in row 3 is written to address 1 of memory B. The data in row 5 is written to address 2 of memory B. Note that because the address 2 of memory A originally specified by the data in row 2 is occupied, it is written to address 3. An overwrite occurs and is recorded.

檢驗欄位2的讀取。列1從記憶體A的位址1讀，列3從記憶體B的位址1讀，列4從記憶體A的位址2讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址3仍有資料。 Check the reading of column 2. Row 1 reads from address 1 of memory A, row 3 reads from address 1 of memory B, row 4 reads from address 2 of memory A, and row 5 reads from address 2 of memory B. During the reading, the data is also removed from the corresponding memory address, and the data is still left at address 3 of memory A.

接著，檢驗欄位2的寫入。列1之資料被寫到記憶體A的位址1，列3之資料被寫到記憶體A的位址2，列4之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。 Next, check the writing of column 2. The data in row 1 is written to address 1 of memory A, the data in row 3 is written to address 2 of memory A, the data in row 4 is written to address 1 of memory B, and the data in row 5 is written to address 2 of memory B.

檢驗欄位3的讀取。列1從記憶體A的位址1讀，列3從記憶體A的位址3讀，列4從記憶體B的位址1讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址2仍有資料。 Check the reading of column 3. Row 1 reads from address 1 of memory A, row 3 reads from address 3 of memory A, row 4 reads from address 1 of memory B, and row 5 reads from address 2 of memory B. During the reading, the data is also removed from the corresponding memory address, and the data is still left at address 2 of memory A.

接著，檢驗欄位3的寫入。列1之資料被寫到記憶體A的位址1，列2之資料被寫到記憶體B的位址1，列4之資料發生衝突而無法寫到記憶體A的位址2，列5之資料被寫到記憶體B的位址2。注意，列4之資料因原指定之記憶體A位址2被佔，故寫入位址3。發生覆蓋情形，予以紀錄。第1次環繞結束，離開步驟S53。 Next, check the writing of column 3. The data in row 1 is written to address 1 of memory A, the data in row 2 is written to address 1 of memory B, the data in row 4 conflicts and cannot be written to address 2 of memory A, and the data in row 5 is written to address 2 of memory B. Note that the data in row 4 is written to address 3 because the originally specified address 2 of memory A is occupied. An overwrite occurs and is recorded. The first loop ends and exits step S53.

在步驟S55，由上述紀錄判斷此次環繞有覆蓋發生，並到步驟S57。 In step S55, it is determined from the above records that this wraparound has coverage, and the process goes to step S57.

在步驟S57，令最大環繞次數為10，當今環繞次數為1，其數目小於10，因此回步驟S53再環繞。 In step S57, the maximum number of loops is set to 10. The current number of loops is 1, which is less than 10, so return to step S53 and loop again.

參考圖13，在步驟S53，執行第二次環繞，就是返回開頭重新檢查整個矩陣、填補缺少的資訊、並處理任何覆蓋的情形。 Referring to Figure 13, in step S53, the second loop is performed, which is to go back to the beginning to recheck the entire matrix, fill in the missing information, and handle any overlap situations.

與第一次環繞作法相同，依序檢驗欄位1的讀跟寫，欄位2的讀跟寫，欄位3的讀跟寫。注意，第1欄列2寫資料時，因原指定之記憶體A位址3被佔，故寫入位址2。發生覆蓋情形，予以紀錄。第3欄列4寫入資料時，因原指定之記憶體A位址3被佔，故寫入位址2。發生覆蓋情形，予以紀錄。第2次環繞結束，離開步驟S53。 Same as the first loop, check the read and write of field 1, field 2, and field 3 in sequence. Note that when writing data in column 1, address 2 is written because the originally specified memory A address 3 is occupied. If an overwrite occurs, it is recorded. When writing data in column 3, address 2 is written because the originally specified memory A address 3 is occupied. If an overwrite occurs, it is recorded. The second loop ends and exits step S53.

在步驟S55，由上述紀錄判斷此次環繞依舊有覆蓋發生，並到步驟S57。 In step S55, it is determined from the above records that this wrap still has coverage, and the process goes to step S57.

在步驟S57，因最大環繞次數為10，當今環繞次數為2，其數目小於10，因此回步驟S53再環繞。 In step S57, since the maximum number of loops is 10 and the current number of loops is 2, which is less than 10, it returns to step S53 and loops again.

在此例，因執行步驟S53環繞時，第1欄列2之寫入與第3欄列4之寫入，會不斷在記憶體A的位址2與位址3切換，故每次環繞時皆會導致覆蓋情況發生，並經步驟S55到步驟S57判斷環繞次數上限，然後回步驟S53執行下次環繞。最後一次環繞在步驟S57因為達到環繞次數10次的上限，到步驟S59記錄錯誤並結束第二階段排序。 In this example, when executing step S53, the writing of column 2 in column 1 and column 4 in column 3 will continuously switch between address 2 and address 3 of memory A, so each time it wraps, an overwrite will occur, and the upper limit of the wrapping times is determined through step S55 to step S57, and then it returns to step S53 to execute the next wrapping. The last wrapping reaches the upper limit of 10 times in step S57, so an error is recorded in step S59 and the second stage sorting ends.

在以上描述，每一個記憶體都有3個記憶空間。在以下描述，每一個記憶體都有4個記憶空間。記憶體的空間愈多，發生覆蓋的可能性愈低，因而更有機會成功找到排序。 In the above description, each memory has 3 memory spaces. In the following description, each memory has 4 memory spaces. The more memory spaces a memory has, the lower the probability of overwriting, and thus the greater the chance of successfully finding a sort.

參考圖14，舉例描述在圖11排列後，使用4個記憶空間的記憶體(記憶體A及B都各有4個記憶空間)來做環繞，並在環繞時執行檢驗與覆蓋，最後成功完成排序的過程。環繞時，依序處理欄位1、欄位2、欄位3。 Refer to Figure 14, which describes an example of using a memory with 4 memory spaces (memory A and B each have 4 memory spaces) to wrap around after the arrangement in Figure 11, and performing verification and overwriting during wrapping, and finally successfully completing the sorting process. During wrapping, field 1, field 2, and field 3 are processed in sequence.

接著，檢驗欄位1的寫入。列1之資料被寫到記憶體A的位址1，列2之資料發生衝突無法寫到記憶體A的位址2，列3之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。注意，列2之資料因原指定之記憶體A位址2被佔，故依隨機方式選擇現今可用之空間，此處假設隨機指定結果為使用記憶體A之位址4。由於列2在此發生覆蓋情形，予以紀錄。 Next, check the writing of column 1. The data in row 1 is written to address 1 of memory A. The data in row 2 conflicts and cannot be written to address 2 of memory A. The data in row 3 is written to address 1 of memory B. The data in row 5 is written to address 2 of memory B. Note that the data in row 2 is randomly selected from the currently available space because the originally specified address 2 of memory A is occupied. Here, it is assumed that the random assignment result is to use address 4 of memory A. Since row 2 is overwritten here, it is recorded.

檢驗欄位2的讀取。列1從記憶體A的位址1讀，列3從記憶體B的位址1讀，列4從記憶體A的位址2讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址4仍有資料。 Check the reading of column 2. Row 1 reads from address 1 of memory A, row 3 reads from address 1 of memory B, row 4 reads from address 2 of memory A, and row 5 reads from address 2 of memory B. During the reading, the data is also removed from the corresponding memory address, and now there is still data at address 4 of memory A.

接著，檢驗欄位2的寫入。列1之資料被寫到記憶體A的位址1，列3之資料被寫到記憶體A的位址3，列4之資料被寫到記憶體B的位址1，列5之資料被寫到記憶體B的位址2。 Next, check the writing of column 2. The data in row 1 is written to address 1 of memory A, the data in row 3 is written to address 3 of memory A, the data in row 4 is written to address 1 of memory B, and the data in row 5 is written to address 2 of memory B.

檢驗欄位3的讀取。列1從記憶體A的位址1讀，列3從記憶體A的位址4讀，列4從記憶體B的位址1讀，列5從記憶體B的位址2讀。讀取時還把資料從相對應的記憶體位址移除，此時剩記憶體A的位址3仍有資料。 Check the reading of column 3. Row 1 reads from address 1 of memory A, row 3 reads from address 4 of memory A, row 4 reads from address 1 of memory B, and row 5 reads from address 2 of memory B. During the reading, the data is also removed from the corresponding memory address, and the data is still left at address 3 of memory A.

接著，檢驗列3的寫入。列1之資料被寫到記憶體A的位址1，列2之資料被寫到記憶體B的位址1，列4之資料被寫到記憶體A的位址2，列5之資料被寫到記憶體B的位址2。第1次環繞結束，離開步驟S53。 Next, check the writing of row 3. The data of row 1 is written to address 1 of memory A, the data of row 2 is written to address 1 of memory B, the data of row 4 is written to address 2 of memory A, and the data of row 5 is written to address 2 of memory B. The first loop ends and exits step S53.

在步驟S57，假定最大環繞次數為10，當今環繞次數為1，其數目小於10，因此回步驟S53再次環繞。 In step S57, assuming that the maximum number of loops is 10, the current number of loops is 1, which is less than 10, so return to step S53 to loop again.

在步驟S53執行第二次環繞，用第一次環繞結果進行分配檢驗，確認分配方式無變動，因此任何覆蓋紀錄。 In step S53, execute the second loop and use the result of the first loop to check the allocation and confirm that the allocation method has not changed, so any overwrite record is not recorded.

在步驟S55判斷無覆蓋，分配成功，並因此離開此第二階段排序。 In step S55, it is determined that there is no overlap, the allocation is successful, and thus the second stage sorting is exited.

以上僅為描述本發明的較佳實施方式，非用以限定本發明的範圍。本技術領域內的一般技術人員根據上述實施例所作的均等變化，以及本領域內技術人員熟知的改變，仍在本發明的範圍內。 The above is only to describe the preferred implementation of the present invention, and is not intended to limit the scope of the present invention. Equal changes made by ordinary technicians in this technical field based on the above embodiments, as well as changes known to technicians in this field, are still within the scope of the present invention.

S20:分配的第一階段 S20: The first stage of allocation

S40:分配的第二階段 S40: Second stage of allocation

S60:更新最佳結果 S60: Update best results

Claims

A method for improving the parallelism of LDPC reorganization decoder by re-arranging LDPC codes comprises the following steps:

Provide two memories for the above LDPC reassembly decoder;

Evenly distribute the non-zero elements of the above LDPC code matrix to the above two memories (S20);

According to the above allocation method, write the above non-zero elements to the above two memories (S40);

Update allocation method (S60);

Determine whether the maximum number of recursions has been reached (S70);

If the maximum number of recursions has not been reached, the process returns to the step of evenly allocating the non-zero elements to the two memories (S20);

If the maximum number of recursions is reached, it is determined whether the allocation is successful (S80);

If the allocation is successful, a scheduling file is generated (S90); and

If the allocation fails, then it ends.

The method for improving the parallelism of LDPC reorganization decoder by rearranging LDPC code as described in claim 1, wherein after reaching the maximum number of recursions or the data reading of each field is balanced, the method further includes the following steps:

According to the above allocation method, in a certain column, a non-zero element is read from the corresponding memory in sequence, and after reading, the element is removed from the memory (S41);

Determine whether all non-zero element data of this field have been read (S43);

If the non-zero element data of this field has not been read, return to the above step of reading a non-zero element data in a field from the corresponding memory (S41);

If all non-zero element data of this field are read, it is determined whether the expected write memory is full (S45);

If any of the above memories is full, the allocation failure is recorded once and the process ends (S59);

If the expected write memory is not full, then according to the above allocation method, write a non-zero element data in the above field into an available address in the corresponding memory in sequence (S47);

Determine whether the non-zero element data of this field has been written (S49);

If all non-zero element data of this field have not been written, return to the above step of determining whether any of the above memories is full (S45);

If all non-zero element data of this field are written, it is determined whether the last field has been reached (S51);

If the last field is not reached, return to the above step of sequentially reading a non-zero element in a field from the corresponding memory (S41);

If the last field is reached, a wrap is performed to check whether the entire matrix has any address conflicts and needs to be reallocated (S53);

Determine whether there is coverage during this orbit (S55);

If there is no coverage during this orbit, it ends;

If there is coverage during this orbit, it is determined whether the number of orbits reaches an upper limit (S57);

If the number of loops does not reach the upper limit, return to the above loop execution step (S53); and

If the number of loops reaches the upper limit, the allocation failure is recorded once and the process ends (S59).

A method for improving the parallelism of an LDPC reorganization decoder comprises the following steps: providing two memories for the LDPC reorganization decoder; calculating the amount of non-zero element data in a field of the matrix of the LDPC code, respectively read from the two memories (S21); if the current field is a new field, storing the current overall allocation state as the best state (S23); judging whether the amount of non-zero element data in the field belonging to the two memories is equal (S24); S25); if the amount of non-zero element data read from the two memories is not equal, determine whether the backtracking times have reached the upper limit (S27); if the backtracking times have not reached the upper limit, randomly select a non-zero element that causes the imbalance, return to the previous non-zero element in the same column, and reprocess the column where it is located (S29); if the backtracking times have reached the upper limit, restore the previous best state (S31); if the amount of non-zero elements belonging to the two memories is equal, then Evenly and randomly distribute the data of the non-zero elements to be written into the two memories (S33); determine whether the last field is reached (S35); if the last field is not reached, return to the above method to calculate the non-zero element data in the next field and read the number from the two memories respectively (S21); if the last field is reached, use the writing information of the last few fields to know the memory where the data is stored, and then return to the beginning field to fill in the corresponding memory read, and record the previous The number of reads from the two memories in each field of the surface (S37); determines whether the maximum recursion number is reached, or the data reading of each field is balanced (S39); if the maximum recursion number is not reached, and the data reading of each field is not balanced, then the above calculation of the number of non-zero element data in a field of the LDPC code matrix read from the two memories (S21); and if the maximum recursion number is reached, or the data reading of each field is balanced, then the process ends.

The method for improving the parallelism of the LDPC recombinant decoder as described in claim 3, wherein after reaching the maximum number of recursions or the data reading of each field is balanced, further comprises the following steps: in a certain field, a non-zero element is sequentially read from the corresponding memory according to the above allocation method, and the element is removed from the memory after reading (S41); it is determined whether all the non-zero element data of this field have been read (S43); if the non-zero element data of this field has not been read, then return to the above step of reading a non-zero element data in a certain field from the corresponding memory (S41 ...1); if the non-zero element data of this field has not been read, then return to the above step of reading a non-zero element data in a certain field from the corresponding memory (S41); if the non-zero element data of this field has not been read, then return to the After all non-zero element data of this field are written, it is determined whether the expected write memory is full (S45); if any of the above memories are full, the allocation failure is recorded once and the process ends (S59); if the expected write memory is not full, a non-zero element data in the above field is sequentially written into an available address in the corresponding memory according to the above allocation method (S47); it is determined whether the non-zero element data of this field is written (S49); if all non-zero element data of this field are not written, it returns to the step of determining whether any of the above memories is full (S45);

If the last field is not reached, return to the above step of sequentially reading a non-zero element in a certain field from the corresponding memory (S41);

Determine whether there is coverage during this orbit (S55);

If there is no coverage during this orbit, it ends;

If the number of loops does not reach the upper limit, the process returns to the above loop execution step (S53); and