CN104617959B

CN104617959B - A LDPC Coding and Decoding Method Based on General Processor

Info

Publication number: CN104617959B
Application number: CN201510026526.1A
Authority: CN
Inventors: 牛凯; 贺志强; 张竟意
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2015-01-20
Filing date: 2015-01-20
Publication date: 2017-09-05
Anticipated expiration: 2035-01-20
Also published as: CN104617959A

Abstract

The present application discloses an LDPC encoding method, which determines the vectors p ₁ and p ₂ and obtains the encoding result vector; when determining the vectors p ₁ and p ₂ , the multiplication process between any matrix and any vector includes: Each row is used as a thread to multiply the corresponding row of the matrix with any vector, and the multiplication results of all rows form a result vector; the multiplication operation of each row of any matrix with any vector includes: determining the matrix The starting position of the vector corresponding to each element j of row i, the data of the length Z-A _i,j in any vector from the starting position is left-shifted by means of single instruction multiple data flow, and the starting position is shifted to the left After moving the data of length A _{i, j} from the start position to the left-shifted data, the vector shift result corresponding to element j is obtained; then add the vector shift results of each element. By the method described above, it is possible to increase the encoding speed in a general-purpose processor by utilizing multi-threading and SIMD processing.

Description

A LDPC Coding and Decoding Method Based on General Processor

技术领域technical field

本申请涉及LDPC编译码技术，特别涉及一种基于通用处理器的LDPC编译码方法。The present application relates to LDPC encoding and decoding technology, in particular to an LDPC encoding and decoding method based on a general-purpose processor.

背景技术Background technique

LDPC码，是一种码长较大的线性分组码。其校验矩阵也较大，并且校验矩阵中的非零元素很少，即“1”的个数很少，故称低密度。LDPC code is a linear block code with a large code length. The check matrix is also relatively large, and there are very few non-zero elements in the check matrix, that is, the number of "1" is very small, so it is called low density.

在实现IEEE 802.11n无线局域网传输协议的过程中，需使用到LDPC编译码技术，依协议要求，其中LDPC PPDU(Presentation Protocol Data Unit，表示层协议数据单元)的生成过程如下，参见图1：In the process of implementing the IEEE 802.11n wireless LAN transmission protocol, LDPC coding and decoding technology is required. According to the protocol requirements, the generation process of LDPC PPDU (Presentation Protocol Data Unit, presentation layer protocol data unit) is as follows, see Figure 1:

(1)计算缩短比特位(1) Calculate the shortened bits

(1a)计算可用比特数N_avbits，公式为：(1a) Calculate the number of available bits N _avbits , the formula is:

N_pld＝length×8+16， N _pld = length × 8 + 16,

其中，如果有STBC(Space-time block code)预编码，则标志位m_STBC为2，否则为1；N_CBPS表示每个符号的编码比特数；length表示PSDU(presentation Service DataUnit)的字节数，即为信息比特位的字节数；N_pld表示PSDU和SERVICE FIELD的总的比特数；R表示编码码率。Among them, if there is STBC (Space-time block code) precoding, the flag bit m _STBC is 2, otherwise it is 1; N _CBPS indicates the number of coded bits per symbol; length indicates the number of bytes of PSDU (presentation Service DataUnit) , which is the number of bytes of information bits; N _pld represents the total number of bits of PSDU and SERVICE FIELD; R represents the coding rate.

(1b)计算LDPC码字个数N_CW和码长L_LDPC (1b) Calculate the number of LDPC codewords N _CW and the code length L _LDPC

当N_avbits≤648时，码字个数N_CW为1，且如果N_avbits≥N_pld+912×(1-R)时，码长L_LDPC为1296，否则码长L_LDPC为648；当648<N_avbits≤1296时，码字个数N_CW为1，且如果N_avbits≥N_pld+1464×(1-R)时，码长L_LDPC为1944，否则码长L_LDPC为1296；当1296<N_avbits≤1944时，码字个数N_CW为1，此时码长L_LDPC为1944；当1944<N_avbits≤2592时，码字个数N_CW为2，且如果N_avbits≥N_pld+2916×(1-R)时，码长L_LDPC为1944，否则码长L_LDPC为1296；当N_avbits>2592时，码字个数N_CW为此时码长L_LDPC为1944；When N _{avbits ≤} 648, the number of code words N _CW is 1, and if N _{avbits ≥} N _pld +912×(1-R), the code length L _LDPC is 1296, otherwise the code length L _LDPC is 648; when 648 When <N _avbits ≤1296, the number of code words N _CW is 1, and if N _avbits ≥N _pld +1464×(1-R), the code length L _LDPC is 1944, otherwise the code length L _LDPC is 1296; when 1296 When <N _avbits ≤1944, the number of codewords N _CW is 1, and the code length L _LDPC is 1944; when 1944<N _avbits ≤2592, the number of codewords N _CW is 2, and if N _avbits ≥N _pld When +2916×(1-R), the code length L _LDPC is 1944, otherwise the code length L _LDPC is 1296; when N _avbits >2592, the number of code words N _CW is At this time, the code length L _LDPC is 1944;

(1c)计算缩短比特位个数N_shrt，缩短比特位在LDPC编码前填充到信息比特位之后：(1c) Calculate the number of shortened bits N _shrt , the shortened bits are filled after the information bits before LDPC encoding:

N_shrt＝max(0,(N_CW×L_LDPC×R)-N_pld)N _shrt ＝max(0,(N _CW ×L _LDPC ×R)-N _pld )

当N_shrt＝0时，不进行补0操作。当N_shrt>0时，缩短比特位在所有N_CW个码字上平均分布，即每个码字分配到的缩短比特位个数为若是N_shrtmodN_CW≠0，，其中mod为取余，即N_shrt对N_CW取余，则第一个码字比其他码字多一个缩短比特位。When N _shrt =0, no 0 complement operation is performed. When N _shrt > 0, the shortened bits are evenly distributed on all N _CW codewords, that is, the number of shortened bits allocated to each codeword is If N _shrt mod N _CW ≠ 0, where mod is the remainder, that is, N _shrt takes the remainder of N _CW , then the first codeword has one more shortened bit than the other codewords.

(2)进行LDPC编码，得到检验比特位。(2) Perform LDPC encoding to obtain check bits.

(3)丢弃缩短比特位(3) Discard the shortened bits

(4)计算打孔比特位个数并丢弃打孔比特位，根据下式计算LDPC编码后打孔比特位个数N_punc：(4) Calculate the number of punched bits and discard the punched bits, calculate the number of punched bits N _punc after LDPC encoding according to the following formula:

N_punc＝max(0,(N_CW×L_LDPC)-N_avbits-N_shrt)N _punc ＝max(0,(N _CW ×L _LDPC )-N _avbits- N _shrt )

如果或者(N_punc>0.3×N_CW×L_LDPC×(1-R))，增大N_avbits然后根据下式重新计算N_punc：if Or (N _punc >0.3×N _CW ×L _LDPC ×(1-R)), increase N _avbits and recalculate N _punc according to the following formula:

N'_avbits＝N_avbits+N_CBPS×m_STBC，N_punc＝max(0,(N_CW×L_LDPC)-N'_avbits-N_shrt)N' _avbits =N _avbits +N _CBPS ×m _STBC , N _punc =max(0,(N _CW ×L _LDPC )-N' _avbits -N _shrt )

打孔比特位在所有N_CW个码字上平均分布，即每个码字分配到的打孔比特位个数为若是N_puncmodN_CW≠0，其中mod为取余，即N_punc对N_CW取余，则第一个码字比其他码字多一个打孔比特位。The punctured bits are evenly distributed on all N _CW codewords, that is, the number of punctured bits allocated to each codeword is If N _punc mod N _CW ≠ 0, where mod is the remainder, that is, N _punc takes the remainder of N _CW , then the first codeword has one more punctured bit than other codewords.

(5)计算重复比特位，根据下式计算重复比特位个数N_rep：(5) Calculate the repeated bits, calculate the number of repeated bits N _rep according to the following formula:

N_rep＝max(0,N'_avbits-N_CW×L_LDPC×(1-R)-N_pld)N _rep ＝max(0,N' _avbits -N _CW ×L _LDPC ×(1-R)-N _pld )

重复比特位在所有N_CW个码字上平均分布，即每个码字分配到的重复比特位个数为若是N_repmodN_CW≠0，，其中mod为取余，即N_rep对N_CW取余，则第一个码字比其他码字多一个重复比特位。重复比特位从信息比特位的第一个比特开始顺序选取，直到满足长度要求，重复比特位是从去掉的缩短比特位后的码字中复制的。选出的重复比特位顺序连接在校验比特位之后。当需要打孔时，检验比特位不需要重复，反之亦然。The repeated bits are evenly distributed on all N _CW codewords, that is, the number of repeated bits allocated to each codeword is If N _rep mod N _CW ≠ 0, where mod is the remainder, that is, N _rep takes the remainder of N _CW , then the first codeword has one more repetition bit than other codewords. The repetition bits are sequentially selected from the first bit of the information bits until the length requirement is satisfied, and the repetition bits are copied from the codeword after the shortened bits are removed. The selected repetition bits are sequentially connected after the parity bits. When puncturing is required, check bits need not be repeated, and vice versa.

在LDPC PPDU生成过程中，LDPC编码方法最为重要，经过LDPC编码后输出的码字向量记为c＝(S,p₁,p₂)，其中S为信息向量，p₁和p₂为码字校验向量，但因LDPC码的校验矩阵H较大，编码过程中的运算将十分繁琐。观察协议中给出的校验矩阵可以看出，不同码率R下的矩阵其行重均为24，其列重均为24×(1-R)，根据校验矩阵H的特性，将其进行如下分块分成矩阵A、矩阵B、矩阵D、矩阵E、矩阵T以及矩阵F六个子矩阵，其中矩阵B、矩阵D、矩阵E以及矩阵T的结构较为特殊，B＝(1--…0-…)^T，D＝(1)，E＝(-…-0)，矩阵A和矩阵F的结构无规律性，参见协议802.11n。另外，由于校验矩阵H规模较大，因此，在表示校验矩阵时，通过一个元素表示实际校验矩阵中的一个子矩阵，具体地，在校验矩阵H和分块矩阵A、B、D、E、T、F的表示方法中，“—”表示该子矩阵为零矩阵，“0”表示该子矩阵为单位矩阵，“常数C”表示该子矩阵为单位矩阵C次循环右移位后的结果矩阵。其中子矩阵的维数为Z*Z，Z可以预先根据码长确定。通过上述方式，可以大大减少校验矩阵和各个分块矩阵的表示大小。In the process of LDPC PPDU generation, the LDPC encoding method is the most important. The code word vector output after LDPC encoding is denoted as c=(S,p ₁ ,p ₂ ), where S is the information vector, p ₁ and p ₂ are the code words Check vector, but because the check matrix H of the LDPC code is relatively large, the operation in the encoding process will be very cumbersome. Observing the parity check matrix given in the protocol, it can be seen that the row weight of the matrix under different code rates R is 24, and the column weight is 24×(1-R). According to the characteristics of the parity check matrix H, its Do the following block It is divided into six sub-matrices: matrix A, matrix B, matrix D, matrix E, matrix T and matrix F, among which the structure of matrix B, matrix D, matrix E and matrix T is relatively special, B=(1--...0-...) ^T , D=(1), E=(-...-0), The structures of matrix A and matrix F are irregular, see protocol 802.11n. In addition, due to the large scale of the check matrix H, when representing the check matrix, one element represents a sub-matrix in the actual check matrix, specifically, the check matrix H and the block matrix A, B, In the expression methods of D, E, T, and F, "—" indicates that the sub-matrix is a zero matrix, "0" indicates that the sub-matrix is an identity matrix, and "constant C" indicates that the sub-matrix is an identity matrix. The resulting matrix after bit. The dimension of the sub-matrix is Z*Z, and Z can be determined in advance according to the code length. Through the above method, the representation size of the parity check matrix and each block matrix can be greatly reduced.

在已知校验矩阵H和信息向量S的前提下，确定码字向量c的具体方式为：根据校验方程Hc^T＝0^T可得分解方程经过优化后可得到求出p₁和p₂后，即可求出码字向量c＝(S,p₁,p₂)。On the premise that the check matrix H and the information vector S are known, the specific way to determine the codeword vector c is: according to the check equation Hc ^T =0 ^T , the decomposition equation can be obtained can be obtained after optimization After calculating p ₁ and p ₂ , the code word vector c=(S,p ₁ ,p ₂ ) can be calculated.

参照上述LDPC编码过程的编码器的组成参见图2，其中含有4种功能模块：编码矩阵生成器、矩阵乘法器、矩阵加法器以及LDPC码字合成器。Refer to Figure 2 for the composition of the encoder referring to the above LDPC encoding process, which contains four functional modules: encoding matrix generator, matrix multiplier, matrix adder and LDPC codeword synthesizer.

该编码器中共有6个编码矩阵生成器，其输入是一个矩阵，分别是矩阵A、矩阵B、矩阵D、矩阵E、矩阵T以及矩阵F六个子矩阵，其输出是一个矩阵，是经过编码矩阵生成器处理后的矩阵，其功能是将输入的矩阵通过压缩存储的方式，即仅存矩阵非零元素，将输入矩阵进行变换，得到输出矩阵。There are 6 encoding matrix generators in the encoder, and its input is a matrix, which are six sub-matrices of matrix A, matrix B, matrix D, matrix E, matrix T and matrix F, and its output is a matrix, which is encoded The function of the matrix processed by the matrix generator is to compress and store the input matrix, that is, only store the non-zero elements of the matrix, and transform the input matrix to obtain the output matrix.

该编码器中共有6个乘法器，其有两个输入端一个输出端，两个输入端分别是信息向量S、矩阵A、通过编码矩阵生成器处理后的矩阵、通过其他乘法器的结果矩阵或者通过加法器后的结果矩阵中的两个，其输出是一个矩阵，是两个输入端进行乘法运算后的结果向量，其功能是将两个输入端进行矩阵乘法运算并输出结果矩阵。There are 6 multipliers in the encoder, which have two input terminals and one output terminal. The two input terminals are the information vector S, the matrix A, the matrix processed by the encoding matrix generator, and the result matrix of other multipliers. Or through two of the result matrices after the adder, the output is a matrix, which is the result vector after the multiplication of the two input terminals, and its function is to perform matrix multiplication on the two input terminals and output the result matrix.

该编码器共有2个矩阵加法器，其输入是两个矩阵，均为编码器中矩阵乘法器输出后矩阵，其输出是一个矩阵，是两个输入矩阵进行矩阵加法后的结果，其功能是将两个输入矩阵进行矩阵加法并输出结果矩阵。The encoder has two matrix adders. The input is two matrices, both of which are the output matrices of the matrix multiplier in the encoder. The output is a matrix, which is the result of matrix addition of two input matrices. Its function is Performs matrix addition of two input matrices and outputs the resulting matrix.

该编码器中有1个LDPC码字合成器,其输入是三个向量，是信息向量S、码字校验向量p₁和码字校验向量p₂，其输出是一个向量，是码字向量c，其功能是将信息向量S、码字校验向量p₁和码字校验向量p₂三个向量合成码字向量c＝(S,p₁,p₂)并码字向量c。There is an LDPC codeword synthesizer in the encoder, and its input is three vectors, which are information vector S, codeword check vector p ₁ and codeword check vector p ₂ , and its output is a vector, which is codeword The function of the vector c is to synthesize the three vectors of the information vector S, the codeword check vector p ₁ and the codeword check vector p ₂ into the codeword vector c=(S,p ₁ ,p ₂ ) and the codeword vector c.

上述即为现有的LDPC编码方法和相应的编码器构成。在接收端，还需要对接收的LDPC码字进行译码，得到重建的信息向量。已有的LDPC译码技术，其主要步骤如下：The above is the existing LDPC encoding method and the corresponding encoder configuration. At the receiving end, it is also necessary to decode the received LDPC codeword to obtain a reconstructed information vector. The main steps of the existing LDPC decoding technology are as follows:

(1)将M个校验点分成M_b层，每层包括T个校验节点。接下来，一层接一层的顺序执行译码过程。在第一层处理过程中计算校验节点和变量节点的信息，第一层译码过程结束之后，第二层使用从第一层得到的变量节点的信息进行初始化，并以此类推；(1) Divide the M checkpoints into M _b layers, and each layer includes T checkpoints. Next, the decoding process is performed sequentially layer by layer. The information of check nodes and variable nodes is calculated during the processing of the first layer. After the decoding process of the first layer is completed, the second layer is initialized with the information of variable nodes obtained from the first layer, and so on;

(2)初始化：用LLRs(log-likelihood ratios，即的信息对变量节点的值进行初始化，并将所有的校验节点信息置为0，译码算法的迭代次数为I，迭代过程按行进行，其中最小和算法中的n∈N_m表示校验矩阵原型H_b中[H_b]_m,n≠'-'的列；(2) Initialization: use LLRs (log-likelihood ratios, ie The information on the variable node The value is initialized, and all check node information Set to 0, the number of iterations of the decoding algorithm is I, and the iterative process is carried out by row, wherein n∈N _m in the minimum sum algorithm represents the column of [H _b ] _{m, n} ≠ '-' in the check matrix prototype H _b ;

(3)求最小值：变量节点向量q_n的循环右移位(移位次数S(m,n)＝[H_b]_m,n)减去校验节点信息将结果存在向量t_n中，根据OMS(offset min-sum，即的值重用特性，只需要计算向量中元素的最小值和次小值；(3) Finding the minimum value: the cyclic right shift of the variable node vector q _n (shift times S(m,n)=[H _b ] _m,n ) minus the check node information Store the result in the vector t _n , according to OMS(offset min-sum, ie The value reuse feature of , only need to calculate the minimum value and the second minimum value of the elements in the vector;

(4)选择最小值：对n∈N_m，计算更新q_n和的值。(4) Choose the minimum value: for n∈N _m , calculate and update q _n and value.

为实现上述译码方法，现有译码器参见图3，由4个部分组成，分别为初始化译码器单元、最小值和次小值选择单元、数据截短单元和循环移位单元。In order to realize the above-mentioned decoding method, the existing decoder is shown in Fig. 3 and consists of 4 parts, namely an initialization decoder unit, a minimum value and next minimum value selection unit, a data truncation unit and a cyclic shift unit.

该译码器中的初始化译码器单元，其输入是一个LDPC检验矩阵，其输出是一个经过初始化译码单元处理后的检验矩阵，其功能是根据译码器输入要求将检验矩阵进行存储转换并输出处理后的检验矩阵。The initialization decoder unit in the decoder, its input is an LDPC inspection matrix, its output is an inspection matrix processed by the initialization decoding unit, and its function is to store and convert the inspection matrix according to the input requirements of the decoder And output the processed test matrix.

该译码器中的最小值和次小值选择单元，其输入是两个矩阵，其中一个是经过初始化译码器单元处理后的检验矩阵，另一个是LDPC码字矩阵c，即经过无线信道传输后的LDPC码字矩阵c，其输出是一个经过最小值和次小值选择单元处理后的矩阵，其功能是计算出变量节点循环右移位与校验节点之间的差值的最小值与次小值并输出结果矩阵提供给数据截短单元和循环移位单元。The minimum value and second minimum value selection unit in the decoder, its input is two matrices, one of which is the inspection matrix after the initial decoder unit processing, and the other is the LDPC code word matrix c, that is, through the wireless channel The output of the transmitted LDPC code word matrix c is a matrix processed by the minimum value and the second minimum value selection unit, and its function is to calculate the minimum value of the difference between the variable node cyclic right shift and the check node and the second smallest value and output the result matrix to provide to the data truncation unit and the cyclic shift unit.

该译码器的数据截短单元，其输入是一个由最小值和次小值选择单元处理后输出的矩阵，其输出是一个经过数据截短单元处理后的矩阵，其功能是为防止校验节点信息的溢出，对其进行数据截短处理，并将输出结果矩阵提供给循环移位单元。The data truncation unit of the decoder, its input is a matrix output after being processed by the minimum value and the second minimum value selection unit, and its output is a matrix processed by the data truncation unit, and its function is to prevent verification The overflow of node information is processed by data truncation, and the output result matrix is provided to the cyclic shift unit.

该译码器的循环移位单元，其输入是两个矩阵，其中一个由数据截短单元处理后输出的矩阵，另一个是由最小值和次小值选择单元处理后输出的矩阵，其输出是一个经过循环移位单元处理后的矩阵，其功能是通过将最小值矩阵与校验节点矩阵进行按位模二加算出变量节点矩阵，并输出变量节点矩阵。The cyclic shift unit of the decoder, its input is two matrices, one of which is the matrix output after being processed by the data truncation unit, and the other is the matrix output after being processed by the minimum value and second minimum value selection unit, and its output It is a matrix processed by the cyclic shift unit, and its function is to calculate the variable node matrix by adding the minimum value matrix and the check node matrix by bitwise modulo, and output the variable node matrix.

如上所述，目前LDPC码的编译码理论较为成熟，但因为LDPC码是一种码长较大的线性分组码，校验矩阵也较大，算法复杂度十分高，传统的LDPC编译码方式不能很好的满足IEEE 802.11n系统的吞吐率要求，很大程度影响到了系统的性能。现有高速无线局域网系统中LDPC码的实现大多基于FPGA(Field－Programmable Gate Array，现场可编程门阵列)芯片与DSP(Digital Signal Processor，数字信号处理)芯片。通过以往方法虽然可以满足现代高速无线局域网协议中处理和时延的要求，但是FPGA编程和专业DSP都比较复杂，缺少丰富的编程环境和调试工具，适用性一般。As mentioned above, the coding and decoding theory of LDPC codes is relatively mature at present, but because LDPC codes are linear block codes with a large code length and a large check matrix, the algorithm complexity is very high, and the traditional LDPC coding and decoding methods cannot It satisfies the throughput requirement of IEEE 802.11n system very well, and affects the performance of the system to a great extent. Most implementations of LDPC codes in existing high-speed wireless local area network systems are based on FPGA (Field-Programmable Gate Array, Field Programmable Gate Array) chips and DSP (Digital Signal Processor, digital signal processing) chips. Although the previous methods can meet the processing and delay requirements of modern high-speed wireless LAN protocols, FPGA programming and professional DSP are relatively complicated, lacking a rich programming environment and debugging tools, and the applicability is average.

发明内容Contents of the invention

本申请给出一种基于通用处理器的LDPC编译码方法，能够在通用处理器上高效地实现LDPC编译码。This application provides an LDPC encoding and decoding method based on a general-purpose processor, which can efficiently implement LDPC encoding and decoding on a general-purpose processor.

为实现上述目的，本申请采用如下的技术方案：In order to achieve the above object, the application adopts the following technical solutions:

一种基于通用处理器的LDPC编码方法，包括：通过信号采集或接收获取待编码的信号向量S，确定校验矩阵H及其分块矩阵A、B、D、E、F和T，并进行保存；根据确定向量p₁和p₂，并得到LDPC的编码结果向量c＝(S,p₁,p₂)；其中，所述确定向量p₁和p₂时进行的任一矩阵与任一向量的相乘处理包括：A general-purpose processor-based LDPC encoding method, comprising: obtaining a signal vector S to be encoded through signal acquisition or reception, determining a parity check matrix H and its block matrices A, B, D, E, F, and T, and performing save; according to Determine the vectors p ₁ and p ₂ , and obtain the LDPC encoding result vector c=(S, p ₁ , p ₂ ); wherein, when determining the vectors p ₁ and p ₂ , the correlation between any matrix and any vector Multiplication processing includes:

将所述任一矩阵的每一行作为一个线程，进行该矩阵的相应行与所述任一向量的相乘操作，并将所有行的相乘结果组合在一起构成结果向量；Taking each row of the arbitrary matrix as a thread, performing the multiplication operation of the corresponding row of the matrix and the arbitrary vector, and combining the multiplication results of all rows to form a result vector;

其中，所述任一矩阵的每一行与所述任一向量的相乘操作包括：确定矩阵当前第i行的每个元素j对应的向量起始位置＝所述任一向量的起始位置+A_i,j+(j-1)*Z，将所述任一向量中从所述起始位置起Z-A_i,j长度的数据通过单指令多数据流SIMD的方式进行左移位，并将所述起始位置开始的前A_i,j长度的数据移至左移位后的数据之后，得到所述元素j对应的向量移位结果；再将每个元素对应的向量移位结果相加，作为所述每一行与所述任一向量的相乘结果；Wherein, the multiplication operation of each row of any matrix and any vector includes: determining the initial position of the vector corresponding to each element j in the current ith row of the matrix=the initial position of any vector+ A _{i, j} + (j-1) * Z, the data of ZA _{i, j} length from the starting position in any vector is left-shifted by means of single instruction multiple data stream SIMD, and After the data of the length of A _{i, j} before the starting position is moved to the left-shifted data, the vector shift result corresponding to the element j is obtained; and then the vector shift result corresponding to each element is added , as the multiplication result of each row and any vector;

所述SIMD的方式中，将从所述起始位置起Z-A_i,j长度的数据以长度W为单位划分成段，对段数据并行进行左移位操作，再将剩余的(Z-A_i,j)modW长度的数据进行左移位操作；In the SIMD method, the data of ZA _i,j length from the starting position is divided into units of length W segment, yes The segment data is shifted to the left in parallel, and then the remaining (ZA _{i, j} ) modW length data is shifted to the left;

Z为所述校验矩阵中的一个元素代表的子矩阵大小。Z is the size of a sub-matrix represented by an element in the parity check matrix.

较佳地，当所述任一矩阵为T^-1时，所述T^-1的每一行与相应向量的相乘操作时，仅进行T^-1取值为0的元素与相应向量的相乘，得到该取值为0元素对应的向量移位结果，将其余元素对应的向量移位结果设置为零向量；再将每个元素对应的向量移位结果相加，作为所述每一行与所述任一向量的相乘结果。Preferably, when the arbitrary matrix is T ^-1 , when each row of T ^-1 is multiplied by the corresponding vector, only the element whose value is 0 in T ^-1 is multiplied by the corresponding vector , get the vector shift result corresponding to the element whose value is 0, and set the vector shift result corresponding to the remaining elements as a zero vector; The multiplication result of any vector described above.

较佳地，对W段数据同时进行左移位操作后取前Z个数据为有效数据。Preferably, the first Z data are taken as valid data after the left shift operation is performed on the W segments of data at the same time.

较佳地，所述将每个元素对应的向量移位结果相加包括：将每个元素对应的向量移位结果以长度W为单位划分成段，通过SIMD对段数据并行进行相加操作，再将剩余的(Z-A_i,j)modW长度的数据进行相加操作。Preferably, adding the vector shift results corresponding to each element includes: dividing the vector shift results corresponding to each element into segment, via SIMD pair The segment data is added in parallel, and then the remaining (ZA _i,j )modW length data is added.

较佳地，所述矩阵A、B、D、E、F和T^-1通过线性查找表进行保存。Preferably, the matrices A, B, D, E, F and T ^-1 are stored through a linear lookup table.

一种基于通用处理器的LDPC译码方法，包括：接收已编码的LDPC码字信号c，确定校验矩阵H；通过多次迭代计算变量节点向量q作为译码结果，每次迭代时，根据当前的变量节点向量q和校验节点向量r计算临时变量向量为并根据所述临时变量向量t更新校验节点向量r，再根据校验节点向量r和临时变量向量t更新变量节点向量q为初次迭代时，将码字信号c作为变量节点向量q，将校验节点向量r设为0；其中，A kind of LDPC decoding method based on a general-purpose processor, comprising: receiving the encoded LDPC code word signal c, determining the parity check matrix H; calculating the variable node vector q as the decoding result through multiple iterations, during each iteration, according to The current variable node vector q and check node vector r calculate the temporary variable vector as And update the check node vector r according to the temporary variable vector t, and then update the variable node vector q according to the check node vector r and the temporary variable vector t as In the first iteration, the codeword signal c is used as the variable node vector q, and the check node vector r is set to 0; where,

每次迭代计算临时变量向量t、校验节点向量r和变量节点向量q时，以校验矩阵的每一行作为一个线程进行运算和更新，得到与每行相对应的向量t、q和r中索引号从到的子向量；其中，i为校验矩阵的行索引，对应所述校验矩阵的第i行计算临时变量向量t、校验节点向量r和变量节点向量q对应的子向量时，根据校验矩阵该行的每个非“-”元素H_i,j对应计算向量t、q和r中与元素H_i,j对应的索引号从到的子向量，再依次进行连接得到与每行相应的子向量，i＝1时，令 When calculating the temporary variable vector t, the check node vector r and the variable node vector q each iteration, each row of the check matrix is used as a thread for operation and update, and the vectors t, q and r corresponding to each row are obtained. index number from arrive where i is the row index of the parity check matrix, corresponding to the i-th row of the parity check matrix when calculating the subvectors corresponding to the temporary variable vector t, the check node vector r and the variable node vector q, according to the check Each non-"-" element H _{i, j} in the row of the matrix corresponds to the index number corresponding to the element H _{i, j} in the calculation vectors t, q and r from arrive The subvectors of , and then sequentially connected to obtain the subvectors corresponding to each row, when i=1, let

计算与H_i,j对应的临时变量向量t子向量的方式为：确定H_i,j对应的向量起始位置Z*(n-1)+H_i,n，将第i行对应的向量q子向量中所述起始位置起长度为或6的数据通过SIMD的方式拷贝到与H_i,j对应的临时变量向量t子向量的开头；在H_i,n≠0、H_i,n≠'-'且(Z-H_i,n)modW≠0时，确定矩阵M_{LdpcAssemble1}中与校验矩阵元素H_i,j对应行中各个元素的取值并将与元素H_i,j对应的当前向量q的子向量中索引号为的各个元素依次拷贝到与H_i,j对应的临时变量向量t子向量的当前位置上；再确定每个元素H_i,j对应的第二向量起始位置M_LdpcOffset2，将所述第二向量起始位置起长度为的数据通过SIMD的方式拷贝到与H_i,j对应的临时变量向量t子向量的当前位置上；取与H_i,j对应的临时变量向量t子向量中的前Z位并取绝对值作为与H_i,j对应的临时变量向量t的有效子向量；The way to calculate the temporary variable vector t sub-vector corresponding to H _i _,j is: determine the vector starting position Z*(n-1)+H _i,n corresponding to H i,j, and set the vector q corresponding to row i The length from the starting position in the subvector is or 6 data is copied to the beginning of the temporary variable vector t subvector corresponding to H _{i, j} by means of SIMD; when H _{i, n} ≠ 0, H _{i, n} ≠ '-' and (ZH _{i, n} ) modW When ≠0, determine the value of each element in the row corresponding to check matrix element H _i,j in matrix M _{LdpcAssemble1} And the index number in the subvector of the current vector q corresponding to the element H _{i, j} is Each element of is copied to the current position of the temporary variable vector t sub-vector corresponding to H _{i, j} in turn; then determine the second vector starting position M _LdpcOffset2 corresponding to each element H _{i, j} , and set the second vector The length from the starting position is The data of is copied to the current position of the temporary variable vector t subvector corresponding to H _i,j by means of SIMD; take the first Z bits in the temporary variable vector t subvector corresponding to H _i,j and take the absolute value as valid subvectors of the temporary variable vector t corresponding to H _i,j ;

当H_i,n≠0、H_i,n≠'-'且(Z-H_i,n)modW≠0时，当H_i,n＝0或H_i,n＝'-'或(Z-H_i,n)modW＝0时，(M_LdpcOffset2)_i,j＝Z*(n-1)； K为通用处理器一次可处理数据量大小，k为SIMD处理的基础单位大小；码长L_LDPC＝648时，LdpcRemain＝11；当码长L_LDPC＝1296时，LdpcRemain＝6；当码长L_LDPC＝1944时，LdpcRemain＝1；j为第i行中的每个非“-”元素在该行所有非“-”元素中的索引，n为第i行第j个非“-”元素在校验矩阵中的列索引。When H _i,n ≠0, H _i,n ≠'-' and (ZH _i,n )modW≠0, When H _i,n =0 or H _i,n ='-' or (ZH _i,n )modW=0, (M _LdpcOffset2 ) _i,j =Z*(n-1); K is the amount of data that can be handled by a general-purpose processor at one time, and k is the basic unit size of SIMD processing; when code length L _LDPC =648, LdpcRemain=11; when code length L _LDPC =1296, LdpcRemain=6; when code length L When _LDPC =1944, LdpcRemain=1; j is the index of each non-"-" element in the i-th row in all non-"-" elements in the row, and n is the j-th non-"-" element in the i-th row Column index in check matrix.

较佳地，计算与校验矩阵的每行对应的校验节点向量r子向量的方式为：Preferably, the method of calculating the check node vector r subvector corresponding to each row of the check matrix is:

将与校验矩阵每行对应的临时变量向量t子向量写成V_{LdpcRowLength}(v)行和列的矩阵T_v，其中，所述矩阵T_v的每一行为所述临时变量向量t子向量中与元素H_i,j对应的子向量，列数不够时进行补位；Write the temporary variable vector t subvector corresponding to each row of the check matrix as V _{LdpcRowLength} (v) rows and A matrix T _v of columns, wherein each row of the matrix T _v is a sub-vector corresponding to the element H _{i, j} in the sub-vector of the temporary variable vector t, and when the number of columns is not enough, a complement is performed;

对所述矩阵T_v进行最值分配，得到最值变量向量m子向量矩阵M_v；Carrying out the most value assignment to the matrix T _v to obtain the most value variable vector m subvector matrix M _v ;

根据所述矩阵T_v计算中间变量向量s子向量矩阵S_v；Calculate the intermediate variable vector s subvector matrix S _{v according to the matrix T v} _;

根据所述矩阵M_v和所述矩阵S_v中索引值相同的元素，确定一中间矩阵R_v'中相应索引值的元素取值；其中，若矩阵S_v中的任一元素小于0，则取该任一元素的补数并与该任一元素相加，将相加结果作为矩阵R_v'中与所述任一元素索引值相同的元素的取值；若矩阵S_v中的任一元素等于0，则将该任一元素与0相加，将相加结果作为矩阵R_v中与所述任一元素索引值相同的元素的取值；若矩阵S_v中的任一元素>0，则在矩阵M_v中取与所述任一元素索引值相同的元素与所述任一元素相加，将相加结果作为矩阵R_v中与所述任一元素索引值相同的元素的取值；所述比较和相加的操作通过SIMD的方式进行；According to the element with the same index value in the matrix _{Mv and the matrix Sv, determine the element value of the corresponding index value in an intermediate matrix Rv} _' _; wherein, if any element in the matrix _Sv is less than 0, then Take the complement of any element and add it to any element, and use the addition result as the value of the element in the matrix R _v ' that is the same as the index value of any element in the matrix; if any of the matrix S _v If the element is equal to 0, add any element to 0, and use the addition result as the value of the element in the matrix R _v that is the same as the index value of any element in the matrix; if any element in the matrix S _v > 0 , then take the same element as the index value of any element in the matrix M _v and add it to any element, and use the result of the addition as the value of the element same as the index value of any element in the matrix R _v value; the operation of comparing and adding is carried out by means of SIMD;

通过SIMD方式将所述矩阵R_v'与矩阵T_v中索引值相同的元素相减，将结果作为校验节点向量r子向量矩阵R_v中相同索引值的元素取值；将所述矩阵R_v中每行的前Z个元素按照行优先的方式依次读出组成校验节点向量r子向量。Subtract the matrix R _v ' from the elements with the same index value in the matrix T _v by means of SIMD, and use the result as the element value of the same index value in the check node vector r subvector matrix R _v ; the matrix R The first Z elements of each row in _v are sequentially read out to form the check node vector r subvector in a row-first manner.

较佳地，所述进行最值分配包括：Preferably, said performing the most value distribution includes:

通过SIMD的方式确定所述矩阵T_v中每一列的最小值和次小值以及最小值对应的行索引；将得到的最小值和次小值进行修正，均减去预设的修正值β，当修正后的最小值和次小值小于0时，将其设置为0，否则保持不变；Determine the minimum value and the second minimum value of each column in the matrix T _v and the row index corresponding to the minimum value by means of SIMD; modify the obtained minimum value and the second minimum value, and subtract the preset correction value β, When the corrected minimum and second minimum values are less than 0, set it to 0, otherwise keep it unchanged;

根据所述矩阵T_v中每一列的当前最小值、次小值和最小值对应的行索引，构造最值变量向量m子向量矩阵M_v中相同索引的列，其中，在M_v的任一列中，将与当前最小值对应相同行索引的元素设置为确定出的最小值，将其余元素设置为次小值。According to the current minimum value of each column in the matrix T _v , the second minimum value and the row index corresponding to the minimum value, construct the column of the same index in the most value variable vector m subvector matrix M _v , wherein, in any column of M _v , set the element corresponding to the same row index as the current minimum value as the determined minimum value, and set the remaining elements as the next smallest value.

较佳地，所述通过SIMD的方式确定每一列的最小值和次小值以及相应的行索引的方式包括：Preferably, the method of determining the minimum value and the second minimum value of each column and the corresponding row index through SIMD includes:

将所述矩阵T_v的每一行元素分成个子块，每个子块包括W个基本单位；在比较所述矩阵T_v中任意两行的元素时，通过SIMD的方式一次性比较W个基本单位。Divide each row element of the matrix T _v into sub-blocks, each sub-block includes W basic units; when comparing the elements of any two rows in the matrix T _v , W basic units are compared at one time by means of SIMD.

较佳地，所述计算中间变量向量s子向量矩阵S_v包括：Preferably, the calculation of the intermediate variable vector s subvector matrix S _v includes:

对于矩阵T_v中的每一列，将该列所有元素进行异或操作，再将结果与第i'行的元素异或后与0x7f进行或操作，将或操作结果作为中间向量矩阵S_v中相同索引列的第i'行元素；其中，将所述矩阵T_v的每一行元素分成个子块，每个子块包括W个基本单位，在进行异或/或操作时，通过SIMD的方式一次性执行W个基本单位的异或/或操作。For each column in the matrix T _v , perform an XOR operation on all the elements in the column, then XOR the result with the elements in the i'th row and then perform an OR operation with 0x7f, and use the OR operation result as the same as in the intermediate vector matrix S _v The i'th row element of the index column; wherein, each row element of the matrix T _v is divided into sub-blocks, each sub-block includes W basic units, and when XOR/OR operations are performed, the XOR/OR operations of W basic units are performed at one time by means of SIMD.

较佳地，计算与H_i,j对应的变量节点向量q子向量包括：Preferably, calculating the variable node vector q subvector corresponding to H _i,j includes:

确定H_i,j对应的向量起始位置Z*(n-1)+H_i,n，通过SIMD方式将H_i,j对应的临时变量向量t子向量与H_i,j对应的校验节点向量r子向量相加，将结果向量中所述起始位置起长度为或5的数据通过SIMD的方式拷贝到与H_i,j对应的变量节点向量q子向量的开头；在H_i,n≠0、H_i,n≠'-'且(Z-H_i,n)modW≠0时，确定矩阵M_{LdpcAssemble1}中与校验矩阵元素H_i,j对应行中各个元素的取值并将与元素H_i,j对应的当前向量q的子向量中索引号为的各个元素依次拷贝到与H_i,j对应的变量节点向量q子向量的当前位置上；Determine the vector starting position Z*(n-1)+H _i,n corresponding to H _i,j _, and use the SIMD method to connect the temporary variable vector t subvector corresponding to H _i,j to the check node corresponding to H i,j The vector r sub-vectors are added, and the length from the starting position in the result vector is or 5 data is copied to the beginning of the variable node vector q subvector corresponding to H _{i, j} by means of SIMD; when H _{i, n} ≠ 0, H _{i, n} ≠ '-' and (ZH _{i, n} ) modW When ≠0, determine the value of each element in the row corresponding to check matrix element H _i,j in matrix M _{LdpcAssemble1} And the index number in the subvector of the current vector q corresponding to the element H _{i, j} is Each element of is copied to the current position of the variable node vector q sub-vector corresponding to H _{i, j} in turn;

确定每个元素H_i,j对应的第二向量起始位置M_LdpcOffset2，将所述第二向量起始位置起长度为0或的数据通过SIMD的方式拷贝到与H_i,j对应的变量节点向量q子向量的当前位置上；Determine the second vector starting position M _LdpcOffset2 corresponding to each element H _i,j , and set the length from the starting position of the second vector to 0 or The data of is copied to the current position of the variable node vector q sub-vector corresponding to H _i,j through SIMD;

根据LdpcRemain指示的补位个数，按照M_{LdpcAssemble1}中与校验矩阵元素H_i,j对应行中元素的取值进行补位。According to the number of supplementary positions indicated by LdpcRemain, supplementary positions are performed according to the value of the element in the row corresponding to the check matrix element H _{i, j} in M _{LdpcAssemble1} .

较佳地，预先计算并保存每个元素H_i,j对应的向量起始位置Z*(n-1)+H_i,n和第二向量起始位置M_LdpcOffset2、矩阵M_{LdpcAssemble1}、校验矩阵中每行非“-”元素的个数构成的向量V_{LdpcRowLength}、M_{LdpcAssemble1}、LdpcRemain。Preferably, the vector start position Z*(n-1)+H _i,n corresponding to each element H _i,j is pre-calculated and stored, and the second vector start position M _LdpcOffset2 , matrix M _{LdpcAssemble1} , parity check matrix The vector V _{LdpcRowLength} , M _{LdpcAssemble1} , and LdpcRemain constituted by the number of non-"-" elements in each row.

由上述技术方案可见，本申请中的LDPC编译码方法，能够通过SIMD指令、多线程和预先存储等方式提高编译码速度。It can be seen from the above technical solutions that the LDPC encoding and decoding method in the present application can improve the encoding and decoding speed by means of SIMD instructions, multi-threading, and pre-storage.

附图说明Description of drawings

图1为LDPC PPDU的生成过程示意图；Fig. 1 is the schematic diagram of the generation process of LDPC PPDU;

图2为LDPC编码过程的编码器组成示意图；Fig. 2 is the encoder composition schematic diagram of LDPC encoding process;

图3为现有LDPC译码器示意图；Fig. 3 is the schematic diagram of existing LDPC decoder;

图4为本申请中编码方法的总体流程图；Fig. 4 is the overall flowchart of the coding method in the present application;

图5为本申请LDPC编码处理中计算码字校验向量p₁的运算示意图；Fig. 5 is the operation schematic diagram of computing code word check vector p ₁ in the LDPC encoding process of the present application;

图6为本申请LDPC编码处理中计算码字校验向量p₂的运算示意图；Fig. 6 is the operation schematic diagram of calculating codeword check vector p ₂ in the LDPC encoding process of the present application;

图7为优化乘法器1的结构示意图；Fig. 7 is the structural representation of optimization multiplier 1;

图8为优化乘法器2的结构示意图；Fig. 8 is the structural representation of optimization multiplier 2;

图9为优化加法器的结构示意图；Fig. 9 is a structural schematic diagram of an optimized adder;

图10为矩阵A中的一个元素与向量S的转置相乘处理的示意图；Fig. 10 is a schematic diagram of an element in the matrix A and the transpose multiplication process of the vector S;

图11为本申请LDPC编码方法中步骤5的处理示意图；Fig. 11 is the processing schematic diagram of step 5 in the LDPC coding method of the present application;

图12为本申请LDPC译码方法中步骤5的处理示意图；Fig. 12 is a schematic diagram of the processing of step 5 in the LDPC decoding method of the present application;

图13为本申请LDPC译码方法的具体流程使用图；Fig. 13 is a specific flow diagram of the application's LDPC decoding method;

图14为本申请LDPC译码方法中针对一个子块进行的最值分配处理的流程示意图；FIG. 14 is a schematic flow chart of the most value allocation process for a sub-block in the LDPC decoding method of the present application;

图15为LDPC译码方法中一次最值分配运算的结构示意图；Fig. 15 is a schematic structural diagram of a maximum value assignment operation in the LDPC decoding method;

图16为LDPC译码方法中针对一个子块计算中间变量向量的流程示意图；16 is a schematic flow chart of calculating an intermediate variable vector for a sub-block in the LDPC decoding method;

图17为LDPC译码方法中一次中间向量计算的结构示意图。Fig. 17 is a schematic structural diagram of an intermediate vector calculation in the LDPC decoding method.

具体实施方式detailed description

为了使本申请的目的、技术手段和优点更加清楚明白，以下结合附图对本申请做进一步详细说明。In order to make the purpose, technical means and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings.

本申请给出适用于通用处理器中实现的LDPC编码方法和译码方法。下面详细描述本申请中的编码方法和译码方法。This application provides an LDPC encoding method and a decoding method suitable for implementation in a general-purpose processor. The encoding method and decoding method in this application are described in detail below.

按照IEEE 802.11n协议中的LDPC PPDU生成方法，将编码后码字向量记为c＝(S,p₁,p₂)，其中S为信息向量，p₁和p₂为码字校验向量，将校验矩阵H简化六个部分分成矩阵A、矩阵B、矩阵D、矩阵E、矩阵F以及矩阵T六个子矩阵。在进行译码前，需外界输入码长L_LDPC、编码码率R以及信息向量S。本申请根据通用处理器(GPP)芯片架构的特性对LDPC编码方法进行如下优化：According to the LDPC PPDU generation method in the IEEE 802.11n protocol, the encoded codeword vector is recorded as c=(S,p ₁ ,p ₂ ), where S is the information vector, p ₁ and p ₂ are the codeword check vectors, Simplify the check matrix H into six parts It is divided into six sub-matrices of matrix A, matrix B, matrix D, matrix E, matrix F and matrix T. Before decoding, the code length L _LDPC , code rate R and information vector S need to be input from outside. The application optimizes the LDPC encoding method as follows according to the characteristics of the general-purpose processor (GPP) chip architecture:

1、采用SIMD(Single Instruction Multiple Data，单指令多数据流)的运算方法对编码方法进行优化，其本质原理是在CPU的一个时钟周期内对多个数据进行处理以获得并行处理的效果，而不像是普通的使用方式——每一个时钟周期只进行一次数据处理操作。其中将涉及所使用通用处理器一次可处理数据量大小，假设该大小为K比特，SIMD处理的基础单位大小为k，则一次运算可处理数据量 1. Using the SIMD (Single Instruction Multiple Data, Single Instruction Multiple Data) algorithm to optimize the encoding method, the essential principle is to process multiple data within one clock cycle of the CPU to obtain the effect of parallel processing, while Unlike normal usage - only one data processing operation per clock cycle. It will involve the amount of data that can be processed at one time by the general-purpose processor used. Assuming that the size is K bits, and the basic unit size of SIMD processing is k, the amount of data that can be processed in one operation

2、采用线性查找表的方法优化校验矩阵的信息存储，将校验矩阵H拆分成六个部分分成矩阵A、矩阵B、矩阵D、矩阵E、矩阵F以及矩阵T六个子矩阵，并以这六个子矩阵生成六个线性查找表，降低计算复杂度。2. Use the linear lookup table method to optimize the information storage of the check matrix, and split the check matrix H into six parts Divide into six sub-matrices of matrix A, matrix B, matrix D, matrix E, matrix F and matrix T, and generate six linear look-up tables with these six sub-matrices, reducing computational complexity.

3、采用多线程的方法，以校验矩阵H的行数为线程数，即以处理校验矩阵H中的一行数据为一线程，在同一时间执行多个线程的处理操作，进而提升系统的整体处理性能。3. Adopt the method of multi-threading, take the number of rows of the check matrix H as the number of threads, that is, process one row of data in the check matrix H as a thread, and execute the processing operations of multiple threads at the same time, thereby improving the performance of the system overall processing performance.

图4为本申请中编码方法的总流程图，其中，该编码方法基于的算法原理与目前LDPC编码方法相同，区别在于在通用处理器中对于编码方法的具体实现。具体流程如下：Fig. 4 is a general flowchart of the encoding method in this application, wherein the algorithm principle based on the encoding method is the same as that of the current LDPC encoding method, and the difference lies in the specific implementation of the encoding method in the general-purpose processor. The specific process is as follows:

1、根据802.11n协议，不同码长L_LDPC以及不同编码码率R对应着不同的校验矩阵H。首先，根据码长L_LDPC以及编码码率R，提取出相对应的校验矩阵H，并对以下各个参数进行初始化：1. According to the 802.11n protocol, different code lengths L _LDPC and different coding rates R correspond to different parity check matrices H. First, according to the code length L _LDPC and the coding rate R, the corresponding check matrix H is extracted, and the following parameters are initialized:

1.1生成矩阵A、矩阵B、矩阵D、矩阵E、矩阵F以及矩阵T^-1六个子矩阵，其中()^-1为矩阵的逆，并将其依次存于线性查找表中，用偏移量的方法标记所需数据的具体位置，用内存换取计算复杂度，提高LDPC码编码方法的数据处理速度。1.1 Generate matrix A, matrix B, matrix D, matrix E, matrix F and matrix T ^-1 six sub-matrices, where () ^-1 is the inverse of the matrix, and store them in the linear lookup table in turn, using the offset The method marks the specific location of the required data, exchanges the memory for computational complexity, and improves the data processing speed of the LDPC code encoding method.

1.2生成所选校验矩阵H中每个元素所代表子矩阵的大小为Z。当码长L_LDPC＝648时，Z＝27；当码长L_LDPC＝1296时，Z＝54；当码长L_LDPC＝1944时，Z＝81。1.2 The size of the sub-matrix represented by each element in the selected parity check matrix H is generated as Z. When the code length L _LDPC =648, Z=27; when the code length L _LDPC =1296, Z=54; when the code length L _LDPC =1944, Z=81.

2、根据GPP芯片特性，采用多线程的方法对编码方法进行优化，以校验矩阵H的行数为线程数，其中校验矩阵H中的一行数据为一线程处理步骤3到步骤4，在同一时间执行多个线程的处理操作，即同一时间进行多个步骤3到步骤4的处理，进而提升系统的整体处理性能。以下的步骤3和步骤4，均为单一线程处理的具体流程。2. According to the characteristics of the GPP chip, the encoding method is optimized by using a multi-thread method. The number of rows of the parity check matrix H is the number of threads, and one row of data in the parity check matrix H is one thread to process steps 3 to 4. Execute the processing operations of multiple threads at the same time, that is, perform the processing of multiple steps 3 to 4 at the same time, thereby improving the overall processing performance of the system. The following steps 3 and 4 are specific processes for single-thread processing.

3、计算码字校验向量p₁，其中的乘法运算以及加法运算均采用SIMD的运算方法进行优化，其运算示意图参见图5，具体流程如下：3. Calculate the codeword check vector p ₁ , in which the multiplication and addition operations are optimized using the SIMD operation method. The operation schematic diagram is shown in Figure 5, and the specific process is as follows:

3.1矩阵A与向量S的转置相乘，结果为向量，向量长度与矩阵A的行数相等。3.1 The matrix A is multiplied by the transpose of the vector S, and the result is a vector whose length is equal to the number of rows of the matrix A.

3.2矩阵T^-1与步骤3.1所得结果向量的转置相乘，结果为向量，向量长度与矩阵T^-1的行数相等。3.2 The matrix T ^-1 is multiplied by the transposition of the result vector obtained in step 3.1, and the result is a vector whose length is equal to the number of rows of the matrix T ^-1 .

3.3矩阵E与步骤3.2所得结果向量的转置相乘，结果为向量，向量长度与矩阵E的行数相等。3.3 The matrix E is multiplied by the transpose of the result vector obtained in step 3.2, and the result is a vector whose length is equal to the number of rows of the matrix E.

3.4矩阵F与向量S的转置相乘，结果为向量，向量长度与矩阵F的行数相等。3.4 The matrix F is multiplied by the transpose of the vector S, and the result is a vector whose length is equal to the number of rows of the matrix F.

3.5步骤3.3所得结果向量与步骤3.4所得结果向量相加，结果即为码字校验向量p₁。3.5 Add the result vector obtained in step 3.3 to the result vector obtained in step 3.4, and the result is the codeword check vector p ₁ .

4、计算码字校验向量p₂，其中的乘法运算以及加法运算均采用SIMD的运算方法进行优化，其运算示意图参见图6，具体流程如下：4. Calculate the codeword check vector p ₂ , in which the multiplication and addition operations are optimized using the SIMD operation method. The schematic diagram of its operation is shown in Figure 6, and the specific process is as follows:

4.1矩阵B与向量p₁的转置相乘，结果为向量，向量长度与矩阵B的行数相等。4.1 Matrix B is multiplied by the transpose of vector p ₁ , and the result is a vector whose length is equal to the number of rows of matrix B.

4.2步骤3.1所得结果向量与步骤4.1所得结果向量相加，结果为向量。4.2 Add the result vector obtained in step 3.1 to the result vector obtained in step 4.1, and the result is a vector.

4.3矩阵T^-1与步骤4.2所得结果向量的转置相乘，结果即为码字校验向量p₂。4.3 The matrix T ^-1 is multiplied by the transpose of the result vector obtained in step 4.2, and the result is the codeword check vector p ₂ .

5、组装LDPC码字向量c：5. Assemble the LDPC codeword vector c:

将所得向量按照S、p₁、p₂的顺序存储，即得LDPC码字向量c＝(S,p₁,p₂)。The obtained vectors are stored in the order of S, p ₁ , p ₂ to obtain the LDPC codeword vector c=(S, p ₁ , p ₂ ).

在上述本申请的编码方法中涉及到两种优化乘法器以及一种优化加法器，分别成为优化乘法器1、优化乘法器2以及优化加法器。根据GPP芯片架构特性，优化乘法器1、优化乘法器2以及优化加法器中涉及到并行操作的部分可用SIMD的运算方法进行优化。下面一一进行介绍。In the encoding method of the present application mentioned above, two kinds of optimized multipliers and one optimized adder are involved, which are called optimized multiplier 1 , optimized multiplier 2 and optimized adder respectively. According to the characteristics of the GPP chip architecture, the optimized multiplier 1, the optimized multiplier 2, and the optimized adder that involve parallel operations can be optimized using the SIMD operation method. Let's introduce them one by one.

优化乘法器1有两个输入端一个输出端，优化乘法器1示意图参见图7，在优化编码方法的步骤3.1、步骤3.3、步骤3.4以及步骤4.1中所涉及到的矩阵与向量相乘运算均使用到优化乘法器1。以步骤3.1为例，优化乘法器1的输入端为矩阵A与向量S的转置，其具体实现流程如下：The optimal multiplier 1 has two input terminals and one output terminal. The schematic diagram of the optimized multiplier 1 is shown in FIG. Use to optimize multiplier 1. Taking step 3.1 as an example, the input terminal of optimization multiplier 1 is the transposition of matrix A and vector S, and its specific implementation process is as follows:

1、判断是否达到矩阵A的最大行数，若达到，则已完成该操作；若没有达到，则进行步骤2。1. Determine whether the maximum number of rows of matrix A has been reached. If so, the operation has been completed; if not, proceed to step 2.

2、将矩阵A中第一个元素所代表的Z*Z的子矩阵与向量S的转置相乘。因为该子矩阵是一个Z*Z的单位矩阵经过A_1,1(A_1,1表示矩阵A第一行第一列的元素)次循环左移位后的结果，所以该子矩阵与向量S的转置相乘相当于对向量S进行A_1,1次循环移位操作。该操作可进行SIMD优化，参见图10，具体操作流程如下：2. Multiply the submatrix of Z*Z represented by the first element in matrix A by the transpose of vector S. Because this sub-matrix is the result of a Z*Z identity matrix after A _1,1 (A _1,1 represents the elements of the first row and first column of matrix A) cycles left, so the sub-matrix and the vector S Multiplying the transpose of is equivalent to performing A _1,1 cyclic shift operation on the vector S. This operation can be SIMD optimized, see Figure 10, the specific operation process is as follows:

2.1计算出部分2数据长度＝(Z-A_1,1)modW(其为Z-A_1,1对W取模)，以及所需数据起始值位置＝信息向量s起始位置+A_1,1。2.1 Calculated Part 2 data length = (ZA _1,1 ) mod W (which is ZA _1,1 modulo W), and required data start value position = information vector s start position + A _1,1 .

2.2将所需数据起始值位置起长度的数据拷贝作为中间数据的起始位置数据；2.2 Start the position of the required data starting value The data copy of the length is used as the starting position data of the intermediate data;

2.3对剩余的(Z-A_1,1)modW个数据进行移位拷贝，并将图10中的“其余部分”以及“补位”拷贝入输出中。为了适应SIMD运算，输入向量长度为Z，输出向量大小将为但在输出向量中，只有其中的前Z个元素为有效数据，并将结果数据存于结果向量寄存器中。2.3 Shift copy the remaining (ZA _1,1 ) modW pieces of data, and copy the "remaining part" and "filling" in Fig. 10 into the output. To accommodate SIMD operations, the input vector length is Z and the output vector size will be But in the output vector, only the first Z elements are valid data, and the result data is stored in the result vector register.

3、判断是否达到矩阵A的最大列数，若达到，则返回步骤1；若没有达到，则进行步骤4。3. Determine whether the maximum number of columns of the matrix A is reached, and if so, return to step 1; if not, proceed to step 4.

4、进行将矩阵A中第下一个元素所代表的Z*Z的子矩阵与向量S的转置相乘，具体步骤同步骤2.1、2.2、2.3，但转置相乘结果不需要存入结果向量寄存器中，并执行步骤5。4. Multiply the sub-matrix of Z*Z represented by the next element in matrix A with the transpose of vector S. The specific steps are the same as steps 2.1, 2.2, and 2.3, but the transposed multiplication result does not need to be stored in the result vector register, and go to step 5.

5、将步骤4计算结果与结果向量寄存器中元素相加，两个二进制数相加相当于两个数进行异或操作，此时可进行SIMD优化，即一次运算可得到W个结果，参见图11，其中(a₁,a₂,…,a_W)表示步骤4计算结果，(b₁,b₂,…,b_W)表示结果向量寄存器中元素，矩形框表示异或运算器，(y₁,y₂,…,y_W)表示运算后的结果，即并存于结果向量寄存器中，返回步骤3。5. Add the calculation result of step 4 to the elements in the result vector register. The addition of two binary numbers is equivalent to the XOR operation of two numbers. At this time, SIMD optimization can be performed, that is, W results can be obtained in one operation, as shown in the figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the calculation result of step 4, (b ₁ ,b ₂ ,…,b _W ) represents the elements in the result vector register, the rectangular box represents the XOR operator, (y ₁ ,y ₂ ,…,y _W ) represent the result after operation, namely Store them in the result vector register and return to step 3.

优化乘法器2有两个输入端一个输出端，优化乘法器2示意图参见图8，在优化编码方法的步骤3.2以及步骤4.3中所涉及到的矩阵与向量相乘运算均使用到优化乘法器2。优化乘法器2是优化乘法器1的特殊情况，优化乘法器2其中一个输入端为矩阵T^-1，不同校验矩阵H下，矩阵矩阵T^-1只有下三角中的元素为有效值。以步骤3.2为例，优化乘法器2的输入端为矩阵T_- ¹与步骤3.1所得结果向量的转置，其具体实现流程如下：The optimized multiplier 2 has two input terminals and one output terminal. The schematic diagram of the optimized multiplier 2 is shown in FIG. . The optimized multiplier 2 is a special case of the optimized multiplier 1. One of the inputs of the optimized multiplier 2 is a matrix T ^-1 . Under different parity check matrices H, the matrix Only the elements in the lower triangle of the matrix T ^-1 are valid values. Taking step 3.2 as an example, the input terminal of optimization multiplier 2 is the transposition of the matrix T _- ¹ and the result vector obtained in step 3.1. The specific implementation process is as follows:

2、有可以看出，矩阵T^-1上三角元素均为0，2. Yes It can be seen that the triangular elements on the matrix T ^-1 are all 0,

元素“0”所代表的Z*Z的子矩阵与图4中步骤3.1所得结果向量的转置相乘后结果仍为后者，所以对矩阵T^-1每行中的元素“0”与图4中步骤3.1所得结果向量的转置相乘，并按照优化乘法器1中的步骤5，将所得结果相加，返回步骤1。The submatrix of Z*Z represented by the element "0" is multiplied with the transposition of the result vector obtained in step 3.1 in Figure 4, and the result is still the latter, so for the element "0" in each row of the matrix T ^-1 and the figure Multiply the transposition of the result vector obtained in step 3.1 in 4, and add the obtained results according to step 5 in optimized multiplier 1, and return to step 1.

优化加法器有两个输入端一个输出端，两个输入端均为向量，优化加法器示意图参见图12，在优化编码方法的步骤3.5以及步骤4.2中所涉及到的向量与向量相加运算均使用到优化加法器。以步骤3.5为例，优化加法器的输入端为步骤3.3所得结果向量与步骤3.4所得结果向量，因为输入端均为二进制数，两个二进制数相加等于两个二进制数做异或操作，此时可进行SIMD优化，即一次运算可得到W个结果，参见图11，其中(a₁,a₂,…,a_W)表示步骤3.3所得结果向量，(b₁,b₂,…,b_W)表示步骤3.4所得结果向量，矩形框表示异或运算器，(y₁,y₂,…,y_W)表示运算后的结果，即 The optimized adder has two input terminals and one output terminal, and both input terminals are vectors. The schematic diagram of the optimized adder is shown in Figure 12. The vector and vector addition operations involved in step 3.5 and step 4.2 of the optimized encoding method are Use an optimized adder. Taking step 3.5 as an example, the input terminal of the optimized adder is the result vector obtained in step 3.3 and the result vector obtained in step 3.4, because the input terminals are all binary numbers, and the addition of two binary numbers is equal to the XOR operation of two binary numbers. SIMD optimization can be performed when , that is, W results can be obtained in one operation, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the result vector obtained in step 3.3, (b ₁ ,b ₂ ,…,b _W ) represents the result vector obtained in step 3.4, the rectangular box represents the XOR operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, namely

上述即为本申请中LDPC编码方法的具体流程。本申请对现有译码器的译码方法进行了优化，优化译码方法具体流程图参见图13。在进行译码前，外界需输入码长L_LDPC、编码码率R、编码后码字向量c、最大迭代次数I以及修正偏移量β。根据GPP芯片架构的特性对LDPC译码方法进行如下优化：The above is the specific flow of the LDPC encoding method in this application. This application optimizes the decoding method of the existing decoder, and the specific flowchart of the optimized decoding method is shown in FIG. 13 . Before decoding, the outside world needs to input code length L _LDPC , code rate R, code word vector c after code, maximum number of iterations I and correction offset β. According to the characteristics of the GPP chip architecture, the LDPC decoding method is optimized as follows:

1、采用SIMD的运算方法对编码方法进行优化，其本质原理是在CPU的一个时钟周期内对多个数据进行处理以获得并行处理的效果，而不像是普通的使用方式——每一个时钟周期只进行一次数据处理操作。其中将涉及所使用通用处理器一次可处理数据量大小，假设该大小为K比特，SIMD处理的基础单位大小为k，则一次运算可处理数据量 1. Using the SIMD operation method to optimize the encoding method, the essential principle is to process multiple data within one clock cycle of the CPU to obtain the effect of parallel processing, unlike the ordinary use method - each clock Only one data processing operation is performed in a cycle. It will involve the amount of data that can be processed at one time by the general-purpose processor used. Assuming that the size is K bits, and the basic unit size of SIMD processing is k, the amount of data that can be processed in one operation

2、优化译码方法中的部分流程采用多线程的方法，以校验矩阵H的行数为线程数，即以处理校验矩阵H中的一行数据为一线程，在同一时间执行多个线程的处理操作，进而提升系统的整体处理性能。2. Part of the process in the optimized decoding method adopts a multi-threaded method. The number of rows of the check matrix H is used as the number of threads, that is, one thread is used to process a row of data in the check matrix H, and multiple threads are executed at the same time processing operations, thereby improving the overall processing performance of the system.

以下介绍本申请中译码方法的具体流程，其中，译码方法的大体框架与目前译码方法相同，具体包括：接收已编码的LDPC码字信号c，确定校验矩阵H；通过多次迭代计算变量节点向量q作为译码结果，每次迭代时，根据当前的变量节点向量q和校验节点向量r计算临时变量向量为并根据临时变量向量t更新校验节点向量r，再根据校验节点向量r和临时变量向量t更新变量节点向量q为本申请提供的译码方法与现有技术的区别在于，在通用处理器中的具体实现不同。具体操作步骤如下：The following describes the specific flow of the decoding method in this application, wherein the general framework of the decoding method is the same as the current decoding method, specifically including: receiving the encoded LDPC codeword signal c, determining the parity check matrix H; through multiple iterations Calculate the variable node vector q as the decoding result, and calculate the temporary variable vector according to the current variable node vector q and check node vector r at each iteration as And update the check node vector r according to the temporary variable vector t, and then update the variable node vector q according to the check node vector r and the temporary variable vector t as The difference between the decoding method provided by the present application and the prior art lies in the specific implementation in the general processor. The specific operation steps are as follows:

1、根据802.11n协议，不同码长L_LDPC以及不同编码码率R对应着不同的校验矩阵H。首先，根据码长L_LDPC以及编码码率R，提取出相对应的校验矩阵H，将各个参数进行初始化，并将其依次存于线性查找表中，用偏移量的方法标记所需数据的具体位置，用内存换取计算复杂度，提高LDPC码译码方法的数据处理速度：1. According to the 802.11n protocol, different code lengths L _LDPC and different coding rates R correspond to different parity check matrices H. First, according to the code length L _LDPC and the encoding code rate R, extract the corresponding parity check matrix H, initialize each parameter, and store them in the linear lookup table in turn, and use the offset method to mark the required data The specific location of the memory is exchanged for computational complexity, and the data processing speed of the LDPC code decoding method is improved:

1.1Z，即所选校验矩阵H中每个元素所代表子矩阵的大小，为变量。当码长L_LDPC＝648时，Z＝27；当码长L_LDPC＝1296时，Z＝54；当码长L_LDPC＝1944时，Z＝81。1.1Z, that is, the size of the sub-matrix represented by each element in the selected parity check matrix H, is a variable. When the code length L _LDPC =648, Z=27; when the code length L _LDPC =1296, Z=54; when the code length L _LDPC =1944, Z=81.

1.2LdpcRowNum，即所选校验矩阵H的行数，为变量。当码率时，LdpcRowNum＝12；当码率时，LdpcRowNum＝8；当码率时，LdpcRowNum＝6；当码率时，LdpcRowNum＝4。1.2 LdpcRowNum, that is, the number of rows of the check matrix H selected, is a variable. bit rate When, LdpcRowNum=12; when the code rate When, LdpcRowNum=8; when the code rate When, LdpcRowNum=6; when the code rate , LdpcRowNum=4.

1.3V_{LdpcRowLength}，即所选校验矩阵H中每行非“—”元素的个数，是长度为1*LdpcRowNum的向量。1.3V _{LdpcRowLength} , that is, the number of non-"—" elements in each row of the selected parity check matrix H, is a vector with a length of 1*LdpcRowNum.

1.4LdpcBufferNum，即存储所选校验矩阵H每个子矩阵数据所需寄存器个数，为变量。其运算公式为 1.4 LdpcBufferNum, that is, the number of registers required to store the data of each sub-matrix of the selected parity check matrix H, is a variable. Its operation formula is

1.5LdpcRemain，即步骤9所需变量之一，为变量。当码长L_LDPC＝648时，LdpcRemain＝11；当码长L_LDPC＝1296时，LdpcRemain＝6；当码长L_LDPC＝1944时，LdpcRemain＝1。1.5LdpcRemain, which is one of the variables required in step 9, is a variable. When the code length L _LDPC =648, LdpcRemain=11; when the code length L _LDPC =1296, LdpcRemain=6; when the code length L _LDPC =1944, LdpcRemain=1.

1.6LdpcRoundNum，即步骤9所需变量之一，为变量。当码长L_LDPC＝648时，LdpcRoundNum＝27；当码长L_LDPC＝1296时，LdpcRoundNum＝22；当码长L_LDPC＝1944时，LdpcRoundNum＝17。1.6 LdpcRoundNum, which is one of the variables required in step 9, is a variable. When the code length L _LDPC =648, LdpcRoundNum=27; when the code length L _LDPC =1296, LdpcRoundNum=22; when the code length L _LDPC =1944, LdpcRoundNum=17.

1.7V_{LdpcRowBuffer}，即存储所选校验矩阵H每行非“—”数据所需寄存器个数，是长度为1*LdpcRowNum的向量。其运算公式为V_{LdpcRowBuffer}(v)＝V_{LdpcRowLength}(v)*LdpcBufferNum(其中V_{LdpcRowBuffer}(v)表示向量LdpcRowBuffer的第v个元素，v对应着选校验矩阵H的行号，如下同理)。1.7V _{LdpcRowBuffer} , which is the number of registers required to store non-"—" data in each row of the selected check matrix H, is a vector with a length of 1*LdpcRowNum. Its operation formula is V _{LdpcRowBuffer} (v)=V _{LdpcRowLength} (v)*LdpcBufferNum (wherein V _{LdpcRowBuffer} (v) represents the vth element of the vector LdpcRowBuffer, and v corresponds to the row number of the check matrix H, as follows).

1.8M_LdpcOffset1，即根据所选校验矩阵H计算出的循环偏移量之一，用于步骤4与步骤9，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵(其中max(V_{LdpcRowLength}(v))表示取向量V_{LdpcRowLength}中元素的最大值，如下同理)。其运算公式为(M_LdpcOffset1)_i,j＝Z*(n-1)+H_i,n，其中n表示所选校验矩阵H中的列数，并且j与n的对应关系为所选校验矩阵H第i行第j个非“—”元素所在该校验矩阵H的位置是第i行第n列，如下同理。1.8M _LdpcOffset1 , which is one of the cyclic offsets calculated according to the selected parity check matrix H, is used in steps 4 and 9, and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)) (wherein max(V _{LdpcRowLength} ( v)) means to take the maximum value of the elements in the vector V _{LdpcRowLength} , the same reason as follows). Its operation formula is (M _LdpcOffset1 ) _i,j =Z*(n-1)+H _i,n , where n represents the number of columns in the selected check matrix H, and the corresponding relationship between j and n is the selected check matrix The position of the jth non-"—" element in the i-th row of the verification matrix H is the i-th row and the nth column, and the same reasoning follows.

1.9M_LdpcRound1，即根据所选校验矩阵H计算出的循环次数之一，用于步骤4，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为当H_i,n≠0且H_i,n≠'-'时，当H_i,n＝0且H_i,n≠'-'时，(M_LdpcRound1)_i,j＝6。1.9M _LdpcRound1 , which is one of the number of rounds calculated according to the selected parity check matrix H, is used in step 4 and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). Its operation formula is when H _i,n ≠0 and H _i,n ≠'-', When H _i,n =0 and H _i,n ≠'-', (M _LdpcRound1 ) _i,j =6.

1.10M_{LdpcAssemble1}，即根据所选校验矩阵H计算出的补足偏移量标志位之一，用于步骤4，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为当H_i,n＝0且H_i,n≠'-'时，(M_{LdpcAssemble1})_i,j＝0；当H_i,n≠0、H_i,n≠'-'且(Z-H_i,n)modW＝0时，(M_{LdpcAssemble1})_i,j＝0；当H_i,n≠0、H_i,n≠'-'且(Z-H_i,n)modW≠0时，(M_{LdpcAssemble1})_i,j＝1。 _{1.10M LdpcAssemble1} , which is one of the complementary offset flag bits calculated according to the selected parity check matrix H, is used in step 4 and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). Its operation formula is when H _i,n =0 and H _i,n ≠'-', (M _{LdpcAssemble1} ) _i,j =0; when H _i,n ≠0, H _i,n ≠'-' and ( ZH _i,n )modW=0, (M _{LdpcAssemble1} ) _i,j =0; when H _i,n ≠0, H _i,n ≠'-' and (ZH _i,n )modW≠0, (M _{LdpcAssemble1} ) _i,j =1.

1.11M_{LdpcAssembleTable1}，即根据所选校验矩阵H计算出循环补足偏移量，用于步骤4和步骤9，为(的矩阵(其中为计算向量LdpcRowLength中所有元素的和)， 1.11M _{LdpcAssembleTable1} , that is, calculate the cyclic complement offset according to the selected parity check matrix H, which is used in step 4 and step 9, as ( matrix (where To calculate the sum of all elements in the vector LdpcRowLength),

1.12M_LdpcOffset2，即根据所选校验矩阵H计算出的循环偏移量之一，用于步骤4与步骤9，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为当(M_{LdpcAssemble1})_i,j＝0时，(M_LdpcOffset2)_i,j＝Z*(n-1)+[W-Z+H_i,n+(M_LdpcRound1)_i,j*W^]；当(M_{LdpcAssemble1})_i,j＝1时，(M_LdpcOffset2)_i,j＝Z*(n-1)。1.12M _LdpcOffset2 , which is one of the cyclic offsets calculated according to the selected parity check matrix H, is used in step 4 and step 9, and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). The operation formula is when (M _{LdpcAssemble1} ) _i,j ＝0, (M _LdpcOffset2 ) _i,j ＝Z*(n-1)+[W-Z+H _i,n +(M _LdpcRound1 ) _i,j * W ^] ; when (M _{LdpcAssemble1} ) _i,j =1, (M _LdpcOffset2 ) _i,j =Z*(n-1).

1.13M_LdpcRound2，即根据所选校验矩阵H计算出的循环次数之一，用于步骤4，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为 _{1.13M LdpcRound2} , which is one of the number of rounds calculated according to the selected parity check matrix H, is used in step 4 and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). Its operation formula is

1.14M_LdpcRound3，即根据所选校验矩阵H计算出的循环次数之一，用于步骤9，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为当H_i,n≠0且H_i,n≠'-'时，当H_i,n＝0且H_i,n≠'-'时，(M_LdpcRound3)_i,j＝5。1.14M _LdpcRound3 , which is one of the number of rounds calculated according to the selected parity check matrix H, is used in step 9 and is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). Its operation formula is when H _i,n ≠0 and H _i,n ≠'-', When H _i,n =0 and H _i,n ≠'-', (M _LdpcRound3 ) _i,j =5.

1.15M_{LdpcAssemble2}，即根据所选校验矩阵H计算出的补足偏移量标志位之一，用于步骤9，同M_{LdpcAssemble1}。 _{1.15M LdpcAssemble2} , which is one of the complementary offset flag bits calculated according to the selected parity check matrix H, is used in step 9, the same as M _{LdpcAssemble1} .

1.16M_LdpcRound4，即根据所选校验矩阵H计算出的循环次数之一，用于步骤9，为LdpcRowNum*max(V_{LdpcRowLength}(v))的矩阵。其运算公式为当i＝0时，(M_LdpcRound4)_i,j＝0；当i≠0， 1.16M _LdpcRound4 , which is one of the round times calculated according to the selected parity check matrix H, used in step 9, is a matrix of LdpcRowNum*max(V _{LdpcRowLength} (v)). Its operation formula is when i=0, (M _LdpcRound4 ) _i,j =0; when i≠0,

2、判断是否达到最大迭代次数I。若没有达到最大迭代次数I，则进行步骤3；若达到最大迭代次数I，则译码结束。2. Determine whether the maximum number of iterations I is reached. If the maximum number of iterations I is not reached, proceed to step 3; if the maximum number of iterations I is reached, the decoding ends.

3、根据GPP芯片特性，采用多线程的方法对译码方法进行优化，以校验矩阵H的行数为线程数，其中校验矩阵H中的一行数据为一线程处理步骤4到步骤9，在同一时间执行多个线程的处理操作，即同一时间进行多个步骤4到步骤9的处理，进而提升系统的整体处理性能。以下的步骤4到步骤9，均为单一线程处理的具体流程。3. According to the characteristics of the GPP chip, the decoding method is optimized by using a multi-threaded method. The number of rows in the parity check matrix H is used as the number of threads, and one row of data in the parity check matrix H is one thread to process steps 4 to 9. Execute the processing operations of multiple threads at the same time, that is, perform the processing of multiple steps 4 to 9 at the same time, thereby improving the overall processing performance of the system. The following steps 4 to 9 are specific processes for single-thread processing.

4、计算临时变量向量t，其是长度为的向量。其中，该临时变量向量中包括与校验矩阵每一行对应的子向量，其索引号为到该子向量又包括与每个非“-”元素H_i,j对应的子向量。(这里，只有校验矩阵中的非“-”元素H_i,j有相应的子向量，校验矩阵中的“-”元素在临时变量向量、校验节点向量r和变量节点向量q中都没有相应的子向量。)具体地，与H_i,j对应的子向量t'的计算公式为即临时变量向量t子向量t'的值为变量节点向量q中与元素H_i,j对应的子向量q'根据校验矩阵H第i行第j列元素值循环移位后与校验节点向量r中与元素H_i,j对应的子向量r'的差值。若此时为第一次迭代运算，变量节点向量q为LDPC编码后的码字向量c，校验节点向量r为初始状态，此时临时变量向量t子向量t'的计算公式为即临时变量向量t子向量t'为变量节点向量q子向量q'根据校验矩阵H第i行第j列元素值循环移位后结果。为了适应SIMD运算，输入变量节点向量q子向量q'与校验节点向量r子向量r'长度为Z，输出临时变量向量t子向量t'大小将为但在输出子向量中，只有其中的前Z个元素为有效数据。临时变量向量t的运算以所选校验矩阵H每行非“-”元素个数进行循环，例如所选校验矩阵H第i行第j个非“-”元素将计算得出临时变量向量t的第到位的元素。其具体计算步骤如下：4. Calculate the temporary variable vector t, whose length is of vectors. Wherein, the temporary variable vector includes a subvector corresponding to each row of the check matrix, and its index number is arrive This subvector in turn includes a subvector corresponding to each non-"-" element H _i,j . (Here, only the non-"-" elements H _{i, j} in the check matrix have corresponding sub-vectors, and the "-" elements in the check matrix are all in the temporary variable vector, check node vector r, and variable node vector q There is no corresponding subvector.) Specifically, the calculation formula of the subvector t' corresponding to H _i,j is That is, the value of the temporary variable vector t sub-vector t' is the sub-vector q' corresponding to the element H _i,j in the variable node vector q, which is cyclically shifted according to the element value of the i-th row and the j-th column of the check matrix H, and the check node Difference of subvector r' corresponding to element H _i,j in vector r. If it is the first iterative operation at this time, the variable node vector q is the codeword vector c after LDPC encoding, and the check node vector r is the initial state, at this time the calculation formula of the temporary variable vector t sub-vector t' is That is, the sub-vector t' of the temporary variable vector t is the result of circular shifting of the element value of the variable node vector q sub-vector q' according to the value of the element in the i-th row and the j-th column of the parity check matrix H. In order to adapt to the SIMD operation, the length of the input variable node vector q subvector q' and the check node vector r subvector r' is Z, and the size of the output temporary variable vector t subvector t' will be But in the output subvector, only the first Z elements are valid data. The operation of the temporary variable vector t is cycled by the number of non-"-" elements in each row of the selected check matrix H, for example, the jth non-"-" element in the ith row of the selected check matrix H will be calculated to obtain a temporary variable vector t's arrive bit elements. The specific calculation steps are as follows:

4.1根据M_LdpcOffset1矩阵中找出所选校验矩阵H第i行第j个非“-”元素对应的循环偏移量，在变量节点向量q中找出所需数据的起始值位置，所需数据的起始值位置＝变量节点向量q子向量q'的起始值位置+对应的循环偏移量。4.1 According to the M _LdpcOffset1 matrix, find the cyclic offset corresponding to the jth non-"-" element in the i-th row of the selected parity check matrix H, and find the starting value position of the required data in the variable node vector q, so The initial value position of the required data = the initial value position of the variable node vector q sub-vector q' + the corresponding loop offset.

4.2根据M_LdpcRound1矩阵找出所选校验矩阵H第i行第j个非“-”元素对应的循环次数，将所需数据的起始值位置后(M_LdpcRound1)_i,j*W个数据拷贝到临时变量向量t子向量t'的当前位置上，这里的当前位置是指子向量中尚未拷贝数据的起始位置。4.2 According to the M _LdpcRound1 matrix, find the number of cycles corresponding to the jth non-"-" element in the i-th row of the selected parity check matrix H, and place (M _LdpcRound1 ) _i,j *W data after the starting value of the required data Copy to the current position of the sub-vector t' of the temporary variable vector t, where the current position refers to the starting position of the uncopied data in the sub-vector.

4.3根据M_{LdpcAssemble1}矩阵，判断是否需要进行补位操作。若(M_{LdpcAssemble1})_i,j＝1，则根据M_{LdpcAssembleTable1}矩阵中所指示的偏移量进行补位操作；若(M_{LdpcAssemble1})_i,j＝0，则不需要补位操作。具体的补位操作包括：确定矩阵M_{LdpcAssemble1}中与校验矩阵元素H_i,j对应行中各个元素的取值并将与元素H_i,j对应的当前向量q的子向量中索引号为的各个元素依次拷贝到与H_i,j对应的临时变量向量t子向量的当前位置上；其中， 4.3 According to the M _{LdpcAssemble1} matrix, it is judged whether a supplementary operation is required. If (M _{LdpcAssemble1} ) _i,j =1, the complement operation is performed according to the offset indicated in the M _{LdpcAssembleTable1} matrix; if (M _{LdpcAssemble1} ) _i,j =0, no complement operation is required. The specific complement operation includes: determining the value of each element in the row corresponding to the check matrix element H _{i, j} in the matrix M _{LdpcAssemble1} And the index number in the subvector of the current vector q corresponding to the element H _{i, j} is Each element of is copied to the current position of the temporary variable vector t subvector corresponding to H _{i, j} in turn; where,

4.4根据M_LdpcOffset2矩阵中找出所选校验矩阵H第i行第j个非“-”元素对应的循环偏移量，在变量节点向量q中找出所需数据的起始值位置，所需数据的起始值位置＝变量节点向量q子向量q'的起始值位置+对应的循环偏移量。4.4 According to the M _LdpcOffset2 matrix, find the cyclic offset corresponding to the jth non-"-" element in the i-th row of the selected parity check matrix H, and find the starting value position of the required data in the variable node vector q, so The initial value position of the required data = the initial value position of the variable node vector q sub-vector q' + the corresponding loop offset.

4.5根据M_LdpcRound2矩阵找出所选校验矩阵H第i行第j个非“-”元素对应的循环次数，将所需数据的起始值位置后(M_LdpcRound2)_i,j*W个数据拷贝到临时变量向量t子向量t'的当前位置上。4.5 According to the M _LdpcRound2 matrix, find out the number of cycles corresponding to the jth non-"-" element in the i-th row of the selected parity check matrix H, and place (M _LdpcRound2 ) _i,j *W data after the starting value of the required data Copy to the current position of the temporary variable vector t sub-vector t'.

4.6若此时非第一次迭代运算，则需进行临时变量向量t子向量t'＝临时变量向量t子向量t'-校验节点向量r子向量r'运算，将临时变量向量t子向量t'以及校验节点向量r子向量r'所有元素分成以W个元素为一组，此时可进行SIMD优化，即一次运算可得到W个临时变量向量t中的元素，参见图11，其中(a₁,a₂,…,a_W)表示一组临时变量向量t中的元素，(b₁,b₂,…,b_W)表示一组校验节点向量r中的元素，矩形框表示减法运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁-b₁,a₂-b₂,…,a_W-b_W)；若此时为第一次迭代运算，则进行步骤5。4.6 If it is not the first iterative operation at this time, the temporary variable vector t subvector t'=temporary variable vector t subvector t'-check node vector r subvector r' operation is required, and the temporary variable vector t subvector All elements of t' and check node vector r sub-vector r' are divided into a group of W elements, and SIMD optimization can be performed at this time, that is, elements in W temporary variable vector t can be obtained in one operation, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represent the elements in a set of temporary variable vector t, (b ₁ ,b ₂ ,…,b _W ) represent the elements in a set of check node vector r, and the rectangular box represents Subtractor, (y ₁ ,y ₂ ,…,y _W ) represents the result after operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ -b ₁ ,a ₂ -b ₂ ,… ,a _W -b _W ); if this is the first iterative operation, go to step 5.

5、计算临时变量向量t子向量t'中所有元素的绝对值，并记为取模后临时变量向量|t|。将临时变量向量t子向量t'分成以W个元素为一组，此时可进行SIMD优化，即一次运算可得到W个临时变量向量t子向量t'中的元素，参见图12，其中(c₁,c₂,…,c_W)表示一组临时变量向量t子向量t'中的元素，矩形框表示取模运算器，(y₁,y₂,…,y_W)表示取模后临时变量向量|t|，即(y₁,y₂,…,y_W)＝(|c₁|,|c₂|,…,|c_W|)。将其作为更新后的临时变量向量t子向量t'。5. Calculate the absolute value of all elements in the subvector t' of the temporary variable vector t, and record it as the modulo temporary variable vector |t|. The temporary variable vector t sub-vector t' is divided into W elements as a group, and SIMD optimization can be performed at this time, that is, elements in the W temporary variable vector t sub-vector t' can be obtained in one operation, see Figure 12, where ( c ₁ ,c ₂ ,…,c _W ) represent the elements in a set of temporary variable vector t sub-vector t', the rectangular box represents the modulo operator, (y ₁ ,y ₂ ,…,y _W ) represents the modulus Temporary variable vector |t|, ie (y ₁ , y ₂ , ..., y _W )=(|c ₁ |, |c ₂ |, ..., |c _W |). Take it as a subvector t' of the updated temporary variable vector t.

6、最值分配运算，并将结果存于最值变量向量m矩阵M中。该过程以所选校验矩阵H行数进行循环，即每次对临时变量向量t中的V_{LdpcRowLength}(v)*W*LdpcBufferNum个元素进行操作，得到长度为V_{LdpcRowLength}(v)*W*LdpcBufferNum的结果存于最值变量向量m中。一次最值分配运算示意图参见图15，将与校验矩阵每行对应的临时变量向量t子向量t'写成V_{LdpcRowLength}(v)行和列的矩阵T_v，其中，矩阵T_v的每一行为临时变量向量t子向量t'中与元素H_i,j对应的子向量，列数不够时进行补位；最值分配运算即计算矩阵T_v中每列元素的最小值以及次小值，进行分配，并将结果存于最值变量向量m子向量矩阵M_v。矩阵T_v每行有LdpcBufferNum个子块，每个子块中有W个基本单位；对比各个行中基本单位的大小，得出其中的最小值以及次小值，并记录该最小值的所在行数的行号，即索引值；根据索引值将最值变量向量m进行填充，若索引值与最值变量向量m的行号不同，则在最值变量向量m填入找出的最小值，若索引值与最值变量向量m的行号相同，则在最值变量向量m填入找出的次小值。以找出一个子块的最小值次小值为例，其流程图参见图14，具体步骤如下：6. The most value distribution operation, and store the result in the matrix M of the most value variable vector m. This process loops with the number of rows of the selected check matrix H, that is, each time the V _{LdpcRowLength} (v)*W*LdpcBufferNum elements in the temporary variable vector t are operated, the length is V _{LdpcRowLength} (v)*W*LdpcBufferNum The result of is stored in the most value variable vector m. Refer to Figure 15 for a schematic diagram of the most value distribution operation. The temporary variable vector t subvector t' corresponding to each row of the check matrix is written as V _{LdpcRowLength} (v) row and The matrix T _v of columns, wherein, each row of the matrix T _v is the sub-vector corresponding to the element H _{i, j} in the temporary variable vector t sub-vector t', and when the number of columns is not enough, the complement is performed; the most value assignment operation is the calculation matrix The minimum value and the second minimum value of each column element in T _v are distributed, and the result is stored in the subvector matrix M _v of the most value variable vector m. There are LdpcBufferNum sub-blocks in each row of the matrix T _v , and there are W basic units in each sub-block; compare the size of the basic units in each row, obtain the minimum value and the second minimum value, and record the number of rows where the minimum value is located Row number, that is, the index value; fill the most value variable vector m according to the index value, if the index value is different from the row number of the most value variable vector m, then fill in the minimum value found in the most value variable vector m, if the index value is the same as the row number of the most value variable vector m, then fill in the second smallest value found in the most value variable vector m. Take finding out the minimum value and second minimum value of a sub-block as an example, its flow chart is shown in Figure 14, and the specific steps are as follows:

6.1比较矩阵T_v的第一行与第二行的第一个子块中对应基本单位的大小，将较小值的行号存入索引值中，并将较小值记录为最小值，较大值记录为次小值，此时可进行SIMD优化，进行两次运算，一次取最大值，得出两者间的较大值，一次去最小值，得出两者间的较小值，参见图11，其中(a₁,a₂,…,a_W)表示第一行第一个子块的元素，(b₁,b₂,…,b_W)表示第二行第一个子块的元素，矩形框表示取最大值运算器或者取最小值运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(max(a₁,b₁),max(a₂,b₂),…,max(a_W,b_W))或者(y₁,y₂,…,y_W)＝(min(a₁,b₁),min(a₂,b₂),…,min(a_W,b_W))。6.1 Compare the size of the corresponding basic unit in the first sub-block of the first row and the second row of the matrix T _v , store the row number of the smaller value in the index value, and record the smaller value as the minimum value, and compare The large value is recorded as the second smallest value. At this time, SIMD optimization can be performed, and two calculations are performed. One takes the maximum value to obtain the larger value between the two, and one time removes the minimum value to obtain the smaller value between the two. See Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the elements of the first sub-block in the first row, and (b ₁ ,b ₂ ,…,b _W ) represents the first sub-block in the second row The elements of , the rectangular box represents the maximum value operator or the minimum value operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, that is, (y ₁ ,y ₂ ,…,y _W )= (max(a ₁ ,b ₁ ),max(a ₂ ,b ₂ ),…,max(a _W ,b _W )) or (y ₁ ,y ₂ ,…,y _W )=(min(a ₁ , b ₁ ),min(a ₂ ,b ₂ ),…,min(a _W ,b _W )).

6.2判断是否已达到最大循环次数V_{LdpcRowLength}(v)，若没有达到，则进行步骤6.3；若达到，则进行步骤6.6。6.2 Determine whether the maximum number of cycles V _{LdpcRowLength} (v) has been reached, if not, proceed to step 6.3; if yes, proceed to step 6.6.

6.3将矩阵T_v的下一行第一子块与先前记录的最小值进行取最大值操作，该操作可进行SIMD优化，同步骤6.1。6.3 Perform a maximum value operation on the first sub-block in the next row of the matrix T _v and the previously recorded minimum value, this operation can be SIMD optimized, the same as step 6.1.

6.4将步骤6.3得到的结果与当前记录的次小值进行取最小值操作，该操作可进行SIMD优化，同步骤6.1，并将结果记为次小值。6.4 Perform the minimum value operation on the result obtained in step 6.3 and the second smallest value currently recorded. This operation can be SIMD optimized, the same as step 6.1, and record the result as the second smallest value.

6.5将步骤6.3得到的结果与当前记录的最小值进行取最小值操作，该操作可进行SIMD优化，同步骤6.1，并将结果记为最小值，同时将该最小值的行号记录为索引值，返回步骤6.2。6.5 Perform the minimum value operation on the result obtained in step 6.3 and the minimum value currently recorded. This operation can be SIMD optimized, the same as step 6.1, and record the result as the minimum value, and record the row number of the minimum value as the index value , return to step 6.2.

6.6将当前记录的最小值和次小值均减去修正值β，该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示先前记录的最小值或者次小值，(b₁,b₂,…,b_W)表示修正值β也可表示为(β,β,…,β)，矩形框表示减法运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁-β,a₂-β,…,a_W-β)，并将结果记录为最小值或次小值。6.6 Subtract the correction value β from the minimum value and the second smallest value of the current record. This operation can be SIMD optimized, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represent the minimum or second smallest value recorded previously Small value, (b ₁ ,b ₂ ,…,b _W ) means the correction value β can also be expressed as (β,β,…,β), the rectangular box means the subtractor, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ -β,a ₂ -β,…,a _W -β), and record the result as the minimum value or second small value.

6.7对当前记录的最小值和次小值进行修正，当前记录的最小值或次小值小于零时，将该值置为零，否则不做操作，该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示当前记录的最小值或者次小值，(b₁,b₂,…,b_W)表示零值也可表示为(0,0,…,0)，矩形框表示修正运算器，(y₁,y₂,…,y_W)表示运算后的结果，即并将结果记录为最小值或次小值。6.7 Correct the minimum value and the second minimum value of the current record. When the minimum value or the second minimum value of the current record is less than zero, set the value to zero, otherwise no operation is performed. This operation can be SIMD optimized. See Figure 11. Where (a ₁ ,a ₂ ,…,a _W ) represents the minimum or second smallest value of the current record, (b ₁ ,b ₂ ,…,b _W ) represents zero value and can also be expressed as (0,0,…, 0), the rectangular frame represents the correction operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, namely And record the result as the smallest or next smallest value.

6.8根据索引值将最值变量向量m进行填充，若索引值与最值变量向量m子向量矩阵M_v的行号不同，则在矩阵M_v的相同位置填入当前记录的最小值，若索引值与最值变量向量m子向量矩阵M_v的行号相同，则在最值变量向量m子向量矩阵M_v的相同位置填入当前记录的次小值。将矩阵M_v中的元素按照行优先的顺序读出构成最值变量向量m子向量。6.8 Fill the most value variable vector m according to the index value. If the index value is different from the row number of the subvector matrix M _v of the most value variable vector m, fill in the minimum value of the current record in the same position of the matrix M _v . If the index value is the same as the row number of the most value variable vector m _subvector matrix _Mv , then fill in the second smallest value of the current record in the same position of the most value variable vector m subvector matrix Mv. The elements in the matrix M _v are read out in row-first order to form the subvector of the most valued variable vector m.

7、计算中间变量向量s。该过程以所选校验矩阵H行数进行循环，即每次对临时变量向量t中的V_{LdpcRowLength}(v)*W*LdpcBufferNum个元素进行操作，得到长度为V_{LdpcRowLength}(v)*W*LdpcBufferNum的结果存于中间变量向量s中。一次中间变量向量s运算示意图参见图17，将临时变量向量t分成V_{LdpcRowLength}(v)行，每行有LdpcBufferNum个子块，每个子块中有W个基本单位。以计算一个子块的中间变量向量s为例，其流程图参见图16，具体操作流程如下：7. Calculate the intermediate variable vector s. This process loops with the number of rows of the selected check matrix H, that is, each time the V _{LdpcRowLength} (v)*W*LdpcBufferNum elements in the temporary variable vector t are operated, the length is V _{LdpcRowLength} (v)*W*LdpcBufferNum The result is stored in the intermediate variable vector s. Refer to Figure 17 for a schematic diagram of an intermediate variable vector s operation. The temporary variable vector t is divided into V _{LdpcRowLength} (v) rows, each row has LdpcBufferNum sub-blocks, and each sub-block has W basic units. Take the calculation of the intermediate variable vector s of a sub-block as an example, its flow chart is shown in Figure 16, and the specific operation process is as follows:

7.1将临时变量向量t的第一行第一子块与第二行第一子块进行异或操作，该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示第一行第一子块，(b₁,b₂,…,b_W)表示第二行第一子块，矩形框表示异或运算器，(y₁,y₂,…,y_W)表示运算后的结果，即 7.1 XOR the first sub-block of the first row and the first sub-block of the second row of the temporary variable vector t, this operation can be SIMD optimized, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the first sub-block in the first row, (b ₁ ,b ₂ ,…,b _W ) represents the first sub-block in the second row, and the rectangular box represents the XOR operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, namely

7.2判断是否已达到最大循环次数V_{LdpcRowLength}(v)，若没有达到，则进行步骤7.3；若达到，则进行步骤7.4，并且从第一行开始执行。7.2 Determine whether the maximum number of cycles V _{LdpcRowLength} (v) has been reached, if not, proceed to step 7.3; if yes, proceed to step 7.4, and execute from the first row.

7.3将临时变量向量t的下一行第一子块与步骤7.1的结果进行异或操作，该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示下一行第一子块，(b₁,b₂,…,b_W)表示步骤7.1的结果，矩形框表示异或运算器，(y₁,y₂,…,y_W)表示运算后的结果，即返回步骤7.2。7.3 XOR the first sub-block of the next row of the temporary variable vector t with the result of step 7.1, this operation can be SIMD optimized, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represent the next row The first sub-block, (b ₁ ,b ₂ ,…,b _W ) represents the result of step 7.1, the rectangular box represents the XOR operator, and (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, namely Return to step 7.2.

7.4判断是否已达到最大循环次数V_{LdpcRowLength}(v)，若没有达到，则进行步骤7.5；若达到，则进行步骤8。7.4 Judging whether the maximum number of cycles V _{LdpcRowLength} (v) has been reached, if not, proceed to step 7.5; if yes, proceed to step 8.

7.5将临时变量向量t的当前行第一子块与步骤7.3的结果进行异或操作，该操作可进行SIMD优化。7.5 Perform an XOR operation on the first sub-block of the current row of the temporary variable vector t and the result of step 7.3, and this operation can be SIMD optimized.

7.6将步骤7.5的结果与进行或操作，该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示步骤7.5的结果，(b₁,b₂,…,b_W)表示矩形框表示或运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁|b₁,a₂|b₂,…,a_W|b_W)，并将结果存入中间变量向量s的第一子块中，返回步骤7.4。7.6 Combine the result of step 7.5 with Perform OR operation, this operation can be SIMD optimized, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the result of step 7.5, (b ₁ ,b ₂ ,…,b _W ) represents The rectangular box represents the OR operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ |b ₁ ,a ₂ |b ₂ ,...,a _W |b _W ), and store the result in the first sub-block of the intermediate variable vector s, and return to step 7.4.

8、计算校验节点向量r，其是长度为的向量。其中，该校验节点向量中包括与校验矩阵每一行对应的子向量，其索引号为到该子向量又包括与每个非“-”元素H_i,j对应的子向量。计算校验节点向量r的过程以所选校验矩阵H行数进行循环，即每次对临时变量向量t中的V_{LdpcRowLength}(v)*W*LdpcBufferNum个元素进行操作，得到长度为V_{LdpcRowLength}(v)*W*LdpcBufferNum的结果存于校验节点向量r中。在计算校验矩阵每行对应的子向量进行计算时，以每个非“-”元素H_i,j对应的子向量为单位进行，计算得到索引为到的向量中的元素。下面以计算一个与非“-”元素H_i,j对应的校验节点向量r子向量r'为例，其具体操作流程如下：8. Calculate the check node vector r, whose length is of vectors. Wherein, the check node vector includes a subvector corresponding to each row of the check matrix, and its index number is arrive This subvector in turn includes a subvector corresponding to each non-"-" element H _i,j . The process of calculating the check node vector r is circulated with the number of rows of the selected check matrix H, that is, the V _{LdpcRowLength} (v)*W*LdpcBufferNum elements in the temporary variable vector t are operated each time to obtain a length of V _{LdpcRowLength} ( v) The result of *W*LdpcBufferNum is stored in the check node vector r. When calculating the subvector corresponding to each row of the parity check matrix, the subvector corresponding to each non-"-" element H _{i, j} is used as the unit, and the calculated index is arrive elements in the vector of . The following takes the calculation of a check node vector r subvector r' corresponding to a non-"-" element H _i,j as an example, and the specific operation process is as follows:

8.1判断是否已达到最大循环次数V_{LdpcRowLength}(v)，若没有达到，则进行步骤8.2；若达到，则进行步骤9。8.1 Determine whether the maximum number of cycles V _{LdpcRowLength} (v) has been reached, if not, proceed to step 8.2; if yes, proceed to step 9.

8.2将最值变量向量m与H_i,j对应的子向量与中间变量向量s与H_i,j对应的子向量进行对比操作，若中间变量向量s小于零，则结果为对最值变量向量m的值求的补数，若中间变量向量s等于零，则结果为零，若中间变量向量s大于零，则结果为对最值变量向量m的值。该操作可进行SIMD优化，参见图12，其中(a₁,a₂,…,a_W)表示最值变量向量m，(b₁,b₂,…,b_W)表示中间变量向量s，矩形框表示对比运算器，(y₁,y₂,…,y_W)表示运算后的结果，即 8.2 Compare the subvector corresponding to the most value variable vector m and H _i,j with the subvector corresponding to the intermediate variable vector s and H _i,j , if the intermediate variable vector s is less than zero, the result is the most value variable vector The complement of the value of m, if the intermediate variable vector s is equal to zero, the result is zero, if the intermediate variable vector s is greater than zero, the result is the value of the most valued variable vector m. This operation can be SIMD optimized, see Figure 12, where (a ₁ , a ₂ ,…,a _W ) represents the most value variable vector m, (b ₁ ,b ₂ ,…,b _W ) represents the intermediate variable vector s, and the rectangle The box represents the comparison operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, namely

8.3将步骤8.2的结果与中间变量向量s相加。该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示步骤8.2的结果，(b₁,b₂,…,b_W)表示中间变量向量s，矩形框表示加法运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁+b₁,a₂+b₂,…,a_W+b_W)。8.3 Add the result of step 8.2 to the intermediate variable vector s. This operation can be SIMD optimized, see Figure 11, where (a ₁ ,a ₂ ,…,a _W ) represents the result of step 8.2, (b ₁ ,b ₂ ,…,b _W ) represents the intermediate variable vector s, and the rectangular box Indicates the addition operator, (y ₁ ,y ₂ ,…,y _W ) indicates the result after operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ +b ₁ ,a ₂ +b ₂ , ..., _aW + _bW ).

8.4将步骤8.3的结果与临时变量向量t子向量t'相减。该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示步骤8.3的结果，(b₁,b₂,…,b_W)表示临时变量向量t，矩形框表示加法运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁-b₁,a₂-b₂,…,a_W-b_W)，并将结果存于校验节点向量r，返回步骤8.1。8.4 Subtract the result of step 8.3 from the temporary variable vector t subvector t'. This operation can be SIMD optimized, see Figure 11, where (a ₁ , a ₂ ,…,a _W ) represents the result of step 8.3, (b ₁ ,b ₂ ,…,b _W ) represents the temporary variable vector t, and the rectangular box Indicates the addition operator, (y ₁ ,y ₂ ,…,y _W ) indicates the result after operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ -b ₁ ,a ₂ -b ₂ , …,a _W -b _W ), and store the result in the check node vector r, return to step 8.1.

9、计算变量节点向量q，其是长度为的向量。其中，该变量节点向量中包括与校验矩阵每一行对应的子向量，其索引号为到该子向量又包括与每个非“-”元素H_i,j对应的子向量q'。变量节点向量q子向量q'的计算公式为即变量节点向量q子向量q'的值为临时变量向量t子向量t'值与校验节点向量r子向量r'值的和，并根据校验矩阵H第i行第j列元素值循环移位后的结果。为了适应SIMD运算，输入临时变量向量t子向量t'与校验节点向量r子向量r'长度为Z，输出变量节点向量q子向量q'大小将为但在输出子向量中，只有其中的前Z个元素为有效数据。变量节点向量q的运算以所选校验矩阵H每行非“-”元素个数进行循环，例如所选校验矩阵H第i行第j个非“-”元素将计算得出变量节点向量q的第到位的元素。其具体计算步骤如下：9. Calculate the variable node vector q, whose length is of vectors. Wherein, the variable node vector includes a subvector corresponding to each row of the parity check matrix, and its index number is arrive This sub-vector in turn includes a sub-vector q' corresponding to each non-"-" element H _i,j . The calculation formula of the variable node vector q sub-vector q' is That is, the value of the variable node vector q sub-vector q' is the sum of the value of the temporary variable vector t sub-vector t' and the value of the check node vector r sub-vector r', and loops according to the element value of the ith row and j column of the check matrix H The shifted result. In order to adapt to the SIMD operation, the length of the input temporary variable vector t subvector t' and the check node vector r subvector r' is Z, and the size of the output variable node vector q subvector q' will be But in the output subvector, only the first Z elements are valid data. The operation of the variable node vector q is cyclically based on the number of non-"-" elements in each row of the selected check matrix H, for example, the jth non-"-" element in the i-th row of the selected check matrix H will be calculated to obtain the variable node vector q's arrive bit elements. The specific calculation steps are as follows:

9.1根据M_LdpcOffset1矩阵中找出所选校验矩阵H第i行第j个非“-”元素对应的循环偏移量，在变量节点向量q中找出所需起始位置，所需起始位置＝变量节点向量q子向量q'的起始值位置+对应的循环偏移量。9.1 According to the M _LdpcOffset1 matrix, find the cyclic offset corresponding to the non-"-" element in the i-th row of the selected check matrix H, and find the required starting position in the variable node vector q. The required starting position Position = initial value position of variable node vector q sub-vector q' + corresponding loop offset.

9.2将临时变量向量t子向量t'与校验节点向量r子向量r'相加。该操作可进行SIMD优化，参见图11，其中(a₁,a₂,…,a_W)表示临时变量向量t，(b₁,b₂,…,b_W)表示校验节点向量r，矩形框表示加法运算器，(y₁,y₂,…,y_W)表示运算后的结果，即(y₁,y₂,…,y_W)＝(a₁+b₁,a₂+b₂,…,a_W+b_W)。9.2 Add the subvector t' of the temporary variable vector t to the subvector r' of the check node vector r. This operation can be SIMD optimized, see Figure 11, where (a ₁ , a ₂ ,…,a _W ) represents the temporary variable vector t, (b ₁ ,b ₂ ,…,b _W ) represents the check node vector r, and the rectangle The box represents the addition operator, (y ₁ ,y ₂ ,…,y _W ) represents the result after the operation, that is, (y ₁ ,y ₂ ,…,y _W )=(a ₁ +b ₁ ,a ₂ +b ₂ ,...,a _W +b _W ).

9.3根据M_LdpcRound3矩阵找出所选校验矩阵H第i行第j个非“-”元素对应的循环次数，将步骤9.2结果数据的起始值位置后(M_LdpcRound1)_i,j*W个数据拷贝到步骤9.1中找出的所需起始位置。9.3 According to the M _LdpcRound3 matrix, find the number of loops corresponding to the jth non-"-" element in the i-th row of the selected check matrix H, and place the starting value of the result data in step 9.2 after (M _LdpcRound1 ) _i,j *W Copy the data to the desired starting location found in step 9.1.

9.4根据M_{LdpcAssemble2}矩阵，判断是否需要进行补位操作。若(M_{LdpcAssemble2})_i,j＝1，则根据M_{LdpcAssembleTable1}矩阵中所指示的偏移量进行补位操作；若(M_{LdpcAssemble2})_i,j＝0，则不需要补位操作。9.4 According to the M _{LdpcAssemble2} matrix, it is judged whether to perform complement operation. If (M _{LdpcAssemble2} ) _i,j =1, the complement operation is performed according to the offset indicated in the M _{LdpcAssembleTable1} matrix; if (M _{LdpcAssemble2} ) _i,j =0, no complement operation is required.

9.5根据M_LdpcOffset2矩阵中找出所选校验矩阵H第i行第j个非“-”元素对应的循环偏移量，在变量节点向量q中找出所需起始位置，所需起始位置＝变量节点向量q的起始值位置+对应的循环偏移量。9.5 According to the M _LdpcOffset2 matrix, find the cyclic offset corresponding to the non-"-" element in the i-th row of the selected parity check matrix H, find the required starting position in the variable node vector q, and the required starting position Position = initial value position of variable node vector q + corresponding loop offset.

9.6根据M_LdpcRound4矩阵找出所选校验矩阵H第i行第j个非“-”元素对应的循环次数，将步骤9.2结果数据的起始值位置后(M_LdpcRound4)_i,j*W个数据拷贝到步骤9.5中找出的所需起始位置。9.6 According to the M _LdpcRound4 matrix, find the number of cycles corresponding to the jth non-"-" element in the i-th row of the selected parity check matrix H, and place the starting value position of the result data in step 9.2 (M _LdpcRound4 ) _i,j *W Copy the data to the desired starting location found in step 9.5.

9.7根据LdpcRemain所指示的所需补位个数，再根据M_{LdpcAssembleTable1}矩阵中所指示的偏移量对变量节点向量q剩余元素进行补充操作，返回步骤2。9.7 According to the required number of complements indicated by LdpcRemain, and then according to the offset indicated in the M _{LdpcAssembleTable1} matrix, supplement the remaining elements of the variable node vector q, and return to step 2.

上述即为本申请中的LDPC编码和译码方法。The above is the LDPC encoding and decoding method in this application.

LDPC码的编译码理论较为成熟，但因为LDPC码是一种码长n较大的线性分组码，校验矩阵H也较大，算法复杂度十分高，传统的LDPC编译码方式不能很好的满足IEEE 802.11n系统的吞吐率要求，很大程度影响到了系统的性能。现有高速无线局域网系统中LDPC码的实现大多基于FPGA芯片与DSP芯片。通过以往方法虽然可以满足现代高速无线局域网协议中处理和时延的要求，但是FPGA编程和专业DSP都比较复杂，缺少丰富的编程环境和调试工具，适用性一般。而基于GPP芯片，开发人员可以使用普通计算机在熟悉的结构和环境下使用丰富的工具进行开发，如C/C++环境。本专利的创新点就是在GPP芯片上对高速无线局域网系统中的LDPC码在使用原有编译码器的情况下，根据GPP芯片的特性对编译码方法进行优化。由于IEEE802.11n的LDPC码是非正则LDPC码，其校验矩阵原型每行的非负值个数不一定相同，所以，相比以往LDPC码编译码实现方法，GPP芯片的灵活性会有很大优势。另外，考虑到CPU(Central Processing Unit，中央处理器)的高速发展，GPP芯片的数据处理能力也会不断提升。The theory of LDPC coding and decoding is relatively mature, but because LDPC code is a linear block code with a large code length n, the check matrix H is also large, and the algorithm complexity is very high. The traditional LDPC coding and decoding methods cannot be very good. Meeting the throughput requirements of the IEEE 802.11n system greatly affects the performance of the system. The implementation of LDPC codes in existing high-speed wireless local area network systems is mostly based on FPGA chips and DSP chips. Although the previous methods can meet the processing and delay requirements of modern high-speed wireless LAN protocols, FPGA programming and professional DSP are relatively complicated, lacking a rich programming environment and debugging tools, and the applicability is average. Based on the GPP chip, developers can use ordinary computers to develop with rich tools in a familiar structure and environment, such as C/C++ environment. The innovation of this patent is to optimize the encoding and decoding method according to the characteristics of the GPP chip under the condition of using the original codec for the LDPC code in the high-speed wireless LAN system on the GPP chip. Since the LDPC code of IEEE802.11n is a non-regular LDPC code, the number of non-negative values in each row of the check matrix prototype is not necessarily the same. Therefore, compared with the previous LDPC code encoding and decoding implementation methods, the flexibility of the GPP chip will be great. Advantage. In addition, considering the rapid development of the CPU (Central Processing Unit, central processing unit), the data processing capability of the GPP chip will also continue to improve.

首先，采用SIMD指令实现数据的并行处理。SIMD指令集，在本专利中所采用的Intel CPU上也可以称为SSE(Streaming SIMD Extensions，指令集)指令集，其本质原理是在CPU的一个时钟周期内对多个数据进行处理以获得并行处理的效果，而不像是普通的使用方式——每一个时钟周期只进行一次数据处理操作。对于Nehalem架构的CPU，其处理位宽为128比特，对于Sandy Bridge架构的CPU，其处理位宽为256比特，即对于8比特定点数而言，前者在一个指令周期内可以对16个数据进行处理，后者在一个指令周期内可以对32个数据进行处理，从理论上而言，并行度分别为16倍并行和32倍并行。然而通过实际的程序仿真结果可以知道，在实际的系统运行过程中往往无法达到理想的并行倍数，一方面是因为程序并非完全由数据操作流程组成，同时还包括了大量的判断语句，而这些判断语句无法进行并行操作。另一方面，如果采用128比特位宽的SIMD指令，针对IEEE 802.11n标准的校验矩阵，每个子矩阵大小并不是16的倍数，因此每个子矩阵大小最后一组数据处理时并行度是小于16的。First, the parallel processing of data is realized by using SIMD instructions. SIMD instruction set, also can be referred to as SSE (Streaming SIMD Extensions, instruction set) instruction set on the Intel CPU adopted in this patent, its essential principle is to process a plurality of data in one clock cycle of CPU to obtain parallel The effect of processing, unlike ordinary usage - only one data processing operation per clock cycle. For the CPU of the Nehalem architecture, its processing bit width is 128 bits, and for the CPU of the Sandy Bridge architecture, its processing bit width is 256 bits, that is, for 8-bit specific points, the former can process 16 data in one instruction cycle Processing, the latter can process 32 data in one instruction cycle, theoretically speaking, the degree of parallelism is 16 times parallelism and 32 times parallelism respectively. However, through the actual program simulation results, it can be known that the ideal parallel multiple is often not achieved during the actual system operation. On the one hand, the program is not completely composed of data operation processes, and also includes a large number of judgment statements, and these judgments Statements cannot be operated in parallel. On the other hand, if a 128-bit wide SIMD instruction is used, for the parity check matrix of the IEEE 802.11n standard, the size of each sub-matrix is not a multiple of 16, so the parallelism of the last group of data processing for each sub-matrix is less than 16 of.

其次，通过采用查找表的方法，即对多个可知参数进行初始化，并用偏移量的方法标记所需数据的具体位置，用内存换取计算复杂度，提高LDPC码编译码方法的数据处理速度。在LDPC码编码器中，可以提前计算不同码率和码长条件下的编码所需的块矩阵，并将其存储在LUTs(Look-Up-Table，查找表)中，只要在程序开始运行时将表读入即可，无需重复计算。Secondly, by using the lookup table method, that is, initializing multiple known parameters, and marking the specific location of the required data with the offset method, the memory is exchanged for the computational complexity, and the data processing speed of the LDPC encoding and decoding method is improved. In the LDPC code encoder, the block matrix required for encoding under different code rates and code lengths can be calculated in advance and stored in LUTs (Look-Up-Table, look-up table), as long as the program starts to run The table can be read in without repeated calculations.

最后，采用了多线程的方法，在同一时间执行多于一个线程，进而提升系统的整体处理性能。在LDPC优化编译码方法中，对其中以校验矩阵一行数据为单位的操作，用多线程的方法进行优化，线程数为校验矩阵的行数。Finally, a multi-threading method is adopted to execute more than one thread at the same time, thereby improving the overall processing performance of the system. In the LDPC optimized encoding and decoding method, the multi-thread method is used to optimize the operation with one line of data in the check matrix as the unit, and the number of threads is the number of rows of the check matrix.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明保护的范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the present invention. within the scope of protection.

Claims

1. a kind of LDPC encoding method based on general processor, comprising: obtain the signal vector S to be encoded by signal acquisition or receiving, determine parity check matrix H and block matrix A, B, D, E, F and T thereof, and save it; according to Determine the vectors p ₁ and p ₂ , and obtain the encoding result vector c=(S, p ₁ , p ₂ ) of LDPC; it is characterized in that any matrix and any vector performed when the vectors p ₁ and p ₂ are determined The multiplication process includes:

Taking each row of the arbitrary matrix as a thread, performing the multiplication operation of the corresponding row of the matrix and the arbitrary vector, and combining the multiplication results of all rows to form a result vector;

Wherein, the multiplication operation of each row of any matrix and any vector includes: determining the initial position of the vector corresponding to each element j in the current ith row of the matrix=the initial position of any vector+ A _{i, j} + (j-1) * Z, the data of ZA _{i, j} length from the starting position in any vector is left-shifted by means of single instruction multiple data stream SIMD, and After the data of the length of A _{i, j} before the starting position is moved to the left-shifted data, the vector shift result corresponding to the element j is obtained; and then the vector shift result corresponding to each element is added , as the multiplication result of each row and any vector;

In the SIMD method, the data of ZA _i,j length from the starting position is divided into units of length W segment, yes Parallel left shift operation is performed on segment data, and then left shift operation is performed on the remaining (ZA _{i, j} ) mod W length data;

Z is the size of a sub-matrix represented by an element in the parity check matrix.

2. The method according to claim 1, characterized in that, when the arbitrary matrix is T ^-1 , when each row of the T ^-1 is multiplied with the corresponding vector, only T ^-1 is taken Multiply the element with a value of 0 by the corresponding vector to obtain the vector shift result corresponding to the element with a value of 0, and set the vector shift result corresponding to the remaining elements to a zero vector; then shift the vector corresponding to each element The results are added as the result of multiplying each row by the either vector.

3. The method according to claim 1 or 2, characterized in that, taking the first Z data as valid data after performing a left shift operation on W segments of data simultaneously.

4. The method according to claim 1 or 2, wherein said adding the vector shift results corresponding to each element comprises: dividing the vector shift results corresponding to each element into segment, via SIMD pair The segment data is added in parallel, and then the remaining (ZA _i,j ) mod W length data is added.

5. The method according to claim 1 or 2, wherein said matrices A, B, D, E, F and T ^-1 are stored by a linear lookup table.

6. A kind of LDPC decoding method based on general-purpose processor, comprising: receive encoded LDPC code word signal c, determine parity check matrix H; Calculate variable node vector q as decoding result by multiple iterations, when each iteration , calculate the temporary variable vector according to the current variable node vector q and check node vector r as And update the check node vector r according to the temporary variable vector t, and then update the variable node vector q according to the check node vector r and the temporary variable vector t as During the first iteration, the codeword signal c is used as the variable node vector q, and the check node vector r is set to 0; it is characterized in that,

When calculating the temporary variable vector t, the check node vector r and the variable node vector q each iteration, each row of the check matrix is used as a thread for operation and update, and the vectors t, q and r corresponding to each row are obtained. index number from where i is the row index of the parity check matrix, corresponding to the i-th row of the parity check matrix when calculating the subvectors corresponding to the temporary variable vector t, the check node vector r and the variable node vector q, according to the check Each non-"-" element H _{i, j} in the row of the matrix corresponds to the index number corresponding to the element H _{i, j} in the calculation vectors t, q and r from The subvectors of , and then sequentially connected to obtain the subvectors corresponding to each row, when i=1, let V _{LdpcRowLength} (v) is the number of each row of non-"-" elements in the check matrix;

The way to calculate the temporary variable vector t sub-vector corresponding to H _i _,j is: determine the vector starting position Z*(n-1)+H _i,n corresponding to H i,j, and set the vector q corresponding to row i The length from the starting position in the subvector is or 6 data is copied to the beginning of the temporary variable vector t subvector corresponding to H _{i, j} by means of SIMD; when H _{i, n} ≠ 0, H _{i, n} ≠ '-' and (ZH _{i, n} )mod When W≠0, determine the value of each element in the row corresponding to check matrix element H _i,j in matrix M _{LdpcAssemble1} And the index number in the subvector of the current vector q corresponding to the element H _{i, j} is Each element of is copied to the current position of the temporary variable vector t sub-vector corresponding to H _{i, j} in turn; then determine the second vector starting position M _LdpcOffset2 corresponding to each element H _{i, j} , and set the second vector The length from the starting position is The data of is copied to the current position of the temporary variable vector t subvector corresponding to H _i,j by means of SIMD; take the first Z bits in the temporary variable vector t subvector corresponding to H _i,j and take the absolute value as An effective subvector of the temporary variable vector t corresponding to H _{i, j} ; wherein, M _{LdpcAssemble1} is one of the complementary offset flag bits calculated according to the check matrix, and M _LdpcOffset2 is calculated according to the check matrix One of the cycle offsets of

When H _i,n ≠0, H _i,n ≠'-' and (ZH _i,n )mod W≠0, When H _i,n =0 or H _i,n ='-' or (ZH _i,n )mod W=0, (M _LdpcOffset2 ) _i,j =Z*(n-1); K is the amount of data that can be handled by a general-purpose processor at one time, and k is the basic unit size of SIMD processing; when code length L _LDPC =648, LdpcRemain=11; when code length L _LDPC =1296, LdpcRemain=6; when code length L When _LDPC =1944, LdpcRemain=1; j is the index of each non-"-" element in the i-th row in all non-"-" elements in the row, and n is the j-th non-"-" element in the i-th row Column index in check matrix.

7. The method according to claim 6, wherein the mode of calculating the check node vector r subvector corresponding to each row of the check matrix is:

Write the temporary variable vector t subvector corresponding to each row of the check matrix as V _{LdpcRowLength} (v) rows and A matrix T _v of columns, wherein each row of the matrix T _v is a sub-vector corresponding to the element H _{i, j} in the sub-vector of the temporary variable vector t, and when the number of columns is not enough, a complement is performed;

Carrying out the most value assignment to the matrix T _v to obtain the most value variable vector m subvector matrix M _v ;

Calculate the intermediate variable vector s subvector matrix S _{v according to the matrix T v} _;

According to the element with the same index value in the matrix _{Mv and the matrix Sv, determine the element value of the corresponding index value in an intermediate matrix Rv} _' _; wherein, if any element in the matrix _Sv is less than 0, then Take the complement of any element and add it to any element, and use the addition result as the value of the element in the matrix R _v ' that is the same as the index value of any element in the matrix; if any of the matrix S _v If the element is equal to 0, add any element to 0, and use the addition result as the value of the element in the matrix R _v that is the same as the index value of any element in the matrix; if any element in the matrix S _v > 0 , then take the same element as the index value of any element in the matrix M _v and add it to any element, and use the result of the addition as the value of the element same as the index value of any element in the matrix R _v Value; the operation of comparing any element in the matrix S _v with 0 and the operation of adding are carried out by means of SIMD;

Subtract the matrix R _v ' from the elements with the same index value in the matrix T _v by means of SIMD, and use the result as the element value of the same index value in the check node vector r subvector matrix R _v ; the matrix R The first Z elements of each row in _v are sequentially read out to form the check node vector r subvector in a row-first manner.

8. The method according to claim 7, wherein said carrying out the most value distribution comprises:

Determine the minimum value and the second minimum value of each column in the matrix T _v and the row index corresponding to the minimum value by means of SIMD; modify the obtained minimum value and the second minimum value, and subtract the preset correction value β, When the corrected minimum and second minimum values are less than 0, set it to 0, otherwise keep it unchanged;

According to the current minimum value of each column in the matrix T _v , the second minimum value and the row index corresponding to the minimum value, construct the column of the same index in the most value variable vector m subvector matrix M _v , wherein, in any column of M _v , set the element corresponding to the same row index as the current minimum value as the determined minimum value, and set the remaining elements as the next smallest value.

9. The method according to claim 8, wherein the method of determining the minimum value and the second minimum value of each column and the corresponding row index by means of SIMD comprises:

Divide each row element of the matrix T _v into sub-blocks, each sub-block includes W basic units; when comparing the elements of any two rows in the matrix T _v , W basic units are compared at one time by means of SIMD.

10. method according to claim 7, is characterized in that, described calculation intermediate variable vector s subvector matrix S _v comprises:

For each column in the matrix T _v , perform an XOR operation on all the elements in the column, then XOR the result with the elements in the i'th row and then perform an OR operation with 0x7f, and use the OR operation result as the same as in the intermediate vector matrix S _v The i'th row element of the index column; wherein, each row element of the matrix T _v is divided into sub-blocks, each sub-block includes W basic units, when performing XOR/OR operations, the XOR/OR operations of W basic units are performed at one time through SIMD, and the i'th row represents the current row.

11. The method according to claim 6, wherein calculating the variable node vector q sub-vector corresponding to H _i,j comprises:

Determine the vector starting position Z*(n-1)+H _i,n corresponding to H _i,j _, and use the SIMD method to connect the temporary variable vector t subvector corresponding to H _i,j to the check node corresponding to H i,j The vector r sub-vectors are added, and the length from the starting position in the result vector is or 5 data is copied to the beginning of the variable node vector q subvector corresponding to H _i,j by means of SIMD; when H _i,n ≠0, H _i,n ≠'-' and (ZH _i,n ) mod When W≠0, determine the value of each element in the row corresponding to check matrix element H _i,j in matrix M _{LdpcAssemble1} And the index number in the subvector of the current vector q corresponding to the element H _{i, j} is Each element of is copied to the current position of the variable node vector q sub-vector corresponding to H _{i, j} in turn;

Determine the second vector starting position M _LdpcOffset2 corresponding to each element H _i,j , and set the length from the starting position of the second vector to 0 or The data of is copied to the current position of the variable node vector q sub-vector corresponding to H _i,j through SIMD;

According to the number of supplementary positions indicated by LdpcRemain, supplementary positions are performed according to the value of the element in the row corresponding to the check matrix element H _{i, j} in M _{LdpcAssemble1} .

12. The method according to any one of claims 6 to 11, characterized in that the vector starting position Z*(n-1)+H _i _{, n} and The starting position of the second vector M _LdpcOffset2 , the matrix M _{LdpcAssemble1} , and the vectors V _{LdpcRowLength} , M _{LdpcAssemble1} , and LdpcRemain constituted by the number of non-"-" elements in each row in the parity check matrix.