CN118508977B - LDPC code deep learning decoding method based on super network - Google Patents
LDPC code deep learning decoding method based on super network Download PDFInfo
- Publication number
- CN118508977B CN118508977B CN202311361702.8A CN202311361702A CN118508977B CN 118508977 B CN118508977 B CN 118508977B CN 202311361702 A CN202311361702 A CN 202311361702A CN 118508977 B CN118508977 B CN 118508977B
- Authority
- CN
- China
- Prior art keywords
- network
- super
- hypernetwork
- learning
- nnms
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000013135 deep learning Methods 0.000 title claims abstract description 25
- 238000013528 artificial neural network Methods 0.000 claims abstract description 45
- 230000001537 neural effect Effects 0.000 claims abstract description 41
- 239000013598 vector Substances 0.000 claims abstract description 11
- 230000003068 static effect Effects 0.000 claims description 36
- 230000006870 function Effects 0.000 claims description 30
- 230000004913 activation Effects 0.000 claims description 17
- 230000008859 change Effects 0.000 abstract description 8
- 238000004422 calculation algorithm Methods 0.000 description 81
- 238000000354 decomposition reaction Methods 0.000 description 34
- 238000012549 training Methods 0.000 description 27
- 238000004891 communication Methods 0.000 description 20
- 230000005055 memory storage Effects 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 6
- 238000007906 compression Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 208000011580 syndromic disease Diseases 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008602 contraction Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 241000169170 Boreogadus saida Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000000750 constant-initial-state spectroscopy Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/11—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits
- H03M13/1102—Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes
- H03M13/1105—Decoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Error Detection And Correction (AREA)
Abstract
The invention discloses a super-network-based LDPC code deep learning decoding method, which is characterized in that the learning weight of the current channel neural decoding is trained for a specific channel environment, so that the channel environment is difficult to adapt to the change of the channel environment. In addition, the number of network layers of the channel decoder is difficult to flexibly change according to the application environment. In order to solve the two problems, a novel channel neural decoder based on a super network is provided. The decoder is formed by combining a super network and NNMS networks, wherein the NNMS network is a channel neural decoding main network, the super network is a main network for generating learning weights, the super network is composed of a dense neural network, and the input of the super network and the input of the main network are log likelihood ratio LLR value vector experimental results received by channels. Furthermore, the proposed framework may provide flexibility in learning weights and flexibility in network architecture.
Description
Technical Field
The invention relates to a super-network-based LDPC code deep learning decoding method.
Background
Channel neural decoding can be largely divided into two classes, one is a data-driven deep learning-based decoder and the other is a model-driven deep learning-based decoder. The decoder based on data-driven deep learning uses a general neural network architecture, such as Convolutional Neural Network (CNN), cyclic neural network (RNN), etc., to decode without using expert knowledge of communication, which is a data-driven method.
The authors in literature [T.Gruber,S.Cammerer,J.Hoydis,and S.ten Brink,"On deep learning-based channel decoding,"in 2017 51st Annual Conference on Information Sciences and Systems(CISS).IEEE,2017,pp.1–6.] use a generic neural network to decode random codes and achieve Maximum A Posteriori (MAP) decoding performance for very short block codes.
The authors in document [S.Cammerer,T.Gruber,J.Hoydis,and S.Ten Brink,"Scaling deep learning-based decoding of polar codes via partitioning,"in GLOBE-COM 2017-2017IEEE global communications conference.IEEE,2017,pp.1–6.] propose replacing subcomponents of the polar decoding algorithm with a Neural Network (NN) based decoder, due to the disadvantage that the NN based decoder is only effective for very short block codes. Thus, the scalability of channel decoding based on deep learning is realized.
The authors of documents [H.Kim,Y.Jiang,R.Rana,S.Kannan,S.Oh,and P.Viswanath,"Communication algorithms via deep learning,"arXiv preprintarXiv:1805.09317,2018.] and [Y.Jiang,S.Kannan,H.Kim,S.Oh,H.Asnani,and P.Viswanath,"Deepturbo:Deep turbo decoder,"in 2019IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications(SPAWC).IEEE,2019,pp.1–5.] propose RNN and CNN based decoders to decode convolutional codes and Turbo codes, respectively. The authors in literature [A.Bennatan,Y.Choukroun,and P.Kisilev,"Deep learning for decoding of linear codes-a syndrome-based approach,"in 2018IEEE International Symposium on Information Theory(ISIT).IEEE,2018,pp.1595–1599.] use syndrome decoding to eliminate the over-fitting problem that often occurs when training codeword sets. Model-free methods typically employ fully connected neural networks or RNNs to implement iterative processes for decoding algorithms. However, this method lacks the interpretability of the decoding performance, is difficult to analyze the reliability of the decoding result, and generally requires a large number of parameters.
Model-driven deep-learning decoders are obtained by parameterizing conventional decoding algorithms, in which Tanner graphs are developed into neural networks, with learning weights added to each edge. The deployment method (unfold method) was proposed in document [J.R.Hershey,J.L.Roux,and F.Weninger,"Deep unfolding:Model-based inspiration of novel deep architectures,"arXiv preprint arXiv:1409.2574,2014.] and has now become a bridge between traditional communication iterative algorithms and model-driven deep-learning based communication algorithms. Using the expansion method (unfold method), authors of documents [E.Nachmani,Y.Be'ery,and D.Burshtein,"Learning to decode linear codes using deep learning,"in 2016 54th Annual Allerton Conference on Communication,Control,and Computing(Allerton).IEEE,2016,pp.341–346.] and [E.Nachmani,E.Marciano,L.Lugosch,W.J.Gross,D.Burshtein,andY.Be'ery,"Deep learning methods for improved decoding of linear codes,"IEEE Journal of Selected Topics in Signal Processing,vol.12,no.1,pp.119–131,2018.] propose a Neural Belief Propagation (NBP) algorithm, first expanding the BP algorithm into a neural network, and adding a trainable learning weight tanner graph on each side. Experiments show that the bit error rate performance of the NBP decoding algorithm on High Density Parity Check (HDPC) codes exceeds that of the BP decoding algorithm.
The authors in literature [L.Lugosch and W.J.Gross,"Neural offset min-sum decoding,"in 2017IEEE International Symposium on Information Theory(ISIT).IEEE,2017,pp.1361–1365.] develop the minimum offset sum (OMS) into the form of a neural network and propose a neural offset sum (NOMS) algorithm that has lower computational complexity than the NBP algorithm. The authors then apply syndrome loss to the neural normalized least sum (NNMS) algorithm (as in document ["Learning from the syndrome,"in 2018 52nd Asilomar Conference on Signals,Systems,and Computers.IEEE,2018,pp.594–598.]), thus enabling unsupervised training of the channel neural decoding algorithm. Authors in documents [Q.Wang,S.Wang,H.Fang,L.Chen,L.Chen,and Y.Guo,"A model-driven deep learning method for normalized min-sum ldpc decoding,"in 2020IEEE International Conference on Communications Workshops(ICC Workshops).IEEE,2020,pp.1–6.] and [Q.Wang,Q.Liu,S.Wang,L.Chen,H.Fang,L.Chen,Y.Guo,and Z.Wu,"Normalized min-sum neural network for ldpc decoding,"IEEE Transactions on Cognitive Communications and Networking,2022.] apply NNMS algorithm to long Low Density Parity Check (LDPC) codes. Document [J.Dai,K.Tan,Z.Si,K.Niu,M.Chen,H.V.Poor,and S.Cui,"Learning to decode protograph ldpc codes,"IEEE Journal on Selected Areas in Communications,vol.39,no.7,pp.1983–1999,2021.] proposes a high performance neural Minimum Sum (MS) decoding method that exploits the boosting structure of original pattern LDPC (P-LDPC) codes. Channel neural decoding outperforms traditional decoding in terms of decoding performance but still presents many difficulties in terms of highly mobile time-varying channel environments and noise levels. Once the neural decoder is fixed after training, internal parameters and layers cannot be changed. Once the noise level or layers are changed, retraining is required, stability and optimality of the algorithm cannot be guaranteed.
The super network uses one network to generate the weight coefficients of another network, which is first proposed in the literature [ d.ha, a.dai, and q.v. le, "Hypernetworks," arXiv preprint arXiv:1609.09106,2016 ]. To improve the adaptability of Machine Learning (ML) based Multiple Input Multiple Output (MIMO) detection to different channel environments and noise levels, authors of documents [M.Goutay,F.A.Aoudia,and J.Hoydis,"Deep hypernetwork-based mimo detection,"in 2020IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications(SPAWC).IEEE,2020,pp.1–5.] and [J.Zhang,C.-K.Wen,and S.Jin,"Adaptive mimo detector based on hypernetwork:Design,simulation,and experimental test,"IEEE Journal on Selected Areas in Communications,vol.40,no.1,pp.65–81,2021.] introduced a super network into ML based MIMO detection and proposed HYPERMMNET and HYPEREPNET, respectively. The authors in literature [Y.Liu and O.Simeone,"Learning how to transfer from uplink to downlink via hyper-recurrent neural network for fdd massive mimo,"IEEE Transactions on Wireless Communications,2022.] propose HyperRNN architecture for an end-to-end downlink channel estimation scheme for massive MIMO Frequency Division Duplex (FDD) systems, achieving lower Normalized Mean Square Error (NMSE) channel estimation and higher summation. Beamforming rate. The authors in document [W.-C.Tsai,C.-W.Chen,C.-F.Teng,and A.-Y.Wu,"Low-complexity compressive channel estimation for irs-aided mmwave systems with hypernetwork-assisted lamp network,"IEEE Communications Letters,2022.] apply the supernetwork to channel estimation algorithms in Reconfigurable Intelligent Surface (RIS) aided communication systems.
The authors of document [E.Nachmani and L.Wolf,"Hyper-graph-network decoders for block codes,"Advances in Neural Information Processing Systems,vol.32,2019.] convert the message graph of algebraic block codes into a graph neural network whose learning weights are generated using a super network. Thus solving the challenge of difficult training of the graph neural network for message delivery. However, [20] only focused on BP decoding algorithm and did not involve the min-sum algorithm which is more hardware friendly. Compared with the NBP algorithm, the NNMS algorithm has lower computational complexity and is more suitable for hardware implementation. In recent years, documents [Y.Liang,C.-T.Lam,and B.K.Ng,"A low-complexity neural normal-ized min-sum ldpc decoding algorithm using tensor-train decomposi-tion,"IEEE Communications Letters,vol.26,no.12,pp.2914–2918,2022.] and ["Joint-way compression for ldpc neural decoding algorithm with tensor-ring decomposition,"IEEE Access,2023.] propose, on the basis of NNMS algorithm, a NNMS + algorithm of a higher performance version and a TT-NNMS + algorithm of a lower complexity version, so as to select different versions for application as required.
The super network can improve the adaptability of the channel neural decoder and avoid repeated training caused by the change of the channel environment and the network layer number. However, deploying additional supernetworks in hardware will increase the computational complexity of the neural decoding algorithm, making the present limited communication hardware resources more challenging. In recent years, tensor Training (TT) and Tensor Ring (TR) decomposition have found widespread use in model compression, as they can overcome dimension disasters. In recent years, students have applied it to intelligent communications. The authors of documents [Y.Liang,C.-T.Lam,and B.K.Ng,"A low-complexity neural normal-ized min-sum ldpc decoding algorithm using tensor-train decomposi-tion,"IEEE Communications Letters,vol.26,no.12,pp.2914–2918,2022.] and ["Joint-way compression for ldpc neural decoding algorithm with tensor-ring decomposition,"IEEE Access,2023.] introduce TT decomposition and joint mode compression into channel neural decoding, which greatly reduces the required memory space and computational complexity while maintaining decoding performance.
However, to our knowledge, TT and TR decomposition has not been applied to super networks. Tensor decomposition has some similarities in mathematical principles and structure with neural networks. Authors of documents [N.Cohen,O.Sharir,and A.Shashua,"On the expressive power of deep learning:A tensor analysis,"in Conference on learning theory.PMLR,2016,pp.698–728.] and [V.Khrulkov,A.Novikov,and I.Oseledets,"Expressive power of recurrent neural networks,"arXiv preprint arXiv:1711.00811,2017.] use Hierarchical Tucker (HT) tensor decomposition and TT decomposition, respectively, to explain the expressive capacity of CNN and RNN. The documents [M.Astrid and S.-I.Lee,"Cp-decomposition with tensor power method for convolutional neural networks compression,"in2017IEEE International Conference on Big Data and Smart Computing(BigComp).IEEE,2017,pp.115–118.] and [V.Lebedev,Y.Ganin,M.Rakhuba,I.Oseledets,and V.Lempit-sky,"Speeding-up convolutional neural networks using fine-tuned cp-decomposition,"arXiv preprint arXiv:1412.6553,2014.] decompose the convolution kernel into canonical multiple (CP) formats and build a deep bottleneck architecture. The authors in document [Y.Chen,X.Jin,B.Kang,J.Feng,and S.Yan,"Sharing residual units through collective tensor factorization to improve deep neural networks."in IJCAI,2018,pp.635–641.] use tensor Block Term Decomposition (BTD) decomposition to build a unified framework to express various distinctly different residual functions. Literature [J.-T.Chien and Y.-T.Bao,"Tensor-factorized neural networks,"IEEE transactions on neural networks and learning systems,vol.29,no.5,pp.1998–2011,2017.] regards the whole process of the Tucker decomposition as a process of neural networks and proposes tensor decomposition of neural networks (TFNN). Literature [J.Kossaifi,Z.C.Lipton,A.Kolbeinsson,A.Khanna,T.Furlanello,and A.Anandkumar,"Tensor regression networks,"The Journal of Machine Learning Research,vol.21,no.1,pp.4862–4882,2020.] incorporates tensor contraction as a microlayer into a neural network and on this basis proposes a tensor contraction layer and a tensor regression layer, respectively.
Disclosure of Invention
The technical problem to be solved by the invention is to provide an LDPC code deep learning decoding method based on a super network, wherein the super network is used for NNMS algorithm and low-complexity version thereof. Supernetworks designed for channel neural decoders can be divided into two types, static and dynamic. The static super network can provide learning weights for the main network, but cannot flexibly change the layer number of the decoding network. Dynamic supernetworks, on the other hand, can extend the channel neural decoder to any number of layers without requiring retraining.
The invention adopts the following technical scheme for solving the technical problems:
The LDPC code deep learning decoding method based on the super network is characterized by combining the super network and NNMS networks, wherein the NNMS network is a channel neural decoding main network, and the super network generates learning weights for the main network;
the super network consists of a dense neural network, and the input of the super network and the input of the main network are both log likelihood ratio LLR value vectors received by channels.
Further, using the tanh function as an activation function of the super-network, training the super-network using a random gradient descent SGD under the direction of the cross-entropy loss function.
The cross entropy function is defined as:
Where o i is the ith bit of the NNMS network estimation output, x i is the ith bit of the transmission codeword, and N is the number of bits of the transmission codeword.
Further, the super network is a static super network, and the learning weight of the main network is generated based on the following expression:
Wherein, Is a static super networkIs used to determine the learning weight of the (c),Is a message transmitted from CN node c to VN node v at time t,Is a message transmitted from VN node v ′ to CN node c at time t-1, M (c) is a set of all VN nodes connected to CN, and M (c) \v is a set of VN nodes other than VN node v in M (c).
Further, the super network is a dynamic super network, and the learning weight of the main network is generated based on the following expression:
Where W α is the learning weight of the dynamic super network f (t-1)(:,Wα), Is a message transmitted from CN node c to VN node v at time t,Is a message transmitted from VN node v ′ to CN node c at time t-1, M (c) is a set of all VN nodes connected to CN, and M (c) \v is a set of VN nodes other than VN node v in M (c).
Further, the dynamic super network is composed of T super network units corresponding to each layer of the main network one by one, each super network unit generates learning weight for the corresponding layer of the main network, the input of the first super network unit is a log likelihood ratio LLR value vector received by a channel, the input of the T super network unit is the output of the T-1 layer of the main network, 1<t is less than or equal to T, and T is the number of layers of the main network.
The method is characterized in that the super network and the TR-NNMS + network are combined, wherein the TR-NNMS + network is a channel neural decoding main network, and the super network generates learning weights for the main network;
the super network consists of a dense neural network, and the input of the super network and the input of the main network are both log likelihood ratio LLR value vectors received by channels.
Further, the super network is a static super network, the number of network layers of the static super network is the same as the number of factor tensors decomposed by the TR, each layer of static super network generates a learning weight tensor for the corresponding factor tensor, and the size of the learning weight tensor of the static super network is the same as the size of the factor tensor decomposed by the TR;
the input to the static super-network is the LLR received by the channel or the information tensor of a layer of output on the main network.
Further, an activation function is added between the learning weight tensors to form an m-layer static super network.
Further, the super network is a dynamic super network formed by T super network units corresponding to each layer of the main network one by one, each super network unit generates a learning weight for the corresponding layer of the main network, the input of the first super network unit is a log likelihood ratio LLR value vector received by a channel, the input of the T super network unit is the output of the T-1 layer of the main network, 1<t is less than or equal to T, and T is the number of layers of the main network.
Compared with the prior art, the technical scheme provided by the invention has the following technical effects:
The invention provides a novel channel neural decoder framework based on a super network, which is used for adaptively generating learning weights and adjusting the network layer number according to an application environment. While the addition of super-networks results in higher computational complexity and memory requirements, in order to reduce the complexity of the proposed framework, the memory space for learning weights is multiplexed between the super-networks and the channel neural decoder (referred to as TRHyper) based on tensor-loop (TR) decomposition and structural similarity between the super-networks. TRHyper by sharing the storage space, the storage space and the computational complexity of the algorithm based on the super network are greatly reduced. Numerical results show that compared with the existing channel neural decoding algorithm, the proposed super-network-based decoding algorithm can achieve better performance on LDPC, BCH and Hamming codes. Furthermore, the proposed framework may provide flexibility in learning weights and flexibility in network architecture.
Drawings
FIG. 1 is a NNMS-based static super-network;
FIG. 2 is a dynamic super network based on NNMS and TR-NNMS +;
FIG. 3 is a schematic diagram of TRHyper-NNMS +, m being equal to 4.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The technical scheme of the invention is further described in detail below with reference to the accompanying drawings:
The channel neural decoding algorithm is superior to the traditional decoding algorithm, and has been widely studied in recent years. However, the current learning weights for channel neural decoding are trained for a specific channel environment, resulting in difficulty in adapting to the variation of the channel environment. Once the channel neural decoding algorithm is deployed, the learning weights and the network layer number cannot be changed.
The invention provides a novel channel neural decoder framework based on a super network, which is used for adaptively generating learning weights and adjusting the network layer number according to an application environment. However, the addition of a super network results in higher computational complexity and storage requirements.
In order to reduce the complexity of the proposed framework, a low-complexity super-network technology (called TRHyper) is proposed based on Tensor Ring (TR) decomposition and the mathematical principle and structural similarity of the super-network, and the storage space of the learning weight of the TR-NNMS + algorithm proposed in [22] is multiplexed to store the learning weight of the super-network, so that the learning weight of the super-network does not occupy additional storage resources. The super network in TRHyper solution generates optimal learning weights for the TR-NNMS + algorithm, saving training costs required from NNMS + to TR-NNMS +.
1. NNMS decoding algorithm based on super network
The NNMS algorithm has two limitations that we aim to address. Firstly, the learning weight of channel neural decoding cannot be automatically updated along with the change of noise level, so that stability and optimality cannot be ensured, and frequent retraining is required. Second, the number of layers of channel neural decoding cannot be adaptively changed according to the requirements of different application scenes. To overcome the above problem, we combine the super-network and channel neural decoding, and propose the following super-network-based channel neural decoding algorithm, in which the super-network generates learning weights for the main network.
The super network solution consists of two neural networks, one is a channel neural decoding network, which is a main network, and the other is a super network, which generates learning weights for the main network. The super network consists of a dense neural network of L layers, each layer having 32 neurons. We use the tanh function as an activation function for the super network. The inputs to the super-network are the same as the inputs to the main network, i.e., the received Log Likelihood Ratio (LLR) value vectors. The super-network is trained using random gradient descent (SGD) under the direction of a cross entropy loss function. Depending on the difference in information interaction between the super network and the main network, the super network can be classified into two types, static and dynamic.
(One), static HYPERNNMS (S-HYPERNNMS)
Let the super-network be represented by a parameterized function f (: W), where W is the learning weight of the super-network. The learning weights for the NNMS algorithm are generated by the static supernetwork and can be expressed as:
Wherein, Is a static super networkIs a learning weight of (a).
As shown in FIG. 1, the static super-network generates all the learning weights required by the main network NNMSThis may eliminate the need to retrain NNMS the primary network when the channel environment changes.
(II), dynamic HYPERNNMS (D-HYPERNNMS)
The static super-network solution may generate learning weights for the primary network, avoiding retraining the primary network when channel environments or noise levels change. However, the number of learning weight vectors output by the super network is fixed. Once the number of layers of the primary network changes, the output dimensions of the super network need to be changed, requiring retraining of the super network. The present invention therefore further proposes a dynamic subnetwork to solve this problem. Dynamic hypernetworks still use dense neural networks for compatibility with static hypernetworks. In a dynamic solution, however, the super-network dynamically generates learning weights for the master network, which weights may vary online across layers.
The learning weights in the NNMS algorithm are generated by dynamic D-HYPERNNMS and can be expressed as:
where W α is the learning weight of the dynamic super network f (t-1)(:,Wα).
Unlike the S-HYPERNNMS algorithm, in the D-HYPERNNMS algorithm, the super network realizes information interaction with the NNMS network at each layer. As shown in fig. 2, the super network dynamically provides learning weights to NNMS networks of each layer, and NNMS networks provide information required for computation to the super network. Each G is a supernetwork element consisting of a dense neural network. Unlike the S-HYPERNNMS algorithm, each supernetwork element in the D-HYPERNNMS algorithm may share learning weights across layers. Therefore, when NNMS network layers change, no retraining is required.
The invention proposes a static and dynamic HYPERNNMS algorithm based on NNMS. The static HYPERNNMS algorithm consists of a super network and a spreading algorithm called NNMS, so that the adaptability of the NNMS algorithm under different noise levels can be improved, the super network generates a learning weight for the NNMS algorithm, and repeated training caused by channel environment change is avoided. The dynamic HYPERNNMS algorithm, in which learning weights and messages are exchanged between the super network and the main network of each layer, can be arbitrarily expanded without retraining, except for the advantages of the static HYPERNNMS.
2. TRHyper A
The static super-network and the dynamic super-network can solve the problem that the channel neural decoding algorithm needs to be frequently retrained when the noise level changes or the network layer number changes. However, the hardware implementation of a super network requires a large amount of memory storage and computing resources, which is a significant challenge for the communication hardware. Particularly for the TR-NNMS + network, because of the special learning weight structure, a plurality of dense networks are needed to generate the learning weights simultaneously, more memory storage and calculation resources are needed, but the performance still cannot meet the requirements. To address these challenges, we propose a low complexity supernetwork technique with TR decomposition for channel neural decoding, referred to as TRHyper.
(One), super network based on TR decomposition
Let one tensor W be decomposed by TR to obtain m factor tensors W 1,W2,...,Wm. If we use these m-factor tensors as learning weights for the neural network. An activation function f is then added between every two tensors. Finally, an input X is given to the neural network. We can build an m-layer dense network as follows:
Y=f(f(...f(f(X*W1)*W2)*…)*Wm)
the above process sets up a bridge between the m tensors and the m-layer neural network. After processing, the m tensors can be used as learning weights of the neural network.
The super network is composed of a neural network having a plurality of dense layers, and an m-layer super network is built from m factor tensors obtained after TR decomposition of learning weights by using the above-mentioned relationship between tensors and the neural network. At the same time, the similarity between them is used to reuse the storage space and weight, thereby reducing the memory storage and computing resources required by the super network.
Scheme (II), TRHyper
TRHyper utilizes the structural similarity between TR decomposition and the dense neural network, and the storage space required by the learning weight can be repeatedly utilized in the training process, so that a large amount of storage resources are saved. The inverse calculation of the TR decomposition required during the training of the TR-NNMS + algorithm is avoided, so that a large amount of calculation resources are saved. In addition, the training of the factor tensor required by TR-NNMS + is not needed, so that the training cost is saved. According to the information interaction condition of the super network and the main network, TRHyper schemes are also divided into static and dynamic schemes.
1. Static TRHyper (S-TRHyper)
The S-TRHyper scheme provided by the invention is a super network consisting of m layers of dense neural networks, and the size of each layer is specially designed so that the storage space can be reused. TR-NNMS + is obtained by compressing the NNMS + algorithm based on TR decomposition. Thus, the learning weights for each layer in the TR-NNMS + algorithm consist of $m $factor tensors. The m factor tensors can participate in the calculation of the layer after the inverse operation of TR decomposition in training. These learning weights need to be trained for a particular channel environment to achieve optimization. The size of the m layers in the super network is designed to be the size of $m$factor tensors, so that the storage space for storing the m factor tensors can be multiplexed to store the learning weights of the m layers in the super network, and the storage resources of the m layers of the super network learning weights are saved.
In addition, in the TRHyper scheme proposed by the present invention, the forward computation of the super-network is designed to complete the reverse computation of the TR decomposition required in TR-NNMS + training. The output of the super network participates in the calculation of the corresponding layer of the TR-NNMS + network as the result of the inverse operation. This design saves the computational resources required for TR decomposition inverse computation. The output of the super network after training is the optimal learning weight of the TR-NNMS + algorithm, so that the training of the learning weight of the factor tensor format required by the optimization of the TR-NNMS + algorithm is avoided. Thus, the TRHyper solution also saves significant training costs.
As shown in fig. 3, the S-TRHyper solution can be divided into the following 5 steps:
Step 1, designing a compression scheme based on TR decomposition aiming at the learning weight of NNMS + decoding algorithm to obtain the number m of factor tensors and the factor tensors of the TR decomposition scheme Is { D 1,…,Dm }. According to m andThe memory space O is designed to store the TR-decomposed factor tensor (i.e., the learning weight of the TR-NNMS + algorithm). The number of network layers of the super network is equal to the number of factor tensors, and the number is also m. The learning weight tensor W i of the super-network, i=1,..m is consistent with the size D i of the factor tensor, i.e.Thus, the designed memory space O may also be used to store the learning weights of the designed supernetwork.
Step 2, during training, the super network uses the storage space O to store the learning weights W i, i=1. M tanh activation functions are added between the learning weight tensors to form an m-layer supernetwork. The designed super network has no own loss function. It uses a cross entropy loss function with the primary network. The input of the super-network corresponds to the input of the main network, i.e. LLR received by the channel or information tensor output by one layer on the main network
And 3, training the established super network and the main network together as shown in fig. 3, and updating the learning weight under the guidance of a loss function and a backward propagation mechanism.
Step 4, after training, we will get the optimized TR-NNMS + algorithm based on TRHyper. The learning weights of the m-layer super network use the magnitudes of m factor tensors, respectively. The forward computation process of the super-network can be seen as the inverse of TR decomposition. Thus, the output of the super-network is the optimal learning weight of the TR-NNMS + algorithm. That is, from NNMS + algorithm to TR-NNMS + algorithm, no separate training is required as in, saving a lot of training costs.
And 5, if the channel environment changes, repeating the steps 3-4.
2. Dynamic TRHyper (D-TRHyper)
Unlike S-TRHyper, D-TRHyper interacts with the host network at each layer. As shown in fig. 2, each supernetwork element G consists of a dense neural network, similar to static TRHyper. Each layer of the main network receives the learning weight generated by the super network to finish the calculation of the current layer. The output of each layer is not only provided as input to the next layer but is also provided to the super network as calculation of the next iteration. The remaining steps are consistent with the S-TRHyper protocol.
The S-TRHyper and D-TRHyper schemes designed by the invention fully utilize the structural similarity between the factor tensor obtained after tensor decomposition and the dense neural network to reuse the storage space. The TRHyper scheme saves a large amount of storage resources required for learning weights and saves the training cost required by the algorithm from NNMS + to TR-NNMS +.
The TR-NNMS + algorithm has better performance and lower complexity than the NNMS algorithm. Each learning weight tensor of the TR-NNMS + algorithm is decomposed into a plurality of factor tensors by the TR. Thus, neither the static nor dynamic supernetwork schemes of NNMS algorithm can be directly applied to the TR-NNMS + algorithm. For the learning weight structure of TR-NNMS +, we use multiple dense neural networks as the super-network, each of which generates learning weights for the factor tensors in TR format. According to the criterion, a corresponding static super-network scheme and a dynamic super-network scheme are designed for the TR-NNMS + algorithm.
In order to reduce the computational complexity and required memory resources of the super network, we propose TRHyper a solution that exploits the mathematical principles and structural similarities between TR decomposition and the super network. TRHyper the learning weight storage space of TR-NNMS + is multiplexed to store the learning weight of the super network. Therefore, the learning weight of the super network does not need extra storage space, and a large amount of storage resources are saved. In addition, the super network in TRHyper scheme generates optimal learning weight for the TR-NNMS + algorithm, and training from NNMS + algorithm to TR-NNMS + algorithm is not needed, so that a great amount of training cost is saved.
3. Complexity analysis
Let us assume that the number of learning weights for dense networks used in the super-network based channel neural decoding algorithm is W, and the required memory storage is O (W). As shown in table 1, the memory storage required for the super network used in HYPERNNMS solution, hyperTR-NNMS + solution, and TRHyper solution are listed.
For the S-HYPERNNMS solution, the super-network is a dense neural network, the number of learning weights for the super-network is W, and the required memory is stored as O (W). For the D-HYPERNNMS scheme, the learning weights and the memory storage occupied by the learning weights are the same as those of S-HYPERNNMS. Since the S-HyperTR-NNMS + algorithm requires d dense networks to generate learning weights for the primary network, where d is the number of factor tensors resulting from TR decomposition. Therefore, memory storage d x O (W) is required to store the learning weights of the S-HyperTR-NNMS + supernetwork. For the D-HyperTR-NNMS scheme, the learning weights and the memory storage occupied by the learning weights are the same as those of S-HyperTR-NNMS +. In the D-TRHyper-NNMS + and S-TRHyper-NNMS + solutions, no additional storage space is needed to store the learning weights of the super-network, as the memory storage of the learning weights of the TR-NNMS + algorithm is reused.
TABLE 1
In the above we only calculate the memory storage needed for the learning weights of the super network. It should be noted that the addition of the super-network includes, in addition to introducing additional learning weights, an activation function that needs to be calculated and stored. For the S-HYPERNNMS algorithm, only one dense neural network is used as the super network. We assume that there are L activation functions in this dense network. We also assume that the computational complexity required for L activation functions during training is O (L). For D-HYPERNNMS, the supernetwork element consists of a dense neural network, which contains L activation functions. Thus, D-HYPERNNMS contains a total of t×l activation functions, requiring a computational complexity of t×o (L), where T is the number of layers of the primary NNMS network. Since the S-HyperTR-NNMS + algorithm requires d dense networks, d is the number of factor tensors resulting from the TR decomposition. Thus, the S-HyperTR-NNMS + algorithm requires a dL activation function, which requires the computational complexity of dO (L). Similar to the supernetwork in D-HYPERNNMS, the supernetwork in D-HyperTR-NNMS + requires dT.times.L activation functions, which require the computational complexity of dT.times.O (L). Although the TRHyper solution multiplexes the storage space for the learning weights, the computational complexity of the activation function is not reduced. Similar to HYPERNNMS, S-TRHyper requires L activation functions and the computational complexity of O (L), and D-TRHyper requires T x L activation functions and the computational complexity of T x O (L).
The foregoing is merely illustrative of the embodiments of the present invention, and the scope of the present invention is not limited thereto, and any person skilled in the art will appreciate that modifications and substitutions are within the scope of the present invention, and the scope of the present invention is defined by the appended claims.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311361702.8A CN118508977B (en) | 2023-10-19 | 2023-10-19 | LDPC code deep learning decoding method based on super network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311361702.8A CN118508977B (en) | 2023-10-19 | 2023-10-19 | LDPC code deep learning decoding method based on super network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118508977A CN118508977A (en) | 2024-08-16 |
CN118508977B true CN118508977B (en) | 2024-12-17 |
Family
ID=92233909
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311361702.8A Active CN118508977B (en) | 2023-10-19 | 2023-10-19 | LDPC code deep learning decoding method based on super network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118508977B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115053228A (en) * | 2020-02-05 | 2022-09-13 | 元平台公司 | Hypergraph network decoder for algebraic block code |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10594541B2 (en) * | 2017-09-04 | 2020-03-17 | Comcast Cable Communications, Llc | Remote evaluation of content delivery service |
WO2022027014A1 (en) * | 2020-07-28 | 2022-02-03 | Intel Corporation | Self-organizing network coordination and energy saving assisted by management data analytics |
US12166504B2 (en) * | 2021-09-28 | 2024-12-10 | New Jersey Institute Of Technology | Systems and methods for decoding of graph-based channel codes via reinforcement learning |
-
2023
- 2023-10-19 CN CN202311361702.8A patent/CN118508977B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115053228A (en) * | 2020-02-05 | 2022-09-13 | 元平台公司 | Hypergraph network decoder for algebraic block code |
Also Published As
Publication number | Publication date |
---|---|
CN118508977A (en) | 2024-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Nachmani et al. | Learning to decode linear codes using deep learning | |
Doan et al. | Neural successive cancellation decoding of polar codes | |
Teng et al. | Low-complexity recurrent neural network-based polar decoder with weight quantization mechanism | |
KR101492595B1 (en) | Memory-efficient ldpc decoding | |
Xiao et al. | Designing finite alphabet iterative decoders of LDPC codes via recurrent quantized neural networks | |
US20050149840A1 (en) | Apparatus for encoding and decoding of low-density parity-check codes, and method thereof | |
CN110932734B (en) | Deep learning channel decoding method based on alternative direction multiplier method | |
Lugosch et al. | Learning from the syndrome | |
CN110113057B (en) | Polarization code decoder utilizing deep learning | |
Salavati et al. | Nonbinary associative memory with exponential pattern retrieval capacity and iterative learning | |
Sun et al. | Deep learning based joint detection and decoding of non-orthogonal multiple access systems | |
CN115664899A (en) | A channel decoding method and system based on graph neural network | |
Jarollahi et al. | Reduced-complexity binary-weight-coded associative memories | |
Huang et al. | Functional error correction for reliable neural networks | |
Deng et al. | Reduced-complexity deep neural network-aided channel code decoder: A case study for BCH decoder | |
Gong et al. | Graph neural networks for enhanced decoding of quantum LDPC codes | |
CN118508977B (en) | LDPC code deep learning decoding method based on super network | |
CN103973317A (en) | Rapid decoding method of multi-element LDPC code | |
Cai et al. | An improved simplified soft cancellation decoding algorithm for polar codes based on frozen bit check | |
CN113872614B (en) | Deep neural network-based Reed-Solomon code decoding method and system | |
Yang et al. | Efficient hardware architecture of deterministic MPA decoder for SCMA | |
Yan et al. | Error Correction Code Transformer: From Non-Unified to Unified | |
CN117220689B (en) | Non-binary LDPC decoding method based on model-driven deep learning | |
Lyu et al. | Optimization and hardware implementation of learning assisted min-sum decoders for polar codes | |
Kakde et al. | FPGA implementation of decoder architectures for high throughput irregular LDPC codes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |