[go: up one dir, main page]

CN110780813B - A Distributed Storage System Based on Subspace Codes on Binary Fields - Google Patents

A Distributed Storage System Based on Subspace Codes on Binary Fields Download PDF

Info

Publication number
CN110780813B
CN110780813B CN201910952539.XA CN201910952539A CN110780813B CN 110780813 B CN110780813 B CN 110780813B CN 201910952539 A CN201910952539 A CN 201910952539A CN 110780813 B CN110780813 B CN 110780813B
Authority
CN
China
Prior art keywords
node
nodes
storage
repair
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910952539.XA
Other languages
Chinese (zh)
Other versions
CN110780813A (en
Inventor
刘宴涛
秦娜
苏秉华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiaying University
Original Assignee
Jiaying University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiaying University filed Critical Jiaying University
Priority to CN201910952539.XA priority Critical patent/CN110780813B/en
Publication of CN110780813A publication Critical patent/CN110780813A/en
Application granted granted Critical
Publication of CN110780813B publication Critical patent/CN110780813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Error Detection And Correction (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a distributed storage system based on subspace codes on a binary domain, which comprises the following steps of writing m original user data symbols into vectors in the form of x= (x) 1 ,…,x m ) T T represents the matrix transpose, performing the following encoding operation on node i:wherein B is i A coding matrix representing node i; step 2: the generated code word y i1 And y i2 Stored on node i; step 3: step 1 and step 2 are performed on all m nodes until all nodes perform the encoding operation to generate and store the codeword. All operations of the invention are based on the binary domain F2, the constructed distributed storage system has simplicity and light weight, and the maximum distance attribute of the subspace storage code designed by the invention ensures that decoding can be finished by depending on the data stored on the least nodes, namely, the decoding bandwidth is reduced.

Description

一种基于二元域上子空间码的分布式存储系统A distributed storage system based on subspace codes over binary fields

技术领域Technical Field

本发明涉及计算机和通信网络技术领域,特别是涉及一种基于二元域上子空间码的分布式存储系统。The present invention relates to the technical field of computers and communication networks, and in particular to a distributed storage system based on subspace codes on a binary domain.

背景技术Background Art

传统的数据中心把客户数据集中存储在一个房间或一栋建筑内的磁盘阵列中,这种存储方式的优点是便于管理,但一个明显的缺点是随着业务量的增加,数据中心负荷急剧增加,瓶颈效应明显;数据中心的另一个缺点是应对突发事件的能力脆弱,一旦出现地震、火灾等灾害,数据中心损毁,用户数据无法恢复,损失巨大。随着互联网技术的飞速发展,数据存储方式逐渐从集中式转变为分布式。与数据中心不同,分布式存储系统把客户数据分散存储在空间上相分离且相距较远的若干个磁盘驱动器或存储节点内,这些磁盘可能分布在一个城市的各个角落,或者位于各个城市,甚至是各个国家。分布式存储最大的特点就是把通信负荷分散到各个存储节点,从而有效解决了数据传输的瓶颈问题。分布式存储已经取代数据中心成为市场上主流的存储技术,商用的分布式存储产品包括Dropbox,微软OneDrive,亚马逊S3,百度云盘等。Traditional data centers store customer data in a disk array in a room or a building. The advantage of this storage method is that it is easy to manage, but an obvious disadvantage is that as the business volume increases, the data center load increases sharply, and the bottleneck effect is obvious; another disadvantage of the data center is that the ability to respond to emergencies is fragile. Once disasters such as earthquakes and fires occur, the data center is damaged and user data cannot be restored, resulting in huge losses. With the rapid development of Internet technology, data storage methods have gradually changed from centralized to distributed. Unlike data centers, distributed storage systems store customer data in a number of disk drives or storage nodes that are separated and far apart in space. These disks may be distributed in various corners of a city, or located in various cities, or even in various countries. The biggest feature of distributed storage is that the communication load is distributed to various storage nodes, thereby effectively solving the bottleneck problem of data transmission. Distributed storage has replaced data centers as the mainstream storage technology in the market. Commercial distributed storage products include Dropbox, Microsoft OneDrive, Amazon S3, Baidu Cloud Disk, etc.

对于一个数据存储系统而言,客户数据永远是第一位的,系统最重要的任务和目标就是保护并为客户提供其所需要的数据,既不能出现数据错误,又不能出现数据丢失。针对这一目标,对分布式存储系统的一个基本的功能要求是客户原始数据的可恢复性,即通过存储在分布式磁盘驱动器上的若干存储数据必须能够恢复客户的原始数据;另一个功能要求是存储数据的可修复性,即当一部分存储节点损坏,导致其上存储的数据丢失,必须能够根据剩余依然存活的存储节点上的数据修复或重建这部分丢失的存储数据。For a data storage system, customer data is always the first priority. The most important task and goal of the system is to protect and provide customers with the data they need, without data errors or data loss. To achieve this goal, a basic functional requirement for distributed storage systems is the recoverability of customer original data, that is, the customer's original data must be recoverable through the storage data stored on the distributed disk drives; another functional requirement is the repairability of storage data, that is, when a part of the storage nodes is damaged, resulting in the loss of the data stored on them, it must be possible to repair or reconstruct the lost storage data based on the data on the remaining surviving storage nodes.

客户原始数据的可恢复性和系统存储数据的可修复性是两个既相互区别又相互联系的功能要求。实现这两个功能的一个最简单的方法是复制存储,就是把客户原始数据原封不动地在分布式磁盘驱动器上存储多个拷贝,磁盘上存储的数据就是客户的原始数据,当某些磁盘损坏时,依靠其他有效的磁盘上存储的数据拷贝可得损坏磁盘上丢失的数据,因此复制存储保证了客户原始数据的可恢复性和系统存储数据的可修复性。但是上述单纯依靠复制的存储方式的存储效率太低,以图1为例,图1(a)的复制存储把1个二元源数据字符d1拷贝n份,分别存储在n个磁盘上,存储效率是1/n;图1(b)采用奇偶校验存储,客户原始数据包括n-1个二元字符d1,…,dn-1,把这n-1个字符通过奇偶校验生成1个校验字符dn,然后把n-1个原始数据字符和这个校验字符分别存储在n个磁盘上,这种存储方式的存储效率是(n-1)/n,当n个磁盘驱动器中的某一个磁盘损坏导致其上存储的数据丢失时,可以依靠剩余的n-1个字符做奇偶校验运算修复该丢失的字符。The recoverability of customer original data and the repairability of system stored data are two functional requirements that are both different and interrelated. The simplest way to achieve these two functions is to replicate the storage, which is to store multiple copies of the customer original data intact on distributed disk drives. The data stored on the disk is the customer's original data. When some disks are damaged, the data lost on the damaged disk can be obtained by relying on the data copies stored on other valid disks. Therefore, replicated storage ensures the recoverability of customer original data and the repairability of system stored data. However, the storage efficiency of the above storage method that relies solely on replication is too low. Taking Figure 1 as an example, the replication storage in Figure 1(a) copies a binary source data character d 1 into n copies and stores them on n disks respectively, with a storage efficiency of 1/n; Figure 1(b) uses parity storage, and the customer's original data includes n-1 binary characters d 1 ,…,d n-1 . These n-1 characters are used to generate a check character d n through parity check, and then the n-1 original data characters and the check character are stored on n disks respectively. The storage efficiency of this storage method is (n-1)/n. When one of the n disk drives is damaged and the data stored on it is lost, the lost character can be repaired by relying on the parity check operation of the remaining n-1 characters.

无论是在理论上还是在工程应用中都已经证明基于编码的数据存储方式的存储效率远远高于基于复制的存储方式。所谓基于编码的分布式存储是指在磁盘中存储的不是客户的原始数据,而是把原始数据经过某种编码运算后生成的码字数据。It has been proven in theory and in engineering applications that the storage efficiency of data storage based on coding is much higher than that based on replication. The so-called distributed storage based on coding means that what is stored on the disk is not the original data of the customer, but the codeword data generated by the original data after some coding operation.

现有技术中的基于编码的分布式存储系统中,由图2所示,首先把客户的原始数据d1,…,dk送入编码器,编码器的输出为码字数据c1,…,cn,假设每个原始数据或码字数据都由某个有限域Fq上的α个字符组成;其次,把码字数据c1,…,cn分布式地存储在n个磁盘中。当用户想要得到原始数据d1,…,dk时,可以从这n个磁盘中读取一部分节点上存储的码字数据并送入译码器,经译码得到原始数据,用于保证用户原始数据的可恢复性;如果系统中某个磁盘损坏导致其上存储的码字数据丢失,比如cj,可以向某些依然存活的磁盘请求一部分码字数据修复和重建cj,用于保证存储数据的可修复性。进一步,可修复性也可以扩展到多个磁盘节点,也就是同时修复多个失效的磁盘节点。译码器为了译码恢复用户的原始数据,需要从存储系统中读取一部分磁盘存储的码字数据,称这个译码过程中需要下载的数据量为译码带宽或下载带宽。另外,为了修复某个存储节点损坏造成的存储数据丢失,需要从剩余磁盘中读取一部分码字数据加以修复,称这个修复过程中需要下载的数据量为修复带宽。译码带宽和修复带宽是分布式编码存储系统的两个性能评价指标参数,不同的编码存储方法或系统具有不同的译码带宽和修复带宽,一个高效的分布式编码存储系统应该使用比较小的译码带宽和修复带宽。In the prior art, in the distributed storage system based on encoding, as shown in FIG2 , firstly, the original data d 1 ,…,d k of the customer is sent to the encoder, and the output of the encoder is the codeword data c 1 ,…, cn , assuming that each original data or codeword data is composed of α characters on a finite field F q ; secondly, the codeword data c 1 ,…, cn is distributedly stored in n disks. When the user wants to obtain the original data d 1 ,…,d k , the codeword data stored on a part of the nodes can be read from the n disks and sent to the decoder, and the original data can be obtained after decoding to ensure the recoverability of the user's original data; if a disk in the system is damaged and the codeword data stored thereon is lost, such as c j , a part of the codeword data can be requested from some surviving disks to repair and rebuild c j to ensure the recoverability of the stored data. Furthermore, the recoverability can also be extended to multiple disk nodes, that is, multiple failed disk nodes can be repaired at the same time. In order to decode and restore the user's original data, the decoder needs to read a part of the codeword data stored on the disk from the storage system. The amount of data that needs to be downloaded during the decoding process is called the decoding bandwidth or download bandwidth. In addition, in order to repair the storage data loss caused by the damage of a storage node, it is necessary to read a part of the codeword data from the remaining disks for repair. The amount of data that needs to be downloaded during the repair process is called the repair bandwidth. Decoding bandwidth and repair bandwidth are two performance evaluation index parameters of distributed coding storage systems. Different coding storage methods or systems have different decoding bandwidths and repair bandwidths. An efficient distributed coding storage system should use relatively small decoding bandwidths and repair bandwidths.

分布式编码存储系统的一个典型例子是(n,k)删除码,用户原始数据为d1,…,dk,经删除码编码后,生成码字数据c1,…,cn存储在n个磁盘中,每个码字包含α个字符。(n,k)删除码依靠n个码字c1,…,cn中任意的k个码字都可以译码恢复原始数据d1,…,dk,因此,(n,k)删除码的译码带宽等于kα个字符。A typical example of a distributed coding storage system is the (n,k) erasure code. The original user data is d 1 ,…,d k . After being encoded by the erasure code, the generated codeword data c 1 ,…, cn is stored in n disks, and each codeword contains α characters. The (n,k) erasure code relies on any k codewords among the n codewords c 1 ,…,cn to recover the original data d 1 ,…,d k . Therefore, the decoding bandwidth of the (n,k) erasure code is equal to kα characters.

分布式编码存储系统的译码过程和修复过程是两个既相互区别又相互联系的过程。一方面,译码过程可以替代修复过程,以(n,k)删除码为例,为了修复系统中某一个或某几个节点失效造成的码字数据丢失,可以从剩余的存活节点中任意选择k个节点,依靠其上存储的k个码字数据通过译码得到原始数据d1,…,dk,再应用原始数据对这些失效节点重新编码生成丢失的码字数据。以下称这种修复方式为“先译码后编码”的修复方式。显然,这种修复过程的译码带宽和修复带宽是相等的,对于(n,k)删除码来说,都等于kα个字符。然而对于单节点失效问题,上述“先译码后编码”的修复过程使用了kα个字符的修复带宽仅仅为了修复单个失效节点上存储的α个字符,效率是很低的。为了提高修复效率,降低修复带宽,Dimakis等研究者基于网络编码技术提出了“再生码”的概念,其关键点在于允许每个节点对其内部存储的码字数据进行运算生成新的数据用于重建损坏的节点。The decoding process and the repair process of the distributed coding storage system are two processes that are both different and related. On the one hand, the decoding process can replace the repair process. Taking the (n, k) erasure code as an example, in order to repair the codeword data loss caused by the failure of one or several nodes in the system, k nodes can be randomly selected from the remaining surviving nodes, and the k codeword data stored on them can be decoded to obtain the original data d 1 ,…,d k , and then the original data is applied to re-encode these failed nodes to generate the lost codeword data. This repair method is hereinafter referred to as the "decode first and then encode" repair method. Obviously, the decoding bandwidth and repair bandwidth of this repair process are equal. For the (n, k) erasure code, they are both equal to kα characters. However, for the single node failure problem, the above "decode first and then encode" repair process uses the repair bandwidth of kα characters just to repair the α characters stored on a single failed node, which is very inefficient. In order to improve the repair efficiency and reduce the repair bandwidth, researchers such as Dimakis proposed the concept of "regeneration code" based on network coding technology. The key point is to allow each node to operate on the codeword data stored internally to generate new data for reconstructing the damaged node.

图3给出了一种基于再生码的分布式存储系统示意图,该码基于有限域Fq,原始数据字符为(x1,x2,y1,y2),经删除码编码后生成8个码字符号,分别存储在4个节点中,每个节点存储2个码字符号,如图3所示。不难验证,从图3中任意2个节点读取4个码字符号通过求解线性方程都可以恢复出原始数据(x1,x2,y1,y2),因此保证了原始数据的可恢复性。对于单节点损坏的修复问题,不失一般性,假设节点2损坏,数据β1=x1+y1和β2=2x2+y2丢失,如果采用“先译码后编码”的修复方式,需要从节点1、3、4中任选两个,读取其中的4个码符号,译码解析出(x1,x2,y1,y2),再对节点2重新编码即可恢复β12,因此这种方式对应的修复带宽为4个码字符号。但如果使用图3所示的网络编码的方法,把节点1、3、4中各自存储的两个符号做编码运算生成如图3所示的3个新的符号a,b,c,这3个符号经过线性代数运算可以构造两个线性无关的方程组如下:Figure 3 shows a schematic diagram of a distributed storage system based on a regenerating code. The code is based on a finite field Fq . The original data characters are ( x1 , x2 , y1 , y2 ). After being encoded by an erasure code, 8 code characters are generated and stored in 4 nodes respectively. Each node stores 2 code characters, as shown in Figure 3. It is not difficult to verify that the original data ( x1 , x2 , y1 , y2 ) can be restored by solving the linear equation by reading 4 code characters from any 2 nodes in Figure 3, thus ensuring the recoverability of the original data. For the repair problem of single node damage, without loss of generality, assume that node 2 is damaged, data β 1 =x 1 +y 1 and β 2 =2x 2 +y 2 are lost. If the "decode first and then encode" repair method is adopted, it is necessary to select any two nodes 1, 3, and 4, read the four code symbols, decode and parse (x 1 , x 2 , y 1 , y 2 ), and then re-encode node 2 to restore β 1 , β 2. Therefore, the corresponding repair bandwidth of this method is 4 code symbols. However, if the network coding method shown in Figure 3 is used, the two symbols stored in nodes 1, 3, and 4 are encoded to generate three new symbols a, b, and c as shown in Figure 3. These three symbols can be used to construct two linearly independent equations through linear algebra operations as follows:

a+b=β12 a+b=β 12

b+c=2β12 b+c=2β 12

求解该方程组可得β12,因此这种修复方式对应的修复带宽等于3个符号,小于“先译码后编码”的修复带宽。Solving the equations yields β 12 . Therefore, the repair bandwidth corresponding to this repair method is equal to 3 symbols, which is smaller than the repair bandwidth of "decoding first and encoding later".

与删除码类似,子空间码也可以用于分布式存储。Hollmann提出了子空间存储码的理论框架,码的参数组为(m,n,α,k,r,β)。具体的,子空间码U由有限域Fq上的一个m维矢量空间的n个α维的子空间U1,U2,…,Un构成,称U为消息空间,称子空间U1,U2,…,Un为存储空间,称α为存储容量。子空间码U需要满足如下译码要求和修复要求:Similar to erasure codes, subspace codes can also be used for distributed storage. Hollmann proposed a theoretical framework for subspace storage codes, where the code parameter set is (m, n, α, k, r, β). Specifically, the subspace code U consists of an m-dimensional vector space over a finite field F q The code U is composed of n α-dimensional subspaces U 1 ,U 2 ,…,U n , where U is called the message space, subspaces U 1 ,U 2 ,…,U n are called storage spaces, and α is called storage capacity. The subspace code U needs to meet the following decoding and repair requirements:

译码要求:在U1,U2,…,Un中任意选择k个子空间构成集合K,如果K的线性张成都等于消息空间U,则称K为满足译码要求的译码集,满足译码要求的最小的译码集的大小k被称为子空间码U的译码维度,对应的译码带宽等于kα。Decoding requirements: Randomly select k subspaces from U 1 ,U 2 ,…,U n to form a set K. If the linear span of K is equal to the message space U, then K is called a decoding set that meets the decoding requirements. The size k of the smallest decoding set that meets the decoding requirements is called the decoding dimension of the subspace code U, and the corresponding decoding bandwidth is equal to kα.

修复要求:对于任意一个子空间Ui,都能从剩余的n-1个子空间{U1,U2,…,Un}/Ui中找到r个子空间构成的集合Ri={Ui1,Ui2,…,Uir},使得其中每个Uij都存在一个β(β<α)维子空间满足这r个β维子空间的线性张成包含Ui,即称Ri为子空间Ui的修复集,称{Vi1,Vi2,…,Vir}为修复空间,对应的修复带宽等于rβ。Repair requirement: For any subspace U i , a set R i = {U i1 ,U i2 ,…,U ir } consisting of r subspaces can be found from the remaining n-1 subspaces {U 1 ,U 2 ,…,U n }/U i , so that each U ij has a β (β<α)-dimensional subspace The linear span of these r β-dimensional subspaces contains U i , that is, R i is called the repair set of subspace U i , {V i1 ,V i2 ,…,V ir } is called the repair space, and the corresponding repair bandwidth is equal to rβ.

更为直观的,参数为(m,n,α,k,r,β)的子空间存储码也可以描述如下:用户原始数据包括有限域Fq上的m个符号,经编码后得到的码字数据存储在n个节点上,每个节点存储α个符号,依靠任意的k个节点上存储的码字符号都可以译码得到原始数据,对于单节点失效问题,可以向r个有效节点中的每一个请求β个符号修复该失效节点,如图2所示。More intuitively, the subspace storage code with parameters (m, n, α, k, r, β) can also be described as follows: the user's original data includes m symbols on the finite field Fq , and the codeword data obtained after encoding is stored on n nodes. Each node stores α symbols, and the original data can be decoded by relying on the codeword symbols stored on any k nodes. For the single node failure problem, β symbols can be requested from each of the r valid nodes to repair the failed node, as shown in Figure 2.

Hollmann只是给出了子空间存储码的理论框架,并没有给出满足译码要求和修复要求的子空间存储码的显式构造方法。因此目前急需一种简单且高效的编码方法解决上述问题。Hollmann only gave a theoretical framework for subspace storage codes, but did not give an explicit construction method for subspace storage codes that meet the decoding and repair requirements. Therefore, a simple and efficient coding method is urgently needed to solve the above problems.

发明内容Summary of the invention

本发明的目的是提供一种基于二元域上子空间码的分布式存储系统,以解决上述现有技术存在的问题,使编码过程简单高效。The purpose of the present invention is to provide a distributed storage system based on subspace codes on a binary field to solve the problems existing in the above-mentioned prior art and make the encoding process simple and efficient.

为实现上述目的,本发明提供了如下方案:一种基于二元域上子空间码的分布式存储系统,二元域F2上的子空间存储码的参数为其中m等于用户原始数据的符号数,同时也等于存储节点的个数,α取2,表示每个节点只存储2个码数据符号,依靠任意的个节点上存储的码数据符号都能译码得到原始数据符号,编码的过程如下:To achieve the above object, the present invention provides the following solution: a distributed storage system based on subspace codes on a binary field, the parameters of the subspace storage code on the binary field F2 are Where m is equal to the number of symbols of the user's original data, and also equal to the number of storage nodes. α is 2, which means that each node only stores 2 code data symbols. The coded data symbols stored on each node can be decoded to obtain the original data symbols. The encoding process is as follows:

步骤一、设m个原始的用户数据符号为消息矢量x=(x1,…,xm)T,消息空间为从消息空间U的全部2维子空间中选择m个子空间作为存储空间,经编码后得到的码字数据符号存储在m个节点上,每个节点i存储2个码字数据符号,节点i的编码矩阵Bi的公式如下:Step 1: Assume that m original user data symbols are message vectors x = (x 1 , ..., x m ) T , and the message space is Select m subspaces from all 2-dimensional subspaces of the message space U as storage spaces. The encoded codeword data symbols are stored on m nodes. Each node i stores 2 codeword data symbols. The formula of the encoding matrix Bi of node i is as follows:

其中,Bi为(2×m)维的矩阵;bi,1为节点i的编码矩阵Bi的第一行的行向量,bi,2为节点i的编码矩阵Bi的第二行的行向量,m等于用户原始数据的字符数,同时也等于存储节点的个数,m取任意的大于等于3的正整数;Wherein, Bi is a (2×m)-dimensional matrix; bi ,1 is the row vector of the first row of the encoding matrix Bi of node i , bi ,2 is the row vector of the second row of the encoding matrix Bi of node i, m is equal to the number of characters of the user's original data, and is also equal to the number of storage nodes, and m is any positive integer greater than or equal to 3;

当m=3时,b1,1=(1 0 0),b1,2=(0 1 0),得出:When m=3, b1,1 =(1 0 0), b1,2 =(0 1 0), we get:

当m=2i或m=2i+1,i≥2时,第1个存储节点的编码矩阵B1的第一行的行向量b1,1的公式为:When m=2i or m=2i+1, i≥2, the formula of the row vector b1,1 of the first row of the encoding matrix B1 of the first storage node is:

b1,1=(1 … 1 0 0 0 … 0),其中包含i-1个1,m-(i-1)个0;b 1,1 = (1 … 1 0 0 0 … 0), which contains i-1 1s and m-(i-1) 0s;

第二行的行向量b1,2的公式为:The formula for the row vector b1,2 of the second row is:

b1,2=(0 1 … 1 0 … 0 0),其中包含i个1,m-i个0,得出:b 1, 2 = (0 1 ... 1 0 ... 0 0), which contains i 1s and mi 0s, so:

剩余m-1个编码矩阵的两个行向量分别是B1对应的两个行向量的循环右移;The two row vectors of the remaining m-1 encoding matrices are the cyclic right shifts of the two row vectors corresponding to B 1 ;

对节点i进行编码执行以下公式:The following formula is used to encode node i:

其中,yi是一个2维码字列矢量,存储在第i个节点上;yi1,yi2均为基于有限域Fq上的码字数据符号;Wherein, yi is a 2D codeword column vector stored on the i-th node; yi1 and yi2 are both codeword data symbols based on the finite field Fq ;

步骤二:把生成的码字数据符号yi1和yi2存储在节点i上;Step 2: Store the generated codeword data symbols y i1 and y i2 on node i;

步骤三、对全部m个节点执行步骤一和步骤二,直到全部节点都执行编码运算并生成存储码字;Step 3: Execute steps 1 and 2 on all m nodes until all nodes have performed the encoding operation and generated storage codewords;

译码的具体方法如下:The specific method of decoding is as follows:

假设任意选取的个节点的序号为j1,…,jk,这k个节点各自的编码矩阵分别为Bj1,…,Bjk,这k个节点存储的码字矢量分别记为yj1,…,yjk,每个码字矢量都由2个有限域Fq上的二元码字数据符号构成,那么执行如下公式:Assuming that any chosen The serial numbers of the nodes are j 1 ,…,j k , the encoding matrices of the k nodes are B j1 ,…,B jk , and the codeword vectors stored by the k nodes are y j1 ,…,y jk . Each codeword vector is composed of two binary codeword data symbols on two finite fields F q . Then the following formula is executed:

由于编码矩阵B的构造方法能够保证系数矩阵B=(Bj1,…,Bjk)T列满秩,因此能够保证应用本式解析出原始数据符号x=(x1,…,xm)TSince the construction method of the encoding matrix B can ensure that the coefficient matrix B = (B j1 , ..., B jk ) T has full column rank, it can be ensured that the original data symbol x = (x 1 , ..., x m ) T can be parsed using this formula.

优选地,对于单节点失效问题,向r个有效节点中的每一个请求13个符号修复失效节点;修复方法的选择方式和指标参数如下:假设存在m个原始的用户数据符号,当m≥6,该存储系统采用先译码后编码的修复方式,β=2,修复带宽等于当m=3时,r=2,β=1,对应的修复带宽等于2;当m=4或m=5时,r=3,β=1,对应的修复带宽等于3,因此,当3≤m≤5时,修复带宽小于先译码后编码的修复带宽。Preferably, for the single node failure problem, 13 symbols are requested from each of the r valid nodes to repair the failed node; the selection method and indicator parameters of the repair method are as follows: Assuming that there are m original user data symbols, when m≥6, the storage system adopts a repair method of decoding first and then encoding, β=2, the repair bandwidth is equal to When m=3, r=2, β=1, and the corresponding repair bandwidth is equal to 2; when m=4 or m=5, r=3, β=1, and the corresponding repair bandwidth is equal to 3. Therefore, when 3≤m≤5, the repair bandwidth is smaller than the repair bandwidth of decoding first and then encoding.

本发明公开了以下技术效果:The present invention discloses the following technical effects:

1.应用本发明构造的分布式存储系统具有简单性和轻量性,所有运算都基于二元域F2。编码过程执行的是F2上的矩阵乘法,译码过程执行的F2上的线性方程组的高斯消去法求解,数学运算简单,占用内存少。1. The distributed storage system constructed by the present invention is simple and lightweight, and all operations are based on the binary field F2. The encoding process performs matrix multiplication on F2, and the decoding process performs Gaussian elimination of linear equations on F2. The mathematical operation is simple and occupies less memory.

2.通过限制参数α=2,即每个节点只存储2个符号,进一步降低了编译码的复杂度。2. By limiting the parameter α=2, that is, each node only stores 2 symbols, the complexity of encoding and decoding is further reduced.

3.应用本发明构造的参数为的子空间码存储系统,由于消息空间是m维空间为了满足译码要求,即要求依靠任意的k个子空间的线性张成都等于消息空间U,则k的下限值必须等于低于该下限值不可能完成译码。理论上可以证明,本发明在表1中给出的编码矩阵中0和1的特殊的排列方式保证了使用任意的个节点上的存储数据都可以实现译码,即依据了编码矩阵满足最大距离准则,即各个存储节点存储的子空间之间具有最大距离,保证了可以依靠最少的节点上存储的数据完成译码。因此达到了译码带宽的下限值,即具有最小译码带宽。3. The parameters of the invention are: Subspace code storage system, since the message space is m-dimensional space In order to meet the decoding requirements, that is, to rely on the linear expansion of any k subspaces to be equal to the message space U, the lower limit of k must be equal to Theoretically, it can be proved that the special arrangement of 0 and 1 in the coding matrix given in Table 1 of the present invention ensures that the decoding can be completed with any The data stored on each node can be decoded, that is, the coding matrix satisfies the maximum distance criterion, that is, the subspaces stored on each storage node have the maximum distance, which ensures that decoding can be completed with the data stored on the least nodes. Therefore, the lower limit of the decoding bandwidth is reached, that is, the minimum decoding bandwidth is achieved.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1为复制存储和编码存储两种存储方式的比较示意图;FIG1 is a schematic diagram showing a comparison between two storage methods: copy storage and coded storage;

图2为基于编码的分布式存储系统原理示意图;FIG2 is a schematic diagram of the principle of a distributed storage system based on coding;

图3为一种基于再生码的存储系统示例;FIG3 is an example of a storage system based on regeneration codes;

图4为本发明的存储示意图;FIG4 is a schematic diagram of storage of the present invention;

图5为采用本方法得到的基于二元域上m=3的子空间码存储系统示意图;FIG5 is a schematic diagram of a subspace code storage system based on m=3 on a binary domain obtained by using the present method;

图6为采用本方法得到的基于二元域上m=4的子空间码存储系统示意图;FIG6 is a schematic diagram of a subspace code storage system based on m=4 on a binary domain obtained by using this method;

图7为采用本方法得到的基于二元域上m=5的子空间码存储系统示意图;FIG7 is a schematic diagram of a subspace code storage system based on m=5 on a binary domain obtained by using the present method;

图8为采用本方法得到的基于二元域上m=6的子空间码存储系统示意图。FIG8 is a schematic diagram of a subspace code storage system based on m=6 on a binary field obtained by using this method.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。In order to make the above-mentioned objects, features and advantages of the present invention more obvious and easy to understand, the present invention is further described in detail below with reference to the accompanying drawings and specific embodiments.

参照图4,本发明提供一种基于二元域上子空间码的分布式存储系统,应用本方法生成的参数为的子空间码的码字在存储节点上的存储情况。二元域F2上的子空间存储码的参数为其中m等于用户原始数据的字符数,同时也等于存储节点的个数,α取2,表示有2个存储符号,表示第个节点,r为有效节点的个数,β为有效节点中的每一个请求符号个数,用户原始数据包括有限域Fq上的m个符号,经编码后得到的码字数据存储在m个节点上,每个节点i存储2个符号,依靠任意的个节点上存储的码字符号都能够译码得到原始数据。4, the present invention provides a distributed storage system based on subspace codes on a binary domain, and the parameters generated by the method are The storage of the codewords of the subspace code on the storage nodes. The parameters of the subspace storage code on the binary field F 2 are Where m is equal to the number of characters in the user's original data, and is also equal to the number of storage nodes. α is 2, indicating that there are 2 storage symbols. Indicates nodes, r is the number of valid nodes, β is the number of each request symbol in the valid node, the user's original data includes m symbols on the finite field Fq , and the codeword data obtained after encoding is stored in m nodes, and each node i stores 2 symbols. The codewords stored on each node can be decoded to get the original data.

编码过程如下:编码器的输入是作为用户原始数据的m个二元符号x=(x1,…,xm)T。系统一共包括m个存储节点,记为节点1至节点m,每个节点对应着一个(2×m)维的编码矩阵,节点i对应的编码矩阵记为Bi子空间码的编码矩阵如表1所示。The encoding process is as follows: The input of the encoder is m binary symbols x = (x 1 , ..., x m ) T as the user's original data. The system includes a total of m storage nodes, denoted as node 1 to node m, each node corresponds to a (2×m)-dimensional encoding matrix, and the encoding matrix corresponding to node i is denoted as Bi , The encoding matrix of the subspace code is shown in Table 1.

表1Table 1

(1)m=3(1) m = 3

(2)m=4(2) m = 4

(3)m=5(3) m = 5

(4)m=6(4) m = 6

(5)m=7(5) m = 7

(6)m=8(6) m = 8

(7)m=9(7) m = 9

以此类推,当m=2i或m=2i+1,(i≥2)时,编码矩阵B1的两个行向量应该分别是:By analogy, when m=2i or m=2i+1, (i≥2), the two row vectors of the encoding matrix B1 should be:

第一行的行向量b1,1=(1 … 1 0 0 0 … 0),其中包含i-1个1,m-(i-1)个0;The first row vector b 1,1 = (1 … 1 0 0 0 … 0) contains i-1 1s and m-(i-1) 0s.

第二行的行向量b1,2=(0 1 … 1 0 0 … 0),其中包含i个1,m-i个0。The row vector b 1,2 =(0 1 … 1 0 0 … 0) of the second row contains i 1s and mi 0s.

剩余m-1个编码矩阵的两个行向量分别是B1行向量的循环右移。The two row vectors of the remaining m-1 encoding matrices are respectively the cyclic right shifts of the B1 row vector.

节点i执行的编码公式如下:The encoding formula executed by node i is as follows:

Bi·x=yi=(yi1,yi2)T (1)B i ·x=y i =(y i1 , y i2 ) T (1)

节点i编码生成的码字是2个二元码符号yi=(yi1,yi2)T,把这2个二元码符号存储在节点i上。全部m个节点一共编码生成2m个码符号,如图4所示分别存储在m个节点上。The codeword generated by encoding node i is 2 binary code symbols yi = ( yi1 , yi2 ) T , and these 2 binary code symbols are stored on node i. All m nodes are encoded to generate 2m code symbols in total, which are stored on m nodes respectively as shown in FIG4.

本方法的译码过程如下:本系统依靠任意的个节点上存储的二元码符号都可以译码恢复原始数据符号x=(x1,…,xm)T。译码器的输入是任意选择的个节点上存储的二元码符号。不失一般性,假设选中的个节点的序号为j1,…,jk,这k个节点的编码矩阵为Bj1,…,Bjk,这k个节点存储的码字矢量分别记为yj1,…,yjk,每个码字矢量都由2个二元码符号构成,则有如下关系:The decoding process of this method is as follows: This system relies on any The binary code symbols stored on each node can be decoded to recover the original data symbol x = (x 1 , ..., x m ) T . The input of the decoder is arbitrarily selected. The binary code symbols stored on the nodes. Without loss of generality, assume that the selected The serial numbers of the nodes are j 1 ,…,j k , the encoding matrices of these k nodes are B j1 ,…,B jk , the codeword vectors stored by these k nodes are y j1 ,…,y jk , and each codeword vector is composed of 2 binary code symbols, then there is the following relationship:

方程(2)可解的充要条件是系数矩阵B=(Bj1,…,Bjk)T列满秩。理论上可以证明对于表1给出的编码矩阵,该充要条件一定是满足的,因此一定可以使用高斯消去法求解方程(2),解析出原始数据符号x=(x1,…,xm)TThe necessary and sufficient condition for equation (2) to be solvable is that the coefficient matrix B = (B j1 , ..., B jk ) T has full column rank. Theoretically, it can be proved that for the coding matrix given in Table 1, this necessary and sufficient condition must be satisfied, so Gaussian elimination can be used to solve equation (2) and resolve the original data symbol x = (x 1 , ..., x m ) T .

本系统可以采用“先译码后编码”的修复过程如下:由于本系统依靠任意的个节点上存储的2k个二元码符号都可以译码恢复原始数据符号x=(x1,…,xm)T,因此系统能够承受并修复最多个节点的同时失效。只要系统还存在个有效节点,就可以先译码恢复原始数据符号x=(x1,…,xm)T,再为那些失效节点重新编码生成丢失的码字符号,该修复过程对应的修复带宽等于个码字符号。然而,从后面的实施例可以看到,对于应用本发明产生的几个特殊参数(m=3,m=4,m=5)的子空间存储码系统,当系统中只有1个节点失效时,不需要采用“先译码后编码”的修复方法,修复带宽可以小于个码字符号,将在具体实施例中说明。This system can adopt the "decode first and then encode" repair process as follows: Since this system relies on any The 2k binary code symbols stored on each node can be decoded to restore the original data symbol x = (x 1 , ..., x m ) T , so the system can withstand and repair up to nodes fail at the same time. As long as the system still exists If there are valid nodes, the original data symbol x = (x 1 , ..., x m ) T can be first decoded and restored, and then the lost codewords can be re-encoded for the failed nodes. The repair bandwidth corresponding to this repair process is equal to However, it can be seen from the following embodiments that for the subspace storage code system using several special parameters (m=3, m=4, m=5) generated by the present invention, when only one node in the system fails, it is not necessary to use the "decoding first and then encoding" repair method, and the repair bandwidth can be less than The code characters will be described in detail in the specific embodiments.

实施例一Embodiment 1

令m=3,则消息空间是基于F2的3维空间用户原始数据矢量是x=(x1,x2,x3)T,xi∈{0,1}。根据表1给出的编码矩阵,3个存储节点上存储的码字的编码计算过程如下。3个存储节点示于图5。Let m = 3, then the message space is a 3D space based on F2 The user original data vector is x=(x 1 , x 2 , x 3 ) T , x i ∈ {0, 1}. According to the coding matrix given in Table 1, the coding calculation process of the codewords stored on the three storage nodes is as follows. The three storage nodes are shown in FIG5 .

节点1: Node 1:

节点2: Node 2:

节点3: Node 3:

首先考察译码问题,任意2个节点包含的4个二元字符都包括了原始数据x1,x2,x3,因此依靠任意2个节点即可译码,译码节点数达到了下限值k的公式如下:First, let's examine the decoding problem. The four binary characters contained in any two nodes include the original data x 1 , x 2 , and x 3 . Therefore, decoding can be done by any two nodes. The formula for the number of decoding nodes reaching the lower limit k is as follows:

其次考察修复问题,本实施例中任意1个节点失效导致其上存储的2个二元字符丢失,都可以从另外的2个节点中的每一个节点请求1个二元字符进行修复即可,因此修复带宽等于2个二元字符。本例构造的是参数为(m,n,α,k,r,β)=(3,3,2,2,2,1)的子空间存储码,通过限制参数α=2,即每个节点只存储2个符号,进一步降低了编译码的复杂度。Next, the repair problem is examined. In this embodiment, if any one node fails and the two binary characters stored on it are lost, one binary character can be requested from each of the other two nodes for repair, so the repair bandwidth is equal to two binary characters. This example constructs a subspace storage code with parameters (m, n, α, k, r, β) = (3, 3, 2, 2, 2, 1). By limiting the parameter α = 2, that is, each node only stores 2 symbols, the complexity of encoding and decoding is further reduced.

实施例二Embodiment 2

令m=4,则消息空间是基于F2的4维空间用户原始数据矢量是x=(x1,x2,x3,x4)T,xi∈{0,1}。根据表1给出的编码矩阵,4个存储节点上存储的码字的编码计算过程如下。4个存储节点如图6所示。Let m = 4, then the message space is a 4-dimensional space based on F2 The user original data vector is x=( x1 , x2 , x3 , x4 ) T , x i∈ {0, 1}. According to the coding matrix given in Table 1, the coding calculation process of the codewords stored on the four storage nodes is as follows. The four storage nodes are shown in FIG6.

节点1: Node 1:

节点2: Node 2:

节点3: Node 3:

节点4: Node 4:

首先考察图6所示的子空间存储码的译码问题。该存储码占用了4个存储节点,一共存储了8个二元字符。本例的译码节点数的下限值k的公式如下:First, let's examine the decoding problem of the subspace storage code shown in Figure 6. This storage code occupies 4 storage nodes and stores a total of 8 binary characters. The formula for the lower limit value k of the number of decoding nodes in this example is as follows:

也就是说应用其中任意的2个存储节点上存储的4个二元字符即可译码解析出原始数据x1,x2,x3,x4。为了验证这一点,任选2个存储节点,假设选择节点2和节点4,则获得的4个二元字符分别是:That is to say, the original data x 1 , x 2 , x 3 , x 4 can be decoded and parsed by using the four binary characters stored on any two storage nodes. To verify this, we select any two storage nodes, assuming that we select node 2 and node 4, and the four binary characters obtained are:

a1=x2,a2=x3+x4,a3=x4,a4=x1+x2 a 1 =x 2 , a 2 =x 3 +x 4 , a 3 =x 4 , a 4 =x 1 +x 2

其中,x2和x4已经自然得到,另外有Among them, x 2 and x 4 are naturally obtained, and there are

a1+a4=x2+(x1+x2)=x1 a 1 + a 4 = x 2 + (x 1 + x 2 ) = x 1

a2+a3=(x3+x4)+x4=x3 a 2 +a 3 =(x 3 +x 4 )+x 4 =x 3

由此就译码得到了全部原始数据x1,x2,x3,x4。如果选择的另外一组的2个节点,也会得到类似的结论。Thus, all the original data x 1 , x 2 , x 3 , x 4 are decoded. If another group of 2 nodes is selected, a similar conclusion can be obtained.

其次考察图6所示的子空间存储码的修复问题。本实施例中,对于任何一个节点失效导致其上存储的2个二元字符丢失,都可以从剩余的3个节点中的每一个节点请求1个二元字符加以修复,因此修复带宽等于3个二元字符。假设节点1损坏,为了恢复其中的数据x1和x2+x3,可以从剩余的3个节点中分别读取x2,x3和x1+x2,通过简单的基于F2上的代数运算即可恢复丢失的数据x1和x2+x3。对于节点2、节点3和节点4的损坏也有类似的结论。所以本例构造的是参数为(m,n,α,k,r,β)=(4,4,2,2,3,1)的子空间存储码。Next, the repair problem of the subspace storage code shown in FIG6 is examined. In this embodiment, if any node fails and the two binary characters stored thereon are lost, one binary character can be requested from each of the remaining three nodes for repair, so the repair bandwidth is equal to three binary characters. Assuming that node 1 is damaged, in order to recover the data x1 and x2 + x3 therein, x2 , x3 and x1 + x2 can be read from the remaining three nodes respectively, and the lost data x1 and x2 + x3 can be recovered by simple algebraic operations based on F2 . Similar conclusions can be drawn for the damage of nodes 2, 3 and 4. Therefore, this example constructs a subspace storage code with parameters (m, n, α, k, r, β) = (4, 4, 2, 2, 3, 1).

实施例三Embodiment 3

令m=5,则消息空间是基于F2的5维空间用户原始数据矢量x=(x1,x2,x3,x4,x5)T,xi∈{0,1}。根据表1给出的编码矩阵,5个存储节点上存储的码字的编码计算过程如下。5个存储节点如图7所示。Let m = 5, then the message space is a 5-dimensional space based on F 2 User original data vector x = (x 1 , x 2 , x 3 , x 4 , x 5 ) T , x i ∈ {0, 1}. According to the coding matrix given in Table 1, the coding calculation process of the codewords stored on the five storage nodes is as follows. The five storage nodes are shown in FIG7 .

节点1: Node 1:

节点2: Node 2:

节点3: Node 3:

节点4: Node 4:

节点5: Node 5:

首先考察图7所示的子空间存储码的译码问题。该存储码占用了5个存储节点,一共存储了10个二元字符。本例的译码节点数的下限值k的公式如下:First, let's examine the decoding problem of the subspace storage code shown in Figure 7. The storage code occupies 5 storage nodes and stores a total of 10 binary characters. The formula for the lower limit k of the number of decoding nodes in this example is as follows:

也就是说应用其中任意的3个存储节点上存储的6个二元字符即可译码解析出原始数据x1,x2,x3,x4,x5。为了验证这一点,任选3个存储节点,假设选择节点1、节点3和节点4,则获得的6个二元字符分别是:That is to say, the original data x 1 , x 2 , x 3 , x 4 , x 5 can be decoded and parsed by using the 6 binary characters stored on any 3 storage nodes. To verify this, select any 3 storage nodes, assuming that node 1, node 3 and node 4 are selected, and the 6 binary characters obtained are:

a1=x1,a2=x2+x3,a3=x3,a4=x4+x5,a5=x4,a6=x5+x1 a 1 =x 1 , a 2 =x 2 +x 3 , a 3 =x 3 , a 4 =x 4 +x 5 , a 5 =x 4 , a 6 =x 5 +x 1

其中,x1,x3和x4已经自然得到,另外有Among them, x 1 , x 3 and x 4 are naturally obtained, and there are

a2+a3=(x2+x3)+x3=x2 a 2 +a 3 =(x 2 +x 3 )+x 3 =x 2

a4+a5=(x4+x5)+x4=x5 a 4 +a 5 =(x 4 +x 5 )+x 4 =x 5

由此就译码得到了全部原始数据x1,x2,x3,x4,x5。如果选择另外一组的3个节点,也会得到类似的结论。Thus, all the original data x 1 , x 2 , x 3 , x 4 , x 5 are decoded. If another group of 3 nodes is selected, a similar conclusion can be obtained.

其次考察图7所示的子空间存储码的修复问题。本实施例中,对于任何1个节点失效导致其上存储的2个二元字符丢失,都可以从剩余4个节点中任选3个节点,从每个节点读取一个字符即可修复。具体的,如果节点1损坏,可以从节点2、3、5中分别读取x2,x3和x1+x2即可修复;如果节点2损坏,可以从节点3、4、1中分别读取x3,x4和x2+x3即可修复;如果节点3损坏,可以从节点4、5、2中分别读取x4,x5和x3+x4即可修复;如果节点4损坏,可以从节点5、1、3中分别读取x5,x1和x4+x5即可修复;如果节点5损坏,可以从节点1、2、4中分别读取x1,x2和x1+x5即可修复。因此,本例构造的是参数为(m,n,α,k,r,β)=(5,5,2,3,3,1)的子空间存储码。Next, the repair problem of the subspace storage code shown in FIG7 is examined. In this embodiment, if any node fails and the two binary characters stored thereon are lost, any three nodes can be selected from the remaining four nodes, and one character can be read from each node to repair it. Specifically, if node 1 is damaged, x 2 , x 3 and x 1 + x 2 can be read from nodes 2, 3 and 5 respectively to repair it; if node 2 is damaged, x 3 , x 4 and x 2 + x 3 can be read from nodes 3, 4 and 1 respectively to repair it; if node 3 is damaged, x 4 , x 5 and x 3 + x 4 can be read from nodes 4, 5 and 2 respectively to repair it; if node 4 is damaged, x 5 , x 1 and x 4 + x 5 can be read from nodes 5, 1 and 3 respectively to repair it; if node 5 is damaged, x 1 , x 2 and x 1 + x 5 can be read from nodes 1, 2 and 4 respectively to repair it. Therefore, this example constructs a subspace storage code with parameters (m, n, α, k, r, β) = (5, 5, 2, 3, 3, 1).

实施例四Embodiment 4

令m=6,则消息空间是基于F2的6维空间用户原始数据矢量x=(x1,x2,x3,x4,x5,x6)T,xi∈{0,1}。根据表1给出的编码矩阵,6个存储节点上存储的码字的编码计算过程如下。6个存储节点如图8所示。Let m = 6, then the message space is a 6-dimensional space based on F 2 User original data vector x = (x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ) T , x i ∈ {0, 1}. According to the coding matrix given in Table 1, the coding calculation process of the codewords stored on the 6 storage nodes is as follows. The 6 storage nodes are shown in FIG8 .

节点1: Node 1:

节点2: Node 2:

节点3: Node 3:

节点4: Node 4:

节点5: Node 5:

节点6: Node 6:

首先考察图8所示的子空间存储码的译码问题。该存储码占用了6个存储节点,一共存储了12个二元字符。本例的译码节点数的下限值k的公式如下:等于First, consider the decoding problem of the subspace storage code shown in Figure 8. The storage code occupies 6 storage nodes and stores a total of 12 binary characters. The formula for the lower limit value k of the number of decoding nodes in this example is as follows:

也就是说应用其中任意的3个存储节点上存储的6个二元字符即可译码解析出原始数据x1,x2,x3,x4,x5,x6。为了验证这一点,任选3个存储节点,假设选择节点1、节点3和节点4,则获得的6个二元字符分别是:That is to say, the original data x 1 , x 2 , x 3 , x 4 , x 5 , x 6 can be decoded and parsed by using the 6 binary characters stored on any 3 storage nodes. To verify this, select any 3 storage nodes, assuming that node 1, node 3 and node 4 are selected, and the 6 binary characters obtained are:

a1=x1+x2,a2=x2+x3+x4,a3=x3+x4,a4=x4+x5+x6a 1 =x 1 +x 2 , a 2 =x 2 +x 3 +x 4 , a 3 =x 3 +x 4 , a 4 =x 4 +x 5 +x 6 ,

a5=x4+x5,a6=x1+x5+x6a 5 =x 4 +x 5 , a 6 =x 1 +x 5 +x 6 .

则译码过程如下:The decoding process is as follows:

a2+a3=x2a 2 +a 3 =x 2 ;

a1+x2=x1a 1 +x 2 =x 1 ;

a4+a5=x6a 4 +a 5 =x 6 ;

a6+x1+x6=x5a 6 +x 1 +x 6 =x 5 ;

a5+x5=x4a 5 +x 5 =x 4 ;

a3+x4=x3a 3 +x 4 =x 3 .

由此就译码得到了全部原始数据x1,x2,x3,x4,x5,x6。如果选择另外一组的3个节点,也会得到类似的结论。Thus, all the original data x 1 , x 2 , x 3 , x 4 , x 5 , x 6 are decoded. If another group of 3 nodes is selected, a similar conclusion will be obtained.

其次考察图8所示的子空间存储码的修复问题。对于本实施例中单节点失效的修复问题,只能使用“先译码后编码”的修复方法,即从5个存活的节点中任选3个,读取其中存储的6个码字数据,译码得到原始数据x1,x2,x3,x4,x5,x6,再利用失效节点的编码矩阵重新编码生成丢失的2个码字数据。因此,本例构造的是参数为(m,n,α,k,r,β)=(6,6,2,3,3,2)的子空间存储码。一般的,当m≥6时,由本方法建立的存储系统也需要使用“先译码后编码”的修复方法,修复带宽等于 Next, the repair problem of the subspace storage code shown in Figure 8 is examined. For the repair problem of single node failure in this embodiment, only the "decoding first and then encoding" repair method can be used, that is, select any 3 nodes from the 5 surviving nodes, read the 6 codeword data stored therein, decode to obtain the original data x1 , x2 , x3 , x4 , x5 , x6 , and then use the encoding matrix of the failed node to re-encode to generate the lost 2 codeword data. Therefore, this example constructs a subspace storage code with parameters (m, n, α, k, r, β) = (6, 6, 2, 3, 3, 2). In general, when m≥6, the storage system established by this method also needs to use the "decoding first and then encoding" repair method, and the repair bandwidth is equal to

在本发明的描述中,需要理解的是,术语“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that the terms "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inside" and "outside" etc., indicating orientations or positional relationships, are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present invention, rather than indicating or implying that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be understood as a limitation on the present invention.

以上所述的实施例仅是对本发明的优选方式进行描述,并非对本发明的范围进行限定,在不脱离本发明设计精神的前提下,本领域普通技术人员对本发明的技术方案做出的各种变形和改进,均应落入本发明权利要求书确定的保护范围内。The embodiments described above are only descriptions of the preferred modes of the present invention, and are not intended to limit the scope of the present invention. Without departing from the design spirit of the present invention, various modifications and improvements made to the technical solutions of the present invention by ordinary technicians in this field should all fall within the protection scope determined by the claims of the present invention.

Claims (1)

1.一种基于二元域上子空间码的分布式存储系统,其特征在于:二元域F2上的子空间存储码的参数为(m,m,α,r,β),其中m等于用户原始数据的符号数,同时也等于存储节点的个数,α取2,表示每个节点只存储2个码数据符号,依靠任意的个节点上存储的码数据符号都能译码得到原始数据符号,编码的过程如下:1. A distributed storage system based on subspace codes on a binary field, characterized in that: the parameters of the subspace storage code on the binary field F2 are (m, m, α, r, β), where m is equal to the number of symbols of the user's original data, and also equal to the number of storage nodes. α is 2, which means that each node only stores 2 code data symbols. The coded data symbols stored on each node can be decoded to obtain the original data symbols. The encoding process is as follows: 步骤一、设m个原始的用户数据符号为消息矢量x=(x1,…,xm)T,消息空间为从消息空间U的全部2维子空间中选择m个子空间作为存储空间,经编码后得到的码字数据符号存储在m个节点上,每个节点i存储2个码字数据符号,节点i的编码矩阵Bi的公式如下:Step 1: Assume that m original user data symbols are message vectors x = (x 1 , ..., x m ) T , and the message space is Select m subspaces from all 2-dimensional subspaces of the message space U as storage spaces. The encoded codeword data symbols are stored on m nodes. Each node i stores 2 codeword data symbols. The formula of the encoding matrix Bi of node i is as follows: 其中,Bi为(2×m)维的矩阵;bi,1为节点i的编码矩阵Bi的第一行的行向量,bi,2为节点i的编码矩阵Bi的第二行的行向量,m等于用户原始数据的字符数,同时也等于存储节点的个数,m取任意的大于等于3的正整数;Wherein, Bi is a (2×m)-dimensional matrix; bi ,1 is the row vector of the first row of the encoding matrix Bi of node i , bi ,2 is the row vector of the second row of the encoding matrix Bi of node i, m is equal to the number of characters of the user's original data, and is also equal to the number of storage nodes, and m is any positive integer greater than or equal to 3; 当m=3时,b1,1=(100),b1,2=(010),得出:When m=3, b1,1 =(100), b1,2 =(010), and we get: 当m=2i或m=2i+1,i≥2时,第1个存储节点的编码矩阵B1的第一行的行向量b1,1的公式为:When m=2i or m=2i+1, i≥2, the formula of the row vector b1,1 of the first row of the encoding matrix B1 of the first storage node is: b1,1=(1…1 0 0 0…0),其中包含i-1个1,m-(i-1)个0;b 1,1 = (1…1 0 0 0…0), which contains i-1 1s and m-(i-1) 0s; 第二行的行向量b1,2的公式为:The formula for the row vector b1,2 of the second row is: b1,2=(0 1…1 0…0 0),其中包含i个1,m-i个0,得出:b 1, 2 = (0 1…1 0…0 0), which contains i 1s and mi 0s, so: 剩余m-1个编码矩阵的两个行向量分别是B1对应的两个行向量的循环右移;The two row vectors of the remaining m-1 encoding matrices are the cyclic right shifts of the two row vectors corresponding to B 1 ; 对节点i进行编码执行以下公式:The following formula is used to encode node i: 其中,yi是一个2维码字列矢量,存储在第i个节点上;yi1,yi2均为基于有限域Fq上的码字数据符号;Wherein, yi is a 2D codeword column vector stored on the i-th node; yi1 and yi2 are both codeword data symbols based on the finite field Fq ; 步骤二:把生成的码字数据符号yi1和yi2存储在节点i上;Step 2: Store the generated codeword data symbols y i1 and y i2 on node i; 步骤三、对全部m个节点执行步骤一和步骤二,直到全部节点都执行编码运算并生成存储码字;Step 3: Execute steps 1 and 2 on all m nodes until all nodes have performed encoding operations and generated storage codewords; 译码的具体方法如下:The specific method of decoding is as follows: 假设任意选取的个节点的序号为j1,…,jk,这k个节点各自的编码矩阵分别为Bj1,…,Bjk,这k个节点存储的码字矢量分别记为yj1,…,yjk,每个码字矢量都由2个有限域Fq上的二元码字数据符号构成,那么执行如下公式:Assume that any chosen The serial numbers of the nodes are j 1 ,…,j k , the encoding matrices of the k nodes are B j1 ,…,B jk , and the codeword vectors stored by the k nodes are y j1 ,…,y jk . Each codeword vector is composed of two binary codeword data symbols on two finite fields F q . Then the following formula is executed: 由于编码矩阵B的构造方法能够保证系数矩阵B=(Bj1,…,Bjk)T列满秩,因此能够保证应用本式解析出原始数据符号x=(x1,…,xm)TSince the construction method of the encoding matrix B can ensure that the coefficient matrix B = (B j1 , ..., B jk ) T has full column rank, it can be ensured that the original data symbol x = (x 1 , ..., x m ) T can be parsed by applying this formula; 对于单节点失效问题,向r个有效节点中的每一个请求β个符号修复失效节点,修复方法的选择方式和指标参数如下:假设存在m个原始的用户数据符号,当m≥6,该存储系统采用先译码后编码的修复方式,r=For the single node failure problem, β symbols are requested from each of the r valid nodes to repair the failed node. The selection method and indicator parameters of the repair method are as follows: Assuming that there are m original user data symbols, when m≥6, the storage system adopts the repair method of decoding first and then encoding, r= β=2,修复带宽等于当m=3时,r=2,β=1,对应的修复带宽等于2;当m=4或m=5时,r=3,β=1,对应的修复带宽等于3,因此,当3≤m≤5时,修复带宽小于先译码后编码的修复带宽。 β=2, the repair bandwidth is equal to When m=3, r=2, β=1, and the corresponding repair bandwidth is equal to 2; when m=4 or m=5, r=3, β=1, and the corresponding repair bandwidth is equal to 3. Therefore, when 3≤m≤5, the repair bandwidth is smaller than the repair bandwidth of decoding first and then encoding.
CN201910952539.XA 2019-10-09 2019-10-09 A Distributed Storage System Based on Subspace Codes on Binary Fields Active CN110780813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910952539.XA CN110780813B (en) 2019-10-09 2019-10-09 A Distributed Storage System Based on Subspace Codes on Binary Fields

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910952539.XA CN110780813B (en) 2019-10-09 2019-10-09 A Distributed Storage System Based on Subspace Codes on Binary Fields

Publications (2)

Publication Number Publication Date
CN110780813A CN110780813A (en) 2020-02-11
CN110780813B true CN110780813B (en) 2023-08-08

Family

ID=69385581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910952539.XA Active CN110780813B (en) 2019-10-09 2019-10-09 A Distributed Storage System Based on Subspace Codes on Binary Fields

Country Status (1)

Country Link
CN (1) CN110780813B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111585581B (en) * 2020-05-14 2023-04-07 成都信息工程大学 Coding method based on binary domain operation and supporting any code distance

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106790408A (en) * 2016-11-29 2017-05-31 中国空间技术研究院 A kind of coding method repaired for distributed memory system node
CN109684127A (en) * 2018-12-29 2019-04-26 西安电子科技大学 Locality node restorative procedure based on complete graph minimum bandwidth regeneration code
CN110178122A (en) * 2018-07-10 2019-08-27 深圳花儿数据技术有限公司 The synchronous restorative procedure of the data of distributed memory system and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10193570B2 (en) * 2013-12-03 2019-01-29 Samsung Electronics Co., Ltd Method of and apparatus for generating spatially-coupled low-density parity-check code

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106790408A (en) * 2016-11-29 2017-05-31 中国空间技术研究院 A kind of coding method repaired for distributed memory system node
CN110178122A (en) * 2018-07-10 2019-08-27 深圳花儿数据技术有限公司 The synchronous restorative procedure of the data of distributed memory system and storage medium
CN109684127A (en) * 2018-12-29 2019-04-26 西安电子科技大学 Locality node restorative procedure based on complete graph minimum bandwidth regeneration code

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于网络编码的云存储系统;刘宴涛等;《计算机科学》(第12期);全文 *

Also Published As

Publication number Publication date
CN110780813A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
Rashmi et al. Having your cake and eating it too: Jointly optimal erasure codes for {I/O}, storage, and network-bandwidth
US10146618B2 (en) Distributed data storage with reduced storage overhead using reduced-dependency erasure codes
US9356626B2 (en) Data encoding for data storage system based on generalized concatenated codes
US20210273654A1 (en) Erasure code calculation method
US20210271557A1 (en) Data encoding, decoding and recovering method for a distributed storage system
US8775860B2 (en) System and method for exact regeneration of a failed node in a distributed storage system
CN106484559B (en) A kind of building method of check matrix and the building method of horizontal array correcting and eleting codes
US20200081778A1 (en) Distributed storage system, method and apparatus
JP6487931B2 (en) Method and apparatus for reconstructing data blocks
CN103336785A (en) Distributed storage method and distributed storage device based on network coding
CN101339524A (en) Disk Fault Tolerance Method for Large-Scale Disk Array Storage System
CN108132854B (en) Erasure code decoding method capable of simultaneously recovering data elements and redundant elements
TW202001920A (en) Method and apparatus for improved data recovery in data storage systems
CN110764950A (en) Hybrid coding method, data repair method, and system based on RS code and regeneration code
CN102843212B (en) Coding and decoding processing method and device
KR101621752B1 (en) Distributed Storage Apparatus using Locally Repairable Fractional Repetition Codes and Method thereof
CN110780813B (en) A Distributed Storage System Based on Subspace Codes on Binary Fields
CN107665152B (en) Decoding method of erasure code
CN112534724A (en) Decoder and method for decoding polarization code and product code
Bhuvaneshwari et al. Review on LDPC codes for big data storage
US9430443B1 (en) Systematic coding technique
CN104572987B (en) A kind of method and system that simple regeneration code storage efficiency is improved by compressing
Chen et al. A new Zigzag MDS code with optimal encoding and efficient decoding
KR101923116B1 (en) Apparatus for Encoding and Decoding in Distributed Storage System using Locally Repairable Codes and Method thereof
Wei et al. expanCodes: Tailored LDPC codes for big data storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220221

Address after: 514015 Jiaying College, Meisong Road, Meizhou City, Guangdong Province

Applicant after: JIAYING University

Address before: 519000 building T8, North Polytechnic teachers' apartment, No. 6, Jinfeng Road, Tangjiawan, Xiangzhou District, Zhuhai City, Guangdong Province

Applicant before: BEIJING INSTITUTE OF TECHNOLOGY, ZHUHAI

GR01 Patent grant
GR01 Patent grant