CN113347238A

CN113347238A - Message partitioning method, system, device and storage medium based on block chain

Info

Publication number: CN113347238A
Application number: CN202110579854.XA
Authority: CN
Inventors: 王鑫; 马超群; 米先华; 周中定; 李信儒; 兰秋军; 万丽
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2021-05-26
Filing date: 2021-05-26
Publication date: 2021-09-03

Abstract

The invention discloses a block chain-based message partition method, system, equipment and storage medium. After the block chain generates transaction messages and enters the corresponding partition, the method first calculates the resource consumption of the partition. The threshold value means that the partition is full and no new messages can be entered in the future. By calculating the matching degree of other partitions in the topic where the partition is located and the partition, and the current resource consumption of the partition with the largest matching degree value does not exceed At the first threshold, the new messages that are going to enter the partition are allocated to the partition with the largest matching degree value. On the one hand, some messages that exceed the resource carrying capacity of the partition do not need to wait for a long time for the partition to vacate resources, which saves the waiting time. On the other hand, allocating messages to the partition with the largest matching value for consensus processing ensures the reliability of the consensus, strengthens the overall ability of the Kafka cluster to process high concurrent data, and improves the overall consensus efficiency of the blockchain.

Description

Blockchain-based message partitioning method and system, equipment and storage medium

技术领域technical field

本发明涉及区块链技术领域，特别地，涉及一种基于区块链的消息分区方法及系统、设备、计算机可读取的存储介质。The present invention relates to the technical field of blockchain, and in particular, to a method and system, device, and computer-readable storage medium for message partitioning based on blockchain.

背景技术Background technique

区块链技术作为一种分布式账本技术，是由多方共同维护，综合使用分布式账本、链式数据结构、点对点传输、共识机制、密码学原理等融合的一项技术，具有去中心化、开放透明、数据不可篡改、可追溯、隐私保护及高度自治等重要特性，在众多领域中具有广泛的应用。在区块链网络中，不同的参与者发起的交易必须按照产生的顺序被依次写入到账本中。如何在分布式场景下所有节点对同一个交易提案达成一致性，是区块链技术必须考虑且解决的重要问题。要实现这一目标，交易顺序必须被正确的建立，并且必须包含对交易被篡改或者恶意提交交易的处理方法，而共识算法就是保证分布式系统一致性实现的解决方式，共识算法是计算机科学中用于在分布式过程或系统之间实现对单个数据值的一致性的过程，旨在解决分布式网络中多个不可靠节点的可靠性问题。As a distributed ledger technology, blockchain technology is jointly maintained by multiple parties and comprehensively uses a combination of distributed ledger, chain data structure, point-to-point transmission, consensus mechanism, and cryptographic principles. It has important features such as openness and transparency, data immutability, traceability, privacy protection and a high degree of autonomy, and has a wide range of applications in many fields. In a blockchain network, transactions initiated by different participants must be written into the ledger in the order in which they were generated. How to reach consensus on the same transaction proposal for all nodes in a distributed scenario is an important issue that blockchain technology must consider and solve. To achieve this goal, the transaction order must be established correctly, and must include the processing method for tampered transactions or maliciously submitted transactions, and the consensus algorithm is the solution to ensure the consistency of distributed systems. A process for achieving consistency on a single data value between distributed processes or systems, designed to solve the reliability problem of multiple unreliable nodes in a distributed network.

现有的区块链网络采用Kafka的消息消费-订阅模式实现数据的共识过程，相关角色包括：Producer(生产者)、Kafka集群、Topic(主题)、Partition(分区)、Broker(数据存储服务器)、Zookeeper集群、Consumer(消费者)。Producer为消息生产者，在区块链网络中由某些节点充当该角色(这类节点由Zookeeper集群进行指定，主要负责消息的获取和发送，以及打包区块)，采用push方式将接收到的消息(即交易数据)发送给Kafka集群。其中，Kafka集群由一组服务器构成，功能上分为Topic(主题)、Partition(分区)和Broker(数据存储服务器)。Topic是消息主题，一个Kafka集群同时支持多个Topic，消息进入集群后进入相应的Topic中。Partition是Topic下的分区，一个Topic可以同时支持多个Partition分区，消息进入Topic后会被分配到Partition分区中。Topic下的Partition数量由Zookeeper集群进行管理。Broker为数据存储服务器，是一种用于实现数据存储的主机服务器，每个Topic都有一个Broker，存储当前各个Partition的消息数据。Zookeeper集群由一组服务器构成，用于对Kafka集群进行管理，可以指定和注册Producer和Consumer节点、配置Kafka集群的Topic和Partition，以及在系统资源出现压力时进行负载均衡。Consumer为消息消费者，在区块链网络中由某些节点充当该角色，采用pull方式将消息从Broker服务器获取到本地，并打包成区块，将区块分发给区块链网络中的其它节点进行验证。消息是Kafka中最基本的数据单元，在Kafka中，一条消息由key和value两部分构成，在发送一条消息时，可以指定这个key，Producer会根据key来判断当前消息应该发送并存储到哪个Partition中。消息选择Partition分区的策略主要有哈希策略与轮询策略两种，当没有为消息指定key，即key为null时，消息会以轮询的方式发送到各个分区，当key不为null时，Producer节点会使用key的哈希值(采用Murmur2Hash算法)对Partition数量取模，以此来决定把消息发送到哪个Partition上。The existing blockchain network adopts Kafka's message consumption-subscription model to realize the consensus process of data. The related roles include: Producer (producer), Kafka cluster, Topic (topic), Partition (partition), Broker (data storage server) , Zookeeper cluster, Consumer (consumer). Producer is a message producer, which is played by some nodes in the blockchain network (such nodes are designated by the Zookeeper cluster, and are mainly responsible for the acquisition and transmission of messages, as well as packaging blocks). Messages (i.e. transaction data) are sent to the Kafka cluster. Among them, the Kafka cluster consists of a set of servers, which are functionally divided into Topic (topic), Partition (partition) and Broker (data storage server). Topic is a message topic. A Kafka cluster supports multiple topics at the same time. After the message enters the cluster, it enters the corresponding topic. A Partition is a partition under a Topic. A Topic can support multiple Partition partitions at the same time. After the message enters the Topic, it will be allocated to the Partition partition. The number of Partitions under Topic is managed by the Zookeeper cluster. Broker is a data storage server, which is a host server for data storage. Each topic has a Broker, which stores the message data of each current Partition. The Zookeeper cluster consists of a set of servers for managing the Kafka cluster. It can specify and register the Producer and Consumer nodes, configure the Topic and Partition of the Kafka cluster, and perform load balancing when system resources are under pressure. Consumer is a message consumer, which is played by some nodes in the blockchain network. The message is obtained locally from the Broker server by means of pull, packaged into blocks, and distributed to others in the blockchain network. node for verification. A message is the most basic data unit in Kafka. In Kafka, a message consists of two parts: key and value. When sending a message, you can specify this key, and the Producer will determine which Partition the current message should be sent and stored in based on the key. middle. There are two main strategies for selecting Partition partitions for messages: hash strategy and polling strategy. When no key is specified for the message, that is, when the key is null, the message will be sent to each partition in a polling manner. When the key is not null, The Producer node will use the hash value of the key (using the Murmur2Hash algorithm) to modulo the number of Partitions to determine which Partition to send the message to.

但是，在区块链网络中，Kafka集群在提供共识服务的过程中，涉及到消息的分发、存储、消费和订阅等操作，尤其在高并发场景下，短时间内大量消息可能会造成Kafka集群中的Partition分区过度服务，比如在某些大型商业应用场景下，短时间内会出现某个Topic主题下涌入大量新消息，这些消息在向Partition分区进行分发时，分区不仅需要处理消息的排序、存储，还要处理Consumer节点的订阅请求，当消息数量超过分区某个第一阈值时，就会出现Partition分区无法继续服务的问题，导致超出第一阈值部分的那些消息无法及时得到处理，从而影响到整个集群的共识效率。However, in the blockchain network, in the process of providing consensus services, the Kafka cluster involves operations such as message distribution, storage, consumption and subscription. Especially in high concurrency scenarios, a large number of messages in a short period of time may cause the Kafka cluster For example, in some large-scale commercial application scenarios, a large number of new messages will flood into a topic topic in a short period of time. When these messages are distributed to the Partition partition, the partition not only needs to process the ordering of the messages , storage, and also process the subscription request of the Consumer node. When the number of messages exceeds a certain first threshold of the partition, there will be a problem that the partition cannot continue to serve, resulting in those messages exceeding the first threshold cannot be processed in time, thus Affects the consensus efficiency of the entire cluster.

现有技术中针对上述情况的解决方法通常是：1)消息等待分区有剩余资源后再进入分区；2)人工修改Partition分区的配置文件，增加分区数。但是，仍然存在下述问题：The solutions to the above situation in the prior art are usually: 1) the message enters the partition after waiting for the remaining resources in the partition; 2) the configuration file of the Partition is manually modified to increase the number of partitions. However, the following problems still exist:

1.当分区出现消息过多，无法继续服务的问题时，消息无法及时进入该分区得到排序处理，只有等待分区资源空出来后再进入，效率较低；1. When there are too many messages in the partition and the service cannot be continued, the messages cannot enter the partition in time for sorting and processing, and only wait for the partition resources to be vacated before entering, which is inefficient;

2.分区数量的增加目前主要依靠人工经验判断，容易出现判断错误的情况，这种错误会造成分区新增数量过多，资源浪费；或者分区新增数量不够，还是会出现新消息无法及时进入分区得到排序处理的情况；2. The increase of the number of partitions is currently mainly based on manual experience judgment, and it is easy to make mistakes in judgment. Such errors will cause too many new partitions and waste of resources; or if the number of new partitions is not enough, there will still be new messages that cannot be entered in time. The case where the partition is sorted;

3.人工配置分区存在时间滞后，效率较低。3. There is a time lag in manually configuring partitions and the efficiency is low.

发明内容SUMMARY OF THE INVENTION

本发明提供了一种基于区块链的消息分区方法及系统、设备、计算机可读取的存储介质，以解决目前的区块链采用Kafka集群进行共识处理时在高并发场景下存在的消息处理效率低的技术问题。The present invention provides a block chain-based message partitioning method, system, equipment, and computer-readable storage medium, so as to solve the problem of message processing existing in high concurrency scenarios when the current block chain adopts Kafka cluster for consensus processing Inefficient technical issues.

根据本发明的一个方面，提供一种基于区块链的消息分区方法，包括以下内容：According to one aspect of the present invention, a method for partitioning messages based on blockchain is provided, including the following:

步骤S1：在区块链网络中部署Kafka共识集群，初始化Partition文件，配置分区设置参数；Step S1: Deploy the Kafka consensus cluster in the blockchain network, initialize the Partition file, and configure the partition setting parameters;

步骤S2：区块链系统产生交易信息后，消息经Producer节点进入其中分区A，计算分区A的资源消耗量，若计算得到的分区A的资源消耗量超过第一阈值，下一个新消息不再进入分区A，否则下一个新消息仍然可进入分区A；Step S2: After the blockchain system generates the transaction information, the message enters the partition A through the Producer node, and calculates the resource consumption of the partition A. If the calculated resource consumption of the partition A exceeds the first threshold, the next new message will not be sent. Enter partition A, otherwise the next new message can still enter partition A;

步骤S3：当分区A的资源消耗量超过第一阈值时，计算同一Topic下的其他所有分区与分区A的匹配度，形成匹配度序列，从中选择匹配度值最大的分区并计算其当前资源消耗量，若所述匹配度值最大的分区的当前资源消耗量不超过第一阈值，则在下一个新消息准备进入分区A时将其分配至所述匹配度值最大的分区。Step S3: When the resource consumption of partition A exceeds the first threshold, calculate the matching degree of all other partitions under the same topic and partition A, form a matching degree sequence, select the partition with the largest matching degree value and calculate its current resource consumption If the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, when the next new message is ready to enter partition A, it will be allocated to the partition with the largest matching degree value.

进一步地，所述步骤S3还包括以下内容：Further, the step S3 also includes the following content:

若所述匹配度值最大的分区的当前资源消耗量超过第一阈值，则按照匹配度值从大到小的顺序逐一计算每个分区当前的资源消耗量，直至找到当前资源消耗量不超过第一阈值的分区，并将下一个准备进入分区A的新消息分配至该分区。If the current resource consumption of the partition with the largest matching degree value exceeds the first threshold, the current resource consumption of each partition is calculated one by one in descending order of the matching degree value, until it is found that the current resource consumption does not exceed the first threshold. A threshold partition and assign the next new message to partition A to that partition.

进一步地，还包括以下内容：Further, it also includes the following:

步骤S4：若在所述匹配度序列中无法找到当前资源消耗量不超过第一阈值的分区，则自动修改Partition配置文件以在该Topic下新增一个新的分区，并将后续的新消息分配至新分区中。Step S4: If a partition whose current resource consumption does not exceed the first threshold cannot be found in the matching degree sequence, the Partition configuration file is automatically modified to add a new partition under the topic, and subsequent new messages are allocated. to the new partition.

进一步地，采用以下公式来计算每个分区的资源消耗量：Further, the following formula is used to calculate the resource consumption of each partition:

其中，CPC^t表示t时刻分区的资源消耗量，Res^t(x)表示t时刻分区中消息x的资源消耗数；Among them, CPC ^t represents the resource consumption of the partition at time t, and Res ^t (x) represents the resource consumption of the message x in the partition at time t;

Res^t(x)＝α₁ResNum^t(x,M₁)+α₂ResCon^t(x,M₂)+α₃ResDat^t(x,M₃)Res ^t (x)=α ₁ ResNum ^t (x, M ₁ )+α ₂ ResCon ^t (x, M ₂ )+α ₃ ResDat ^t (x, M ₃ )

其中，α₁，α₂，α₃∈(0,1)为加权系数，ResNum^t(x,M₁)表示t时刻消息x的订阅总数，ResCon^t(x,M₂)表示t时刻消息x的访问连接总数，ResDat^t(x,M₃)表示t时刻消息x的数据大小，消息数据大小包括消息的value值对应的字符位数和key值对应的字符位数。Among them, α ₁ , α ₂ , α ₃ ∈(0,1) are weighting coefficients, ResNum ^t (x, M ₁ ) represents the total number of subscriptions of message x at time t, and ResCon ^t (x, M ₂ ) represents message x at time t The total number of access connections, ResDat ^t (x, M ₃ ) represents the data size of the message x at time t, and the message data size includes the number of characters corresponding to the value of the message and the number of characters corresponding to the key value.

进一步地，采用以下公式计算匹配度：Further, the following formula is used to calculate the matching degree:

其中，

表示分区B与分区A的消息序列的相似匹配度，Meg_i表示消息i在序列中的id编号，sequenceA_i表示分区A的消息序列sequenceA的第i个消息，sequenceB_j表示分区B的消息序列sequenceB的第j个消息，β_i,j∈(0,1)为加权系数，p为指数参数，p≥1。in,

Indicates the similarity matching degree of the message sequence of partition B and partition A, Megi represents the id number of message i in the sequence, sequenceA _i represents the _ith message of the message sequence sequenceA of partition A, and sequenceB _j represents the message sequence sequenceB of partition B The jth message of , β _i,j ∈(0,1) is the weighting coefficient, p is the index parameter, p≥1.

进一步地，还包括以下内容：Further, it also includes the following:

步骤S5：监测每个分区的资源消耗量，当至少一个分区的资源消耗量小于第二阈值时，在新分区完成共识处理后删除新分区。Step S5: Monitor the resource consumption of each partition, when the resource consumption of at least one partition is less than the second threshold, delete the new partition after the new partition completes the consensus processing.

另外，本发明的另一实施例还提供一种基于区块链的消息分区系统，包括：In addition, another embodiment of the present invention also provides a block chain-based message partitioning system, including:

配置模块，用于在区块链网络中部署Kafka共识集群，初始化Partition文件，配置分区设置参数；The configuration module is used to deploy the Kafka consensus cluster in the blockchain network, initialize the Partition file, and configure the partition setting parameters;

第一计算模块，用于在消息进入分区A时计算分区A的资源消耗量；The first calculation module is used to calculate the resource consumption of the partition A when the message enters the partition A;

第二计算模块，用于在所述第一计算模块计算得到的分区A的资源消耗量超过第一阈值时计算同一Topic下的其他所有分区与分区A的匹配度，形成匹配度序列；The second calculation module is configured to calculate the matching degree between all other partitions under the same topic and the partition A when the resource consumption of the partition A calculated by the first calculation module exceeds the first threshold, to form a matching degree sequence;

分配模块，用于从所述匹配度序列中选择匹配度值最大的分区并计算其当前资源消耗量，若所述匹配度值最大的分区的当前资源消耗量不超过第一阈值，则在下一个新消息准备进入分区A时将其分配至所述匹配度值最大的分区。The allocation module is used to select the partition with the largest matching degree value from the matching degree sequence and calculate its current resource consumption, if the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, then the next When a new message is ready to enter partition A, it is allocated to the partition with the highest matching degree value.

进一步地，还包括：Further, it also includes:

更新模块，用于在所述匹配度序列中无法找到当前资源消耗量不超过第一阈值的分区时，自动修改Partition配置文件以在该Topic下新增一个新的分区，并将后续的新消息分配至新分区中。The update module is used to automatically modify the Partition configuration file to add a new partition under the topic when the partition whose current resource consumption does not exceed the first threshold cannot be found in the matching degree sequence, and the subsequent new message assigned to the new partition.

另外，本发明的另一实施例还提供一种设备，包括处理器和存储器，所述存储器中存储有计算机程序，所述处理器通过调用所述存储器中存储的所述计算机程序，用于执行如上所述的方法的步骤。In addition, another embodiment of the present invention also provides a device, including a processor and a memory, where a computer program is stored in the memory, and the processor is configured to execute the computer program by invoking the computer program stored in the memory The steps of the method as described above.

另外，本发明的另一实施例还提供一种计算机可读取的存储介质，用于存储基于区块链进行消息分区的计算机程序，该计算机程序在计算机上运行时执行如上所述的方法的步骤。In addition, another embodiment of the present invention also provides a computer-readable storage medium for storing a computer program for message partitioning based on blockchain, the computer program executing the above method when running on a computer step.

本发明具有以下效果：The present invention has the following effects:

本发明的基于区块链的消息分区方法，在区块链产生交易消息进入对应的分区后，先计算该分区的资源消耗量，若超过第一阈值，意味着该分区已满，后续不能再进入新消息，则通过计算该分区所在的Topic中的其它分区与该分区的匹配度，并在匹配度值最大的分区的当前资源消耗量不超过第一阈值时，将后续准备进入该分区的新消息分配至匹配度值最大的分区，一方面，超出分区资源承载能力的部分消息无需长时间等待该分区腾出资源，省去了等待时间，另一方面，将消息分配至匹配度值最大的分区进行共识处理，确保了共识的可靠性，整体强化了Kafka集群处理高并发数据的能力，提升了区块链整体的共识效率。In the method for message partitioning based on the blockchain of the present invention, after the blockchain generates a transaction message and enters the corresponding partition, the resource consumption of the partition is calculated first. If it exceeds the first threshold, it means that the partition is full and cannot be further Enter a new message, calculate the matching degree of other partitions in the topic where the partition is located and the partition, and when the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold New messages are allocated to the partition with the largest matching degree value. On the one hand, some messages that exceed the resource carrying capacity of the partition do not need to wait for a long time for the partition to vacate resources, which saves the waiting time. On the other hand, the message is allocated to the largest matching degree value. Consensus processing is performed on the partitions of Kafka, which ensures the reliability of the consensus, strengthens the ability of the Kafka cluster to process high concurrent data as a whole, and improves the overall consensus efficiency of the blockchain.

另外，本发明的基于区块链的消息分区系统同样具有上述优点。In addition, the blockchain-based message partitioning system of the present invention also has the above advantages.

除了上面所描述的目的、特征和优点之外，本发明还有其它的目的、特征和优点。下面将参照图，对本发明作进一步详细的说明。In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail below with reference to the drawings.

附图说明Description of drawings

构成本申请的一部分的附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。在附图中：The accompanying drawings constituting a part of the present application are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached image:

图1是现有采用Kafka集群进行共识处理的区块链网络的架构示意图。Figure 1 is a schematic diagram of the architecture of an existing blockchain network using a Kafka cluster for consensus processing.

图2是本发明优选实施例的基于区块链的消息分区方法的流程示意图。FIG. 2 is a schematic flowchart of a block chain-based message partitioning method according to a preferred embodiment of the present invention.

图3是本发明的基于区块链的消息分区方法的另一实施方式的流程示意图。FIG. 3 is a schematic flowchart of another embodiment of the block chain-based message partitioning method of the present invention.

图4是本发明的基于区块链的消息分区方法的又一实施方式的流程示意图。FIG. 4 is a schematic flowchart of another embodiment of the method for message partitioning based on the blockchain of the present invention.

图5是本发明另一实施例的基于区块链的消息分区系统的模块结构示意图。FIG. 5 is a schematic diagram of a module structure of a block chain-based message partitioning system according to another embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的实施例进行详细说明，但是本发明可以由下述所限定和覆盖的多种不同方式实施。The embodiments of the present invention will be described in detail below with reference to the accompanying drawings, but the present invention can be implemented in many different ways as defined and covered below.

如图1所示，目前采用Kafka集群进行共识处理的区块链网络中，具体的共识过程包括以下内容：As shown in Figure 1, in the current blockchain network using Kafka cluster for consensus processing, the specific consensus process includes the following:

客户端向区块链共识模块中的Producer节点发送交易信息(即消息)；The client sends transaction information (ie messages) to the Producer node in the blockchain consensus module;

Producer节点根据消息的key值，按照哈希取模或轮询方式将消息推送到Kafka集群中Topic下的某个Partition分区；According to the key value of the message, the Producer node pushes the message to a Partition partition under the Topic in the Kafka cluster according to the hash modulo or polling method;

消息在分区中进行排序，形成有序的消息队列；Messages are sorted in partitions to form an ordered message queue;

Consumer节点从订阅的Topic下的Partition分区中获取消息队列，并打包成区块；The Consumer node obtains the message queue from the Partition partition under the subscribed Topic and packs it into blocks;

Consumer节点将区块分发给区块链网络中的其他节点进行验证。Consumer nodes distribute blocks to other nodes in the blockchain network for verification.

通过以上共识过程，区块链网络能够使用Kafka集群快速进行数据共识。但在此共识过程中，如果消息瞬时数量过多(比如高并发状态)，很可能出现Partition分区不够的情况，此时多余的消息只能等待分区资源空出来之后，才能进入分区进行排序。因此，如图2所示，本发明的优选实施例提供一种基于区块链的消息分区方法，包括以下内容：Through the above consensus process, the blockchain network can use the Kafka cluster to quickly perform data consensus. However, during this consensus process, if the instantaneous number of messages is too large (such as high concurrency), it is likely that there are not enough Partition partitions. At this time, the redundant messages can only enter the partition for sorting after the partition resources are vacated. Therefore, as shown in FIG. 2, a preferred embodiment of the present invention provides a block chain-based message partitioning method, including the following:

可以理解，在所述步骤S1中，配置的分区设置参数包括分区的数量、每个分区资源消耗量的第一阈值等，另外，Kafka共识集群中包含的Topic数量和每个Topic的共识算法均在初始化阶段进行配置。在分区文件配置好后，每个Topic下都默认生成预设数量的分区。It can be understood that in the step S1, the configured partition setting parameters include the number of partitions, the first threshold of resource consumption of each partition, etc. In addition, the number of topics contained in the Kafka consensus cluster and the consensus algorithm of each topic are both Configured during the initialization phase. After the partition file is configured, a preset number of partitions are generated by default under each topic.

可以理解，在所述步骤S2中，消息按照哈希策略或者轮询策略分配至分区A后，若计算得到的分区A的当前资源消耗量超过第一阈值，意味着分区A已满，后续无法再进入新消息，后续分配至分区A的消息需要进行转移处理。若计算得到分区A的当前资源消耗量不超过第一阈值，意味着分区A尚有剩余的资源处理能力，后续仍然可以分配消息进入而不会出现过载的风险。It can be understood that in the step S2, after the message is allocated to the partition A according to the hash strategy or the polling strategy, if the calculated current resource consumption of the partition A exceeds the first threshold, it means that the partition A is full, and the follow-up cannot be performed. Then enter a new message, and the subsequent messages allocated to partition A need to be transferred. If it is calculated that the current resource consumption of partition A does not exceed the first threshold, it means that partition A still has remaining resource processing capacity, and subsequent messages can still be allocated to enter without the risk of overload.

可以理解，在所述步骤S3中，当分区A的资源消耗量超过第一阈值时，意味着分区A已满。此时，通过计算同一个Topic下的其它分区与分区A的匹配度，筛选出匹配度值最大的分区，意味着其与分区A在消息属性上最匹配，然后将下一个准备进入分区A的新消息分配至匹配度值最大的分区。一方面，省去了新消息的等待时间，提高了消息处理效率，另一方面，由于匹配度值最大的分区在消息属性上与分区A最接近，确保了共识处理的可靠性。It can be understood that in the step S3, when the resource consumption of the partition A exceeds the first threshold, it means that the partition A is full. At this time, by calculating the matching degree between other partitions under the same topic and partition A, the partition with the largest matching degree value is filtered out, which means that it is the best match with partition A in terms of message attributes, and then the next one that is ready to enter partition A is selected. New messages are assigned to the partition with the highest matching value. On the one hand, it saves the waiting time for new messages and improves the efficiency of message processing. On the other hand, since the partition with the largest matching degree value is the closest to partition A in terms of message attributes, the reliability of consensus processing is ensured.

可以理解，本实施例的基于区块链的消息分区方法，在区块链产生交易消息进入对应的分区后，先计算该分区的资源消耗量，若超过第一阈值，意味着该分区已满，后续不能再进入新消息，则通过计算该分区所在的Topic中的其它分区与该分区的匹配度，并在匹配度值最大的分区的当前资源消耗量不超过第一阈值时，将后续准备进入该分区的新消息分配至匹配度值最大的分区，一方面，超出分区资源承载能力的部分消息无需长时间等待该分区腾出资源，省去了等待时间，另一方面，将消息分配至匹配度值最大的分区进行共识处理，确保了共识的可靠性，整体强化了Kafka集群处理高并发数据的能力，提升了区块链整体的共识效率。It can be understood that, in the method for message partitioning based on the blockchain in this embodiment, after the blockchain generates a transaction message and enters the corresponding partition, the resource consumption of the partition is calculated first. If it exceeds the first threshold, it means that the partition is full. , no new messages can be entered in the future, then by calculating the matching degree of other partitions in the topic where the partition is located and the partition, and when the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, the subsequent preparation New messages entering the partition are allocated to the partition with the largest matching degree value. On the one hand, some messages that exceed the resource carrying capacity of the partition do not need to wait for a long time for the partition to vacate resources, which saves the waiting time. On the other hand, the messages are allocated to Consensus processing is performed on the partition with the largest matching degree value, which ensures the reliability of the consensus, strengthens the ability of the Kafka cluster to process high concurrent data as a whole, and improves the overall consensus efficiency of the blockchain.

可以理解，所述步骤S3还包括以下内容：It can be understood that the step S3 also includes the following content:

当匹配度值最大的分区的当前资源消耗量超过第一阈值时，意味着其也处于满载状态，无法再对新消息进行共识处理，此时，按照匹配度值从大到小的顺序进行迭代筛选，直至找到一个当前资源消耗量不超过第一阈值的分区，并将下一个分配至分区A的新消息转移至该分区进行共识处理。When the current resource consumption of the partition with the largest matching degree value exceeds the first threshold, it means that it is also in a fully loaded state and can no longer perform consensus processing on new messages. Screen until a partition whose current resource consumption does not exceed the first threshold is found, and the next new message allocated to partition A is transferred to this partition for consensus processing.

可以理解，如图3所示，所述基于区块链的消息分区方法还包括以下内容：It can be understood that, as shown in Figure 3, the block chain-based message partitioning method further includes the following contents:

若在分区A所在的Topic下无法找到当前资源消耗量不超过第一阈值的分区，意味着目前Topic下的所有分区均已满，不具备处理新消息的能力，此时，通过修改分区配置文件在该Topic下新增一个新的分区，然后将后续的新消息分配至新分区中，新分区则按照预设的共识算法进行共识排序。从而可以根据整个Topic的消息处理能力来自动增加分区的数量，以保证消息的处理效率。If a partition whose current resource consumption does not exceed the first threshold cannot be found under the topic where partition A is located, it means that all partitions under the topic are full and cannot process new messages. In this case, modify the partition configuration file by modifying the partition configuration file. A new partition is added under the topic, and subsequent new messages are allocated to the new partition, and the new partition is sorted by consensus according to the preset consensus algorithm. Therefore, the number of partitions can be automatically increased according to the message processing capability of the entire topic to ensure the message processing efficiency.

可以理解，本发明中采用了资源消耗量和分区匹配度这两个数据指标来进行消息的分区选择。其中，分区的资源消耗量用来描述当前时刻某个分区因处理消息而消耗的资源，分区匹配度用来描述同一时刻同一Topic下不同分区之间的相似匹配程度。具体地，采用以下公式来计算每个分区的资源消耗量：It can be understood that, in the present invention, two data indicators of resource consumption and partition matching degree are used to select a message partition. Among them, the resource consumption of the partition is used to describe the resources consumed by a partition at the current moment for processing messages, and the partition matching degree is used to describe the similarity matching degree between different partitions under the same topic at the same time. Specifically, the following formula is used to calculate the resource consumption of each partition:

其中，CPC^t表示t时刻分区的资源消耗量，Res^t(x)表示t时刻分区中消息x的资源消耗数，则分区当前的资源消耗量等于当前分区中所有消息的资源消耗数之和。Among them, CPC ^t represents the resource consumption of the partition at time ^t , and Rest (x) represents the resource consumption of message x in the partition at time t, then the current resource consumption of the partition is equal to the sum of the resource consumption of all messages in the current partition.

其中，α₁，α₂，α₃∈(0,1)为加权系数，ResNum^t(x,M₁)表示t时刻消息x的订阅总数，ResCon^t(x,M₂)表示t时刻消息x的访问连接总数，ResDat^t(x,M₃)表示t时刻消息x的数据大小，消息数据大小包括消息的value值对应的字符位数和key值对应的字符位数。本发明基于消息的订阅总数、连接总数和数据大小三个维度来综合评估分区的资源消耗量，提高了计算的精准度。Among them, α ₁ , α ₂ , α ₃ ∈(0,1) are weighting coefficients, ResNum ^t (x, M ₁ ) represents the total number of subscriptions of message x at time t, and ResCon ^t (x, M ₂ ) represents message x at time t The total number of access connections, ResDat ^t (x, M ₃ ) represents the data size of the message x at time t, and the message data size includes the number of characters corresponding to the value of the message and the number of characters corresponding to the key value. The present invention comprehensively evaluates the resource consumption of the partition based on the three dimensions of the total number of message subscriptions, the total number of connections and the data size, thereby improving the accuracy of calculation.

另外，具体采用以下公式计算匹配度：In addition, the following formula is used to calculate the matching degree:

假设对于任意Partition分区，在时刻t拥有的消息序列表示为Partition^t＝{Meg₁，Meg₂,...,Meg_n}，其中n为该序列在时刻t的消息总数，Meg_i表示消息i在序列中的id编号，消息插入序列时会生成一个id号。其中，PMD

表示分区B与分区A的消息序列的相似匹配度，Meg_i表示消息i在序列中的id编号，sequenceA_i表示分区A的消息序列sequenceA的第i个消息，sequenceB_j表示分区B的消息序列sequenceB的第j个消息，β_i,j∈(0,1)为加权系数，p为指数参数，p≥1。本发明通过采用闵可夫斯基距离来计算两个分区消息序列的相似匹配度，并结合了加权计算，可以准确地计算出两个分区的匹配度。Assuming that for any Partition partition, the sequence of messages possessed at time t is represented as Partition ^t = {Meg ₁ , Meg ₂ ,..., Meg _n }, where n is the total number of messages in the sequence at time t, and Meg _i represents message i The id number in the sequence, an id number is generated when the message is inserted into the sequence. Among them, PMD

Indicates the similarity matching degree of the message sequence of partition B and partition A, Megi represents the id number of message i in the sequence, sequenceA _i represents the _ith message of the message sequence sequenceA of partition A, and sequenceB _j represents the message sequence sequenceB of partition B The jth message of , β _i,j ∈(0,1) is the weighting coefficient, p is the index parameter, p≥1. The invention calculates the similarity matching degree of the two partition message sequences by adopting the Minkowski distance, and combining with the weighted calculation, the matching degree of the two partitions can be accurately calculated.

另外，当p＝2时，计算得到的结果为曼哈顿距离，即公式可表示为：

In addition, when p=2, the calculated result is the Manhattan distance, that is, the formula can be expressed as:

可以理解，如图4所示，所述基于区块链的消息分区方法还包括以下内容：It can be understood that, as shown in Figure 4, the block chain-based message partitioning method further includes the following contents:

对每个分区的资源消耗量进行实时监测或定时监测，若有至少一个分区的资源消耗量小于第二阈值，意味着有至少一个分区具有充足的资源承载能力，此时，在新分区完成共识任务之后删除该新分区，避免在高并发业务场景结束后造成资源的浪费。其中，所述第二阈值小于第一阈值，第二阈值的具体数值也在步骤S1中初始化时进行配置。Real-time monitoring or regular monitoring of the resource consumption of each partition is performed. If the resource consumption of at least one partition is less than the second threshold, it means that at least one partition has sufficient resource carrying capacity. At this time, the consensus is completed in the new partition Delete the new partition after the task to avoid wasting resources after the high concurrency business scenario ends. The second threshold is smaller than the first threshold, and the specific value of the second threshold is also configured during initialization in step S1.

另外，如图5所示，本发明的另一实施例还提供一种基于区块链的消息分区系统，优选采用如上所述的消息分区方法，该系统包括：In addition, as shown in FIG. 5 , another embodiment of the present invention also provides a block chain-based message partitioning system, preferably using the message partitioning method described above, and the system includes:

可以理解，本实施例的基于区块链的消息分区系统，在区块链产生交易消息进入对应的分区后，先计算该分区的资源消耗量，若超过第一阈值，意味着该分区已满，后续不能再进入新消息，则通过计算该分区所在的Topic中的其它分区与该分区的匹配度，并在匹配度值最大的分区的当前资源消耗量不超过第一阈值时，将后续准备进入该分区的新消息分配至匹配度值最大的分区，一方面，超出分区资源承载能力的部分消息无需长时间等待该分区腾出资源，省去了等待时间，另一方面，将消息分配至匹配度值最大的分区进行共识处理，确保了共识的可靠性，整体强化了Kafka集群处理高并发数据的能力，提升了区块链整体的共识效率。It can be understood that in the blockchain-based message partition system of this embodiment, after the blockchain generates a transaction message and enters the corresponding partition, the resource consumption of the partition is calculated first. If it exceeds the first threshold, it means that the partition is full. , no new messages can be entered in the future, then by calculating the matching degree of other partitions in the topic where the partition is located and the partition, and when the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, the subsequent preparation New messages entering the partition are allocated to the partition with the largest matching degree value. On the one hand, some messages that exceed the resource carrying capacity of the partition do not need to wait for a long time for the partition to vacate resources, which saves the waiting time. On the other hand, the messages are allocated to Consensus processing is performed on the partition with the largest matching degree value, which ensures the reliability of the consensus, strengthens the ability of the Kafka cluster to process high concurrent data as a whole, and improves the overall consensus efficiency of the blockchain.

另外，所述消息分区系统还包括：In addition, the message partition system further includes:

监测模块，用于监测每个分区的资源消耗量，当至少一个分区的资源消耗量小于第二阈值时，在新分区完成共识处理后删除新分区。The monitoring module is used to monitor the resource consumption of each partition, and when the resource consumption of at least one partition is less than the second threshold, delete the new partition after the new partition completes the consensus processing.

可以理解，本实施例的系统中的各个模块与上述方法实施例中的各个步骤相对应，故各个模块的具体工作过程在此不再赘述。It can be understood that each module in the system of this embodiment corresponds to each step in the foregoing method embodiment, so the specific working process of each module is not repeated here.

一般计算机可读取介质的形式包括：软盘(floppy disk)、可挠性盘片(flexibledisk)、硬盘、磁带、任何其与的磁性介质、CD-ROM、任何其余的光学介质、打孔卡片(punchcards)、纸带(paper tape)、任何其余的带有洞的图案的物理介质、随机存取存储器(RAM)、可编程只读存储器(PROM)、可抹除可编程只读存储器(EPROM)、快闪可抹除可编程只读存储器(FLASH-EPROM)、其余任何存储器芯片或卡匣、或任何其余可让计算机读取的介质。指令可进一步被一传输介质所传送或接收。传输介质这一术语可包含任何有形或无形的介质，其可用来存储、编码或承载用来给机器执行的指令，并且包含数字或模拟通信信号或其与促进上述指令的通信的无形介质。传输介质包含同轴电缆、铜线以及光纤，其包含了用来传输一计算机数据信号的总线的导线。Typical forms of computer readable media include: floppy disks, flexible disks, hard disks, magnetic tapes, any other magnetic media, CD-ROMs, any other optical media, punch cards ( punchcards), paper tape, any other physical media with a pattern of holes, random access memory (RAM), programmable read only memory (PROM), erasable programmable read only memory (EPROM) , Flash-Erasable Programmable Read-Only Memory (FLASH-EPROM), any other memory chip or cartridge, or any other computer-readable medium. The instructions may further be transmitted or received by a transmission medium. The term transmission medium can include any tangible or intangible medium that can be used to store, encode, or carry instructions for execution by a machine, and includes digital or analog communication signals or intangible media that facilitate communication of such instructions. Transmission media include coaxial cables, copper wire, and fiber optics, which contain the wires of a bus used to transmit a computer data signal.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. A block chain-based message partitioning method is characterized in that, comprising the following content:

Step S1: Deploy the Kafka consensus cluster in the blockchain network, initialize the Partition file, and configure the partition setting parameters;

Step S2: After the blockchain system generates the transaction information, the message enters the partition A through the Producer node, and calculates the resource consumption of the partition A. If the calculated resource consumption of the partition A exceeds the first threshold, the next new message will not be sent. Enter partition A, otherwise the next new message can still enter partition A;

Step S3: When the resource consumption of partition A exceeds the first threshold, calculate the matching degree of all other partitions under the same topic and partition A, form a matching degree sequence, select the partition with the largest matching degree value and calculate its current resource consumption If the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, when the next new message is ready to enter partition A, it will be allocated to the partition with the largest matching degree value.

2. The block chain-based message partitioning method according to claim 1, wherein the step S3 further comprises the following content:

If the current resource consumption of the partition with the largest matching degree value exceeds the first threshold, the current resource consumption of each partition is calculated one by one in descending order of the matching degree value, until it is found that the current resource consumption does not exceed the first threshold. A threshold partition and assign the next new message to partition A to that partition.

3. The block chain-based message partitioning method according to claim 2, further comprising the following content:

Step S4: If a partition whose current resource consumption does not exceed the first threshold cannot be found in the matching degree sequence, the Partition configuration file is automatically modified to add a new partition under the topic, and subsequent new messages are allocated. to the new partition.

4. The block chain-based message partitioning method of claim 1, wherein the following formula is used to calculate the resource consumption of each partition:

Among them, CPC ^t represents the resource consumption of the partition at time t, and Res ^t (x) represents the resource consumption of the message x in the partition at time t;

Res ^t (x)=α ₁ ResNum ^t (x, M ₁ )+α ₂ ResCon ^t (x, M ₂ )+α ₃ ResDat ^t (x, M ₃ )

Among them, α ₁ , α ₂ , α ₃ ∈(0,1) are weighting coefficients, ResNum ^t (x, M ₁ ) represents the total number of subscriptions of message x at time t, and ResCon ^t (x, M ₂ ) represents message x at time t The total number of access connections, ResDat ^t (x, M ₃ ) represents the data size of the message x at time t, and the message data size includes the number of characters corresponding to the value of the message and the number of characters corresponding to the key value.

5. The block chain-based message partitioning method as claimed in claim 1, wherein the following formula is used to calculate the matching degree:

in,

6. The block chain-based message partitioning method of claim 3, further comprising the following:

Step S5: Monitor the resource consumption of each partition, when the resource consumption of at least one partition is less than the second threshold, delete the new partition after the new partition completes the consensus processing.

7. A block chain-based message partitioning system, comprising:

The configuration module is used to deploy the Kafka consensus cluster in the blockchain network, initialize the Partition file, and configure the partition setting parameters;

The first calculation module is used to calculate the resource consumption of the partition A when the message enters the partition A;

The second calculation module is configured to calculate the matching degree between all other partitions under the same topic and the partition A when the resource consumption of the partition A calculated by the first calculation module exceeds the first threshold, to form a matching degree sequence;

The allocation module is used to select the partition with the largest matching degree value from the matching degree sequence and calculate its current resource consumption, if the current resource consumption of the partition with the largest matching degree value does not exceed the first threshold, then the next When a new message is ready to enter partition A, it is allocated to the partition with the highest matching degree value.

8. The block chain-based message partitioning system of claim 7, further comprising:

The update module is used to automatically modify the Partition configuration file to add a new partition under the topic when the partition whose current resource consumption does not exceed the first threshold cannot be found in the matching degree sequence, and the subsequent new message assigned to the new partition.

9. A device, characterized by comprising a processor and a memory, wherein a computer program is stored in the memory, and the processor is used to execute the computer program according to claim 1 by calling the computer program stored in the memory. 6 the steps of any one of the methods.

10. A computer-readable storage medium for storing a computer program for message partitioning based on a block chain, characterized in that, when the computer program is run on a computer, the execution of any one of claims 1 to 6 is performed. steps of the method.