CN114693450A

CN114693450A - Calculation, update, reading method and device based on smart contract, electronic equipment

Info

Publication number: CN114693450A
Application number: CN202210332050.4A
Authority: CN
Inventors: 周晨辉; 闫莺
Original assignee: Ant Blockchain Technology Shanghai Co Ltd
Current assignee: Ant Blockchain Technology Shanghai Co Ltd
Priority date: 2022-03-30
Filing date: 2022-03-30
Publication date: 2022-07-01
Also published as: WO2023185052A1

Abstract

A computing method based on intelligent contracts, intelligent contracts used for executing approximate computation are deployed on a blockchain, and the method comprises the following steps: receiving an intelligent contract calling transaction aiming at an intelligent contract and initiated by a calculation initiator; the intelligent contract invoking transaction comprises calculation parameters corresponding to the approximate calculation; the calculation parameters comprise data identifications of the data sets participating in the approximate calculation; responding to the intelligent contract call transaction, calling sampling logic contained in the intelligent contract call transaction, dividing a data set corresponding to the data identification into an outlier data subset formed by a plurality of outlier data samples and a non-outlier data subset formed by a plurality of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset; and calling a calculation logic contained in the transaction by using an intelligent contract, performing accurate calculation on the outlier data samples in the outlier data subset, performing approximate calculation on the non-outlier data samples obtained by sampling, and combining the results of the accurate calculation and the approximate calculation.

Description

Calculation, update, reading method and device based on smart contract, electronic equipment

技术领域technical field

本说明书一个或多个实施例涉及区块链技术领域，尤其涉及一种基于智能合约的计算装置、电子设备。One or more embodiments of this specification relate to the field of blockchain technology, and in particular, to a computing device and electronic device based on a smart contract.

背景技术Background technique

区块链(Blockchain)是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链系统中按照时间顺序将数据区块以顺序相连的方式组合成链式数据结构，并以密码学方式保证的不可篡改和不可伪造的分布式账本。由于区块链具有去中心化、信息不可篡改、自治性等特性，区块链也受到人们越来越多的重视和应用。Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. In the blockchain system, the data blocks are sequentially connected to form a chain data structure according to the time sequence, and a distributed ledger that cannot be tampered with and cannot be forged by cryptography. Due to the characteristics of decentralization, non-tampering of information, and autonomy, blockchain has also received more and more attention and applications.

发明内容SUMMARY OF THE INVENTION

本说明书提出一种基于智能合约的计算方法，应用于区块链中的节点设备，所述区块链上部署了用于执行近似计算的智能合约，所述方法包括：This specification proposes a computing method based on a smart contract, which is applied to a node device in a blockchain where a smart contract for performing approximate computing is deployed, and the method includes:

接收计算发起方发起的针对所述智能合约的智能合约调用交易；其中，所述智能合约调用交易包括与所述近似计算对应的计算参数；所述计算参数包括参与近似计算的数据集合的数据标识；Receive a smart contract invocation transaction for the smart contract initiated by a calculation initiator; wherein the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameter includes a data identifier of a data set participating in the approximate calculation ;

响应于所述智能合约调用交易，调用所述智能合约调用交易包含的采样逻辑，将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，并针对所述非离群数据子集中的非离群数据样本进行采样；in response to the smart contract invocation transaction, invoking the sampling logic included in the smart contract invocation transaction, dividing the data set corresponding to the data identifier into an outlier data subset consisting of a number of outlier data samples, and A non-outlier data subset composed of several non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

进一步调用所述智能合约调用交易包含的计算逻辑，针对所述离群数据子集中的离群数据样本进行精确计算，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，并合并所述精确计算和所述近似计算的结果，以作为针对所述数据集合的近似计算结果。Further invoking the calculation logic included in the smart contract invocation transaction, performing accurate calculation on the outlier data samples in the outlier data subset, and approximating the non-outlier data samples sampled from the non-outlier data subset calculating, and combining the results of the exact calculation and the approximate calculation as an approximate calculation result for the data set.

本说明书还提出一种基于智能合约的计算装置，应用于区块链中的节点设备，所述区块链上部署了用于执行近似计算的智能合约，所述装置包括：This specification also proposes a computing device based on a smart contract, which is applied to a node device in a blockchain, where a smart contract for performing approximate calculations is deployed on the blockchain, and the device includes:

接收模块，接收计算发起方发起的针对所述智能合约的智能合约调用交易；其中，所述智能合约调用交易包括与所述近似计算对应的计算参数；所述计算参数包括参与近似计算的数据集合的数据标识；a receiving module, for receiving a smart contract invocation transaction for the smart contract initiated by a computing initiator; wherein, the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameter includes a data set participating in the approximate calculation data identification;

采样模块，响应于所述智能合约调用交易，调用所述智能合约调用交易包含的采样逻辑，将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，并针对所述非离群数据子集中的非离群数据样本进行采样；The sampling module, in response to the smart contract invocation transaction, invokes the sampling logic included in the smart contract invocation transaction, and divides the data set corresponding to the data identifier into outlier data subsections consisting of a number of outlier data samples set, and a non-outlier data subset composed of several non-outlier data samples, and sample the non-outlier data samples in the non-outlier data subset;

计算模块，进一步调用所述智能合约调用交易包含的计算逻辑，针对所述离群数据子集中的离群数据样本进行精确计算，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，并合并所述精确计算和所述近似计算的结果，以作为针对所述数据集合的近似计算结果。The calculation module further invokes the calculation logic contained in the smart contract invocation transaction, and performs accurate calculation for the outlier data samples in the outlier data subset, and for the non-outlier data sampled from the non-outlier data subset An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

以上技术方案中，在调用智能合约针对数据集合进行近似计算的场景下，通过在智能合约中引入针对该数据集合的采样机制，可以在不牺牲近似计算结果的准确度的基础上，降低对该数据集合进行近似计算时的耗时，提高针对该数据集合进行近似计算时的计算效率。而且，由于在对该数据集合进行近似计算的过程中，不对该数据集合中的离群数据进行采样后执行近似计算，而是不进行采样直接进行精确计算，从而可以该数据集合中包括离群数据的情况下，进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响，可以最大程度的确保针对该数据集合进行近似计算的准确度。In the above technical solution, in the scenario of invoking a smart contract to perform approximate calculation on a data set, by introducing a sampling mechanism for the data set into the smart contract, the accuracy of the approximate calculation result can be reduced without sacrificing the accuracy of the approximate calculation result. The time-consuming of the approximate calculation for the data set is improved, and the calculation efficiency of the approximate calculation for the data set is improved. Moreover, in the process of performing approximate calculation on the data set, the approximate calculation is not performed after sampling the outlier data in the data set, but the accurate calculation is directly performed without sampling, so that the data set can include outliers. In the case of data, it is further avoided that these outlier data samples affect the accuracy of the approximate calculation result for the data set, and the accuracy of the approximate calculation for the data set can be ensured to the greatest extent.

附图说明Description of drawings

图1是一示例性实施例提供的一种基于智能合约的计算方法的流程图；1 is a flowchart of a computing method based on a smart contract provided by an exemplary embodiment;

图2是一示例性实施例提供的一种最优化求解方法的流程图；Fig. 2 is a flow chart of an optimization solution method provided by an exemplary embodiment;

图3是一示例性实施例提供的一种电子设备的结构示意图；3 is a schematic structural diagram of an electronic device provided by an exemplary embodiment;

图4是一示例性实施例提供的一种基于智能合约的计算装置的框图。FIG. 4 is a block diagram of a computing device based on a smart contract provided by an exemplary embodiment.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书一个或多个实施例相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本说明书一个或多个实施例的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with one or more embodiments of this specification. Rather, they are merely examples of apparatus and methods consistent with some aspects of one or more embodiments of this specification, as recited in the appended claims.

需要说明的是：在其他实施例中并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤。在一些其他实施例中，其方法所包括的步骤可以比本说明书所描述的更多或更少。此外，本说明书中所描述的单个步骤，在其他实施例中可能被分解为多个步骤进行描述；而本说明书中所描述的多个步骤，在其他实施例中也可能被合并为单个步骤进行描述。It should be noted that: in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or fewer steps than described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; and multiple steps described in this specification may also be combined into a single step in other embodiments. describe.

随着智能合约技术的不断发展，在使用智能合约与业务进行对接时，智能合约也逐渐开始承担一部分与该业务相关的算力。With the continuous development of smart contract technology, when using smart contracts to connect with businesses, smart contracts have gradually begun to assume part of the computing power related to the business.

例如，在实际应用中，区块链上部署的用于与业务进行对接的智能合约中，除了可以包括与业务相关的业务逻辑以外，还可以包括针对该业务相关的业务数据进行计算的逻辑，从而使得用户可以通过调用该智能合约的方式，在区块链上完成针对该业务相关的计算。For example, in practical applications, the smart contract deployed on the blockchain for docking with the business may include, in addition to the business logic related to the business, the logic for calculating the business data related to the business, Therefore, the user can complete the calculation related to the business on the blockchain by invoking the smart contract.

当利用智能合约对与业务相关的数据集合进行计算时，其计算总耗时通常取决于针对每一条数据分别进行I/O操作的耗时和对上述一组数据批量进行计算的耗时。When a smart contract is used to calculate a business-related data set, the total calculation time usually depends on the time-consuming of performing I/O operations for each piece of data and the time-consuming calculation of the above-mentioned group of data in batches.

例如，在实际应用中，以与业务相关的数据集合预先存证在区块链上为例，此时智能合约对与业务相关的数据集合进行计算时的总耗时，通常可以用如下的公式进行表示：For example, in practical applications, taking business-related data sets pre-stored on the blockchain as an example, at this time, the total time spent by smart contracts to calculate business-related data sets can usually be calculated using the following formula to represent:

其中，在上述公式中，i表示上述数据集合中的第i条数据；IO_i表示针对第i条数据进行I/O操作处理操作的耗时；Operation_i表示对数据集合中的i条数据批量进行计算的耗时。Wherein, in the above formula, i represents the i-th piece of data in the above-mentioned data set; IO _i represents the time-consuming operation of I/O operation processing operations for the i-th piece of data; Operation _i represents the batch of i pieces of data in the data set The time it takes to perform the calculation.

需要说明的是，由于数据集合在区块链上进行存证时，通常是以key-Value键值对的形式，逐条的存储在区块链节点设备搭载的存储介质中，因此对于存储在区块链上的上述数据集合，通常只能根据数据的key键值，逐条的从区块链节点设备搭载的存储介质中来读取数据。It should be noted that when the data collection is stored on the blockchain, it is usually stored in the storage medium carried by the blockchain node device in the form of key-value key-value pairs one by one. The above-mentioned data set on the blockchain can usually only be read one by one from the storage medium carried by the blockchain node device according to the key value of the data.

在一些对数据计算的隐私性和安全性要求较高的应用场景中，上述智能合约还可以部署在区块链节点设备搭载的TEE(Trusted execution environment，可信执行环境)中。In some application scenarios that require higher privacy and security for data computing, the above smart contracts can also be deployed in a TEE (Trusted execution environment, trusted execution environment) carried on the blockchain node device.

在这种情况下，上述数据集合中的数据，通常都需要加密存储。此时，利用智能合约对与业务相关的数据集合进行计算时，其计算总耗时则通常取决于针对每一条数据分别进行I/O操作的耗时、针对每一条数据分别进行解密的耗时、和对上述一组数据批量进行计算的耗时。In this case, the data in the above data set usually needs to be encrypted and stored. At this time, when a smart contract is used to calculate a business-related data set, the total calculation time usually depends on the time-consuming of I/O operations for each piece of data and the time-consuming of decrypting each piece of data separately. , and the time-consuming calculation of the above-mentioned set of data in batches.

其中，在上述公式中，Operation_i表示对数据集合中的第i条数据进行解密的耗时。Wherein, in the above formula, Operation _i represents the time-consuming of decrypting the i-th piece of data in the data set.

通过以上的介绍不难看出，在利用智能合约对与业务相关的数据集合进行计算的场景下，如果该数据集合包含的数据量比较大，通过智能合约对该数据集合进行计算，得到准确的计算结果是非常耗时的。From the above introduction, it is not difficult to see that in the scenario where smart contracts are used to calculate business-related data sets, if the data set contains a relatively large amount of data, the smart contracts are used to calculate the data set to obtain accurate calculations. The result is very time consuming.

而在实际应用中，在一些业务场景之下，可能并不需要针对与业务相关的数据的精确计算结果，而是可以容忍一些计算精度上的损失。In practical applications, in some business scenarios, accurate calculation results for business-related data may not be required, but some loss of calculation accuracy can be tolerated.

例如，在计算用户平均年龄的计算场景下，大多数情况下是不需要准确的计算结果的，通常只需要近似计算得到一个平均年龄的区间即可。For example, in the calculation scenario of calculating the average age of a user, in most cases, an accurate calculation result is not required, and usually only an approximate calculation is required to obtain an average age interval.

基于此，本说明书提出一种在智能合约中引入近似计算和数据采样的机制，来提升针对业务相关的数据进行计算的计算效率的技术方案。Based on this, this specification proposes a technical solution for introducing approximate calculation and data sampling mechanisms into smart contracts to improve the computational efficiency of business-related data calculations.

在实现时，可以在区块链上部署用于进行数据计算的智能合约，在该智能合约中可以包含用于进行近似计算的近似计算逻辑和用于进行数据采样的采样逻辑。计算发起方可以通过发起一笔智能合约调用交易的方式，来调用该智能合约对参与计算的数据集合进行近似计算。其中，该智能合约调用交易可以包括与近似计算对应的计算参数；该计算参数可以包括参与近似计算的数据集合的数据标识；When implemented, a smart contract for data computation can be deployed on the blockchain, and the smart contract can contain approximate computation logic for approximate computation and sampling logic for data sampling. The calculation initiator can call the smart contract to perform approximate calculation on the data set participating in the calculation by initiating a smart contract call transaction. Wherein, the smart contract invocation transaction may include a calculation parameter corresponding to the approximate calculation; the calculation parameter may include a data identifier of a data set participating in the approximate calculation;

而区块链中的节点设备在接收到计算发起方发起的该智能合约调用交易时，可以响应于智能合约调用交易，调用该智能合约调用交易包含的采样逻辑，将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，并针对所述非离群数据子集中的非离群数据样本进行采样；在采样完成后，可以进一步调用该智能合约包含的近似计算逻辑，针对所述离群数据子集中的离群数据样本进行精确计算，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，并合并所述精确计算和所述近似计算的结果，以作为针对所述数据集合的近似计算结果。When receiving the smart contract invocation transaction initiated by the computing initiator, the node device in the blockchain can respond to the smart contract invocation transaction, invoke the sampling logic contained in the smart contract invocation transaction, and store the data corresponding to the data identifier. The data set is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and for the non-outlier data subsets in the non-outlier data subset After the sampling is completed, the approximate calculation logic contained in the smart contract can be further called to perform accurate calculation for the outlier data samples in the outlier data subset, and for the outlier data samples from the non-outlier data subset. An approximate calculation is performed on the sampled non-outlier data samples, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

在以上技术方案中，在调用智能合约针对数据集合进行近似计算的场景下，通过在智能合约中引入针对该数据集合的采样机制，可以在不牺牲近似计算结果的准确度的基础上，降低对该数据集合进行近似计算时的耗时，提高针对该数据集合进行近似计算时的计算效率。In the above technical solution, in the scenario of invoking a smart contract to perform approximate calculation on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation result can be reduced without sacrificing the accuracy of the approximate calculation result. The time-consuming when the approximate calculation is performed on the data set improves the calculation efficiency when the approximate calculation is performed on the data set.

而且，由于在对该数据集合进行近似计算的过程中，不对该数据集合中的离群数据进行采样后执行近似计算，而是不进行采样直接进行精确计算，从而可以该数据集合中包括离群数据的情况下，进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响，可以最大程度的确保针对该数据集合进行近似计算的准确度。Moreover, in the process of performing approximate calculation on the data set, the approximate calculation is not performed after sampling the outlier data in the data set, but the accurate calculation is directly performed without sampling, so that the data set can include outliers. In the case of data, it is further avoided that these outlier data samples affect the accuracy of the approximate calculation result for the data set, and the accuracy of the approximate calculation for the data set can be ensured to the greatest extent.

请参见图1，图1是一示例性实施例提供的一种基于智能合约的计算方法的流程图。所述方法应用于区块链中的节点设备；其中，所述区块链上部署了用于执行近似计算的智能合约，所述方法包括以下步骤：Please refer to FIG. 1 , which is a flowchart of a computing method based on a smart contract provided by an exemplary embodiment. The method is applied to a node device in a blockchain; wherein a smart contract for performing approximate calculations is deployed on the blockchain, and the method includes the following steps:

步骤102，接收计算发起方发起的针对所述智能合约的智能合约调用交易；其中，所述智能合约调用交易包括与所述近似计算对应的计算参数；所述计算参数包括参与近似计算的数据集合的数据标识；Step 102: Receive a smart contract invocation transaction for the smart contract initiated by a computing initiator; wherein, the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameter includes a data set participating in the approximate calculation data identification;

上述计算发起方，具体可以是具有数据计算需求的一方。例如，在一个例子中，上述计算发起方可以是一个具有数据计算需求的用户。在另一个例子中，在基于智能合约与业务对接的场景下，该计算发起方具体也可以是一个具有数据计算需求的链外业务系统。The above calculation initiator may specifically be a party with data calculation requirements. For example, in one example, the above computing initiator may be a user with data computing requirements. In another example, in the scenario of docking with a business based on a smart contract, the computing initiator may specifically be an off-chain business system with data computing requirements.

在区块链上，可以部署用于进行数据计算的智能合约，该智能合约包含的合约代码对应的执行逻辑，具体可以包括用于进行近似计算的近似计算逻辑和用于进行数据采样的采样逻辑。通过这种方式，可以在该智能合约中引入对数据的近似计算和数据采样的逻辑。On the blockchain, a smart contract for data calculation can be deployed, and the execution logic corresponding to the contract code contained in the smart contract may specifically include approximate calculation logic for approximate calculation and sampling logic for data sampling . In this way, the logic of approximate calculation of data and data sampling can be introduced into the smart contract.

其中，需要说明的是，上述数据采样所采用的采样方式，在本说明书中不进行特别限定；例如，可以采用随机采样(Random Sampling)、分层采样(Stratified Sampling)，等等。It should be noted that the sampling method used for the above data sampling is not particularly limited in this specification; for example, random sampling (Random Sampling), Stratified Sampling (Stratified Sampling), etc. may be used.

上述计算发起方可以通过发起一笔智能合约调用交易的方式，来调用上述智能合约对参与计算的数据集合进行近似计算。The above-mentioned calculation initiator can call the above-mentioned smart contract to perform approximate calculation on the data set participating in the calculation by initiating a smart contract invocation transaction.

例如，以上述计算发起方为用户，以及上述区块链为采用账户模型的区块链为例，在这种情况下，上述智能合约可以理解为区块链上的一个锚定了合约代码的合约账户，而该用户可以在区块链上注册外部账户，并通过该外部账户发起一笔智能合约调用交易，并将该智能合约调用交易提交至接入的区块链节点设备，来调用该智能合约。For example, take the above calculation initiator as the user and the above blockchain as an account model blockchain as an example, in this case, the above smart contract can be understood as an anchored contract code on the blockchain. contract account, and the user can register an external account on the blockchain, initiate a smart contract invocation transaction through the external account, and submit the smart contract invocation transaction to the connected blockchain node device to invoke the smart contracts.

其中，需要说明的是，在上述智能合约调用交易中，具体可以包括与近似计算对应的计算参数；该计算参数可以包括参与近似计算的数据集合的数据标识。It should be noted that, in the above smart contract invocation transaction, the calculation parameter corresponding to the approximate calculation may be specifically included; the calculation parameter may include the data identifier of the data set participating in the approximate calculation.

上述计算发起方在发起上述智能合约调用交易时，如果该计算发起方直接与区块链节点进行对接，则可以打包一笔智能合约交易，点对点的直接提交至接入的区块链节点设备即可。而如果该计算发起方通过诸如Baas(Blockchain as a Service)平台提供的区块链接入服务接入区块链，则可以生成一个针对上述智能合约的调用请求，并将该调用请求提交至Baas平台，再由该Baas平台基于该调用请求中携带的调用参数打包一笔智能合约调用交易，提交至区块链节点设备。When the above-mentioned calculation initiator initiates the above-mentioned smart contract invocation transaction, if the calculation initiator directly connects with the blockchain node, it can package a smart contract transaction, and directly submit it to the connected blockchain node device point-to-point. Can. If the computing initiator accesses the blockchain through a blockchain linking service provided by a platform such as Baas (Blockchain as a Service), it can generate a call request for the above smart contract and submit the call request to the Baas platform , and then the Baas platform packages a smart contract call transaction based on the call parameters carried in the call request, and submits it to the blockchain node device.

区块链节点设备可以接收上述计算发起方发起的上述智能合约调用交易，并在接收到上述智能合约调用交易时，可以响应该智能合约调用交易，在区块链上调用上述智能合约，对上述数据集合进行近似计算。The blockchain node device can receive the above-mentioned smart contract invocation transaction initiated by the above-mentioned calculation initiator, and when receiving the above-mentioned smart contract invocation transaction, it can respond to the smart contract invocation transaction, and call the above-mentioned smart contract on the blockchain. Data sets are approximated.

步骤104，响应于所述智能合约调用交易，调用所述智能合约调用交易包含的采样逻辑，将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，并针对所述非离群数据子集中的非离群数据样本进行采样；Step 104, in response to the smart contract invocation transaction, invoke the sampling logic included in the smart contract invocation transaction, and divide the data set corresponding to the data identifier into outlier data subsections consisting of several outlier data samples. set, and a non-outlier data subset composed of several non-outlier data samples, and sample the non-outlier data samples in the non-outlier data subset;

区块链节点设备接收到上述计算发起方发起的上述智能合约调用交易之后，可以响应该智能合约调用交易，调用所述智能合约包含的采样逻辑，对与所述数据标识对应的所述数据集合中的数据样本进行采样。After receiving the above-mentioned smart contract invocation transaction initiated by the above-mentioned calculation initiator, the blockchain node device can respond to the smart contract invocation transaction, invoke the sampling logic contained in the smart contract, and analyze the data set corresponding to the data identifier. The data samples in are sampled.

其中，需要说明的是，区块链节点设备在接收到上述计算发起方发起的上述智能合约调用交易之后，通常还需要基于区块链支持的共识算法，与其它参与共识的区块链节点一起，对该智能合约调用交易以及该智能合约调用交易的执行结果进行共识处理。由于本说明书并不涉及对区块链的共识过程进行改进，故在本说明书中对该智能合约调用交易以及该智能合约调用交易的执行结果进行共识处理的过程不再进行详述。Among them, it should be noted that after receiving the above-mentioned smart contract invocation transaction initiated by the above-mentioned calculation initiator, the blockchain node device usually needs a consensus algorithm supported by the blockchain, together with other blockchain nodes participating in the consensus. , perform consensus processing on the smart contract invocation transaction and the execution result of the smart contract invocation transaction. Since this specification does not involve improving the consensus process of the blockchain, the process of consensus processing of the smart contract invocation transaction and the execution result of the smart contract invocation transaction will not be described in detail in this specification.

在示出的一种实施方式中，区块链节点设备在调用所述智能合约包含的采样逻辑，对与所述数据标识对应的所述数据集合中的数据样本进行采样之前，可以先获取上述智能合约调用交易中包含的上述数据标识，并基于该数据标识来读取参与近似计算的数据集合。In the illustrated embodiment, before calling the sampling logic included in the smart contract to sample the data samples in the data set corresponding to the data identifier, the blockchain node device may first obtain the above The smart contract invokes the above-mentioned data identification contained in the transaction, and reads the data set participating in the approximate calculation based on the data identification.

其中，基于该数据标识来读取参与近似计算的数据集合时，具体可以从区块链上来读取，也可以从链外读取，在本说明书中不进行特别限定。Wherein, when the data set participating in the approximate calculation is read based on the data identifier, it can be read from the blockchain or from outside the chain, which is not particularly limited in this specification.

在一种实现方式中，该数据集合具体可以预先存证在上述区块链上。In an implementation manner, the data set may specifically be pre-stored on the above-mentioned blockchain.

例如，在区块链上还可以部署一个用于进行数据存证的存证合约，计算发起方在调用上述智能合约进行计算之前，可以通过打包一笔存证交易的方式，将该需要参与计算的数据集合发布至该存证合约进行存证。For example, a certificate storage contract for data storage can also be deployed on the blockchain. Before calling the above smart contract for calculation, the calculation initiator can package a certificate storage transaction to participate in the calculation. The data set is published to the deposit contract for deposit.

又如，上述智能合约包含的合约代码对应的执行逻辑，除了可以包括上述近似计算逻辑和上述采样逻辑以外，还可以包含数据存证逻辑。也即，该智能合约除了可以用于进行近似计算以外，其本身也自带针对数据的存证功能。此时计算发起方在调用该智能合约进行计算之前，也可以先通过打包一笔存证交易的方式，将该需要参与计算的数据集合预先发布至该智能合约进行存证，后续该智能合约可以从自身的合约存储空间中来读取存证完毕的上述数据集合来进行近似计算。For another example, the execution logic corresponding to the contract code contained in the above smart contract may include, in addition to the above approximate calculation logic and the above sampling logic, data proof logic. That is, in addition to being used for approximate calculations, the smart contract itself also has its own function of depositing data. At this time, before calling the smart contract for calculation, the calculation initiator can also pre-publish the data set that needs to participate in the calculation to the smart contract by packaging a certificate deposit transaction. Read the above-mentioned data set that has been certified from its own contract storage space for approximate calculation.

在这种情况下，区块链节点设备可以基于上述数据标识，来获取区块链上存证的与该数据标识对应的数据集合。例如，在这种情况下，该数据标识具体可以是上述数据集合在区块链上存证成功之后，由区块链节点返回的存证hash。In this case, the blockchain node device can obtain the data set corresponding to the data identification stored in the blockchain based on the above-mentioned data identification. For example, in this case, the data identifier may specifically be the certificate hash returned by the blockchain node after the above-mentioned data set is successfully stored on the blockchain.

在另一种实现方式中，该数据集合具体也可以预先存证在与上述区块链对接的链外数据库中。在这种情况下，该智能合约可以通过与其对应的预言机程序(oraclemachine)，从上述链外数据库中获取与该数据标识对应的数据集合。In another implementation manner, the data set may also be specifically pre-stored in an off-chain database connected to the above-mentioned blockchain. In this case, the smart contract can obtain the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle machine program.

其中，上述预言机程序具体可以是中心化的预言机程序，也可以是去中心化的预言机程序。当上述预言机程序为中心化的预言机程序时，此时该预言机程序可以是部署在链外的服务设备上的一个预言机服务程序。当上述预言机程序为去中心化的预言机程序时，此时该预言机程序可以是部署在区块链上的一个与上述智能合约进行对接的预言机合约。需要说明的是，由于本说明书并不涉及预言机程序相关的改进，故在本说明书中对上述智能合约通过与其对应的预言机程序，从上述链外数据库中获取与该数据标识对应的数据集合的具体实现过程，在本说明书中不再详述。Among them, the above-mentioned oracle program may be a centralized oracle program, or a decentralized oracle program. When the above-mentioned oracle program is a centralized oracle program, the oracle program can be an oracle service program deployed on a service device outside the chain. When the above-mentioned oracle program is a decentralized oracle program, the oracle program can be an oracle contract deployed on the blockchain that connects with the above-mentioned smart contract. It should be noted that since this specification does not involve improvements related to the oracle program, in this specification, the above-mentioned smart contract obtains the data set corresponding to the data identifier from the above-mentioned off-chain database through the corresponding oracle program. The specific implementation process will not be described in detail in this specification.

对于上述智能合约调用交易中包含的计算参数，除了可以包括以上提到的上述数据集合的数据标识以外，在实际应用中，还可以包括其它形式的与近似计算相关的参数。For the calculation parameters included in the above-mentioned smart contract invocation transaction, in addition to the data identifiers of the above-mentioned data sets mentioned above, in practical applications, other forms of parameters related to approximate calculation may also be included.

在示出的一种实施方式中，上述计算参数具体可以包括下表中示出的各类参数：In the illustrated embodiment, the above-mentioned calculation parameters may specifically include various parameters shown in the following table:

参数类型Parameter Type 参数含义Parameter meaning 数据集IDDataset ID 表示参与近似计算的数据集合Represents a collection of data involved in approximate computations 计算类型IDCalculation Type ID 表示需要进行的近似计算的计算类型Indicates the type of calculation that needs to be approximated 误差值difference 表示可容忍的近似计算的计算误差Indicates the calculation error of the tolerable approximation calculation 置信概率confidence probability 表示期望的近似计算的准确度Indicates the desired accuracy of the approximate calculation 采样算法IDSampling algorithm ID 表示指定的采样算法类型Indicates the specified sampling algorithm type

其中，需要说明的是，上表中除了数据集ID以外，其它参数均为可选参数。Among them, it should be noted that, except for the dataset ID, other parameters in the above table are optional parameters.

例如，如果上述智能合约调用交易中的计算参数中，不包含计算类型ID，则表示允许上述智能合约采用默认的计算类型对上述数据集合进行近似计算。如果上述智能合约调用交易中的计算参数中，不包含误差值，则表示可容忍的计算误差为0。如果上述智能合约调用交易中的计算参数中，不包含置信概率，则表示置信概率为100％，期望的近似计算的准确度100％，在这种情况下，上述智能合约会针对上述数据集合进行精确计算，不再进行近似计算。For example, if the calculation parameters in the above smart contract invocation transaction do not contain the calculation type ID, it means that the above smart contract is allowed to use the default calculation type to perform approximate calculation on the above data set. If the calculation parameters in the above smart contract invocation transaction do not contain an error value, it means that the tolerable calculation error is 0. If the calculation parameters in the above smart contract invocation transaction do not include confidence probability, it means that the confidence probability is 100%, and the expected approximate calculation accuracy is 100%. Exact calculation, no more approximate calculation.

在示出的一种实施方式中，区块链节点设备在调用上述智能合约包含的采样逻辑，对获取到与上述数据标识对应的数据集合进行采样时，为了避免上述数据集合中的离群数据对最终的近似计算结果造成影响，具体可以将上述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，然后仅针对非离群数据子集中的数据样本进行采样。In the illustrated embodiment, when the blockchain node device calls the sampling logic contained in the smart contract to sample the acquired data set corresponding to the above data identifier, in order to avoid outlier data in the above data set It will affect the final approximate calculation result. Specifically, the above data set can be divided into an outlier data subset composed of several outlier data samples, and a non-outlier data subset composed of several non-outlier data samples, and then only Sampling for data samples in a subset of non-outlier data.

其中，上述数据集合中的离群数据，具体可以由上述智能合约进行计算得出。The outlier data in the above data set can be calculated by the above smart contract.

在示出的一种实施方式中，上述智能合约包含的上述采样逻辑中具体还可以包括对上述数据集合进行离群计算的逻辑。在这种情况下，区块链节点设备在调用上述智能合约包含的采样逻辑，对获取到与上述数据标识对应的数据集合进行采样时，具体可以执行上述进行离群计算的逻辑，针对上述数据集合中的数据样本进行离群数据计算，确定出该数据集合中包含的离群数据样本和非离群数据样本，然后再根据确定出的离群数据样本创建离群数据子集，根据确定出的非离群数据样本创建非离群数据子集。In the illustrated embodiment, the sampling logic included in the smart contract may further include logic for performing outlier calculation on the data set. In this case, when the blockchain node device calls the sampling logic contained in the above smart contract to sample the acquired data set corresponding to the above data identifier, it can specifically execute the above logic for performing outlier calculation. For the above data Perform outlier data calculation on the data samples in the set, determine the outlier data samples and non-outlier data samples contained in the data set, and then create outlier data subsets based on the determined outlier data samples, and then create an outlier data subset according to the determined outlier data samples. of non-outlier data samples to create non-outlier data subsets.

其中，针对上述数据集合中的数据样本进行离群数据计算的过程中，通常是指基于一定的统计学算法，统计出上述数据集合中明显与其它数据样本差异比较大的数据样本的过程，具体的统计计算的方式在本说明书中不进行特别限定。例如，在一个例子中，可以通过计算上述数据集合中的数据样本对应的数值的中位数，然后基于该中位数来筛选出该数据集合中数值明显偏离该中位数的数据样本，作为离群数据样本。Among them, the process of calculating outlier data for the data samples in the above data set usually refers to the process of counting data samples in the above data set that are significantly different from other data samples based on a certain statistical algorithm. Specifically, The method of statistical calculation of is not particularly limited in this specification. For example, in one example, the median of the values corresponding to the data samples in the above data set can be calculated, and then based on the median, the data samples in the data set whose values deviate significantly from the median can be filtered out, as Outlier data samples.

当然，在实际应用中，上述数据集合中的离群数据，也可以由人工预先进行标定。这种情况下，上述智能合约包含的上述采样逻辑中具体还可以包括对上述数据集合进行离群数据筛选的逻辑。区块链节点设备在调用上述智能合约包含的采样逻辑，对获取到与上述数据标识对应的数据集合进行采样时，具体可以执行上述对离群数据进行筛选的逻辑，针对上述数据集合中的数据样本进行离群数据筛选，确定出该数据集合中包含的离群数据样本和非离群数据样本，然后再根据确定出的离群数据样本创建离群数据子集，根据确定出的非离群数据样本创建非离群数据子集。Of course, in practical applications, the outlier data in the above data set can also be manually calibrated in advance. In this case, the above-mentioned sampling logic included in the above-mentioned smart contract may also specifically include logic for screening outlier data for the above-mentioned data set. When the blockchain node device invokes the sampling logic contained in the above smart contract to sample the acquired data set corresponding to the above data identifier, it can specifically execute the above logic for screening outlier data, for the data in the above data set. The samples are screened for outlier data, and the outlier data samples and non-outlier data samples contained in the data set are determined, and then an outlier data subset is created according to the determined outlier data samples. Data samples create non-outlier subsets of data.

在示出的一种实施方式中，在针对所述非离群数据子集中的数据样本进行采样之前，具体还可以先计算针对该非离群数据子集中的数据样本进行采样的采样数量，然后再按照计算出的采样数量对该非离群数据子集中的数据样本进行采样。In the illustrated embodiment, before sampling the data samples in the non-outlier data subset, the number of samples to be sampled for the data samples in the non-outlier data subset may be calculated first, and then Then, the data samples in the non-outlier data subset are sampled according to the calculated sampling number.

在示出的一种实施方式中，霍夫丁不等式(Hoeffding’s Inequality)通常用于描述随机变量和与其期望值偏差的概率上限。而在近似计算的场景下，上述采样数量可以作为随机变量，上述近似计算的误差值可以作为期望值偏差，上述近似计算的置信概率可以作为上述概率上限。因此，在本说明书中可以利用霍夫丁不等式来描述上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系。换言之，在近似计算的场景下，可以利用霍夫丁不等式来推导出上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系。In one embodiment shown, Hoeffding's Inequality is generally used to describe a random variable and an upper bound on the probability of deviation from its expected value. In the case of approximate calculation, the above-mentioned sampling number can be used as a random variable, the error value of the above-mentioned approximate calculation can be used as the expected value deviation, and the confidence probability of the above-mentioned approximate calculation can be used as the above-mentioned upper limit of probability. Therefore, the mathematical relationship between the above-mentioned sampling number, the above-mentioned approximately calculated error value, and the above-mentioned approximately calculated confidence probability can be described in this specification by using Hoovding's inequality. In other words, in the case of approximate calculation, the mathematical relationship between the above sampling number, the error value of the above approximate calculation, and the confidence probability of the above approximate calculation can be derived by using Hooding's inequality.

其中，在利用霍夫丁不等式来描述上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系时，霍夫丁不等式表示成如下公式：Wherein, when the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation is described by the Hooding's inequality, the Hooding's inequality is expressed as the following formula:

在上述公式中，H表示霍夫丁不等式的数学标识符。n_g表示所述采样数量。b_g、a_g分别表示所述数据集合中的数据样本的最大值和最小值。δ表示所述置信概率；ε_g表示与上述近似计算对应的误差值；N_g表示所述数据集合中的数据样本的总数量。In the above formula, H represents the mathematical identifier of Hofding's inequality. n _g represents the number of samples. b _g and a _g respectively represent the maximum value and the minimum value of the data samples in the data set. δ represents the confidence probability; ε _g represents the error value corresponding to the above approximate calculation; N _g represents the total number of data samples in the data set.

而基于上述公式推导出的上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系，则可以用如下公式表示：The mathematical relationship between the above-mentioned sampling number, the above-mentioned approximate calculation error value and the above-mentioned approximate calculation confidence probability derived based on the above formula can be expressed by the following formula:

而在上述智能合约中，可以预先维护上述数学关系。区块链节点设备在调用上述智能合约计算针对上述非离群数据子集中的数据样本进行采样所需的采样数量时，可以获取上述智能合约调用交易中的计算参数中的与近似计算对应的置信概率δ，以及与近似计算对应的误差值ε_g，再将获取到的置信概率δ和误差值ε_g输入维护的上述数学关系中进行计算，得到与上述非离群数据子集对应的采样数量。In the above smart contract, the above mathematical relationship can be maintained in advance. When the blockchain node device invokes the above smart contract to calculate the number of samples required for sampling the data samples in the above non-outlier data subset, it can obtain the confidence corresponding to the approximate calculation in the calculation parameters in the above smart contract invocation transaction probability δ, and the error value ε _g corresponding to the approximate calculation, and then input the obtained confidence probability δ and error value ε _g into the above-mentioned mathematical relationship maintained for calculation, and obtain the sampling number corresponding to the above non-outlier data subset .

其中，需要说明的是，在针对非离群数据子集中的数据样本进行采样所采用的采样方式，在本说明书中不进行特别限定；例如，可以采用随机采样(Random Sampling)、分层采样(Stratified Sampling)，等等。Among them, it should be noted that the sampling method used for sampling the data samples in the non-outlier data subset is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling ( Stratified Sampling), etc.

以下实施例中将分别以采用随机采样和分层采样为例，来详细描述针对非离群数据子集中的非离群数据样本的采样过程。In the following embodiments, random sampling and stratified sampling are used as examples to describe in detail the sampling process for the non-outlier data samples in the non-outlier data subset.

在示出的一种实施方式中，如果采用随机采样的方式针对非离群数据子集中的数据样本进行随机采样，在这种情况下，区块链节点设备在调用上述智能合约，基于计算出的采样数量对上述数据集合进行随机采样时，具体可以先获取用于进行随机采样的随机数，然后再基于获取到的该随机数对非离群数据子集中的非离群数据样本进行随机采样，得到与计算出的上述采样数量对应的数据样本。In the illustrated embodiment, if random sampling is used to randomly sample the data samples in the non-outlier data subset, in this case, the blockchain node device is calling the above smart contract, based on the calculated When random sampling is performed on the above data set, the random number used for random sampling can be obtained first, and then the non-outlier data samples in the non-outlier data subset are randomly sampled based on the obtained random number. , to obtain the data samples corresponding to the above-mentioned calculated sampling numbers.

其中，上述随机数具体用于控制从上述非离群数据子集中采样的非离群数据样本的随机性，在实际应用中，可以按照获取到的随机数，来确定需要从上述非离群数据子集中采样的非离群数据样本。例如，在一个例子中，可以利用随机数来表示待采样的非离群数据样本的样本标识，在进行随机采样的过程中，可以按照该随机数，随机的从非离群数据子集中抽取将该随机数的数值作为样本标识的非离群数据样本完成数据采样。The above random number is specifically used to control the randomness of the non-outlier data samples sampled from the above-mentioned non-outlier data subset. Non-outlier data samples sampled in the subset. For example, in one example, a random number can be used to represent the sample identifier of the non-outlier data sample to be sampled, and in the process of random sampling, the random number can be randomly selected from the non-outlier data subset. The value of the random number is used as a non-outlier data sample identified by the sample to complete data sampling.

需要说明的是，关于上述随机数具体的获取方式，可以在区块链上生成，也可以从链外获取，在本说明书中不进行特别限定。It should be noted that the specific acquisition method of the above random number can be generated on the blockchain or acquired from outside the chain, which is not particularly limited in this specification.

以下是本说明书示出的几种用于获取随机数的具体方式：The following are several specific methods for obtaining random numbers shown in this specification:

在示出的一种方式中，在区块链上可以预先部署一个用于生成随机数的随机函数。例如，在实际应用中，上述随机函数具体可以作为一个独立的智能合约部署在区块链上，或者作为上述用于进行近似计算的智能合约包含的执行逻辑部署在该智能合约中。在这种情况下，可以通过调用上述随机函数在区块链上生成随机树；In the illustrated manner, a random function for generating random numbers may be pre-deployed on the blockchain. For example, in practical applications, the above random function can be specifically deployed on the blockchain as an independent smart contract, or deployed in the smart contract as the execution logic contained in the above smart contract for approximate calculation. In this case, a random tree can be generated on the blockchain by calling the above random function;

在示出的另一种方式中，上述区块链节点设备上可以搭载一个可信执行环境(Trusted Execution Environment)。在该可信执行环境中，预先可以维护一个用于生成随机数的随机数种子。在这种情况下，可以通过在该可信执行环境中，基于该随机种子来生成随机数。In another manner shown, a Trusted Execution Environment (Trusted Execution Environment) may be mounted on the above-mentioned blockchain node device. In the trusted execution environment, a random number seed for generating random numbers can be maintained in advance. In this case, a random number may be generated based on the random seed in the trusted execution environment.

在示出的第三种方式中，也可以从用于进行近似计算的上述智能合约维护的数据相关的数据参数中，来获取可以作为随机数种子的目标数据参数，然后可以基于获取到的目标数据参数在上述智能合约中生成随机数。例如，还可以从上述智能合约维护的历史区块的hash值、历史区块的生成时间戳这些具有唯一性的参数来作为随机数种子，在该智能合约中计算随机数。In the third method shown, the target data parameters that can be used as random number seeds can also be obtained from the data parameters related to the data maintained by the above smart contract for approximate calculation, and then the target data parameters can be obtained based on the obtained target data parameters. The data parameter generates random numbers in the above smart contract. For example, the unique parameters such as the hash value of the historical block and the generation time stamp of the historical block maintained by the smart contract can also be used as the random number seed, and the random number can be calculated in the smart contract.

在示出的第四种方式中，上述随机数可以在链外生成。在这种情况下，上述智能合约，也可以通过与该智能合约对应的预言机程序，来获取在链外生成的该随机数。In the illustrated fourth manner, the above random number may be generated off-chain. In this case, the above-mentioned smart contract can also obtain the random number generated outside the chain through the oracle program corresponding to the smart contract.

在示出的第五种方式中，可以在链外生成一个用于进一步生成上述随机数的随机数种子。在这种情况下，上述智能合约，也可以通过与该智能合约对应的预言机程序，来获取在链外生成的该随机数种子，然后基于获取到的该随机数种子在该智能合约中生成随机数。In the illustrated fifth manner, a random number seed for further generating the above random number may be generated outside the chain. In this case, the above smart contract can also obtain the random number seed generated outside the chain through the oracle program corresponding to the smart contract, and then generate the random number seed in the smart contract based on the obtained random number seed random number.

在示出的第六种方式中，在链外生成的上述随机数种子，具体也可以作为计算参数携带在上述智能合约调用交易中。在这种情况下，可以获取该智能合约调用交易中包括的在链外生成的随机数种子，然后基于获取到的该随机数种子在该智能合约中生成随机数。In the sixth manner shown, the random number seed generated outside the chain can also be specifically carried in the smart contract invocation transaction as a calculation parameter. In this case, a random number seed generated off-chain included in the smart contract calling transaction can be obtained, and then a random number can be generated in the smart contract based on the obtained random number seed.

以上举了几种用于获取随机数的常见实现方式，需要强调的是，在实际应用中，显然也可以采用以上列举的实现方式以外的方式来获取随机数，在本说明书中不再进行一一列举。Several common implementations for obtaining random numbers are listed above. It should be emphasized that, in practical applications, it is obvious that methods other than the above-listed implementations can also be used to obtain random numbers. an enumeration.

在示出的一种实施方式中，如果采用分层采样的方式针对非离群数据子集中的数据样本进行随机采样，在这种情况下，由于采用分层采样时，通常需要将上述非离群数据子集划分成若干个bucket，再从这些bucket中分别进行数据采样。因此，区块链节点设备在调用上述智能合约计算进行分层采样所需的采样数量时，可以获取上述智能合约调用交易携带的计算参数中包含的与近似计算对应的置信概率δ，以及与各个bucket对应的误差值ε_k，再将获取到的置信概率δ和各个bucket的误差值ε_k输入维护的上述数学关系中分别进行计算，得到与上述非离群数据子集划分出的各个bucket对应的采样数量。In the illustrated embodiment, if random sampling is performed on the data samples in the non-outlier data subset by means of stratified sampling, in this case, when stratified sampling is used, it is usually necessary to The group data subset is divided into several buckets, and then data sampling is performed from these buckets respectively. Therefore, when the blockchain node device invokes the above smart contract to calculate the sampling quantity required for stratified sampling, it can obtain the confidence probability δ corresponding to the approximate calculation contained in the calculation parameters carried by the above smart contract invocation transaction, and the corresponding confidence probability δ for each The error value ε _k corresponding to the bucket, and then the obtained confidence probability δ and the error value ε _k of each bucket are input and maintained in the above mathematical relationship to be calculated respectively, and the corresponding buckets corresponding to the above non-outlier data subsets are obtained. the number of samples.

需要说明的是，在对上述非离群数据子集进行分层采样时，需要划分出的bucket的数量K，以及每一个bucket对应的误差值ε_k，可以由计算发起方来指定，并作为计算参数携带在上述智能合约调用交易中。例如，在这种情况下，上述计算参数中除了需要携带一个计算发起方指定的针对上述数据集合进行近似计算的总误差ε_g以外，还需要携带K个与各个bucket对应的误差值ε_k。It should be noted that when performing hierarchical sampling on the above non-outlier data subsets, the number K of buckets to be divided, and the error value ε _k corresponding to each bucket can be specified by the calculation initiator and used as The calculation parameters are carried in the above smart contract invocation transaction. For example, in this case, the above calculation parameters need to carry K error values ε _k corresponding to each bucket in addition to a total error ε _g specified by the calculation initiator for approximate calculation of the above data set.

除此之外，在对上述非离群数据子集进行分层采样时，需要划分出的bucket的数量K，以及每一个bucket对应的误差值ε_k，具体也可以是由上述智能合约在链上自主的进行计算得到的最优值。In addition, when performing hierarchical sampling on the above non-outlier data subsets, the number K of buckets that need to be divided, and the error value ε _k corresponding to each bucket, can also be calculated by the above smart contract in the chain. The optimal value obtained by autonomous calculation.

在示出的一种实施方式中，需要划分出的bucket的数量K，以及每一个bucket对应的误差值ε_k，可以是由上述智能合约采用最优化求解方法求解出的最优值。In the illustrated embodiment, the number K of buckets to be divided, and the error value ε _k corresponding to each bucket may be the optimal value obtained by the above-mentioned smart contract using the optimal solution method.

在这种情况下，区块链节点设备在调用上述智能合约包含的采样逻辑，对非离群数据子集中的非离群数据样本进行分层采样时，可以先采用最优化求解方法，求解在针对上述非离群数据子集进行分层采样时所需划分出的bucket的最优数量K，以及每一个bucket对应的最优误差值ε_k，再将上述智能合约调用交易中的计算参数中与近似计算对应的置信概率δ，与求解出的各个bucket对应的最优误差值ε_k，输入维护的上述数学关系中分别进行计算，得到与上述非离群数据子集划分出的各个bucket对应的最优采样数量。然后，可以基于计算出的上述最优数量K和上述最优采样数量，对上述非离群数据子集中的数据样本进行分层采样。In this case, when the blockchain node device calls the sampling logic contained in the above smart contract to perform hierarchical sampling on the non-outlier data samples in the non-outlier data subset, it can first use the optimization solution method to solve the problem in The optimal number K of buckets that need to be divided into stratified sampling for the above non-outlier data subsets, and the optimal error value ε _k corresponding to each bucket, and then add the calculation parameters in the smart contract call transaction to the above The confidence probability δ corresponding to the approximate calculation, and the optimal error value ε _k corresponding to each bucket solved, are respectively calculated in the above-mentioned mathematical relationship maintained by the input, and the corresponding buckets corresponding to the above-mentioned non-outlier data subsets are obtained. the optimal number of samples. Then, stratified sampling may be performed on the data samples in the non-outlier data subset based on the calculated optimal number K and the optimal sampling number.

需要说明的是，上述智能合约所采用的最优化求解方法的具体类型，在本说明书中不进行特别限定，在实际应用中，本领域技术人员可以基于实际的需求，来灵活的采用不同的最优化求解算法。例如，在实际应用中，具体可以采用诸如梯度下降法(GradientDescent)等常用的最优求解算法。It should be noted that the specific type of the optimization solution method adopted by the above-mentioned smart contracts is not particularly limited in this specification. In practical applications, those skilled in the art can flexibly adopt different optimal solutions based on actual needs. Optimize the solution algorithm. For example, in practical applications, a commonly used optimal solution algorithm such as a gradient descent method (Gradient Descent) can be specifically adopted.

其中，对于最优化求解方法而言，通常需要设置一个明确的约束条件。而在实际应用中，通常可以基于具体的求解目标，来设置上述约束条件。Among them, for the optimization solution method, it is usually necessary to set a clear constraint. In practical applications, the above constraints can usually be set based on specific solution objectives.

在对上述非离群数据子集进行分层采样的场景下，最优化的求解目标可以包括求解划分出的bucket的最优数量、求解出各个bucket对应的最优误差值，等等。那么，在实际应用中，就可以基于上述优化目标来为上述最优化求解方法设置上述约束条件。In the scenario of performing stratified sampling on the above non-outlier data subset, the optimization objective may include finding the optimal number of divided buckets, finding the optimal error value corresponding to each bucket, and so on. Then, in practical applications, the above-mentioned constraints can be set for the above-mentioned optimization solution method based on the above-mentioned optimization objective.

在示出的一种实施方式中，基于上述求解目标，为上述最优化求解方法设置约束条件具体可以是：In the illustrated embodiment, based on the above-mentioned solution objective, setting constraints for the above-mentioned optimization solution method may specifically be:

针对各个bucket对应的误差值进行加权平均计算，得到的加权平均误差值最小，并且不大于针对上述非离群数据子集进行近似计算对应的总误差值。The weighted average calculation is performed on the error values corresponding to each bucket, and the obtained weighted average error value is the smallest, and is not greater than the total error value corresponding to the approximate calculation for the above non-outlier data subset.

例如，上述约束条件可以表示成如下的公式：For example, the above constraints can be expressed as the following formula:

以上公式中，ε_g表示针对上述非离群数据子集进行近似计算对应的总误差值。N_k表示从第k个bucket中采样的样本数量。N表示从上述非离群数据子集中采样的总样本数量。In the above formula, ε _g represents the total error value corresponding to the approximate calculation for the above non-outlier data subset. N _k represents the number of samples sampled from the k-th bucket. N represents the total number of samples sampled from the aforementioned subset of non-outlier data.

以下通过附图和具体的实施例来描述采用上述约束条件，来求解分层采样所需的bucket的最优数量和各个bucket的最优误差值的具体算法流程。The following describes a specific algorithm flow for solving the optimal number of buckets required for stratified sampling and the optimal error value of each bucket by using the above constraints by using the accompanying drawings and specific embodiments.

请参见图2，图2为本说明书示出的一种最优化求解方法的流程图，包括以下的执行步骤：Please refer to Fig. 2, Fig. 2 is a flow chart of an optimization solution method shown in this specification, including the following execution steps:

步骤201，初始化i值；其中，所述i值表示初始化设置的各个bucket中包含的样本数量。除了步骤201以外，以下步骤为迭代执行的步骤：Step 201, initialize the i value; wherein, the i value represents the number of samples included in each bucket set by initialization. In addition to step 201, the following steps are iteratively executed:

步骤202，对初始化的i值对应的数值进行调整；Step 202, adjusting the value corresponding to the initialized i value;

其中，对i值的调整幅度可以灵活设置，在本说明书中不进行特别限定。The adjustment range of the i value can be set flexibly, and is not particularly limited in this specification.

步骤203，将所述非离群数据子集划分为分别包含i个样本数量的若干bucket；Step 203, dividing the non-outlier data subset into several buckets containing i samples respectively;

步骤204，将上述置信概率δ(即智能合约调用交易携带的计算参数中包含的置信概率δ)以及调整之后的i值(即与各个bucket对应的样本数量)作为计算参数，输入至所述数学关系中进行计算，得到与各个bucket分别对应的误差值，并对各个bucket对应的误差值进行加权平均计算，得到加权平均误差值；In step 204, the above-mentioned confidence probability δ (that is, the confidence probability δ contained in the calculation parameters carried by the smart contract invocation transaction) and the adjusted i value (that is, the number of samples corresponding to each bucket) are used as calculation parameters, and are input into the mathematical The calculation is performed in the relationship to obtain the error values corresponding to each bucket, and the weighted average calculation is performed on the error values corresponding to each bucket to obtain the weighted average error value;

其中，需要说明的是，如果是第一轮迭代，执行完步骤204，会重新执行步骤202-步骤204，执行第二轮迭代。It should be noted that, if it is the first round of iteration, after step 204 is performed, steps 202 to 204 will be performed again, and the second round of iteration will be performed.

步骤205，确定所述加权平均误差值是否不大于所述总误差值(即智能合约调用交易携带的计算参数中包含的与近似计算对应的误差值)，并且小于上一轮迭代计算出的加权平均误差值(即基于本轮迭代调整之前的i值计算出的加权平均误差值)；如果否，重新执行以上的步骤202-步骤205，继续执行下一轮迭代，并重复以上的迭代过程，直至最优化求解算法收敛，计算出满足上述约束条件的加权平均误差值时停止迭代。Step 205: Determine whether the weighted average error value is not greater than the total error value (that is, the error value corresponding to the approximate calculation contained in the calculation parameters carried by the smart contract invocation transaction), and is smaller than the weighted value calculated by the previous iteration. Average error value (that is, the weighted average error value calculated based on the i value before this round of iterative adjustment); if not, re-execute the above steps 202 to 205, continue to perform the next round of iteration, and repeat the above iterative process, The iteration is stopped until the optimization algorithm converges and the weighted average error value satisfying the above constraints is calculated.

步骤206，在停止迭代后，获取使得计算出的加权平均误差值满足上述约束条件时的最优i值；Step 206, after stopping the iteration, obtain the optimal i value when the calculated weighted average error value satisfies the above constraints;

步骤207，基于所述最优i值确定针对所述非离群数据子集进行分层抽样时，所需划分出的bucket的最优数量，并再次将上述置信概率以及与上述最优i值，输入至所述数学关系中进行计算，得到与各个bucket对应的最优误差值。需要说明的是，在以上实施例中，是基于以上描述的求解目标为上述最优化求解方法设置约束条件的一种具体的实施方式，在实际应用中，显然也可以基于上述求解目标，来为上述最优化求解方法设置其它形式的约束条件。在本说明书中，上述智能合约采用最优化求解方法求解出需要划分出的bucket的最优数量，以及每一个bucket对应的最优采样数量之后，可以基于该最优数量和该最优误差值对上述非离群数据子集进行分层采样。Step 207: Determine the optimal number of buckets to be divided when stratified sampling is performed for the non-outlier data subset based on the optimal i value, and again compare the above confidence probability with the above optimal i value. , input into the mathematical relationship for calculation, and obtain the optimal error value corresponding to each bucket. It should be noted that, in the above embodiment, it is a specific implementation of setting constraints for the above-mentioned optimization solution method based on the above-described solution objective. The above optimization solving methods set other forms of constraints. In this specification, after the above smart contract uses the optimal solution method to solve the optimal number of buckets to be divided, and the optimal sampling number corresponding to each bucket, the optimal number and the optimal error value can be paired based on the optimal number The subset of non-outlier data described above was stratified sampling.

在示出的一种实施方式中，上述智能合约在基于该最优数量和该最优误差值对上述非离群数据子集进行分层采样时，首先可以按照该最优数量将上述非离群数据子集划分为若干个bucket；比如，假设上述最优数量为K，则可以将上述非离群数据子集划分为K个bucket。然后，可以从划分出的各个bucket中，按照上述最优采样数量分别对各个bucket中的数据样本进行采样。In the illustrated embodiment, when the above-mentioned smart contract performs stratified sampling on the above-mentioned non-outlier data subset based on the optimal quantity and the optimal error value, firstly, the above-mentioned non-outlier data subset can be sampled according to the optimal quantity. The cluster data subset is divided into several buckets; for example, if the above optimal number is K, the above non-outlier data subset can be divided into K buckets. Then, the data samples in each bucket may be sampled according to the above optimal sampling quantity from each of the divided buckets.

其中，按照上述最优采样数量分别对各个bucket中的数据样本进行采样所采用的具体的采样方式，在本说明书中不进行特别限定。Wherein, the specific sampling manner used for sampling the data samples in each bucket according to the above-mentioned optimal sampling quantity is not particularly limited in this specification.

例如，如果采用随机采样的方式，按照上述最优采样数量分别对各个bucket中的数据样本进行采样时，具体可以先获取用于进行随机采样的随机数，然后再基于获取到的该随机数对各个bucket中的非离群数据样本进行随机采样，得到与计算出的上述最优采样数量对应的非离群数据样本。其中，关于上述随机数具体的获取方式可以参照之前实施例的描述，不再赘述。For example, if the random sampling method is adopted, when the data samples in each bucket are sampled according to the above optimal sampling quantity, the random number used for random sampling can be obtained first, and then based on the obtained random number pair The non-outlier data samples in each bucket are randomly sampled to obtain non-outlier data samples corresponding to the calculated optimal sampling number. For the specific acquisition method of the above random number, reference may be made to the description of the previous embodiment, and details are not repeated here.

步骤106，进一步调用所述智能合约调用交易包含的计算逻辑，针对所述离群数据子集中的离群数据样本进行精确计算，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，并合并所述精确计算和所述近似计算的结果，以作为针对所述数据集合的近似计算结果。Step 106, further invoking the calculation logic included in the smart contract invocation transaction to perform accurate calculation for the outlier data samples in the outlier data subset, and for the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

在本说明书中，当完成针对上述非离群数据子集中的非离群数据子集的数据采样后，可以进一步针对采样到的非离群数据样本进行近似计算。In this specification, after the data sampling for the non-outlier data subsets in the above-mentioned non-outlier data subsets is completed, an approximate calculation may be further performed for the sampled non-outlier data samples.

其中，在针对采样得到的非离群数据样本进行近似计算时，可以采用计算发起方指定的计算类型进行近似计算，也可以采用上述智能合约支持的默认计算类型进行近似计算，在本说明书中不进行特别限定。Among them, when the approximate calculation is performed for the sampled non-outlier data samples, the calculation type specified by the calculation initiator can be used for the approximate calculation, or the default calculation type supported by the above smart contract can be used for the approximate calculation. Make special restrictions.

例如，在示出的一种实施方式中，在上述智能合约调用交易中，还可以包括采样算法ID。该采样算法ID具体可以用于指示计算发起方指定的针对上述数据集合进行近似计算的计算类型。For example, in the illustrated embodiment, the above-mentioned smart contract invocation transaction may further include a sampling algorithm ID. The sampling algorithm ID may be specifically used to indicate the calculation type designated by the calculation initiator to perform approximate calculation on the above-mentioned data set.

在这种情况下，在针对采样得到的非离群数据样本进行近似计算时，可以获取该智能合约调用交易中包括的采样算法ID，然后按照该采样算法ID指示的计算类型针对采集得到的非离群数据样本进行近似计算。In this case, when performing approximate calculation on the sampled non-outlier data samples, the sampling algorithm ID included in the smart contract invocation transaction can be obtained, and then according to the calculation type indicated by the sampling algorithm ID, the collected non-outlier data samples can be obtained. Outlier data samples are approximated.

当然，如果上述智能合约调用交易不包括上述采样算法ID，也即计算发起方并没有指定的针对上述数据集合进行近似计算的计算类型，也可以基于上述智能合约支持的默认计算类型针对采样得到的非离群样本数据进行近似计算。Of course, if the above-mentioned smart contract invocation transaction does not include the above-mentioned sampling algorithm ID, that is, the calculation initiator does not specify the calculation type to perform approximate calculation on the above-mentioned data set, it can also be based on the default calculation type supported by the above-mentioned smart contract. Approximate calculations are performed on non-outlier sample data.

而对于上述离群数据子集来说，可以不对上述离群数据子集中的离群数据样本进行采样。同时，由于针对离群数据样本进行近似计算的结果，通常会偏移针对上述数据集合进行近似计算的近似计算结果，因此对于上述离群数据子集中的离群数据样本，可以不进行近似计算，而是进行精确计算。For the above-mentioned outlier data subset, the outlier data samples in the above-mentioned outlier data subset may not be sampled. At the same time, since the approximate calculation results for the outlier data samples usually offset the approximate calculation results for the above-mentioned data sets, the outlier data samples in the above-mentioned outlier data subsets may not be approximated. Instead, make precise calculations.

其中，在针对离群数据子集中的离群数据样本进行精确计算时，可以采用计算发起方指定的计算类型进行精确计算，也可以采用上述智能合约支持的默认计算类型进行精确计算，在本说明书中不进行特别限定。Among them, when accurate calculation is performed for the outlier data samples in the outlier data subset, the calculation type specified by the calculation initiator can be used for accurate calculation, or the default calculation type supported by the above smart contract can be used for accurate calculation. is not particularly limited.

例如，在针对离群数据子集中的离群数据进行近似计算时，可以获取上述智能合约调用交易中包括的采样算法ID，然后按照该采样算法ID指示的计算类型针对离群数据子集中的离群数据样本进行精确计算。当然，如果上述智能合约调用交易不包括上述采样算法ID，也可以基于上述智能合约支持的默认计算类型针对离群数据子集中的离群数据进行精确计算。For example, when performing approximate calculation on the outlier data in the outlier data subset, the sampling algorithm ID included in the smart contract invocation transaction can be obtained, and then according to the calculation type indicated by the sampling algorithm ID, the outlier data in the outlier data subset is calculated. Cluster data samples for exact calculations. Of course, if the above-mentioned smart contract invocation transaction does not include the above-mentioned sampling algorithm ID, it is also possible to accurately calculate the outlier data in the outlier data subset based on the default calculation type supported by the above-mentioned smart contract.

需要说明的是，上述近似计算和精确计算对应的计算类型，在本说明书中不进行特别限定。例如，可以包括求和、求平均值，等等，在本说明书中不再进行一一列举。It should be noted that the calculation types corresponding to the above approximate calculation and exact calculation are not particularly limited in this specification. For example, summation, averaging, etc. may be included, which will not be enumerated in this specification.

当完成针对离群数据子集中的离群数据样本的精确计算，以及针对非离群数据子集中采样得到的非离群数据样本的近似计算之后，可以合并上述精确计算和上述近似计算的结果，以作为针对上述数据集合的最终的近似计算结果。After the accurate calculation for the outlier data samples in the outlier data subset and the approximate calculation for the non-outlier data samples sampled from the non-outlier data subset are completed, the results of the above accurate calculation and the above approximate calculation can be combined, as the final approximate calculation result for the above data set.

在示出的一种实施方式中，上述用于进行近似计算的智能合约，具体还可以是一个部署在区块链节点设备搭载的可信执行环境中的隐私智能合约。In the illustrated embodiment, the above-mentioned smart contract for performing approximate calculation may specifically be a privacy smart contract deployed in a trusted execution environment carried by a blockchain node device.

在这种场景下，上述智能合约调用交易中的计算参数，以及获取到的上述数据集合中的数据样本，通常都预先进行了加密处理。区块链节点设备在调用上述智能合约包含的采样逻辑，将该数据集合划分为离群数据子集和非离群数据子集之前，还可以在该可信执行环境中，对上述计算参数以及对获取到的上述数据集合中的数据样本分别进行解密。In this scenario, the calculation parameters in the above-mentioned smart contract invocation transaction and the obtained data samples in the above-mentioned data set are usually encrypted in advance. Before the blockchain node device calls the sampling logic contained in the above smart contract to divide the data set into outlier data subsets and non-outlier data subsets, it can also perform calculations on the above calculation parameters and Decrypt the obtained data samples in the above-mentioned data set respectively.

例如，在一个例子中，可以为上述可信执行环境分配一对用于对数据进行加解密的非对称密钥对，并将上述非对称密钥的私钥存储在上述可信执行环境中，将上述非对称密钥的公钥发布给上述计算发起方。而上述智能合约调用交易中的计算参数，以及获取到的上述数据集合中的数据样本，都可以预先基于上述公钥进行加密。区块链节点设备在调用上述智能合约包含的采样逻辑，对获取到与上述数据标识对应的数据集合进行随机采样之前，还可以在该可信执行环境中，使用维护的私钥对上述计算参数以及对获取到的上述数据集合中的数据样本分别进行解密。在以上技术方案中，在调用智能合约针对数据集合进行近似计算的场景下，通过在智能合约中引入针对该数据集合的采样机制，可以在不牺牲近似计算结果的准确度的基础上，降低对该数据集合进行近似计算时的耗时，提高针对该数据集合进行近似计算时的计算效率。For example, in one example, a pair of asymmetric key pairs for encrypting and decrypting data may be allocated to the above-mentioned trusted execution environment, and the private key of the above-mentioned asymmetric key may be stored in the above-mentioned trusted execution environment, The public key of the above-mentioned asymmetric key is released to the above-mentioned calculation initiator. The calculation parameters in the above-mentioned smart contract invocation transaction and the obtained data samples in the above-mentioned data set can be encrypted in advance based on the above-mentioned public key. Before calling the sampling logic contained in the above smart contract and randomly sampling the data set corresponding to the above data identifier, the blockchain node device can also use the maintained private key to calculate the above calculation parameters in the trusted execution environment. and decrypting the obtained data samples in the above data set respectively. In the above technical solution, in the scenario of invoking a smart contract to perform approximate calculation on a data set, by introducing a sampling mechanism for the data set into the smart contract, the accuracy of the approximate calculation result can be reduced without sacrificing the accuracy of the approximate calculation result. The time-consuming when the approximate calculation is performed on the data set improves the calculation efficiency when the approximate calculation is performed on the data set.

例如，仍以与业务相关的数据集合预先存证在区块链上为例，在智能合约中引入了数据采样机制之后，此时智能合约对与业务相关的数据集合进行计算时的总耗时，通常可以用如下的公式进行表示：For example, still taking business-related data sets pre-existing on the blockchain as an example, after the data sampling mechanism is introduced into smart contracts, the total time spent by smart contracts to calculate business-related data sets , which can usually be expressed by the following formula:

其中，在上述公式中，n_g表示从数据集合中采样得到的数据样本的数量。N_g表示数据集合中的数据样本的总数量。由于n_g的数值与N_g的数值相比，通常是数量级的差异，因此智能合约中引入了采样机制之后，通过该智能合约对数据集合进行计算时的耗时，也会数量级的减少。可见，在智能合约中引入了采样机制，可以显著的缩短对数据集合进行计算时的耗时，提高针对该数据集合进行近似计算时的计算效率。Wherein, in the above formula, _ng represents the number of data samples sampled from the data set. N _g represents the total number of data samples in the data set. Since the value of n _g is usually an order of magnitude difference compared to the value of N _g , after the sampling mechanism is introduced into the smart contract, the time-consuming calculation of the data set through the smart contract will also be reduced by orders of magnitude. It can be seen that the introduction of the sampling mechanism into the smart contract can significantly shorten the time-consuming calculation of the data set and improve the computational efficiency of approximate calculation of the data set.

而且，由于在对该数据集合进行近似计算的过程中，不对该数据集合中的离群数据进行采样后执行近似计算，而是不进行采样直接进行精确计算，从而可以该数据集合中包括离群数据的情况下，进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响，可以最大程度的确保针对该数据集合进行近似计算的准确度。Moreover, in the process of performing the approximate calculation on the data set, the approximate calculation is not performed after sampling the outlier data in the data set, but the accurate calculation is performed directly without sampling, so that the outlier data can be included in the data set. In the case of data, it is further avoided that these outlier data samples affect the accuracy of the approximate calculation result for the data set, and the accuracy of the approximate calculation for the data set can be ensured to the greatest extent.

与上述方法实施例相对应，本申请还提供了装置的实施例。Corresponding to the above method embodiments, the present application also provides device embodiments.

本说明书的装置的实施例可以应用在电子设备上。装置实施例可以通过软件实现，也可以通过硬件或者软硬件结合的方式实现。以软件实现为例，作为一个逻辑意义上的装置，是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。Embodiments of the apparatus of this specification can be applied to electronic equipment. The apparatus embodiment may be implemented by software, or may be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located.

从硬件层面而言，如图3所示，为本说明书的装置所在电子设备的一种硬件结构图，除了图3所示的处理器、内存、网络接口、以及非易失性存储器之外，实施例中装置所在的电子设备通常根据该电子设备的实际功能，还可以包括其他硬件，对此不再赘述。From the perspective of hardware, as shown in FIG. 3 , which is a hardware structure diagram of the electronic device where the device of this specification is located, except for the processor, memory, network interface, and non-volatile memory shown in FIG. 3 , In the embodiment, the electronic device where the apparatus is located generally may also include other hardware according to the actual function of the electronic device, and details are not described herein again.

图4是本说明书一示例性实施例示出的一种基于智能合约的计算装置的框图。FIG. 4 is a block diagram of a smart contract-based computing device shown in an exemplary embodiment of the present specification.

请参考图4，所述基于智能合约的计算装置40可以应用在前述图3所示的电子设备中，所述区块链上部署了用于执行近似计算的智能合约，所述装置40包括：Referring to FIG. 4 , the smart contract-based computing device 40 can be applied to the electronic equipment shown in the aforementioned FIG. 3 , and a smart contract for performing approximate calculations is deployed on the blockchain, and the device 40 includes:

接收模块401，接收计算发起方发起的针对所述智能合约的智能合约调用交易；其中，所述智能合约调用交易包括与所述近似计算对应的计算参数；所述计算参数包括参与近似计算的数据集合的数据标识；A receiving module 401, receiving a smart contract invocation transaction for the smart contract initiated by a computing initiator; wherein, the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameter includes data participating in the approximate calculation The data identifier of the collection;

采样模块402，响应于所述智能合约调用交易，调用所述智能合约调用交易包含的采样逻辑，将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集，并针对所述非离群数据子集中的非离群数据样本进行采样；Sampling module 402, in response to the smart contract invocation transaction, invokes the sampling logic included in the smart contract invocation transaction, and divides the data set corresponding to the data identifier into outlier data consisting of several outlier data samples a subset, and a non-outlier data subset composed of several non-outlier data samples, and sampling is performed for the non-outlier data samples in the non-outlier data subset;

计算模块403，进一步调用所述智能合约调用交易包含的计算逻辑，针对所述离群数据子集中的离群数据样本进行精确计算，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，并合并所述精确计算和所述近似计算的结果，以作为针对所述数据集合的近似计算结果。The calculation module 403 further invokes the calculation logic included in the smart contract invocation transaction to perform accurate calculation for the outlier data samples in the outlier data subset, and for the non-outlier data samples sampled from the non-outlier data subset An approximate calculation is performed on a sample of data, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

在本实施例中，所述装置40还包括：In this embodiment, the device 40 further includes:

获取模块404(图4中未示出)，在采样模块402将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集之前，获取所述区块链上存证的与所述数据标识对应的数据集合；或者，通过与所述智能合约对应的预言机程序，从与所述区块链对接的链外数据库中获取与所述数据标识对应的数据集合。The acquisition module 404 (not shown in FIG. 4 ), in the sampling module 402, the data set corresponding to the data identifier is divided into outlier data subsets composed of several outlier data samples, and outlier data subsets composed of several non-outlier data samples. Before the non-outlier data subset composed of data samples, obtain the data set corresponding to the data identifier stored on the blockchain; or, through the oracle program corresponding to the smart contract, from the data set corresponding to the The data set corresponding to the data identifier is obtained from the off-chain database connected by the blockchain.

在本实施例中，所述采样模块402：In this embodiment, the sampling module 402:

针对与所述数据标识对应的所述数据集合中的数据样本进行离群数据计算，以确定所述数据集合中包含的离群数据样本和非离群数据样本；Perform outlier data calculation on the data samples in the data set corresponding to the data identifier to determine the outlier data samples and non-outlier data samples included in the data set;

基于所述离群数据样本创建所述离群数据子集，并基于所述非离群数据样本创建非离群数据子集。The outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.

在本实施例中，所述计算参数包括与所述近似计算对应的置信概率；以及，与所述近似计算对应的总误差值；所述置信概率表征所述近似计算的准确度；所述智能合约维护了基于霍夫丁不等式推导出的，用于描述与所述近似计算对应的置信概率，与所述近似计算对应的误差值，以及与参与所述近似计算的数据集合对应的采样数量三者之间的数学关系；In this embodiment, the calculation parameter includes a confidence probability corresponding to the approximate calculation; and a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; the intelligence The contract maintains the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the sampling number corresponding to the data set participating in the approximate calculation, which is derived based on the Houghding inequality. the mathematical relationship between them;

所述采样模块402进一步：The sampling module 402 further:

在针对所述非离群数据子集中的数据样本进行采样之前，将与所述近似计算对应的所述置信概率以及与所述近似计算对应的所述误差值，输入至所述数学关系中进行计算，得到与所述非离群数据子集对应的采样数量。Before sampling the data samples in the non-outlier data subset, the confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship to perform Calculate to obtain the number of samples corresponding to the non-outlier data subset.

在本实施例中，所述数据关系利用如下的公式进行表示：In this embodiment, the data relationship is represented by the following formula:

其中，在上述公式中，n_g表示所述采样数量；b_g、a_g分别表示所述数据集合中的数据样本的最大值和最小值；δ表示所述置信概率；ε_g表示所述误差值；N_g表示所述数据集合中的数据样本的总数量。Wherein, in the above formula, n _g represents the number of samples; b _g and a _g represent the maximum and minimum values of the data samples in the data set, respectively; δ represents the confidence probability; ε _g represents the error value; N _g represents the total number of data samples in the data set.

在本实施例中，所述采样模块402:In this embodiment, the sampling module 402:

按照计算出的所述采样数量针对所述非离群数据子集中的非离群数据样本进行采样。The non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.

在本实施例中，针对所述非离群数据子集中的非离群数据样本进行的采样包括随机采样；In this embodiment, the sampling performed for the non-outlier data samples in the non-outlier data subset includes random sampling;

所述采样模块402进一步：The sampling module 402 further:

获取用于进行随机采样的随机数；Get a random number for random sampling;

基于所述随机数针对所述非离群数据子集中的非离群数据样本进行随机采样，得到与计算出的所述采样数量对应的非离群数据样本。Random sampling is performed on the non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.

在本实施例中，针对所述非离群数据子集中的非离群数据样本进行的采样包括分层采样；In this embodiment, the sampling performed for the non-outlier data samples in the non-outlier data subset includes stratified sampling;

所述采样模块402进一步:The sampling module 402 further:

采用最优化求解方法，求解在针对所述非离群数据子集进行分层采样时所需划分出的数据子集的最优数量，以及与划分出的各个数据子集对应的最优误差值；An optimal solution method is used to obtain the optimal number of data subsets that need to be divided when stratified sampling is performed on the non-outlier data subsets, and the optimal error value corresponding to each divided data subset. ;

将与所述近似计算对应的所述置信概率以及与所述各个数据子集对应的所述最优误差值，输入至所述数学关系中进行计算，得到与所述各个数据子集分别对应的最优采样数量；Input the confidence probability corresponding to the approximate calculation and the optimal error value corresponding to the respective data subsets into the mathematical relationship for calculation, and obtain the corresponding values for the respective data subsets. Optimal sampling number;

按照所述最优数量将所述数据集合划分为若干个数据子集，并按照所述最优采样数据分别对各个数据子集中的非离群数据样本进行采样。The data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled respectively according to the optimal sampling data.

在本实施例中，其中，所述最优化求解方法所采用的约束条件包括：针对各个数据子集对应的误差值进行加权平均计算得到的加权平均误差值最小，并且不大于所述总误差值；In this embodiment, the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing the weighted average calculation on the error values corresponding to each data subset is the smallest and not greater than the total error value ;

所述采样模块402进一步执行如下步骤：The sampling module 402 further performs the following steps:

步骤A，对初始化的i值对应的数值进行调整；Step A, adjust the value corresponding to the initialized i value;

步骤B，将所述非离群数据子集划分为分别包含i个样本数量的若干数据子集；Step B, dividing the non-outlier data subset into several data subsets containing i samples respectively;

步骤C，将所述置信概率以及调整之后的i值作为计算参数，输入至所述数学关系中进行计算，得到与所述各个数据子集分别对应的误差值，并对所述各个数据子集对应的误差值进行加权平均计算，得到加权平均误差值；In step C, the confidence probability and the adjusted i value are used as calculation parameters, and are input into the mathematical relationship for calculation, so as to obtain the error values corresponding to the respective data subsets, and for the respective data subsets. The corresponding error value is calculated by weighted average to obtain the weighted average error value;

步骤D，确定所述加权平均误差值是否不大于所述总误差值，并且小于基于本次调整之前的所述i值计算出的所述加权平均误差值；如果否，重新执行以上的步骤A-步骤D，直到计算出的所述加权平均误差值满足所述约束条件时停止迭代，并获取使得所述加权平均误差值满足所述约束条件时的最优i值；Step D, determine whether the weighted average error value is not greater than the total error value, and is smaller than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute the above step A - Step D, stop the iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;

步骤E，基于所述最优i值确定针对所述非离群数据子集进行分层抽样时所需划分出的数据子集的最优数量，并将所述置信概率以及与所述最优i值，输入至所述数学关系中进行计算，得到与所述各个数据子集分别对应的最优误差值。Step E: Determine the optimal number of data subsets to be divided when performing stratified sampling for the non-outlier data subset based on the optimal i value, and compare the confidence probability with the optimal number of data subsets. The i value is input into the mathematical relationship for calculation to obtain the optimal error values corresponding to the respective data subsets.

在本实施例中，所述采样模块进一步：In this embodiment, the sampling module further:

基于所述随机数分别对各个数据子集中的非离群数据样本进行随机采样，得到与计算出的所述最优采样数量对应的非离群数据样本。The non-outlier data samples in each data subset are randomly sampled based on the random numbers, to obtain non-outlier data samples corresponding to the calculated optimal sampling number.

在本实施例中，所述采样模块进一步执行以下示出的任一：In this embodiment, the sampling module further performs any one of the following:

调用所述区块链上部署的随机函数生成用于进行随机采样的随机树；calling the random function deployed on the blockchain to generate a random tree for random sampling;

基于所述节点设备中搭载的可信执行环境中维护的随机数种子，在所述可信执行环境中生成随机数；generating a random number in the trusted execution environment based on the random number seed maintained in the trusted execution environment carried in the node device;

从所述智能合约维护的数据相关的数据参数中，获取作为所述随机数种子的目标数据参数，并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数；Obtain a target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate a random number for random sampling in the smart contract based on the obtained target data parameter;

通过与所述智能合约对应的预言机程序，获取在链外生成的用于进行随机采样的随机数；Obtain random numbers generated outside the chain for random sampling through the oracle program corresponding to the smart contract;

通过与所述智能合约对应的预言机程序，获取在链外生成的用于生成随机数的随机数种子，并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数；获取所述计算参数中包括的在链外生成的随机数种子，基于所述随机数种子在所述智能合约中生成用于进行随机采样的随机数。Through the oracle program corresponding to the smart contract, a random number seed generated outside the chain for generating random numbers is obtained, and based on the obtained target data parameters, a random number for random sampling is generated in the smart contract. obtain the random number seed generated outside the chain included in the calculation parameter, and generate a random number for random sampling in the smart contract based on the random number seed.

在本实施例中，所述计算参数还包括算法标识；In this embodiment, the calculation parameter further includes an algorithm identifier;

所述计算模块403：The computing module 403:

按照所述算法标识指示的计算类型，针对所述离群数据子集中的离群数据样本进行精确计算；According to the calculation type indicated by the algorithm identifier, accurately calculate the outlier data samples in the outlier data subset;

针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算，包括：Approximate calculations are performed on the non-outlier data samples sampled from the non-outlier data subset, including:

按照所述算法标识指示的计算类型，针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算。According to the calculation type indicated by the algorithm identifier, an approximate calculation is performed on the non-outlier data samples sampled from the non-outlier data subset.

在本实施例中，所述智能合约部署在所述节点设备搭载的可信执行环境中；所述计算参数和所述数据集合中的数据样本预先经过了加密处理；In this embodiment, the smart contract is deployed in a trusted execution environment carried by the node device; the calculation parameters and the data samples in the data set are encrypted in advance;

所述采样模块402进一步：The sampling module 402 further:

将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集，和由若干非离群数据样本构成的非离群数据子集之前，在所述可信执行环境中对所述计算参数以及对获取到的所述数据集合中的数据样本分别进行解密。Before dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, before the possible The computing parameters and the acquired data samples in the data set are respectively decrypted in the information execution environment.

上述实施例阐明的系统、装置、模块或单元，具体可以由计算机芯片或实体实现，或者由具有某种功能的产品来实现。一种典型的实现设备为计算机，计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules or units described in the above embodiments may be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, email sending and receiving device, game control desktop, tablet, wearable device, or a combination of any of these devices.

在一个典型的配置中，计算机包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

内存可能包括计算机可读介质中的非永久性存储器，随机存取存储器(RAM)和/或非易失性内存等形式，如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括，但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带、磁盘存储、量子存储器、基于石墨烯的存储介质或其他磁性存储设备或任何其他非传输介质，可用于存储可以被计算设备访问的信息。按照本文中的界定，计算机可读介质不包括暂存电脑可读媒体(transitory media)，如调制的数据信号和载波。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridges, disk storage, quantum memory, graphene-based storage media or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. As defined herein, computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

还需要说明的是，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device comprising a series of elements includes not only those elements, but also Other elements not expressly listed, or which are inherent to such a process, method, article of manufacture, or apparatus are also included. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article of manufacture, or device that includes the element.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下，在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外，在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中，多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of the present specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. Additionally, the processes depicted in the figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

在本说明书一个或多个实施例使用的术语是仅仅出于描述特定实施例的目的，而非旨在限制本说明书一个或多个实施例。在本说明书一个或多个实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。还应当理解，本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in one or more embodiments of this specification is for the purpose of describing a particular embodiment only and is not intended to limit the one or more embodiments of this specification. As used in the specification or embodiments and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

应当理解，尽管在本说明书一个或多个实施例可能采用术语第一、第二、第三等来描述各种信息，但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如，在不脱离本说明书一个或多个实施例范围的情况下，第一信息也可以被称为第二信息，类似地，第二信息也可以被称为第一信息。取决于语境，如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It will be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, such information should not be limited by these terms. These terms are only used to distinguish the same type of information from each other. For example, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information without departing from the scope of one or more embodiments of the present specification. Depending on the context, the word "if" as used herein can be interpreted as "at the time of" or "when" or "in response to determining."

以上所述仅为本说明书一个或多个实施例的较佳实施例而已，并不用以限制本说明书一个或多个实施例，凡在本说明书一个或多个实施例的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本说明书一个或多个实施例保护的范围之内。The above descriptions are only preferred embodiments of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. All within the spirit and principles of one or more embodiments of this specification, Any modifications, equivalent replacements, improvements, etc. made should be included within the protection scope of one or more embodiments of this specification.

Claims

1. A computing method based on intelligent contracts, applied to node devices in a blockchain on which intelligent contracts for performing approximate computation are deployed, the method comprising:

receiving an intelligent contract invoking transaction aiming at the intelligent contract initiated by a calculation initiator; wherein the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameters comprise data identifications of data sets participating in approximate calculation;

responding to the intelligent contract call transaction, calling sampling logic contained in the intelligent contract call transaction, dividing the data set corresponding to the data identification into an outlier data subset formed by a plurality of outlier data samples and a non-outlier data subset formed by a plurality of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

further invoking the intelligent contract to invoke the computation logic included in the transaction, performing a precise computation on outlier data samples of the subset of outlier data, performing an approximate computation on non-outlier data samples sampled from the subset of non-outlier data, and merging results of the precise computation and the approximate computation as an approximate computation result for the set of data.

2. The method of claim 1, prior to dividing the set of data corresponding to the data identification into an outlier subset of outlier data samples and a non-outlier subset of non-outlier data samples, further comprising:

acquiring a data set corresponding to the data identification and stored on the block chain; or,

and acquiring a data set corresponding to the data identification from an off-chain database interfacing with the blockchain through a predictive machine program corresponding to the intelligent contract.

3. The method of claim 2, dividing the set of data corresponding to the data identification into an outlier subset of outlier data samples and a non-outlier subset of non-outlier data samples, comprising:

performing outlier data calculation on data samples in the data set corresponding to the data identification to determine outlier data samples and non-outlier data samples contained in the data set;

creating the outlier data subset based on the outlier data samples and creating a non-outlier data subset based on the non-outlier data samples.

4. The method of claim 3, the calculation parameter comprising a confidence probability corresponding to the approximation calculation; and, a total error value corresponding to the approximation calculation; the confidence probability characterizes an accuracy of the approximation calculation; the intelligent contract maintains a mathematical relationship which is derived based on the Hough's inequality and is used for describing a confidence probability corresponding to the approximate calculation, an error value corresponding to the approximate calculation and a sampling number corresponding to a data set participating in the approximate calculation;

prior to sampling the data samples in the non-outlier subset of data, further comprising:

inputting the confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation into the mathematical relationship for calculation to obtain the number of samples corresponding to the non-outlier data subset.

5. The method of claim 4, the data relationship being represented by the following formula:

wherein, in the above formula, n_gRepresenting the number of samples; b_g、a_gRespectively representing a maximum value and a minimum value of data samples in the data set; δ represents the confidence probability; epsilon_gRepresenting the error value; n is a radical of hydrogen_gRepresenting the total number of data samples in the data set.

6. The method of claim 3, sampling non-outlier data samples of the subset of non-outlier data, comprising:

sampling non-outlier data samples of the subset of non-outlier data according to the calculated number of samples.

7. The method of claim 6, the sampling for non-outlier data samples of the subset of non-outlier data comprising random sampling;

sampling non-outlier data samples of the subset of non-outlier data according to the calculated number of samples, comprising:

acquiring a random number for random sampling;

and randomly sampling non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.

8. The method of claim 6, the sampling for non-outlier data samples of the subset of non-outlier data comprising hierarchical sampling;

adopting an optimization solving method to solve the optimal number of the data subsets required to be divided when the non-outlier data subsets are subjected to hierarchical sampling and the optimal error value corresponding to each divided data subset;

inputting the confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset into the mathematical relationship for calculation to obtain optimal sampling numbers respectively corresponding to each data subset;

and dividing the data set into a plurality of data subsets according to the optimal quantity, and respectively sampling non-outlier data samples in each data subset according to the optimal sampling data.

9. The method of claim 8, wherein the constraints employed by the optimization solution method include: performing weighted average calculation on error values corresponding to the data subsets to obtain a weighted average error value which is the minimum and is not greater than the total error value;

adopting an optimization solving method to solve the optimal number of the data subsets required to be divided when the non-outlier data subsets are subjected to hierarchical sampling and the optimal error value corresponding to each divided data subset, wherein the optimization solving method comprises the following steps:

step A, adjusting a numerical value corresponding to the initialized value i;

step B, dividing the non-outlier data subsets into a plurality of data subsets respectively containing i sample numbers;

step C, inputting the confidence probability and the adjusted i value as calculation parameters into the mathematical relationship for calculation to obtain error values respectively corresponding to the data subsets, and performing weighted average calculation on the error values corresponding to the data subsets to obtain weighted average error values;

step D, determining whether the weighted average error value is not greater than the total error value and is less than the weighted average error value calculated based on the i value before the current adjustment; if not, re-executing the steps A-D until the calculated weighted average error value meets the constraint condition, stopping iteration, and obtaining an optimal i value when the weighted average error value meets the constraint condition;

and E, determining the optimal number of the data subsets which need to be divided when the non-outlier data subsets are subjected to hierarchical sampling based on the optimal i value, and inputting the confidence probability and the optimal i value into the mathematical relationship for calculation to obtain optimal error values respectively corresponding to the data subsets.

10. The method of claim 8, sampling non-outlier data samples in each data subset according to the optimal sampling data, respectively, comprising:

acquiring a random number for random sampling;

and respectively carrying out random sampling on the non-outlier data samples in each data subset based on the random numbers to obtain the non-outlier data samples corresponding to the calculated optimal sampling number.

11. The method of claim 7 or 10, the obtaining random numbers for random sampling comprising any one of:

calling a random function deployed on the block chain to generate a random tree for random sampling;

generating a random number in a trusted execution environment based on a random number seed maintained in the trusted execution environment loaded in the node device;

acquiring target data parameters serving as the random number seeds from data-related data parameters maintained by the intelligent contract, and generating random numbers for random sampling in the intelligent contract based on the acquired target data parameters;

acquiring random numbers which are generated outside a chain and used for random sampling through a language predicting machine program corresponding to the intelligent contract;

acquiring a random number seed which is generated outside a chain and used for generating a random number through a prediction machine program corresponding to the intelligent contract, and generating the random number used for random sampling in the intelligent contract based on the acquired target data parameter; and acquiring random number seeds which are included in the calculation parameters and generated outside the chain, and generating random numbers for random sampling in the intelligent contract based on the random number seeds.

12. The method of claim 1, the calculation parameters further comprising an algorithm identification;

performing a refined calculation on outlier data samples in the subset of outlier data, comprising:

accurately calculating the outlier data samples in the outlier data subset according to the calculation type indicated by the algorithm identifier;

performing an approximation calculation on non-outlier data samples sampled from the non-outlier data subset, comprising:

and performing approximate calculation on non-outlier data samples sampled from the non-outlier data subset according to the calculation type indicated by the algorithm identification.

13. The method of claim 1, the smart contract deployed in a trusted execution environment hosted by the node device; the calculation parameters and the data samples in the data set are encrypted in advance;

before the dividing the data set corresponding to the data identifier into an outlier data subset composed of a plurality of outlier data samples and a non-outlier data subset composed of a plurality of non-outlier data samples, the method further includes:

and respectively decrypting the calculation parameters and the data samples in the acquired data set in the trusted execution environment.

14. An intelligent contract-based computing apparatus applied to a node device in a blockchain on which an intelligent contract for performing approximate computation is deployed, the apparatus comprising:

the receiving module is used for receiving an intelligent contract calling transaction aiming at the intelligent contract and initiated by a calculation initiator; wherein the smart contract invocation transaction includes a calculation parameter corresponding to the approximate calculation; the calculation parameters comprise data identifications of data sets participating in approximate calculation;

the sampling module is used for responding to the intelligent contract call transaction, calling sampling logic contained in the intelligent contract call transaction, dividing the data set corresponding to the data identification into an outlier data subset formed by a plurality of outlier data samples and a non-outlier data subset formed by a plurality of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

and the calculation module is used for further calling the calculation logic included in the intelligent contract calling transaction, performing accurate calculation on the outlier data samples in the outlier data subset, performing approximate calculation on the non-outlier data samples obtained by sampling from the non-outlier data subset, and combining the results of the accurate calculation and the approximate calculation to obtain an approximate calculation result on the data set.

15. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the steps of the method of any one of claims 1-13 by executing the executable instructions.

16. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 13.