CN111797422A

CN111797422A - Data privacy protection query method and device, storage medium and electronic equipment

Info

Publication number: CN111797422A
Application number: CN201910282034.7A
Authority: CN
Inventors: 何明; 陈仲铭; 徐鑫; 刘耀勇; 陈岩
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-04-09
Filing date: 2019-04-09
Publication date: 2020-10-20
Anticipated expiration: 2039-04-09
Also published as: CN111797422B

Abstract

The embodiment of the present application discloses a data privacy protection query method, device, storage medium and electronic device. The data privacy protection query method includes: acquiring first data, second data and third data of basic data, and storing them in a terminal; Obtain the basic data information of the terminal; perform distributed storage of the basic data information in the cloud to obtain a distributed storage database; when the user's query instruction is obtained, extract the basic data information from the distributed storage database; store the basic data information in the terminal Match to get the target basic data. The embodiment of the present application extracts and fuses the key features of the basic data by means of tertiary storage. When operating the data, the first data is not transmitted to the cloud, and the second data and the third data are extracted and then transmitted to the cloud. In this way, the exposure of user privacy data to the cloud is avoided, and the security of system data and user privacy data is effectively protected in the terminal and the cloud.

Description

Data privacy protection query method, device, storage medium and electronic device

技术领域technical field

本申请涉及电子技术领域，特别涉及一种数据隐私保护查询方法、装置、存储介质及电子设备。The present application relates to the field of electronic technologies, and in particular, to a data privacy protection query method, device, storage medium and electronic device.

背景技术Background technique

随着电子技术的发展，诸如智能手机等电子设备的智能化程度越来越高。电子设备可以通过各种各样的算法模型来进行数据处理，从而为用户提供各种功能。对于需要收集大量数据的电子设备来说，系统数据的安全性和用户隐私数据的安全性都很重要。With the development of electronic technology, electronic devices such as smart phones are becoming more and more intelligent. Electronic devices can process data through various algorithm models, thereby providing users with various functions. For electronic devices that need to collect a large amount of data, the security of system data and the security of user privacy data are both important.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供一种数据隐私保护查询方法、装置、存储介质及电子设备，可以在终端和云端兼顾系统数据的安全性和用户隐私数据的安全性。The embodiments of the present application provide a data privacy protection query method, device, storage medium and electronic device, which can take into account the security of system data and the security of user privacy data in the terminal and the cloud.

本申请实施例提供一种数据隐私保护查询方法，应用于电子设备，其中，数据隐私保护查询方法包括：The embodiment of the present application provides a data privacy protection query method, which is applied to an electronic device, wherein the data privacy protection query method includes:

对基础数据进行聚类得到第一数据，对所述第一数据进行特征提取得到第二数据，将多个所述第二数据进行融合得到第三数据，将所述第一数据、第二数据和第三数据在终端进行存储；The first data is obtained by clustering the basic data, the second data is obtained by feature extraction on the first data, the third data is obtained by fusing a plurality of the second data, and the first data and the second data are obtained. and the third data are stored in the terminal;

获取终端的基础数据信息，所述基础数据信息为第二数据和第三数据的基础数据信息；acquiring basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

对所述基础数据信息在云端进行分布式存储，得到分布式存储数据库，所述分布式存储数据库中包含多个分布式存储子节点；Distributed storage of the basic data information in the cloud to obtain a distributed storage database, wherein the distributed storage database includes a plurality of distributed storage sub-nodes;

当获取到用户的查询指令时，根据所述查询指令在分布式存储数据库中提取所述基础数据信息；When the user's query instruction is acquired, extract the basic data information from the distributed storage database according to the query instruction;

将所述基础数据信息在终端进行匹配，得到目标基础数据。The basic data information is matched on the terminal to obtain target basic data.

本申请实施例还提供了一种数据隐私保护查询装置，包括：The embodiment of the present application also provides a data privacy protection query device, including:

处理模块，用于对基础数据进行聚类得到第一数据，对所述第一数据进行特征提取得到第二数据，将多个所述第二数据进行融合得到第三数据，将所述第一数据、第二数据和第三数据在终端进行存储；The processing module is used for clustering the basic data to obtain the first data, performing feature extraction on the first data to obtain the second data, fusing a plurality of the second data to obtain the third data, and combining the first data The data, the second data and the third data are stored in the terminal;

获取模块，用获取终端的基础数据信息，所述基础数据信息为第二数据和第三数据的基础数据信息；an acquisition module for acquiring basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

存储模块，用于对所述基础数据信息在云端进行分布式存储，得到分布式存储数据库，所述分布式存储数据库中包含多个分布式存储子节点；a storage module, configured to perform distributed storage of the basic data information in the cloud to obtain a distributed storage database, wherein the distributed storage database includes a plurality of distributed storage sub-nodes;

提取模块，用于当获取到用户的查询指令时，根据所述查询指令在分布式存储数据库中提取所述基础数据信息；an extraction module, configured to extract the basic data information from the distributed storage database according to the query instruction when the user's query instruction is obtained;

匹配模块，用于将所述基础数据信息在终端进行匹配，得到目标基础数据。The matching module is used for matching the basic data information on the terminal to obtain target basic data.

本申请实施例还提供一种存储介质，其中，存储介质中存储有计算机程序，当计算机程序在计算机上运行时，使得计算机执行以下步骤：The embodiment of the present application also provides a storage medium, wherein a computer program is stored in the storage medium, and when the computer program runs on the computer, the computer is caused to perform the following steps:

本申请实施例还提供一种电子设备，其中，电子设备包括处理器和存储器，存储器中存储有计算机程序，处理器通过调用存储器中存储的计算机程序，用于执行以下步骤：The embodiment of the present application also provides an electronic device, wherein the electronic device includes a processor and a memory, and a computer program is stored in the memory, and the processor is used to perform the following steps by calling the computer program stored in the memory:

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍。显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can also be obtained from these drawings without creative effort.

图1为本申请实施例提供的数据隐私保护查询方法的应用场景示意图。FIG. 1 is a schematic diagram of an application scenario of a data privacy protection query method provided by an embodiment of the present application.

图2为本申请实施例提供的数据隐私保护查询方法的第一种流程示意图。FIG. 2 is a first schematic flowchart of a data privacy protection query method provided by an embodiment of the present application.

图3为本申请实施例提供的数据隐私保护查询方法的另一应用场景示意图。FIG. 3 is a schematic diagram of another application scenario of the data privacy protection query method provided by the embodiment of the present application.

图4为本申请实施例提供的数据隐私保护查询方法的第二种流程示意图。FIG. 4 is a second schematic flowchart of a data privacy protection query method provided by an embodiment of the present application.

图5为本申请实施例提供的数据隐私保护查询装置的结构示意图。FIG. 5 is a schematic structural diagram of an apparatus for querying data privacy protection provided by an embodiment of the present application.

图6为本申请实施例提供的数据隐私保护查询装置的另一结构示意图。FIG. 6 is another schematic structural diagram of an apparatus for querying data privacy protection provided by an embodiment of the present application.

图7为本申请实施例提供的数据隐私保护查询装置的又一结构示意图。FIG. 7 is another schematic structural diagram of an apparatus for querying data privacy protection provided by an embodiment of the present application.

图8为本申请实施例提供的数据隐私保护查询装置的再一结构示意图。FIG. 8 is still another schematic structural diagram of an apparatus for querying data privacy protection provided by an embodiment of the present application.

图9为本申请实施例提供的电子设备的第一种结构示意图。FIG. 9 is a schematic diagram of a first structure of an electronic device provided by an embodiment of the present application.

图10为本申请实施例提供的电子设备的第二种结构示意图。FIG. 10 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述。显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域技术人员在没有付出创造性劳动前提下所获得的所有其他实施例，都属于本申请的保护范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of this application.

参考图1，图1为本申请实施例提供的数据隐私保护查询方法的应用场景示意图。数据隐私保护查询方法应用于电子设备。电子设备中设置有全景感知架构。全景感知架构为电子设备中用于实现数据隐私保护查询方法的硬件和软件的集成。Referring to FIG. 1 , FIG. 1 is a schematic diagram of an application scenario of a data privacy protection query method provided by an embodiment of the present application. The data privacy protection query method is applied to electronic equipment. A panoramic perception architecture is provided in the electronic device. Panoramic perception architecture is the integration of hardware and software in electronic devices for implementing data privacy protection query methods.

其中，全景感知架构包括信息感知层、数据处理层、特征抽取层、情景建模层以及智能服务层。Among them, the panoramic perception architecture includes an information perception layer, a data processing layer, a feature extraction layer, a scenario modeling layer, and an intelligent service layer.

信息感知层用于获取电子设备自身的信息和/或外部环境中的信息。信息感知层可以包括多个传感器。例如，信息感知层包括距离传感器、磁场传感器、光线传感器、加速度传感器、指纹传感器、霍尔传感器、位置传感器、陀螺仪、惯性传感器、姿态感应器、气压计、心率传感器等多个传感器。The information perception layer is used to acquire the information of the electronic device itself and/or the information in the external environment. The information perception layer may include multiple sensors. For example, the information perception layer includes a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a Hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, a heart rate sensor, and other sensors.

其中，距离传感器可以用于检测电子设备与外部物体之间的距离。磁场传感器可以用于检测电子设备所处环境的磁场信息。光线传感器可以用于检测电子设备所处环境的光线信息。加速度传感器可以用于检测电子设备的加速度数据。指纹传感器可以用于采集用户的指纹信息。霍尔传感器是根据霍尔效应制作的一种磁场传感器，可以用于实现电子设备的自动控制。位置传感器可以用于检测电子设备当前所处的地理位置。陀螺仪可以用于检测电子设备在各个方向上的角速度。惯性传感器可以用于检测电子设备的运动数据。姿态感应器可以用于感应电子设备的姿态信息。气压计可以用于检测电子设备所处环境的气压。心率传感器可以用于检测用户的心率信息。Among them, the distance sensor can be used to detect the distance between the electronic device and the external object. The magnetic field sensor can be used to detect the magnetic field information of the environment in which the electronic device is located. The light sensor can be used to detect the light information of the environment where the electronic device is located. Acceleration sensors can be used to detect acceleration data of electronic devices. The fingerprint sensor can be used to collect the user's fingerprint information. Hall sensor is a magnetic field sensor made according to the Hall effect, which can be used to realize automatic control of electronic equipment. The location sensor can be used to detect the current geographic location of the electronic device. Gyroscopes can be used to detect the angular velocity of electronic devices in various directions. Inertial sensors can be used to detect motion data of electronic devices. The attitude sensor can be used to sense the attitude information of the electronic device. A barometer can be used to detect the air pressure in the environment in which the electronic device is located. The heart rate sensor may be used to detect the user's heart rate information.

数据处理层用于对信息感知层获取到的数据进行处理。例如，数据处理层可以对信息感知层获取到的数据进行数据清理、数据集成、数据变换、数据归约等处理。The data processing layer is used to process the data obtained by the information perception layer. For example, the data processing layer can perform data cleaning, data integration, data transformation, data reduction and other processing on the data obtained by the information perception layer.

其中，数据清理是指对信息感知层获取到的大量数据进行清理，以剔除无效数据和重复数据。数据集成是指将信息感知层获取到的多个单维度数据集成到一个更高或者更抽象的维度，以对多个单维度的数据进行综合处理。数据变换是指对信息感知层获取到的数据进行数据类型的转换或者格式的转换等，以使变换后的数据满足处理的需求。数据归约是指在尽可能保持数据原貌的前提下，最大限度的精简数据量。Among them, data cleaning refers to cleaning a large amount of data obtained by the information perception layer to eliminate invalid data and duplicate data. Data integration refers to integrating multiple single-dimensional data obtained by the information perception layer into a higher or more abstract dimension to comprehensively process multiple single-dimensional data. Data transformation refers to converting the data type or format of the data obtained by the information perception layer, so that the transformed data can meet the processing requirements. Data reduction refers to reducing the amount of data to the greatest extent possible on the premise of keeping the original data as much as possible.

特征抽取层用于对数据处理层处理后的数据进行特征抽取，以提取数据中包括的特征。提取到的特征可以反映出电子设备自身的状态或者用户的状态或者电子设备所处环境的环境状态等。The feature extraction layer is used to perform feature extraction on the data processed by the data processing layer to extract features included in the data. The extracted features may reflect the state of the electronic device itself, the state of the user, or the environmental state of the environment in which the electronic device is located.

其中，特征抽取层可以通过过滤法、包装法、集成法等方法来提取特征或者对提取到的特征进行处理。Among them, the feature extraction layer can extract features or process the extracted features by filtering method, packaging method, integration method and other methods.

过滤法是指对提取到的特征进行过滤，以删除冗余的特征数据。包装法用于对提取到的特征进行筛选。集成法是指将多种特征提取方法集成到一起，以构建一种更加高效、更加准确的特征提取方法，用于提取特征。The filtering method refers to filtering the extracted features to remove redundant feature data. The packing method is used to filter the extracted features. The integration method refers to the integration of multiple feature extraction methods to construct a more efficient and accurate feature extraction method for feature extraction.

情景建模层用于根据特征抽取层提取到的特征来构建模型，所得到的模型可以用于表示电子设备的状态或者用户的状态或者环境状态等。例如，情景建模层可以根据特征抽取层提取到的特征来构建关键值模型、模式标识模型、图模型、实体联系模型、面向对象模型等。The scenario modeling layer is used to construct a model according to the features extracted by the feature extraction layer, and the obtained model can be used to represent the state of the electronic device, the state of the user, or the environment state, etc. For example, the scenario modeling layer can construct a key value model, a pattern identification model, a graph model, an entity relationship model, an object-oriented model, etc. according to the features extracted by the feature extraction layer.

智能服务层用于根据情景建模层所构建的模型为用户提供智能化的服务。例如，智能服务层可以为用户提供基础应用服务，可以为电子设备进行系统智能优化，还可以为用户提供个性化智能服务。The intelligent service layer is used to provide users with intelligent services according to the model constructed by the scenario modeling layer. For example, the intelligent service layer can provide users with basic application services, can perform system intelligent optimization for electronic devices, and can also provide users with personalized intelligent services.

此外，全景感知架构中还可以包括多种算法，每一种算法都可以用于对数据进行分析处理，多种算法可以构成算法库。例如，算法库中可以包括马尔科夫算法、隐含狄里克雷分布算法、贝叶斯分类算法、支持向量机、K均值聚类算法、K近邻算法、条件随机场、残差网络、长短期记忆网络、卷积神经网络、循环神经网络等算法。In addition, the panoramic perception architecture can also include multiple algorithms, each of which can be used to analyze and process data, and multiple algorithms can form an algorithm library. For example, the algorithm library may include Markov algorithm, latent Dirichlet distribution algorithm, Bayesian classification algorithm, support vector machine, K-means clustering algorithm, K-nearest neighbor algorithm, conditional random field, residual network, long Algorithms such as short-term memory networks, convolutional neural networks, and recurrent neural networks.

本申请实施例提供一种数据隐私保护查询方法，数据隐私保护查询方法可以应用于电子设备中。电子设备可以是智能手机、平板电脑、游戏设备、AR(Augmented Reality，增强现实)设备、汽车、数据隐私保护查询装置、音频播放装置、视频播放装置、笔记本、桌面计算设备、可穿戴设备诸如手表、眼镜、头盔、电子手链、电子项链、电子衣物等设备。The embodiment of the present application provides a data privacy protection query method, and the data privacy protection query method can be applied to an electronic device. The electronic device may be a smartphone, a tablet computer, a gaming device, an AR (Augmented Reality) device, a car, a data privacy protection query device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as a watch , glasses, helmets, electronic bracelets, electronic necklaces, electronic clothing and other equipment.

本申请实施例提供的数据隐私保护查询方法可用于上述全景感知框架的架构，涉及包括信息感知层、数据处理层及特征抽取层在内的全景感知架构。The data privacy protection query method provided by the embodiment of the present application can be used in the architecture of the above-mentioned panoramic perception framework, which involves a panoramic perception architecture including an information perception layer, a data processing layer, and a feature extraction layer.

参考图2，图2为本申请实施例提供的数据隐私保护查询方法的第一种流程示意图。其中，数据隐私保护查询方法包括以下步骤：Referring to FIG. 2 , FIG. 2 is a schematic flowchart of a first type of data privacy protection query method provided by an embodiment of the present application. The data privacy protection query method includes the following steps:

110，对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储。110: Cluster the basic data to obtain first data, perform feature extraction on the first data to obtain second data, fuse multiple second data to obtain third data, and combine the first data, the second data and the third data Store in the terminal.

在此步骤中，第一数据可以为聚类后的基础数据；第二数据可以为全景信息特征；第三数据可以为融合特征数据。In this step, the first data may be clustered basic data; the second data may be panoramic information features; and the third data may be fusion feature data.

基础数据可以包括电子设备的运行信息、电子设备的配置信息、用户信息、当前环境信息等。具体的，可以通过一个或多个传感器采集基础数据，也可以为实时采集。例如，通过距离传感器、磁场传感器、光线传感器、加速度传感器、指纹传感器、霍尔传感器、位置传感器、陀螺仪、惯性传感器、姿态感应器、气压计、血压传感器、脉搏传感器、心率传感器等中的至少一个获取当前环境信息和电子设备的相关信息。其中，当前环境信息包括用户的身体信息，如血压、脉搏、心率等。电子设备的相关信息包括电子设备的运行信息、电子设备的配置信息、电子设备内存储的用户信息等。其中，用户信息包括用户的身份信息、个人爱好、浏览记录、个人收藏等人机交互的信息。电子设备的运行信息包括开机时间、关机时间、待机时间、各个时间点的内存使用率、各个时间点的主芯片使用率、当前运行程序信息、后台运行程序信息、各个程序的运行时长、各个程序的下载量等。在一些实施例中，基础数据还可以包括用户操作终端的行为数据、传感器数据和终端系统运行数据。The basic data may include operation information of the electronic device, configuration information of the electronic device, user information, current environment information, and the like. Specifically, the basic data may be collected through one or more sensors, and may also be collected in real time. For example, through at least one of a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a Hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, a blood pressure sensor, a pulse sensor, a heart rate sensor, etc. One to obtain current environmental information and related information of electronic equipment. The current environment information includes the user's physical information, such as blood pressure, pulse, heart rate, and the like. The related information of the electronic device includes operation information of the electronic device, configuration information of the electronic device, user information stored in the electronic device, and the like. The user information includes the user's identity information, personal hobbies, browsing records, personal collections and other human-computer interaction information. The operation information of electronic equipment includes boot time, shutdown time, standby time, memory usage rate at each time point, main chip usage rate at each time point, current running program information, background running program information, running time of each program, each program downloads, etc. In some embodiments, the basic data may also include behavior data of the user operating the terminal, sensor data, and terminal system operation data.

得到多个基础数据后，可以将其存储在第一存储模块中。如，可以将多个全景感知信息存储在硬盘中。其中，可以设置多个数据库，将其基础数据按照类别存储到对应的数据库中。After a plurality of basic data are obtained, they can be stored in the first storage module. For example, a plurality of panoramic perception information can be stored in the hard disk. Among them, multiple databases can be set, and their basic data can be stored in the corresponding databases according to the categories.

对基础数据进行聚类，得到第一数据。第一数据可以为聚类后的基础数据。将同类的基础数据聚合在一起，形成一个数据集合，从而得到多类基础数据的多个数据集合。其中，基础数据可以根据数据的硬件属性进行分类，如主芯片相关的数据、显示屏相关的数据、硬盘相关的数据、内存相关的数据、各类传感器相关的数据等。基础数据还可以根据对应的应用程序进行分类，如系统应用程序相关的数据、安装的应用程序相关的数据；其中安装的应用程序相关的数据又可以根据具体的应用程序进行再分类，如即时通讯应用程序相关的数据、地图应用程序相关的数据、购物应用程序相关的数据等。将基础数据按照类别存储到对应的数据库中，有效地隔离了不相关的数据，使得数据能够独立存放。在一些实施例中，获取对应每个数据库中的时序索引，还能够便于基础数据的索引。The basic data is clustered to obtain the first data. The first data may be the clustered basic data. The basic data of the same type are aggregated together to form a data set, thereby obtaining multiple data sets of multiple types of basic data. Among them, the basic data can be classified according to the hardware attributes of the data, such as the data related to the main chip, the data related to the display screen, the data related to the hard disk, the data related to the memory, and the data related to various sensors. Basic data can also be classified according to the corresponding application, such as system application-related data, installed application-related data; the installed application-related data can be further classified according to specific applications, such as instant messaging App-related data, map app-related data, shopping app-related data, etc. The basic data is stored in the corresponding database according to the category, which effectively isolates the irrelevant data, so that the data can be stored independently. In some embodiments, obtaining the time series index corresponding to each database can also facilitate the indexing of basic data.

每一种类型的第一数据的数据格式和数据内容可以不相同，例如传感器数据中wifi连接信息非常有限，在没有连接wifi信号的时候并不会有wifi信息进行存储记录；相对而言，对于IMU数据则是每秒以赫兹的频率进行回传，一天最多可存储高达上G的数据。对数据库进行基础数据的特征提取，一方面有利于减少冗余信息、节省存储空间，另外一方面可以有效提取基础数据中的重要含义。以音频信息为例，音频信息属于时序信息，随着时间的增长，音频信息的数据不断增长，因此需要对数据进行特征提取，减少数据量。以双麦克风通道、32bit的位宽、采样频率为44100的音频信息为例，其5分钟产生的数据大概为1G，经过特征提取后得到每个时间窗口的重要特征，此时特征可以以向量形式进行储存，1G的数据可以压缩至几百k不等。The data format and data content of each type of first data may be different. For example, the wifi connection information in the sensor data is very limited, and no wifi information will be stored and recorded when there is no wifi signal connected; relatively speaking, for The IMU data is sent back at the frequency of Hertz every second, and it can store up to gigabytes of data a day. The feature extraction of basic data in the database is conducive to reducing redundant information and saving storage space on the one hand, and on the other hand, it can effectively extract the important meaning of the basic data. Taking audio information as an example, audio information belongs to time series information. With the increase of time, the data of audio information continues to grow, so it is necessary to perform feature extraction on the data to reduce the amount of data. Taking the audio information with dual microphone channels, 32bit bit width, and sampling frequency of 44100 as an example, the data generated in 5 minutes is about 1G. After feature extraction, the important features of each time window are obtained. At this time, the features can be in the form of vectors. For storage, 1G of data can be compressed to hundreds of kilobytes.

同类型的基础数据数据隐私保护查询到同一数据库中。一项基础数据可以存储到一个数据库中，例如，加速度传感器数据只存储到加速度传感器数据库中。一项基础数据也可以存储到多个数据库中，例如，当某项基础数据分属于两种类别时，可以将这项基础数据进行复制，将复制后的基础数据和原基础数据分别存储到两个数据库中，两个数据库分别对应于这项基础数据所属的两种类别。需要说明的是，数据库中不仅可以存储当前获取的全景感知信息，还可以存储之前存储的全景感知信息。Data privacy protection of the same type of basic data is queried into the same database. A basic data can be stored in a database, for example, the acceleration sensor data is only stored in the acceleration sensor database. A piece of basic data can also be stored in multiple databases. For example, when a piece of basic data belongs to two categories, the basic data can be copied, and the copied basic data and the original basic data can be stored in two separate databases. In each database, the two databases correspond to the two categories to which the basic data belongs. It should be noted that the database can store not only the currently acquired panoramic perception information, but also the previously stored panoramic perception information.

对第一数据进行特征提取，得到第二数据，第二数据可以为全景信息特征。对数据库进行基础数据的特征提取，得到每一个数据库对应的全景信息特征，将全景信息特征进行第二次存储。第二数据不需要包含大量的原始基础数据，只需要包含对应的全景信息特征即可。对第一数据进行特征提取，有效地提取出第一数据的重要特征，减少了原始基础数据的冗余信息、节省存储空间。相对于原始基础数据和第一数据，第二数据的数据量大大减少。需要说明的是，对数据库进行第一数据的特征提取，将提取得到的第二数据进行存储，还能够避免直接存储原始数据格式，严谨把控信息安全，保护用户隐私。通过对数据库进行基础数据的特征提取，能够对源数据进行脱敏处理，有效地记录经过特征层脱敏的用户数据，减少数据冗余，便于后续使用。Feature extraction is performed on the first data to obtain second data, where the second data may be panoramic information features. The feature extraction of basic data is performed on the database to obtain the panorama information feature corresponding to each database, and the panorama information feature is stored for the second time. The second data does not need to contain a large amount of original basic data, but only needs to contain corresponding panoramic information features. The feature extraction is performed on the first data, and the important features of the first data are effectively extracted, redundant information of the original basic data is reduced, and storage space is saved. Compared with the original basic data and the first data, the data amount of the second data is greatly reduced. It should be noted that by performing feature extraction of the first data on the database and storing the extracted second data, it is also possible to avoid directly storing the original data format, strictly control information security, and protect user privacy. By extracting the features of basic data in the database, the source data can be desensitized, the user data desensitized by the feature layer can be effectively recorded, data redundancy can be reduced, and subsequent use can be facilitated.

对单独的数据库中的数据进行单独的特征提取，得到每一个数据库对应的全景信息特征。可以设置特征提取层，用多种方式对基础数据进行特征提取，对应于不同的数据可以有不同的特征提取方法。A separate feature extraction is performed on the data in the separate databases to obtain the panoramic information features corresponding to each database. A feature extraction layer can be set up to perform feature extraction on the basic data in a variety of ways, and there can be different feature extraction methods corresponding to different data.

在一些实施例中，用人工预设的方法对数据库进行基础数据的特征提取，预先设定每一类别的基础数据中的重要特征。将基础数据聚类并存储至相应数据库中，对同一数据库中的基础数据进行相同的重要特征认定，提取预设的重要特征对应到每一项基础数据的具体数据，作为全景信息特征，将全景信息特征进行第二次存储。In some embodiments, the feature extraction of the basic data is performed on the database by a manual preset method, and the important features in the basic data of each category are preset. The basic data is clustered and stored in the corresponding database, and the basic data in the same database is identified with the same important features, and the preset important features corresponding to the specific data of each basic data are extracted as panoramic information features. Information features are stored a second time.

在一些实施例中，用预先训练机器学习模型的方法对数据库进行基础数据的特征提取。对所数据库进行基础数据的特征提取，得到每一个数据库对应的全景信息特征的步骤具体可以为：预先训练机器学习模型，得到与基础数据匹配的机器学习模型；将基础数据输入机器学习模型，获取模型输出结果，将模型输出结果作为全景信息特征。In some embodiments, feature extraction of basic data is performed on the database by means of a pre-trained machine learning model. The steps of performing feature extraction on the basic data of the databases to obtain the panoramic information features corresponding to each database may specifically include: pre-training a machine learning model to obtain a machine learning model matching the basic data; inputting the basic data into the machine learning model to obtain The model output results, and the model output results are used as panoramic information features.

将多个第二数据进行融合，得到第三数据。第三数据可以为融合特征数据。具体的，可以使用多表连接的方式对全景信息特征进行融合，也可以使用时序对齐的方式对全景信息特征进行融合，还可以使用多表连接与时序对齐的方式共同对全景信息特征进行融合。由于终端上的数据大部分为时序数据，即不同时间点用户的操作和终端的情景是不相同的，随着时间的改变而改变，因此融合第二数据，可以进一步减少数据之间的不对称性，压缩数据量。A plurality of second data are fused to obtain third data. The third data may be fusion feature data. Specifically, the panorama information features can be fused using a multi-table connection method, the panoramic information features can also be fused using a time series alignment method, and the panoramic information features can also be jointly fused using a multi-table connection method and a time series alignment method. Since most of the data on the terminal is time series data, that is, the user's operation at different time points and the terminal's situation are different, and change with the change of time, so the fusion of the second data can further reduce the asymmetry between the data properties, the amount of compressed data.

将全景信息特征进行融合，得到融合特征数据，将融合特征数据进行第三次存储，可以存储到第三存储单元中。通过级联的存储方式有效地对数据进行容灾备份，能够避免对明文数据进行存储和传输，通过特有的特征提取步骤对基础数据提取高纬特征(相当于对基础数据进行加密操作)，有效地保护用户隐私信息。The panoramic information features are fused to obtain fused feature data, and the fused feature data is stored for a third time, which can be stored in a third storage unit. Effectively perform disaster recovery backup for data through cascaded storage, which can avoid storage and transmission of plaintext data, and extract high-dimensional features from basic data through unique feature extraction steps (equivalent to encrypting basic data), effectively protect user privacy information.

将第一数据、第二数据和第三数据在终端进行存储，存储方式可以是触发式的数据回传方法，即数据的回传方式可以为触发式回传。例如，对于网络模块来说，其开启WIFI功能的时候会搜索附近可用网络，此时网络模块检测到的数据向系统进行传输，系统在收集基础数据时，对系统通知类消息进行监控和收集。The first data, the second data and the third data are stored in the terminal, and the storage method may be a trigger-type data return method, that is, the data return method may be a trigger-type return method. For example, for the network module, when the WIFI function is turned on, it will search for available networks nearby. At this time, the data detected by the network module is transmitted to the system. When the system collects basic data, the system monitors and collects system notification messages.

在一些实施例中，还可以获取对应每个数据库中的时序索引，将对应每个数据库的时序索引也存储在第二存储模块(如内存)中，以便系统其他模块根据时序索引在数据库中查找到对应的基础数据。通过聚类的方法对多源异构的基础数据进行时间序列聚类，有效地对原始基础数据进行压缩，减少了基础数据的冗余信息的同时，实现了实时的基础数据的索引和访问。电子设备的运算资源和存储资源有限，合理地对基础数据进行访问和分配，能够加快全景感知信息的检索速度。In some embodiments, the time series index corresponding to each database can also be obtained, and the time series index corresponding to each database is also stored in the second storage module (such as a memory), so that other modules of the system can search in the database according to the time series index to the corresponding base data. The multi-source heterogeneous basic data is clustered in time series through the clustering method, which effectively compresses the original basic data, reduces the redundant information of the basic data, and realizes the real-time indexing and access of the basic data. The computing resources and storage resources of electronic devices are limited, and reasonable access and allocation of basic data can speed up the retrieval of panoramic perception information.

120，获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息。120. Acquire basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data.

对终端的第二数据和第三数据提取后进行云端传送，将智能移动终端设备当做Client客户端，将第二数据和第三数据上传至云端服务器集群Cluster中，可以使用TCP/IP协议对数据进行封装，以流式数据上传至服务器集群中。通过云端的服务器集群，可以对各级数据进行索引。The second data and third data of the terminal are extracted and then transmitted to the cloud, and the intelligent mobile terminal device is used as a client, and the second data and third data are uploaded to the cloud server cluster Cluster, and the TCP/IP protocol can be used to transfer the data. Encapsulated for streaming data upload to server clusters. Through the server cluster in the cloud, data at all levels can be indexed.

在云端获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息。基础数据信息可以为第二数据和第三数据本身，也可以为第二数据和第三数据的索引表。换句话说，云端的基础数据信息与终端的第二数据和第三数据相对应，可以根据基础数据信息获取终端的第二数据和第三数据。The basic data information of the terminal is acquired in the cloud, and the basic data information is the basic data information of the second data and the third data. The basic data information may be the second data and the third data themselves, or may be an index table of the second data and the third data. In other words, the basic data information of the cloud corresponds to the second data and the third data of the terminal, and the second data and the third data of the terminal can be acquired according to the basic data information.

130，对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点。130. Perform distributed storage on the cloud for basic data information to obtain a distributed storage database, where the distributed storage database includes a plurality of distributed storage sub-nodes.

分布式存储是一种数据隐私保护查询技术，通过网络使用企业中的每台机器上的磁盘空间，并将这些分散的存储资源构成一个虚拟的存储设备，数据分散的存储在企业的各个角落。分布式存储的储存方式，能够分担存储负荷，它不但提高了系统的可靠性、可用性和存取效率，还易于扩展。Distributed storage is a data privacy protection query technology, which uses the disk space on each machine in the enterprise through the network, and forms a virtual storage device with these scattered storage resources, and the data is stored in all corners of the enterprise. The storage method of distributed storage can share the storage load. It not only improves the reliability, availability and access efficiency of the system, but also is easy to expand.

分布式存储技术可包括在多个机器上多次地存储文件，将分散数据隐私保护查询的负担和风险。文件存在的副本越多，越不可能丢失。然而，副本越多，意味着更多地方可以偷取，因此对于敏感数据或环境需要加密系统。在一些实施例中，在云端进行分布式存储的数据为基础数据信息，基础数据信息中不包括原始的基础数据，也不包括由原始数据聚类得到的第一数据，可以包括第一数据进行特征提取后得到的第二数据，可以包括将第二数据融合后得到的第三数据。对基础数据信息在云端进行分布式存储，得到分布式存储数据库。由于在云端不涉及原始基础数据的存储和处理，使得分布式存储的存储方法在保证性能的同时，能够保证高水平的安全性。Distributed storage technology can include storing files multiple times on multiple machines, which will spread the burden and risk of data privacy protection queries. The more copies of a file exist, the less likely it will be lost. However, more copies means more places to steal, so encryption systems are needed for sensitive data or environments. In some embodiments, the data that is distributed and stored in the cloud is basic data information, and the basic data information does not include the original basic data, nor does it include the first data obtained by clustering the original data. The second data obtained after feature extraction may include third data obtained by fusing the second data. Distributed storage of basic data information in the cloud to obtain a distributed storage database. Since the storage and processing of the original basic data is not involved in the cloud, the distributed storage method can ensure a high level of security while ensuring performance.

传统上的分布式存储本质上是一个中心化的系统，是将数据分散存储在多台独立的设备上，采用可扩展的系统结构、利用多台存储服务器分担存储负荷、利用位置服务器定位存储信息。而基于网络的分布式存储是区块链的核心技术，是将数据隐私保护查询于区块上并通过开放节点的存储空间建立的一种分布式数据库，解决传统分布式存储的问题。The traditional distributed storage is essentially a centralized system, which is to store data scattered on multiple independent devices, adopt a scalable system structure, use multiple storage servers to share the storage load, and use location servers to locate and store information. . The network-based distributed storage is the core technology of the blockchain. It is a distributed database established by querying the data privacy protection on the block and opening the storage space of the nodes to solve the problem of traditional distributed storage.

节点是区块链分布式系统中的网络节点，是通过网络连接的服务器、计算机、电话等，针对不同性质的区块链，成为节点的方式也会有所不同。以比特币为例，参与交易或挖矿即构成一个节点。A node is a network node in a blockchain distributed system, and is a server, computer, phone, etc. connected through the network. For different types of blockchains, the way to become a node will be different. Taking Bitcoin as an example, participating in transactions or mining constitutes a node.

在分布式存储数据库中，包含多个分布式存储子节点。在云端，可以将多个分布式存储子节点按照其对应的数据类别划分为多个集群，例如，将多个分布式存储子节点归于用户集群、二级特征集群和三级特征集群。用户集群中包含有大量的用户信息，用户集群中的分布式存储子节点可以对应到各个用户的用户信息，例如终端的唯一标识以及不同终端用户数据隐私保护查询在服务器中的物理地址等。可以通过用户集群对二级特征集群和三级特征集群进行检索和查询等。In a distributed storage database, there are multiple distributed storage sub-nodes. In the cloud, multiple distributed storage sub-nodes can be divided into multiple clusters according to their corresponding data categories. For example, multiple distributed storage sub-nodes can be assigned to user clusters, second-level feature clusters, and third-level feature clusters. The user cluster contains a large amount of user information. The distributed storage sub-nodes in the user cluster can correspond to the user information of each user, such as the unique identifier of the terminal and the physical address in the server for data privacy protection query of different terminal users. The secondary feature clusters and tertiary feature clusters can be retrieved and queried through user clusters.

二级特征集群和三级特征集群为对应于第二数据和第三数据的基础数据信息集群。基础数据信息可以为第二数据和第三数据本身，也可以为能够匹配出第二数据和第三数据的信息，例如对应于第二数据和第三数据的索引表。二级特征集群和三级特征集群中的分布式存储子节点可以对应到各个用户的基础数据信息，以根据分布式存储子节点提取基础数据信息。The second-level feature clusters and the third-level feature clusters are basic data information clusters corresponding to the second data and the third data. The basic data information may be the second data and the third data themselves, or may be information that can match the second data and the third data, such as an index table corresponding to the second data and the third data. The distributed storage sub-nodes in the second-level feature cluster and the third-level feature cluster may correspond to the basic data information of each user, so as to extract the basic data information according to the distributed storage sub-nodes.

140，当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息。140. When the user's query instruction is acquired, extract basic data information from the distributed storage database according to the query instruction.

获取用户的查询指令，包括在云端直接获取用户的查询指令，也包括在终端获取用户的查询指令后，将查询指令传送给云端。查询指令西带有用户的唯一标识及待查询数据。当获取到用户的查询指令时，根据查询指令中携带的用户的唯一标识，在云端确定出用户对应的分布式存储数据库，该分布式数据库中包含有该用户的基础数据信息的分布式存储子节点。Obtaining the user's query instruction includes directly obtaining the user's query instruction on the cloud, and also including transmitting the query instruction to the cloud after the terminal obtains the user's query instruction. The query command contains the unique identifier of the user and the data to be queried. When the user's query instruction is obtained, according to the unique identifier of the user carried in the query instruction, the distributed storage database corresponding to the user is determined in the cloud, and the distributed database contains the distributed storage sub-system of the basic data information of the user. node.

在一些实施例中，根据查询指令在分布式存储数据库中提取基础数据信息，包括获取用户的查询指令，查询指令携带有用户的唯一标识及待查询数据；根据唯一标识确定出用户对应的目标分布式存储数据库；在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点，根据分布式存储子节点提取基础数据信息。In some embodiments, basic data information is extracted from the distributed storage database according to the query instruction, including obtaining the user's query instruction, where the query instruction carries the user's unique identifier and the data to be queried; the target distribution corresponding to the user is determined according to the unique identifier In the target distributed storage database, the distributed storage sub-nodes corresponding to the data to be queried are determined, and the basic data information is extracted according to the distributed storage sub-nodes.

需要说明的是，待查询数据可以是第一数据，也可以是第二数据或第三数据，还可以是用户信息数据等等。对应不同的待查询数据，可以有不同的查询方法。例如，当待查询数据为第一数据时，将待查询的第一数据进行特征提取，得到与待查询的第一数据对应的待查询第二数据；根据待查询第二数据，在目标分布式存储数据库中确定出与待查询第二数据对应的分布式存储子节点；根据分布式存储子节点提取基础数据信息。又例如，当待查询数据为第二数据或第三数据时，在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点；根据分布式存储子节点提取基础数据信息。再例如，当待查询数据为用户信息数据等，在分布式存储数据库中确定出与待查询用户信息数据对应的分布式存储子节点，根据分布式存储子节点提取用户信息数据。It should be noted that the data to be queried may be first data, second data or third data, user information data, and so on. Corresponding to different data to be queried, there can be different query methods. For example, when the data to be queried is the first data, feature extraction is performed on the first data to be queried to obtain the second data to be queried corresponding to the first data to be queried; A distributed storage sub-node corresponding to the second data to be queried is determined in the storage database; basic data information is extracted according to the distributed storage sub-node. For another example, when the data to be queried is the second data or the third data, a distributed storage sub-node corresponding to the to-be-queried data is determined in the target distributed storage database; basic data information is extracted according to the distributed storage sub-node. For another example, when the data to be queried is user information data, etc., a distributed storage sub-node corresponding to the to-be-queried user information data is determined in the distributed storage database, and the user information data is extracted according to the distributed storage sub-node.

本领域的技术人员应知，在根据查询指令在分布式存储数据库中提取基础数据信息的过程中，提取方法可以有多种，以上仅仅是示例性的说明，不应构成对本申请的限制。Those skilled in the art should know that in the process of extracting basic data information in the distributed storage database according to the query instruction, there may be various extraction methods, and the above is only an exemplary description, which should not be construed to limit the present application.

提取出的对应于用户查询指令的基础数据信息，可以包含待查询数据本身，也可以包含有能够匹配出待查询数据的相关信息。例如，当带查询指令为第一数据时，基础数据信息包含有能够匹配出待查询第一数据的相关信息，例如，可以包含相应的第二数据，也可以包含能够匹配出该相应的第二数据的相关信息。当带查询指令为第二数据时，基础数据信息可以包含相应的第二数据，也可以包含能够匹配出该相应的第二数据的相关信息。The extracted basic data information corresponding to the user's query instruction may include the data to be queried itself, or may include relevant information that can match the data to be queried. For example, when the query command is the first data, the basic data information includes relevant information that can match the first data to be queried. For example, it may include the corresponding second data, or may include the corresponding second data information about the data. When the belt query instruction is the second data, the basic data information may include the corresponding second data, or may include relevant information that can match the corresponding second data.

需要说明的是，待查询数据可以是一种数据，也可以是多种数据，例如，可以包括待查询用户信息数据和第二数据。待查询数据可以是某一级的数据，也可以是多级的数据，例如，可以同时对第二数据和第三数据进行查询，也可以同时对第一数据和第二数据进行查询，等等。在一些实施例中，获取用户的查询指令，查询指令携带有用户的唯一标识及多个待查询数据，在根据唯一标识确定出用户对应的目标分布式存储数据库之后，确定多个待查询数据的种类，对应于多个待查询数据的种类，分别执行对应的查询方法。若多个待查询数据属于同一种类，则确定多个待查询数据的查询顺序，依照顺序对待查询数据进行查询。It should be noted that the data to be queried may be one type of data or multiple types of data, for example, may include user information data to be queried and second data. The data to be queried can be data of a certain level or data of multiple levels. For example, the second data and the third data can be queried at the same time, the first data and the second data can also be queried at the same time, etc. . In some embodiments, a query instruction of the user is obtained, and the query instruction carries the unique identifier of the user and a plurality of data to be queried. The type corresponds to the types of the data to be queried, and the corresponding query method is executed respectively. If a plurality of data to be queried belong to the same type, the query order of the plurality of data to be queried is determined, and the data to be queried is queried according to the order.

150，将基础数据信息在终端进行匹配，得到目标基础数据。150. Match the basic data information on the terminal to obtain the target basic data.

如果说步骤140包含有云端对用户查询指令的反馈，那么步骤150则包含有云端与终端的信息对接。在云端根据待查数据提取出基础数据信息后，将基础数据信息与终端存储的数据进行匹配，得到用户需要的待查询数据。If step 140 includes the feedback from the cloud to the user's query instruction, then step 150 includes the information connection between the cloud and the terminal. After the cloud extracts basic data information according to the data to be queried, the basic data information is matched with the data stored in the terminal to obtain the queried data required by the user.

当待查询数据为第二数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第二数据；当待查询数据为第三数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第三数据。当待查询数据为第一数据时，可以将基础数据信息与终端的第二数据进行匹配，得到目标第二数据，根据目标第二数据在终端中搜索出目标第一数据。在一些实施例中，在云端预先设置有基础数据信息与终端的第一数据对应的匹配映射或检索表等，此时，也可以直接将基础数据信息与终端的第一数据进行匹配，得到目标第一数据。When the data to be queried is the second data, the basic data information is matched with the second data of the terminal to obtain the target second data; when the data to be queried is the third data, the basic data information is compared with the second data of the terminal. match to obtain the target third data. When the data to be queried is the first data, the basic data information can be matched with the second data of the terminal to obtain the target second data, and the target first data can be searched in the terminal according to the target second data. In some embodiments, a matching map or a retrieval table corresponding to the basic data information and the first data of the terminal is preset in the cloud. At this time, the basic data information can also be directly matched with the first data of the terminal to obtain the target first data.

需要说明的是，为了在云端对不同的终端进行识别，在云端本身可以存储有用户信息数据或对应于用户信息数据的相关数据，例如，用户的唯一标识。为了区分不同的用户终端，可以在获取终端的基础数据信息时，就对用户信息数据进行提取，在用户对待查询数据进行查询时，将查询得到的目标数据和终端的用户信息数据一并反馈给用户。It should be noted that, in order to identify different terminals in the cloud, the cloud itself may store user information data or related data corresponding to the user information data, for example, the unique identifier of the user. In order to distinguish different user terminals, the user information data can be extracted when the basic data information of the terminal is obtained, and the target data obtained by the query and the user information data of the terminal are fed back to the user when the user queries the query data. user.

参考图3，图3为本申请实施例提供的数据隐私保护查询方法的另一应用场景图。其中，用户行为数据、传感器数据、…、系统运行数据等为基础数据的来源，具体的，可以通过传感器等获取基础数据。然后，将多个基础数据聚类后，得到第一数据，将第一数据进行存储。第一数据包括用户的行为数据、传感器数据、…、系统运行数据等基础数据。Referring to FIG. 3 , FIG. 3 is a diagram of another application scenario of the data privacy protection query method provided by the embodiment of the present application. Among them, user behavior data, sensor data, . Then, after clustering a plurality of basic data, first data is obtained, and the first data is stored. The first data includes basic data such as user behavior data, sensor data, ..., system operation data, and the like.

随后，特征提取模块对第一数据进行特征提取，提取出第一数据的重要特征作为第二数据，进行存储。第二数据包括用户的行为特征、传感器特征、…、系统特征等全景信息特征。Subsequently, the feature extraction module performs feature extraction on the first data, extracts important features of the first data as second data, and stores them. The second data includes panoramic information features such as user behavior features, sensor features, ..., system features, and the like.

第三数据可以为对第二数据的全景信息特征进行融合得到的融合全景特征，将融合全景特征进行存储。The third data may be a fused panorama feature obtained by fusing the panorama information features of the second data, and the fused panorama feature is stored.

需要说明的是，第一数据、第二数据和第三数据均在终端进行了存储，以保证原始数据不会丢失。It should be noted that the first data, the second data and the third data are all stored in the terminal to ensure that the original data will not be lost.

得到第三数据后，可以将第二数据和第三数据上传至云端提供给服务器进行数据分析，也可以将第二数据和第三数据传送给应用服务层或数据处理层，进行计算。此外，还可以对第二数据和第三数据进行冗余备份，增加数据冗余度，有效预防数据丢失。After the third data is obtained, the second data and the third data can be uploaded to the cloud to be provided to the server for data analysis, or the second data and the third data can be transmitted to the application service layer or the data processing layer for calculation. In addition, the second data and the third data can also be backed up redundantly to increase data redundancy and effectively prevent data loss.

在一些实施例中，对第二数据和第三数据提取出特征传输到云端，在云端分别设置有相应的二级特征master和三级特征master，在云端还设置有用户master，用户master记录大量用户信息，方便后续对用户信息进行索引，通过用户master可以对二级特征master和三级特征master进行检索和查询。master可以为一台服务器，也可以为服务器集群，其主要职责是负责对存储服务器集群和处理服务器的物理地址表和逻辑地址表进行维护，方便对输入的数据分配服务器进行存储和处理分发。每个master负责维护索引表，但可以不真正存储实际的数据，其设计的好处是使用分布式存储方式对南京信息进行存储维护。分布式存储的好处在于支持高性能的读写，多副本一致性，便于对数据进行容灾和备份，服务器节点弹性扩展，存储系统标准化。In some embodiments, features are extracted from the second data and the third data and transmitted to the cloud, where corresponding secondary feature masters and tertiary feature masters are respectively set on the cloud, and a user master is also set on the cloud, and the user master records a large number of User information is convenient for subsequent indexing of user information, and the user master can retrieve and query the secondary feature master and the tertiary feature master. The master can be a server or a server cluster. Its main responsibility is to maintain the physical address table and logical address table of the storage server cluster and processing server, so as to facilitate the storage, processing and distribution of the input data distribution server. Each master is responsible for maintaining the index table, but does not really store the actual data. The advantage of its design is to use the distributed storage method to store and maintain Nanjing information. The advantages of distributed storage are that it supports high-performance read and write, multi-copy consistency, easy disaster recovery and backup of data, elastic expansion of server nodes, and standardization of storage systems.

当服务器的全景感知建模层需要使用到二级特征和三级特征的时候，通过对用户master发送请求，可以从分布式数据库的子节点(node)中获得该用户对应的二级特征数据或者三级特征数据。When the panorama perception modeling layer of the server needs to use secondary features and tertiary features, by sending a request to the user master, the secondary feature data corresponding to the user can be obtained from the sub-nodes (nodes) of the distributed database or Tertiary feature data.

在一些实施例中，对第一数据、第二数据或第三数据等进行索引的索引方式可以为CG-Index的方式，其基本思想是每个节点负责维护一个局部的B+树，还维护一个全局的CG-index索引表，通过访问CG-index索引表可以确定需要在哪些节点上查询局部索引，可以支持高性能的随机读写。而CG-index将索引看做另外一种形式的数据并存储表格，内部使用互补校验表来代替普通表格实现索引容错和恢复，减少索引的存储开销。In some embodiments, the indexing method for indexing the first data, the second data or the third data, etc. may be the CG-Index method. The basic idea is that each node is responsible for maintaining a local B+ tree and maintaining a The global CG-index index table, by accessing the CG-index index table, can determine which nodes need to query the local index, which can support high-performance random read and write. On the other hand, CG-index regards the index as another form of data and stores the table, and internally uses the complementary check table instead of the ordinary table to achieve index fault tolerance and recovery, reducing the storage cost of the index.

参考图4，图4为本申请实施例提供的数据隐私保护查询方法的第二种流程示意图。其中，数据隐私保护查询方法包括以下步骤：Referring to FIG. 4 , FIG. 4 is a schematic flowchart of a second type of data privacy protection query method provided by an embodiment of the present application. The data privacy protection query method includes the following steps:

210、对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储。210. Cluster the basic data to obtain first data, perform feature extraction on the first data to obtain second data, fuse multiple second data to obtain third data, and combine the first data, the second data and the third data Store in the terminal.

基础数据包括但不限于用户行为数据、传感器数据、系统运行数据等，在终端存储有基础数据的第一数据、第二数据和第三数据。The basic data includes but is not limited to user behavior data, sensor data, system operation data, etc. The terminal stores first data, second data and third data of the basic data.

一级存储模块：对第一数据进行存储，作为一级存储数据库中的数据。值得注意的是，该数据库由于涉及到用户核心数据和用户隐私数据，因此只存储备份在终端本地，并不会上传服务器。The first-level storage module: stores the first data as the data in the first-level storage database. It is worth noting that because the database involves user core data and user privacy data, it is only stored and backed up locally on the terminal, and will not be uploaded to the server.

二级存储模块：二级存储中提取第一数据独立数据库中的高维度特征，有效地减少数据冗余和对源数据进行脱敏处理。Secondary storage module: High-dimensional features in the independent database of the first data are extracted from the secondary storage, which effectively reduces data redundancy and desensitizes the source data.

三级存储模块：主要是对独立特征的第二数据进行融合，具体而言，其对二级存储的数据库进行连表操作，合并具有相同时间序列的数据维度，融合全景特征。Three-level storage module: It mainly fuses the second data of independent features. Specifically, it performs table join operation on the database of secondary storage, merges data dimensions with the same time series, and fuses panoramic features.

220、为不同的终端设置唯一标识，用以区分不同的终端。220. Set unique identifiers for different terminals to distinguish different terminals.

对终端的二级存储和三级存储信息进行云端传送，并标记该用户的设备信息，区分该用户和其他用户。不同的用户在云端存储有不同的数据，通过为终端设置唯一标识，可以区分出不同的终端，在提取和匹配数据的时候，准确找到该用户所对应的部分。The secondary storage and tertiary storage information of the terminal is transmitted to the cloud, and the device information of the user is marked to distinguish the user from other users. Different users store different data in the cloud. By setting a unique identifier for the terminal, different terminals can be distinguished. When extracting and matching data, the part corresponding to the user can be accurately found.

230、获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息。230. Acquire basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data.

240、对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点。240. Perform distributed storage on the cloud for basic data information to obtain a distributed storage database, where the distributed storage database includes a plurality of distributed storage sub-nodes.

250、获取用户的查询指令，查询指令携带有用户的唯一标识及待查询数据。250. Obtain a query instruction of the user, where the query instruction carries the unique identifier of the user and the data to be queried.

获取用户的查询指令，包括在云端直接获取用户的查询指令，也包括在终端获取用户的查询指令后，将查询指令传送给云端。查询指令西带有用户的唯一标识及待查询数据。Obtaining the user's query instruction includes directly obtaining the user's query instruction on the cloud, and also including transmitting the query instruction to the cloud after the terminal obtains the user's query instruction. The query command contains the unique identifier of the user and the data to be queried.

260、根据唯一标识确定出用户对应的目标分布式存储数据库。260. Determine a target distributed storage database corresponding to the user according to the unique identifier.

当获取到用户的查询指令时，根据查询指令中携带的用户的唯一标识，在云端确定出用户对应的分布式存储数据库，该分布式数据库中包含有该用户的基础数据信息的分布式存储子节点。When the user's query instruction is obtained, according to the unique identifier of the user carried in the query instruction, the distributed storage database corresponding to the user is determined in the cloud, and the distributed database contains the distributed storage sub-system of the basic data information of the user. node.

270、判断待查询数据是否为第一数据。270. Determine whether the data to be queried is the first data.

需要说明的是，待查询数据可以是第一数据，也可以是第二数据或第三数据，还可以是用户信息数据等等。对应不同的待查询数据，可以有不同的查询方法。由于第一数据涉及底层的原始数据，为了保护用户隐私，没有将第一数据进行云端传送。判断查询数据是否为第一数据，有助于根据不同的情况对查询指令作出反应。It should be noted that the data to be queried may be first data, second data or third data, user information data, and so on. Corresponding to different data to be queried, there can be different query methods. Since the first data involves the underlying original data, in order to protect user privacy, the first data is not transmitted to the cloud. Determining whether the query data is the first data is helpful for responding to the query instruction according to different situations.

待查询数据可以是一种数据，也可以是多种数据，例如，可以包括待查询用户信息数据和第二数据。待查询数据可以是某一级的数据，也可以是多级的数据，例如，可以同时对第二数据和第三数据进行查询，也可以同时对第一数据和第二数据进行查询，等等。在一些实施例中，获取用户的查询指令，查询指令携带有用户的唯一标识及多个待查询数据，在根据唯一标识确定出用户对应的目标分布式存储数据库之后，确定多个待查询数据的种类，对应于多个待查询数据的种类，分别执行对应的查询方法。若多个待查询数据属于同一种类，则确定多个待查询数据的查询顺序，依照顺序对待查询数据进行是否为第一数据的判断，当判断出待查询数据是第一数据时，按照顺序执行第一数据对应的查询步骤；当判断出待查询数据不是第一数据时，按照顺序执行当待查询数据不是第一数据时，对应的查询步骤。The data to be queried may be one type of data or multiple types of data, for example, may include user information data to be queried and second data. The data to be queried can be data of a certain level or data of multiple levels. For example, the second data and the third data can be queried at the same time, the first data and the second data can also be queried at the same time, etc. . In some embodiments, a query instruction of the user is obtained, and the query instruction carries the unique identifier of the user and a plurality of data to be queried. The type corresponds to the types of multiple data to be queried, and the corresponding query method is executed respectively. If multiple data to be queried belong to the same type, determine the query order of the multiple data to be queried, and judge whether the data to be queried is the first data according to the order. When it is judged that the data to be queried is the first data, execute in order The query step corresponding to the first data; when it is determined that the data to be queried is not the first data, the corresponding query steps when the data to be queried is not the first data are executed in sequence.

281、若不是，在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点。281. If not, determine a distributed storage sub-node corresponding to the data to be queried in the target distributed storage database.

282、将基础数据信息与终端存储的数据进行匹配，得到目标数据。282. Match the basic data information with the data stored in the terminal to obtain target data.

当待查询数据为第二数据或第三数据时，在目标分布式存储数据库中确定出与第二数据或第三数据对应的分布式存储子节点；根据分布式存储子节点提取基础数据信息。再例如，当待查询数据为用户信息数据等，在分布式存储数据库中确定出与待查询用户信息数据对应的分布式存储子节点，根据分布式存储子节点提取用户信息数据。When the data to be queried is the second data or the third data, the distributed storage sub-node corresponding to the second data or the third data is determined in the target distributed storage database; the basic data information is extracted according to the distributed storage sub-node. For another example, when the data to be queried is user information data, etc., a distributed storage sub-node corresponding to the to-be-queried user information data is determined in the distributed storage database, and the user information data is extracted according to the distributed storage sub-node.

291、若是，将待查询的第一数据进行特征提取，得到与待查询的第一数据对应的待查询第二数据。291. If yes, perform feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried.

由于在云端没有存储第一数据以及第一数据对应的全景信息特征，可以先将待查询的第一数据进行特征提取，得到与待查询的第一数据对应的待查询第二数据，对第二数据进行查询，根据第二数据查找到相应的第一数据。Since the first data and the panoramic information features corresponding to the first data are not stored in the cloud, the first data to be queried can be extracted first to obtain the second data to be queried corresponding to the first data to be queried, and the second data to be queried can be obtained. The data is queried, and the corresponding first data is found according to the second data.

292、根据待查询第二数据，在目标分布式存储数据库中确定出与待查询第二数据对应的分布式存储子节点，根据分布式存储子节点提取基础数据信息。292. Determine, in the target distributed storage database, a distributed storage sub-node corresponding to the second data to be queried according to the second data to be queried, and extract basic data information according to the distributed storage sub-node.

需要说明的是，也可以根据待查询第二数据融合全景特征得到待查询第三数据，在步骤292中，根据待查询第三数据，在目标分布式存储数据库中确定出与待查询三级级数据对应的分布式存储子节点，根据分布式存储子节点提取基础数据信息。It should be noted that the third data to be queried can also be obtained by merging the panoramic features of the second data to be queried. The distributed storage sub-node corresponding to the data extracts basic data information according to the distributed storage sub-node.

293、将基础数据信息与终端的第二数据进行匹配，得到目标第二数据。293. Match the basic data information with the second data of the terminal to obtain the target second data.

294、根据目标第二数据在终端中搜索出目标第一数据。294. Search the terminal for the target first data according to the target second data.

在一些实施例中，数据隐私保护查询方法具体可以包括：首先通过信息感知层获取用户的电子设备的信息(具体包括电子设备运行信息、用户行为信息、各个传感器获取的信息、电子设备状态信息、电子设备显示内容信息、电子设备上下载信息等)，然后通过数据处理层对电子设备的信息进行处理(如分类等)，接着再通过特征抽取层从数据处理层处理后的信息中提取出第二数据和第三数据(第二数据和第三数据具体可参阅上述实施例的说明)，然后将第二数据和第三数据进行处理后，通过存储模块将目标模型参数上传到服务器进行存储。服务器基于保护数据不接收第一数据，当想要对第一数据进行查询时，可通过对服务器中的第二数据和第三数据的相关数据进行查询后对应到第一数据，避免对明文数据进行操作。In some embodiments, the data privacy protection query method may specifically include: first obtaining information of the user's electronic device through the information perception layer (specifically including electronic device operation information, user behavior information, information obtained by various sensors, electronic device status information, Electronic devices display content information, download information on electronic devices, etc.), and then process the information of electronic devices (such as classification, etc.) through the data processing layer, and then extract the first information from the information processed by the data processing layer through the feature extraction layer. The second data and the third data (for details of the second data and the third data, please refer to the description of the above embodiment), and after the second data and the third data are processed, the target model parameters are uploaded to the server for storage through the storage module. The server does not receive the first data based on the protection data. When it wants to query the first data, it can correspond to the first data by querying the related data of the second data and the third data in the server, so as to avoid clear text data. to operate.

由上可知，本申请实施例提供的数据隐私保护查询方法，首先对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储；然后获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息；接着对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点；随后在当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息；最后将基础数据信息在终端进行匹配，得到目标基础数据。通过三级存储的方式，将基础数据的关键特征进行提取和融合，在操作数据时，避免直接对明文数据进行操作，有效保护了终端系统数据的安全性和用户隐私数据的安全性。对第一数据不进行云端传送，对第二数据和第三数据提取后进行云端传送，从而避免用户隐私数据暴露在云端，进一步保护了云端系统数据的安全性和用户隐私数据的安全性。It can be seen from the above that, in the data privacy protection query method provided by the embodiments of the present application, first data is clustered to obtain first data, feature extraction is performed on the first data to obtain second data, and multiple second data is fused to obtain first data. Three data, the first data, the second data and the third data are stored in the terminal; then the basic data information of the terminal is obtained, and the basic data information is the basic data information of the second data and the third data; then the basic data information is stored in the terminal. The cloud performs distributed storage to obtain a distributed storage database. The distributed storage database contains multiple distributed storage sub-nodes; then when the user's query command is obtained, the basic data information is extracted from the distributed storage database according to the query command. ; Finally, the basic data information is matched on the terminal to obtain the target basic data. Through the three-level storage method, the key features of the basic data are extracted and fused. When operating the data, the direct operation of the plaintext data is avoided, which effectively protects the security of the terminal system data and the security of user privacy data. The first data is not transmitted to the cloud, and the second data and the third data are extracted and transmitted to the cloud, thereby avoiding the exposure of user privacy data to the cloud, and further protecting the security of cloud system data and user privacy data.

参考图5，图5为本申请实施例提供的数据隐私保护查询装置的结构示意图。其中，数据隐私保护查询装置300可以集成在电子设备中，数据隐私保护查询装置300包括处理模块301、获取模块302、存储模块303、提取模块304和匹配模块305。Referring to FIG. 5 , FIG. 5 is a schematic structural diagram of an apparatus for querying data privacy protection provided by an embodiment of the present application. The data privacy protection query apparatus 300 may be integrated into an electronic device, and the data privacy protection query apparatus 300 includes a processing module 301 , an acquisition module 302 , a storage module 303 , an extraction module 304 and a matching module 305 .

处理模块301，用于对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储；The processing module 301 is configured to cluster basic data to obtain first data, perform feature extraction on the first data to obtain second data, fuse multiple second data to obtain third data, and combine the first data and the second data and the third data are stored in the terminal;

获取模块302，用于获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息；an acquisition module 302, configured to acquire basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

存储模块303，用于对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点；The storage module 303 is used for distributed storage of basic data information in the cloud to obtain a distributed storage database, and the distributed storage database includes a plurality of distributed storage sub-nodes;

提取模块304，用于当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息；The extraction module 304 is configured to extract basic data information from the distributed storage database according to the query instruction when the user's query instruction is obtained;

匹配模块305，用于将基础数据信息在终端进行匹配，得到目标基础数据。The matching module 305 is configured to match the basic data information on the terminal to obtain the target basic data.

请一并参阅图6，图6为本申请实施例提供的数据隐私保护查询装置的另一结构示意图。在一些实施例中，对处理模块301中的第一数据、第二数据和第三数据在终端进行存储，处理模块301可以包括第一存储单元3011、第二存储单元3012和第三存储单元3013。Please refer to FIG. 6 together. FIG. 6 is another schematic structural diagram of a data privacy protection query apparatus provided by an embodiment of the present application. In some embodiments, the first data, the second data and the third data in the processing module 301 are stored in the terminal, and the processing module 301 may include a first storage unit 3011 , a second storage unit 3012 and a third storage unit 3013 .

第一存储单元3011，用于将基础数据进行聚类得到第一数据，将第一数据进行第一次存储，存储到对应的数据库中；The first storage unit 3011 is used to cluster the basic data to obtain the first data, store the first data for the first time, and store it in the corresponding database;

第二存储单元3012，用于对数据库进行基础数据的特征提取，得到每一个数据库对应的全景信息特征，将全景信息特征作为第二数据，对第二数据进行第二次存储；The second storage unit 3012 is used to perform feature extraction of basic data on the database, obtain the panoramic information feature corresponding to each database, and use the panoramic information feature as the second data to store the second data for the second time;

在一些实施例中，第二存储单元还用于：In some embodiments, the second storage unit is also used to:

预先训练机器学习模型，得到与第一数据匹配的机器学习模型；Pre-training the machine learning model to obtain a machine learning model matching the first data;

将第一数据输入机器学习模型，获取模型输出结果，将模型输出结果作为第二数据。Input the first data into the machine learning model, obtain the output result of the model, and use the output result of the model as the second data.

第三存储单元3013，用于将全景信息特征进行融合，得到融合特征数据，将融合特征数据作为第三数据，进行第三次存储。The third storage unit 3013 is configured to fuse the features of the panoramic information to obtain fused feature data, and use the fused feature data as third data for third storage.

在一些实施例中，第三存储单元3013可以具体用于将多个第二数据以多表连接和/或时序对齐的方式进行融合，得到第三数据，将第三数据进行第三次存储。In some embodiments, the third storage unit 3013 may be specifically configured to fuse multiple second data in a multi-table connection and/or time sequence alignment manner to obtain third data, and store the third data for the third time.

请一并参阅图7，图7为本申请实施例提供的数据隐私保护查询装置的又一结构示意图。在一些实施例中，提取模块304可以包括查询单元3041、确定单元3042和提取单元3043。Please also refer to FIG. 7. FIG. 7 is another schematic structural diagram of a data privacy protection query apparatus provided by an embodiment of the present application. In some embodiments, the extraction module 304 may include a query unit 3041 , a determination unit 3042 and an extraction unit 3043 .

查询单元3041用于获取用户的查询指令，查询指令携带有用户的唯一标识及待查询数据；The query unit 3041 is used to obtain the query instruction of the user, and the query instruction carries the unique identifier of the user and the data to be queried;

确定单元3042用于根据唯一标识确定出用户对应的目标分布式存储数据库；The determining unit 3042 is configured to determine the target distributed storage database corresponding to the user according to the unique identifier;

提取单元3043用于在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点，根据分布式存储子节点提取基础数据信息。The extraction unit 3043 is configured to determine the distributed storage sub-nodes corresponding to the data to be queried in the target distributed storage database, and extract basic data information according to the distributed storage sub-nodes.

当待查询数据为第一数据时，提取单元3043执行的步骤包括：将待查询的第一数据进行特征提取，得到与待查询的第一数据对应的待查询第二数据；根据待查询第二数据，在目标分布式存储数据库中确定出与待查询第二数据对应的分布式存储子节点。When the data to be queried is the first data, the steps performed by the extraction unit 3043 include: performing feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried; data, and determine a distributed storage sub-node corresponding to the second data to be queried in the target distributed storage database.

在一些实施例中，提取模块304还可以包括获取单元3044，用于获取待查询数据的数据级别，数据级别至少包括第一数据、第二数据和第三数据。In some embodiments, the extracting module 304 may further include an obtaining unit 3044, configured to obtain a data level of the data to be queried, where the data level at least includes the first data, the second data and the third data.

请一并参阅图8，图8为本申请实施例提供的数据隐私保护查询装置的再一结构示意图。在一些实施例中，匹配模块305可以包括第一匹配单元3051、第二匹配单元3052和第三匹配单元3053。Please also refer to FIG. 8. FIG. 8 is still another schematic structural diagram of a data privacy protection query apparatus provided by an embodiment of the present application. In some embodiments, the matching module 305 may include a first matching unit 3051 , a second matching unit 3052 and a third matching unit 3053 .

第一匹配单元3051，用于当待查询数据为第一数据时，将目标基础数据与终端的第二数据进行匹配，得到目标第二数据；根据目标第二数据在终端中搜索出目标第一数据。The first matching unit 3051 is configured to match the target basic data with the second data of the terminal when the data to be queried is the first data to obtain the second target data; search for the target first data in the terminal according to the second target data data.

第二匹配单元3052，用于当待查询数据为第二数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第二数据。The second matching unit 3052 is configured to match the basic data information with the second data of the terminal to obtain the target second data when the data to be queried is the second data.

第三匹配单元3053，用于当待查询数据为第三数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第三数据。The third matching unit 3053 is configured to match the basic data information with the second data of the terminal to obtain the target third data when the data to be queried is the third data.

在一些实施例中，装置还可以包括备份模块、传输模块。备份模块用于将融合特征数据在终端进行实时备份。传输模块用于将融合特征数据传输至应用服务层或数据处理层，以便应用服务层或数据处理层利用融合信息特征进行计算。In some embodiments, the apparatus may further include a backup module and a transmission module. The backup module is used for real-time backup of the fusion feature data on the terminal. The transmission module is used to transmit the fusion feature data to the application service layer or the data processing layer, so that the application service layer or the data processing layer can perform calculation by using the fusion information characteristic.

由上可知，本申请实施例提供了一种数据隐私保护查询装置，首先处理模块301对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储；然后获取模块302获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息；接着存储模块303对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点；随后提取模块304在当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息；最后匹配模块305将基础数据信息在终端进行匹配，得到目标基础数据。通过三级存储的方式，将基础数据的关键特征进行提取和融合，在操作数据时，避免直接对明文数据进行操作，有效保护了终端系统数据的安全性和用户隐私数据的安全性。对第一数据不进行云端传送，对第二数据和第三数据提取后进行云端传送，从而避免用户隐私数据暴露在云端，进一步保护了云端系统数据的安全性和用户隐私数据的安全性。It can be seen from the above that the embodiment of the present application provides a data privacy protection query device. First, the processing module 301 performs clustering on basic data to obtain first data, performs feature extraction on the first data to obtain second data, and combines multiple second data. The data is fused to obtain the third data, and the first data, the second data and the third data are stored in the terminal; then the acquisition module 302 acquires the basic data information of the terminal, and the basic data information is the basic data of the second data and the third data Then the storage module 303 performs distributed storage of the basic data information in the cloud to obtain a distributed storage database, and the distributed storage database contains a plurality of distributed storage sub-nodes; then the extraction module 304 obtains the user's query instruction when , extract the basic data information in the distributed storage database according to the query instruction; finally, the matching module 305 matches the basic data information on the terminal to obtain the target basic data. Through the three-level storage method, the key features of the basic data are extracted and fused. When operating the data, the direct operation of the plaintext data is avoided, which effectively protects the security of the terminal system data and the security of user privacy data. The first data is not transmitted to the cloud, and the second data and the third data are extracted and transmitted to the cloud, thereby avoiding the exposure of user privacy data to the cloud, and further protecting the security of cloud system data and user privacy data.

本申请实施例还提供一种电子设备。电子设备可以是智能手机、平板电脑、游戏设备、AR(Augmented Reality，增强现实)设备、汽车、数据隐私保护查询装置、音频播放装置、视频播放装置、笔记本、桌面计算设备、可穿戴设备诸如手表、眼镜、头盔、电子手链、电子项链、电子衣物等设备。The embodiments of the present application also provide an electronic device. The electronic device may be a smartphone, a tablet computer, a gaming device, an AR (Augmented Reality) device, a car, a data privacy protection query device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as a watch , glasses, helmets, electronic bracelets, electronic necklaces, electronic clothing and other equipment.

参考图9，图9为本申请实施例提供的电子设备900的第一种结构示意图。其中，电子设备900包括处理器901和存储器902。处理器901与存储器902电性连接。Referring to FIG. 9 , FIG. 9 is a first structural schematic diagram of an electronic device 900 provided by an embodiment of the present application. The electronic device 900 includes a processor 901 and a memory 902 . The processor 901 is electrically connected to the memory 902 .

处理器901是电子设备900的控制中心，利用各种接口和线路连接整个电子设备的各个部分，通过运行或调用存储在存储器902内的计算机程序，以及调用存储在存储器902内的数据，执行电子设备的各种功能和处理数据，从而对电子设备进行整体监控。The processor 901 is the control center of the electronic device 900, uses various interfaces and lines to connect various parts of the entire electronic device, executes the electronic device by running or calling the computer program stored in the memory 902, and calling the data stored in the memory 902. Various functions of the device and processing data, so as to carry out the overall monitoring of the electronic device.

在本实施例中，电子设备900中的处理器901会按照如下的步骤，将一个或一个以上的计算机程序的进程对应的指令加载到存储器902中，并由处理器901来运行存储在存储器902中的计算机程序，从而实现各种功能：In this embodiment, the processor 901 in the electronic device 900 loads the instructions corresponding to the processes of one or more computer programs into the memory 902 according to the following steps, and is executed by the processor 901 and stored in the memory 902 A computer program in , which implements various functions:

对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储；The first data is obtained by clustering the basic data, the second data is obtained by feature extraction on the first data, the third data is obtained by fusing a plurality of second data, and the first data, the second data and the third data are stored in the terminal. to store;

获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息；Acquire basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点；Distributed storage of basic data information in the cloud to obtain a distributed storage database, which includes multiple distributed storage sub-nodes;

当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息；When the user's query instruction is obtained, the basic data information is extracted from the distributed storage database according to the query instruction;

将基础数据信息在终端进行匹配，得到目标基础数据。The basic data information is matched on the terminal to obtain the target basic data.

在一些实施例中，在对第一数据进行特征提取得到第二数据时，处理器901执行以下步骤：In some embodiments, when performing feature extraction on the first data to obtain the second data, the processor 901 performs the following steps:

在一些实施例中，在将多个第二数据进行融合得到第三数据时，处理器901执行以下步骤：In some embodiments, when a plurality of second data are fused to obtain third data, the processor 901 performs the following steps:

将多个第二数据以多表连接和/或时序对齐的方式进行融合，得到第三数据。The plurality of second data are fused in the manner of multi-table connection and/or time series alignment to obtain third data.

在一些实施例中，在当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息时，处理器901执行以下步骤：In some embodiments, when the user's query instruction is obtained, and the basic data information is extracted from the distributed storage database according to the query instruction, the processor 901 performs the following steps:

获取用户的查询指令，查询指令携带有用户的唯一标识及待查询数据；Obtain the user's query instruction, where the query instruction carries the user's unique identifier and the data to be queried;

根据唯一标识确定出用户对应的目标分布式存储数据库；Determine the target distributed storage database corresponding to the user according to the unique identifier;

在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点，根据分布式存储子节点提取基础数据信息。A distributed storage sub-node corresponding to the data to be queried is determined in the target distributed storage database, and basic data information is extracted according to the distributed storage sub-node.

在一些实施例中，在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点之前，处理器901执行以下步骤：In some embodiments, before the distributed storage sub-node corresponding to the data to be queried is determined in the target distributed storage database, the processor 901 performs the following steps:

获取待查询数据的数据级别，数据级别包括第一数据、第二数据和第三数据。The data level of the data to be queried is acquired, and the data level includes the first data, the second data and the third data.

在一些实施例中，在待查询数据为第二数据或第三数据时，处理器901执行以下步骤：In some embodiments, when the data to be queried is the second data or the third data, the processor 901 performs the following steps:

在目标分布式存储数据库中确定出与待查询数据对应的分布式存储子节点；Determine the distributed storage sub-node corresponding to the data to be queried in the target distributed storage database;

当待查询数据为第二数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第二数据；When the data to be queried is the second data, the basic data information is matched with the second data of the terminal to obtain the target second data;

当待查询数据为第三数据时，将基础数据信息与终端的第二数据进行匹配，得到目标第三数据。When the data to be queried is the third data, the basic data information is matched with the second data of the terminal to obtain the target third data.

在一些实施例中，当待查询数据为第一数据时，处理器901执行以下步骤：In some embodiments, when the data to be queried is the first data, the processor 901 performs the following steps:

将待查询的第一数据进行特征提取，得到与待查询的第一数据对应的待查询第二数据；Perform feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried;

根据待查询第二数据，在目标分布式存储数据库中确定出与待查询第二数据对应的分布式存储子节点。According to the second data to be queried, a distributed storage sub-node corresponding to the second data to be queried is determined in the target distributed storage database.

在一些实施例中，当待查询数据为第一数据时，在将基础数据信息在终端进行匹配，得到目标基础数据时，处理器901执行以下步骤：In some embodiments, when the data to be queried is the first data, when the basic data information is matched on the terminal to obtain the target basic data, the processor 901 performs the following steps:

将目标基础数据与终端的第二数据进行匹配，得到目标第二数据；Matching the target basic data with the second data of the terminal to obtain the target second data;

根据目标第二数据在终端中搜索出目标第一数据。The target first data is searched in the terminal according to the target second data.

在一些实施例中，参考图10，图10为本申请实施例提供的电子设备900的第二种结构示意图。In some embodiments, referring to FIG. 10 , FIG. 10 is a schematic diagram of a second structure of an electronic device 900 according to an embodiment of the present application.

其中，电子设备900还包括：显示屏903、控制电路904、输入单元905、传感器906以及电源907。其中，处理器901分别与显示屏903、控制电路904、输入单元905、传感器906以及电源907电性连接。The electronic device 900 further includes: a display screen 903 , a control circuit 904 , an input unit 905 , a sensor 906 and a power supply 907 . The processor 901 is electrically connected to the display screen 903 , the control circuit 904 , the input unit 905 , the sensor 906 and the power supply 907 , respectively.

显示屏903可用于显示由用户输入的信息或提供给用户的信息以及电子设备的各种图形用户接口，这些图形用户接口可以由图像、文本、图标、视频和其任意组合来构成。The display screen 903 may be used to display information input by or provided to the user and various graphical user interfaces of the electronic device, which may be composed of images, text, icons, videos, and any combination thereof.

控制电路904与显示屏903电性连接，用于控制显示屏903显示信息。The control circuit 904 is electrically connected to the display screen 903 for controlling the display screen 903 to display information.

输入单元905可用于接收输入的数字、字符信息或用户特征信息(例如指纹)，以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。其中，输入单元905可以包括指纹识别模组。The input unit 905 may be used to receive input numbers, character information, or user characteristic information (eg, fingerprints), and generate keyboard, mouse, joystick, optical, or trackball signal input related to user settings and function control. The input unit 905 may include a fingerprint identification module.

传感器906用于采集电子设备自身的信息或者用户的信息或者外部环境信息。例如，传感器906可以包括距离传感器、磁场传感器、光线传感器、加速度传感器、指纹传感器、霍尔传感器、位置传感器、陀螺仪、惯性传感器、姿态感应器、气压计、心率传感器等多个传感器。The sensor 906 is used to collect the information of the electronic device itself or the user's information or the external environment information. For example, the sensor 906 may include a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, a posture sensor, a barometer, a heart rate sensor, and the like.

电源907用于给电子设备900的各个部件供电。在一些实施例中，电源907可以通过电源管理系统与处理器901逻辑相连，从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。Power supply 907 is used to power various components of electronic device 900 . In some embodiments, the power supply 907 may be logically connected to the processor 901 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption through the power management system.

尽管图10中未示出，电子设备900还可以包括摄像头、蓝牙模块等，在此不再赘述。Although not shown in FIG. 10 , the electronic device 900 may further include a camera, a Bluetooth module, and the like, which will not be repeated here.

由上可知，本申请实施例提供了一种电子设备，电子设备中的处理器执行以下步骤：首先对基础数据进行聚类得到第一数据，对第一数据进行特征提取得到第二数据，将多个第二数据进行融合得到第三数据，将第一数据、第二数据和第三数据在终端进行存储；然后获取终端的基础数据信息，基础数据信息为第二数据和第三数据的基础数据信息；接着对基础数据信息在云端进行分布式存储，得到分布式存储数据库，分布式存储数据库中包含多个分布式存储子节点；随后在当获取到用户的查询指令时，根据查询指令在分布式存储数据库中提取基础数据信息；最后将基础数据信息在终端进行匹配，得到目标基础数据。通过三级存储的方式，将基础数据的关键特征进行提取和融合，在操作数据时，避免直接对明文数据进行操作，有效保护了终端系统数据的安全性和用户隐私数据的安全性。对第一数据不进行云端传送，对第二数据和第三数据提取后进行云端传送，从而避免用户隐私数据暴露在云端，进一步保护了云端系统数据的安全性和用户隐私数据的安全性。It can be seen from the above that the embodiment of the present application provides an electronic device, and the processor in the electronic device performs the following steps: first, the basic data is clustered to obtain the first data, the first data is subjected to feature extraction to obtain the second data, and the A plurality of second data are fused to obtain third data, and the first data, the second data and the third data are stored in the terminal; then basic data information of the terminal is obtained, and the basic data information is the basis of the second data and the third data data information; then perform distributed storage of basic data information in the cloud to obtain a distributed storage database, which contains multiple distributed storage sub-nodes; then when the user's query instruction is obtained, according to the query instruction The basic data information is extracted from the distributed storage database; finally, the basic data information is matched on the terminal to obtain the target basic data. Through the three-level storage method, the key features of the basic data are extracted and fused. When operating the data, the direct operation of the plaintext data is avoided, which effectively protects the security of the terminal system data and the security of user privacy data. The first data is not transmitted to the cloud, and the second data and the third data are extracted and transmitted to the cloud, thereby avoiding the exposure of user privacy data to the cloud, and further protecting the security of cloud system data and user privacy data.

本申请实施例还提供一种存储介质，存储介质中存储有计算机程序，当计算机程序在计算机上运行时，计算机执行上述任一实施例的数据隐私保护查询方法。An embodiment of the present application further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program runs on the computer, the computer executes the data privacy protection query method of any of the foregoing embodiments.

例如，在一些实施例中，当计算机程序在计算机上运行时，计算机执行以下步骤：For example, in some embodiments, when a computer program is run on a computer, the computer performs the following steps:

需要说明的是，本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过计算机程序来指令相关的硬件来完成，计算机程序可以存储于计算机可读存储介质中，存储介质可以包括但不限于：只读存储器(ROM，Read Only Memory)、随机存取存储器(RAM，Random AccessMemory)、磁盘或光盘等。It should be noted that those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium, The storage medium may include, but is not limited to, a read only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk, and the like.

以上对本申请实施例所提供的数据隐私保护查询方法、装置、存储介质及电子设备进行了详细介绍。本文中应用了具体个例对本申请的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本申请的方法及其核心思想；同时，对于本领域的技术人员，依据本申请的思想，在具体实施方式及应用范围上均会有改变之处，综上，本说明书内容不应理解为对本申请的限制。The data privacy protection query method, device, storage medium, and electronic device provided by the embodiments of the present application are described in detail above. The principles and implementations of the present application are described herein using specific examples, and the descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; meanwhile, for those skilled in the art, according to the Thoughts, there will be changes in the specific implementation and application scope. In conclusion, the content of this specification should not be construed as a limitation on the application.

Claims

1. A data privacy protection query method, wherein the data privacy protection query method comprises:

The first data is obtained by clustering the basic data, the second data is obtained by feature extraction on the first data, the third data is obtained by fusing a plurality of the second data, and the first data and the second data are obtained. and the third data are stored in the terminal;

acquiring basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

Distributed storage of the basic data information in the cloud to obtain a distributed storage database, wherein the distributed storage database includes a plurality of distributed storage sub-nodes;

When the user's query instruction is acquired, extract the basic data information from the distributed storage database according to the query instruction;

The basic data information is matched on the terminal to obtain target basic data.

2. The data privacy protection query method according to claim 1, wherein, before the acquiring the basic data information of the terminal, further comprising:

Unique identifiers are set for different terminals to distinguish different terminals.

3. The data privacy protection query method according to claim 2, wherein, when the user's query instruction is obtained, extracting the basic data information in a distributed storage database according to the query instruction, comprising:

Obtain a query instruction of the user, where the query instruction carries the unique identifier of the user and the data to be queried;

Determine the target distributed storage database corresponding to the user according to the unique identifier;

A distributed storage sub-node corresponding to the data to be queried is determined in the target distributed storage database, and the basic data information is extracted according to the distributed storage sub-node.

4. The data privacy protection query method according to claim 3, wherein before the distributed storage sub-node corresponding to the data to be queried is determined in the target distributed storage database, the method further comprises:

The data level of the data to be queried is acquired, where the data level includes first data, second data and third data.

5. The data privacy protection query method according to claim 4, wherein the determining the distributed storage sub-node corresponding to the data to be queried in the target distributed storage database comprises:

When the data to be queried is the second data or the third data, determining a distributed storage sub-node corresponding to the data to be queried in the target distributed storage database;

The matching of the basic data information on the terminal to obtain the target basic data includes:

When the data to be queried is the second data, the basic data information is matched with the second data of the terminal to obtain the target second data;

When the data to be queried is the third data, the basic data information is matched with the second data of the terminal to obtain the target third data.

6. The data privacy protection query method according to claim 4, wherein the determining the distributed storage sub-node corresponding to the query instruction in the target distributed storage database comprises:

When the data to be queried is the first data, feature extraction is performed on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried;

According to the second data to be queried, a distributed storage sub-node corresponding to the second data to be queried is determined in the target distributed storage database.

7. The data privacy protection query method according to claim 6, wherein the matching of the basic data information at the terminal to obtain the target basic data comprises:

Matching the target basic data with the second data of the terminal to obtain the target second data;

The target first data is searched in the terminal according to the target second data.

8 . The data privacy protection query method according to claim 1 , wherein the basic data includes at least behavior data, sensor data and terminal system operation data of the user operating the terminal. 9 .

9. The data privacy protection query method according to claim 1, wherein the feature extraction of the first data to obtain the second data comprises:

Pre-training a machine learning model to obtain a machine learning model matching the first data;

Inputting the first data into the machine learning model, obtaining an output result of the model, and using the output result of the model as the second data.

10. The data privacy protection query method according to claim 1, wherein the obtaining third data by fusing a plurality of the second data comprises:

A plurality of the second data are fused in the manner of multi-table connection and/or time series alignment to obtain third data.

11. A data privacy protection query device, wherein the data privacy protection query device comprises:

The processing module is used for clustering the basic data to obtain the first data, performing feature extraction on the first data to obtain the second data, fusing a plurality of the second data to obtain the third data, and combining the first data The data, the second data and the third data are stored in the terminal;

an acquisition module, configured to acquire basic data information of the terminal, where the basic data information is the basic data information of the second data and the third data;

a storage module, configured to perform distributed storage of the basic data information in the cloud to obtain a distributed storage database, wherein the distributed storage database includes a plurality of distributed storage sub-nodes;

an extraction module, configured to extract the basic data information from the distributed storage database according to the query instruction when the user's query instruction is obtained;

The matching module is used for matching the basic data information on the terminal to obtain target basic data.

12. A storage medium having a computer program stored thereon, wherein when the computer program is run on a computer, the computer is made to execute the data privacy protection query according to any one of claims 1 to 10 method.

13. An electronic device, characterized in that the electronic device comprises a processor and a memory, and a computer program is stored in the memory, and the processor is used to execute the computer program by invoking the computer program stored in the memory. Claims The data privacy protection query method according to any one of claims 1 to 10.