CN118101651B

CN118101651B - Distributed system for realizing low retention of service high-availability data

Info

Publication number: CN118101651B
Application number: CN202410203115.4A
Authority: CN
Inventors: 张炜琛; 倪培峰; 于奕; 赵金科; 严凯; 周百超; 刘威; 袁得嵛; 郑海龙; 王佳怡
Original assignee: Xiamen Dragon Information Technology Co ltd; PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Current assignee: Xiamen Dragon Information Technology Co ltd; PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority date: 2024-02-23
Filing date: 2024-02-23
Publication date: 2024-09-13
Anticipated expiration: 2044-02-23
Also published as: CN118101651A

Abstract

The present application relates to the field of data processing technology, and in particular to a distributed system for realizing high-availability services and low data retention, wherein the system comprises: a first service platform, a plurality of second service platforms, a scheduling platform, a plurality of initial data access platforms, a processor, and a memory storing a computer program, and when the computer program is executed by the processor, the following steps are implemented: when the scheduling platform receives a target query request sent by any second service platform, the target data access platform is obtained from a plurality of initial data access platforms, and according to the target data access platform, the scheduling platform sends the target query request to the first service platform, and according to the target query request, the first service platform sends the target service-related data set to the second service platform, and the second service platform completes the deletion or storage of the target service-related data. It can be seen that the present invention can select whether to retain data according to the data type, thereby improving data security.

Description

A distributed system that achieves high service availability and low data retention

技术领域Technical Field

本发明涉及数据处理技术领域，特别是涉及一种实现服务高可用数据低留存的分布式系统。The present invention relates to the field of data processing technology, and in particular to a distributed system that realizes high-availability services and low data retention.

背景技术Background Art

目前，对于多个低等级平台，其采集的人员数据都是各自进行存储，不进行数据的共享，可用性较低，另外，由于低等级平台安全性较低，或者存储的人员信息较多，也即在低等级平台存在有数据高留存的情况下，会出现信息泄露的问题，导致数据安全性较低，还有，若将多个低等级平台的数据都传输到高等级平台进行统一存储和管理，在每个低等级平台需要查询人员信息时，则可能会出现访问拥堵等问题，影响系统的正常运行。At present, for multiple low-level platforms, the personnel data they collect are stored separately without data sharing, and the availability is low. In addition, due to the low security of low-level platforms, or the large amount of personnel information stored, that is, when there is a high data retention in the low-level platforms, there will be problems of information leakage, resulting in low data security. In addition, if the data of multiple low-level platforms are transmitted to the high-level platform for unified storage and management, when each low-level platform needs to query personnel information, problems such as access congestion may occur, affecting the normal operation of the system.

发明内容Summary of the invention

针对上述技术问题，本发明采用的技术方案为：In view of the above technical problems, the technical solution adopted by the present invention is:

一种实现服务高可用数据低留存的分布式系统，所述系统包括：第一服务平台、若干个第二服务平台、调度平台、若干个初始数据访问平台、处理器和存储有计算机程序的存储器，所述第一服务平台与每一所述第二服务平台通信连接，每一所述第二服务平台与所述调度平台通信连接，所述调度平台与每一初始数据访问平台通信连接，每一初始数据访问平台与所述第一服务平台通信连接，当所述计算机程序被处理器执行时，实现以下步骤：A distributed system for realizing high-availability service and low-retention data, the system comprising: a first service platform, a plurality of second service platforms, a scheduling platform, a plurality of initial data access platforms, a processor, and a memory storing a computer program, the first service platform is communicatively connected with each of the second service platforms, each of the second service platforms is communicatively connected with the scheduling platform, the scheduling platform is communicatively connected with each initial data access platform, and each initial data access platform is communicatively connected with the first service platform, and when the computer program is executed by the processor, the following steps are implemented:

S100，当所述调度平台接收到任一第二服务平台发送的目标查询请求时，从若干个初始数据访问平台中获取目标数据访问平台。S100: When the scheduling platform receives a target query request sent by any second service platform, it obtains a target data access platform from a plurality of initial data access platforms.

S200，根据所述目标数据访问平台，所述调度平台将所述目标查询请求发送至所述第一服务平台。S200: According to the target data access platform, the scheduling platform sends the target query request to the first service platform.

S300，根据所述目标查询请求，所述第一服务平台将所述目标查询请求对应的目标服务关联数据集发送到所述第二服务平台。S300: According to the target query request, the first service platform sends a target service associated data set corresponding to the target query request to the second service platform.

S400，当所述第二服务平台接收到所述目标服务关联数据集时，对所述目标服务关联数据集中每一目标服务关联数据对应的数据标签进行判断。S400: When the second service platform receives the target service associated data set, it determines a data tag corresponding to each target service associated data in the target service associated data set.

S500，当所述目标服务关联数据对应的数据标签为第一数据标签时，所述第二服务平台将所述目标服务关联数据删除。S500: When the data tag corresponding to the target service-related data is the first data tag, the second service platform deletes the target service-related data.

S600，当所述目标服务关联数据对应的数据标签为第二数据标签时，所述第二服务平台对所述目标服务关联数据进行存储。S600: When the data tag corresponding to the target service associated data is a second data tag, the second service platform stores the target service associated data.

本发明与现有技术相比具有明显的有益效果，借由上述技术方案，本发明提供的一种实现服务高可用数据低留存的分布式系统可达到相当的技术进步性及实用性，并具有产业上的广泛利用价值，其至少具有以下有益效果：Compared with the prior art, the present invention has obvious beneficial effects. By means of the above technical solution, the present invention provides a distributed system for realizing high-availability services and low data retention, which can achieve considerable technical advancement and practicality, and has wide industrial utilization value, and has at least the following beneficial effects:

本发明提供了一种实现服务高可用数据低留存的分布式系统，所述系统包括：第一服务平台、若干个第二服务平台、调度平台、若干个初始数据访问平台、处理器和存储有计算机程序的存储器，所述第一服务平台与每一所述第二服务平台通信连接，每一所述第二服务平台与所述调度平台通信连接，所述调度平台与每一初始数据访问平台通信连接，每一初始数据访问平台与所述第一服务平台通信连接，当所述计算机程序被处理器执行时，实现以下步骤：当所述调度平台接收到任一第二服务平台发送的目标查询请求时，从若干个初始数据访问平台中获取目标数据访问平台，将所述目标查询请求通过目标数据访问平台发送至所述第一服务平台，根据所述目标查询请求，所述第一服务平台将所述目标查询请求对应的目标服务关联数据集发送到所述第二服务平台，当所述目标服务关联数据对应的数据标签为第一数据标签时，所述第二服务平台将所述目标服务关联数据删除，当所述目标服务关联数据对应的数据标签为第二数据标签时，所述第二服务平台对所述目标服务关联数据进行存储。一方面，通过获取目标数据访问平台，能够将目标查询请求调度到目标数据访问平台而非其他的初始数据访问平台，防止其他初始数据访问平台出现拥塞等情况，有利于维护系统的正常运行；另一方面，第二数据平台通过根据数据标签的类型，将不能留存的数据删除掉，能够防止人员信息在第二数据平台中泄露，增强了数据的安全性。The present invention provides a distributed system for realizing high-availability service and low-retention data. The system comprises: a first service platform, several second service platforms, a scheduling platform, several initial data access platforms, a processor and a memory storing a computer program. The first service platform is communicatively connected with each of the second service platforms, each of the second service platforms is communicatively connected with the scheduling platform, the scheduling platform is communicatively connected with each initial data access platform, and each initial data access platform is communicatively connected with the first service platform. When the computer program is executed by the processor, the following steps are implemented: when the scheduling platform receives a target query request sent by any second service platform, a target data access platform is obtained from several initial data access platforms, and the target query request is sent to the first service platform through the target data access platform. According to the target query request, the first service platform sends a target service-related data set corresponding to the target query request to the second service platform. When the data label corresponding to the target service-related data is a first data label, the second service platform deletes the target service-related data. When the data label corresponding to the target service-related data is a second data label, the second service platform stores the target service-related data. On the one hand, by obtaining the target data access platform, the target query request can be dispatched to the target data access platform rather than other initial data access platforms, preventing congestion on other initial data access platforms, which is beneficial to maintaining the normal operation of the system; on the other hand, the second data platform can prevent the leakage of personnel information in the second data platform by deleting data that cannot be retained according to the type of data tag, thereby enhancing data security.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work.

图1为本发明实施例提供的一种实现服务高可用数据低留存的分布式系统执行计算机程序的流程图。FIG1 is a flow chart of a distributed system executing a computer program to implement high-availability service and low-data retention, provided by an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative work are within the scope of protection of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或服务器不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the specification and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable where appropriate, so that the embodiments of the present invention described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, system, product, or server that includes a series of steps or units is not necessarily limited to those steps or units clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products, or devices.

本发明提供了一种实现服务高可用数据低留存的分布式系统，所述系统包括：第一服务平台、若干个第二服务平台、调度平台、若干个初始数据访问平台、处理器和存储有计算机程序的存储器，所述第一服务平台与每一所述第二服务平台通信连接，每一所述第二服务平台与所述调度平台通信连接，所述调度平台与每一初始数据访问平台通信连接，每一初始数据访问平台与所述第一服务平台通信连接，当所述计算机程序被处理器执行时，实现以下步骤，如图1所示：The present invention provides a distributed system for realizing high-availability service and low-retention data, the system comprising: a first service platform, a plurality of second service platforms, a scheduling platform, a plurality of initial data access platforms, a processor and a memory storing a computer program, the first service platform is communicatively connected with each of the second service platforms, each of the second service platforms is communicatively connected with the scheduling platform, the scheduling platform is communicatively connected with each initial data access platform, and each initial data access platform is communicatively connected with the first service platform, when the computer program is executed by the processor, the following steps are implemented, as shown in FIG1:

具体的，所述第一服务平台、若干个第二服务平台、调度平台和若干个初始数据访问平台之间通过分布式架构的形式进行连接，以实现各个平台之间的通信。Specifically, the first service platform, several second service platforms, the scheduling platform and several initial data access platforms are connected in the form of a distributed architecture to achieve communication between the various platforms.

具体的，所述第一服务平台用于监控每一所述第二服务平台。Specifically, the first service platform is used to monitor each of the second service platforms.

具体的，所述调度平台是指在接收到所述第二服务平台发送的查询请求时，将查询请求发送至某一初始数据访问平台的平台。Specifically, the scheduling platform refers to a platform that sends a query request to an initial data access platform when receiving the query request sent by the second service platform.

具体的，所述初始数据访问平台是指在接收到所述调度平台发送的查询请求时，将查询请求发送至第一服务平台的数据访问平台。Specifically, the initial data access platform refers to a data access platform that sends a query request to the first service platform when receiving a query request sent by the scheduling platform.

具体的，所述目标查询请求是指查询目标用户对应的目标服务关联数据集的请求。Specifically, the target query request refers to a request for querying a target service associated data set corresponding to a target user.

在一个具体的实施例中，在S100步骤中还包括如下步骤：In a specific embodiment, step S100 also includes the following steps:

S101，当所述调度平台接收到任一第二服务平台发送的目标查询请求时，获取所述初始数据访问平台对应的访问优先级集E＝{E₁，E₂，……，E_δ，……，E_d}，E_δ为第δ个初始数据访问平台对应的访问优先级，δ＝1，2……d，d为初始数据访问平台的数量；S101, when the scheduling platform receives a target query request sent by any second service platform, obtains an access priority set E={E ₁ , E ₂ , ..., E _δ , ..., E _d } corresponding to the initial data access platform, where E _δ is the access priority corresponding to the δth initial data access platform, δ=1, 2...d, and d is the number of initial data access platforms;

其中，E_δ符合如下条件：Among them, E _δ meets the following conditions:

E_δ＝1/Q_δ，其中，Q_δ为当前时刻的前一秒钟内，第δ个初始数据访问平台接收的访问量的数量。E _δ =1/Q _δ , where Q _δ is the number of visits received by the δth initial data access platform within one second before the current moment.

具体的，所述当前时刻是指所述调度平台接收到所述第二服务平台发送的目标查询请求的时刻。Specifically, the current time refers to the time when the scheduling platform receives the target query request sent by the second service platform.

S102，所述调度平台根据每一所述初始数据访问平台对应的访问优先级，获取目标数据访问平台。S102: The scheduling platform obtains a target data access platform according to the access priority corresponding to each of the initial data access platforms.

具体的，所述目标数据访问平台为最大的访问优先级对应的初始数据访问平台。Specifically, the target data access platform is an initial data access platform corresponding to the highest access priority.

上述，通过计算各个初始数据访问平台对应的访问优先级，能够得知各个初始数据访问平台的访问量情况，以将目标查询请求调度到访问优先级最大的初始数据访问平台而非其他的初始数据访问平台，能够实现合理分配，能够防止访问优先级低的初始数据访问平台出现拥塞等情况，有利于维护系统的正常运行。As mentioned above, by calculating the access priority corresponding to each initial data access platform, the access volume of each initial data access platform can be known, so that the target query request can be scheduled to the initial data access platform with the highest access priority rather than other initial data access platforms. Reasonable allocation can be achieved, and congestion can be prevented from occurring in the initial data access platforms with low access priority, which is conducive to maintaining the normal operation of the system.

S200，根据所述目标数据访问平台，所述调度平台将所述目标查询请求发送至所述第一服务平台；可以理解为：所述调度平台将所述目标查询请求发送至所述目标数据访问平台，所述目标数据访问平台将所述目标查询请求发送至第一服务平台。S200, according to the target data access platform, the scheduling platform sends the target query request to the first service platform; it can be understood that: the scheduling platform sends the target query request to the target data access platform, and the target data access platform sends the target query request to the first service platform.

S300，根据所述目标查询请求，所述第一服务平台将所述目标查询请求对应的目标服务关联数据集发送到所述第二服务平台；可以理解为：将所述目标服务关联数据集发送至所述第二服务平台中的同时将所述目标服务关联数据集也存储在第一服务平台中。S300, according to the target query request, the first service platform sends the target service associated data set corresponding to the target query request to the second service platform; it can be understood that: the target service associated data set is sent to the second service platform and the target service associated data set is also stored in the first service platform.

上述，当第二服务平台需要使用该目标服务关联数据集时，则向第一服务平台发送该目标服务关联数据集对应的目标查询请求，以获取到目标服务关联数据集后进行后续工作的处理。As described above, when the second service platform needs to use the target service associated data set, it sends a target query request corresponding to the target service associated data set to the first service platform, so as to obtain the target service associated data set and then process subsequent work.

在一个具体的实施例中，在S100步骤前还通过如下步骤获取所述目标服务关联数据集：In a specific embodiment, before step S100, the target service associated data set is obtained by the following steps:

S001，当所述第一服务平台接收到任一所述第二服务平台的目标服务名称时，所述第一服务平台发送目标服务名称对应的初始服务网页信息至所述第二服务平台中。S001, when the first service platform receives a target service name of any second service platform, the first service platform sends initial service web page information corresponding to the target service name to the second service platform.

具体的，所述第一服务平台用于监控所述第二服务平台；可以理解为：所述第一服务平台的等级大于所述第二服务平台的等级，即所述第一服务平台能够监控所述第二服务平台的操作情况、存储情况等。Specifically, the first service platform is used to monitor the second service platform; it can be understood that: the level of the first service platform is greater than the level of the second service platform, that is, the first service platform can monitor the operation status, storage status, etc. of the second service platform.

具体的，所述初始服务网页信息包括：初始服务名称、初始服务名称对应的网址、初始服务名称对应的网页信息；例如，所述初始服务名称为“人员信息采集”等。Specifically, the initial service web page information includes: an initial service name, a website address corresponding to the initial service name, and web page information corresponding to the initial service name; for example, the initial service name is "personnel information collection" and the like.

进一步的，所述初始服务名称对应的网址表征为用于获取所述初始服务名称对应的网页信息的网址；可以理解为：通过对所述初始服务名称的网址进行操作可以跳转至初始服务网页并获取到所述初始服务名称对应的网页信息。Furthermore, the URL corresponding to the initial service name is characterized as a URL used to obtain the web page information corresponding to the initial service name; it can be understood that: by operating the URL of the initial service name, you can jump to the initial service web page and obtain the web page information corresponding to the initial service name.

进一步的，所述初始服务名称对应的网页信息包括若干个初始服务名称对应的网页选项名称、每一网页选项名称的标签、每一网页选项名称的规则，其中，例如，所述网页选项名称可以是姓名、尿检、酒精含量等。Furthermore, the web page information corresponding to the initial service name includes several web page option names corresponding to the initial service names, labels of each web page option name, and rules of each web page option name, wherein, for example, the web page option name may be name, urine test, alcohol content, etc.

具体的，所述网页选项名称的标签用于表征所述网页选项名称是否需要重新采集的标签，例如：血型的标签为“1”，尿检的标签为“0”，“0”表征为需要重新采集，“1”表征为无需重新采集。Specifically, the label of the web page option name is used to indicate whether the web page option name needs to be re-collected. For example, the label of blood type is "1", and the label of urine test is "0". "0" indicates that re-collection is required, and "1" indicates that re-collection is not required.

具体的，所述网页选项名称的规则是指所述网页选项名称对应的采集要求的规则，例如，尿检的规则是采集5-10ml。Specifically, the rule of the web page option name refers to the rule of the collection requirement corresponding to the web page option name, for example, the rule of urine test is to collect 5-10 ml.

具体的，所述目标服务名称是指目标服务的服务名称，其中，所述目标服务是指所述第二服务平台需要请求的服务。Specifically, the target service name refers to the service name of the target service, wherein the target service refers to the service that the second service platform needs to request.

具体的，在S001步骤中还包括如下步骤：Specifically, step S001 also includes the following steps:

S0011，获取所述目标服务名称对应的词向量A和初始服务名称对应的词向量集A⁰＝{A⁰ ₁，A⁰ ₂，……，A⁰ _j，……，A⁰ _n}，A⁰ _j是指第j个初始服务名称对应的词向量，j＝1，2……n。S0011, obtain the word vector A corresponding to the target service name and the word vector set A ⁰ ={A ⁰ ₁ , A ⁰ ₂ , ..., A ⁰ _j , ..., A ⁰ _n } corresponding to the initial service name, where A ⁰ _j refers to the word vector corresponding to the j-th initial service name, j=1, 2...n.

S0012，根据A和A⁰，获取到A对应的词向量相似度F＝{F₁，F₂，……，F_j，……，F_n}，F_j是指A与A⁰ _j之间的相似度。S0012, based on A and A ⁰ , obtain the word vector similarity F = {F ₁ , F ₂ , ..., F _j , ..., F _n } corresponding to A, where F _j refers to the similarity between A and A ⁰ _j .

具体的，F_j符合如下条件：Specifically, F _j meets the following conditions:

F_j＝(A·A⁰ _j)/(||A||×||A⁰ _j||)。F _j =(A·A ⁰ _j )/(||A||×||A ⁰ _j ||).

S0013，当F中最大相似度的数量为1时，将最大相似度对应的所述初始服务网页信息发送至所述第二服务平台中；可以理解为：若n个相似度中的最大值只有一个时，将最大相似度对应的初始服务网页信息发送至所述第二服务平台中。S0013, when the number of maximum similarities in F is 1, the initial service web page information corresponding to the maximum similarity is sent to the second service platform; it can be understood that: if there is only one maximum value among the n similarities, the initial service web page information corresponding to the maximum similarity is sent to the second service platform.

S0014，当F中最大相似度的数量不为1时，反馈第一提示信息至所述第二服务平台中。S0014: When the number of maximum similarities in F is not 1, the first prompt information is fed back to the second service platform.

具体的，所述第一提示信息用于表征为提示第二服务平台发送的目标服务名称有误的信息。Specifically, the first prompt information is used to indicate that the target service name sent by the second service platform is incorrect.

上述，通过分别计算第一服务平台中的所有初始服务名称对应的词向量和目标服务名称对应的词向量之间的相似度，能够获取到与目标服务名称相似度最大的初始服务名称，即使在目标服务名称与所有初始服务名称均不一致时，也能获取到目标服务名称对应的初始服务网页信息，提高了可用性，并且在最大相似度的数量不为1时，能够向第二服务平台发送第一提示信息，以使第二服务平台的工作人员及时更正目标服务名称。In the above, by respectively calculating the similarity between the word vectors corresponding to all initial service names in the first service platform and the word vectors corresponding to the target service name, the initial service name with the greatest similarity to the target service name can be obtained. Even when the target service name is inconsistent with all the initial service names, the initial service web page information corresponding to the target service name can be obtained, thereby improving the usability. When the maximum number of similarities is not 1, the first prompt information can be sent to the second service platform so that the staff of the second service platform can correct the target service name in time.

S002，当所述第一服务平台接收到所述第二服务平台反馈的第一服务关联数据集时，所述第一服务平台根据所述第一服务关联数据集，生成目标服务关联数据集。S002: When the first service platform receives the first service-related data set fed back by the second service platform, the first service platform generates a target service-related data set according to the first service-related data set.

具体的，在S002中，还通过如下步骤生成目标服务关联数据集：Specifically, in S002, the target service associated data set is generated through the following steps:

S0021，当所述第一服务平台接收到所述第二服务平台的第一反馈信息时，根据所述第一反馈信息从所述第一服务平台的预设用户数据库中查询出中间服务关联数据集，且将中间服务关联数据集发送至所述第二服务平台中；可以理解为：所述第二服务平台在接收到第一服务平台发送的目标服务名称对应的初始服务网页信息时，向所述第一服务平台发送第一反馈信息。S0021, when the first service platform receives the first feedback information from the second service platform, it queries the intermediate service related data set from the preset user database of the first service platform according to the first feedback information, and sends the intermediate service related data set to the second service platform; it can be understood that: when the second service platform receives the initial service web page information corresponding to the target service name sent by the first service platform, it sends the first feedback information to the first service platform.

具体的，所述预设用户数据库为存储有已采集过的若干个用户相关数据信息的数据库。Specifically, the preset user database is a database storing several collected user-related data information.

在一个具体的实施例中，通过如下步骤实现所述第一反馈信息：In a specific embodiment, the first feedback information is realized by the following steps:

S1，所述第二服务平台遍历中间网页选项名称的标签列表，其中，所述中间网页选项名称的标签列表包括若干个中间网页选项名称的标签，其中，所述中间网页选项名称的标签是指所述目标服务名称对应的初始服务网页信息中初始服务名称对应的任一网页选项名称的标签。S1, the second service platform traverses the label list of intermediate web page option names, wherein the label list of intermediate web page option names includes several labels of intermediate web page option names, wherein the label of the intermediate web page option name refers to the label of any web page option name corresponding to the initial service name in the initial service web page information corresponding to the target service name.

S2，当所述中间网页选项名称的标签为第一标签类型时，所述第二服务平台获取到中间网页选项名称对应的目标用户的关联数据；可以理解为：对目标用户按照所述中间网页选项名称中的每一项进行采集，并将采集的数据传输到所述第二服务平台。S2, when the label of the intermediate web page option name is of the first label type, the second service platform obtains the associated data of the target user corresponding to the intermediate web page option name; it can be understood as: collecting data of the target user according to each item in the intermediate web page option name, and transmitting the collected data to the second service platform.

具体的，所述目标用户是指在处理目标服务名称所对应的目标服务时，被采集相关数据信息的用户。Specifically, the target user refers to a user whose relevant data information is collected when processing the target service corresponding to the target service name.

进一步的，所述第一标签类型是指每一服务请求时均需要重新采集的网页选项名称对应的标签类型。Furthermore, the first tag type refers to a tag type corresponding to a web page option name that needs to be re-collected for each service request.

进一步的，所述目标用户的关联数据是指按照所述中间网页选项名称的规则采集目标用户相对应的数据，例如，所述中间网页选项名称为血液指标或尿检指标时，所述中间网页选项名称的规则为采集5-10ml；所述中间网页选项名称为人脸图像时，所述中间网页选项名称的规则为免冠图像。Furthermore, the associated data of the target user refers to data corresponding to the target user collected according to the rule of the intermediate web page option name. For example, when the intermediate web page option name is a blood index or a urine test index, the rule of the intermediate web page option name is to collect 5-10ml; when the intermediate web page option name is a face image, the rule of the intermediate web page option name is a bare-headed image.

S3，当所述中间网页选项名称的标签为第二标签类型时，所述第二服务平台生成目标用户的采集项信息且将目标用户的采集项信息作为第一反馈信息，可以理解为：所述第二服务平台将第二标签类型对应的中间网页选项名称反馈给第一服务平台。S3, when the label of the intermediate web page option name is the second label type, the second service platform generates the collection item information of the target user and uses the collection item information of the target user as the first feedback information, which can be understood as: the second service platform feeds back the intermediate web page option name corresponding to the second label type to the first service platform.

进一步的，所述第二标签类型是指每一服务请求时不需要重新采集的网页选项名称对应的标签类型。Furthermore, the second tag type refers to a tag type corresponding to a web page option name that does not need to be re-collected for each service request.

进一步的，所述中间服务关联数据集包括若干个中间服务关联数据，其中，所述中间服务关联数据是指根据第一反馈信息中所述中间网页选项名称，从预设用户数据库中查询出所述中间网页选项名称对应的目标用户的相关数据，可以理解为：所述中间服务关联数据为已被采集过并存储在第一服务平台中的数据。Furthermore, the intermediate service associated data set includes a number of intermediate service associated data, wherein the intermediate service associated data refers to relevant data of the target user corresponding to the intermediate web page option name queried from a preset user database based on the intermediate web page option name in the first feedback information. It can be understood that: the intermediate service associated data is data that has been collected and stored in the first service platform.

S0022，所述第一服务平台根据所述中间服务关联数据集从所述第一服务平台的预设用户数据库中查询出第二服务关联数据集。S0022: The first service platform queries a second service-related data set from a preset user database of the first service platform according to the intermediate service-related data set.

具体的，所述第二服务关联数据集包括若干个第二服务关联数据，其中，所述第二服务关联数据是指当所述中间服务关联数据不为null时对应的中间服务关联数据。Specifically, the second service association data set includes a plurality of second service association data, wherein the second service association data refers to the intermediate service association data corresponding when the intermediate service association data is not null.

S0023，当所述第一服务平台接收到所述第二服务平台反馈的第一服务关联数据集时，将所述第一服务关联数据集和所述第二服务关联数据集合并成目标服务关联数据集。S0023: When the first service platform receives the first service-related data set fed back by the second service platform, the first service-related data set and the second service-related data set are merged into a target service-related data set.

具体的，所述第一服务关联数据集包括若干个第一服务关联数据，所述第一服务关联数据为第一类关联数据或者第二类关联数据，其中，所述第一类关联数据是指当所述中间网页选项名称的标签为第一标签类型时获取到的中间网页选项名称对应的目标用户的关联数据；所述第二类关联数据是指当所述中间服务关联数据为null时根据第一反馈信息中所述中间网页选项名称重新采集的目标用户的相关数据；可以理解为：若第二服务平台在所述第一服务平台中未获取到某一中间服务关联数据，则根据该中间服务关联数据对应的中间网页选项名称，获取对目标用户重新采集后的数据。Specifically, the first service-related data set includes several first service-related data, and the first service-related data is the first category of related data or the second category of related data, wherein the first category of related data refers to the related data of the target user corresponding to the intermediate web page option name obtained when the label of the intermediate web page option name is the first label type; the second category of related data refers to the relevant data of the target user re-collected according to the intermediate web page option name in the first feedback information when the intermediate service-related data is null; it can be understood that: if the second service platform does not obtain a certain intermediate service-related data in the first service platform, then according to the intermediate web page option name corresponding to the intermediate service-related data, the data of the target user that has been re-collected is obtained.

上述，第一服务平台通过将中间服务关联数据集发送给第二服务平台，使得第二服务平台对于中间服务关联数据集中已有的数据无需再次采集，节省时间和人力资源，同时，第二服务关联数据集与第二服务平台发送的第一服务关联数据集合并成目标服务关联数据集在第一服务平台中统一存储，实现了目标用户的相关数据的扩充，使得每一第二服务平台在进行相关服务处理时都能够调用目标用户的相关数据，因此实现了服务平台的服务高可用的效果。As described above, the first service platform sends the intermediate service associated data set to the second service platform, so that the second service platform does not need to collect the data already in the intermediate service associated data set again, saving time and human resources. At the same time, the second service associated data set is merged with the first service associated data set sent by the second service platform into a target service associated data set and is uniformly stored in the first service platform, thereby realizing the expansion of the relevant data of the target user, so that each second service platform can call the relevant data of the target user when performing related service processing, thereby achieving the high availability of the service platform.

具体的，所述目标服务关联数据为第一服务关联数据或者第二服务关联数据。Specifically, the target service associated data is the first service associated data or the second service associated data.

S500，当所述目标服务关联数据对应的数据标签为第一数据标签时，所述第二服务平台将所述目标服务关联数据删除；可以理解为：所述第二服务平台在使用过所述目标服务关联数据后，将所述目标服务关联数据删除，不对所述目标服务关联数据进行留存。S500, when the data tag corresponding to the target service-related data is the first data tag, the second service platform deletes the target service-related data; it can be understood that: after using the target service-related data, the second service platform deletes the target service-related data and does not retain the target service-related data.

具体的，所述第一数据标签是指不能在所述第二服务平台留存的数据类型，可以理解为：非所述第一数据标签对应的数据是指能够在第二服务平台留存的数据，例如：不可以留存的数据可以是采集的SIM卡号、用户账号等表征用户唯一关联信息的数据。Specifically, the first data tag refers to the type of data that cannot be retained on the second service platform, which can be understood as: data not corresponding to the first data tag refers to data that can be retained on the second service platform. For example, data that cannot be retained may be collected SIM card numbers, user accounts, and other data that represent unique user-related information.

具体的，所述第二数据标签是指能够在所述第二服务平台留存的数据类型，例如：可以留存的数据可以是所述目标服务关联数据的主键数据或脱敏后的编号等不会泄露用户个人信息的数据。Specifically, the second data tag refers to the type of data that can be retained on the second service platform. For example, the data that can be retained may be the primary key data of the target service-related data or the desensitized serial number and other data that will not disclose user personal information.

上述，通过判断每个目标服务关联数据对应的数据标签的类型，将可以留存的数据存储在第二服务平台中，不能留存的数据删除掉，第二服务平台中删除了大量的数据，仅留存了少量不涉及用户个人隐私的数据，实现了数据低留存，使得操作人员在后续使用时不能从第二服务平台中获取到用户的若干个服务关联数据，防止人员信息在第二数据平台中泄露，增强了数据的安全性。As described above, by judging the type of data tag corresponding to each target service-related data, the data that can be retained is stored in the second service platform, and the data that cannot be retained is deleted. A large amount of data is deleted from the second service platform, and only a small amount of data that does not involve the user's personal privacy is retained, thereby achieving low data retention. This prevents the operator from obtaining several service-related data of the user from the second service platform during subsequent use, thereby preventing the leakage of personnel information in the second data platform and enhancing data security.

在一个具体的实施例中，还包括如下步骤：In a specific embodiment, the following steps are also included:

S01，根据预设的时间间隔，所述第一服务平台检测所述第二服务平台中的存储磁盘，本领域技术人员根据实际需求设置时间间隔，在此不再赘述。S01, according to a preset time interval, the first service platform detects the storage disk in the second service platform. Those skilled in the art set the time interval according to actual needs, which will not be described in detail here.

具体的，所述第二服务平台中的存储磁盘是指用于存储第二服务平台中的所有数据的一个或若干个磁盘。Specifically, the storage disk in the second service platform refers to one or more disks used to store all data in the second service platform.

S02，若所述第二服务平台中的存储磁盘中存储有所述目标服务关联数据，所述第一服务平台向所述第二服务平台发送第二提示信息。S02: If the target service-related data is stored in the storage disk in the second service platform, the first service platform sends a second prompt message to the second service platform.

具体的，所述第二提示信息用于表征为提示第二服务平台将所述目标服务关联数据删除的信息。Specifically, the second prompt information is used to be characterized as information prompting the second service platform to delete the target service-related data.

上述，所述第一服务平台每隔一段时间对每一第二服务平台的存储磁盘进行检测，并在第二服务平台中存储有目标服务关联数据时，向第二服务平台发送将目标服务关联数据删除的提示信息，起到警示作用，防止第二服务平台存储目标服务关联数据，保护用户数据的安全性。As mentioned above, the first service platform detects the storage disk of each second service platform at regular intervals, and when the target service-related data is stored in the second service platform, sends a prompt message to the second service platform to delete the target service-related data, thereby serving as a warning to prevent the second service platform from storing the target service-related data and protecting the security of user data.

综上，本发明提供了一种实现服务高可用数据低留存的分布式系统，所述系统包括：第一服务平台、若干个第二服务平台、调度平台、若干个初始数据访问平台、处理器和存储有计算机程序的存储器，所述第一服务平台与每一所述第二服务平台通信连接，每一所述第二服务平台与所述调度平台通信连接，所述调度平台与每一初始数据访问平台通信连接，每一初始数据访问平台与所述第一服务平台通信连接，当所述计算机程序被处理器执行时，实现以下步骤：当所述调度平台接收到任一第二服务平台发送的目标查询请求时，从若干个初始数据访问平台中获取目标数据访问平台，将所述目标查询请求通过目标数据访问平台发送至所述第一服务平台，根据所述目标查询请求，所述第一服务平台将所述目标查询请求对应的目标服务关联数据集发送到所述第二服务平台，当所述目标服务关联数据对应的数据标签为第一数据标签时，所述第二服务平台将所述目标服务关联数据删除，当所述目标服务关联数据对应的数据标签为第二数据标签时，所述第二服务平台对所述目标服务关联数据进行存储。一方面，通过获取目标数据访问平台，能够将目标查询请求调度到目标数据访问平台而非其他的初始数据访问平台，防止其他初始数据访问平台出现拥塞等情况，有利于维护系统的正常运行；另一方面，第二数据平台通过根据数据标签的类型，将不能留存的数据删除掉，能够防止人员信息在第二数据平台中泄露，增强了数据的安全性。In summary, the present invention provides a distributed system for realizing high service availability and low data retention. The system includes: a first service platform, several second service platforms, a scheduling platform, several initial data access platforms, a processor and a memory storing a computer program. The first service platform is communicatively connected to each of the second service platforms, each of the second service platforms is communicatively connected to the scheduling platform, the scheduling platform is communicatively connected to each initial data access platform, and each initial data access platform is communicatively connected to the first service platform. When the computer program is executed by the processor, the following steps are implemented: when the scheduling platform receives a target query request sent by any second service platform, a target data access platform is obtained from several initial data access platforms, and the target query request is sent to the first service platform through the target data access platform. According to the target query request, the first service platform sends a target service associated data set corresponding to the target query request to the second service platform. When the data label corresponding to the target service associated data is the first data label, the second service platform deletes the target service associated data. When the data label corresponding to the target service associated data is the second data label, the second service platform stores the target service associated data. On the one hand, by obtaining the target data access platform, the target query request can be dispatched to the target data access platform rather than other initial data access platforms, preventing congestion on other initial data access platforms, which is beneficial to maintaining the normal operation of the system; on the other hand, the second data platform can prevent the leakage of personnel information in the second data platform by deleting data that cannot be retained according to the type of data tag, thereby enhancing data security.

虽然已经通过示例对本发明的一些特定实施例进行了详细说明，但是本领域的技术人员应该理解，以上示例仅是为了进行说明，而不是为了限制本发明的范围。本领域的技术人员还应理解，可以对实施例进行多种修改而不脱离本发明的范围和精神。本发明开的范围由所附权利要求来限定。Although some specific embodiments of the present invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are provided for illustration only and are not intended to limit the scope of the present invention. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the present invention. The scope of the present invention is defined by the appended claims.

Claims

1. A distributed system for realizing high-availability service and low-retention data, characterized in that the system comprises: a first service platform, a plurality of second service platforms, a scheduling platform, a plurality of initial data access platforms, a processor and a memory storing a computer program, wherein the first service platform is communicatively connected with each of the second service platforms, each of the second service platforms is communicatively connected with the scheduling platform, the scheduling platform is communicatively connected with each initial data access platform, and each initial data access platform is communicatively connected with the first service platform, and when the computer program is executed by the processor, the following steps are implemented:

S100, when the scheduling platform receives a target query request sent by any second service platform, it obtains a target data access platform from a plurality of initial data access platforms;

S200, according to the target data access platform, the scheduling platform sends the target query request to the first service platform;

S300: According to the target query request, the first service platform sends a target service associated data set corresponding to the target query request to the second service platform;

S400, when the second service platform receives the target service associated data set, determining a data label corresponding to each target service associated data in the target service associated data set;

S500: When the data tag corresponding to the target service-related data is the first data tag, the second service platform deletes the target service-related data;

S600: When the data tag corresponding to the target service associated data is a second data tag, the second service platform stores the target service associated data.

2. According to claim 1, the distributed system for achieving high service availability and low data retention is characterized in that the first service platform is used to monitor each of the second service platforms.

3. The distributed system for realizing high-availability service and low-retention data according to claim 1, characterized in that the step S100 further comprises the following steps:

S101, when the scheduling platform receives a target query request sent by any second service platform, obtains an access priority set E={E ₁ , E ₂ , ..., E _δ , ..., E _d } corresponding to the initial data access platform, where E _δ is the access priority corresponding to the δth initial data access platform, δ=1, 2...d, and d is the number of initial data access platforms;

Among them, E _δ meets the following conditions:

E _δ =1/Q _δ , where Q _δ is the number of visits received by the δth initial data access platform within one second before the current moment;

S102: The scheduling platform obtains a target data access platform according to the access priority corresponding to each of the initial data access platforms.

4. According to claim 1, the distributed system for achieving high service availability and low data retention is characterized in that the first data tag refers to a data type that cannot be retained on the second service platform.

5. The distributed system for realizing high-availability service and low data retention according to claim 1 is characterized in that the second data tag refers to the type of data that can be retained on the second service platform.

6. The distributed system for realizing high-availability service and low-data retention according to claim 1, characterized in that it also includes the following steps:

S01, according to a preset time interval, the first service platform detects the storage disk in the second service platform;

S02: If the target service-related data is stored in the storage disk in the second service platform, the first service platform sends a second prompt message to the second service platform.

7. According to claim 1, the distributed system for achieving high service availability and low data retention is characterized in that the target query request refers to a request for querying a target service associated data set corresponding to a target user.

8. The distributed system for realizing high service availability and low data retention according to claim 1, characterized in that before step S100, the target service associated data set is obtained by the following steps:

S001, when the first service platform receives a target service name of any second service platform, the first service platform sends initial service webpage information corresponding to the target service name to the second service platform;

S002: When the first service platform receives the first service-related data set fed back by the second service platform, the first service platform generates a target service-related data set according to the first service-related data set.

9. The distributed system for realizing high service availability and low data retention according to claim 8, characterized in that, in S002, the target service associated data set is further generated by the following steps:

S0021, when the first service platform receives the first feedback information from the second service platform, it queries the intermediate service associated data set from the preset user database of the first service platform according to the first feedback information, and sends the intermediate service associated data set to the second service platform;

S0022, the first service platform queries a second service-related data set from a preset user database of the first service platform according to the intermediate service-related data set;

S0023: When the first service platform receives the first service-related data set fed back by the second service platform, the first service-related data set and the second service-related data set are merged into a target service-related data set.