CN107247725B

CN107247725B - Structural Integrity Detection Optimization Method Based on Metadata Logically Independent Fragmentation

Info

Publication number: CN107247725B
Application number: CN201710290286.5A
Authority: CN
Inventors: 赵晓非; 柴争义; 尤轶; 郭永新
Original assignee: Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Current assignee: Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Priority date: 2017-04-28
Filing date: 2017-04-28
Publication date: 2020-10-23
Anticipated expiration: 2037-04-28
Also published as: CN107247725A

Abstract

The invention relates to a structural integrity detection optimization method based on metadata logic irrelevant fragmentation, which is technically characterized by comprising the following steps of: formalizing repository metadata and data into a description logic SHIQ knowledge base; carrying out logic-independent fragmentation on the SHIQ metadata knowledge base; structural integrity checking is performed on top of the logically unrelated slices. The invention completely reserves all relevant information of given metadata, so that detection for partial metadata can be carried out on a smaller data set and does not need to be carried out on the whole storage library; and the detection of the whole storage library can be decomposed to different metadata fragments, and the experimental result shows that the size of the metadata fragment generated by the method is obviously smaller than the original size on average, so that the efficiency of structural integrity detection performed on the basis of the method can be obviously improved.

Description

Structural Integrity Detection Optimization Method Based on Metadata Logically Independent Fragmentation

技术领域technical field

本发明属于存储库系统技术领域，尤其是一种基于元数据逻辑无关分片的结构完整性检测优化方法。The invention belongs to the technical field of storage library systems, in particular to a structural integrity detection and optimization method based on logically irrelevant fragmentation of metadata.

背景技术Background technique

存储库系统中元数据的组织方式，即元数据的结构呈现出一种多级的、分层的而且动态变化的复杂结构，因此，保持该种系统的一致性是一项重要任务。存储库系统中的一致性包括：(1)操作一致性：涉及存储库应用间的交互，与存储库事务的概念密切相关。它又分为协作原子性和并发多用户访问。(2)元数据完整性：包括结构完整性和良格式。良格式确保元层次中元素定义的语法正确性，而结构完整性确保一个层次中的元素符合与该层相邻的、更高的元层次中的类型定义。The organization of the metadata in the repository system, that is, the structure of the metadata presents a complex structure that is multi-level, hierarchical and dynamically changing. Therefore, maintaining the consistency of the system is an important task. Consistency in the repository system includes: (1) Operational consistency: it involves the interaction between repository applications and is closely related to the concept of repository transactions. It is further divided into cooperative atomicity and concurrent multi-user access. (2) Metadata integrity: including structural integrity and good format. Well-formed ensures the syntactic correctness of element definitions in a meta-level, while structural integrity ensures that elements in one level conform to type definitions in an adjacent, higher meta-level.

结构完整性是存储库系统一致性的重要组成部分。如果结构完整性得不到保证，存储库系统应用就可能修改或建立M_n层中的元数据元素而与M_n+1层中它们的元类相冲突。例如，一个操作可能会读取某个元素的属性，而该元素的元类并不存在，则该操作是无效的。一个数据库系统包含M₀层到M₂层，其中M₂层的内容是固定不变的。而为了提供可自定义、可扩展的系统框架，存储库系统引入了允许用户对M₂层进行定义的M₃层，在运行时刻M₀，M₁和M₂层均可以被动态修改，因而就可能导致相邻层次之间的冲突问题，即结构完整性问题。其它系统并不面临这类问题因为它们假定系统框架在运行时刻是静止的。Structural integrity is an important part of repository system consistency. If structural integrity is not guaranteed, repository system applications may modify or create metadata elements in the _Mn layer to conflict with their metaclasses in the Mn ₊₁ layer. For example, an operation that might read a property of an element whose metaclass does not exist is invalid. A database system includes M ₀ to M ₂ layers, where the content of M ₂ layer is fixed. In order to provide a customizable and extensible system framework, the repository system introduces the M _{3 layer that allows users to define the M 2} _layer . At runtime, the M ₀ , M ₁ and M ₂ layers can be dynamically modified. Therefore, It may lead to conflict problems between adjacent layers, that is, structural integrity problems. Other systems do not face this problem because they assume that the system framework is static at runtime.

但是，高的计算开销使得结构完整性的自动检测逐渐成为一个棘手的问题。原因主要有如下四个方面：(1)近年来元数据量的快速增长；(2)元数据的更新频度很高；(3)约束的集合越来越大；(4)约束的内部复杂度越来越高，因此如何对结构完整性检测方法进行优化以提高其效率逐渐成为存储库系统一致性领域的研究热点。However, the high computational overhead makes the automatic detection of structural integrity increasingly a thorny problem. There are four main reasons for this: (1) The amount of metadata has grown rapidly in recent years; (2) The update frequency of metadata is very high; (3) The set of constraints is getting bigger and bigger; (4) The internal complexity of constraints Therefore, how to optimize the structural integrity detection method to improve its efficiency has gradually become a research hotspot in the field of repository system consistency.

目前，元对象设施MOF已经成为国际主流的元数据存储库规范，但是，关于MOF存储系统的结构完整性检测的一种检测方法是将结构完整性约束转化为逻辑表达式，而后将约束检测问题转化为逻辑推理问题，比如Duboisset等人(Duboisset M,et al.Integratingthe calculus-based method into OCL:Study of expressiveness and codegeneration,Proc of the 18th Int Workshop on Database and Expert SystemsApplications.Piscataway,NJ:IEEE,2007:502-506)、Donald等人(Donald C,et al.Usingfirst-order logic to query heterogeneous internet data sources,Proc of the2015Int Conf on Soft Computing and Software Engineering.Holand:AcademicPress,Elsevier,2015:1-8)和Demuth等人(Demuth B,et al.OCL as a specificationlanguage for business rules in database applications,LNCS 2185:Proc of the4th Conf on UML.Berlin:Springer,2006:104-117)的工作。然而由于结构完整性约束包含了递归、否定、包语义等诸多复杂机制，所提出的转化算法很难涵盖上述所有机制，尽管有的算法对该缺陷进行了改进以支持尽可能多的约束机制，但处理方式的高复杂度又致使算法效率不高。At present, the meta-object facility MOF has become the mainstream metadata repository specification in the world. However, a detection method for the structural integrity detection of the MOF storage system is to convert the structural integrity constraints into logical expressions, and then solve the constraint detection problem. Translated into logical reasoning problems, such as Duboisset et al. (Duboisset M, et al. Integrating the calculus-based method into OCL: Study of expressiveness and codegeneration, Proc of the 18th Int Workshop on Database and Expert Systems Applications. Piscataway, NJ: IEEE, 2007 : 502-506), Donald et al. (Donald C, et al. Using first-order logic to query heterogeneous internet data sources, Proc of the 2015 Int Conf on Soft Computing and Software Engineering. Holand: AcademicPress, Elsevier, 2015: 1-8) and Demuth et al. (Demuth B, et al. OCL as a specificationlanguage for business rules in database applications, LNCS 2185: Proc of the 4th Conf on UML. Berlin: Springer, 2006: 104-117). However, since structural integrity constraints include many complex mechanisms such as recursion, negation, and packet semantics, it is difficult for the proposed transformation algorithm to cover all the above mechanisms, although some algorithms have improved this defect to support as many constraint mechanisms as possible. However, the high complexity of the processing method makes the algorithm inefficient.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服现有技术的不足，提供一种设计合理、方法简单且效率高的基于元数据逻辑无关分片的结构完整性检测优化方法。The purpose of the present invention is to overcome the deficiencies of the prior art, and to provide a structural integrity detection and optimization method based on logically unrelated fragments of metadata with reasonable design, simple method and high efficiency.

本发明解决现有的技术问题是采取以下技术方案实现的：The present invention solves the existing technical problems and adopts the following technical solutions to realize:

一种基于元数据逻辑无关分片的结构完整性检测优化方法，包括以下步骤：A structural integrity detection and optimization method based on metadata logically irrelevant fragmentation, comprising the following steps:

步骤1、将存储库元数据及数据形式化为描述逻辑SHIQ元数据知识库；Step 1. Form the repository metadata and data into a description logic SHIQ metadata knowledge base;

步骤2、将SHIQ元数据知识库进行逻辑无关分片；Step 2. Perform logically independent sharding of the SHIQ metadata knowledge base;

步骤3：在逻辑无关分片之上执行结构完整性检测；Step 3: Perform structural integrity checks on logically unrelated shards;

所述步骤2的具体实现方法包括以下步骤：The specific implementation method of the step 2 includes the following steps:

⑴根据准则1和准则2计算SHIQ元数据知识库给定的元素a的属性演绎片断

为SHIQ元数据知识库；(1) Calculate the attribute deduction fragment of the element a given by the SHIQ metadata knowledge base according to the criteria 1 and 2

for SHIQ metadata knowledge base;

⑵将元素a的全部类断言添加进

(2) Add all class assertions of element a into

⑶对于

中的每个R₀(a，b)以及满足

的每个R，判断

是否满足，如果是，则将R₀(a，b)、论据

中的断言以及

中的断言添加进

⑶ for

for each R ₀ (a, b) in and satisfying

for each R, judging

Whether it is satisfied, if so, then R ₀ (a, b), the argument

assertions in and

The assertion in the add in

⑷计算

⑷Calculation

所述的准则1为：SHIQ元数据知识库中以元素a作为第一个要素或第二个要素的属性断言；The criterion 1 is: attribute assertion with element a as the first element or the second element in the SHIQ metadata knowledge base;

所述的准则2为：SHIQ元数据知识库中从元素a到元素b的角色路径中的属性断言，这些断言具有同一个传递父角色。Said criterion 2 is: attribute assertions in the role path from element a to element b in the SHIQ metadata knowledge base, these assertions have the same transitive parent role.

所述步骤1包括：将存储库中的元层次M_n+1层和实例层M_n层分别进行形式化，其中n为0或1。The step 1 includes: respectively formalizing the meta-level M _n+1 level and the instance level M _n level in the repository, where n is 0 or 1.

所述元层次M_n+1层的形式化方法为：The formalized method of the meta-level M _n+1 layer is:

⑴将元层次中的每个元类转换为一个SHIQ概念，并使得两个不同的元类可以拥有类型不同但名字相同的属性；(1) Convert each metaclass in the meta hierarchy into a SHIQ concept, and enable two different metaclasses to have attributes of different types but the same name;

⑵将元层次中的类C和一个类型C＇形式化为概念以及互逆的两个角色r₁和r₂；(2) Formalize a class C and a type C' in the meta-level into concepts and two reciprocal roles r ₁ and r ₂ ;

⑶泛化关系：如果一个元类C1是元类C2的泛化，将之形式化为

一个元类C₁的每个属性以及与元类C₁相关的每个聚合关联和一般关联都被元类C₂继承下来了，并适用于元类之间的多重继承关系。(3) Generalization relationship: If a metaclass C1 is a generalization of metaclass C2, it can be formalized as

Every attribute of a metaclass _C1 and every aggregate association and general association related to metaclass _C1 is inherited by metaclass _C2 and applies to multiple inheritance relationships between metaclasses.

所述实例M_n层的形式化方法为：The formalized method of the example _Mn layer is:

⑴若M_n层元素c是其元层次中元类C的实例，则将其形式化为：C(c)；(1) If the element c of the _Mn layer is an instance of the metaclass C in its meta-level, it is formalized as: C(c);

⑵若M_n层元素c₁关联了元素c₂，相应的元类C₁通过聚合关联A聚合了C₂，聚合关联A被形式化为Tbox中角色，则将其形式化为：A(c₁，c₂)；(2) If the element c ₁ of the _Mn layer is associated with the element c ₂ , the corresponding metaclass C ₁ aggregates C ₂ through the aggregation association A, and the aggregation association A is formalized as the role in the Tbox, then it is formalized as: A(c ₁ , c ₂ );

⑶若M_n层元素c₁关联了元素c₂，相应的元类C₁通过一般关联与元类C₂相联系，而元类C₁与元类C₂的一般关联被形式化为概念和角色r₁，r₂，则c₁和c₂之间的关系可形式化为三个断言：A(a)；r₁(a，c₁)；r₂(a，c₂)。(3) If the element c ₁ of the _Mn layer is associated with the element c ₂ , the corresponding metaclass C ₁ is related to the meta class C ₂ through a general association, and the general association between the meta class C ₁ and the meta class C ₂ is formalized as the concept and Roles r ₁ , r ₂ , then the relationship between c ₁ and c ₂ can be formalized as three assertions: A(a); r ₁ (a, c ₁ ); r ₂ (a, c ₂ ).

所述步骤3的方法为：检测单个元数据元素a的类属关系以及检测元数据元素a、元数据元素b的属性关系是在包含元数据元素a、元数据元素b的片断上进行；检测一个元类的全部实例元素或检测通过一个属性相关联的全部实例元素是将同样的查询在各片断上并行执行而后将结果合并。The method of the step 3 is: detecting the generic relationship of a single metadata element a and detecting the attribute relationship of the metadata element a and the metadata element b are carried out on the fragment containing the metadata element a and the metadata element b; All instance elements of a metaclass or checking all instance elements associated by an attribute is to execute the same query on fragments in parallel and then combine the results.

本发明的优点和积极效果是：The advantages and positive effects of the present invention are:

本发明针对MOF存储库的特点，将元数据的不同层次转换进描述逻辑SHIQ知识库，在此基础上提出了元数据逻辑无关片断的形式定义并给出了如何提取它们的方法，该方法完整地保留了给定元数据的相关的全部信息从而带来了两方面的好处：一方面，针对部分元数据的检测可以在较小数据集上进行而不必针对整个存储库；另一方面，对整个存储库的检测可以分解到不同的元数据片断上进行。实验结果表明平均来说通过本方法所产生的元数据片断的规模显著地小于其原始规模，在此基础上执行的结构完整性检测的效率可以得到显著地提高。Aiming at the characteristics of the MOF repository, the present invention converts different levels of metadata into the description logic SHIQ knowledge base. On this basis, it proposes a formal definition of logically irrelevant pieces of metadata and gives a method for how to extract them. The method is complete Preserving all relevant information about a given metadata brings two benefits: on the one hand, detection for partial metadata can be performed on smaller datasets and not necessarily for the entire repository; The detection of the entire repository can be broken down into different pieces of metadata. Experimental results show that, on average, the size of metadata fragments generated by the present method is significantly smaller than its original size, and the efficiency of structural integrity detection performed on this basis can be significantly improved.

附图说明Description of drawings

图1是本发明的有效性评测的实验结果；Fig. 1 is the experimental result of the validity evaluation of the present invention;

图2是MOF存储库系统中M_n+1层中的一般关联的例子；Figure 2 is an example of a general association in the Mn ₊₁ layer in a MOF repository system;

图3是MOF存储库系统中M_n+1层中的聚合关联的例子；Figure 3 is an example of an aggregate association in the Mn ₊₁ layer in the MOF repository system;

图4是属性类型冲突示例图。FIG. 4 is an example diagram of attribute type conflict.

具体实施方式Detailed ways

以下结合附图对本发明实施例做进一步详述：Embodiments of the present invention are described in further detail below in conjunction with the accompanying drawings:

步骤1：将存储库元数据及数据形式化为描述逻辑SHIQ知识库。Step 1: Form the repository metadata and data into a description logic SHIQ knowledge base.

MOF框架中相邻层次之间的关系是类型-实例的关系，因此我们将元层次，即M_n+1(n为0或1)层中的信息形式化为SHIQ Tbox

中的概念定义，而将作为实例的M_n层中的元素形式化进SHIQAbox

具体来说，当n＝1时，我们分别将M₂层和M₁层形式化进Tbox和Abox，检测的是M₂层和M₁层之间的一致性；当n＝0时，我们分别将M₁层和M₀层形式化进Tbox和Abox，检测的是M₁层和M₀层之间的一致性。下面分别进行介绍。The relationship between adjacent levels in the MOF framework is a type-instance relationship, so we formalize the information in the meta-level, that is, the M _n+1 (n is 0 or 1) layer as SHIQ Tbox

The concept definition in , and the elements in the _Mn layer as instances are formalized into SHIQAbox

Specifically, when n= ₁ , we formalize the _M2 and M1 layers into Tbox and Abox, respectively, and detect the consistency between the _M2 and M1 layers; when n ₌ 0, we _The M1 layer and the M0 layer are formalized into _Tbox and _Abox respectively, and the consistency between the M1 layer and the _M0 layer is detected. They are introduced separately below.

1、M_n+1层的形式化1. Formalization of the Mn ₊₁ layer

(1)元类及元属性(1) Meta class and meta attribute

在元层次中，元类也是一种类，因此我们不对元类和类进行区分。由于元类和SHIQ概念都是用于描述实例的集合，因此我们将每个元类转换为一个SHIQ概念。In the meta hierarchy, a metaclass is also a class, so we do not distinguish between metaclasses and classes. Since both metaclasses and SHIQ concepts are used to describe collections of instances, we convert each metaclass into a SHIQ concept.

由于类C的一个类型为C＇的属性a将C的每个实例关联到C＇的实例，因此属性a是C的实例与C＇的实例之间的二元关系，所以我们将属性a形式化为一个SHIQ角色，该角色可以通过如下断言来表示：

C＇。若a存在多重性i..j，则将该多重性形式化为：

上述断言精确地指明了对于概念C的每个实例c，所有通过a关联到c的对象都是C＇的实例，并且精确反映了属性名在整个元层次中的不唯一性，即两个不同的元类可以拥有类型不同但名字相同的属性。Since an attribute a of class C of type C' associates each instance of C to an instance of C', the attribute a is a binary relationship between an instance of C and an instance of C', so we put the attribute a in the form into a SHIQ role, which can be represented by the following assertion:

C'. If a has multiplicity i..j, then the multiplicity is formalized as:

The above assertion precisely specifies that for each instance c of concept C, all objects related to c through a are instances of C', and accurately reflects the non-uniqueness of attribute names in the entire meta-level, that is, two different 's metaclass can have properties of different types but the same name.

(2)一般关联(2) General association

元层次中的一般关联如图2所示。用于表明两个元类的实例之间的二元关系。每个一般关联都包含两个关联端并且对应一个相应的关联类。每个关联端都存在多重性约束。与属性不同，在MOF框架中一般关联的名字是唯一的。A general association in the meta-level is shown in Figure 2. Used to indicate a binary relationship between instances of two metaclasses. Each general association contains two association ends and corresponds to a corresponding association class. A multiplicity constraint exists at each association end. Unlike properties, generally associated names are unique within the MOF framework.

我们将类C和C＇之间的一般关联(关联端分别为r₁和r₂)形式化为概念A以及互逆的两个角色r₁和r₂，其中r₁用于描述关联端r₁，它分别以C和C＇作为第1和第2个要素。因此r₁的要素的取值限定被形式化为：We formalize the general association between classes C and C' (the association ends are r ₁ and r ₂ , respectively) as a concept A and two reciprocal roles r ₁ and r ₂ , where r ₁ is used to describe the association end r ₁ , which takes C and C' as the first and second elements, respectively. Therefore, the value constraints of the elements of r ₁ are formalized as:

r₁和r₂之间的关系被形式化为r₂≡r₁ˉ。r₁的多重性i₁..j₁和r₂的多重性i₂..j₂分别被形式化为：The relationship between r ₁ and r ₂ is formalized as r ₂ ≡ r ₁ ˉ. The multiplicity i ₁ ..j ₁ of r ₁ and the multiplicity i ₂ .. j _{2 of r 2} _are respectively formalized as:

(3)聚合关联(3) Aggregate association

元层次中的聚合关联如图3所示，用于表明两个元类的实例之间的部分-整体的关系，是一种二元关系。例如LevelBasedHierachy与HierarchyLevelAssociation之间的聚合关联表示每个LevelBasedHierachy的实例由一组HierarchyLevelAssociation的实例组成。The aggregation association in the meta-level is shown in Figure 3, which is used to indicate the part-whole relationship between instances of two metaclasses, which is a binary relationship. For example, an aggregate association between LevelBasedHierachy and HierarchyLevelAssociation indicates that each instance of LevelBasedHierachy consists of a set of instances of HierarchyLevelAssociation.

由于聚合关联本质上是一般关联的一种形式，因此对聚合关联进行转换的方法与一般关联相同，聚合中的包含类与被包含类之间的区别并没有丢失，我们约定角色的第一要素是包含类。Since an aggregate association is essentially a form of a general association, the method of transforming an aggregate association is the same as that of a general association, and the distinction between the containing class and the contained class in the aggregation is not lost. The first element of our agreement on the role is the containing class.

(4)泛化(4) Generalization

MOF框架中的泛化关系表明子类的每个实例也是父类的实例。因此子类的实例继承了父类的属性，此外它们还可以定义自己的属性。The generalization relation in the MOF framework states that every instance of a subclass is also an instance of the superclass. So instances of subclasses inherit the properties of the parent class, in addition they can define their own properties.

泛化关系是被SHIQ所支持的，如果一个元类C₁是元类C₂的泛化，我们可以将之形式化为

由于

的语义是基于子集理论的，因此在SHIQ中如果给定断言

把C₁作为第i个要素的角色的每个元组也可以把C₂的实例作为第i个要素，因为它也是C₁的实例。因此在形式化中，C₁的每个属性以及与C₁相关的每个聚合关联和一般关联都被C₂继承下来了。此外这种形式化方式也完全适用于元类之间的多重继承关系。The generalization relation is supported by SHIQ. If a metaclass _C1 is a generalization of metaclass _C2 , we can formalize it as

because

The semantics of is based on subset theory, so in SHIQ if an assertion is given

Each tuple that has _C1 as the role of the ith element can also have an instance of _C2 as the ith element, since it is also an instance of _C1 . So in the formalization, every property of _C1 and every aggregate association and general association related to _C1 is inherited by _C2 . In addition, this formalization is also fully applicable to the multiple inheritance relationship between metaclasses.

2、M_n层的形式化2. Formalization of the _Mn layer

M_n层中的每个元素是M_n+1层中相应元类的实例，元素之间的关系是元类之间相应关联的实例，因此M_n层元素应被转化为SHIQ知识库的Abox

转换分三种情况：Each element in the _Mn layer is an instance of the corresponding metaclass in the Mn ₊₁ layer, and the relationship between the elements is an instance of the corresponding association between the metaclasses, so the _Mn layer element should be transformed into the Abox of the SHIQ knowledge base

There are three cases of conversion:

(1)若M_n层元素c是其元层次中元类C的实例，则将其形式化为：C(c)；(1) If the element c of the _Mn layer is an instance of the metaclass C in its meta-level, it is formalized as: C(c);

(2)若M_n层元素c₁关联了c₂，相应的元类C₁(或其祖先)通过聚合关联A聚合了C₂(或其祖先)，聚合关联A被形式化为Tbox中角色A，则将其形式化为：A(c₁，c₂)；(2) If the element c ₁ of the _Mn layer is associated with c ₂ , the corresponding metaclass C ₁ (or its ancestor) aggregates C ₂ (or its ancestor) through the aggregation association A, and the aggregation association A is formalized as the role in the Tbox A, then formalize it as: A(c ₁ , c ₂ );

(3)若M_n层元素c₁关联了c₂，相应的元类C₁(或其祖先)通过一般关联与元类C₂(或其祖先)相联系，而该一般关联被形式化为概念A和角色r₁，r₂，则c₁和c₂之间的关系可形式化为三个断言：A(a)；r₁(a，c₁)；r₂(a，c₂)。(3) If the _Mn layer element c ₁ is associated with c ₂ , the corresponding metaclass C ₁ (or its ancestor) is associated with the metaclass C ₂ (or its ancestor) through a general association, and the general association is formalized as Concept A and roles r ₁ , r ₂ , the relationship between c ₁ and c ₂ can be formalized as three assertions: A(a); r ₁ (a, c ₁ ); r ₂ (a, c ₂ ) .

步骤2：将SHIQ元数据知识库进行逻辑无关分片。Step 2: sharding the SHIQ metadata knowledge base logically irrelevant.

1、逻辑无关分片的基本思路1. The basic idea of logically irrelevant sharding

通过步骤1的转换，我们获得了元数据知识库

下面讨论如何对元数据进行逻辑无关分片。首先给出一些定义。Through the transformation of step 1, we obtain the metadata knowledge base

The following discusses how to shard the metadata logically independently. First some definitions are given.

定义1(签名)：给定Abox

断言γ，γ中出现的元数据元素的集合称为γ的签名，记为Sig(γ)。

中所有元数据元素的签名记为 Definition 1 (Signature): Given Abox

Assert γ, the set of metadata elements appearing in γ is called the signature of γ, denoted as Sig(γ).

The signature of all metadata elements in

定义2(角色路径)：若对于i＝1，…，n-1，

中或者存在角色R_i(a_i，a_i+1)，或者存在角色R_iˉ(a_i+1，a_i)，则称元数据元素a₁和a_n之间存在角色路径。Definition 2 (role path): if for i=1,...,n-1,

There is either a role R _i (a _i , a _i+1 ), or a role R _i ˉ(a _i ₊₁ , a _i ), then there is a role path between the metadata elements a ₁ and an.

角色路径可以包含逆角色。例如给定R₁(a₁，a₂)，R₂(a₃，a₂)，R₃(a₃，a₄)，从a₁到a₄的角色路径是{R₁，R₂ˉ，R₃}，相反从a₄到a₁的角色路径是{R₃ˉ，R₂，R₁ˉ}。Role paths can contain inverse roles. For example, given R ₁ (a ₁ , a ₂ ), R ₂ (a ₃ , a ₂ ), R ₃ (a ₃ , a ₄ ), the character path from a ₁ to a ₄ is {R ₁ , R ₂ ˉ , R ₃ }, conversely the role path from a ₄ to a ₁ is {R ₃ ˉ, R ₂ , R ₁ ˉ}.

元数据知识库上进行的完整查询将导致推理的效率低下及难驾驭性，考虑到元数据知识库中的元数据可以分解为不同的逻辑无关片断，我们可以将查询分解到不同的元数据片断上进行，从而减小待查询的元数据量并且可以并行执行查询。例如要查询和元数据元素Dimension的实例相关的信息，我们仅需在包含Dimension的实例的元数据片断上执行查询即可。为了保持查询结果的完备性，元数据片断必须是逻辑无关的，即该片断必须是给定元数据元素逻辑蕴含的闭包，基于上述分析，我们给出元数据逻辑无关片断的形式定义：A complete query on the metadata knowledge base will lead to inefficiency and unruly reasoning. Considering that the metadata in the metadata knowledge base can be decomposed into different logically unrelated pieces, we can decompose the query into different metadata pieces. This reduces the amount of metadata to be queried and queries can be executed in parallel. For example, to query information related to an instance of the metadata element Dimension, we only need to perform a query on the metadata fragment containing the instance of Dimension. In order to maintain the completeness of the query results, the metadata fragment must be logically independent, that is, the fragment must be a closure of the logical implication of a given metadata element. Based on the above analysis, we give the formal definition of the metadata logically independent fragment:

定义3(元数据逻辑无关片断)：令

是元数据知识库，集合S是签名。

的子集

称为签名S的逻辑无关片断当且仅当对于满足Sig(γ)

的任意断言γ(类断言或者属性断言)，有

等价于

Definition 3 (Metadata logically irrelevant fragment): Let

is the metadata repository, and the set S is the signature.

subset of

A logically independent segment called a signature S if and only if Sig(γ) is satisfied for

For any assertion γ (class assertion or attribute assertion), there are

Equivalent to

定义3规定了成为元数据逻辑无关片断的充要条件，它确保了签名S中元数据元素的逻辑蕴含的完备性，然而根据定义3以及SHIQ的单调性可知，

的任意超集也是S的逻辑无关片断(比如整个Abox

总是S的逻辑无关片断)，因此定义3并没有确保签名S的逻辑无关片断的唯一性。Definition 3 specifies the necessary and sufficient conditions for becoming a logically irrelevant piece of metadata, which ensures the completeness of the logical implication of metadata elements in the signature S. However, according to Definition 3 and the monotonicity of SHIQ,

Any superset of is also a logically unrelated fragment of S (such as the entire Abox

is always a logically irrelevant fragment of S), so definition 3 does not ensure the uniqueness of the logically irrelevant fragment of signature S.

我们的目标是划分出精确的元数据片断，该片断仅包含对给定签名必不可少的断言，从而使得产生的分片在保持信息完备性的同时具有最小的规模。简单地说，要使得断言对于给定签名S必不可少，它们必须能够影响S中任意元数据元素的逻辑结论，为了区分这种断言，我们给出如下定义：Our goal is to carve out precise pieces of metadata that contain only the assertions that are essential for a given signature, so that the resulting shards have minimal size while maintaining information integrity. Simply put, for assertions to be necessary for a given signature S, they must be able to affect the logical conclusion of any metadata element in S. To distinguish such assertions, we give the following definitions:

定义4(论据)：给定元数据知识库

及断言α，且

称

的片断

是α的论据，当且仅当对于任意

有

且

成立。α的论据记作

Definition 4 (Argument): Given Metadata Repository

and assert α, and

say

piece of

is an argument for α if and only if for any

Have

and

established. The argument for α is written as

定义5(关键断言)：给定元数据知识库

元数据元素a及断言γ，称γ为{a}的关键断言，当且仅当对于a的任意断言α(类断言或者属性断言)，有

成立。Definition 5 (Key Assertion): Given Metadata Knowledge Base

Metadata element a and assertion γ, called γ the key assertion of {a}, if and only if any assertion α (class assertion or attribute assertion) for a, has

established.

根据上述定义，断言α的论据

实质上是蕴含α的元数据知识库的最小片断，即

中每个断言都是α的关键断言。断言γ能够影响签名S中某个元数据元素的逻辑推导当且仅当它出现在该元素的任意属性断言或类断言的论据中，此时γ是S的关键断言。利用S的全部关键断言构造出的逻辑无关片断

不仅保持了S中元素的类断言和属性断言的全部信息而且最小的规模。因此下面我们的任务就变成了如何为给定签名S计算仅包含关键断言的逻辑无关片断，即最小逻辑无关片断，除非特别指明，下文的逻辑无关片断均指最小逻辑无关片断。可以证明签名S中每个元素的逻辑无关片断的并集即为S的逻辑无关片断，因此我们仅需为S中单个元数据元素的逻辑无关片断的计算提出算法即可。According to the above definition, the argument for asserting α

is essentially the smallest fragment of the metadata knowledge base containing α, namely

Each assertion in is a key assertion of α. An assertion γ can affect the logical derivation of a metadata element in a signature S if and only if it appears in the argument of any attribute assertion or class assertion of that element, where γ is the key assertion of S. A logically irrelevant fragment constructed using all key assertions of S

Not only maintains all the information of class assertion and attribute assertion of elements in S but also the minimum size. Therefore, our task below becomes how to calculate a logically irrelevant fragment containing only key assertions for a given signature S, that is, the minimum logically irrelevant fragment. Unless otherwise specified, the following logically irrelevant fragment refers to the minimum logically irrelevant fragment. It can be proved that the union of the logically independent fragments of each element in the signature S is the logically independent fragment of S, so we only need to propose an algorithm for the calculation of the logically independent fragments of a single metadata element in S.

要计算给定的单个元数据元素a的逻辑无关分片，我们需要判断

中的每个断言是否为a的关键断言，即必须测试每个断言是否与a的属性断言或类断言的推导有关。根据SHIQ的推理方法可知，元数据元素的类断言既依赖于类断言也依赖于属性断言，相反不同元数据元素之间的属性断言仅受属性断言的影响而与类断言无关，因此单个元数据元素a的逻辑无关分片可以通过三步实现：首先计算断言的集合

其中每个断言都与a的任意属性断言R(a，b)的推导有关，我们将该集合称为属性演绎片断，接着计算断言的集合

其中每个断言都与a的任意类断言C(a)的推导有关，我们将该集合称为类演绎片断，最后将集合

与

合并即得a的逻辑无关片断。To compute the logically unrelated shard given a single metadata element a, we need to decide

Whether each assertion in a is a key assertion of a, i.e., must test whether each assertion is related to the derivation of a property assertion or class assertion. According to SHIQ's reasoning method, the class assertion of metadata elements depends on both the class assertion and the attribute assertion. On the contrary, the attribute assertion between different metadata elements is only affected by the attribute assertion and has nothing to do with the class assertion. Therefore, a single metadata Logically independent sharding of element a can be achieved in three steps: first compute the set of assertions

where each assertion is related to the derivation of an arbitrary attribute assertion R(a,b) of a, we call this set an attribute deduction fragment, and then compute the set of assertions

where each assertion is related to the derivation of an arbitrary class assertion C(a) of a, we call this set a class deduction fragment, and finally the set

and

Merge to get the logically unrelated pieces of a.

2、

的计算2,

calculation

元数据元素a的属性演绎片断

中的每个断言都与a的任意属性断言R(a，b)的推导有关，因此

其中γ是Abox断言而

是R(a，b)的论据。由于角色层次和传递角色都会对属性断言产生影响，因此

的计算需要考虑两种断言：第一种断言形如

或

第二种断言是从a到b的角色层次中的断言，这些断言均有传递的父角色R₀且

例如R₁(a，a₁)，R₂(a₂，a₁)，

而R₁，R₂ˉ，

且R₀是传递角色。由第一种断言我们得到准则1：

中以a作为第一个要素或第二个要素的属性断言，由第二种断言我们得到准则2：

中从a到b的角色路径中的属性断言，这些断言具有同一个传递父角色。因此同时满足准则1和准则2的属性断言的集合即为元数据元素a的属性演绎片断

Attribute deduction fragment of metadata element a

Each assertion in is related to the derivation of an arbitrary property assertion R(a,b) of a, so

where γ is the Abox assertion and

is the argument for R(a,b). Since both role hierarchies and passing roles have an impact on attribute assertions, so

There are two kinds of assertions that need to be considered for the computation of : the first assertion has the form

or

The second type of assertion is an assertion in the role hierarchy from a to b, which all have a passed parent role R ₀ and

For example R ₁ (a, a ₁ ), R ₂ (a ₂ , a ₁ ),

And R ₁ , R ₂ ˉ,

And R ₀ is the transfer role. From the first assertion we get criterion 1:

With a as the attribute assertion of the first element or the second element, we get criterion 2 from the second assertion:

Attribute assertions in the role path from a to b that have the same transitive parent role. Therefore, the set of attribute assertions that satisfy both criterion 1 and criterion 2 is the attribute deduction fragment of metadata element a

3、

的计算3.

calculation

要计算元数据元素a的逻辑无关片断，需要在获得的

的基础上进一步计算

由于

中的每个断言都与a的任意类断言C(a)的推导有关，因此

其中γ是Abox断言而

是C(a)的论据。To calculate the logically unrelated fragment of metadata element a, it is necessary to

on the basis of further calculation

because

Each assertion in is related to the derivation of an arbitrary class assertion C(a) of a, so

where γ is the Abox assertion and

is the argument for C(a).

如前所述，在SHIQ中元数据元素的类断言的推导既依赖于类断言也依赖于属性断言，因此a的类断言是

必不可少的组成部分。为了识别影响C(a)的属性断言，还需要对

中的每个断言进行鉴别。给定元数据知识库

仅当元素a的断言的支撑概念被C所包含，才会有C(a)成立，因此为了识别与a的类断言的推导有关的断言，必须确定该断言的支撑概念被某个概念所包含。例如令

若要识别R₀(a，b)是否与a的类断言推导有关，必须确定R₀(a，b)的支撑概念被某个概念所包含，即是否有As mentioned earlier, the derivation of class assertions for metadata elements in SHIQ depends on both class assertions and attribute assertions, so the class assertion for a is

essential component. To identify property assertions affecting C(a), it is also necessary to

Each assertion in is authenticated. Given metadata knowledge base

C(a) holds only if the supporting concept of the assertion of element a is contained by C, so in order to identify an assertion related to the derivation of the class assertion of a, it must be determined that the supporting concept of the assertion is contained by a concept . e.g. order

To identify whether R ₀ (a, b) is related to the class assertion derivation of a, it must be determined that the supporting concept of R ₀ (a, b) is contained by a concept, that is, whether there is

其中

且b∈C₁。不难看出通过将C₁替换为B、C₂替换为A后上式是可满足的，因此可知R₀(a，b)与C₂(a)的推导有关，即R₀(a，b)在C₂(a)的论据中因而应该被加入

不仅如此，由于C₁(b)也是推导出C₂(a)的要素，

中的断言也应该被加入

in

And b∈C ₁ . It is not difficult to see that the above formula can be satisfied by replacing C ₁ with B and C ₂ with A, so it can be seen that R ₀ (a, b) is related to the derivation of C ₂ (a), that is, R ₀ (a, b ) should therefore be added to the argument for C ₂ (a)

Not only that, since C ₁ (b) is also an element for deriving C ₂ (a),

Assertions in should also be added

上面的例子仅考虑了影响类断言推导的单个属性断言，实际上后者有时会被多个断言所影响，例如令

此时R₀(a，b)仍然与A(a)的推导有关，但是按照公式(1)判断R₀(a，b)的支撑概念是否被某个概念所包含时却发现它是不可满足的，因此为了包含关于元数据元素a的全部信息，应将公式(1)扩展为：The above example only considers a single property assertion that affects the deduction of a class assertion, in fact the latter can sometimes be affected by multiple assertions, e.g.

At this time, R ₀ (a, b) is still related to the derivation of A (a), but according to formula (1), when judging whether the supporting concept of R ₀ (a, b) is contained by a certain concept, it is found that it is unsatisfiable , so in order to include all the information about the metadata element a, formula (1) should be extended to:

其中C₃为元素a的所有其它信息的整合且

如果把SHIQ的数量限定考虑在内，公式(2)应进一步扩展为：where C3 is the integration _of all other information for element a and

If the quantitative limitation of SHIQ is taken into account, Equation (2) should be further extended to:

其中

且

公式(2)中

的仅是≥nR.C₁的特例并且

代表C₁元素a的R-邻居。公式(3)是识别影响C(a)的属性断言的最一般形式，它表明在元数据知识库中，对于影响元数据元素a的类断言的任意属性断言R(a，b)，相应的支撑概念必被某个概念所包含并且对于任意的数量限定，a的R-邻居的数目应不少于限定的数目。in

and

In formula (2)

is only a special case of ≥nR.C ₁ and

represents the R-neighbors of _C1 element a. Equation (3) is the most general form of identifying attribute assertions affecting C(a), it shows that in the metadata knowledge base, for any attribute assertion R(a, b) affecting the class assertion of metadata element a, the corresponding The supporting concept must be contained by a concept and for any number of constraints, the number of R-neighbors of a should be no less than the number of constraints.

4、执行元数据逻辑无关分片算法4. Execute metadata logic independent sharding algorithm

由于元数据元素的类断言既依赖于类断言也依赖于属性断言，相反不同元素之间的属性断言仅受属性断言的影响而与类断言无关，因此单个元数据元素a的逻辑无关分片应首先计算属性演绎片断

其中每个断言都与a的任意属性断言R(a，b)的推导有关，然后在此基础上计算类演绎片断

其中每个断言都与a的任意类断言C(a)的推导有关，最后求二者的并集即得a的逻辑无关片断。据此思路得到如下算法。Since the class assertion of a metadata element depends on both the class assertion and the attribute assertion, on the contrary, the attribute assertion between different elements is only affected by the attribute assertion and has nothing to do with the class assertion, so the logically irrelevant fragmentation of a single metadata element a should be First compute the property deduction snippet

where each assertion is related to the derivation of an arbitrary property assertion R(a,b) of a, and then computes the class deduction piece based on that

Each of these assertions is related to the derivation of an arbitrary class assertion C(a) of a, and finally the union of the two is obtained to obtain a logically irrelevant fragment of a. According to this idea, the following algorithm is obtained.

步骤3：在逻辑无关分片之上执行结构完整性检测。Step 3: Perform structural integrity checks on top of logically unrelated shards.

由于通过步骤2求得的片断是给定元数据元素逻辑蕴含的闭包，因此逻辑无关分片使得在较小的元数据集合上执行结构完整性检测或并行执行该检测成为可能。按照元数据知识库的初始规模，我们可以将其划分为合适规模的互不相交的子集，然后通过算法1生成同等数量的逻辑无关片断。检测单个元数据元素a的类属关系以及检测两个元数据元素a、b的属性关系可以在包含a、b的片断上进行而不必针对整个元数据知识库；另一方面，检测某个元类的全部实例元素或检测通过某个属性相关联的全部实例元素可以将同样的查询在各片断上并行执行而后将结果合并。Since the fragments obtained by step 2 are closures of the logical implications of a given metadata element, logically independent fragmentation makes it possible to perform structural integrity checks on smaller sets of metadata or to perform the checks in parallel. According to the initial size of the metadata knowledge base, we can divide it into disjoint subsets of suitable size, and then generate the same number of logically unrelated fragments through Algorithm 1. Detecting the generic relationship of a single metadata element a and detecting the attribute relationship of two metadata elements a and b can be performed on the fragment containing a and b without necessarily targeting the entire metadata knowledge base; on the other hand, detecting a certain metadata All instance elements of a class or checking all instance elements associated by a property can execute the same query in parallel on each fragment and combine the results.

根据结构完整性约束，如果某个操作修改了元层次中某属性的类型，而新类型不是原有类型的超类且原有类型是元层次中已存在的元类，若下面层次中的元素没有被修改就会产生结构完整性冲突。该类冲突可以通过下面的语句来检测(假定元类Property的属性referencedType的类型由StructuredType变为SimpleType，而非DataType等等超类型，如图4所示)：According to structural integrity constraints, if an operation modifies the type of an attribute in the meta-hierarchy, and the new type is not a superclass of the original type and the original type is a meta-class that already exists in the meta-hierarchy, if the element in the following hierarchy Structural integrity violations occur if they are not modified. This type of conflict can be detected by the following statement (assuming that the type of the attribute referencedType of the metaclass Property is changed from StructuredType to SimpleType, not a supertype such as DataType, as shown in Figure 4):

在本例的查询中，计算count1和count2时均需执行查询原子referencedType(property,datatype)，该原子根据已知的property检索整个元数据知识库用以确定所有被property引用的datatype，而count2的计算另需执行查询原子SimpleType(datatype)，该原子检索整个元数据知识库用以确定所有属于SimpleType的datatype。基于步骤2的逻辑无关分片，这两个查询原子均可被并行执行从而提高检测效率。In the query of this example, the query atom referencedType(property, datatype) needs to be executed when calculating count1 and count2. This atom retrieves the entire metadata knowledge base according to the known property to determine all the datatypes referenced by the property. The computation additionally executes the query atom SimpleType(datatype), which retrieves the entire metadata repository to determine all datatypes belonging to SimpleType. Based on the logically unrelated sharding in step 2, both query atoms can be executed in parallel to improve detection efficiency.

下例是聚合多重性冲突的检测。根据结构完整性约束，如果某个操作修改了元层次中的聚合端的多重性，而下层元素没有被相应修改将导致相应实例数目与修改后的多重性冲突。可以通过下面的语句来检测该类冲突(假定Aggregation和AggregationEnd间的聚合在AggregationEnd端的多重性由1改为2)：The following example is the detection of aggregated multiplicity collisions. According to the structural integrity constraints, if an operation modifies the multiplicity of the aggregate end in the meta hierarchy, and the lower element is not modified accordingly, it will cause the corresponding instance number to conflict with the modified multiplicity. This type of conflict can be detected by the following statement (assuming that the multiplicity of the aggregation between Aggregation and AggregationEnd at AggregationEnd is changed from 1 to 2):

在本例中，count1的计算需要执行查询原子Aggregation-AggregationEnd(aggregation,aggregationEnd)，该原子检索整个元数据知识库用以确定所有与已知的aggregation通过Aggregation-AggregationEnd相关联的aggregationEnd，基于步骤2的逻辑无关分片，该查询原子可以被并行优化从而提高检测效率。In this example, the calculation of count1 requires the execution of the query atom Aggregation-AggregationEnd(aggregation,aggregationEnd), which retrieves the entire metadata repository to determine all aggregationEnds associated with known aggregations through Aggregation-AggregationEnd, based on step 2 For logically independent sharding, the query atom can be optimized in parallel to improve detection efficiency.

为了评测本发明的有效性，我们进行了大量实验，重点测试了本优化方法对结构完整性检测时间性能的提升。实验的实例集取自MOF元数据存储库系统MBRS。该系统的结构由存储库客户端、存储库管理模块和数据存储构成。存储库客户端用于在该系统之上建立存储库应用；存储库管理模块用于处理元数据并为存储库客户端提供服务，它实现了元数据逻辑无关分片及并行化处理；数据存储由M₀层及其上的各层元数据构成。其中存储库管理模块又包括：一组良定义的MBRS接口API，这些API的实现是基于对JMI反射的扩展；元数据管理器，它将元数据组织成层次结构并管理各层元数据的查询及存储。MBRS使用Oracle11g数据库存储M₀层数据和元数据。结构完整性冲突是采用系统实例和人工植入两种方式，它们涵盖了结构完整性的各个方面，包括与包的删除和建立相关的冲突、与更改关联端和属性的多重性相关的冲突、与修改引用相关的冲突等。实验结果表明，本优化方法对各类结构完整性冲突的检测效率均有不同程度的提升。In order to evaluate the effectiveness of the present invention, we conducted a large number of experiments, focusing on testing the improvement of the time performance of structural integrity detection by the optimization method. The set of instances for the experiment was taken from the MOF metadata repository system MBRS. The structure of the system consists of repository client, repository management module and data storage. The repository client is used to build repository applications on the system; the repository management module is used to process metadata and provide services for the repository client, which implements metadata logic-independent sharding and parallel processing; data storage It is composed of M ₀ layer and the metadata of each layer above it. The repository management module also includes: a set of well-defined MBRS interface APIs, the implementation of which is based on the extension of JMI reflection; the metadata manager, which organizes metadata into a hierarchical structure and manages the query of metadata at each layer and storage. MBRS uses Oracle11g database to store M ₀ layer data and metadata. Structural Integrity Conflicts are implemented in both system instances and artificially implanted. They cover all aspects of Structural Integrity, including conflicts related to the deletion and creation of packages, conflicts related to changing the multiplicity of association ends and attributes, Conflicts related to modifying references, etc. The experimental results show that the optimization method improves the detection efficiency of various structural integrity conflicts to varying degrees.

有效性测试是在Intel Xeon E7-4830八核CPU、6GB内存的运行环境下进行的。分类测试的耗用时间以毫秒计，测试结果如图1所示。该图显示了不同规模的M₀、M₁、M₂层元数据时结构完整性检测的执行时间。其中上半部分是当M₂层元数据规模较小(24个类)时，变化相应的M₁和M₀层元素的数目时检测一致性所耗费的时间。黑色曲线反映的是不采用优化措施时的执行时间，蓝色、橙色、紫色曲线分别为M₂层划分为2片、4片和8片时结构完整性检测花费的时间。下半部分反映的是M₂层元数据规模更大，相应M₁和M₀层元数据规模也更大时的执行时间。可以看出在不采用优化措施以及划分的片断数分别为2、4和8时执行结构完整性检测所耗费的时间均与元数据规模之间呈线性关系，与预期相符合。片断数增加一倍时检测时间均并没有降低为原时间的一半，究其原因应该是分片过程本身占用一定时间所致。尽管如此，片断数的增加均导致执行检测的时间显著减少。从实验结果也可以看出，平均来说在中小规模的元数据集之上该优化方法对时间效率的提升是显著的。Validity tests are performed under the operating environment of an Intel Xeon E7-4830 eight-core CPU and 6GB of memory. The elapsed time of the classification test is measured in milliseconds, and the test results are shown in Figure 1. The figure shows the execution time of structural integrity detection for different scales of M ₀ , M ₁ , and M ₂ layers of metadata. The upper part is the time it takes to detect consistency when the _number _of elements in the corresponding M1 and M0 layers is changed when the size of the _M2 layer metadata is small (24 classes). The black curve reflects the execution time without optimization measures, and the blue, orange, and purple curves represent the time spent on structural integrity testing when the M ₂ layer is divided into 2, 4, and 8 slices, respectively. _The lower part reflects the execution time when the _M2 level metadata is larger, and the corresponding M1 and _M0 level metadata is larger. It can be seen that the time spent performing structural integrity detection is linearly related to the metadata size when no optimization measures are adopted and the number of divided segments is 2, 4, and 8, which is in line with expectations. When the number of fragments is doubled, the detection time is not reduced to half of the original time. The reason should be that the fragmentation process itself takes a certain time. Nonetheless, an increase in the number of fragments resulted in a significant reduction in the time to perform the detection. It can also be seen from the experimental results that, on average, the optimization method improves the time efficiency significantly on medium and small-scale metadata sets.

需要强调的是，本发明所述的实施例是说明性的，而不是限定性的，因此本发明包括并不限于具体实施方式中所述的实施例，凡是由本领域技术人员根据本发明的技术方案得出的其他实施方式，同样属于本发明保护的范围。It should be emphasized that the embodiments described in the present invention are illustrative rather than restrictive, so the present invention includes but is not limited to the embodiments described in the specific implementation manner. Other embodiments derived from the scheme also belong to the protection scope of the present invention.

Claims

1. a structural integrity detection optimization method based on metadata logic irrelevant fragmentation, is characterized in that comprising the following steps:

Step 1. Form the repository metadata and data into a description logic SHIQ metadata knowledge base;

Step 2. Perform logically independent sharding of the SHIQ metadata knowledge base;

Step 3: Perform structural integrity checks on logically unrelated shards;

The specific implementation method of the step 2 includes the following steps:

(1) Calculate the attribute deduction fragment of the element a given by the SHIQ metadata knowledge base according to the criteria 1 and 2

for SHIQ metadata knowledge base;

(2) Add all class assertions of element a into

⑶ for

for each R ₀ (a, b) in and satisfying

for each R, judging

Whether it is satisfied, if so, then R ₀ (a, b), the argument

assertions in and

The assertion in the add in

⑷Calculation

The criterion 1 is: attribute assertion with element a as the first element or the second element in the SHIQ metadata knowledge base;

Said criterion 2 is: attribute assertions in the role path from element a to element b in the SHIQ metadata knowledge base, these assertions have the same transitive parent role.

2. the structural integrity detection and optimization method based on metadata logically irrelevant fragmentation according to claim 1, is characterized in that: described step 1 comprises: meta-level M _n+1 level and instance level M in the storage repository _n layers are formalized separately, where n is 0 or 1.

3. the structural integrity detection optimization method based on metadata logically irrelevant fragmentation according to claim 2, is characterized in that: the formalized method of described meta-level M _n+1 layer is:

(1) Convert each metaclass in the meta hierarchy into a SHIQ concept, and enable two different metaclasses to have attributes of different types but the same name;

(2) Formalize a class C and a type C' in the meta-level into concepts and two reciprocal roles r ₁ and r ₂ ;

(3) Generalization relationship: If a metaclass C1 is a generalization of metaclass C2, it can be formalized as

4. the structural integrity detection optimization method based on metadata logically irrelevant fragmentation according to claim 2, is characterized in that: the formalized method of described instance M _n layer is:

(1) If the element c of the _Mn layer is an instance of the metaclass C in its meta-level, it is formalized as: C(c);

(2) If the element c ₁ of the _Mn layer is associated with the element c ₂ , the corresponding metaclass C ₁ aggregates C ₂ through the aggregation association A, and the aggregation association A is formalized as the role in the Tbox, then it is formalized as: A(c ₁ , c ₂ );

(3) If the element c ₁ of the _Mn layer is associated with the element c ₂ , the corresponding metaclass C ₁ is related to the meta class C ₂ through a general association, and the general association between the meta class C ₁ and the meta class C ₂ is formalized as the concept and Roles r ₁ , r ₂ , then the relationship between c ₁ and c ₂ can be formalized as three assertions: A(a); r ₁ (a, c ₁ ); r ₂ (a, c ₂ ).

5. the structural integrity detection optimization method based on metadata logic irrelevant fragmentation according to claim 1, is characterized in that: the method of described step 3 is: detect the generic relationship of single metadata element a and detect metadata The attribute relationship of element a and metadata element b is performed on the fragment containing metadata element a and metadata element b; detecting all instance elements of a metaclass or detecting all instance elements associated with an attribute is the same. The query is executed in parallel on each fragment and the results are merged.