FR3135802A1

FR3135802A1 - Method for supervised generation of a virtual semantic graph of specialized knowledge

Info

Publication number: FR3135802A1
Application number: FR2204883A
Authority: FR
Inventors: Christian Iasio
Original assignee: BRGM SA
Current assignee: BRGM SA
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2023-11-24

Abstract

L’invention est un procédé (100) de génération d’un graphe de connaissances, mis en œuvre par ordinateur, comprenant les étapes suivantes : - fourniture (103) de plusieurs triplets correspondant respectivement à des affirmations de connaissance ; - enchaînement entre triplets (104) visant à les organiser en un graphe topologique ; - définition (106) de types de concepts sémantiques, de concepts sémantiques (107) pour décrire et qualifier les types d’entités et d’attributs reconnus parmi les éléments du graphe topologique, - instanciations (108) des concepts sémantiques définis, les instances étant générées par classification des éléments composant les triplets et associées entre elles de façon à hériter aussi les relations du graphe topologique, de manière à générer un graphe des connaissances sémantique. Figure pour l’abrégé : figure 3The invention is a method (100) for generating a knowledge graph, implemented by computer, comprising the following steps: - providing (103) several triples corresponding respectively to knowledge assertions; - sequence between triplets (104) aimed at organizing them into a topological graph; - definition (106) of types of semantic concepts, of semantic concepts (107) to describe and qualify the types of entities and attributes recognized among the elements of the topological graph, - instantiations (108) of defined semantic concepts, instances being generated by classification of the elements composing the triplets and associated with each other so as to also inherit the relationships of the topological graph, so as to generate a semantic knowledge graph. Figure for abstract: figure 3

Description

Method for supervised generation of a virtual semantic graph of specialized knowledge

L’invention concerne la gestion des connaissances, en particulier la génération supervisée et l’alimentation de graphes sémantiques virtuels de connaissances spécialisées, autrement dit consacrée à une certaine application ou analyse.The invention concerns knowledge management, in particular the supervised generation and feeding of virtual semantic graphs of specialized knowledge, in other words dedicated to a certain application or analysis.

Les graphes de connaissance visent à structurer les connaissances acquises sur un domaine prédéterminé selon leurs relations réciproques. Lorsqu’ils sont construits virtuellement via des moyens informatiques, ils permettent l’analyse rapide voire automatique par ces moyens des connaissances du graphe, notamment la détermination de liens cachés entre des entités qui semblaient pourtant éloignées les unes des autres, la détection de motifs sémantiques récurrents, la généralisation de faits à des domaines plus vastes que le cas d’usage concerné. Ces graphes forment ainsi notamment des outils pour modéliser des processus complexes de systèmes sociotechniques, par exemple pour l’analyse de la gestion de certains accidents, de manière à améliorer la gestion des processus du système et même à prévenir ces accidents.Knowledge graphs aim to structure the knowledge acquired in a predetermined domain according to their reciprocal relationships. When they are constructed virtually via computer means, they allow the rapid or even automatic analysis by these means of the knowledge of the graph, in particular the determination of hidden links between entities which nevertheless seemed distant from each other, the detection of semantic patterns recurring, the generalization of facts to areas larger than the use case concerned. These graphs thus form tools for modeling complex processes of socio-technical systems, for example for the analysis of the management of certain accidents, so as to improve the management of system processes and even to prevent these accidents.

On connaît dans l’état de la technique plusieurs méthodes d’organisation des connaissances, mais chacune est adaptée à un domaine particulier et présente ses propres inconvénients. Par exemple, la méthode FRAM (pour «Functional Resonance Analysis Method»), utilisée dans le cas d’analyse d’accidents, impose une grande quantité de données à fournir dans un cadre précis et complexe, résultant en un graphe de connaissance obéissant à une syntaxe à acquérir pour ses utilisateurs et difficilement généralisable à d’autres domaines. La méthode AHP (pour «Analytic H ierarchy Process») génère un graphe sur la base d’hypothèses de départ qui peuvent se révéler fausses ensuite et résulter en liens trop complexes entre les entités du graphe. On connait également des méthodes visant à créer des ontologies, telles que la méthode «M ethondology», la méthode «Gospl» ou encore la méthode «Gellish». Cependant, elles ne permettent pas d’utiliser un cadre commun pour tous les domaines d’organisation de connaissance.Several methods of organizing knowledge are known in the state of the art, but each is adapted to a particular field and has its own drawbacks. For example, the FRAM method (for “ Functional Resonance Analysis Method ”), used in the case of accident analysis, requires a large quantity of data to be provided in a precise and complex framework, resulting in a knowledge graph obeying a syntax to be acquired by its users and difficult to generalize to other domains. The AHP method (for “ Analytic Hierarchy Process ”) generates a graph on the basis of initial hypotheses which may subsequently turn out to be false and result in overly complex links between the entities in the graph. We also know methods aimed at creating ontologies, such as the “ M ethondology ” method, the “ Gospl ” method or the “ Gellish ” method. However, they do not allow the use of a common framework for all areas of knowledge organization.

De manière générale, il est difficile d’identifier une méthode de génération supervisée d’un graphe de connaissances spécialisées généralisable à tous les domaines.Generally speaking, it is difficult to identify a method for supervised generation of a specialized knowledge graph that can be generalized to all domains.

Il est également difficile d’identifier une méthode permettant l’agrégation de connaissances selon un même cadre pour de nombreuses sources de connaissance aux profils variés, et notamment une méthode permettant l’apport de connaissances par un néophyte du domaine ciblé.It is also difficult to identify a method allowing the aggregation of knowledge according to the same framework for numerous sources of knowledge with varied profiles, and in particular a method allowing the contribution of knowledge by a neophyte in the targeted field.

Enfin, il est difficile d’obtenir un graphe virtuel comprenant des relations faciles à analyser, les liens entre les entités du graphe étant souvent de plus en plus complexes au fur et à mesure de l’enrichissement du graphe.Finally, it is difficult to obtain a virtual graph including relationships that are easy to analyze, the links between the entities in the graph often becoming more and more complex as the graph is enriched.

L'invention a notamment pour but de permettre la génération supervisée d’un graphe de connaissances spécialisées selon une même méthode quel que soit le domaine visé.The invention aims in particular to enable the supervised generation of a specialized knowledge graph using the same method regardless of the targeted domain.

Elle vise également à permettre la structuration d’une quantité de connaissances importantes, provenant de sources variées et hétérogènes, en un graphe aux relations simples à comprendre et à analyser.It also aims to enable the structuring of a large quantity of knowledge, coming from varied and heterogeneous sources, into a graph with relationships that are simple to understand and analyze.

Elle vise enfin à permettre l’apport de connaissances d’une manière simple par tout néophyte du domaine concerné par le graphe.Finally, it aims to enable the provision of knowledge in a simple manner by any neophyte in the field concerned by the graph.

À cet effet l’invention a pour objet un procédé de génération d’un graphe sémantique virtuel de connaissances spécialisées, mis en œuvre par ordinateur, dans lequel sont mises en œuvre les étapes suivantes :To this end, the subject of the invention is a method for generating a virtual semantic graph of specialized knowledge, implemented by computer, in which the following steps are implemented:

- fourniture de plusieurs triplets correspondant respectivement à des affirmations de connaissance, chaque triplet comprenant deux noms reliés par un prédicat ;- provision of several triplets corresponding respectively to knowledge assertions, each triplet comprising two nouns linked by a predicate;

- agrégation des triplets en fonction des noms de manière à relier les triplets entre eux et à les organiser en un graphe topologique ;- aggregation of triples based on names so as to link the triples together and organize them into a topological graph;

- détection de motifs sémantiques dans le graphe topologique ;- detection of semantic patterns in the topological graph;

- en fonction des motifs sémantiques détectés, définition de types de concepts sémantiques génériques, définition de concepts sémantiques, chaque concept sémantique appartenant à un type de concept sémantique générique, et définition des associations possibles entre les concepts sémantiques de manière à former un modèle sémantique comprenant les concepts associés entre eux ;- depending on the detected semantic patterns, definition of types of generic semantic concepts, definition of semantic concepts, each semantic concept belonging to a type of generic semantic concept, and definition of possible associations between the semantic concepts so as to form a semantic model comprising the concepts associated with each other;

- instanciations des concepts sémantiques, chaque instance étant associée à au moins l’un des concepts définis, les instances étant générées par classification du contenu du graphe topologique, de manière à générer un graphe des connaissances.- instantiations of semantic concepts, each instance being associated with at least one of the defined concepts, the instances being generated by classification of the content of the topological graph, so as to generate a knowledge graph.

Ainsi, l’étape de fourniture des triplet correspond à l’apport de connaissances. Dans le cadre de l’invention, un triplet, ou «triplé», est un groupe de trois éléments verbaux ou numériques associés entre eux pour représenter une affirmation de connaissance. L’expression «affirmation de connaissance» concerne la formulation d’un témoignage, d’un fait, ou d’une hypothèse à vérifier. Cet apport est réduit à sa plus simple expression puisqu’il ne s’agit que d’indiquer deux noms, correspondant à deux entités ou à une entité et un attribut de cette entité, et un prédicat, correspondant au lien entre ces deux noms : une interaction, une qualification, par exemple. Ainsi, toute personne, non nécessairement experte du domaine concerné par le graphe de connaissances, peut apporter sa connaissance. Cette étape de fourniture est en particulier réalisée via une application logicielle ou web, permettant à un utilisateur de remplir des champs prédestinés au recueil de triplet.Thus, the step of providing triplets corresponds to the provision of knowledge. In the context of the invention, a triplet, or “ triplet ”, is a group of three verbal or numerical elements associated with each other to represent an assertion of knowledge. The expression “ knowledge assertion ” concerns the formulation of testimony, a fact, or a hypothesis to be verified. This contribution is reduced to its simplest expression since it only involves indicating two names, corresponding to two entities or to an entity and an attribute of this entity, and a predicate, corresponding to the link between these two names: an interaction, a qualification, for example. Thus, any person, not necessarily an expert in the field concerned by the knowledge graph, can contribute their knowledge. This supply step is in particular carried out via a software or web application, allowing a user to fill in fields predestined for the triplet collection.

L’étape d’agrégation permet d’obtenir un graphe topologique des connaissances fournies. Ainsi, après cette étape, un expert du domaine ciblé par les connaissances dispose d’un graphe topologique qu’il peut analyser. Ce graphe est toutefois dénué d’apports sémantiques puisqu’il ne comprend que les triplets fournis par les différentes sources, sans que des généralisations aient été effectuées ou que des catégories aient créées pour regrouper les connaissances. Cette étape d’agrégation de triplets est réalisée au fur et à mesure que les triplets sont saisis, manuellement ou automatiquement. Elle fait appel à des moyens informatiques reconnaissant les noms identiques ou assimilés (par critères orthographiques ou grammaticales) déjà utilisés dans d’autres triplets, de manière à relier au fur et à mesure les triplets autour des noms similaires qu’ils comprennent. Les moyens font apparaître les triplets saisis sur un écran avec la portion du graphe topologique qui les concerne.The aggregation step makes it possible to obtain a topological graph of the knowledge provided. Thus, after this step, an expert in the domain targeted by the knowledge has a topological graph that he can analyze. This graph is, however, devoid of semantic contributions since it only includes the triples provided by the different sources, without generalizations having been made or categories having been created to group the knowledge. This triplet aggregation step is carried out as the triplets are entered, manually or automatically. It uses computer means recognizing identical or similar nouns (by orthographic or grammatical criteria) already used in other triplets, so as to gradually connect the triplets around the similar nouns that they include. The means make the triplets entered appear on a screen with the portion of the topological graph which concerns them.

L’étape de détection de motifs sémantiques dans le graphe topologique vise à repérer des liens récurrents entre les différents éléments du graphe topologique, tels que des prédicats similaires entre des entités sémantiquement proches. Elle vise donc à identifier des types d’interaction génériques qui peuvent être appliqués à de nombreux éléments constituant les nœuds du graphe topologique, et donc à suggérer des concepts plus généraux que le cas d’étude du graphe topologique. Elle permet en ce sens de préparer les étapes suivantes. Cette étape peut être réalisée manuellement ou à l’aide d’algorithmes de machine Learning ayant été entraînés à détecter et reconnaitre des motifs sémantiques et/ou des liens. Dans ce cas, l’apprentissage automatique est réalisé au préalable sur des ontologies préexistantes formant bases de données d’apprentissage.The step of detecting semantic patterns in the topological graph aims to identify recurring links between the different elements of the topological graph, such as similar predicates between semantically close entities. It therefore aims to identify generic types of interaction which can be applied to numerous elements constituting the nodes of the topological graph, and therefore to suggest more general concepts than the case study of the topological graph. In this sense, it allows us to prepare for the following steps. This step can be carried out manually or using machine learning algorithms that have been trained to detect and recognize semantic patterns and/or links. In this case, machine learning is carried out beforehand on pre-existing ontologies forming learning databases.

L’étape suivante correspond aux définitions des types de concepts sémantiques génériques. Par «typedeconcept génériques», on peut en particulier désigner deux types généraux : les «entités» génériques, telles que les «Ressource», «Agent», d’une part, ou «Processus», «Evènement», d’autre part, et les types de concepts formant des «attributs» génériques, pour « nommer » une identité, un contexte, ou une qualification. Les concepts formant «attributs» ont ainsi vocation à qualifier les concepts formant «entités». Les types de concepts représentent les composants primaires du modèle sémantique de données, c’est-à-dire la structure sémantique. Le choix de ces types de concepts et le choix du modèle sémantique, c’est-à-dire les concepts eux-mêmes, sont donc interdépendants.The next step corresponds to the definitions of the types of generic semantic concepts. By " generic concept type ", we can in particular designate two general types: generic " entities ", such as " Resource ", " Agent ", on the one hand, or " Process ", " Event ", on the one hand, on the other hand, and the types of concepts forming generic “ attributes ”, to “name” an identity, a context, or a qualification. The concepts forming “ attributes ” are thus intended to qualify the concepts forming “ entities ”. Concept types represent the primary components of the semantic data model, i.e. the semantic structure. The choice of these types of concepts and the choice of the semantic model, that is to say the concepts themselves, are therefore interdependent.

L’étape de définition des concepts correspond à la sélection des classes sémantiques, appelés « concepts » ou « concepts sémantiques », appartenant au types de concepts génériques définis ci-dessus. et à la définition de leurs caractéristiques sémantiques. Par «concept», on désigne des termes utiles pour classifier des éléments du graphe topologique en tant qu’entités ou en tant qu’attributs d’entité, de manière adaptée à l’objectif du cas d’étude. Ces concepts peuvent être définis à partir de zéro, mais ils peuvent alternativement être sélectionnés par l’utilisateur à partir d’ontologies génériques (ou ontologies supérieures, «upper ontologies» en anglais) existantes, fournies dans une base de données. On définit également à cette étape les caractéristiques et propriétés de ces concepts. On définit enfin les relations et associations possibles entre ces concepts, en particulier entre les concepts appartenant aux types de concept « entité » d’une part, et entre les concepts d’ « entité » et d’ « attribut » d’autre part.The concept definition step corresponds to the selection of semantic classes, called “concepts” or “semantic concepts”, belonging to the types of generic concepts defined above. and the definition of their semantic characteristics. By “ concept ”, we designate terms useful for classifying elements of the topological graph as entities or as entity attributes, in a manner adapted to the objective of the case study. These concepts can be defined from scratch, but they can alternatively be selected by the user from existing generic ontologies (or upper ontologies ), provided in a database. At this stage we also define the characteristics and properties of these concepts. Finally, we define the possible relationships and associations between these concepts, in particular between the concepts belonging to the “entity” concept types on the one hand, and between the concepts of “entity” and “attribute” on the other hand.

L’ensemble de ces concepts forment l’ontologie de référence du graphe de connaissance d’intérêt. Cette ontologie est dite «hybride» car les concepts sont extraits d’une part des ontologies générales (ou supérieures, «upper ontologies»), applicables à de nombreux domaines, et d’autre part des ontologies des domaines spécifiques pertinent au domaine d’intérêt. Ces concepts extérieurs sont sélectionnés, agrégés et adaptés dans leurs définitions en tant que concepts du modèle sémantique des données, en considérant la portée et l’application du graphe de connaissances d’intérêt. Toute connaissance fournie, par le biais des triplets, s’inscrit dans cette ontologie de référence.All of these concepts form the reference ontology of the knowledge graph of interest. This ontology is called “ hybrid ” because the concepts are extracted on the one hand from general ontologies (or “ upper ontologies ”), applicable to numerous domains, and on the other hand from the ontologies of specific domains relevant to the domain of interest. These external concepts are selected, aggregated and adapted in their definitions as concepts of the semantic data model, considering the scope and application of the knowledge graph of interest. Any knowledge provided, through the triples, is part of this reference ontology.

Ces définitions permettant de s’inscrire dans un cadre très générique, qui peut être utilisé dans de nombreux cas d’usages différents. Ils permettent également de simplifier au maximum le graphe de connaissance qui sera ensuite généré, puisque chaque entité du graphe appartient à un concept parmi un faible nombre de concepts.These definitions allow us to fit into a very generic framework, which can be used in many different use cases. They also make it possible to simplify as much as possible the knowledge graph which will then be generated, since each entity in the graph belongs to one concept among a small number of concepts.

Cette étape est réalisée manuellement par un expert du domaine d’application, ou semi automatiquement à l’aide de moyens informatiques. L’étape de définition des concepts est réalisée de manière concomitante avec l’étape de définition des types de concepts, mais les concepts peuvent être ensuite complétés au fur et à mesure que l’on découvre des affirmations de connaissance ou que l’on fournit des triplets au graphe topologique.This step is carried out manually by an expert in the field of application, or semi-automatically using computer resources. The concept definition step is carried out concurrently with the concept type definition step, but the concepts can then be supplemented as knowledge claims are discovered or provided. from triples to the topological graph.

L’étape d’instanciation des concepts est l’étape permettant la transformation du graphe topologique en graphe de connaissances sémantiques à proprement parler. Ce sont ces instances de concepts qui apparaissent dans le graphe de connaissance final.The concept instantiation step is the step allowing the transformation of the topological graph into a semantic knowledge graph strictly speaking. It is these instances of concepts that appear in the final knowledge graph.

Cette étape d’instanciation et d’association pour générer le graphe sémantique de connaissances peut être réalisée manuellement par l’expert via des moyens d’interaction informatique, telle qu’une application logicielle ou web, ou automatiquement par des moyens de Machine Learning ayant été entraînés à produire des instances conformes à une ontologie et à un modèle sémantique de données de référence, à partir d’un graphe topologique.This instantiation and association step to generate the semantic knowledge graph can be carried out manually by the expert via computer interaction means, such as a software or web application, or automatically by Machine Learning means having were trained to produce instances conforming to an ontology and a semantic model of reference data, from a topological graph.

On peut prévoir que les concepts de cette ontologie incluent eux-mêmes des sous-classes sémantiques. Par exemple, la sous-classe «hôpital» pourrait appartenir au concept « infrastructure », lui-même appartenant au type de concept «Ressource». On peut alors également réaliser des instanciations de ces sous-classes.We can predict that the concepts of this ontology themselves include semantic subclasses. For example, the “ hospital ” subclass could belong to the “infrastructure” concept, itself belonging to the “ Resource ” concept type. We can then also create instantiations of these subclasses.

En classifiant de cette manière, comme instances de concepts sémantiques, l’ensemble des triplet fournis dans le graphe topologique en respectant la logique du modelé sémantique des données défini dans l’étape précédente, on produit un graphe de connaissances dans lequel il est aisé de naviguer et de déduire des liens cachés entre les éléments du graphe, de généraliser le contenu du graphe topologique, notamment en évoluant entre différents niveaux de granularités et en réglant la résolution d’informations extraits du graphe.By classifying in this way, as instances of semantic concepts, all the triplets provided in the topological graph while respecting the logic of the semantic modeling of the data defined in the previous step, we produce a knowledge graph in which it is easy to navigate and deduce hidden links between the elements of the graph, generalize the content of the topological graph, in particular by evolving between different levels of granularity and by adjusting the resolution of information extracted from the graph.

Le procédé permet le traitement traçable des connaissances qui peuvent être très nombreuses et hétérogènes et fournies par des sources très différentes, non expertes du domaine, sous la forme de simples triplets. Enfin, en se conformant à un modèle sémantique très générique, le procédé peut être utilisé dans tous les domaines pertinents, sans que quiconque ait à apprendre et à appliquer une syntaxe ou des règles compliquées pour apporter les connaissances au graphe.The method allows the traceable processing of knowledge which can be very numerous and heterogeneous and provided by very different sources, not experts in the field, in the form of simple triplets. Finally, by conforming to a very generic semantic model, the method can be used in all relevant domains, without anyone having to learn and apply complicated syntax or rules to bring the knowledge to the graph.

L’invention peut également comporter l’une ou plusieurs des caractéristiques optionnelles suivantes, prises seules ou en combinaison.The invention may also include one or more of the following optional features, taken alone or in combination.

De préférence, chaque affirmation de connaissance correspond :Preferably, each knowledge assertion corresponds to:

- soit à une qualification, auquel cas le prédicat de ce triplet comprend un verbe de liaison pour qualifier l’un des deux noms de ce triplet par l’autre nom de ce triplet ;- either to a qualification, in which case the predicate of this triplet includes a connecting verb to qualify one of the two nouns of this triplet by the other noun of this triplet;

- soit à une action, auquel cas le prédicat de ce triplet comprend un verbe d’action pour décrire une interaction entre les deux noms du triplet,- either to an action, in which case the predicate of this triplet includes an action verb to describe an interaction between the two nouns of the triplet,

et dans lequel, de préférence, au moins l’un des noms inclut une référence temporelle ou spatiale de manière à positionner chronologiquement ou géographiquement l’affirmation à laquelle correspond le triplet.and in which, preferably, at least one of the names includes a temporal or spatial reference so as to chronologically or geographically position the statement to which the triplet corresponds.

La référence temporelle ou spatiale peut également permettre de positionner chronologiquement ou spatialement des affirmations auxquels correspondent au moins certains des autres triplets comprenant le même nom.The temporal or spatial reference can also make it possible to position chronologically or spatially statements to which correspond at least some of the other triples comprising the same name.

Avantageusement, la fourniture des triplets et l’organisation des triplets en un graphe topologique sont réalisées via un module informatique d’édition et de structuration de texte.Advantageously, the provision of the triplets and the organization of the triplets in a topological graph are carried out via a computer module for editing and structuring text.

De préférence, les types de concept sémantiques génériques sont des entités génériques ou des attributs génériques, les attributs correspondant à des caractéristiques d’identification, de contexte ou de qualification associés aux entités.Preferably, the generic semantic concept types are generic entities or generic attributes, the attributes corresponding to identification, context or qualification characteristics associated with the entities.

Avantageusement, la définition des concepts sémantiques est réalisée via un module informatique de détection de motifs relationnels récurrents dans le graphe topologique.Advantageously, the definition of semantic concepts is carried out via a computer module for detecting recurring relational patterns in the topological graph.

De préférence, les instanciations des concepts et les associations de ces instances entre elles sont réalisées via un module informatique de transformation du graphe topologique en graphe de connaissance sémantique.Preferably, the instantiations of the concepts and the associations of these instances between them are carried out via a computer module for transforming the topological graph into a semantic knowledge graph.

Avantageusement, le procédé met en outre en œuvre une étape de manipulation du graphe de connaissances à travers des moyens d’interaction informatique, de manière à naviguer dans l’ensemble du graphe de connaissances et à éditer le graphe de connaissances.Advantageously, the method further implements a step of manipulating the knowledge graph through computer interaction means, so as to navigate through the entire knowledge graph and to edit the knowledge graph.

On prévoit également selon l’invention un procédé de fourniture d’un retour d’expérience, mis en œuvre par ordinateur, dans lequel sont mises en œuvre les étapes suivantes :According to the invention, there is also provided a method of providing feedback, implemented by computer, in which the following steps are implemented:

- choix d’un graphe de connaissances existant, le graphe de connaissance ayant été généré conformément au procédé décrit précédemment ;- choice of an existing knowledge graph, the knowledge graph having been generated in accordance with the method described above;

- fourniture d’un triplet correspondant à un retour d’expérience, le triplet comprenant deux noms reliés par un prédicat ;- provision of a triplet corresponding to feedback, the triplet comprising two nouns linked by a predicate;

- transformation du triplet en instanciation d’un ou plusieurs concepts et intégration de la ou des instances dans le graphe de connaissances.- transformation of the triple into instantiation of one or more concepts and integration of the instance(s) into the knowledge graph.

Avantageusement, la transformation est réalisée automatiquement par un module informatique d’instanciation et d’intégration d’un ou de plusieurs concepts dans le graphe de connaissances en fonction du triplet.Advantageously, the transformation is carried out automatically by a computer module for instantiation and integration of one or more concepts in the knowledge graph according to the triplet.

On prévoit également selon l’invention un programme d'ordinateur comprenant des instructions qui, lorsque le programme est exécuté par un ordinateur conduisent celui-ci à mettre en œuvre les étapes du procédé décrit plus haut.According to the invention, there is also provided a computer program comprising instructions which, when the program is executed by a computer, lead it to implement the steps of the method described above.

On prévoit également selon l’invention un support d'enregistrement lisible par ordinateur comprenant des instructions qui, lorsqu'elles sont exécutées par un ordinateur, conduisent celui-ci à mettre en œuvre les étapes du procédé décrit plus haut.According to the invention, there is also provided a computer-readable recording medium comprising instructions which, when executed by a computer, lead it to implement the steps of the method described above.

Brief description of the figures

L'invention sera mieux comprise à la lecture de la description qui va suivre donnée uniquement à titre d'exemple et faite en se référant aux dessins annexés dans lesquels :The invention will be better understood on reading the description which follows, given solely by way of example and made with reference to the appended drawings in which:

la est un schéma de moyens de mise en œuvre de l’invention ; there is a diagram of means of implementing the invention;

la est un schéma d’un programme d’ordinateur de l’invention ; there is a diagram of a computer program of the invention;

la est un schéma d’un procédé selon l’invention ; there is a diagram of a process according to the invention;

la est un schéma d’une étape du procédé de la ; there is a diagram of a step in the process of ;

la est un schéma d’un autre procédé selon l’invention. there is a diagram of another method according to the invention.

detailed description

La illustre un utilisateur 1 et des moyens informatiques 2 permettant la mise en œuvre des procédés 100 et 200 qui seront décrits plus bas.There illustrates a user 1 and computer means 2 allowing the implementation of methods 100 and 200 which will be described below.

L’utilisateur 1 est une personne physique souhaitant générer un graphe de connaissances à partir d’une pluralité de connaissances, dans le cadre du procédé 100, ou ajouter des connaissances à un arbre déjà généré dans le cadre du procédé 200.User 1 is a natural person wishing to generate a knowledge graph from a plurality of pieces of knowledge, as part of the method 100, or add knowledge to a tree already generated as part of the method 200.

Les moyens informatiques 2 incluent des moyens d’interaction 21 tels qu’un écran et un clavier, ou d’autres interfaces homme-machine, des moyens de calcul 22 tels qu’un processeur, et une mémoire informatique 23, qui est physique mais pourrait être virtuelle. Au sein de la mémoire 23 est enregistré un programme d’ordinateur 24, dit « Core », permettant la mise en œuvre, lorsqu’il est exécuté par l’utilisateur 1, des étapes des procédés 100 et 200.The computer means 2 include interaction means 21 such as a screen and a keyboard, or other man-machine interfaces, calculation means 22 such as a processor, and a computer memory 23, which is physical but could be virtual. Within the memory 23 is recorded a computer program 24, called “Core”, allowing the implementation, when executed by the user 1, of the steps of the processes 100 and 200.

L’architecture du programme 24 est schématisée sur la . Une interface utilisateur 25 quelconque est ainsi reliée au programme 24 par le biais d’un module de proxy 26, lequel est relié directement ou indirectement à l’ensemble des autres modules formant le programme 24. Le module 27 utilise le logiciel open sourceTiddlyWikiet son extensionTiddlyMappour permettre l’édition de données de textes à transformer en graphe topologique. C’est donc un module informatique d’édition et d’organisation de texte. Ainsi, l’utilisateur 1 utilise, à travers le programme 24, les fonctions de ce module pour traiter les données brutes fournies sous la forme de triplets (voir plus bas). Le module 28 permet la transformation de données brutes, les triplets, en données sémantiques, les instances de concepts. Il utilise un serveurApache Jena Fuseki SPARQL, qui supporte le langage «OWL» (ou «Web Ontology Language»). Le module 29 inclut la bibliothèque PostgreSQL et permet l’enregistrement de requêtes SPARQL effectuées sur le graphe de connaissances. Il est connecté au module 30 dit de «Core Back - end». Le module 31 est le module de «Core F ront-end», il permet l’injection de code Javascript dans le module 27, les requêtes dans le module 28 et est également en liaison avec le module 30. La sortie du programme 24 est réalisée au niveau du module 32, connecté au module 28. Le module 32 exploite l’outil OntotextGraphDB, qui permet à l’utilisateur de visualiser et manipuler le graphe de connaissances généré en fin de procédé 100.The architecture of program 24 is schematized on the . Any user interface 25 is thus connected to the program 24 via a proxy module 26, which is connected directly or indirectly to all of the other modules forming the program 24. The module 27 uses the open source software TiddlyWiki and its TiddlyMap extension to allow the editing of text data to be transformed into a topological graph. It is therefore a computer module for editing and organizing text. Thus, user 1 uses, through program 24, the functions of this module to process the raw data provided in the form of triplets (see below). Module 28 allows the transformation of raw data, triplets, into semantic data, instances of concepts. It uses an Apache Jena Fuseki SPARQL server, which supports the “ OWL ” language (or “ Web Ontology Language ”). Module 29 includes the PostgreSQL library and allows the recording of SPARQL queries performed on the knowledge graph. It is connected to module 30 called “ Core Back - end ”. Module 31 is the “ Core Front -end ” module, it allows the injection of Javascript code into module 27, queries in module 28 and is also linked to module 30. The output of program 24 is carried out at the level of module 32, connected to module 28. Module 32 uses the OntotextGraphDB tool, which allows the user to visualize and manipulate the knowledge graph generated at the end of process 100.

En variante, on peut tout à fait prévoir d’autres outils que ceux mentionnés pour chacun de ces modules. Par exemple, le module 27 peut exploiter des interfaces homme-machine fournissant différentes manière d’interaction 21 pour la saisie et l’édition des données, et le module 32 peut inclure un autre outil de visualisation et de manipulation du graphe sémantique de connaissances pour l’extraction et l’utilisation des résultats de l’analyse.Alternatively, it is entirely possible to provide tools other than those mentioned for each of these modules. For example, the module 27 can exploit human-machine interfaces providing different ways of interaction 21 for entering and editing data, and the module 32 can include another tool for visualizing and manipulating the semantic knowledge graph for extraction and use of analysis results.

En référence à la , on va maintenant décrire un procédé 100 de génération d’un graphe de connaissances. Il est mis en œuvre par l’utilisateur 1 grâce aux moyens informatiques 2 qui exécutent le programme 24. C’est ce programme 24 qui permet la mise en œuvre des étapes du procédé 100 décrit ci-après. Ce procédé 100 est inspiré de la théorie des catégories, des toposes (ou topoï) et des « sketches, de la théorie des graphes, mais également de la théorie ancrée et de la «multiper spe ctivité». Dans l’exemple décrit plus bas, il est appliqué à la gestion d’une catastrophe naturelle. Le graphe de connaissance qu’il permet de générer vise à déterminer des causalités et des relations cachées entre les entités engagées dans la gestion de la catastrophe, de manière à générer des scenarios de conduite alternatives pour améliorer la gestion de prochaines crises du même type, voire de crises d’autres types. Bien que les exemples ici proviennent de la gestion d’une crise, le procédé 100 peut être appliqué à tous domaines, en particulier car il inclut le procédé pour définir des ontologies dédiées à différents cas d’étude, permettant sa mise en œuvre dans des domaines et cadres d’application très variés.In reference to the , we will now describe a method 100 for generating a knowledge graph. It is implemented by the user 1 thanks to the computer means 2 which execute the program 24. It is this program 24 which allows the implementation of the steps of the method 100 described below. This process 100 is inspired by the theory of categories, toposes (or topoi) and “sketches, graph theory, but also grounded theory and “ multiper spe ctivity ”. In the example described below, it is applied to the management of a natural disaster. The knowledge graph that it generates aims to determine causality and hidden relationships between the entities involved in disaster management, so as to generate alternative management scenarios to improve the management of future crises of the same type, or even crises of other types. Although the examples here come from crisis management, the method 100 can be applied to all fields, in particular because it includes the method for defining ontologies dedicated to different study cases, allowing its implementation in very varied fields and frameworks of application.

Au préalable, dans une étape 101, dite de «Scope graph», l’utilisateur 1 définit le périmètre d’application du procédé 100 qui va suivre. Il définit les sources de connaissances qui seront utilisées pour apporter les affirmations de connaissance, sélectionne les noms de cibles, sujets ou objets, qui pourraient être proposées, sélectionnes des verbes d’action et des verbes d’attributions qui pourront être également utilisés lors de l’étape 103 de fourniture des triplets, pour la description des relations recherchées ou attendues entre ces éléments. Le module 27 assiste l’utilisateur dans la définition de ce périmètre, qui résulte en un graphe, qu’on peut qualifier de graphe des termes initiaux, dont un exemple est fourni en .Beforehand, in a step 101, called “ Scope graph ”, the user 1 defines the scope of application of the process 100 which will follow. It defines the sources of knowledge which will be used to provide knowledge assertions, selects the names of targets, subjects or objects, which could be proposed, selects action verbs and attribution verbs which can also be used during step 103 of providing triplets, for the description of the desired or expected relationships between these elements. Module 27 assists the user in defining this perimeter, which results in a graph, which can be described as a graph of initial terms, an example of which is provided in .

Suite à la définition du périmètre d’application du procédé 100, dans une étape 102, l’utilisateur 1 recueille des affirmations de connaissance. Il s’agit de disposer de documents, de témoignages, d’explications, et de toutes autres types d’informations qui seront amenées à être fournies aux moyens informatiques 2 pour former le corpus des données disponibles pour générer et alimenter le graphe sémantique de connaissances. Ces affirmations de connaissance, recueillies en particulier auprès des sources définies à l’étape précédente, peuvent être recueillies par des solutions de traitement de retour d’expérience, allant d’approches basiques permettant l’édition de texte, à des outils de machine learning, tel queMAXQDA, qui aident à mettre en avant les éléments essentiels des connaissances recueillies. Le logicielAtlas.tipeut également permettre le traitement de larges données textuelles, graphiques, audio et video, et permettre une analyse collaborative. On peut également utiliser des technologies de NLP (pour «Natural Language Processing»). Cependant, ces technologies ne produisent pas de donnée originales, elles permettent simplement d’extraire une partie du sens de certaines données fournies. Alternativement, les données peuvent être fournies par d’autres moyens informatiques, tels que des moyens d’enregistrement vidéo et/ou des logiciels de réalité augmentée.Following the definition of the scope of application of the method 100, in a step 102, the user 1 collects knowledge assertions. This involves having documents, testimonies, explanations, and all other types of information which will be provided to the computer means 2 to form the corpus of data available to generate and supply the semantic knowledge graph . These knowledge assertions, collected in particular from the sources defined in the previous step, can be collected by feedback processing solutions, ranging from basic approaches allowing text editing, to machine learning tools , such as MAXQDA , which help to highlight the essential elements of the knowledge collected. Atlas.ti software can also allow the processing of large textual, graphic, audio and video data, and enable collaborative analysis. We can also use NLP (for “ Natural Language Processing ”) technologies. However, these technologies do not produce original data, they simply allow part of the meaning to be extracted from certain data provided. Alternatively, the data may be provided by other computing means, such as video recording means and/or augmented reality software.

Une fois les affirmations de connaissances recueillies, on débute l’étape 103. Il s’agit de fournir des triplets, représentés schématiquement à la . Un triplet comprend deux noms, ou groupes nominaux, reliés par un prédicat comprenant généralement un verbe. Ces noms et verbes peuvent être choisis parmi ceux définis à l’étape 101 de définition du périmètre de l’application du procédé. Ces triplets correspondent soit à une qualification, auquel cas le prédicat de ce triplet comprend un verbe de liaison pour qualifier l’un des deux noms de ce triplet par l’autre nom de ce triplet soit à une action, auquel cas le prédicat de ce triplet comprend un verbe d’action V_apour décrire une interaction entre les deux noms du triplet. Dans le cadre d’une qualification, on peut considérer que le deuxième nom est dès lors en fait un attribut du premier nom. La illustre ainsi trois triplet : un triplet d’interaction entre le nom N₁et le nom N₂, avec le prédicat V_l1, un triplet d’attribution, ou qualification, attribuant le nom a₁au nom N₂au moyen du prédicat Va, et un autre triplet d’interaction, cette fois entre le nom N₂et un nom N₃au moyen d’un prédicat V_l2. Dans le cadre de la gestion de catastrophe naturelle prise en exemple ici, un triplet peut par exemple comprendre les noms « Hôpital » et « piste hélicoptère », reliés par le prédicat « possède ». Il s’agit ainsi de fournir la connaissance selon laquelle, dans le cadre de cette crise, une entité hôpital est munie d’une piste d’atterrissage pour hélicoptère. De plus, un nom peut également représenter une référence temporelle ou géographique, si pertinents pour le cas d’étude, de manière à positionner chronologiquement ou spatialement l’information à laquelle correspond le triplet.Once the knowledge assertions have been collected, we begin step 103. This involves providing triplets, represented schematically in . A triplet consists of two nouns, or noun phrases, connected by a predicate usually including a verb. These nouns and verbs can be chosen from those defined in step 101 of defining the scope of the application of the method. These triplets correspond either to a qualification, in which case the predicate of this triplet includes a connecting verb to qualify one of the two nouns of this triplet by the other noun of this triplet or to an action, in which case the predicate of this triplet includes an action verb V _a to describe an interaction between the two nouns of the triplet. In the context of a qualification, we can consider that the second name is therefore in fact an attribute of the first name. There thus illustrates three triplets: a triplet of interaction between the name N ₁ and the name N ₂ , with the predicate V _l1 , a triplet of attribution, or qualification, attributing the name a ₁ to the name N ₂ by means of the predicate Va , and another triplet of interaction, this time between the name N ₂ and a name N ₃ by means of a predicate V _l2 . In the context of natural disaster management taken as an example here, a triplet can for example include the nouns “Hospital” and “helicopter runway”, linked by the predicate “possesses”. This involves providing the knowledge that, in the context of this crisis, a hospital entity is equipped with a helicopter landing strip. In addition, a name can also represent a temporal or geographical reference, if relevant for the case study, so as to chronologically or spatially position the information to which the triple corresponds.

Toutes les connaissances dont dispose l’utilisateur 1, et qui ont éventuellement été recueillies auprès de sources nombreuses et hétérogènes, sont ainsi découpées en triplets permettant la plus simple expression de ces connaissances. L’utilisateur 1 n’est donc pas nécessairement un expert du domaine cible. Les connaissances peuvent par ailleurs être apportées, sous la forme de triplets, par chacune ou certaines des sources disposant des connaissances, plutôt que par le seul utilisateur 1.All the knowledge available to user 1, and which may have been collected from numerous and heterogeneous sources, is thus divided into triplets allowing the simplest expression of this knowledge. User 1 is therefore not necessarily an expert in the target domain. The knowledge can also be provided, in the form of triplets, by each or some of the sources having the knowledge, rather than by the single user 1.

Cette étape 103 est réalisée via l’interface 25, organisée pour permettre intuitivement à l’utilisateur de fournir simplement les deux noms et le prédicat de chacun des triplets. Elle peut par exemple présenter des champs à remplir pour placer les trois éléments du triplet. Elle peut proposer des noms et attributs déjà fournis auparavant, pour éviter une duplication de noms concernant le même élément.This step 103 is carried out via interface 25, organized to intuitively allow the user to simply provide the two names and the predicate of each of the triples. For example, it can present fields to fill in to place the three elements of the triplet. It can propose names and attributes already provided previously, to avoid duplication of names concerning the same element.

À l’étape 104, l’utilisateur est aidé par les moyens informatiques pour opérer l’agrégation des triplets de manière à former un graphe dit «topologique». Cela est rendu possible par les fonctions du module 27 du programme 24. Ainsi, en considérant l’ensemble des triplets, les noms identiques ou qui se réfèrent au même élément sont fusionnés de manière à relier les triplets entre eux. Par exemple, tous les triplets comprenant le nom « Madame X » sont reliés autour du nom « Madame X », de manière à créer une section du réseau de triplets autour de ce nom. Un exemple de graphe topologique obtenu est celui de la , au sujet de triplets en lien avec la gestion d’une catastrophe naturelle.In step 104, the user is helped by computer means to aggregate the triples so as to form a so-called “ topological ” graph. This is made possible by the functions of module 27 of program 24. Thus, by considering all of the triples, the identical names or which refer to the same element are merged so as to connect the triples together. For example, all the triples comprising the name “Madame X” are connected around the name “Madame An example of a topological graph obtained is that of the , about triplets linked to the management of a natural disaster.

Il convient de noter que ce graphe topologique est en soi déjà un résultat utile du procédé 100, puisqu’il présente en lui l’ensemble des affirmations de connaissances, réorganisées de manière à éviter les redondances et à regrouper les éléments similaires entre eux, en permettant l’agrégation cohérente de plusieurs sources hétérogènes d’informationsIt should be noted that this topological graph is in itself already a useful result of the method 100, since it presents in it the set of knowledge assertions, reorganized so as to avoid redundancies and to group similar elements together, in allowing the coherent aggregation of several heterogeneous sources of information

L’étape suivante 105 est étape de détection de motifs sémantiques dans le graphe topologique. Elle vise à repérer des liens récurrents entre les différents éléments du graphe topologique, tels que des triplets incluant des prédicats similaires, pour des noms appartenant éventuellement à des concepts sémantiques proches. Elle vise donc à permettre à l’utilisateur d’identifier des types d’interaction génériques qui peuvent être appliqués à de nombreux éléments constituant les nœuds du graphe topologique, et donc de lui suggérer des concepts plus généraux que le cas d’étude du graphe topologique. Cette étape est réalisée manuellement par l’utilisateur, assisté du module 27. Alternativement, elle pourrait être réalisée automatiquement via un module de machine Learning ayant été entraîné à détecter et reconnaitre des motifs sémantiques dans des ontologies préexistantes formant bases d’apprentissages.The next step 105 is the step of detecting semantic patterns in the topological graph. It aims to identify recurring links between the different elements of the topological graph, such as triples including similar predicates, for names possibly belonging to similar semantic concepts. It therefore aims to allow the user to identify generic types of interaction which can be applied to numerous elements constituting the nodes of the topological graph, and therefore to suggest concepts more general than the case study of the graph. topological. This step is carried out manually by the user, assisted by module 27. Alternatively, it could be carried out automatically via a machine learning module having been trained to detect and recognize semantic patterns in pre-existing ontologies forming learning bases.

À l’étape 106, l’utilisateur détermine une structure sémantique de données. Cette structure est par exemple celle de la . Elle fait apparaître deux types de concepts, correspondant à des «entités génériques» : les «agents» d’une part, qu’on peut également appeler «ressources», et les «processus» ou «évènements» d’autre part. La fait également apparaître huit autres types de concepts, qui ne sont pas des «entités génériques» mais des «attributs génériques». Dans cette structure les huit types de concepts formant des attributs viennent accompagner les deux types de concepts formant des entités, pour permettre sémantiquement leur identification, caractérisation et qualification. Ces huit types d’attributs génériques forment des références géospatiales, des qualifications, des conditions, des contextes, des paramètres clefs, à associer aux entités. La structure sémantique indique aussi quels types de lien sont permis entre les entités d’une part et entre les entités et les attributs d’autre part. Comme décrit plus bas, tout triplet de connaissances a vocation à être transformé en instances de concepts appartenant à ces types d’entités ou attributs.In step 106, the user determines a semantic data structure. This structure is for example that of the . It reveals two types of concepts, corresponding to “ generic entities ”: “ agents ” on the one hand, which can also be called “ resources ”, and “ processes ” or “ events ” on the other hand. There also reveals eight other types of concepts, which are not “ generic entities ” but “ generic attributes ”. In this structure, the eight types of concepts forming attributes accompany the two types of concepts forming entities, to semantically allow their identification, characterization and qualification. These eight types of generic attributes form geospatial references, qualifications, conditions, contexts, key parameters, to be associated with entities. The semantic structure also indicates what types of links are allowed between entities on the one hand and between entities and attributes on the other hand. As described below, any triplet of knowledge is intended to be transformed into instances of concepts belonging to these types of entities or attributes.

L’étape 107 est réalisée de façon concomitante à l’étape 106 au moyen des fonctions du module 28. Il s’agit de créer le modèle sémantique de données, au moyen de la structure, c’est-à-dire des types de concepts, de l’étape 106. Il s’agit ainsi de définir, ou sélectionner parmi des ontologies préexistantes, des concepts, et éventuellement leurs sous-classes, appartenant aux types de concepts de la structure sémantiques choisie à l’étape 106. On définit par exemple le concept «I nfrastructure», appartenant au type de concept «R essource», une entité générique. On peut également définir des sous-classes appartenant aux concepts. Par exemple, on peut définir la sous-classe «hôpital» du concept «infrastructure».Step 107 is carried out concomitantly with step 106 by means of the functions of module 28. This involves creating the semantic data model, by means of the structure, that is to say the types of concepts, from step 106. This involves defining, or selecting from pre-existing ontologies, concepts, and possibly their subclasses, belonging to the types of concepts of the semantic structure chosen in step 106. defines for example the concept “ Infrastructure ”, belonging to the concept type “ Resource ”, a generic entity. We can also define subclasses belonging to concepts. For example, we can define the subclass “ hospital ” of the concept “ infrastructure ”.

La définition d’un concept comprend des propriétés ou caractéristiques, la définition de la compatibilité du concept avec des concepts appartenant à d’autres types de concepts. En ce qui concerne les propriétés par exemple, on peut notamment définir des concepts d’attributs qui sont spécifiques à un type de concept d’entité particulier, mais aussi des attributs génériques, notamment des concepts de «qualification», qui peuvent être appliqués à tous les types de concept. De même, en ce qui concerne les compatibilités entre concepts, on peut définir qu’un concept d’attribut spécifique peut être compatible avec un ou plusieurs concepts d’entité, mais pas avec tous les concepts d’entité du modèle. Par contre, un concept d’attribut de qualification pourra, par exemple, être associé aux concepts de tous types. L’ensemble de ces choix mène à ce qu’on appelle le «modèle sémantique de données». Ainsi, dans le cas d’étude concernant la gestion d’une catastrophe naturelle, on définit le concept de «Personnes» appartenant au type de concept «Agent/Ressource». Une instance de ce concept «Personnes» pourra alors être en lien avec :The definition of a concept includes properties or characteristics, the definition of the compatibility of the concept with concepts belonging to other types of concepts. With regard to properties for example, we can in particular define attribute concepts which are specific to a particular type of entity concept, but also generic attributes, in particular " qualification " concepts, which can be applied to all types of concept. Similarly, regarding compatibilities between concepts, one can define that a specific attribute concept can be compatible with one or more entity concepts, but not with all entity concepts in the model. On the other hand, a qualification attribute concept could, for example, be associated with concepts of all types. All of these choices lead to what is called the “ semantic data model ”. Thus, in the case of study concerning the management of a natural disaster, we define the concept of “ People ” belonging to the type of concept “ Agent/Resource ”. An instance of this “ People ” concept could then be linked to:

- une autre instance du même concept, identifiée par deux concepts d’attribut «entité social» différents (par exemple un «Individu» appartenant à une «Communauté»),- another instance of the same concept, identified by two different “ social entity ” attribute concepts (for example an “ Individual ” belonging to a “ Community ”),

- une instance d’un autre concept du même type «Agent / Ressource» (par exemple un «Individu» nécessitant des «Ressources alimentaires»),- an instance of another concept of the same type “ Agent / Resource ” (for example an “ Individual ” requiring “ Food Resources ”),

- un type de concept diffèrent, comme le type «Processus/ Évènement» (par exemple un «Individu» souffrant d’un «Impact»), et/ou- a different type of concept, such as the “ Process/ Event ” type (for example an “ Individual ” suffering from an “ Impact ”), and/or

- une instance d’un concept qui exprime un attribut, comme la «condition» (par exemple un «Individu» qui serait une «Personne Sinistrée»).- an instance of a concept which expresses an attribute, such as the “ condition ” (for example an “ Individual ” which would be a “ Disaster Person ”).

Cette étape est réalisée via l’interface utilisateur 25. Alternativement, cette étape peut être réalisée à l’aide d’un module de machine Learning ayant été entraînés à proposer des concepts sémantiques à partir d’un ensemble d’ontologies des domaines possiblement pertinents, sur la base de structures prédéfinies de modèles sémantiques de données, fondées sur les théories des toposes et des sketches.This step is carried out via the user interface 25. Alternatively, this step can be carried out using a machine learning module having been trained to propose semantic concepts from a set of ontologies of possibly relevant domains , based on predefined structures of semantic data models, based on the theories of toposes and sketches.

L’ensemble de ces concepts forment une ontologie de référence du cas d’usage. L’ontologie est dite hybride car elle est à la fois fondée sur des concepts sémantiques génériques spécifiquement sélectionnés, provenant d’ontologies extérieures, et adaptés en tant que concepts propres au cas d’usage.All of these concepts form a reference ontology of the use case. The ontology is called hybrid because it is both based on specifically selected generic semantic concepts, coming from external ontologies, and adapted as concepts specific to the use case.

Ces structures de classification d’entités et attributs sont intrinsèquement liées à des concepts mathématiques de la théorie des catégories tels que les morphismes, les foncteurs, et les monoïdes. Ainsi, en décrivant les connaissances au moyen de ces structures, on obtient des motifs récurrents et répétitifs, qu’on appelle morphismes universels. Les concepts possibles sont limités. Bien qu’ils puissent être définis à partir de zéro par l’utilisateur, il est avantageux qu’ils soient sélectionnés par l’utilisateur 1 à partir de «upper ontologies» (ou « ontologies supérieures » en français) et d’ontologies de domaine, regroupés en concepts. De manière optionnelle, une liste de choix de concepts ou d’ontologies supérieures est fournie à l’utilisateur sur son écran. Ces concepts (et, si pertinent, leurs sous-classes) seront utilisés comme décrit plus bas pour définir les instances du graphe sémantique des connaissances.These entity and attribute classification structures are intrinsically linked to mathematical concepts from category theory such as morphisms, functors, and monoids. Thus, by describing knowledge using these structures, we obtain recurring and repetitive patterns, which we call universal morphisms. The possible concepts are limited. Although they can be defined from scratch by the user, it is advantageous that they are selected by user 1 from " upper ontologies " and from ontologies of domain, grouped into concepts. Optionally, a list of choices of concepts or higher ontologies is provided to the user on their screen. These concepts (and, if relevant, their subclasses) will be used as described below to define the instances of the semantic knowledge graph.

À l’étape 108, l’utilisateur 1 commande la génération du graphe de connaissances. Il s’agit de produire, au moyen du module 28, les instances des différents concepts choisis à l’étape précédente, les entités et leurs attributs, leurs relations, classifiées sur la base du modèle sémantique des données, de façon à transformer (ou enrichir sémantiquement) le contenu du graphe topologique. La génération des instances sémantiques nécessite la définition de «règles de concordance» associant les concepts aux types de données du graphe topologique. L’utilisateur, ou un algorithme de classification sémantique, parcourt donc le graphe topologique et classifie les instances de concepts de manière conforme à ces règles, et aux concepts définis à l’étape 107. C’est en d’autres termes la sémantification des triplets fournis, la génération d’un graphe de connaissances organisé en fonction des données qui constituent son corpus sous forme de graphe topologique. La sémantification des triplets est fondée sur la théorie des catégories. Elle permet l’usage de cette ontologie dans d’autres cas du même domaine, ainsi que des analyses fonctionnelles simplifiées dans les graphes de connaissances ensuite générés.In step 108, user 1 controls the generation of the knowledge graph. This involves producing, by means of module 28, the instances of the different concepts chosen in the previous step, the entities and their attributes, their relationships, classified on the basis of the semantic model of the data, so as to transform (or semantically enrich) the content of the topological graph. The generation of semantic instances requires the definition of “ concordance rules ” associating the concepts with the data types of the topological graph. The user, or a semantic classification algorithm, therefore browses the topological graph and classifies the instances of concepts in a manner consistent with these rules, and with the concepts defined in step 107. In other words, it is the semanticification of the triples provided, the generation of a knowledge graph organized according to the data which constitutes its corpus in the form of a topological graph. The semanticification of triples is based on category theory. It allows the use of this ontology in other cases in the same domain, as well as simplified functional analyzes in the knowledge graphs subsequently generated.

Dans le cas d’étude de la gestion d’une catastrophe naturelle, il s’agit par exemple de décrire sémantiquement, par classification avec plusieurs concepts, l’entité «Hôpital Saint Nazaire». Cette entité est une instance du concept «Infrastructure», une entité de type «Ressource», classifiée par le concept d’attribut spécifique de «Secteur contextuel : Santé». Les propriétés prévues dans les concepts sont définies de façon conformes à l’ontologie de référence, et de façon à rendre compte des triplets fournis dans l’étape 103. Ainsi, en partant d’une part d’un triplet comprenant le nom «Hôpital», le nom «Saint Nazaire» et un prédicat comprenant le verbe «s’appeler», et d’autre part d’un autre triplet comprenant le groupe nominal «Hôpital Saint Nazaire», le groupe nominal «1500 personnes» et le prédicat «emploie», l’utilisateur créé l’instance «Hôpital Saint Nazaire» de la sous-classe «hôpital» (une «Infrastructure» de type «Ressource» et comprenant l’attribut de Secteur contextuel « Santé »), il associe le nombre de « 1500 » à l’attribut «paramètre clé : Nombre d’employés», et le nom « Saint-Nazaire » à l’attribut «identité : nom».In the case of studying the management of a natural disaster, it is for example a question of describing semantically, by classification with several concepts, the entity “ Saint Nazaire Hospital ”. This entity is an instance of the “ Infrastructure ” concept, a “ Resource ” type entity, classified by the specific attribute concept of “ Contextual Sector : Health ”. The properties provided in the concepts are defined in a manner consistent with the reference ontology, and in such a way as to account for the triplets provided in step 103. Thus, starting on the one hand from a triplet including the name “ Hospital », the name “ Saint Nazaire ” and a predicate comprising the verb “ to be called ”, and on the other hand another triplet comprising the nominal group “ Saint Nazaire Hospital ”, the nominal group “ 1500 people ” and the predicate “ employs ”, the user creates the “ Saint Nazaire Hospital ” instance of the “ hospital ” subclass (an “ Infrastructure ” of type “ Resource ” and including the Contextual Sector attribute “Health”), he associates the number of “1500” to the “ key parameter: Number of employees ” attribute, and the name “Saint-Nazaire” to the “ identity: name ” attribute.

Ces instances sont associées entre elles conformément aux triplets fournis dans l’étape 103 et forment alors le graphe de connaissances. Ainsi, si on dispose d’un triplet comprenant le groupe nominal «Hôpital Saint Nazaire», le nom «Madame X» et le prédicat «dirige», l’utilisateur 1 crée l’instance «Madame X», du concept «entité social : Individu» lui-même attribut du concept «Personnes», lui attribue la sous-classe «Directrice» du concept «condition : rôle», dans le secteur contextuel «Santé», et relie l’instance «Hôpital Saint Nazaire» à l’instance« Madame X», via une liaison entre les deux entités conformément au modèle sémantique choisi.These instances are associated with each other in accordance with the triples provided in step 103 and then form the knowledge graph. Thus, if we have a triplet including the nominal group “ Saint Nazaire Hospital ”, the name “ Madame X ” and the predicate “ directs ”, user 1 creates the instance “ Madame : Individual " itself an attribute of the concept " People ", assigns it the subclass " Director " of the concept " condition: role ", in the contextual sector " Health ", and links the instance " Saint Nazaire Hospital " to the “Madame X ” instance, via a connection between the two entities in accordance with the chosen semantic model.

Alternativement, cette étape 108 peut être réalisée automatiquement par un module d’instanciation, ou encore de «module informatique de transformation du graphe topologique en graphe de connaissance». Le module reprend les éléments du graphe topologique et crée des instances d’un concept d’entité, caractérisées par des concepts d’attribut. Il peut en particulier s’agir d’un module de machine Learning entraîné à cet effet.Alternatively, this step 108 can be carried out automatically by an instantiation module, or even a “ computer module for transforming the topological graph into a knowledge graph ”. The module takes the elements of the topological graph and creates instances of an entity concept, characterized by attribute concepts. It may in particular be a machine learning module trained for this purpose.

Le graphe de connaissances est alors généré. Il permet de visualiser des liens qui n’étaient pas directement identifiables auparavant, et de naviguer à travers l’ensemble des connaissances de manière simple et organisée.The knowledge graph is then generated. It allows you to visualize links that were not directly identifiable before, and to navigate through all the knowledge in a simple and organized way.

Comme on l’a vu, l’utilisateur 1 est au moins assisté par le programme 24 à chacune des étapes du procédé, mais certaines de ces étapes peuvent être réalisées automatiquement par des algorithmes de Machine Learning. De manière générale, l’extraction des classes sémantiques des ontologies préexistantes pour le vocabulaire des concepts ainsi que l’application de l’ensemble de règles nécessaires pour transformer une instance topologique en instance sémantique, peuvent donc être effectuées par traitement de base de données interactif ou par interface graphique. La conception du système 24 implique l’intégration de la construction incrémentale semi-automatique d’ontologies mettant en œuvre des processus dits d’« apprentissage ontologique ». L’algorithme peut être entraîné à choisir les classes sémantiques pour la définition des concepts, sur la base d’une structure sémantique et d’un graphe topologique déterminé. Alternativement, il peut avoir été entraîné à proposer des classes sur la base de certains termes d’un graphe topologique et d’un modèle sémantique donné.As we have seen, user 1 is at least assisted by program 24 at each step of the process, but some of these steps can be carried out automatically by Machine Learning algorithms. In general, the extraction of semantic classes from pre-existing ontologies for the vocabulary of concepts as well as the application of the set of rules necessary to transform a topological instance into a semantic instance, can therefore be carried out by interactive database processing. or by graphical interface. The design of system 24 involves the integration of semi-automatic incremental construction of ontologies implementing processes known as “ontological learning”. The algorithm can be trained to choose semantic classes for the definition of concepts, on the basis of a semantic structure and a determined topological graph. Alternatively, it may have been trained to propose classes based on certain terms from a topological graph and a given semantic model.

Dans une étape optionnelle non illustrée, l’utilisateur 1 effectue une requête sur le graphe de connaissances de manière à extraire les connaissances qu’il souhaite, ou de manière à identifier des connaissances non visibles directement, par exemple en appliquant un ou plusieurs filtres. Il utilise à cet effet le module 29. Un exemple d’utilisations des types de concepts, concepts et sous-classes sur une portion du graphe des connaissances produit pour l’analyse d’un cas de gestion de crise, est montré à la .In an optional step not illustrated, user 1 performs a query on the knowledge graph so as to extract the knowledge he wishes, or so as to identify knowledge not directly visible, for example by applying one or more filters. For this purpose, it uses module 29. An example of uses of the types of concepts, concepts and subclasses on a portion of the knowledge graph produced for the analysis of a crisis management case, is shown in section .

Dans la définition d’une requête, l’utilisateur 1 peut aussi définir des nouvelles classes ontologiques inférées par des règles de combinaisons ou de relations entre des concepts définis dans le modèle sémantique des données.In defining a query, user 1 can also define new ontological classes inferred by rules of combinations or relationships between concepts defined in the semantic model of the data.

Le procédé permet ainsi l’organisation de toutes sortes de données structurées, non structurées ou semi-structures dans un graphe de connaissances avec une capacité d’inférence importante. Il permet la découverte de connaissances et leur extraction au moyen de leur représentation conceptuelle à différents niveaux d’abstraction. Il permet au final l’analyse de systèmes sociotechniques complexes de n’importe quelle sorte, de manière à améliorer voire à aider à la décision.The process thus allows the organization of all kinds of structured, unstructured or semi-structured data in a knowledge graph with significant inference capacity. It allows the discovery of knowledge and its extraction by means of its conceptual representation at different levels of abstraction. It ultimately allows the analysis of complex socio-technical systems of any kind, in order to improve or even aid decision-making.

On va maintenant décrire le procédé 200 en référence à la . Ce procédé vise à fournir un retour d’expérience ou un corpus d’information dans un graphe de connaissance déjà existant. Il est mis en œuvre par l’utilisateur 1 au moyen des mêmes modules que ceux ayant mis en œuvre le procédé 100.We will now describe the process 200 with reference to the . This process aims to provide feedback or a body of information in an already existing knowledge graph. It is implemented by user 1 using the same modules as those which implemented method 100.

À l’étape 201, l’utilisateur 1 choisit, via l’interface 25, un graphe de connaissances parmi plusieurs possibles, tous les graphes ayant été généré conformément au procédé 100. Le graphe choisit correspond au domaine ou au cas d’usage auquel il souhaite contribuer.In step 201, user 1 chooses, via interface 25, a knowledge graph from several possible ones, all the graphs having been generated in accordance with method 100. The chosen graph corresponds to the domain or use case to which he wants to contribute.

À l’étape 202, l’utilisateur 1 fournit un triplet correspondant à des éléments concernant son retour d’expérience, le triplet comprenant deux noms reliés par un prédicat de la même manière que les triplet de l’étape 101 du procédé 100.In step 202, user 1 provides a triplet corresponding to elements concerning his feedback, the triplet comprising two names linked by a predicate in the same way as the triplets of step 101 of method 100.

À l’étape 203, l’utilisateur 1 commande la transformation du triplet en instanciation d’une ou plusieurs concepts et l’intégration de la ou des instances dans le graphe de connaissance. Une fois commandée, cette transformation et cette intégration sont réalisées automatiquement par un module informatique non illustré d’instanciation et d’intégration d’une ou plusieurs concepts dans le graphe de connaissance en fonction du triplet. Il s’agit en particulier d’un module de machine Learning ayant été entraîné à modifier un graphe de connaissance en fonction d’un triplet fourni.In step 203, user 1 orders the transformation of the triple into instantiation of one or more concepts and the integration of the instance(s) into the knowledge graph. Once ordered, this transformation and this integration are carried out automatically by a non-illustrated computer module for instantiation and integration of one or more concepts in the knowledge graph according to the triplet. In particular, this is a machine learning module that has been trained to modify a knowledge graph based on a provided triplet.

Le procédé de l’invention est applicable à tout domaine et vise en ce sens à contribuer à la standardisation des solutions de gestion des connaissances. Parmi les domaines ciblés figurent, en plus de la gestion de crise, tous les domaines industriels et notamment l’ensemble des processus industriels, tels que le minage, mais aussi la gestion de ressources de manière générale, dont l’analyse de flux de matière première et secondaire, et l’ingénierie de la résilience des systèmes sociotechniques.The method of the invention is applicable to any field and in this sense aims to contribute to the standardization of knowledge management solutions. Among the targeted areas are, in addition to crisis management, all industrial areas and in particular all industrial processes, such as mining, but also resource management in general, including the analysis of material flows. primary and secondary, and engineering the resilience of sociotechnical systems.

Le graphe peut être utilisé comme une base pour d’autres processus. Il peut permettre de collecter et structurer de manière plus efficace les données nécessaires à l’implémentation des méthodes d’analyse de systèmes sociotechniques déjà connues, nécessitant une grand quantité de données d’entrées (« inputs ») et la gestion de relations complexes entre eux, comme le « FRAM ».The graph can be used as a basis for other processes. It can make it possible to collect and structure in a more efficient manner the data necessary for the implementation of methods of analysis of sociotechnical systems already known, requiring a large quantity of input data and the management of complex relationships between them, like the “FRAM”.

Si les affirmations de connaissance qui lui sont fournies incluent des données quantitatives, il peut être utilisé pour évaluer les performances d’organisations ou réaliser des analyses suggérant les meilleures options à suivre pour améliorer un processus, que ce soit de manière factuelle, c’est-à-dire au niveau opérationnel, et de manière conceptuelle, c’est-à-dire à un niveau stratégique ou en termes de règles génériques. Ainsi, il peut permettre d’accélérer la mise à jour de données d’apprentissages conformément aux connaissances apprises, son usage peut être orienté à la compréhension d’effets cascades, à l’anticipation des relations de cause à effet, à la définition de scénarios alternatifs.If the knowledge claims provided to it include quantitative data, it can be used to evaluate the performance of organizations or carry out analyzes suggesting the best options to follow to improve a process, whether in a factual manner, it is that is to say at the operational level, and conceptually, that is to say at a strategic level or in terms of generic rules. Thus, it can make it possible to accelerate the updating of learning data in accordance with the knowledge learned, its use can be oriented to the understanding of cascade effects, to the anticipation of cause and effect relationships, to the definition of alternative scenarios.

L'invention n'est pas limitée aux modes de réalisation présentés et d'autres modes de réalisation apparaîtront clairement à l'homme du métier.The invention is not limited to the embodiments presented and other embodiments will be clear to those skilled in the art.

Claims

Method (100) for generating a virtual semantic graph of specialized knowledge, implemented by computer (2), in which the following steps are implemented:
- supply (103) of several triplets corresponding respectively to knowledge assertions, each triplet comprising two names (N ₁ , N ₂ , a ₁ , N ₃ ) connected by a predicate (V _l1 , V _a , V _l2 );
- aggregation (104) of the triples according to the names so as to connect the triples together and to organize them into a topological graph;
- detection (105) of semantic patterns in the topological graph;
- depending on the detected semantic patterns, definition (106) of types of generic semantic concepts, definition (107) of semantic concepts, each semantic concept belonging to a type of generic semantic concept, and definition of possible associations between the semantic concepts in a manner to form a semantic model comprising the concepts associated with each other;
- instantiations (108) of semantic concepts, each instance being associated with at least one of the defined concepts, the instances being generated by classification of the content of the topological graph, so as to generate a knowledge graph.

Method (100) according to the preceding claim, in which each knowledge assertion corresponds:
- either to a qualification, in which case the predicate (V _a ) of this triplet includes a connecting verb to qualify one of the two nouns (N ₂ , a ₁ ) of this triplet by the other noun of this triplet;
- either to an action, in which case the predicate (V _l1 , V _l2 ) of this triplet includes an action verb to describe an interaction between the two nouns (N ₁ , N ₂ , N ₃ ) of the triplet,
and in which, preferably, at least one of the names includes a temporal or spatial reference so as to chronologically or geographically position the statement to which the triplet corresponds.

Method (100) according to any one of the preceding claims, in which the supply (103) of the triplets and the organization of the triplets into a topological graph (104) are carried out via a computer module (27) for editing and structuring of text.

Method (100) according to any one of the preceding claims, wherein the generic semantic concept types are generic entities or generic attributes, the attributes corresponding to identification, context or qualification characteristics associated with the entities.

Method (100) according to any one of the preceding claims, in which the definition of semantic concepts is carried out via a computer module for detecting recurring relational patterns in the topological graph.

Method (100) according to any one of the preceding claims, in which the instantiations (108) of the concepts and the associations of these instances between them are carried out via a computer module for transforming the topological graph into a semantic knowledge graph.

Method (100) according to any one of the preceding claims, further implementing a step of manipulating the knowledge graph through computer interaction means (2), so as to navigate through the entire knowledge graph and to edit the knowledge graph.

Method (200) for providing feedback, implemented by computer (2), in which the following steps are implemented:
- choice (201) of an existing knowledge graph, the knowledge graph having been generated in accordance with any one of the preceding claims;
- provision (202) of a triplet corresponding to feedback, the triplet comprising two nouns linked by a predicate;
- transformation (203) of the triple into instantiation of one or more concepts and integration of the instance(s) into the knowledge graph.

Method (200) according to the preceding claim, in which the transformation (203) is carried out automatically by a computer module for instantiation and integration of one or more concepts in the knowledge graph according to the triplet.

Computer program (24) comprising instructions which, when the program is executed by a computer (2) lead it to implement the steps of the method (100) for generating a virtual semantic graph of specialized knowledge according to any one of claims 1 to 7 or the steps of the method (200) for providing feedback according to claim 8 or 9.

A computer-readable recording medium (23) comprising instructions which, when executed by a computer, cause the computer to implement the steps of the method (100) for generating a virtual semantic graph of specialized knowledge according to any one of claims 1 to 7 or the steps of the method (200) for providing feedback according to claim 8 or 9.