CN115827881A - Multi-mode tourism information positioning type retrieval method based on tourism knowledge map - Google Patents
Multi-mode tourism information positioning type retrieval method based on tourism knowledge map Download PDFInfo
- Publication number
- CN115827881A CN115827881A CN202111088382.4A CN202111088382A CN115827881A CN 115827881 A CN115827881 A CN 115827881A CN 202111088382 A CN202111088382 A CN 202111088382A CN 115827881 A CN115827881 A CN 115827881A
- Authority
- CN
- China
- Prior art keywords
- entity
- entities
- travel
- data
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种基于旅游知识图谱的多模态旅游信息定位式检索方法,根据图文游记和旅游视频混合数据库中的多模态数据构建带有权值的旅游知识图谱,并在构建和更新过程中保存实体和实体间关系对数据源的语义位置索引,用户进行文本搜索时对文本抽取搜索实体和搜索实体间的关系,映射到知识图谱的一个子图,对该子图增强扩展后根据对应索引返回检索结果。本发明对检索文本返回的结果同样是多模态的,并且指向语义对应的位置。对于数据库中的游记数据,返回增强后子图对应的文本和图片及所在游记;对于数据库中的旅游视频数据,返回增强后子图对应的视频片段和整个视频。本发明解决了多模态数据难以有效管理,且旅游数据检索难以定位到目标语义单位的问题。
A multi-modal tourism information positioning retrieval method based on tourism knowledge graph, constructs a weighted tourism knowledge graph based on multi-modal data in the mixed database of graphic travel notes and travel videos, and saves it during the construction and update process The entity and the relationship between the entities index the semantic position of the data source. When the user performs a text search, the text extracts the search entity and the relationship between the search entity, and maps it to a subgraph of the knowledge graph. After the subgraph is enhanced and expanded, it is returned according to the corresponding index. Search Results. The results returned by the present invention to the retrieved text are also multi-modal, and point to semantically corresponding positions. For the travel data in the database, return the text and pictures corresponding to the enhanced sub-picture and the travel notes; for the travel video data in the database, return the video segment and the whole video corresponding to the enhanced sub-picture. The invention solves the problems that it is difficult to effectively manage the multi-modal data, and it is difficult to locate the target semantic unit in tourism data retrieval.
Description
技术领域technical field
本发明属于多媒体计算领域,涉及文本、图片和视频的语义分析,具体为一种基于旅游知识图谱的多模态旅游信息定位式检索方法。The invention belongs to the field of multimedia computing, relates to semantic analysis of text, pictures and videos, and specifically relates to a multi-modal travel information positioning retrieval method based on a travel knowledge graph.
背景技术Background technique
知识图谱可以描述客观世界的概念、实体、事件及其之间的关系,从而支持信息检索、智能问答等应用。旅游大数据来源复杂,体量巨大,模态多样,难以有效获取和管理,因此多模态的旅游数据难以准确检索。利用多模态的旅游大数据构建多模态知识图谱可以有效增强对旅游数据的管理和利用能力。The knowledge map can describe the concepts, entities, events and the relationship between them in the objective world, thus supporting applications such as information retrieval and intelligent question answering. Tourism big data has complex sources, huge volume, and diverse modes, making it difficult to effectively acquire and manage it. Therefore, it is difficult to accurately retrieve multi-modal tourism data. Using multi-modal tourism big data to build a multi-modal knowledge graph can effectively enhance the ability to manage and utilize tourism data.
如今的旅游信息检索应用大多依赖于文本查找和标签匹配,当检索要求复杂时难以给出准确的检索结果。旅游知识图谱可以支持更复杂的检索要求,但是现有的旅游知识图谱模态单一,大多是文本数据,少有图像数据和视频数据。然而,伴随着移动终端的发展,现今的互联网充斥着海量的图片和视频数据,人们在旅游的时候往往会拍摄大量的图片和视频,图文游记和旅游vlog成为了当下时兴的旅游经历分享方式。并且,基于文字的检索方式对于无标签的图片和视频难以定位到语义对应的位置,仍需要用户二次人工筛选定位,检索难度大。因此,传统的检索方式和单一模态的旅游知识图谱无法支持对目前多模态旅游大数据的定位式检索。Most of today's tourism information retrieval applications rely on text search and tag matching, and it is difficult to give accurate retrieval results when the retrieval requirements are complex. Tourism knowledge graphs can support more complex retrieval requirements, but the existing tourism knowledge graphs have a single mode, mostly text data, and few image data and video data. However, with the development of mobile terminals, today's Internet is flooded with massive pictures and video data. People often take a lot of pictures and videos when traveling. Graphic travel notes and travel vlogs have become popular ways to share travel experiences. . In addition, the text-based retrieval method is difficult to locate the semantically corresponding position for unlabeled pictures and videos, and the user still needs a second manual screening and positioning, making retrieval difficult. Therefore, traditional retrieval methods and single-modal tourism knowledge graphs cannot support the positioning retrieval of current multi-modal tourism big data.
发明内容Contents of the invention
本发明要解决的问题是:对多模态旅游信息的检索定位,通过构建多模态旅游知识图谱实现对多模态旅游大数据进行检索的语义定位,得到更符合要求的检索结果。The problem to be solved by the present invention is: for the retrieval and positioning of multi-modal tourism information, the semantic positioning of the retrieval of multi-modal tourism big data can be realized by constructing a multi-modal tourism knowledge map, and more satisfactory retrieval results can be obtained.
本发明的技术方案为:一种基于旅游知识图谱的多模态旅游信息定位式检索方法,基于图文游记数据和旅游视频数据混合的多模态旅游数据库,构建带有权值的旅游知识图谱,并在构建和更新过程中保存实体和实体间关系对数据源的语义位置索引,用户进行文本搜索时对文本抽取搜索实体和搜索实体间的关系,映射到知识图谱的一个子图,对该子图增强扩展后根据对应索引返回检索结果,返回的检索结果为子图在多模态旅游数据库中对应的多模态数据。The technical solution of the present invention is: a multi-modal tourism information positioning search method based on the tourism knowledge map, based on the multi-modal tourism database mixed with graphic travel data and travel video data, to construct a tourism knowledge map with weights , and save the semantic position index of the entity and the relationship between entities to the data source during the construction and update process. When the user performs text search, the text is extracted from the search entity and the relationship between the search entity, and mapped to a subgraph of the knowledge map. After the subgraph is enhanced and expanded, the retrieval result is returned according to the corresponding index, and the returned retrieval result is the multimodal data corresponding to the subgraph in the multimodal tourism database.
作为优选方式,构建带有权值的旅游知识图谱具体为:As an optimal way, constructing a tourism knowledge graph with weights is specifically:
1)根据旅游垂直网站构建本体库,定义实体类型,包括城市、景点、地点、时间、活动、和其他实体;1) Construct an ontology library according to the tourism vertical website, and define entity types, including cities, scenic spots, places, time, activities, and other entities;
2)从旅游垂直网站和视频网站获取多模态数据,作为多模态旅游数据库,包括从旅游垂直网站获取半结构化城市、景点数据以及非结构化的游记数据,以及从视频网站获取非结构化的旅游类视频;2) Obtain multi-modal data from vertical tourism websites and video websites as a multi-modal tourism database, including semi-structured city, scenic spot data and unstructured travel data from vertical tourism websites, and unstructured data from video websites. personalized travel videos;
3)将多模态数据进行预处理,对游记数据中的文本进行分词、词性分析和依存关系分析,对游记数据中的图片进行物体识别,对视频进行物体跟踪识别和场景文字识别,并对场景文字进行分词、词性分析和依存关系分析;3) Preprocess the multimodal data, perform word segmentation, part-of-speech analysis and dependency analysis on the text in the travel data, perform object recognition on the pictures in the travel data, perform object tracking recognition and scene text recognition on the video, and Word segmentation, part-of-speech analysis and dependency analysis of scene text;
4)从游记数据的分析的文本、视频识别的场景文字文本、游记数据识别的物体、视频跟踪识别的物体中,结合半结构化数据抽取语义实体;4) Extract semantic entities from the analyzed text of travel data, the scene text of video recognition, the objects recognized by travel data, and the objects recognized by video tracking, combined with semi-structured data;
5)挖掘提取的实体之间的关系,构成知识图谱,实体间关系权值计算方法为:5) Mining the relationship between the extracted entities to form a knowledge map, the calculation method of the relationship weight between entities is:
w(h,r,t)=P((r,t)|h),w(h,r,t)=P((r,t)|h),
其中w(h,r,t)表示头部实体h和尾部实体t之间的关系(h,r,t)的权值,P((r,t)|h)表示实体关系在头部实体h出现条件下关系为r,尾部实体为t的事件出现的概率。where w(h, r, t) represents the weight of the relationship (h, r, t) between the head entity h and the tail entity t, and P((r, t)|h) represents the entity relationship in the head entity The probability of occurrence of an event whose relationship is r and tail entity is t under the condition that h appears.
作为优选方式,检索方法具体为:As a preferred method, the retrieval method is as follows:
1)基于多模态旅游数据构建带有权值的旅游知识图谱;1) Construct a tourism knowledge map with weights based on multi-modal tourism data;
2)在构建旅游知识图谱的过程中,保存知识图谱中实体和实体关系所对应数据源的语义单元位置索引,游记文本数据源的实体语义定位表示为<文档id,章id,节id,段id,句id,词id>,游记图片数据源的实体语义定位表示为<文档id,章id,节id,段id,图片语句id,包围框>,视频数据源图像中的实体语义定位表示为<视频id,镜头id,0,0,帧id,包围框>,视频数据源识别出的文本中实体语义定位表示为<视频id,镜头id,0,0,句id,词id>,数据源的实体关系语义定位表示为<头实体定位,尾实体定位>;2) In the process of constructing the tourism knowledge map, the semantic unit location index of the data source corresponding to the entity and entity relationship in the knowledge map is saved, and the entity semantic positioning of the travel text data source is expressed as <document id, chapter id, section id, paragraph id, sentence id, word id>, the entity semantic positioning representation of the travel picture data source is <document id, chapter id, section id, segment id, picture sentence id, bounding box>, the entity semantic positioning representation in the video data source image is <video id, shot id, 0, 0, frame id, bounding box>, and the entity semantic positioning in the text identified by the video data source is expressed as <video id, shot id, 0, 0, sentence id, word id>, The entity relationship semantic positioning of the data source is expressed as <head entity positioning, tail entity positioning>;
3)对输入的检索文本抽取实体和实体关系;3) Extract entities and entity relationships from the input retrieval text;
4)将步骤3)获得的实体和实体关系映射到步骤1)构建的知识图谱,得到其中的一个子图;4) Map the entity and entity relationship obtained in step 3) to the knowledge graph constructed in step 1), and obtain a subgraph thereof;
5)对步骤4)得到的子图,将每个实体根据设置的扩展阈值沿实体关系扩展关联实体,将扩展的实体和实体关系加入到子图中,得到增强子图;5) For the subgraph obtained in step 4), expand the associated entity along the entity relationship for each entity according to the extension threshold set, and add the expanded entity and entity relationship into the subgraph to obtain an enhanced subgraph;
6)根据步骤5)增强子图中实体和实体关系对应的源数据语义位置返回检索数据。6) Return retrieval data according to the source data semantic positions corresponding to the entities and entity relationships in the enhanced subgraph in step 5).
本发明对检索文本返回的结果同样是多模态的,并且指向语义对应的位置。对于数据库中的游记数据,返回增强后子图对应的文本和图片及所在游记;对于数据库中的旅游视频数据,返回增强后子图对应的视频片段和整个视频。The results returned by the present invention to the retrieved text are also multi-modal, and point to semantically corresponding positions. For the travel data in the database, return the text and pictures corresponding to the enhanced sub-picture and the travel notes; for the travel video data in the database, return the video segment and the entire video corresponding to the enhanced sub-picture.
进一步地,本发明实现带权值的多模态旅游知识图谱构建和检索文本子图映射检索,本发明通过对多模态旅游大数据构建带权值的多模态知识图谱,为多模态旅游信息的定位式检索提供了解决方案,对检索文本构建检索子图后映射到知识图谱子图,根据知识图谱子图中实体和实体关系对源数据的语义位置索引,返回符合检索要求的源数据及对应语义位置。Further, the present invention realizes the construction of a weighted multimodal tourism knowledge map and the retrieval of text subgraph mapping retrieval. The present invention constructs a weighted multimodal knowledge map for multimodal tourism big data, providing multimodal The location-based retrieval of tourism information provides a solution. After the retrieval subgraph is constructed for the retrieval text, it is mapped to the knowledge graph subgraph. According to the semantic position index of the source data in the knowledge graph subgraph and the entity relationship, the source that meets the retrieval requirements is returned. Data and their corresponding semantic locations.
本发明首先利用文本分析、图片物体识别、视频场景文字识别和视频物体跟踪等技术构建了一个带权值的多模态旅游知识图谱。不同于单一文本模态旅游知识图谱,该知识图谱能够抽取图像和视频中的知识与文本中抽取的支持相互补充和制约,提供更加丰富准确的实体和实体关系。本发明利用构建的带权值的多模态旅游知识图谱支持了多模态旅游信息的定位式检索。有效地解决了传统文本检索和标签检索不能支持复杂语义检索要求的问题,以及解决了单一文本模态知识图谱不能对图片、视频进行语义检索的问题。同时,本发明使用了定位式检索,可以帮助用户找到定位更加精确的检索目标,无需人工对检索返回数据再次搜索理解,尤其是对于长视频数据源,降低人工花费的效果更加明显。The present invention first uses technologies such as text analysis, picture object recognition, video scene text recognition and video object tracking to construct a weighted multi-modal tourism knowledge graph. Different from a single text modal tourism knowledge map, this knowledge map can complement and restrict the knowledge extracted from images and videos and the support extracted from text, and provide more abundant and accurate entities and entity relationships. The invention uses the constructed multimodal tourism knowledge map with weights to support the positioning retrieval of multimodal tourism information. It effectively solves the problem that traditional text retrieval and label retrieval cannot support complex semantic retrieval requirements, and solves the problem that a single text modal knowledge graph cannot perform semantic retrieval on pictures and videos. At the same time, the present invention uses a positioning search, which can help users find a search target with more accurate positioning, and does not need to manually search and understand the returned data. Especially for long video data sources, the effect of reducing labor costs is more obvious.
本发明的有效利益是:提供了一种多模态旅游信息定位式搜索的解决方案,通过构建带权值的多模态旅游知识图谱,增强了多模态旅游大数据的复杂语义要求的检索能力,通过知识图谱对数据源的语义位置索引,能够返回更加精确的检索结果,降低了人工二次搜索理解的成本,具有良好的广泛性与实用性。The effective benefit of the present invention is that it provides a solution for multi-modal tourism information positioning search, and enhances the retrieval of complex semantic requirements of multi-modal tourism big data by constructing a weighted multi-modal tourism knowledge map Ability, through the knowledge map to index the semantic position of the data source, it can return more accurate retrieval results, reduce the cost of manual secondary search and understanding, and has good universality and practicability.
附图说明Description of drawings
图1为本发明的检索原理示意。Fig. 1 is a schematic diagram of the retrieval principle of the present invention.
图2为本发明的多模态旅游知识图谱构建过程。Fig. 2 is the construction process of the multimodal tourism knowledge map of the present invention.
具体实施方式Detailed ways
本发明提出一种基于旅游知识图谱的多模态旅游信息定位式检索方法,原理如图1所示,根据图文游记数据和旅游视频数据混合数据库中的多模态数据构建多模态旅游知识图谱,并在构建和更新过程中保存实体和实体间关系对数据源的语义位置索引,用户进行文本搜索时对文本抽取搜索实体和搜索实体间的关系,映射到知识图谱的一个子图,对该子图增强扩展后根据对应索引返回检索结果。本发明对检索文本返回的结果同样是多模态的,并且指向语义对应的位置。对于数据库中的游记数据,返回增强后子图对应的文本和图片及所在游记;对于数据库中的旅游视频数据,返回增强后子图对应的视频片段和整个视频。The present invention proposes a multimodal tourism information positioning retrieval method based on tourism knowledge map, the principle is shown in Figure 1, and multimodal tourism knowledge is constructed according to the multimodal data in the mixed database of graphic travel data and tourism video data Graph, and save the entity and the relationship between the entity and the semantic position index of the data source during the construction and update process. When the user searches for text, the text extracts the search entity and the relationship between the search entity, and maps it to a subgraph of the knowledge graph. After the subgraph is enhanced and expanded, the retrieval result is returned according to the corresponding index. The results returned by the present invention to the retrieved text are also multi-modal, and point to semantically corresponding positions. For the travel data in the database, return the text and pictures corresponding to the enhanced sub-picture and the travel notes; for the travel video data in the database, return the video segment and the entire video corresponding to the enhanced sub-picture.
本发明对带权值的多模态旅游知识图谱构建和检索文本子图映射检索的实现包括:In the present invention, the construction of weighted multimodal tourism knowledge map and the realization of retrieval text subgraph mapping retrieval include:
1)如图2所示,基于多模态旅游数据构建带有权值的旅游知识图谱;1) As shown in Figure 2, construct a tourism knowledge map with weights based on multi-modal tourism data;
1.1)对视频使用镜头分割软件ShotDetect进行镜头分割;1.1) Shot segmentation is performed on the video using the lens segmentation software ShotDetect;
1.2)对步骤1.1)分割的每个镜头每0.5秒取帧,并使用文本识别软件PaddleOCR识别帧场景文本;1.2) Get frames every 0.5 seconds for each shot of step 1.1) segmentation, and use the text recognition software PaddleOCR to recognize the frame scene text;
1.3)对每个镜头中步骤1.2)识别的文本去重,并以镜头为单位保存;1.3) Deduplicate the text identified in step 1.2) in each shot, and save it in units of shots;
1.4)对视频使用跟踪器CenterTrack进行多类别多物体跟踪;1.4) Use the tracker CenterTrack to perform multi-category and multi-object tracking on the video;
1.5)对步骤1.4)跟踪结果保存每帧的物体类别和物体包围框;1.5) to step 1.4) the tracking result saves the object category and the object bounding box of each frame;
1.6)对游记图片使用Mask R-CNN进行物体识别;1.6) Use Mask R-CNN for object recognition on travel pictures;
1.7)对步骤1.6)识别结果保存物体的类别的物体包围框;1.7) to step 1.6) the object bounding box of the category of recognition result preservation object;
1.8)对游记每章的每节文本进行分句;1.8) Make sentences for each section of each chapter of the travel notes;
1.9)对步骤1.8)分句结果进行分词;1.9) word segmentation is carried out to step 1.8) sentence result;
1.10)基于步骤1.9)分词结果进行词性分析;1.10) carry out part-of-speech analysis based on step 1.9) part-of-speech result;
1.11)基于步骤1.9)分词结果进行命名实体识别;1.11) carry out named entity recognition based on step 1.9) participle result;
1.12)基于步骤1.9)分词结果进行依存句法分析;1.12) carry out dependency syntax analysis based on step 1.9) participle result;
1.13)对视频每个镜头文本进行分句;1.13) segment the text of each shot of the video into sentences;
1.14)对步骤1.13)分句结果进行分词;1.14) carry out participle to step 1.13) sentence result;
1.15)基于步骤1.13)分词结果进行词性分析;1.15) carry out part-of-speech analysis based on step 1.13) part-of-speech result;
1.16)基于步骤1.13)分词结果进行命名实体识别;1.16) carry out named entity recognition based on step 1.13) participle result;
1.17)基于步骤1.13)分词结果进行依存句法分析。1.17) Based on the result of step 1.13) word segmentation, the dependency syntax analysis is carried out.
1.18)对城市和对应景点构建映射关系;1.18) Construct a mapping relationship between cities and corresponding scenic spots;
1.19)按照游记行文顺序将图片和游记文本的句作为语义单位;1.19) According to the order of the travel notes, the pictures and the sentences of the travel notes are taken as semantic units;
1.20)从步骤1.19)每个句中词性为地名的命名实体中选取能与城市景点映射对应的命名实体抽取为城市实体和景点实体,并记录为最近城市或最近景点;1.20) from step 1.19) in each sentence, part of speech is the named entity of the place name, selects the named entity that can correspond to the city scenic spot mapping and extracts as city entity and scenic spot entity, and is recorded as the nearest city or the nearest scenic spot;
1.21)从步骤1.19)每个句中词性为地名的命名实体中选取未能与城市景点映射对应的命名实体抽取为地点实体;1.21) from step 1.19) in each sentence, the part of speech is selected as the named entity of the place name, and the named entity corresponding to the urban scenic spot mapping is selected to be extracted as the location entity;
1.22)从步骤1.19)每个句中选取临近时间词组合抽取为时间实体;1.22) from step 1.19) in each sentence, select the near time word combination to extract as time entity;
1.23)从步骤1.19)每个句中选取动词抽取为活动实体;1.23) from step 1.19) in each sentence, select verbs to be extracted as active entities;
1.24)从步骤1.19)每个句中选取与动词、介词具有依存关系的非地点名词及图片中的物体抽取为其他实体;1.24) From step 1.19) in each sentence, select non-location nouns and objects in pictures that have dependencies with verbs and prepositions and extract them as other entities;
1.25)按照视频镜头时间顺序将视频的镜头、视频的文本识别句作为语义单位;1.25) According to the time sequence of the video shots, the video shots and the text recognition sentences of the video are used as semantic units;
1.26)从步骤1.25)每个句中词性为地名的命名实体中选取能与城市景点映射对应的命名实体抽取为城市实体和景点实体,并记录为最近城市或最近景点;1.26) from step 1.25) in each sentence, part of speech is the named entity of the place name, selects the named entity that can correspond to the urban scenic spot mapping and extracts it as a city entity and a scenic spot entity, and records it as the nearest city or the nearest scenic spot;
1.27)从步骤1.25)每个句中词性为地名的命名实体中选取未能与城市景点映射对应的命名实体抽取为地点实体;1.27) from step 1.25) the part of speech in each sentence is the named entity of the place name, and the named entity that fails to map corresponding to the city scenic spot is selected as the place entity;
1.28)从步骤1.25)每个句中选取临近时间词组合抽取为时间实体;1.28) from step 1.25) in each sentence, select the near time word combination to extract as time entity;
1.29)从步骤1.25)每个句中选取动词抽取为活动实体;1.29) from step 1.25) in each sentence, select verbs to be extracted as active entities;
1.30)从步骤1.25)每个句中选取与动词、介词具有依存关系的非地点名词及镜头中跟踪的物体抽取为其他实体;1.30) From each sentence in step 1.25), select non-location nouns that have dependencies with verbs and prepositions and objects tracked in the camera lens to be extracted as other entities;
1.31)对从步骤1.19)到步骤1.30)抽取的实体计算莱文斯坦比合并相同类别实体。1.31) Calculate the Levin-Stan ratio for entities extracted from step 1.19) to step 1.30) and merge entities of the same category.
1.32)对抽取的景点实体和最近城市实体构建所属关系;1.32) Construct the ownership relationship between the extracted scenic spot entity and the nearest city entity;
1.33)对抽取的地点实体和最近城市实体构建所属关系;1.33) Construct the relationship between the extracted location entity and the nearest city entity;
1.34)对抽取的地点实体和最近景点实体构建所属关系;1.34) Construct the ownership relationship between the extracted location entity and the nearest scenic spot entity;
1.35)对抽取的活动实体和最近景点实体构建发生在关系;1.35) The relationship between the extracted activity entity and the nearest attraction entity is constructed;
1.36)对抽取的活动实体和地点实体根据依存关系构建发生在关系;1.36) Construct the happened-in relationship for the extracted activity entity and location entity according to the dependency relationship;
1.37)对抽取的活动实体和时间实体根据依存关系构建发生时关系;1.37) Construct the time-of-occurrence relationship for the extracted activity entity and time entity according to the dependency relationship;
1.38)对抽取的活动实体和时间实体根据依存关系构建发生时关系;1.38) Construct the time-of-occurrence relationship for the extracted activity entity and time entity according to the dependency relationship;
1.39)对抽取的景点实体和地点实体根据关键词和依存关系构建位置接近关系;1.39) Constructing a location proximity relationship for the extracted scenic spot entity and location entity according to keywords and dependencies;
1.40)对抽取的其他实体和地点、景点、城市实体根据关键词和依存关系构建利用到达和利用出发关系;1.40) Construct utilization arrival and utilization departure relations for other extracted entities and locations, scenic spots, and city entities according to keywords and dependencies;
1.41)对抽取的其他实体间根据依存关系或语义顺序构建所属关系;1.41) Construct the belonging relationship between other extracted entities according to the dependency relationship or semantic order;
1.42)对从步骤3.32)到步骤3.41)抽取的实体间关系权值计算方法为:1.42) The calculation method for the relationship weight between entities extracted from step 3.32) to step 3.41) is:
w(h,r,t)=P((r,t)|h),w(h,r,t)=P((r,t)|h),
其中w(h,r,t)表示头部实体h和尾部实体t之间的关系(h,r,t)的权值,P((r,t)|h)表示实体关系在头部实体h出现条件下关系为r,尾部实体为t的事件出现的概率。Where w(h, r, t) represents the weight of the relationship (h, r, t) between the head entity h and the tail entity t, and P((r, t)|h) represents the entity relationship in the head entity The probability of occurrence of an event whose relationship is r and tail entity is t under the condition that h appears.
2)在步骤1)构建旅游知识图谱的过程中保存知识图谱中实体和实体关系对应源数据的语义单元位置索引,游记文本数据源的实体语义定位表示为<文档id,章id,节id,段id,句id,词id>,游记图片数据源的实体语义定位表示为<文档id,章id,节id,段id,图片语句id,包围框>,视频数据源图像中的实体语义定位表示为<视频id,镜头id,0,0,帧id,包围框>,视频数据源识别出的文本中实体语义定位表示为<视频id,镜头id,0,0,句id,词id>,数据源的实体关系语义定位表示为<头实体定位,尾实体定位>;;2) In the process of constructing the tourism knowledge map in step 1), the semantic unit position index of the source data corresponding to the entity and entity relationship in the knowledge map is saved, and the entity semantic positioning of the travel text data source is expressed as <document id, chapter id, section id, Segment id, sentence id, word id>, entity semantic positioning of travel picture data source is expressed as <document id, chapter id, section id, segment id, picture sentence id, bounding box>, entity semantic positioning in video data source image Expressed as <video id, shot id, 0, 0, frame id, bounding box>, entity semantic positioning in the text identified by the video data source is represented as <video id, shot id, 0, 0, sentence id, word id> , the entity relationship semantic location of the data source is expressed as <head entity location, tail entity location>;
3)对给定的检索文本抽取实体和实体关系:3) Extract entities and entity relationships for a given search text:
3.1)对检索文本进行文本识别分析;3.1) Perform text recognition and analysis on the retrieved text;
3.2)从步骤3.1)得到的分析后数据中根据词性和句法依存关系抽取语义实体;3.2) extracting semantic entities according to part-of-speech and syntactic dependencies from the analyzed data obtained in step 3.1);
3.3)从步骤3.1)得到的分析后数据中根据句法依存关系抽取实体间关系。3.3) From the analyzed data obtained in step 3.1), the relationship between entities is extracted according to the syntactic dependency relationship.
4)将步骤3)获得的检索文本实体和实体关系映射到步骤1)构建的知识图谱中的一个子图:4) Map the retrieved text entities and entity relationships obtained in step 3) to a subgraph in the knowledge graph constructed in step 1):
4.1)对检索文本中的实体和实体关系构建检索子图;4.1) Construct a retrieval subgraph for the entities and entity relationships in the retrieval text;
4.2)将步骤4.1)中构建的检索子图映射到步骤1)构建的带权值的旅游知识图谱的一个检索子图。4.2) Map the retrieval subgraph constructed in step 4.1) to a retrieval subgraph of the weighted tourism knowledge graph constructed in step 1).
5)对步骤4)得到的映射的检索子图,将每个实体根据阈值沿实体关系扩展关联实体,将扩展的实体和实体关系加入到子图中得到增强子图:5) For the mapped retrieval subgraph obtained in step 4), extend each entity along the entity relationship to the associated entity according to the threshold, and add the expanded entity and entity relationship to the subgraph to obtain an enhanced subgraph:
5.1)对于边缘实体h,子图暂时不作延申,对于边缘实体关系(h,r,t),将实体t作为边缘实体,现子图只存在边缘实体而不存在边缘实体关系;5.1) For the edge entity h, the subgraph will not be extended for the time being. For the edge entity relationship (h, r, t), the entity t is used as the edge entity. Now the subgraph only has edge entities but not edge entity relationships;
5.2)对于步骤5.1)得到的子图,对于边缘实体h和非子图实体如果存在非子图关系关系且大于等于阈值α,将非子图实体和非子图关系添加为扩展子图实体和扩展子图实体关系同样地,如果存在非子图关系关系且大于等于阈值α,将非子图实体和非子图关系关添加为扩展子图实体和扩展子图实体关系 5.2) For the subgraph obtained in step 5.1), for the edge entity h and the non-subgraph entity If there is a non-subgraph relationship relationship and Greater than or equal to the threshold α, the non-subgraph entity and non-subgraph relations Added as an extended subgraph entity and Extended Subgraph Entity Relationships Likewise, if there is a non-subgraph relation relationship and Greater than or equal to the threshold α, the non-subgraph entity and non-subgraph relations Close Add as Extended Subgraph Entity and Extended Subgraph Entity Relationships
5.3)对于步骤5.2)得到的扩展子图,对于原边缘实体h,和非子图实体如果存在非子图关系且大于等于阈值α,将非子图实体和非子图关系添加为扩展子图实体和扩展子图实体关系同样地,如果存在非子图关系且大于等于阈值α,将非子图实体和非子图关系添加为扩展子图实体和扩展子图实体关系由此类推直到对非子图实体的权值乘积均小于阈值α,则对于原边缘实体h的扩展结束;5.3) For the extended subgraph obtained in step 5.2), for the original edge entity h, and the non-subgraph entity If there is a non-subgraph relationship and Greater than or equal to the threshold α, the non-subgraph entity and non-subgraph relations Added as an extended subgraph entity and Extended Subgraph Entity Relationships Likewise, if there is a non-subgraph relation and Greater than or equal to the threshold α, the non-subgraph entity and non-subgraph relations Added as an extended subgraph entity and Extended Subgraph Entity Relationships And so on until for non-subgraph entities The weight products of all are less than the threshold α, then the expansion of the original edge entity h ends;
5.4)当所有边缘实体扩展结束,该子图扩展结束。5.4) When all edge entity expansion ends, the subgraph expansion ends.
6)根据步骤5)增强子图中实体和实体关系,对应查询多模态旅游数据库中源数据语义位置,返回检索数据,具体为:6) Enhancing the entity and entity relationship in the sub-graph according to step 5), correspondingly querying the semantic position of the source data in the multimodal tourism database, and returning the retrieved data, specifically:
6.1)对增强子图中的实体和实体关系,取得源数据映射索引;6.1) To enhance the entity and entity relationship in the subgraph, obtain the source data mapping index;
6.2)根据映射索引查询多模态旅游数据库,对于数据库中的游记数据,返回索引对应的文本和图片及所在游记;对于数据库中的旅游视频数据,返回索引对应的视频片段和整个视频。6.2) Query the multimodal tourism database according to the mapping index. For the travel data in the database, return the text and pictures corresponding to the index and the travel notes; for the travel video data in the database, return the video clips and the entire video corresponding to the index.
Claims (14)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111088382.4A CN115827881A (en) | 2021-09-16 | 2021-09-16 | Multi-mode tourism information positioning type retrieval method based on tourism knowledge map |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111088382.4A CN115827881A (en) | 2021-09-16 | 2021-09-16 | Multi-mode tourism information positioning type retrieval method based on tourism knowledge map |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115827881A true CN115827881A (en) | 2023-03-21 |
Family
ID=85515088
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111088382.4A Pending CN115827881A (en) | 2021-09-16 | 2021-09-16 | Multi-mode tourism information positioning type retrieval method based on tourism knowledge map |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115827881A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116821692A (en) * | 2023-08-28 | 2023-09-29 | 北京化工大学 | Method, device and storage medium for constructing descriptive text and space scene sample set |
CN118278420A (en) * | 2024-04-18 | 2024-07-02 | 江苏微盛网络科技有限公司 | Cloud computing-based enterprise session data storage analysis method and system |
CN118569366A (en) * | 2024-05-28 | 2024-08-30 | 中国科学院地理科学与资源研究所 | Tourism resource monomer combination method, device, storage medium and computer equipment |
-
2021
- 2021-09-16 CN CN202111088382.4A patent/CN115827881A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116821692A (en) * | 2023-08-28 | 2023-09-29 | 北京化工大学 | Method, device and storage medium for constructing descriptive text and space scene sample set |
CN118278420A (en) * | 2024-04-18 | 2024-07-02 | 江苏微盛网络科技有限公司 | Cloud computing-based enterprise session data storage analysis method and system |
CN118569366A (en) * | 2024-05-28 | 2024-08-30 | 中国科学院地理科学与资源研究所 | Tourism resource monomer combination method, device, storage medium and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111694965B (en) | An image scene retrieval system and method based on multimodal knowledge graph | |
CN113283551B (en) | Training method and training device of multi-mode pre-training model and electronic equipment | |
CN111680173B (en) | CMR model for unified searching cross-media information | |
CN111143479B (en) | Fusion Method of Knowledge Graph Relation Extraction and REST Service Visualization Based on DBSCAN Clustering Algorithm | |
Moncla et al. | Geocoding for texts with fine-grain toponyms: an experiment on a geoparsed hiking descriptions corpus | |
RU2688271C2 (en) | Image search in natural language | |
CN115827881A (en) | Multi-mode tourism information positioning type retrieval method based on tourism knowledge map | |
US20070112838A1 (en) | Method and system for classifying media content | |
US20180285744A1 (en) | System and method for generating multimedia knowledge base | |
CN111177591A (en) | Knowledge graph-based Web data optimization method facing visualization demand | |
Varini et al. | Personalized egocentric video summarization of cultural tour on user preferences input | |
JP2010128806A (en) | Information analyzing device | |
CN118069812B (en) | Navigation method based on large model | |
Moncla et al. | Automated geoparsing of paris street names in 19th century novels | |
Hobel et al. | Deriving the geographic footprint of cognitive regions | |
CN116361510A (en) | Method and device for automatically extracting and retrieving scenario segment video established by utilizing film and television works and scenario | |
CN112199526A (en) | Method and device for issuing multimedia content, electronic equipment and storage medium | |
CN115827882A (en) | Knowledge graph construction method based on multi-mode tourism big data | |
US20070112839A1 (en) | Method and system for expansion of structured keyword vocabulary | |
Tran et al. | V-first: A flexible interactive retrieval system for video at vbs 2022 | |
Choi et al. | Human vs machine: establishing a human baseline for multimodal location estimation | |
Contreras et al. | Semantic web and augmented reality for searching people, events and points of interest within of a university campus | |
Kayed et al. | Postal address extraction from the web: a comprehensive survey | |
Pham et al. | Towards a large-scale person search by vietnamese natural language: dataset and methods | |
CN116662583B (en) | Text generation method, place retrieval method and related devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |