[go: up one dir, main page]

CN109582858A - A kind of believable Knowledge Ecosystem - Google Patents

A kind of believable Knowledge Ecosystem Download PDF

Info

Publication number
CN109582858A
CN109582858A CN201811207431.XA CN201811207431A CN109582858A CN 109582858 A CN109582858 A CN 109582858A CN 201811207431 A CN201811207431 A CN 201811207431A CN 109582858 A CN109582858 A CN 109582858A
Authority
CN
China
Prior art keywords
knowledge
personal
text
platform
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811207431.XA
Other languages
Chinese (zh)
Inventor
吴旭
颉夏青
许晋
戴雨伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201811207431.XA priority Critical patent/CN109582858A/en
Publication of CN109582858A publication Critical patent/CN109582858A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of believable Knowledge Ecosystems, it include: the automatic collection module of education resource, the learning materials of specific subject are acquired from internet open source on demand, data cleansing, data regularization are carried out, structuring or semi-structured data are credibly stored in internal learning resources bank;Resources learning module carries out study to resource and forms personal knowledge base, and wherein personal knowledge base is stored on platform in a trusted manner;Knowledge management module carries out tissue to the knowledge element in the personal knowledge base and arrives existing knowledge hierarchy with merging, updating;Knowledge is credibly shared with specific user (group) by Knowledge Sharing module;Knowledge innovation module produces personal achievement in conjunction with existing knowledge hierarchy, and forms personal achievement library;Achievement submits module, and the personal achievement is submitted on platform.In the present invention, the flow process of knowledge promotes the continuous self-renewing of knowledge base, forms constantly circulation, self-growing Knowledge Ecosystem.

Description

A kind of believable Knowledge Ecosystem
Technical field
The invention belongs to reliable computing technology fields, and in particular to a kind of believable Knowledge Ecosystem and its building side Method.
Background technique
Currently, information management, the research progress just like a raging fire of the trusted storage of knowledge, the shared aspect of knowledge.? The epoch of information explosion, how to carry out information management, the trusted storage of knowledge, knowledge it is shared, be pendulum in each information management Problem in face of industry practitioner.
Chinese invention patent publication number CN107704634A discloses the method for a kind of formation knowledge and building knowledge chain, It is characterized in that, studies the knowledge for including in text information, construct the field primitive figure of system to indicate knowledge, pass through the phase of knowledge Mutually connection generates knowledge chain;The knowledge that primitive figure indicates is made of object, attribute and magnitude, and object is center, attribute explanation pair As;Attribute is divided into index, tendency and object three classes;The object properties of object Object are to form the basis of knowledge chain, and knowledge Explanation be generate knowledge chain power;Knowledge and learning knowledge in text information constitute primitive Knowledge Ecosystem;It is main The knowledge analysis being used in text and building knowledge chain.
However, above-mentioned Knowledge Ecosystem master is to solve the management system structure that enterprise produces, faces in business administration Problem is built, there is no believable Knowledge Ecosystem at present, the integration of the education resource on common internet.
Summary of the invention
In order to solve the above problem, the present invention is by semantic-based Theme Crawler of Content, increment crawler technology, from it is multi-source heterogeneous, The related resource that specific subject or specific area are obtained in the internet open source of magnanimity complexity, by data deduplication, knows The management of knowledge, credible management, credible sharing, the technologies such as trusted storage establish collect resource acquisition, resources learning, information management, at Fruit output, knowledge sharing are in the Knowledge Ecosystem of one.
According to an aspect of the invention, there is provided a kind of believable Knowledge Ecosystem, comprising:
The automatic collection module of education resource acquires the learning materials of specific subject from internet open source on demand, into Structuring or semi-structured data are credibly stored in internal learning resources bank by row data cleansing, data regularization;
Resources learning module, resource carry out self study and form personal knowledge base, and wherein personal knowledge base is in a trusted manner It is stored on platform;
Knowledge management module carries out tissue to the knowledge element in the personal knowledge base and merges, updates to existing Knowledge base;
Knowledge is credibly shared with specific user (group) by Knowledge Sharing module;
Knowledge innovation module produces personal achievement in conjunction with existing knowledge hierarchy, and forms personal achievement library;
Achievement submits module, and the personal achievement is submitted on platform, is learnt for other learners.
Preferably, the carrier format of the resources learning module object includes: video, audio, document, and/or webpage.
Preferably, in the personal knowledge base knowledge element and existing knowledge hierarchy merge, comprising: knowledge Duplicate removal, disambiguation, and/or knowledge connection.
Preferably, the method for the information management are as follows: the system is embedded in plug-in unit method of calling using mind map tool System is managed knowledge.
Preferably, the Knowledge Sharing includes: to share the partial knowledge in personal knowledge base to the inside of the system Intercommunion platform or internet open source platform are shared from internal communication platform to internet open source platform, from described Personal achievement library is shared with internal communication platform or internet open source platform.
Preferably, user can control the access profile of strict control content by permission during sharing, and believe Breath is transmitted in transmission channel with ciphertext form.
Preferably, the method for the acquisition learning materials is semantic-based Theme Crawler of Content or increment crawler algorithm.
Preferably, the Theme Crawler of Content method includes the following steps:
(1), using Dewey Decimal Classification, the stage is extracted in web page characteristics, web page text is rapidly found out and Anchor Text closes Keyword similar in keyword theme;
(2), theme candidate link feature text is extracted;
(3), classified using Naive Bayes Classifier to candidate link theme edge text, obtain theme Related web page;If text belongs to specific subject, corresponding candidate link is using weight of classifying as priority value, with excellent The size order of first grade is inserted into queue of creeping, and crawler preferentially accesses the big link of classification value, if text is not belonging to specific subject, Then abandon candidate link;
(4), its corresponding technorati authority and centrad are calculated with HITS algorithm to the Web link information of related web page, it is comprehensive Close Anchor Text, Anchor Text nearby information, reversed webpage, backward chaining brother link, URL link, prejudge webpage to be crawled with The degree of correlation of theme.
Preferably, the extraction theme candidate link feature text includes the following steps:
(1) word segmentation processing is carried out to the Anchor Text of webpage and text, removes stop words, obtains keyword;
(2) the Du Wei class number of keyword is searched;
(3) theme candidate link feature text is extracted with the characteristic of Dewey decimal classification and combination two-dimensional coordinate; Using the length of keyword class number as X-axis, keyword classification number as Y-axis, by the corresponding Du Weishi of keyword into point Class number draws corresponding point in two-dimensional coordinate;
(4) the corresponding keyword of key point in two-dimensional coordinate around Anchor Text key point and Anchor Text is extracted as master Inscribe candidate link feature text.
Preferably, which is believable, is embodied in the access authority control of the personal knowledge base, personal achievement library System is managed by user;Data therein are credibly stored in system platform.
Preferably, the trusted storage refers to that data in the database, pass through unsymmetrical key with encrypted test mode storage Public key encrypted, and private key only has user oneself to possess, even if the manager of platform can not obtain.
The present invention has the advantages that the flow process of knowledge promotes the continuous self-renewing of knowledge base on this platform, Form constantly circulation, self-growing Knowledge Ecosystem.Platform of the invention has the following characteristics that
(1) renewal of knowledge self-renewing microcirculation of individual subscriber: the process of information management can be recycled past by user Multiple knowledge process realizes the continuous renewal of personal knowledge base;
(2) the interior circulation (partial circulating) that internal system education resource automatically updates: on learning platform, knowledge learning is known Know innovation and achievement submits three processes to form partial circulating, is updated by the learning outcome of user and the study of platform is constantly promoted to provide Source library is cyclically updated.
(3) outer circulation (systemic circulation) that internal system education resource automatically updates: personal knowledge base and achievement library it is continuous It updates, promotes the continuous updating of internet open source by orderly opening Knowledge Sharing, internet open source continues The update of further promotion system internal learning resource is updated, the outer circulation that internal system education resource updates is formed.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows the operation principle schematic diagram of the believable Knowledge Ecosystem of the present invention.
Fig. 2 shows believable Knowledge Ecosystem structure charts of the invention.
Specific embodiment
The illustrative embodiments of the disclosure are more fully described below with reference to accompanying drawings.Although showing this public affairs in attached drawing The illustrative embodiments opened, it being understood, however, that may be realized in various forms the disclosure without the reality that should be illustrated here The mode of applying is limited.It is to be able to thoroughly understand the disclosure on the contrary, providing these embodiments, and can be by this public affairs The range opened is fully disclosed to those skilled in the art.
Explain " credible " concept according to the present invention first below.In the present invention, the meaning of " credible " is different from daily Concept and range in life has it in the particular meaning of specific field.
In technical field, nineteen eighty-three U.S. Department of Defense has just formulated " trust computing and system evaluation criterion ".1999 TCPA tissue is set up, and is reorganized as Trusted Computing Group TCG within 2003.TCPA and TCG has been formulated about credible calculating platform, credible A series of technical specifications such as storage, trustable network connection.Existing more than 200 a enterprises joined TCG at present, and trusted computer is Into practical application.
In theoretical field, IEEE, which is organized in, has edited and publish IEEE Transaction on dependable for 2004 And secure Computing magazine, specially discusses trust computing.
Trusted Computing Group TCG is defined credible with the expection of entity behavior: an entity is believable, if it Behavior always by way of expectations, reaches the set goal.
15408 standard of ISO/IEC will be credible is defined as: participates in the component calculated, operation or process are under conditions of any It is predictable, and virus and physical disturbance can be resisted.
It is so-called credible to refer to that service provided by computer system is that prove it be reliable.Here it is refer to from Family angle sees that service provided by computer system is reliable, and this trust is demonstrable.Credible accounting system It is the reliability, availability, the computer system of information and behavior safety of the system that is capable of providing.
In addition, what the present invention can be published about the basic conception of trust computing with reference to China Machine Press in 2009 " trust computing " book.
As shown in Figure 1, technical principle of the invention is as follows:
The automatic collection of education resource, by semantic-based Theme Crawler of Content, increment crawler technology, from multi-source heterogeneous, extra large The related resource for obtaining specific subject or specific area in complicated internet open source is measured, and number is passed through to these resources After realizing duplicate removal according to cleaning technique and carry out quality evaluation, internal system is stored in the form of structuring and semi-structured data Learning object repository.The Theme Crawler of Content method includes the following steps:
(1), using Dewey Decimal Classification, the stage is extracted in web page characteristics, web page text is rapidly found out and Anchor Text closes Keyword similar in keyword theme;
(2), theme candidate link feature text is extracted;The extraction theme candidate link feature text includes following step It is rapid:
(1) word segmentation processing is carried out to the Anchor Text of webpage and text, removes stop words, obtains keyword;
(2) the Du Wei class number of keyword is searched;
(3) theme candidate link feature text is extracted with the characteristic of Dewey decimal classification and combination two-dimensional coordinate; Using the length of keyword class number as X-axis, keyword classification number as Y-axis, by the corresponding Du Weishi of keyword into point Class number draws corresponding point in two-dimensional coordinate;
(4) the corresponding keyword of key point in two-dimensional coordinate around Anchor Text key point and Anchor Text is extracted as master Inscribe candidate link feature text.
(3), classified using Naive Bayes Classifier to candidate link theme edge text, obtain theme Related web page;If text belongs to specific subject, corresponding candidate link is using weight of classifying as priority value, with excellent The size order of first grade is inserted into queue of creeping, and crawler preferentially accesses the big link of classification value, if text is not belonging to specific subject, Then abandon candidate link;
(4), its corresponding technorati authority and centrad are calculated with HITS algorithm to the Web link information of related web page, it is comprehensive Close Anchor Text, Anchor Text nearby information, reversed webpage, backward chaining brother link, URL link, prejudge webpage to be crawled with The degree of correlation of theme.
Resources learning, the resources learning including different carriers type.Video study, including on-line study and off-line learning two Kind function, utilizes the technologies such as video online request, break-point continuous playing, stream data cache to realize.Audio study and document, article etc. are literary The study of this class carrier, mainly supports on-line study, can be downloaded off-line learning in the case where copyright allows.
Information management, the knowledge such as notes, idea that can be generated by way of mind map to learning process carry out pipe Reason, a kind of method can use the mind maps such as xmind, freemind tool with plug-in unit method of calling embedded system, another Method can also be realized based on existed system.
Knowledge innovation can carry out notes and the automatically extracting of idea, Science Report or academic paper based on system The Automatic Typesetting of equal articles, forms personal achievement library.The access authority in achievement library can be configured by author, allow specified model Enclose or designated user access, downloading etc..Personal achievement is stored in system platform in the form of ciphertext, and user can choose difference The Encryption Algorithm of intensity is encrypted, and voluntarily grasps key, to protect the intellectual property of author.
The algorithm encrypted to personal achievement is the combination of symmetric encipherment algorithm and rivest, shamir, adelman: as used Rivest, shamir, adelman RSA transmission key, and AES encryption algorithm is used for data encryption.
Achievement is submitted, and is referred to that the personal achievement in part is submitted to system internal resources platform by author, is accessed for platform user Study.Personal achievement is submitted, and the personal achievement formed in platform can be carried out a key submission, can also be by offline personal achievement It is submitted.After the achievement submitted is audited by copyright, content, form of platform etc., it can issue.
Knowledge Sharing, by possessing the user for sharing permission for knowledge from an access profile sharing to another access model It encloses.It is settable whether to allow repeatedly to be shared when author shares from personal knowledge base to internal system intercommunion platform;Work as knowledge When arriving internet open platform by sharing, such as microblogging, wechat social platform, dated copyright can repeatedly be shared.The process of sharing Middle user can control the access profile of strict control content by permission, and information is passed in transmission channel with ciphertext form It is defeated.
The system be it is believable, be embodied in the personal knowledge base, personal achievement library access privilege control by with Family is managed;Data therein are credibly stored in system platform.
The trusted storage refers to that data in the database, pass through the public key of unsymmetrical key with encrypted test mode storage It is encrypted, and private key only has user oneself that can decrypt, even if the manager of platform can not obtain the key of decryption.
On this platform, the flow process of knowledge promotes the continuous self-renewing of knowledge base, is formed and constantly recycles, is spontaneous Long Knowledge Ecosystem.Platform of the invention has the following characteristics that
(1) renewal of knowledge self-renewing microcirculation of individual subscriber: the process of information management can be recycled past by user Multiple knowledge process realizes the continuous renewal of personal knowledge base;
(2) the interior circulation (partial circulating) that internal system education resource automatically updates: on learning platform, knowledge learning is known Know innovation and achievement submits three processes to form partial circulating, is updated by the learning outcome of user and the study of platform is constantly promoted to provide Source library is cyclically updated.
(3) outer circulation (systemic circulation) that internal system education resource automatically updates: personal knowledge base and achievement library it is continuous It updates, promotes the continuous updating of internet open source by orderly opening Knowledge Sharing, internet open source continues The update of further promotion system internal learning resource is updated, the outer circulation that internal system education resource updates is formed.
Embodiment 1
Correspondingly, the present invention also provides a kind of believable Knowledge Ecosystem 10, including education resource collects mould automatically Block 11, resources learning module 12, knowledge management module 13, Knowledge Sharing module 14, knowledge innovation module 15, achievement submit module 16。
The automatic collection module 11 of education resource acquires the learning materials of specific subject from internet open source on demand, Data cleansing, data regularization are carried out, structuring or semi-structured data are credibly stored in internal learning resources bank.
Resources learning module 12, resource carry out self study and form personal knowledge base, and wherein personal knowledge base is with believable side Formula is stored on platform.The carrier format of resources learning object includes but is not limited to video, audio, document, webpage, learning process In can recorde notes, mark resource, extract knowledge point etc., these knowledge elements are the required contents for forming knowledge base.
Knowledge management module 13, be to the knowledge element and the process that is merged of existing knowledge hierarchy in knowledge base, Tissue is carried out to the knowledge element in the personal knowledge base and merges, update to existing knowledge base, includes but is not limited to know Know duplicate removal, disambiguation, knowledge connection etc., and can be realized the continuous renewal of personal knowledge base by information management.Wherein Personal knowledge base is stored on platform in a trusted manner.Herein credible is embodied in, and content is stored in platform with ciphertext form On, even the manager of platform can not also decrypt to obtain ciphertext.Specific cipher mode, optional way first is that public key adds Mode close, that private key is taken care of by user oneself.
Knowledge Sharing module 14 refers to and knowledge is credibly shared with to specific user (group), is a kind of side of knowledge dissemination Formula.Its is credible to be embodied in, and content is transmitted in transmission process with ciphertext form, is not easy to be stolen by packet capturing;The object of sharing passes through Access control technology carries out strict control, and can limit the permission of the read-write of other side, duplication, forwarding.The Knowledge Sharing Including the partial knowledge in personal knowledge base to be shared to internal communication platform or internet open source platform to system, from Internal communication platform is shared to internet open source platform, is shared with internal communication platform or internet from personal achievement library Open source platform.Knowledge Sharing is the most important a kind of mode promoted knowledge flow.User can lead to during sharing The access profile of permission control strict control content is crossed, and information is transmitted in transmission channel with ciphertext form.
Knowledge innovation module 15, knowledge innovation are the deep-processing process of knowledge, in conjunction with existing knowledge hierarchy, utilize personal intelligence The new knowledge production of intelligent production, and gradually form personal achievement library.
Achievement submits module 16.The personal achievement of user, can be submitted to internal platform, further enrich internal system study Resources bank.It is shared under the premise of protecting intellectual property, is learnt for other learners.
The system be it is believable, be embodied in the personal knowledge base, personal achievement library access privilege control by with Family is managed;Data therein are credibly stored in system platform.
The trusted storage refers to that data in the database, pass through the public key of unsymmetrical key with encrypted test mode storage It is encrypted, and private key only has user oneself that can decrypt, even if the manager of platform can not obtain the key of decryption.
It should be understood that
Algorithm and display be not inherently related to any certain computer, virtual bench or other equipment provided herein. Various fexible units can also be used together with teachings based herein.As described above, it constructs required by this kind of device Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the present invention and help to understand one or more of the various inventive aspects, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself All as a separate embodiment of the present invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.
The software module that various component embodiments of the invention can be run on one or more processors is realized.This Field it will be appreciated by the skilled person that microprocessor or digital signal processor (DSP) Lai Shixian can be used in practice The some or all functions of some or all components in the creating device of virtual machine according to an embodiment of the present invention.This hair The bright some or all device or device program (examples being also implemented as executing method as described herein Such as, computer program and computer program product).It is such to realize that program of the invention can store in computer-readable medium On, or may be in the form of one or more signals.Such signal can be downloaded from an internet website to obtain, or Person is provided on the carrier signal, or is provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame Claim.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of the claim Subject to enclosing.

Claims (10)

1.一种可信的知识生态系统,其特征在于,包括:1. A credible knowledge ecosystem, comprising: 学习资源自动收集模块,按需从互联网开放资源中采集特定主题的学习资料,进行数据清洗、数据归约,将结构化或半结构化数据可信地存储在内部学习资源库中;The learning resource automatic collection module collects learning materials on specific topics from Internet open resources on demand, performs data cleaning and data reduction, and reliably stores structured or semi-structured data in the internal learning resource library; 资源学习模块,资源进行自学习形成个人知识库,其中个人知识库以可信的方式存储于平台上;Resource learning module, resources self-learn to form a personal knowledge base, in which the personal knowledge base is stored on the platform in a credible way; 知识管理模块,对所述个人知识库中的知识要素进行组织并融合、更新到已有的知识库;The knowledge management module organizes, integrates and updates the knowledge elements in the personal knowledge base to the existing knowledge base; 知识分享模块,将知识可信地分享给特定用户;Knowledge sharing module, share knowledge credibly to specific users; 知识创新模块,结合已有知识体系,生产个人成果,并形成个人成果库;The knowledge innovation module, combined with the existing knowledge system, produces personal achievements and forms a personal achievement library; 成果提交模块,将所述个人成果提交到平台上,供其他学习者进行学习。The achievement submission module submits the individual achievement to the platform for other learners to study. 2.根据权利要求1所述的系统,其特征在于,2. The system of claim 1, wherein: 所述资源学习模块对象的载体形式包括:视频、音频、文档、和/或网页;The carrier form of the resource learning module object includes: video, audio, document, and/or web page; 对所述个人知识库中的知识要素和已有的知识体系进行融合,包括:知识去重、消除歧义、和/或知识关联。Integrate the knowledge elements in the personal knowledge base with the existing knowledge system, including: knowledge deduplication, disambiguation, and/or knowledge association. 3.根据权利要求2所述的系统,其特征在于,3. The system of claim 2, wherein: 所述知识管理的方法为:利用思维导图工具以插件调用方式嵌入所述系统对知识进行管理。The method for knowledge management is: using a mind map tool to manage knowledge by embedding the system in a plug-in calling manner. 4.根据权利要求1所述的系统,其特征在于,4. The system of claim 1, wherein: 所述知识分享包括:将个人知识库中的部分知识分享到所述系统的内部交流平台或者互联网开放资源平台,从内部交流平台分享到互联网开放资源平台,从所述个人成果库分享给内部交流平台或者互联网开放资源平台。The knowledge sharing includes: sharing part of the knowledge in the personal knowledge base to the internal communication platform or the Internet open resource platform of the system, from the internal communication platform to the Internet open resource platform, and from the personal achievement base to the internal communication platform. platform or Internet open resource platform. 5.根据权利要求4所述的系统,其特征在于,5. The system of claim 4, wherein: 分享的过程中用户能够通过权限控制严格控制内容的访问范围,并且信息以密文形式在传输信道中传输。During the sharing process, users can strictly control the access scope of the content through permission control, and the information is transmitted in the transmission channel in the form of cipher text. 6.根据权利要求1所述的系统,其特征在于,6. The system of claim 1, wherein: 所述采集学习资料的方法为基于语义的主题爬虫或增量爬虫算法。The method for collecting learning materials is a semantic-based topic crawler or an incremental crawler algorithm. 7.根据权利要求6所述的系统,其特征在于,7. The system of claim 6, wherein: 所述主题爬虫方法包括如下步骤:The subject crawler method includes the following steps: (一)、使用杜威十进分类法,在网页特征提取阶段,快速找出网页文本与锚文本关键词主题相近的关键词;(1) Using the Dewey Decimal Classification method, in the stage of webpage feature extraction, quickly find out the keywords with similar themes of the webpage text and anchor text keywords; (二)、提取主题候选链接特征文本;(2) Extract the feature text of topic candidate links; (三)、使用朴素贝叶斯文本分类器对候选链接主题边缘文本进行分类,获取主题相关网页;如果文本属于特定主题,那么相对应的候选链接以分类权值作为优先级值,以优先级的大小顺序插入爬行队列,爬虫优先访问分类值大的链接,如果文本不属于特定主题,则丢弃候选链接;(3) Use the Naive Bayesian text classifier to classify the edge text of the candidate link topic, and obtain the topic-related webpage; if the text belongs to a specific topic, the corresponding candidate link takes the classification weight as the priority value, and the priority value The size of the text is inserted into the crawling queue, and the crawler preferentially accesses the link with a large classification value. If the text does not belong to a specific topic, the candidate link is discarded; (四)、对相关网页的Web链接信息用HITS算法计算出其对应的权威度和中心度,综合锚文本、锚文本附近信息、反向网页、反向链接的兄弟链接、URL链接,预判待爬取网页与主题的相关度。(4) Use the HITS algorithm to calculate the corresponding authority and centrality of the Web link information of the relevant webpage, and synthesize the anchor text, the information near the anchor text, the reverse webpage, the brother link of the reverse link, and the URL link, and predict the The relevance of the webpage to be crawled to the topic. 8.根据权利要求7所述的系统,其特征在于,8. The system of claim 7, wherein: 所述提取主题候选链接特征文本包括如下步骤:The extracting feature text of topic candidate links includes the following steps: (1)对网页的锚文本和正文进行分词处理,去掉停用词,得到关键词;(1) Perform word segmentation on the anchor text and body of the web page, remove stop words, and obtain keywords; (2)查找关键词的杜威分类号码;(2) Find the Dewey classification number of the keyword; (3)运用杜威十进制分类法的特性并结合二维坐标提取主题候选链接特征文本;把关键词分类号码的长度作为X轴,关键词分类号码作为Y轴,将关键词对应的杜威十进分类号码在二维坐标中绘制相应的点;(3) Using the characteristics of the Dewey decimal classification method and combining the two-dimensional coordinates to extract the feature text of the topic candidate link; take the length of the keyword classification number as the X axis, the keyword classification number as the Y axis, and the corresponding Dewey decimal classification of the keyword The number draws the corresponding point in two-dimensional coordinates; (4)提取二维坐标中锚文本关键点以及锚文本周围的关键点对应的关键词作为主题候选链接特征文本。(4) Extract the key points of the anchor text in the two-dimensional coordinates and the keywords corresponding to the key points around the anchor text as the topic candidate link feature text. 9.根据权利要求1所述的系统,其特征在于,9. The system of claim 1, wherein: 该系统是可信的,具体体现在所述个人知识库、个人成果库的访问权限控制由用户进行管理;其中的数据可信地存储于系统平台。The system is credible, which is embodied in the fact that the access authority control of the personal knowledge base and the personal achievement base is managed by the user; the data therein is stored credibly in the system platform. 10.根据权利要求9所述的系统,其特征在于,所述的可信存储,是指数据以密文方式存储在数据库中,通过非对称密钥的公钥进行加密,而私钥只有用户自己拥有,即使平台的管理者无法获取。10. The system according to claim 9, wherein the trusted storage means that the data is stored in the database in ciphertext, encrypted by the public key of the asymmetric key, and the private key only has the user Own it, even if the administrator of the platform cannot obtain it.
CN201811207431.XA 2018-10-17 2018-10-17 A kind of believable Knowledge Ecosystem Pending CN109582858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811207431.XA CN109582858A (en) 2018-10-17 2018-10-17 A kind of believable Knowledge Ecosystem

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811207431.XA CN109582858A (en) 2018-10-17 2018-10-17 A kind of believable Knowledge Ecosystem

Publications (1)

Publication Number Publication Date
CN109582858A true CN109582858A (en) 2019-04-05

Family

ID=65920559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811207431.XA Pending CN109582858A (en) 2018-10-17 2018-10-17 A kind of believable Knowledge Ecosystem

Country Status (1)

Country Link
CN (1) CN109582858A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342966A (en) * 2021-07-08 2021-09-03 北京明略昭辉科技有限公司 Method, system, equipment and storage medium for editing online document based on knowledge base
CN113407805A (en) * 2021-07-16 2021-09-17 山东北斗科技信息咨询有限公司 Big data based policy acquisition, cleaning and automatic accurate pushing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521337A (en) * 2011-12-08 2012-06-27 华中科技大学 Academic community system based on massive knowledge network
WO2015126957A1 (en) * 2014-02-19 2015-08-27 Snowflake Computing Inc. Resource management systems and methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521337A (en) * 2011-12-08 2012-06-27 华中科技大学 Academic community system based on massive knowledge network
WO2015126957A1 (en) * 2014-02-19 2015-08-27 Snowflake Computing Inc. Resource management systems and methods

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
中国通信学会学术工作委员会: "《第九届中国通信学会学术年会论文集》", 31 December 2012, 北京邮电大学出版社 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342966A (en) * 2021-07-08 2021-09-03 北京明略昭辉科技有限公司 Method, system, equipment and storage medium for editing online document based on knowledge base
CN113407805A (en) * 2021-07-16 2021-09-17 山东北斗科技信息咨询有限公司 Big data based policy acquisition, cleaning and automatic accurate pushing method

Similar Documents

Publication Publication Date Title
AU2024200809B2 (en) Data protection via aggregation-based obfuscation
Ullah et al. Identification of nodes influence based on global structure model in complex networks
Abbas et al. Investigating the applications of artificial intelligence in cyber security
KR102079970B1 (en) Method, apparatus and computer program for providing cyber security using a knowledge graph
CN107292189B (en) The privacy of user guard method of text-oriented retrieval service
Chen et al. Challenges and remedies to privacy and security in AIGC: Exploring the potential of privacy computing, blockchain, and beyond
WO2018129110A1 (en) Cryptographic operations in an isolated collection
JP7320866B2 (en) Method, apparatus and computer program for collecting data from multiple domains
Karger et al. Blockchain for AI Data-State of the Art and Open Research.
CN109582858A (en) A kind of believable Knowledge Ecosystem
Ding et al. The malicious technical ecosystem: Exposing limitations in technical governance of ai-generated non-consensual intimate images of adults
Dasoriya A review of big data analytics over cloud
Malik et al. Towards identifying the challenges associated with emerging large scale social networks
Pendergrass The intersection of human trafficking and technology
Heni et al. Towards an automatic detection of sensitive information in Mongo database
Martin The Internet as a reverse Panopticon
Yu et al. Tee based cross-silo trustworthy federated learning infrastructure
Jiang et al. Protecting source privacy in federated search
Heni et al. Combining fragmentation and encryption to ensure big data at rest security
Xu et al. Blockchain-based verifiable DSSE with forward security in multi-server environments
Takano et al. Privacy-Protective Distributed Machine Learning Between Rich Devices and Edge Servers Using Confidence Level
Xiang et al. Privacy vs. Utility: An Enhanced K-coRated
Deshpande et al. User information privacy awareness using machine learning-based tool
Yogi et al. Privacy Attributes Aware Framework in Information Disclosure Models for Public During Pandemic
JP2011028703A (en) Security system incorporated in search system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190405

RJ01 Rejection of invention patent application after publication