CN109446247B

CN109446247B - Scientific and technological innovation data visual analysis and display method

Info

Publication number: CN109446247B
Application number: CN201811062412.2A
Authority: CN
Inventors: 范通让; 王建民; 贾红佳; 吕红伟; 孙菲
Original assignee: Shijiazhuang Tiedao University
Current assignee: Shijiazhuang Tiedao University
Priority date: 2018-09-12
Filing date: 2018-09-12
Publication date: 2022-08-30
Anticipated expiration: 2038-09-12
Also published as: CN109446247A

Abstract

The invention discloses a method for visual analysis and display of scientific and technological innovation data. The method realizes the visualization scalability and reusability of scientific and technological innovation data by establishing a visual representation abstract model and an information polyhedron data model. With the help of technological innovation data feature analysis, a nested circle visualization algorithm for hierarchical data and a fisheye moment diagram visualization algorithm for mesh data are designed. This method has certain generality for large-scale data and data with complex relationship.

Description

Scientific and technological innovation data visualization analysis and display method

所属技术领域Technical field

本专利申请涉及数据可视化领域，具体涉及针对科技创新类数据给出的一种可视化分析与展示方法。This patent application relates to the field of data visualization, in particular to a visualization analysis and display method for scientific and technological innovation data.

背景技术Background technique

数据可视化是现有数据分析的基础，当前数据可视化逐渐普及。随着科技政务部门现有数据量的增加和数据关联关系日益复杂，现有针对科技创新数据可视化系统已经难以满足对大规模数据和信息多面体可视化展示。数据可视化中的算法具有很强的专业性以及个性化需求，因此如何开发一个针对科技创新数据具有扩展性、复用性以及专业性的可视化系统是现在面临的主要问题。Data visualization is the foundation of existing data analysis, and the current data visualization is gradually popularized. With the increase in the amount of existing data in science and technology government departments and the increasingly complex data associations, the existing data visualization systems for scientific and technological innovation have been unable to satisfy the visualization of large-scale data and information polyhedrons. Algorithms in data visualization have strong professional and personalized needs, so how to develop a visualization system that is scalable, reusable, and professional for scientific and technological innovation data is the main problem now.

发明内容SUMMARY OF THE INVENTION

本发明需要解决的技术问题是提供一种针对科技创新类数据给出一种可视化分析与展示方法，以解决科技创新数据可视化系统扩展性、复用性和专业性问题。The technical problem to be solved by the present invention is to provide a visual analysis and display method for scientific and technological innovation data, so as to solve the problems of scalability, reusability and professionalism of the scientific and technological innovation data visualization system.

为了解决上述问题，本发明所采用的技术方案是：In order to solve the above problems, the technical scheme adopted in the present invention is:

一种科技创新类数据可视化分析与展示方法，包括如下步骤：A scientific and technological innovation data visualization analysis and display method, comprising the following steps:

1)将数据可视化过程抽象，建立可视化表征抽象模型，实现对数据组件的统一建模和可视化组件的扩展性；1) Abstract the data visualization process, establish an abstract model of visualization representation, and realize the unified modeling of data components and the expansibility of visualization components;

2)将科技创新类数据中的信息多面体数据抽象，建立针对信息侧面与用户任务的信息多面体数据模型，实现信息侧面与用户任务的可视化；2) Abstract the information polyhedron data in the scientific and technological innovation data, establish an information polyhedron data model for the information side and user tasks, and realize the visualization of the information side and user tasks;

3)结合鱼眼力矩图交互技术，对科技创新数据进行鱼眼力矩图布局算法，实现科技创新数据中具有复杂关联关系数据的可视化；同时结合Venn可视化算法，利用改进Venn可视化算法，实现科技创新数据中针对大规模数据的嵌套圆可视化处理。3) Combined with fisheye moment diagram interaction technology, perform fisheye moment diagram layout algorithm for scientific and technological innovation data to realize the visualization of data with complex relationship in scientific and technological innovation data; at the same time, combined with Venn visualization algorithm, use improved Venn visualization algorithm to realize scientific and technological innovation Visualization of nested circles in data for large scale data.

优选的，数据组件包括表格数据，在使用过程中将表格数据最终都会转化为JSON格式类型的数据进行统一配置映射：表格数据包括表头，数值和数据类型，数据类型包括数值型数据，日期型数据，文本型数据；Preferably, the data component includes tabular data, and the tabular data will eventually be converted into JSON format data for unified configuration and mapping during use: tabular data includes table header, numeric value and data type, data type includes numeric data, date type data, textual data;

可视化组件包括柱状图(BarChart)、折线图(LineChart)、饼状图(PieChart)、力导向图(ForceGraph)。Visual components include bar charts (BarChart), line charts (LineChart), pie charts (PieChart), and force-oriented charts (ForceGraph).

优选的，信息多面体数据模型包括用户任务模型、信息侧面模型和用户偏好模型三个部分，用户任务模型包括用户建立任务的标志符、任务名称、任务类型、任务的重要度；Preferably, the information polyhedron data model includes three parts: a user task model, an information side model and a user preference model, and the user task model includes the identifier of the user's established task, the task name, the task type, and the importance of the task;

信息侧面模型包括数据项集合、数据项关联表集合、数据项时间维度集合、数据项地域维度集合、数据项类别分类，数据项集合包括数据项,数据项关键字集合，其中数据项包括数据表的编码、科技创新类数据类别名称、科技创新类数据类别属性；数据项关联表集合包括源表编码、目标数据表编码、数据表关联权值、数据表关联关系类型>,数据项类别分类包括业务数据详细数据、业务数据统计数据、年鉴数据统计数据、年鉴数据详细数据四种类别；The information side model includes data item collection, data item association table collection, data item time dimension collection, data item regional dimension collection, data item category classification, data item collection includes data item, data item keyword collection, wherein data item includes data table data category name, scientific and technological innovation data category attribute; data item association table set includes source table code, target data table code, data table association weight, data table association type>, data item category classification includes There are four categories of business data detailed data, business data statistical data, yearbook data statistical data, and yearbook data detailed data;

数据项关键字集合指的是具有相同概念的本体集合；The data item keyword set refers to the ontology set with the same concept;

用户偏好模型：用户偏好Preferences＝F1(Sum(API),tbi(API))，表示验证第i个数据表的任务重要度。其中Sum(API)代表用户所使用表的任务权值总和，即Sum(API)＝tb1API+tb2API+tb3API+...+tbiAPI；而tbiAPI代表第i个数据表的任务权值，tbiAPI＝F2(TaskType,Level)，其中TaskType代表数据表的任务类型，Level代表数据表的任务值。User preference model: User preference Preferences=F1(Sum(API), tbi(API)), indicating the importance of the task of verifying the i-th data table. Among them, Sum(API) represents the sum of the task weights of the tables used by the user, that is, Sum(API)=tb1API+tb2API+tb3API+...+tbiAPI; and tbiAPI represents the task weights of the ith data table, tbiAPI=F2( TaskType, Level), where TaskType represents the task type of the data table, and Level represents the task value of the data table.

优选的，信息多面体数据模型的建立过程为：Preferably, the process of establishing the information polyhedron data model is as follows:

从信息侧面角度出发，建立信息侧面统一数据模型；From the perspective of information side, establish a unified data model for information side;

从用户任务角度，建立用户任务模型与信息侧面间关联；From the perspective of user tasks, establish the relationship between the user task model and the information side;

结合用户任务，建立用户偏好分析，实现对用户任务历史记录的数据分析。Combined with user tasks, establish user preference analysis to realize data analysis of user task history records.

优选的，鱼眼力矩图布局算法为：Preferably, the fisheye moment diagram layout algorithm is:

设图形中需要放大的焦点部位坐标为C＝(x,y)，在视图下的坐标为C1＝(x1,y1)，则C1和C的对应关系为Suppose the coordinates of the focus part to be enlarged in the graph are C=(x, y), and the coordinates in the view are C1=(x1, y1), then the corresponding relationship between C1 and C is

C₁＝F_finsheye(C) (1)C ₁ =F _finsheye (C) (1)

设C点与X轴正向夹角为

C1点与X轴正向夹角为θ，则上式(1)可以变化为Let the angle between point C and the positive X axis be

The positive angle between point C1 and the X axis is θ, then the above formula (1) can be changed to

假设图形的变形区域圆形的半径为R，则可以用公式表示为Assuming that the radius of the circle in the deformation area of the figure is R, it can be expressed as

R＝(x^m+y^m)ⁿ (3)R=(x ^m +y ^m ) ⁿ (3)

所要变形的圆形区域的半径与参数m,n成正比。根据上述条件，给出最终变形区域圆形放大公式为The radius of the circular area to be deformed is proportional to the parameters m, n. According to the above conditions, the circular enlargement formula of the final deformation area is given as

优选的，改进Venn可视化算法的步骤如下：Preferably, the steps for improving the Venn visualization algorithm are as follows:

Step 1：首先确定各个圆所代表的节点半径，层次数据中叶子节点的大小统一设置为1，中间节点的大小为所包含子节点个数之和；Step 1: First determine the radius of the node represented by each circle, the size of the leaf node in the hierarchical data is uniformly set to 1, and the size of the intermediate node is the sum of the number of sub-nodes contained;

Step 2：在可视化区域选定中心点O，作为嵌套圆的排布中心点；Step 2: Select the center point O in the visualization area as the center point of the nested circle arrangement;

Step 3：以上述改进Veen可视化算法，进行根节点到叶子节点的顺序进行迭代排布；Step 3: Use the above improved Veen visualization algorithm to iteratively arrange the order from the root node to the leaf node;

Step 4：当最后一个叶子节点排布完毕后，结束。Step 4: End when the last leaf node is arranged.

本发明技术方案的进一步改进在于：The further improvement of the technical solution of the present invention is:

由于采用了上述技术方案，本发明取得的技术进步是：本发明以信息传播多途径多层次等特征为基础，并且结合流行病学中常用的SIR模型，构建了具有单向影响的双层SIR信息传播模型。同时结合主观异质性和记忆效应异质性特征完善双层SIR信息传播模型。为了对双层网络中的节点重要性进行排序，采用度中心性、介数中心性、接近中心性这三个评价指标对网络中的节点进行综合评价。利用TOPSIS(多属性决策分析)方法并且结合指标权重和网络层权重，给出一种双层网络重要节点选取方法。Due to the adoption of the above technical solution, the technical progress achieved by the present invention is: the present invention is based on the characteristics of information dissemination, multi-channel and multi-level, and combined with the SIR model commonly used in epidemiology to construct a two-layer SIR with unidirectional influence. Information dissemination model. At the same time, the two-layer SIR information propagation model is improved by combining the characteristics of subjective heterogeneity and memory effect heterogeneity. In order to rank the importance of nodes in a two-layer network, three evaluation indicators of degree centrality, betweenness centrality and proximity centrality are used to comprehensively evaluate the nodes in the network. Using the TOPSIS (Multiple Attribute Decision Analysis) method and combining the index weights and network layer weights, a method for selecting important nodes in a two-layer network is presented.

本发明主要解决了三个问题：The present invention mainly solves three problems:

(1)基于信息在网络中多路径多层次的特征，结合在研究信息传播模型中常用的SIR模型，给出了具有单向影响的双层SIR信息传播模型。该模型每一个层内都具有单独的传播过程，并且在该模型中存在由线上传播对线下传播的辐射影响。(1) Based on the multi-path and multi-level characteristics of information in the network, combined with the SIR model commonly used in the study of information dissemination models, a two-layer SIR information dissemination model with one-way influence is given. The model has a separate propagation process in each layer, and in this model there is the influence of radiation propagating on the line to the line propagating off the line.

(2)信息的传播过程不具有异质性特征，因此在信息的传播过程中结合信息在传播过程所具有的主观异质性和记忆效应这两种因素改进双层SIR信息传播模型，构建一种具有异质性的线上线下信息传播模型。(2) The information dissemination process does not have the characteristics of heterogeneity. Therefore, in the process of information dissemination, the two-layer SIR information dissemination model is improved by combining the subjective heterogeneity and memory effect of the information in the dissemination process. A heterogeneous online and offline information dissemination model.

(3)通过TOPSIS(多属性决策分析)方法采用度中心性、介数中心性、接近中心性这三个评价指标进行综合计算以评价每个节点在该网络层中的重要程度，给出一种基于TOPSIS的双层网络重要节点选取方法。并通过该方法选取出的节点为免疫节点作为多层网络的一种免疫策略。(3) Through the TOPSIS (Multiple Attribute Decision Analysis) method, the three evaluation indicators of degree centrality, betweenness centrality and proximity centrality are used for comprehensive calculation to evaluate the importance of each node in the network layer, and a A method for selecting important nodes in a two-layer network based on TOPSIS. And the nodes selected by this method are immune nodes as an immune strategy of the multi-layer network.

为了真实的反应现实网络中信息的传播过程，本发明通过研究信息传播问题来预测信息流向，控制信息传播以及实现舆论监控，抽取网络中的主要因素加以研究，寻求支配双层网络中复杂系统中信息传播的简单规律，用于研究不同类型环境中节点行为及策略方法，最有效便捷、且具有通用性的途径。In order to truly reflect the information dissemination process in the real network, the present invention predicts the information flow by studying the information dissemination problem, controls the information dissemination and realizes the monitoring of public opinion, extracts the main factors in the network for research, and seeks to dominate the complex system in the two-layer network. The simple law of information dissemination is the most effective, convenient and universal way to study node behavior and strategy methods in different types of environments.

本发明中网络行为策略的研究仅与产生此行为的网络节点本身相关，而不考虑其具体的实现方式，屏蔽了底层不相关的细节，使作用于网络行为策略的研究具有整体化、简单化等特点，并具有对网络演化的适应性、普遍性和可移植性。The research on the network behavior strategy in the present invention is only related to the network node itself that produces the behavior, regardless of its specific implementation mode, shielding the irrelevant details of the bottom layer, and making the research on the network behavior strategy integrated and simplified It has the characteristics of adaptability, universality and portability to network evolution.

附图说明Description of drawings

图1是可视化组件结构图；Figure 1 is a visual component structure diagram;

图2是数据组件结构图；Figure 2 is a data component structure diagram;

图3是信息多面体结构图；Figure 3 is a structural diagram of an information polyhedron;

图4是传统Venn可视化算法图；Fig. 4 is the traditional Venn visualization algorithm diagram;

图5是嵌套圆可视化算法图；Fig. 5 is a nested circle visualization algorithm diagram;

图6是嵌套圆可视化算法节点排布时间图。Figure 6 is a time chart of the node arrangement of the nested circle visualization algorithm.

具体实施方式Detailed ways

下面结合附图和实例对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and examples.

一、基本要素定义1. Definition of basic elements

⑴可视化表征抽象模型(1) Visual representation abstract model

可视化表征抽象模型对可视表征进行多层次描述。可视表征抽象模型继承了用户界面描述语言的一些特性，使用模块化的描述方式对界面进行描述，对界面的描述主要包括三个部分：可视化组件(VCMD)、数据组件(DMD)、和数据与可视组件的映射部分(VDMD)。The visual representation abstract model describes the visual representation at multiple levels. The visual representation abstract model inherits some features of the user interface description language, and uses a modular description method to describe the interface. The description of the interface mainly includes three parts: visual component (VCMD), data component (DMD), and data Mapping section (VDMD) with displayables.

对各组件可以描述为：Each component can be described as:

1.可视化组件是对STUIMD可视化视图的描述，可视化组件可对常见的可视化方式进行配置显示。可视化组件与可视化视图是多对一的关系，即一个用户功能视图可配置多个可视化组件，同时可视化组件可以重复使用，增加了可视化组件的可复用性。可视化组件主要包括柱状图(BarChart)、折线图(LineChart)、饼状图(PieChart)、力导向图(ForceGraph)基本的可视化组件，详见图1。1. The visualization component is a description of the STUIMD visualization view, and the visualization component can configure and display common visualization methods. There is a many-to-one relationship between visual components and visual views, that is, a user functional view can configure multiple visual components, and visual components can be reused, which increases the reusability of visual components. The visualization components mainly include basic visualization components such as BarChart, LineChart, PieChart, and ForceGraph, as shown in Figure 1.

2.数据组件是对STUIMD模型对于数据的描述，它可以对进行可视化的数据进行配置和具体描述。对于科技创新类数据的可视化主要分为两种数据包括表格数据和网络结构数据。但是在数据组件中，将网络结构数据进行转化，使得网络结构数据转化为表格形式的数据。给出了表格结构数据的形式化描述，在表格结构描述中包括表头，数值，和数据类型三种定义。同时数据类型主要包括数值型数据，日期型数据，文本型数据。在使用中将表格数据最终都会转化为JSON格式类型的数据进行统一配置映射，详见图2。2. The data component is the description of the data in the STUIMD model, which can configure and describe the data for visualization. The visualization of scientific and technological innovation data is mainly divided into two types of data, including tabular data and network structure data. However, in the data component, the network structure data is converted, so that the network structure data is converted into data in the form of a table. The formal description of table structure data is given, and the table structure description includes three definitions of table header, value, and data type. At the same time, the data types mainly include numeric data, date data, and text data. In use, the table data will eventually be converted into JSON format data for unified configuration and mapping, as shown in Figure 2.

3.映射组件是实现可视化组件与数据组件的关联配置功能，主要针对数据表属性与可视化组件属性的配置关联记录。3. The mapping component is to realize the associated configuration function of the visualization component and the data component, mainly for the configuration association record of the data table attribute and the visualization component attribute.

⑵信息多面体数据模型(2) Information polyhedron data model

信息多面体数据模型包括用户任务模型、信息侧面模型和用户偏好模型三个部分，详见图3。The information polyhedron data model includes three parts: user task model, information side model and user preference model, see Figure 3 for details.

1.用户任务模型Task＝<taskID,taskName,taskType,level>，taskID代表用户建立任务的标志符，taskName代表任务名称即英文名称，taskType代表任务类型，level代表任务的重要度。其中taskType主要包括浏览，抽取，平行上移，打印四种。1. User task model Task=<taskID, taskName, taskType, level>, taskID represents the identifier of the task created by the user, taskName represents the task name, that is, the English name, taskType represents the task type, and level represents the importance of the task. Among them, taskType mainly includes four kinds: browse, extract, move up in parallel, and print.

2.科技创新类数据模型STInfoFacet＝<DataItemSet,DataItemRelationSet，DataTimeSet,DataRegionSet,Class>，数据项集合DataItemSet＝<DataItem,KeyWordSet>，其中数据项DataItem＝<DICode,AttriName,MetaData>,其中DICode代表数据表的编码，AttriName代表科技创新类数据类别名称，MetaData代表科技创新类数据类别属性。KeyWordSet代表数据项关键字集合(具有相同概念的本体集合)。数据项关联表集合DataItemRelationSet＝<SourceDICode,TargetDICode,Degree,Relationship>,其中SourceDICode代表源表编码，TargetDICode代表目标数据表编码，Degree代表数据表关联权值，Relationship代表数据上述数据表关联关系类型。DataTimeSet代表数据项时间维度集合。DataRegionSet代表数据项地域维度集合。Class代表数据项类别分类，包括业务数据详细数据，业务数据统计数据，年鉴数据统计数据，和年鉴数据详细数据四种类别。2. Scientific and technological innovation data model STInfoFacet=<DataItemSet, DataItemRelationSet, DataTimeSet, DataRegionSet, Class>, data item set DataItemSet=<DataItem, KeyWordSet>, where data item DataItem=<DICode, AttriName, MetaData>, where DICode represents data table , AttriName represents the category name of scientific and technological innovation data, and MetaData represents the attribute of scientific and technological innovation data category. KeyWordSet represents a set of data item keywords (ontologies with the same concept). The data item association table set DataItemRelationSet=<SourceDICode, TargetDICode, Degree, Relationship>, where SourceDICode represents the source table code, TargetDICode represents the target data table code, Degree represents the data table correlation weight, and Relationship represents the data table correlation type. DataTimeSet represents a collection of data item time dimensions. DataRegionSet represents a collection of data item regional dimensions. Class represents the classification of data item categories, including four categories of business data detailed data, business data statistical data, yearbook data statistical data, and yearbook data detailed data.

3.用户偏好Preferences＝F1(Sum(API),tbi(API))，表示验证第i个数据表的任务重要度。其中Sum(API)代表用户所使用表的任务权值总和，即Sum(API)＝tb1API+tb2API+tb3API+...+tbiAPI。而tbiAPI代表第i个数据表的任务权值，tbiAPI＝F2(TaskType,Level)，其中TaskType代表数据表的任务类型，Level代表数据表的任务值。3. User preference Preferences=F1(Sum(API), tbi(API)), indicating the importance of the task of verifying the i-th data table. The Sum(API) represents the sum of the task weights of the tables used by the user, that is, Sum(API)=tb1API+tb2API+tb3API+...+tbiAPI. And tbiAPI represents the task weight of the i-th data table, tbiAPI=F2(TaskType, Level), where TaskType represents the task type of the data table, and Level represents the task value of the data table.

⑶嵌套圆可视化算法(3) Nested circle visualization algorithm

嵌套圆利用圆形或椭圆表示树形结构的节点，而子节点在父节点的圆形内部。嵌套圆在空间利用率和表示父子节点方面都有一定的优势。Nested circles use circles or ellipses to represent the nodes of the tree structure, and the child nodes are inside the circle of the parent node. Nested circles have certain advantages in space utilization and in representing parent and child nodes.

1.传统Venn可视化算法，详见图4：1. Traditional Venn visualization algorithm, see Figure 4 for details:

Step 1：首先在可视化界面选择一点O作为嵌套圆排布的圆心。圆C1、C2、C3围绕点O进行排布。如图4(a)所示，记录完成排布最外层圆{C1,C2,C3}，三者之间排布规律按照逆时针方向{C1->C2->C3->C1}。Step 1: First select a point O in the visual interface as the center of the nested circle arrangement. Circles C1, C2, C3 are arranged around point O. As shown in Figure 4(a), the outermost circles {C1, C2, C3} are arranged after recording, and the arrangement between the three is in the counterclockwise direction {C1->C2->C3->C1}.

Step 2：当有新圆Ci加入时，设Ci圆的半径为Ri，圆的半径与节点的大小相同，设Cm是距离圆心O最近的圆，Cm左边的圆为Cm-1，Cm右边的圆为Cm-1,设Cj为最外层圆的任何一个圆。Step 2: When a new circle Ci is added, let the radius of the Ci circle be Ri, the radius of the circle is the same as the size of the node, let Cm be the circle closest to the center O, the circle to the left of Cm is Cm-1, and the circle to the right of Cm is Cm-1. The circle is Cm-1, and Cj is any circle of the outermost circle.

Step 3：根据Ci和Cm和Cm-1相切的条件以及Ci的半径，计算出Ci圆的圆心，取在Cm外面的圆心位置作为第i个节点圆的圆心。Step 3: According to the condition that Ci, Cm and Cm-1 are tangent and the radius of Ci, calculate the center of the Ci circle, and take the center position outside Cm as the center of the i-th node circle.

Step 4：遍历记录中所有的外层圆是否与这个圆相交。Step 4: Traverse whether all outer circles in the record intersect with this circle.

Step 5：如果没有任何外层圆与之相交，如图4(b)则将新圆加入。更新记录外层圆集合{C1，C2，C3，Ci},更新排不规律按照逆时针方向{C1->Ci->C2->C3->C1}。Step 5: If there is no outer circle intersecting it, as shown in Figure 4(b), add a new circle. The outer circle set {C1, C2, C3, Ci} is updated and recorded, and the update row is irregular according to the counterclockwise direction {C1->Ci->C2->C3->C1}.

Step 6：如果外层圆与新加入圆有相交，则重新设定与新圆相切的圆是Cm+1与Cm,返回第3步骤。Step 6: If the outer circle intersects with the newly added circle, reset the circle tangent to the new circle to be Cm+1 and Cm, and return to step 3.

Step 7：假设当新圆与Cm+p相切时，第一次外层圆不会与之相交，则更新外层圆集合与排布规律集合。Step 7: Assuming that when the new circle is tangent to Cm+p, the outer circle will not intersect it for the first time, then update the outer circle set and the arrangement rule set.

Step 8：直到最后一个节点排布完成，结束。Step 8: Until the last node arrangement is completed, end.

5.改进算法伪代码：5. Improved algorithm pseudocode:

嵌套圆的可视化算法相当于Veen节点算法的迭代算法。步骤如下：The visualization algorithm for nested circles is equivalent to the iterative algorithm for the Veen node algorithm. Proceed as follows:

Step 1：首先确定各个圆所代表的节点半径，层次数据中叶子节点的大小统一设置为1，中间节点的大小为所包含子节点个数之和。Step 1: First, determine the radius of the node represented by each circle. The size of the leaf node in the hierarchical data is uniformly set to 1, and the size of the intermediate node is the sum of the number of sub-nodes contained.

Step 2：在可视化区域选定中心点O，作为嵌套圆的排布中心点。Step 2: Select the center point O in the visualization area as the center point of the nested circle arrangement.

Step 3：以上述改进Veen可视化算法，进行根节点到叶子节点的顺序进行迭代排布。Step 3: Use the above improved Veen visualization algorithm to iteratively arrange the order from the root node to the leaf node.

⑷鱼眼力矩图布局算法⑷ Fisheye moment diagram layout algorithm

1.设图形中需要放大的焦点部位坐标为C＝(x,y)，在视图下的坐标为C1＝(x1,y1)，则C1和C的对应关系为1. Set the coordinates of the focus part to be enlarged in the graph as C=(x, y), and the coordinates in the view as C1=(x1, y1), then the corresponding relationship between C1 and C is

C₁＝F_finsheye(C) (1)C ₁ =F _finsheye (C) (1)

设C点与X轴正向夹角为

R＝(x^m+y^m)ⁿ (3)R=(x ^m +y ^m ) ⁿ (3)

二、实验验证与分析2. Experimental verification and analysis

⑴嵌套圆可视化算法的验证与分析(1) Verification and analysis of nested circle visualization algorithm

1.图6显示：实验数据使用1万～5万个大小随机节点分别用传统Venn可视化算法和改进算法进行节点排布，并统计各个算法所花费的时间。由于实验中无法准确获取外层圆排布所花费的时间，但两种算法使用实验数据相同，并且在寻找最短距离中心节点方法相同，因此设定实验从开始到节点完成排布为两者的实验统计时间。1. Figure 6 shows: the experimental data uses 10,000 to 50,000 random nodes of size and size to arrange the nodes using the traditional Venn visualization algorithm and the improved algorithm, and count the time spent by each algorithm. Since the time spent in the outer circle arrangement cannot be accurately obtained in the experiment, but the two algorithms use the same experimental data and the same method for finding the shortest distance to the center node, the experiment is set from the beginning to the completion of the node arrangement as the two Experiment statistics time.

实验分析：experiment analysis:

1.传统Venn可视化算法随着节点数目的增多所用时间增长率也在不断增大。由于节点数据增多，外层节点也随之增多，每个新的节点加入都需要将外层圆全部判断是否与之相交，因此在大规模层次节点进行排布时所用时间会越来越多。而改进Venn算法则定义每次只判断20个临近节点是否与之相交，当节点的大小相差较多时，可适当增大判断值。由于每次都是只判断20个节点，因此改进算法所用时间会远远小于传统Venn算法，呈现出线性复杂度。节点数量越大则改进算法的优势就越明显，因此改进算法较之传统Venn算法具有较高的效率。1. The time growth rate of the traditional Venn visualization algorithm increases with the increase of the number of nodes. As the node data increases, the outer nodes also increase, and each new node joins needs to determine whether all the outer circles intersect with it, so it will take more and more time to arrange large-scale hierarchical nodes. The improved Venn algorithm defines that only 20 adjacent nodes are judged each time whether they intersect with it. When the size of the nodes differs greatly, the judgment value can be appropriately increased. Since only 20 nodes are judged each time, the time taken by the improved algorithm will be much smaller than the traditional Venn algorithm, showing linear complexity. The larger the number of nodes, the more obvious the advantages of the improved algorithm, so the improved algorithm has higher efficiency than the traditional Venn algorithm.

⑵鱼眼力矩图实验评估(2) Experimental evaluation of fisheye moment diagram

1.表1表示：实验者在20分钟内观察学习如何使用三种实验方式。掌握使用方法后实验者分别完成特定的两个任务，并记录所花费的时间。试验中不允许使用第三方插件或者借助其他搜索工具。最后将实验者所花费时间计算平均值和标准差。1. Table 1 shows: the experimenter observes and learns how to use the three experimental methods within 20 minutes. After mastering the method, the experimenter completed two specific tasks and recorded the time spent. The use of third-party plugins or the aid of other search tools is not allowed in the experiment. Finally, the time spent by the experimenter was used to calculate the mean and standard deviation.

表1任务完成时间比较Table 1 Comparison of task completion time

2.表2表示：从三种实验工具的易学性、易用性、有效性和可靠性四个方面进行调查，同时设定每个调查项满分为10分，最后对实验者评估结果取平均值和标准差。2. Table 2 shows that the survey is carried out from the four aspects of ease of learning, ease of use, effectiveness and reliability of the three experimental tools. At the same time, the full score of each survey item is set to 10 points, and finally the evaluation results of the experimenter are averaged value and standard deviation.

表2调查问卷结果比较Table 2 Comparison of questionnaire results

实验分析：experiment analysis:

1.实验者使用鱼眼力矩图完成任务所用时间均少于其它两种方式，由数据标准差可得实验者完成任务时间差距不大，说明鱼眼力矩图效果稳定可靠。当用户寻找特定节点时首先根据颜色确定节点的大致范围，然后根据节点的名称确定节点位置，最后用户使用鱼眼视图放大节点位置，观察并记录节点关联数目。20位实验者也反馈采用鱼眼视图在完成记录节点关联数目时具有明显优势。同时，当用户采用鱼眼视图时力矩图整体效果不变，用户可以快速寻找下一个节点。1. The time used by the experimenter to complete the task using the fisheye moment diagram is less than that of the other two methods. From the standard deviation of the data, the difference in the time between the experimenters completing the task is not large, indicating that the effect of the fisheye moment diagram is stable and reliable. When the user looks for a specific node, first determine the approximate range of the node according to the color, then determine the node position according to the name of the node, and finally use the fisheye view to zoom in on the node position, observe and record the number of node associations. 20 experimenters also reported that the fisheye view has obvious advantages in recording the number of node associations. At the same time, when the user adopts the fisheye view, the overall effect of the moment graph remains unchanged, and the user can quickly find the next node.

2.鱼眼力矩图的易学性、易用性、有效性和可靠性明显优于其它两种方式，其次是里传统力矩图，而使用Excel表格的方式完成任务效果是最差的。实验者反馈鱼眼力矩图在易用性方面效果是最好的，大大提高了用户工作效率。同时，根据标准差比较发现实验者反馈情况稳定没有明显出入，说明鱼眼力矩图具有明显的优势相对于其他两种方式。2. The ease of learning, ease of use, effectiveness and reliability of the fisheye moment diagram is obviously better than that of the other two methods, followed by the traditional moment diagram, and the method of using the Excel table to complete the task is the worst. The experimenter feedback that the fisheye moment diagram is the best in terms of ease of use, which greatly improves the user's work efficiency. At the same time, according to the standard deviation comparison, it is found that the feedback situation of the experimenter is stable and there is no obvious difference, indicating that the fisheye moment diagram has obvious advantages over the other two methods.

Claims

1. A scientific and technological innovation data visualization analysis and display method, characterized in that:

1) Abstract the data visualization process, establish an abstract model of visualization representation, and realize the unified modeling of data components and the expansibility of visualization components;

2) Abstract the information polyhedron data in the scientific and technological innovation data, establish an information polyhedron data model for the information side and user tasks, and realize the visualization of the information side and user tasks;

3) Combined with fisheye moment diagram interaction technology, perform fisheye moment diagram layout algorithm for scientific and technological innovation data to realize the visualization of data with complex relationship in scientific and technological innovation data; at the same time, combined with Venn visualization algorithm, use improved Venn visualization algorithm to realize scientific and technological innovation Visualization of nested circles in data for large-scale data;

The data component includes table data, which will eventually be converted into JSON format data for unified configuration and mapping during use: table data includes table headers, values and data types, and data types include numeric data, date data, and text. type data;

Visual components include bar chart (BarChart), line chart (LineChart), pie chart (PieChart), force-directed chart (ForceGraph);

The information polyhedron data model includes three parts: the user task model, the information side model and the user preference model. The user task model includes the identifier of the user's established task, the task name, the task type, and the importance of the task;

The information side model includes data item collection, data item association table collection, data item time dimension collection, data item regional dimension collection, data item category classification, data item collection includes data item, data item keyword collection, wherein data item includes data table data category name, scientific and technological innovation data category attribute; data item association table set includes source table code, target data table code, data table association weight, data table association type>, data item category classification includes There are four categories of business data detailed data, business data statistical data, yearbook data statistical data, and yearbook data detailed data;

The data item keyword set refers to the ontology set with the same concept;

User preference model: User preference Preferences=F1(Sum(API), tbi(API)), indicating the importance of the task of verifying the i-th data table, where Sum(API) represents the sum of the task weights of the tables used by the user, namely Sum(API)=tb1API+tb2API+tb3API+...+tbiAPI; and tbiAPI represents the task weight of the ith data table, tbiAPI=F2(TaskType, Level), where TaskType represents the task type of the data table, and Level represents the data The task value of the table;

The process of establishing the information polyhedron data model is as follows:

From the perspective of information side, establish a unified data model for information side;

From the perspective of user tasks, establish the relationship between the user task model and the information side;

Combine user tasks, establish user preference analysis, and realize data analysis of user task history;

The fisheye moment diagram layout algorithm is:

Suppose the coordinates of the focus part to be enlarged in the graph are C=(x, y), and the coordinates in the view are C1=(x1, y1), then the corresponding relationship between C1 and C is:

C ₁ =F _finsheye (C) (1)

Let the angle between point C and the positive X axis be

The positive angle between point C1 and the X axis is θ, then the above formula (1) can be changed to:

Assuming that the radius of the circle in the deformation area of the graph is R, the corresponding transformed fisheye radius can be expressed as formula (3):

R=(x ^m +y ^m ) ⁿ (3)

The radius of the circular area to be deformed is proportional to the parameters m, n. According to the above conditions, the circular enlargement formula of the final deformed area is given as:

2. technology innovation class data visualization analysis and display method according to claim 1, is characterized in that: the step of improving Venn visualization algorithm is as follows:

Step1: First determine the radius of the node represented by each circle, the size of the leaf node in the hierarchical data is uniformly set to 1, and the size of the intermediate node is the sum of the number of sub-nodes contained;

Step2: Select the center point O in the visualization area as the center point of the nested circle arrangement;

Step3: Use the above improved Veen visualization algorithm to iteratively arrange the order from the root node to the leaf node;

Step4: End when the last leaf node is arranged.