WO2023287395A1 - Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection - Google Patents
Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection Download PDFInfo
- Publication number
- WO2023287395A1 WO2023287395A1 PCT/US2021/041296 US2021041296W WO2023287395A1 WO 2023287395 A1 WO2023287395 A1 WO 2023287395A1 US 2021041296 W US2021041296 W US 2021041296W WO 2023287395 A1 WO2023287395 A1 WO 2023287395A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- objects
- topic
- computer
- user
- implemented method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/904—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2428—Query predicate definition using graphical user interfaces, including menus and forms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
-
- G06T11/26—
Definitions
- the present invention relates to data displays, and more particularly to computer- implemented displays of data objects.
- the invention provides a computer-implemented method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection.
- each of the second set of objects is hierarchically structured.
- the method is implemented by computer processes including: storing the first set of objects and storing the second set of objects; parsing the first set of objects in relation to the second set of objects in order to generate a mapping between the first set of objects and the second set of objects; storing the mapping; causing an interactive graphical display of a representation of the first set of objects as a central node and a representation of the second set as a set of objects as a set of topic nodes surrounding the central node or surrounded by the central node, the displayed representations constituting a sieve diagram; and using the mapping to cause display in the sieve diagram of a set of relationships between the first set of objects and the second set of objects by providing a graphical linkage directly or indirectly between the central node and each node that corresponds to a member of the second set of objects, wherein the graphical linkage has a feature that graphically indicates a quantity associated with the mapping.
- the method further includes, in response to graphical selection, by a user, in the sieve diagram of a subset of topic nodes corresponding to a topic subset of the second set of objects, retrieving an object subset, of the first set of objects, that is mapped to the topic set.
- the method further includes: receiving a search query; searching, in response to the received search query, the first set of objects, for a search results set of objects having a set of features matching a set of features of the search query; and displaying the search results set of objects.
- the graphical linkage is a distribution line.
- the method further includes, upon receiving a graphical selection by the user of a given one of the topic nodes in the sieve diagram, causing filtering of the first set of objects displayed to include only those objects corresponding to objects in the second set that are represented by the selected topic node.
- the method further includes, filtering, upon receiving a graphical selection by the user of one of the members of the second set in the sieve diagram, the first set of objects displayed to include only objects corresponding to objects in the first set that are mapped to the selected member of the second set.
- the method further includes, removing, upon receiving a graphical selection by the user of a central node in the sieve diagram, any filtering caused by a graphical selection of a topic node or of a distribution line.
- the method includes, displaying, upon the graphical selection by the user of the given one of the topic nodes, information pertinent to the selected topic node.
- the method further includes, upon the graphical selection by the user of the member of the second set, causing display of information pertinent to the selected member of the second set.
- the search query includes a Boolean expression
- the method further includes evaluating the Boolean expression and performing the search using the evaluated expression.
- the search query is defined, at least in part, by a user selection of a set of graphical elements in the sieve diagram.
- the search query includes at least one Boolean operator defined by a user input.
- the method includes, the step of displaying, upon receipt of a user command to shrink displayed membership of the first set, only objects having features matching features of the search query.
- the method further includes, upon invocation by a user of a topic map limiter, limiting display of objects in the second set.
- the topic map limiter invokes display of a tier in which the display of objects in the second set is limited.
- the topic map limiter allows selection by the user of which topics to eliminate from the sieve diagram.
- the topic map limiter allows selection by the user of which topics to include in the sieve diagram.
- the computer processes further include, upon a graphical selection by a user of a member of the second set of objects in the sieve diagram, causing display of a derivative sieve diagram in which the selected member is a derivative central node.
- the graphical linkage is a distribution line and the feature is a thickness of the distribution line.
- the distribution line has a secondary feature that indicates the quantity associated with the mapping under conditions wherein the thickness of the distribution line has been graphically constrained.
- the graphical linkage has a secondary feature that graphically indicates a quality associated with the mapping.
- the invention provides a computer-implemented method for interactive visualization of a set of objects in a hierarchical topic structure.
- the topic structure comprises a first set of topics and a second set of topics.
- the method implemented by computer processes including:
- FIG. l is a representation of a display, generated by a computer executing a program in accordance with an embodiment of the present invention, that graphically represents relationships between first and second sets of objects, using, for the first set, a subset of documents from the NCBI PubMed document corpus, in this case illustrating results from a query to a database accessed by the program (wherein the second set is a hierarchy of topic nodes describing the anatomy of the brain);
- FIG. 2 is a further representation of the display of the embodiment depicted in FIG. 1, in which the user has selected the topic node “Telencephalon” in the Fig. 1 display;
- FIG. 3 is a further representation of the display of the embodiment depicted in FIG. 2, in which the user has incremented the number of tiers shown, using 106 in the Fig. 2 display;
- FIG. 4 is a further representation of the display of the embodiment depicted in FIG. 3, in which the user has incremented the number of tiers of the topic nodes displayed in 101 using 106, and also utilized the system’s pan/zoom functionality to better visualize a portion of 101 that is of interest;
- FIG. 5 is a further representation of the display of the embodiment depicted in FIG. 4, in which the user has decremented the number of topic nodes shown in 101 by using 106, and has made new topic node selections of “Midbrain” and “Hindbrain”;
- FIG. 6 is a further representation of the display of the embodiment depicted in FIG. 5, where the user has pressed shift+enter on their keyboard in order to make the document set returned in Fig. 5 become represented by the central node 102;
- FIG. 7 is a further representation of the display of the embodiment depicted in FIG. 6, where the user has made an additional selection of “Forebrain” from Fig 6;
- FIG. 8 is a further representation of the display of the embodiment depicted in FIG. 7, where the user has removed the “Forebrain” selection from Fig 7 and updated the search query from “*” to “glia”;
- FIG. 9 depicts an embodiment of a system similar to Fig 1 - Fig 8, except with additional features to save queries to a project for later use;
- FIG. 10 is a further representation of the display of the embodiment depicted in FIG. 9, where the user is saving the active query to “Project 1”;
- FIG. 11 is a further representation of the display of the embodiment depicted in FIG. 10, where the user has changed the active query using previously depicted methods to retrieve a different set of documents from Fig. 10;
- FIG. 12 is a further representation of the display of the embodiment depicted in FIG. 11, where the user is saving the active query to a project named “Projectl”;
- FIG. 13 is a further representation of the display of the embodiment depicted in FIG. 12, where the user is on a default, unmanipulated interface;
- FIG. 14 is a further representation of the display of the embodiment depicted in FIG. 13, where the user is using the “open project” feature to load the collection of documents the system derives from the queries saved to “Projectl”;
- FIG. 15 depicts another embodiment of the present invention that features several distinct hierarchical topic maps that the user can choose to view individually or in combination with other hierarchical maps, and in this case the user has chosen to view the distribution of available information across maps of “human neuroanatomy” and “brain functions”;
- FIG. 16 is a further representation the display of the embodiment depicted in FIG. 15, where the user has elected to view the distribution of available documents across maps of “human neuroanatomy”, “brain functions”, and “research methodologies”;
- FIG. 17 is a logical flow of an embodiment of the present invention, by which the user can utilize Boolean selection criterion for multiple topic node 103 selections to be logically “OR” separated;
- FIG. 18 is a logical flow that can be implemented in an embodiment of the invention and in combination with the logical flow depicted in FIG. 17 to enable additional Boolean selection criterion for multiple topic node 103 selections to be logically “AND” separated;
- FIG. 19 is a logical flow that can be implemented in an embodiment of the invention and in addition to the logical flows depicted in FIG.17 and FIG. 18 for opening a project with multiple saved queries in order to retrieve the cumulative set of documents represented by those queries, allowing users to separate multiple distinct queries with a logical “OR”;
- FIG. 20 is a block diagram showing, at a high-level, a suitable architecture of the infrastructure and modules necessary to set up and prepare the system and data objects for operation in one embodiment of the invention
- FIG. 21 depicts an example input and output of the data normalization process for a collection of document data objects from different sources and with different formatting, in accordance with an embodiment of the present invention
- FIG. 22 depicts an example input and output for the data normalization process, in accordance with an embodiment of the present invention, to produce a hierarchically related set of data objects that represent topics;
- FIG. 23 depicts an example of input and output data objects for the mapping module in an embodiment of the invention that will produce a mapping between a first set of document data objects 2103 of FIG. 21, and a second set of hierarchically related data objects representing topics 2202 of FIG. 22;
- FIG. 24 depicts system architecture, for an embodiment of the invention that is accessible via the internet, using a client-server architecture
- FIG. 25 is a block diagram, of an embodiment of the present invention, showing the modules utilized in implementation of the server application 2402 of FIG. 24, which communicates with the client application 2401, the document database 2008, and graph database 2005.;
- FIG. 26 is a visual representation of example inputs and output data of an embodiment of the recommended “Magnitude-of-Mapping Retrieval Module” 2503 of FIG. 25, in accordance with an embodiment of the present invention
- FIG. 27 depicts abstract representations of example inputs and outputs of an embodiment of the “Data Consolidation Module”2505 of FIG. 25, which combines the magnitudes of mappings output 2601 with the set of hierarchically related topics 2202;
- FIG. 28 is a diagram, in accordance with an embodiment of the present invention, that shows modules for use in implementation of the client application 2401 of FIG. 24;
- FIG. 29 depicts a first manner by which the consolidated data 2701 are rendered by the render module 2801 of Fig. 28 in an embodiment of the present invention;
- FIG. 30 depicts a second manner by which the consolidated data 2701 are rendered by the render module 2801 of FIG. 28 in an embodiment of the present invention;
- FIG. 31 depicts a third manner by which the consolidated data 2701 are rendered by the render module 2801 FIG. 28 in an embodiment of the present invention
- FIG. 32 depicts how the document data stored in the system is utilized to render search results in the embodiment of the invention depicted in FIG. 24;
- FIG. 33 depicts an example of a basic interaction that the Interaction Module 2802 of FIG. 28 can be programmed to register and process in an embodiment of the present invention
- FIG 34 depicts an example of how the Query Generation Module 2803, of FIG. 28 might be implemented to interpret the application state depicted and convert it into a query in an embodiment of the present invention
- FIG 35 is a further representation of the display associated with the embodying system depicted in FIG. 34, with a more complex example of how a query can be generated given a more complex selection state of the interface;
- FIG. 36 depicts a simplified example of an embodiment of the present invention that features a total of 6 neuroscience related documents and a simplified topic map, describing the human anatomy of the brain;
- FIG. 37 is a further representation of the display associated with the embodying system in FIG. 36, where the user has made a selection on the topic “Midbrain” in Fig 36;
- FIG. 38 is a further representation of the display associated with the embodying system in FIG. 37, where the user has selected “Forebrain” in Fig 37;
- FIG. 39 is a further representation of the display associated with the embodying system in FIG. 38, where the user has selected “Diencephalon” in Fig 38;
- FIG. 40 is a further representation of the display associated with the embodying system in FIG. 39, where the user has selected the topic “Thalamus” in Fig 39;
- FIG. 41 is a further representation of the display associated with the embodying system in FIG. 40, where the user has made selections on “Thalamus” and “Telencephalon”, and has used the pan/zoom feature to focus on the topic “Forebrain” and its transitive subtopics;
- FIG. 42 is an embodiment of the sieve diagram in which the central node is rendered in the periphery. A red rectangular box highlights an area that will be zoomed-into in Fig 43;
- FIG. 43 is a further representation of the embodiment of the sieve diagram depicted in Fig 42 in which the interface uses the pan/zoom feature to enlarge the area highlighted by the red rectangle in Fig 42.
- the red rectangular box highlights an area that will be enlarged in Fig 44;
- FIG. 44 is a further representation of the embodiment of the sieve diagram depicted in Fig 43 in which the interface has used the pan/zoom feature to enlarge the area highlighted by the red rectangle in Fig 43;
- FIG. 45 is a perspective rendering of a three-dimensional sieve diagram in another embodiment of the present invention in which the distribution of a user’s investments across a variety of funds is depicted.
- FIG. 46 is a perspective rendering of an embodiment of a circular sieve.
- FIG. 47 is a further representation of the embodiment of the circular sieve in FIG.
- FIG. 48 is a perspective rendering of an embodiment of a circular sieve in which neither the parent dataset nor the primary topic node is shown.
- FIG. 49 is a perspective rendering of the invention wherein a filter ring, representing a previously curated Boolean set of topics has been added around parent dataset.
- FIG. 50 is a perspective rendering of the invention wherein topics related to human anatomy are depicted around a central node.
- FIG. 51 is a perspective rendering of the invention shown in FIG 50 after a user has selected a topic.
- FIG. 52 is a further representation of the invention in FIG 50 after a user has selected topic Tissues.
- FIG. 53 is a further representation of the embodiment in FIG. 50 after a user has selected topic Cells.
- FIG. 54 is a perspective rendering of wherein a compressed distribution line is colored with a higher intensity.
- FIG. 55 is a perspective rendering of a sieve in which the relationships are depicted through a relationship arc.
- FIG. 56 is a further representation of the embodiment in FIG. 55 wherein the user has a cursor hovering over distribution line.
- FIG. 57 is a further representation of the embodiment in FIG. 55 wherein the user’s cursor is hovering over the relationship arc.
- FIG. 58 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the distribution line.
- FIG. 59 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the secondary relationship arc.
- FIG. 60 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the distribution line between topic Cl and topic C13.
- FIG. 61 is a perspective rendering of an embodiment of the invention wherein the distribution lines only connect to the child topic nodes.
- FIG. 62 is a perspective rendering of an embodiment of a sieve from FIG. 61 where the central filling is depicted as a distribution line.
- FIG. 63 is a perspective rendering of an embodiment of a sieve diagram represented as a webbed map.
- FIG. 64 is a perspective rendering of a sieve wherein the distribution lines show the hierarchical pathing between data objects and their children.
- FIG. 65 is a perspective rendering of a sieve wherein the central node is a topic node in the hierarchical set of nodes.
- FIG. 66 is a perspective rendering of a sieve wherein no default search results are displayed.
- FIG. 67 is a further representation of the embodiment of FIG. 66 after the user has selected Topic C3 and documents are displayed.
- FIG. 68 is a further representation of the embodiment of FIG. 67 after the user has chosen to view related documents.
- FIG. 69 is a perspective rendering of an embodiment of the invention where a sieve depicts a user’s investment portfolio.
- FIG. 70 is a further representation of the embodiment in FIG. 69 after the user has selected the stock Netflix.
- FIG. 71 is a further representation of the embodiment of FIG. 50 wherein a user has selected the topic Animal Structures and entered a query pig.
- FIG. 72 is a top down rendering of an embodiment of the invention where the sieve diagram has been rendered in three dimensions.
- FIG. 73 is a perspective view of the embodiment of FIG. 72 wherein the sieve diagram has been rendered in three dimensions and the distribution lines flow from the top down.
- a “set” includes at least one member.
- a “computer process” is the performance of a described function in a computer using computer hardware (such as a processor, field-programmable gate array or other electronic combinatorial logic, or similar device), which may be operating under control of software or firmware or a combination of any of these or operating outside control of any of the foregoing. All or part of the described function may be performed by active or passive electronic components, such as transistors or resistors.
- computer process we do not necessarily require a schedulable entity, or operation of a computer program or a part thereof, although, in some embodiments, a computer process may be implemented by such a schedulable entity, or operation of a computer program or a part thereof.
- a “process” may be implemented using more than one processor or more than one (single- or multi-processor) computer.
- An “object” is a machine-readable encoding of a data item that can be processed and utilized by a computer programming language, operating on a computing device.
- a “set X” is a set of data objects that can be mapped against a hierarchical set of topic nodes, set Y, based on any criteria, including but not limited to input or output source(s), meta-tags or meta-information about the data objects, or the content of the data objects themselves, and for which those mappings can be quantified by some measurable magnitude for each member of the Set Y that Set X can be mapped against.
- a “set Y” is a set of data objects with a defined hierarchical or heterarchical pattern of relationships between each other.
- a “central node” is a display component, representing all members of the set X, and rendered either at the center of a sieve diagram or at the periphery of the sieve diagram. Distribution lines, wherein each distribution line corresponds to a subset of set X, emanate from the central node to connect to topic nodes.
- the central node represents a master document set that is subject to filtering by means of (i) graphically imposed constraints via the sieve diagram or (ii) search terms entered into a query box or (iii) other constraints, such as date range, specified as search facets.
- the master document set is all the documents mapped to a particular member of set Y.
- the central node is such particular member of set Y.
- a central node may be represented as a circle, an arcuate slice, a square, an image, or another suitable visual element or combination of visual elements.
- a “topic node” is a display component representing a member of the set Y.
- a topic node may be represented as a circle, an arcuate slice, a square, an image, or another suitable visual element or combination of visual elements.
- hierarchical topic nodes are rendered radially outward, away from the central node, so that the topmost node in the hierarchy is closest to the central node and the bottom-most nodes of the hierarchy are farthest from the central node.
- graphically selecting a given topic node, associated with a class of members in set Y will cause filtering of the set X (displayed at the central node) to include only members of set X that correspond to the selected topic node of set Y.
- a “distribution line” is a visual representation portraying that a set of objects mapped to a member of set Y are also mapped to a child of such member of set Y.
- the width of the distribution line, or some other characteristic of the line is utilized to represent a number of the mappings between set X and members of set Y represented by the connected topic node.
- a distribution line is a line that connects a central node and a topic node.
- a line is rendered “indirectly” to a node corresponding to a given member of the second set when the line extends from the central node to a hierarchical parent of the given node and a further line extends from the hierarchical parent to the particular node.
- a “sieve diagram” is a display comprising (i) a set of central nodes associated with a first set of objects, (ii) a set of topic nodes that are included in a second set of objects, wherein the first set of objects has been mapped to the second set of objects, and (iii) a set of distribution lines, each of which connects one of the topic nodes to one of the central nodes.
- Each sieve diagram has its own set of central nodes.
- a “topic map selector” is an interface component that allows the user to select which topic hierarchies to display in the sieve when the system is implemented so that set Y is composed of multiple independent hierarchical topic sets.
- a “topic map limiter” is any interface component of functionality that lets user limit how much of a topic map hierarchy or heterarchy is to be displayed.
- Results show either a subset or entirety of the results returned by either a traditional search-engine type search or a sieve filtration. If the embodiment implemented uses something other than a document set for set X, then results should be relevant to the set X used in the embodiment.
- a “derivative sieve diagram” is a new sieve diagram, based on a previously established sieve diagram, wherein a user selection has been made of a member of the second set of objects in the previously established sieve diagram in which the selected member is a derivative central node.
- a “derivative central node” of a derivative sieve diagram is a central node of the derivative sieve diagram.
- the visualizations depicted herein are original and improve upon cluster-wheel visualization.
- the embodiments described herein include novel functionality that can be implemented only using methods that leverage modem computer and information technology systems.
- FIG. 1 is a representation of a display, generated by a computer executing a program in accordance with an embodiment of the present invention, that graphically represents relationships between first and second sets of objects, using, for the first set, a subset of documents from the NCBI PubMed document corpus, in this case illustrating results from a query to a database accessed by the program.
- the second set is a hierarchy of topic nodes describing the anatomy of the brain.
- the Sieve interface component 101 is composed of a central node 102, a set of topic nodes 103 like “Forebrain” and “Diencephalon”, and a set of distribution lines 104 that emanate from the central node and connect to the topic nodes either directly or transitively through hierarchical parent topic nodes.
- the central node 102 represents a master set of documents returned from the user-specified query entered using the search bar 105. Since the user-specified query is a “*”, which this particular system interprets as a wild card, “return all documents”, all 78,908 documents in the system are returned. Consequently, the central node 102, indicates that the master set of documents in response to the user- specified query is all 78,908 documents.
- the distribution lines 104 emanating from the central node 102 and connecting to the various topic nodes 103, illustrate the number of documents returned from the user- specified query in the search bar 105 that also are related to the topic node 103 connected to, based upon the width of the distribution line 104.
- a thicker distribution line 104 means more documents in the system are related to a topic and a thinner distribution line 104 means fewer documents are thus related.
- No distribution line connecting to a topic node 103 means that there are no documents returned from the user-specified query in the search bar 105 that are related to the topic represented by the topic node 103.
- the distribution lines 104 indicate that there are documents related either directly to the topic represented by the topic node 103 connected to or that there are documents related to a subtopic of the topic represented by the topic node connected to.
- the topic map limiter 106 in this embodiment consists simply of an increment and decrement functionality that allows users to determine how many tiers of the hierarchy to display on screen.
- the results section 107 includes a results summary 108, in addition to a list of search results, which is composed of individual documents 109 that match the total user query, which is composed of the user input in the search bar 105 combined with the query state of the sieve 101.
- Fig 1 represents a default, unmanipulated state of the computer-generated display.
- the display of FIG. 1 is a computer user interface generated by the computer program.
- the computer user interface presents the sieve interface component 101, to graphically represent relationships between the first set of objects (subset of documents from the NCBI PubMed document corpus) and the second set of objects (topic nodes describing the anatomy of the brain), the search components 105, 108, 110, 120, and the topic map limiter 106 to the user via a computing device.
- FIG. 2 is a further representation of the display of the embodiment depicted in FIG. 1, in which the user has selected the topic node 103 representing the topic “Telencephalon” in the Fig. 1 display.
- the combined query defined by the sieve 101 and the query entered in the search bar 105, is updated as shown in the results description 108.
- the documents returned in the search results 107 are now all documents returned from the “*” search query (all documents in the system) that are also tagged with “Telencephalon”, which of the total 78,908 documents in the system, is 1,533 documents.
- FIG. 3 is a further representation of the display of the embodiment depicted in FIG.
- the topic map limiter 106 includes only two buttons that trigger updates to the sieve 101, in order to show greater or fewer tiers of the hierarchical topic map displayed in the sieve 101.
- the topic map limiter was set to “Tier: 2” and the sieve 101 showed only two tiers deep into the topic map of neuroanatomy
- the user has set the topic map limiter 106 in Fig. 3 to “Tier: 3” and so the sieve 101 shows the top three tiers of the hierarchical topic map of neuroanatomy.
- FIG. 4 is a further representation of the display of the embodiment depicted in FIG. 3, in which the user has incremented the number of tiers of the topic nodes displayed in 101 using topic map limiter 106, and also utilized the system’s pan/zoom functionality to better visualize a portion of 101 that is of interest.
- the topic map limiter was set to “Tier: 3” and the sieve 101 showed only the top three tiers of the topic map of neuroanatomy
- the user has set the topic map limiter 106 in Fig. 4 to “Tier: 4,” and so the sieve 101 shows the top four tiers of the hierarchical topic map of neuroanatomy.
- the pan functionality is utilized by clicking and dragging the sieve diagram and the zoom functionality is utilized by use of the mouse wheel, where scrolling it up zooms in and scrolling it down zooms out.
- FIG. 5 is a further representation of the display of the embodiment depicted in FIG. 4, in which the user has decremented the number of topic nodes in the sieve 101 by using the topic map limiter 106, and has selected a different set of topic nodes: “Midbrain” and “Hindbrain”.
- the combined query from the search bar 105 and the sieve 101 is now all documents returned from the “*” query (all documents) that also contain the topics or subtopics of either “Midbrain” or “Hindbrain”. This update to the query and description of the documents returned is reflected in the results description 108.
- FIG. 6 is a further representation of the display of the embodiment depicted in FIG. 5 and has been manipulated by the user so that the previous document selection represented in the results description from FIG. 5 is now represented by the central node 102 and the distribution lines 104 displayed now describe the distribution of documents from that subset across the hierarchical topic set describing the human brain.
- the central node 102 represented the full set of documents returned from the search query entered into the search bar 105
- this type of sieve 101 manipulation is triggered when the user presses the “shift” key and the “enter” key simultaneously.
- this type of “deep dive” feature is implemented in the system it can be implemented in other ways as well. For example, it could have been implemented so that the central node 102 would represent the entirety of the last submitted query, including the user submitted query in the search bar 105, and not just the selected topic node portion of the previous query.
- FIG. 7 is a further representation of the display of the embodiment depicted in FIG. 6, and has an additional selection of “Forebrain”, making the active query all documents returned from the query in the search bar 105, “*” that are related to the Midbrain, Hindbrain, or any of their subtopics, and also the “Forebrain”, or any of its subtopics.
- this can be better represented as: (documents returned from “*” query) AND (documents tagged with “Midbrain” OR “Hindbrain”) AND (documents tagged with “Forebrain”). Results of this updated query are reflected in the results description 108.
- FIG. 8 is a further representation of the display of the embodiment depicted in FIG. 7, with the selection of “Forebrain” from FIG. 7 removed by clicking the central node 102, and the search query in the search bar 105 updated to “glia”. Therefore, the document set retrieved from the system and displayed in the results section 107 in Fig. 8 is all documents that mention “glia” that also mention “Midbrain” or “Hindbrain”, or any of their subtopics. This is reflected in the results description 108.
- FIG. 9 is a further representation of the display of the embodiment depicted in FIG. 8, with additional features to save new queries for later use. Specifically, it features additional “Open Project” 901 and a “Save Query” 902 functionalities. Saving queries to a project allows users to combine search results from several different queries in order to create a larger collection of documents that they can then sift through with the sieve and any other filters / search features that an embodiment of the invention is implemented with. It is also one possible approach, though not the only approach, for enabling multiple queries to be combined with a Boolean logical “OR”.
- FIG. 10 is a further representation of the display of the embodiment depicted in FIG. 9, where the user is saving the active query to “Projectl”, using the dialog box 1001 that appeared after the user clicked “Save Query” 902 in Fig. 9.
- FIG. 11 is a further representation of the display of the embodiment depicted in FIG. 10, where the user has changed the active query on the sieve 101 to retrieve a different set of documents.
- FIG. 12 is a further representation of the display of the embodiment depicted in FIG. 11, where the user is saving the active query to a project named “Projectl” using the dialog box 1001 that appeared as a result of the user clicking “Save Query” 902 in Fig. 11.
- FIG. 13 is a further representation of the display of the embodiment depicted in FIG. 12, where the user is now on an unmanipulated, default interface. The user achieved this view into the data by selecting the central node 102 and removing the existing selections.
- FIG. 14 is a further representation the display of the embodiment depicted in FIG. 13, where the user clicked “Open Project” 901 in Fig. 13, causing the project dialog box 1401 to appear.
- the system will load the set of documents created by combining the separate document sets from each of the subqueries saved to “Projectl” in Fig. 9 - Fig. 10 with Boolean logic “OR”. That is, it combines all of the document sets from each saved query and removes duplicates.
- This new document set will be then represented by the central node 102, and the distribution line 104 that is displayed will represent the distribution of search results from that new document set across the topics represented by the topic nodes 103.
- FIG. 15 depicts an embodiment of the present invention that features several distinct hierarchical topic maps that the user can choose to view individually or in combination with other hierarchical maps.
- the user can double-click a distribution line 104 or topic node 103 to view the body of related documents on a different page.
- the user has selected to view the distribution of available documents across maps of “human neuroanatomy” 1502 and “brain functions” 1503, using the topic map selector 1501. These maps are displayed around the central node 102 as visually separated groups of topic nodes 103, using space and color for distinction.
- FIG. 16 is a further representation of the display of the embodiment depicted in FIG. 15, where the user has elected to view the distribution of available documents across three topic maps simultaneously, by clicking the “Methodology” button 1601 in the topic map selector 1501 in Fig. 15, causing the display of the group of topic nodes in orange 1602 that are depicted at the bottom of the sieve 101 in the interface.
- FIG. 17 is a logical flow of an embodiment of the present invention, by which the user can utilize Boolean selection criteria to enable multiple topic node 103, of FIG. 1, selections to be logically “OR” separated.
- This logical flow is not the only way this interaction can be implemented, but merely represents the implementation used in the embodiment of the invention depicted in FIG. 1 - FIG. 14.
- Process step 1701 the system is fully loaded and ready for the user to interact with.
- Process step 1702 is triggered when a user selects a topic node 103 or a distribution line 104 in the sieve 101, as in FIG. 2. After the selection is registered in step 1702, the system checks if the user was holding down the “shift” key on the keyboard controlling the system in process step 1703.
- step 1705 adds the selected topic node 103 to the list of existing selected topic node 103, if any, as in FIG. 5. If the “shift” key was not held, the system proceeds to step 1704 and removes any existing topic node 103 selections before making the newly selected topic node 103 the only active topic node 103 selection. After either step 1704 or 1705 completes, the system returns to step 1701, where it is ready to record any further interactions.
- FIG. 18 is a logical flow diagram that can be implemented with the logical flow depicted in FIG. 17 for an embodiment of the present invention wherein multiple topic node 103 selections can be logically “AND” separated.
- This logical flow is not the only way this interaction can be implemented, but merely represents the implementation used in the embodiment of the invention depicted in Fig. 1 - Fig. 14.
- process step 1801 the system is ready for interaction and the user has already interacted with the system to register one or more topic node 103 selections.
- process 1802 taking the currently selected body of documents and making the central node 102 represent these documents and clearing any of the topic node 103 selections.
- the system then proceeds to process 1803 to retrieve magnitude mappings 2701 for each topic node 103 displayed that has mappings with documents in the selected document set.
- the system then proceeds to process 1804 and uses this new data about magnitude mappings to display new distribution lines 104 to represent the distribution of documents within that document set across the displayed topic nodes 103.
- Process 1804 is completed when the system has re-rendered the interface and the system is ready to receive additional user-interactions. An example of the results of this interaction can be seen in FIG.
- FIG. 19 is a logical flow diagram that can be implemented in addition to the logical flows depicted in FIG. 17 and FIG. 18, for an embodiment that enables opening a project with multiple saved queries in order to retrieve the cumulative set of documents represented by those queries, allowing users to combine the document sets retrieved from multiple distinct queries with a logical “OR”.
- An example of a corresponding interface display for this logical flow can be seen in FIG. 9 - FIG. 14.
- the system receives, processes, and renders the response from a user specified query.
- Process 1902 is triggered once a user indicates that they wish to add the currently displayed query to a project. Since multiple queries can be added to a project, processes 1901 and 1902 can be repeated as many times as desired.
- Process 1903 is triggered when the user indicates that a desire to open an existing project.
- Process 1903 in turn triggers process 1904, where a collection of documents is retrieved such that a document is included in the collection if it matches at least one of the queries that were added to the project. At this point any duplicate documents can be removed.
- the system also retrieves the magnitudes of mappings 2701. Once the requisite data is retrieved, the system proceeds to process 1905, where the interface is re-rendered with the new document set and magnitude data, used for the distribution lines 104.
- FIG. 20 is a block diagram showing, at a high-level, a suitable architecture of the infrastructure and modules necessary for a particular embodiment of the invention to set up and prepare the system and data objects for operation.
- Block 2001 represents the document data objects of the set X, which may come from any variety of sources.
- the purpose of the Data cleansing and normalization module 2003 for set X is to ensure all data has the consistent and desired formatting necessary for the embodying system to operate with it.
- the data is stored for later use, such as by storing it in a document database 2004 of some kind, such as SQL.
- Graph data set 2002 is a raw form of the set Y, related data objects.
- the purpose of the Data Cleansing and Normalization Module 2006 for Set Y is to prepare the set of related objects for use in accordance with operation of the system.
- the exact implementation and functions of this module will vary depending on the data 2002 that is provided as an input. However, the output should be hierarchical or heterarchical data with no loops.
- the module can be either an automated process, manual process, or a combination of both as is necessary to normalize the data inputs.
- the hierarchically related data of set Y 2002, output from the normalization module 2006 is stored for later use. Any suitable data storage mechanism will suffice, though the embodiment depicted utilizes a Neo4j graph database store 2005.
- the mapping module 2007 takes these data sets as inputs for its processes that determine which objects in set X should have mappings to specific objects in the set Y.
- the criteria for determining whether a mapping should be made is arbitrary and will vary depending on the system’s needs.
- a mapping is generated for a document in the set X if a topic from the set Y is mentioned in a document in the set X.
- the mappings are generated, they are stored for later access. Any suitable mechanism of data storage and access is sufficient, such as, for example, the Elasticsearch database 2008.
- the documents from the set X are stored in the database 2008 and indexed with their mappings, which in the embodied system are topics from the set Y that the documents were found to be related to in the mapping module.
- FIG. 21 depicts an example input and output of the data normalization process for a collection of document data objects from different sources and with different formatting, in accordance with the embodiment of the present invention in FIG. 20.
- Headers 2101 and 2102 are tabularized headers of sample XML-formatted document-based data from two different sources and with two different kinds of formatting.
- the embodied system in this example utilizes data from both sources, and so the data must be normalized so that both have consistent headers for later access and querying. If a system only uses information from a single source and reformatting of the data is not necessary, this module may be omitted.
- FIG. 22 depicts an example input and output for the data normalization process, in accordance with the embodiment of the present invention in FIG.
- Item 2201 is a connected graph data object representing a variety of relationships between the topics contained within the graph.
- Embodiments of the present the invention are well suited for use with hierarchical or heterarchical related data objects.
- the purpose of this module is to ensure that the data consumed by the system is either hierarchically or heterarchically related, and has no direct or transitive loops between data objects. If any of these circumstances is determined to be present, automated and/or manual methods can be used to address them, as suited to the needs and capabilities of the embodying system and environment. If an embodiment of the invention does not utilize graph data inputs, and instead generates hierarchies from scratch, then this module is not necessary.
- FIG. 23 depicts an example of input and output data objects for the mapping module 2007 in FIG. 20, from an embodiment of the invention that will produce a mapping between a first set of document data objects 2103 of FIG. 21, and a second set of hierarchically related data objects representing topics 2202 of FIG. 22.
- the criteria by which a mapping is made between members of the first and second of objects 2202 and 2103 are arbitrary but should be consistent and quantifiable.
- the embodiment of FIG. 23 creates a mapping for a document in 2103 if a topic in 2202 is mentioned in the title or text of a document.
- mapping module 2007 The output of the mapping module 2007 is represented in the tabular data representation 2301, and depicts how the documents of 2103, stored in the SQL database 2004, are updated with a new column, “Mappings”, which contains all the topics from 2202 that were found to be mapped to the documents. Note that 2301 still retains additional fields like “Text” in 2103, but they are simply not depicted in the figure. In the embodiment depicted, the updated Set X 2301 is then stored in a search database like Elasticsearch, though any other sufficiently query-able system is suitable as well. The quantifiable metric in this embodiment will be the number of documents each topic is mentioned in.
- FIG. 24 depicts system architecture for an embodiment of the invention that is accessible via the internet, using a client-server architecture. Any suitable architecture that enables the prepared data to be effectively queried and processed to perform the functions and capabilities outlined in this patent is suitable.
- the client web application 2401 is running in a browser on a user’s computer.
- the web application 2401 is in communication over the internet with web server 2402, which may be hosted privately, locally, in the cloud, or any other suitable hosting mechanism that enables communication via networks such as the internet.
- the web server 2402 in turn is coupled to document database 2008, containing documents with mappings 2301, and graph database 2005, which contains the hierarchically or heterarchically related data objects 2202 utilized by the system.
- the Request Routing Modules 2501 are responsible for receiving requests from the client application, orchestrating their processing, and sending responses.
- the Document Retrieval Module 2502 and Graph Retrieval Module 2504 are responsible for communicating with the databases to get the data objects necessary to fulfill the client request.
- the Magnitude-of-Mapping Retrieval Module 2503 is responsible for quantifying the mappings that are found in the retrieved documents and providing a magnitude for each type of mapping, if the capability is not included in the database itself.
- the Data Consolidation Module 2505 is then responsible for integrating the data retrieved from the variety of sources so that the Request Routing Module 2501 can respond with the appropriate data payload.
- FIG. 26 is a visual representation of example inputs and output data of an embodiment of the recommended “Magnitude-of-Mapping Retrieval Module” 2503 of FIG. 25.
- the mappings must be quantified for each topic in the system that had a mapping determination made against the set X.
- An example output is depicted in 2601, where for each possible mapping type, or topic, the number of times the mapping appeared in the result set is totaled. This is not the only way to quantify the mappings, but simply the approach implemented in this particular embodiment of the invention. Any sensible quantification that is suitable to the aims of the embodying system may be implemented.
- FIG. 27 depicts abstract representations of example inputs and output of an embodiment of the “Data Consolidation Module”2505 of FIG. 25, which combines the magnitudes of mappings output 2601 with the set of hierarchically related topics 2202.
- the data need not be integrated in this way, but rather this module is presented and visualized in this manner to better help the reader understand how the system data is combined for later use in rendering.
- FIG. 28 is a diagram, in accordance with an embodiment of the present invention, which shows modules for use in implementation of the client application 2401 of FIG. 24.
- the Render Module 2801 utilizes system data to build the display that the user sees.
- the Interaction Modul 2802 integrates with the rendered display and programs interactive capability so the user can interact with the system.
- the Query Generation Module 2803 maintains and tracks data that is necessary for query generation and passes it to the Request Module 2804, which maintains communication with the application server 2402, when requests are made for new query responses.
- FIG. 29 shows an embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of Fig. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103.
- the amount of documents mapped to each topic in 2701 are used to determine the width of the distribution lines 104.
- the central node 102 represents all documents returned from a submitted query, and one or more higher-tier topic nodes in 2701 are excluded from the rendering.
- FIG. 30 depicts a second embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of FIG. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103. Again, the amount of documents mapped to each topic in 2701 are used to determine the width of the distribution lines 104 rendered.
- the central node 102 in this embodiment represents a topic node from the topic hierarchy in 2701.
- the master set of document are documents mapped to “Brain.” No node is rendered to represent the full-set of documents returned from an active query.
- another mechanism is provided to select the full document set via the orphaned node 3001.
- FIG. 31 depicts a third embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of FIG. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103. Again, the amount of documents mapped each topic in the abstract representation of the data on the left-side of the figure are used to determine the width of the distribution lines 104 rendered.
- the central node 102 represents the set of all documents returned from a query, and the root topic in the abstract data representation of the hierarchically related topics is also rendered.
- FIG. 32 depicts an embodiment of a rendering by the Render Module 2801 of FIG. 28 wherein document data 2301 of FIG. 23, retrieved in response to a user query and stored in the system, is utilized to render search results 107.
- FIG. 33 depicts an example of a basic interaction that the interaction module 2802 of FIG. 28 can be encoded to account for.
- clicking a topic node 102 or distribution line 104 connecting to a topic node 103 triggers the selection of the topic node 103.
- Selection of a topic node 103 triggers the generation of a new query and retrieving of the subset of results that correspond to the topic node 103 selected.
- Fig. 17 - Fig 19 for more sophisticated explanations of suggested interactions.
- FIG 34 depicts an example of how the Query Generation Module 2803 might be implemented to interpret the application state depicted and convert it into a query.
- the search bar 105 is not required, and the Sieve can be combined with any other search facets desired as well.
- a user query 3401 is entered in the search bar 105, and a selection 3402 has been made on “Topic A”.
- the system retrieves all documents that matched the search query that also are mapped to “Topic A”.
- any document mapped to a subtopic is also considered to be mapped to the parent topic. Therefore, a selection of “Topic A” also retrieves documents that are mapped with “Topic G”, “Topic F”, “Topic E”, “Topic I”, and “Topic H”.
- the combined query is described in 3403.
- FIG 35 is a further representation of the display associated with the embodying system depicted in FIG. 34, with a more complex example of how a query can be generated given a more complex selection state of the interface.
- this FIG. 35 depicts a system state where a multi-selection 3501 has been made on “Topic I” and “Topic F”. 3502 contains a more sophisticated description of the combined query.
- FIG. 36 depicts a simplified example of an embodiment of the present invention, which features a total of 6 neuroscience related documents and a simplified topic map describing the human anatomy of the brain.
- a “*” query meaning return all documents, has been submitted and no selections have been made. Therefore, all documents are returned, regardless of their mappings.
- the query depicted can be stated as: (*).
- FIG. 37 is a further representation of the display associated with the embodying system in FIG. 36, but with a selection made on the topic “Midbrain”, indicating a query of all available documents that are also related to the topic “Midbrain”.
- the query can be stated as: ((*) A (mapping: “Midbrain”)).
- FIG. 38 is a further representation of the display associated with the embodying system in FIG. 37, but with a selection made on the topic “Forebrain”, indicating a query of all available documents that are also related to the topic “Forebrain”, or any of its subtopics.
- the query can be stated as: ((*) A (mapping: “Forebrain”)).
- FIG. 39 is a further representation of the display associated with the embodying system in FIG. 38, but with a selection made on the topic “Diencephalon”, indicating a query of all available documents that are also related to the topic “Diencephalon”, or any of its subtopics. In Boolean logic, the query can be stated as: ((*) A (mapping: “Diencephalon”)).
- FIG. 40 is a further representation of the display associated with the embodying system in FIG. 39, but with a selection made on the topic “Thalamus”, indicating a query of all available documents that are also related to the topic “Thalamus”.
- the query can be stated as: ((*) A (mapping: “Thalamus”)).
- the distribution line 104 from “Forebrain” to the left off the page need not be rendered, and thus the topic node 103 is the central node.
- FIG. 41 is a further representation of the display associated with the embodying system in FIG. 40, but with selections made on the topics “Thalamus” and “Telencephalon”, indicating a query of all available documents that are also related to either the topic “Thalamus” or the topic “Telencephalon”, or any of their subtopics.
- the query can be stated as: ((*) A ((mapping: “Thalamus”) V (mapping: “Telencephalon”)).
- FIG. 42 is an embodiment of the sieve diagram 101 in which the central node 102 is rendered in the periphery.
- Distribution Lines 104 emanate from the central node 102 to connect directly or transitively to topic nodes 103.
- a red rectangular box 4201 is depicted on top of the interface to aid the reader of this document in identifying the region of the diagram that will be zoomed into in Fig 43.
- FIG. 43 is a further representation of the embodiment of the sieve diagram depicted in Fig 42, in which the user has utilized the/pan/zoom functionality to zoom into the area of the diagram identified by the box 4201 in Fig 42.
- the red rectangular box 4301 of Fig. 43 identifies a region that has been selected for enlargement in Fig. 44.
- FIG. 44 is a further representation of the embodiment of the sieve diagram depicted in Fig 43, in which the user has utilized the pan/zoom feature to zoom into the area identified by the red rectangle 4301 of Fig 43.
- FIG. 45 is a perspective rendering of a three-dimensional sieve diagram in another embodiment of the present invention.
- the distribution of a user’s investments across a variety of funds is depicted.
- the user has selected the “Crypto” fund and is able to see additional details, in this case total amount invested, in the fund’s members: “Bitcoin” and “Ether”.
- a three-dimensional sieve diagram is likely to be more suitable for use with AR (Augmented Reality) and VR (Virtual Reality) systems.
- this embodiment may include a robust “rotation” feature, along with a “zoom”.
- FIG. 45 is a perspective rendering of a three-dimensional sieve diagram in another embodiment of the present invention.
- the distribution of a user’s investments across a variety of funds is depicted.
- the user has selected the “Crypto” fund and is able to see additional details, in this case total amount invested, in the fund’s members: “Bitcoin” and “Ether”.
- FIG. 46 is a perspective rendering of an embodiment of a circular sieve.
- the parent dataset is the central node 4601.
- the central node 4601 is surrounded by a primary topic node 4602, depicted as a ring, representing the primary topic from which a set of secondary topic nodes 4604 derive from. Since the ring represents a singular topic, the presence and magnitude of the distribution between the central node 4601 and the topic node 4602, which is usually represented by a distribution line having a visual characteristic, such as width, is instead represented by a filling 4603.
- the distribution lines 4605 represent the number of documents related to a topic 4604.
- the portion of the line emanating from the parent node has a width indicative of the number of documents mapped to the parent topic and the portion of the line emanating to the child topic represents the number of documents mapped to the child topic.
- the filling 4603 leverages a topic ring 4602 as a container.
- the full height of the ring represents the number of documents mapped to the central node 4601, and the height of the filling 4603 represents the number of documents mapped to the child topic 4603 as a percentage of the documents mapped to the parent dataset 4601.
- the filling 4603 indicates that more than 75% of the documents mapped in the central node 4601 are mapped to the topic “Chemicals and Drugs” 4602.
- FIG. 47 depicts a derivative embodiment of the invention in which the parent dataset is not rendered. Therefore, the topic node 4702 for “Chemicals and Drugs” is the central node. The distribution lines 4705 are still able to indicate a volume of articles in each topic. In some embodiments, rendering the parent dataset provides clarity and usefulness, however, in other embodiments, more experienced or acquainted users may prefer to have the parent dataset not rendered.
- FIG. 49 depicts an embodiment of the invention wherein a filter ring 4911, representing a previously curated Boolean set of topics has been added around parent dataset 4901.
- the filter ring 4911 conceptually represents a filter of the mappings available for representation across the portions of the visualization that are radially more distal, relative to the filtering ring.
- the filter ring 4911 represents all topics a user selected which they believe are related to a particular research assignment.
- the user named the set of topic selections “Research Project A Topics.”
- the filling 4903 is shown about a quarter of the way filled, thus representing about 25% of the topics available in the parent dataset 4901 are mapped to the Boolean set of topics represented by the filter 4911.
- the documents considered for determining mapping magnitudes across the remaining topics, distal to the filter, are now only the subset of documents that matched the query represented by the filter ring.
- the search bar 4960 is used to create a filter ring 4911.
- a user enters a search and the filter ring can depict such search, and the filling 4903 would represent the documents in the parent dataset related to such search.
- FIG. 50 depicts an embodiment of the invention in which topics related to human anatomy are depicted around a central node, representing a 60,000 document test set of the NCBI PubMed corpus.
- the filling 5003 is at a different level, as a different number of documents related to the topic ring 5002 of Anatomy than the topic ring 4602 of Chemicals and Drugs.
- the topics 5004 differ from the topics 4604 as they all relate to the topic ring 5002, anatomy.
- the interface only displays up to a certain number of topics at once. If a topic has no children, it is rendered with a slight transparency. If a topic has children it is solid in color.
- a user can view a topic’s hidden children by double-clicking a topic of interest that has hidden children. This will cause all topics accept the indicated topic to disappear, reposition the indicated topic, and then display its children and transitive children until there are either no more topics to render or until the number of topics rendered would go over a predetermined render limit.
- FIG. 51 depicts the embodiment of the invention shown in Figure 50 after a user has selected a topic 5107.
- the user happened to select Nervous System.
- the user may select Nervous system through using a mouse and keyboard, a touchscreen, an algorithm, and other methods of selection.
- Selecting the topic 5107 causes the display of documents 5108 to show document information 5109 mapped to the topic 5107.
- the document information 5109 is the title and the authors.
- the document information 5109 includes a summary, an abstract, a date and other information about and related to the document.
- a search results summary 5110 shows how many documents are returned for the selected topic 5107, in the example of Figures 51, 7,514 documents are associated with the topic Nervous System.
- FIG. 52 depicts the embodiment of the invention in Figure 50 after a user has selected topic 5207 “Tissues,” causing display of documents 5208 to show documents mapped to the topic “Tissues.”
- the search results summary 5210 shows there are 17,744 documents associated with this topic.
- the angular space of the topic Nervous System is larger than that of Tissues. Therefore, in this embodiment the angular spaces are not indicative of a number of documents mapped to the topic.
- the distribution lines 5205 are constrained in thickness to the angular space of the topic 5207.
- FIG. 53 depicts the embodiment of the invention in Figure 50 after a user has selection topic 5307 “Cells,” causing display of documents 5308 to show documents mapped to the topic “Cells.”
- the search results summary 5310 shows there are 19,576 documents associated with this topic.
- Figure 51 there are more documents mapped to Cells than Nervous System, however, due to the number of child topics 5304 mapped to Nervous System, the angular space of the topic Nervous System is larger than that of Cells.
- FIG. 54 depicts a different embodiment of the invention than what is depicted in Figures 50-53.
- the compressed distribution lines 5412 are colored with a higher intensity to visually communicate a higher density of documents being represented in a compressed space.
- the compressed distribution lines 5412 are colored differently, more opaque, less opaque, shaped, patterned, or marked any other various way to depict a higher density of documents.
- FIG. 55 depicts an embodiment of a sieve 5500 in which the relationships are depicted through a relationship arc 5520.
- the relationship arc indicates, for a parent/child relationship, the largest width that distribution lines to children can have and re-establish the meaning of a relative numeric quantity associated with a width.
- FIG. 56 depicts an embodiment of the Sieve in figure 55 wherein the user has a cursor hovering over distribution line 5621.
- hovering over the distribution line 5621 causes alt text 5622 to be displayed showing the number of documents associated with the distribution line 5621.
- the number of documents may not be alt text, but another method of causing the number of documents to be displayed on some user interaction such as hovering, selection, or another interaction mechanism.
- FIG. 57 depicts the embodiment of figure 55 wherein the user’s cursor is hovering over the relationship arc 5720. Hovering over the relationship arc 5720 causes a text 5721 to be displayed indicative of a number of documents associated with the relationship arc.
- the thickness of the relationship arc 5720 is indicative of the amount of documents, and the distribution lines 5723 to the children are proportional to the thickness of the relationship arc 5720 based on the ratio of documents in the relationship arc to the number of documents related to each Topic C1-C5. This is useful when a large number of documents must be represented in a limited space, as is the case with the children of “Topic Cl.”
- FIG. 58 depicts the embodiment of figure 55, wherein the user is hovering a cursor over the distribution line 5824 between topic C and topic Cl.
- the text 5821 shows the number of documents related to topic Cl.
- the thickness of the first child distribution line 5824 is .9 times (450,000/500,000) as thick as the relationship arc 5820.
- the distribution line 5824 may be proportional to the thickness of the parent distribution line 5821.
- FIG. 59 depicts the embodiment shown in figure 55 wherein the user is hovering a cursor over the secondary relationship arc 5930 of Topic Cl’s children.
- the secondary relationship arc 5930 re-establishes the width-quantity relationship between topic Cl and its children.
- the text 5921 shows that for children of topic Cl, 450,000 documents will be indicated by a line having the thickness of secondary relationship arc 5930.
- FIG. 60 depicts the embodiment shown in figure 55 wherein the user is hovering a cursor over the distribution line 6032 between topic Cl and topic C13.
- each of the 450,000 documents in topic Cl relate to topic C13, and thus the thickness of the distribution line 6032 is the same as the thickness of the relationship arc 6030.
- FIG. 61 depicts a derivative embodiment of figures 55-60 wherein the distribution lines only connect to the child topic nodes and does not extend through to the topic node’s relationship arc.
- the sieve diagram 6100 is read the same way as in the embodiments depicted in figures 55-60 and the difference is only stylistic.
- FIG. 62 depicts an embodiment of the sieve, similar to the embodiment of figure 61.
- the embodiment of figure 62 is different in that instead of providing a filling, there is a distribution line 6203 which shows the documents from the central node 6201 to Map 1 6204.
- FIG. 63 depicts an embodiment of a sieve diagram 6300 as a webbed map.
- the width of the distribution lines 6305 indicate an amount of documents associated with each topic A, B, C.
- the widths of the borderlines around topics A, B, and C are indicative of the width with which the quantities depicted by the topics’ incoming distribution line will be represented in outgoing distribution lines to the topics’ children. Therefore, one can visually perceive the percentage of documents mapped to, for example, topic C that are also mapped to topic Cl by comparing the width of the borderline around topic C to the width of the distribution line from topic C to topic Cl.
- a node has no children, for example node 6341, then it does not depict a meaningful borderline on the topic node.
- FIG. 64 depicts an embodiment of sieve 6400 in which the lines 6405 connecting the topic nodes, directly and indirectly, to the central node are only used to show the hierarchical pathing between the data objects and their children.
- the width of the border-line 6481 of the topic node is used to communicate the quantity of documents mapped to a given topic.
- different colors or opacities might indicate the amount of documents mapped to a given topic.
- FIG. 65 depicts an embodiment of the sieve similar to embodiments of figures 55- 62, however, the central node is a topic node, Root Topic 6550, of the hierarchical set of nodes.
- This embodiment may be a derivative of another embodiment where, for example, the Root Topic 6550 is selected from a broader selection of topics and the system renders a zoomed version.
- Similar sieve diagrams can be created, for example, by selecting Topic Cl and creating a new sieve diagram with Topic Cl as the central node and the children Cl 1- C17 surrounding Topic Cl.
- FIG. 66 depicts an embodiment of the Sieve 6600 in which no default search results are displayed and initially only the sieve diagram is displayed.
- the user is about to select the topic “Topic C3”, as indicated by the user’s cursor 6670 hovering over the topic.
- FIG. 67 depicts the embodiment from Figure 66 after the user has selected topic “Topic C3”.
- the display of documents 6708 first displays information related to Topic C3, and provides user the option to see documents upon selecting the button labeled, “View Related Documents.”
- the user is about to click this button as indicated by the user’s cursor hovering the button.
- FIG. 68 depicts the embodiment from Figure 67, after the user selected the button labeled “View Related Documents,” which causes the system to retrieve and display results, which in this embodiment are data objects in the first set of objects.
- an option to view the original topic description is provided by selecting the button labeled “Back to Topic Details”.
- FIG. 69 depicts an embodiment of sieve 6900 used to visualize a user’s investment portfolio.
- Distribution lines 6904 are used to indicate the amount of money invested in various stocks and stock categories, and are either red (loss), green (profit), or grey (no profit or loss).
- the graphical linkage 6904 may have a color and a thickness, such that the color is a secondary feature representing a quality of the data of which the distribution line is indicative.
- the thickness represents a cash-flow while the color represents a positive or negative number, thus allowing the distribution lines to indicate a negative number as a negative thickness is not otherwise graphically conveyable.
- FIG. 70 depicts the embodiment from Figure 69 after the user has selected the stock Netflix, causing the display of additional information 7008 about the selected stock and the performance of their invested funds in the selected stock.
- FIG. 71 depicts the embodiment in Figure 50, in which the user has selected the topic “Animal Structures” 7104.
- a query of “pig” 7171 entered into the search bar a display of the documents 7109 which correspond to those documents mapped to “Animal Structures” which also contain the word “pig” is generated.
- This embodiment also shows how some topics may have multiple parents.
- the topics Cloaca and Nonmammalian are children of both embryonic structures and animal structures. Therefore, when Animal structures is highlighted, the children, which are displayed at a different parent topic are also highlighted.
- the sieve diagram puts the children with the most closely related parent, but highlights the child topic when any of the parent topics are selected.
- FIG. 72 is a top down rendering of an embodiment of the invention where the sieve diagram has been rendered in three dimensions.
- the central node 7201 is marked in a lighter color to denote the height of the node in three-dimensions.
- the distribution lines 7204 flow from the central node 7201 to the topics, either directly or indirectly.
- the central node can be, inter alia, a plurality of topics on the outside, a topic ring, and a master document set; and in even further embodiments the sieve diagram includes a filter ring.
- FIG. 73 is a perspective view of the embodiments of FIG. 72 wherein the three dimensional sieve diagram 7300 is shown with height.
- the base of the first hierarchical layer 7380 is indicative of a quantity or quality of documents related to the topics of the layer.
- the height of each layer in the wedding cake shaped sieve diagram 7300 would decrease as documents are filtered out for not being associated with a child topic of the previous layer.
- the height of the first hierarchical layer 7380 is not even throughout.
- the layer 7380 is formed as a pie chart with various slices of pie at different heights to represent a quantity of documents in that section of the pie.
- the distribution lines are not necessary to indicate the quantity, however, in some embodiments they may also be included for more visual cues.
- the height of base layer 7380 is indicative of a quantity of documents, the height would be a distribution line as defined by this application.
- the present invention may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof.
- a processor e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer
- programmable logic for use with a programmable logic device
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments.
- the source code may define and use various data structures and communication messages.
- the source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
- the computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device.
- a semiconductor memory device e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM
- a magnetic memory device e.g., a diskette or fixed disk
- an optical memory device e.g., a CD-ROM
- PC card e.g., PCMCIA card
- the computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking technologies.
- the computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web) .
- Hardware logic including programmable logic for use with a programmable logic device
- implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g, PALASM, ABEL, or CUPL).
- CAD Computer Aided Design
- a hardware description language e.g., VHDL or AHDL
- PLD programming language e.g, PALASM, ABEL, or CUPL
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A computer-implemented method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection, wherein each of the second set of objects is hierarchically structured includes parsing the first set of objects in relation to the second set of objects in order to generate a mapping between the first set of objects and the second set of objects. The method further includes causing display of a representation of the first set as a central node and a representation of the second set as a set of topic nodes surrounding the central node or surrounded by the central node, the displayed representations constituting a sieve diagram. The method also includes causing display in the sieve diagram of a set of relationships between the first set of objects and the second set of objects by providing a graphical linkage directly or indirectly between the central node and each node that corresponds to a member of the second set.
Description
Computer-Implemented Apparatus and Method for Interactive Visualization of a First Set of Objects in Relation to a Second Set of Objects in a Data Collection
Technical Field
[0001] The present invention relates to data displays, and more particularly to computer- implemented displays of data objects.
Priority
[0002] For purposes of the United States, this application is a continuation-in-part of U.S. patent application serial number 16/454,464 which claims the benefit of U.S. provisional patent application serial no. 62/697,785, filed July 13, 2018. Each of these applications is hereby incorporated herein by reference in its entirety.
Background
[0003] Systems employing graphic primitives to represent information are well known and have been employed in a variety of visualization systems. One of the advantages provided by such systems is that, due to the significant visual information processing capabilities of the human brain, it is generally easier for an individual to absorb and/or understand data represented visually than data represented in numerical or textual form. Further, complex and/or dense data which, if in a numerical or textual format would require multiple sheets of paper to be printed out or multiple screens views of a computer monitor to be displayed can be represented on a single computer monitor screen in a well-designed visualization.
[0004] In the search and retrieval context in particular, document results are often presented as a one-dimensional list of results that a user must scroll and click through. With the explosion of data in recent years, retrieval results can include hundreds and thousands of pages of list-based results for users to sift through. This has led to the advent of ever more sophisticated machine learning technologies to sort and rank the documents so that the most
useful documents are more likely to be at the top of the “stack”. However, in certain cases such as in research contexts, users urgently need to be aware of the full breadth of data available and to identify, create, and analyze meaningful subsets of the available information. In such situations, existing search and retrieval systems fall short of meeting users’ needs. It is therefore desired to provide more sophisticated and powerful methods for users. Relevant existing systems include search and retrieval technologies, and clustering techniques such as the cluster-wheel (also known as the sunburst or multi-level pie chart).
Summary of the Embodiments
[0005] In one embodiment the invention provides a computer-implemented method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection. In this embodiment, each of the second set of objects is hierarchically structured.
The method is implemented by computer processes including: storing the first set of objects and storing the second set of objects; parsing the first set of objects in relation to the second set of objects in order to generate a mapping between the first set of objects and the second set of objects; storing the mapping; causing an interactive graphical display of a representation of the first set of objects as a central node and a representation of the second set as a set of objects as a set of topic nodes surrounding the central node or surrounded by the central node, the displayed representations constituting a sieve diagram; and using the mapping to cause display in the sieve diagram of a set of relationships between the first set of objects and the second set of objects by providing a graphical linkage directly or indirectly between the central node and each node that corresponds to a member of the second set of objects, wherein the graphical linkage has a feature that graphically indicates a quantity associated with the mapping.
[0006] Optionally, the method further includes, in response to graphical selection, by a user, in the sieve diagram of a subset of topic nodes corresponding to a topic subset of the second set of objects, retrieving an object subset, of the first set of objects, that is mapped to the topic set. As another option the method further includes: receiving a search query; searching, in response to the received search query, the first set of objects, for a search
results set of objects having a set of features matching a set of features of the search query; and displaying the search results set of objects.
[0007] In a further related embodiment the graphical linkage is a distribution line.
[0008] In a further related embodiment, the method further includes, upon receiving a graphical selection by the user of a given one of the topic nodes in the sieve diagram, causing filtering of the first set of objects displayed to include only those objects corresponding to objects in the second set that are represented by the selected topic node.
[0009] Optionally, the method further includes, filtering, upon receiving a graphical selection by the user of one of the members of the second set in the sieve diagram, the first set of objects displayed to include only objects corresponding to objects in the first set that are mapped to the selected member of the second set.
[0010] In a further related embodiment, the method further includes, removing, upon receiving a graphical selection by the user of a central node in the sieve diagram, any filtering caused by a graphical selection of a topic node or of a distribution line.
[0011] Optionally, the method includes, displaying, upon the graphical selection by the user of the given one of the topic nodes, information pertinent to the selected topic node. [0012] Also optionally, the method further includes, upon the graphical selection by the user of the member of the second set, causing display of information pertinent to the selected member of the second set.
[0013] In a further related embodiment, the search query includes a Boolean expression, and wherein the method further includes evaluating the Boolean expression and performing the search using the evaluated expression. Optionally, the search query is defined, at least in part, by a user selection of a set of graphical elements in the sieve diagram. Also optionally, the search query includes at least one Boolean operator defined by a user input.
[0014] Also optionally, the method includes, the step of displaying, upon receipt of a user command to shrink displayed membership of the first set, only objects having features matching features of the search query.
[0015] In a further related embodiment, the method further includes, upon invocation by a user of a topic map limiter, limiting display of objects in the second set. Optionally, the topic map limiter invokes display of a tier in which the display of objects in the second set is limited. Also optionally, the topic map limiter allows selection by the user of which topics to
eliminate from the sieve diagram. Also optionally, the topic map limiter allows selection by the user of which topics to include in the sieve diagram.
[0016] In a further related embodiment, the computer processes further include, upon a graphical selection by a user of a member of the second set of objects in the sieve diagram, causing display of a derivative sieve diagram in which the selected member is a derivative central node. Optionally, the graphical linkage is a distribution line and the feature is a thickness of the distribution line. Also optionally, the distribution line has a secondary feature that indicates the quantity associated with the mapping under conditions wherein the thickness of the distribution line has been graphically constrained.
[0017] In a further related embodiment the graphical linkage has a secondary feature that graphically indicates a quality associated with the mapping.
[0018] In another embodiment the invention provides a computer-implemented method for interactive visualization of a set of objects in a hierarchical topic structure. The topic structure comprises a first set of topics and a second set of topics. In this embodiment, the method implemented by computer processes including:
(a) causing display of a sieve diagram having: the first set of topics; the second set of topics arranged around the first set of topics, wherein each member in the second set of topics is a child of a corresponding topic in the first set of topics; and a set of distribution lines, each distinct distribution line connecting one of the topics with its corresponding child and having a visual characteristic indicative of a quantity of objects associated with such child; and
(b) upon a user selection, in the sieve diagram, of a topic, causing display, outside of the sieve diagram, of objects mapped to the user selected topic.
Brief Description of the Drawings
[0019] The patent or application file contains at least one drawing executed in color. Copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
[0020] The foregoing features of embodiments will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawings, in which:
[0021] FIG. l is a representation of a display, generated by a computer executing a program in accordance with an embodiment of the present invention, that graphically represents relationships between first and second sets of objects, using, for the first set, a subset of documents from the NCBI PubMed document corpus, in this case illustrating results from a query to a database accessed by the program (wherein the second set is a hierarchy of topic nodes describing the anatomy of the brain);
[0022] FIG. 2 is a further representation of the display of the embodiment depicted in FIG. 1, in which the user has selected the topic node “Telencephalon” in the Fig. 1 display; [0023] FIG. 3 is a further representation of the display of the embodiment depicted in FIG. 2, in which the user has incremented the number of tiers shown, using 106 in the Fig. 2 display;
[0024] FIG. 4 is a further representation of the display of the embodiment depicted in FIG. 3, in which the user has incremented the number of tiers of the topic nodes displayed in 101 using 106, and also utilized the system’s pan/zoom functionality to better visualize a portion of 101 that is of interest;
[0025] FIG. 5 is a further representation of the display of the embodiment depicted in FIG. 4, in which the user has decremented the number of topic nodes shown in 101 by using 106, and has made new topic node selections of “Midbrain” and “Hindbrain”;
[0026] FIG. 6 is a further representation of the display of the embodiment depicted in FIG. 5, where the user has pressed shift+enter on their keyboard in order to make the document set returned in Fig. 5 become represented by the central node 102;
[0027] FIG. 7 is a further representation of the display of the embodiment depicted in FIG. 6, where the user has made an additional selection of “Forebrain” from Fig 6;
[0028] FIG. 8 is a further representation of the display of the embodiment depicted in FIG. 7, where the user has removed the “Forebrain” selection from Fig 7 and updated the search query from “*” to “glia”;
[0029] FIG. 9 depicts an embodiment of a system similar to Fig 1 - Fig 8, except with additional features to save queries to a project for later use;
[0030] FIG. 10 is a further representation of the display of the embodiment depicted in FIG. 9, where the user is saving the active query to “Project 1”;
[0031] FIG. 11 is a further representation of the display of the embodiment depicted in FIG. 10, where the user has changed the active query using previously depicted methods to retrieve a different set of documents from Fig. 10;
[0032] FIG. 12 is a further representation of the display of the embodiment depicted in FIG. 11, where the user is saving the active query to a project named “Projectl”;
[0033] FIG. 13 is a further representation of the display of the embodiment depicted in FIG. 12, where the user is on a default, unmanipulated interface;
[0034] FIG. 14 is a further representation of the display of the embodiment depicted in FIG. 13, where the user is using the “open project” feature to load the collection of documents the system derives from the queries saved to “Projectl”;
[0035] FIG. 15 depicts another embodiment of the present invention that features several distinct hierarchical topic maps that the user can choose to view individually or in combination with other hierarchical maps, and in this case the user has chosen to view the distribution of available information across maps of “human neuroanatomy” and “brain functions”;
[0036] FIG. 16 is a further representation the display of the embodiment depicted in FIG. 15, where the user has elected to view the distribution of available documents across maps of “human neuroanatomy”, “brain functions”, and “research methodologies”;
[0037] FIG. 17 is a logical flow of an embodiment of the present invention, by which the user can utilize Boolean selection criterion for multiple topic node 103 selections to be logically “OR” separated;
[0038] FIG. 18 is a logical flow that can be implemented in an embodiment of the invention and in combination with the logical flow depicted in FIG. 17 to enable additional Boolean selection criterion for multiple topic node 103 selections to be logically “AND” separated;
[0039] FIG. 19 is a logical flow that can be implemented in an embodiment of the invention and in addition to the logical flows depicted in FIG.17 and FIG. 18 for opening a project with multiple saved queries in order to retrieve the cumulative set of documents
represented by those queries, allowing users to separate multiple distinct queries with a logical “OR”;
[0040] FIG. 20 is a block diagram showing, at a high-level, a suitable architecture of the infrastructure and modules necessary to set up and prepare the system and data objects for operation in one embodiment of the invention;
[0041] FIG. 21 depicts an example input and output of the data normalization process for a collection of document data objects from different sources and with different formatting, in accordance with an embodiment of the present invention;
[0042] FIG. 22 depicts an example input and output for the data normalization process, in accordance with an embodiment of the present invention, to produce a hierarchically related set of data objects that represent topics;
[0043] FIG. 23 depicts an example of input and output data objects for the mapping module in an embodiment of the invention that will produce a mapping between a first set of document data objects 2103 of FIG. 21, and a second set of hierarchically related data objects representing topics 2202 of FIG. 22;
[0044] FIG. 24 depicts system architecture, for an embodiment of the invention that is accessible via the internet, using a client-server architecture;
[0045] FIG. 25 is a block diagram, of an embodiment of the present invention, showing the modules utilized in implementation of the server application 2402 of FIG. 24, which communicates with the client application 2401, the document database 2008, and graph database 2005.;
[0046] FIG. 26 is a visual representation of example inputs and output data of an embodiment of the recommended “Magnitude-of-Mapping Retrieval Module” 2503 of FIG. 25, in accordance with an embodiment of the present invention;
[0047] FIG. 27 depicts abstract representations of example inputs and outputs of an embodiment of the “Data Consolidation Module”2505 of FIG. 25, which combines the magnitudes of mappings output 2601 with the set of hierarchically related topics 2202;
[0048] FIG. 28 is a diagram, in accordance with an embodiment of the present invention, that shows modules for use in implementation of the client application 2401 of FIG. 24; [0049] FIG. 29 depicts a first manner by which the consolidated data 2701 are rendered by the render module 2801 of Fig. 28 in an embodiment of the present invention;
[0050] FIG. 30 depicts a second manner by which the consolidated data 2701 are rendered by the render module 2801 of FIG. 28 in an embodiment of the present invention;
[0051] FIG. 31 depicts a third manner by which the consolidated data 2701 are rendered by the render module 2801 FIG. 28 in an embodiment of the present invention;
[0052] FIG. 32 depicts how the document data stored in the system is utilized to render search results in the embodiment of the invention depicted in FIG. 24;
[0053] FIG. 33 depicts an example of a basic interaction that the Interaction Module 2802 of FIG. 28 can be programmed to register and process in an embodiment of the present invention;
[0054] FIG 34 depicts an example of how the Query Generation Module 2803, of FIG. 28 might be implemented to interpret the application state depicted and convert it into a query in an embodiment of the present invention;
[0055] FIG 35 is a further representation of the display associated with the embodying system depicted in FIG. 34, with a more complex example of how a query can be generated given a more complex selection state of the interface;
[0056] FIG. 36 depicts a simplified example of an embodiment of the present invention that features a total of 6 neuroscience related documents and a simplified topic map, describing the human anatomy of the brain;
[0057] FIG. 37 is a further representation of the display associated with the embodying system in FIG. 36, where the user has made a selection on the topic “Midbrain” in Fig 36; [0058] FIG. 38 is a further representation of the display associated with the embodying system in FIG. 37, where the user has selected “Forebrain” in Fig 37;
[0059] FIG. 39 is a further representation of the display associated with the embodying system in FIG. 38, where the user has selected “Diencephalon” in Fig 38;
[0060] FIG. 40 is a further representation of the display associated with the embodying system in FIG. 39, where the user has selected the topic “Thalamus” in Fig 39;
[0061] FIG. 41 is a further representation of the display associated with the embodying system in FIG. 40, where the user has made selections on “Thalamus” and “Telencephalon”, and has used the pan/zoom feature to focus on the topic “Forebrain” and its transitive subtopics;
[0062] FIG. 42 is an embodiment of the sieve diagram in which the central node is rendered in the periphery. A red rectangular box highlights an area that will be zoomed-into in Fig 43;
[0063] FIG. 43 is a further representation of the embodiment of the sieve diagram depicted in Fig 42 in which the interface uses the pan/zoom feature to enlarge the area highlighted by the red rectangle in Fig 42. The red rectangular box highlights an area that will be enlarged in Fig 44;
[0064] FIG. 44 is a further representation of the embodiment of the sieve diagram depicted in Fig 43 in which the interface has used the pan/zoom feature to enlarge the area highlighted by the red rectangle in Fig 43;
[0065] FIG. 45 is a perspective rendering of a three-dimensional sieve diagram in another embodiment of the present invention in which the distribution of a user’s investments across a variety of funds is depicted.
[0066] FIG. 46 is a perspective rendering of an embodiment of a circular sieve.
[0067] FIG. 47 is a further representation of the embodiment of the circular sieve in FIG.
46 in which the parent dataset is not shown.
[0068] FIG. 48 is a perspective rendering of an embodiment of a circular sieve in which neither the parent dataset nor the primary topic node is shown.
[0069] FIG. 49 is a perspective rendering of the invention wherein a filter ring, representing a previously curated Boolean set of topics has been added around parent dataset. [0070] FIG. 50 is a perspective rendering of the invention wherein topics related to human anatomy are depicted around a central node.
[0071] FIG. 51 is a perspective rendering of the invention shown in FIG 50 after a user has selected a topic.
[0072] FIG. 52 is a further representation of the invention in FIG 50 after a user has selected topic Tissues.
[0073] FIG. 53 is a further representation of the embodiment in FIG. 50 after a user has selected topic Cells.
[0074] FIG. 54 is a perspective rendering of wherein a compressed distribution line is colored with a higher intensity.
[0075] FIG. 55 is a perspective rendering of a sieve in which the relationships are depicted through a relationship arc.
[0076] FIG. 56 is a further representation of the embodiment in FIG. 55 wherein the user has a cursor hovering over distribution line.
[0077] FIG. 57 is a further representation of the embodiment in FIG. 55 wherein the user’s cursor is hovering over the relationship arc.
[0078] FIG. 58 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the distribution line.
[0079] FIG. 59 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the secondary relationship arc.
[0080] FIG. 60 is a further representation of the embodiment in FIG. 55 wherein the user is hovering a cursor over the distribution line between topic Cl and topic C13.
[0081] FIG. 61 is a perspective rendering of an embodiment of the invention wherein the distribution lines only connect to the child topic nodes.
[0082] FIG. 62 is a perspective rendering of an embodiment of a sieve from FIG. 61 where the central filling is depicted as a distribution line.
[0083] FIG. 63 is a perspective rendering of an embodiment of a sieve diagram represented as a webbed map.
[0084] FIG. 64 is a perspective rendering of a sieve wherein the distribution lines show the hierarchical pathing between data objects and their children.
[0085] FIG. 65 is a perspective rendering of a sieve wherein the central node is a topic node in the hierarchical set of nodes.
[0086] FIG. 66 is a perspective rendering of a sieve wherein no default search results are displayed.
[0087] FIG. 67 is a further representation of the embodiment of FIG. 66 after the user has selected Topic C3 and documents are displayed.
[0088] FIG. 68 is a further representation of the embodiment of FIG. 67 after the user has chosen to view related documents.
[0089] FIG. 69 is a perspective rendering of an embodiment of the invention where a sieve depicts a user’s investment portfolio.
[0090] FIG. 70 is a further representation of the embodiment in FIG. 69 after the user has selected the stock Netflix.
[0091] FIG. 71 is a further representation of the embodiment of FIG. 50 wherein a user has selected the topic Animal Structures and entered a query pig.
[0092] FIG. 72 is a top down rendering of an embodiment of the invention where the sieve diagram has been rendered in three dimensions.
[0093] FIG. 73 is a perspective view of the embodiment of FIG. 72 wherein the sieve diagram has been rendered in three dimensions and the distribution lines flow from the top down.
Detailed Description of Specific Embodiments
[0094] Definitions.
[0095] As used in this description and the accompanying claims, the following terms shall have the meanings indicated, unless the context otherwise requires:
[0096] A “set” includes at least one member.
[0097] A “computer process” is the performance of a described function in a computer using computer hardware (such as a processor, field-programmable gate array or other electronic combinatorial logic, or similar device), which may be operating under control of software or firmware or a combination of any of these or operating outside control of any of the foregoing. All or part of the described function may be performed by active or passive electronic components, such as transistors or resistors. In using the term “computer process” we do not necessarily require a schedulable entity, or operation of a computer program or a part thereof, although, in some embodiments, a computer process may be implemented by such a schedulable entity, or operation of a computer program or a part thereof. Furthermore, unless the context otherwise requires, a “process” may be implemented using more than one processor or more than one (single- or multi-processor) computer.
[0098] An “object” is a machine-readable encoding of a data item that can be processed and utilized by a computer programming language, operating on a computing device.
[0099] A “set X” is a set of data objects that can be mapped against a hierarchical set of topic nodes, set Y, based on any criteria, including but not limited to input or output
source(s), meta-tags or meta-information about the data objects, or the content of the data objects themselves, and for which those mappings can be quantified by some measurable magnitude for each member of the Set Y that Set X can be mapped against.
[00100] A “set Y” is a set of data objects with a defined hierarchical or heterarchical pattern of relationships between each other.
[00101] A “central node” is a display component, representing all members of the set X, and rendered either at the center of a sieve diagram or at the periphery of the sieve diagram. Distribution lines, wherein each distribution line corresponds to a subset of set X, emanate from the central node to connect to topic nodes. In one embodiment the central node represents a master document set that is subject to filtering by means of (i) graphically imposed constraints via the sieve diagram or (ii) search terms entered into a query box or (iii) other constraints, such as date range, specified as search facets. In some embodiments, the master document set is all the documents mapped to a particular member of set Y. In these embodiments the central node is such particular member of set Y. A central node may be represented as a circle, an arcuate slice, a square, an image, or another suitable visual element or combination of visual elements.
[00102] A “topic node” is a display component representing a member of the set Y. A topic node may be represented as a circle, an arcuate slice, a square, an image, or another suitable visual element or combination of visual elements. In one embodiment hierarchical topic nodes are rendered radially outward, away from the central node, so that the topmost node in the hierarchy is closest to the central node and the bottom-most nodes of the hierarchy are farthest from the central node. In a further related embodiment, graphically selecting a given topic node, associated with a class of members in set Y, will cause filtering of the set X (displayed at the central node) to include only members of set X that correspond to the selected topic node of set Y.
[00103] A “distribution line” is a visual representation portraying that a set of objects mapped to a member of set Y are also mapped to a child of such member of set Y. In some embodiments, the width of the distribution line, or some other characteristic of the line, is utilized to represent a number of the mappings between set X and members of set Y represented by the connected topic node. In one embodiment a distribution line is a line that connects a central node and a topic node. A line is rendered “indirectly” to a node
corresponding to a given member of the second set when the line extends from the central node to a hierarchical parent of the given node and a further line extends from the hierarchical parent to the particular node.
[00104] A “sieve diagram” is a display comprising (i) a set of central nodes associated with a first set of objects, (ii) a set of topic nodes that are included in a second set of objects, wherein the first set of objects has been mapped to the second set of objects, and (iii) a set of distribution lines, each of which connects one of the topic nodes to one of the central nodes. Each sieve diagram has its own set of central nodes.
[00105] A “topic map selector” is an interface component that allows the user to select which topic hierarchies to display in the sieve when the system is implemented so that set Y is composed of multiple independent hierarchical topic sets.
[00106] A “topic map limiter” is any interface component of functionality that lets user limit how much of a topic map hierarchy or heterarchy is to be displayed.
[00107] “Results” show either a subset or entirety of the results returned by either a traditional search-engine type search or a sieve filtration. If the embodiment implemented uses something other than a document set for set X, then results should be relevant to the set X used in the embodiment.
[00108] A “derivative sieve diagram” is a new sieve diagram, based on a previously established sieve diagram, wherein a user selection has been made of a member of the second set of objects in the previously established sieve diagram in which the selected member is a derivative central node.
[00109] A “derivative central node” of a derivative sieve diagram is a central node of the derivative sieve diagram.
[00110] The visualizations depicted herein are original and improve upon cluster-wheel visualization. The embodiments described herein include novel functionality that can be implemented only using methods that leverage modem computer and information technology systems.
[00111] FIG. 1 is a representation of a display, generated by a computer executing a program in accordance with an embodiment of the present invention, that graphically represents relationships between first and second sets of objects, using, for the first set, a
subset of documents from the NCBI PubMed document corpus, in this case illustrating results from a query to a database accessed by the program. (The second set is a hierarchy of topic nodes describing the anatomy of the brain.) The Sieve interface component 101 is composed of a central node 102, a set of topic nodes 103 like “Forebrain” and “Diencephalon”, and a set of distribution lines 104 that emanate from the central node and connect to the topic nodes either directly or transitively through hierarchical parent topic nodes. In the embodiment depicted and in its current state, the central node 102 represents a master set of documents returned from the user-specified query entered using the search bar 105. Since the user-specified query is a “*”, which this particular system interprets as a wild card, “return all documents”, all 78,908 documents in the system are returned. Consequently, the central node 102, indicates that the master set of documents in response to the user- specified query is all 78,908 documents.
[00112] The distribution lines 104, emanating from the central node 102 and connecting to the various topic nodes 103, illustrate the number of documents returned from the user- specified query in the search bar 105 that also are related to the topic node 103 connected to, based upon the width of the distribution line 104. A thicker distribution line 104 means more documents in the system are related to a topic and a thinner distribution line 104 means fewer documents are thus related. No distribution line connecting to a topic node 103 means that there are no documents returned from the user-specified query in the search bar 105 that are related to the topic represented by the topic node 103. In the embodiment depicted, the distribution lines 104 indicate that there are documents related either directly to the topic represented by the topic node 103 connected to or that there are documents related to a subtopic of the topic represented by the topic node connected to. The topic map limiter 106 in this embodiment consists simply of an increment and decrement functionality that allows users to determine how many tiers of the hierarchy to display on screen. The results section 107 includes a results summary 108, in addition to a list of search results, which is composed of individual documents 109 that match the total user query, which is composed of the user input in the search bar 105 combined with the query state of the sieve 101. Since, in Fig 1, the user has not performed any manipulations on the Sieve 101, the query is simply represented by the search input into the search bar 105. Each document returned includes the document’s title 110, as well as a list of tags 120 that represent the topics, from the Set Y,
that the document was mapped to. Fig 1 represents a default, unmanipulated state of the computer-generated display.
[00113] In an example embodiment, the display of FIG. 1 is a computer user interface generated by the computer program. The computer user interface presents the sieve interface component 101, to graphically represent relationships between the first set of objects (subset of documents from the NCBI PubMed document corpus) and the second set of objects (topic nodes describing the anatomy of the brain), the search components 105, 108, 110, 120, and the topic map limiter 106 to the user via a computing device.
[00114] FIG. 2 is a further representation of the display of the embodiment depicted in FIG. 1, in which the user has selected the topic node 103 representing the topic “Telencephalon” in the Fig. 1 display. As a result, the combined query, defined by the sieve 101 and the query entered in the search bar 105, is updated as shown in the results description 108. The documents returned in the search results 107 are now all documents returned from the “*” search query (all documents in the system) that are also tagged with “Telencephalon”, which of the total 78,908 documents in the system, is 1,533 documents. [00115] FIG. 3 is a further representation of the display of the embodiment depicted in FIG. 2, in which the user has incremented the number of tiers shown, using the topic map limiter 106 in the Fig. 2 display. In this particular embodiment, the topic map limiter 106 includes only two buttons that trigger updates to the sieve 101, in order to show greater or fewer tiers of the hierarchical topic map displayed in the sieve 101. Whereas in Fig. 2, where the topic map limiter was set to “Tier: 2” and the sieve 101 showed only two tiers deep into the topic map of neuroanatomy, the user has set the topic map limiter 106 in Fig. 3 to “Tier: 3” and so the sieve 101 shows the top three tiers of the hierarchical topic map of neuroanatomy.
[00116] FIG. 4 is a further representation of the display of the embodiment depicted in FIG. 3, in which the user has incremented the number of tiers of the topic nodes displayed in 101 using topic map limiter 106, and also utilized the system’s pan/zoom functionality to better visualize a portion of 101 that is of interest. Whereas in Fig. 3, where the topic map limiter was set to “Tier: 3” and the sieve 101 showed only the top three tiers of the topic map of neuroanatomy, the user has set the topic map limiter 106 in Fig. 4 to “Tier: 4,” and so the sieve 101 shows the top four tiers of the hierarchical topic map of neuroanatomy. In this
particular embodiment, the pan functionality is utilized by clicking and dragging the sieve diagram and the zoom functionality is utilized by use of the mouse wheel, where scrolling it up zooms in and scrolling it down zooms out.
[00117] FIG. 5 is a further representation of the display of the embodiment depicted in FIG. 4, in which the user has decremented the number of topic nodes in the sieve 101 by using the topic map limiter 106, and has selected a different set of topic nodes: “Midbrain” and “Hindbrain”. As a result, the combined query from the search bar 105 and the sieve 101 is now all documents returned from the “*” query (all documents) that also contain the topics or subtopics of either “Midbrain” or “Hindbrain”. This update to the query and description of the documents returned is reflected in the results description 108.
[00118] FIG. 6 is a further representation of the display of the embodiment depicted in FIG. 5 and has been manipulated by the user so that the previous document selection represented in the results description from FIG. 5 is now represented by the central node 102 and the distribution lines 104 displayed now describe the distribution of documents from that subset across the hierarchical topic set describing the human brain. In other words, whereas in Fig. 5 the central node 102 represented the full set of documents returned from the search query entered into the search bar 105, it now represents in Fig. 6 the document set returned from the previous sieve 101 selections. In this particular embodiment, this type of sieve 101 manipulation is triggered when the user presses the “shift” key and the “enter” key simultaneously. If this type of “deep dive” feature is implemented in the system it can be implemented in other ways as well. For example, it could have been implemented so that the central node 102 would represent the entirety of the last submitted query, including the user submitted query in the search bar 105, and not just the selected topic node portion of the previous query.
[00119] FIG. 7 is a further representation of the display of the embodiment depicted in FIG. 6, and has an additional selection of “Forebrain”, making the active query all documents returned from the query in the search bar 105, “*” that are related to the Midbrain, Hindbrain, or any of their subtopics, and also the “Forebrain”, or any of its subtopics. In terms of Boolean logic, this can be better represented as: (documents returned from “*” query) AND (documents tagged with “Midbrain” OR “Hindbrain”) AND
(documents tagged with “Forebrain”). Results of this updated query are reflected in the results description 108.
[00120] FIG. 8 is a further representation of the display of the embodiment depicted in FIG. 7, with the selection of “Forebrain” from FIG. 7 removed by clicking the central node 102, and the search query in the search bar 105 updated to “glia”. Therefore, the document set retrieved from the system and displayed in the results section 107 in Fig. 8 is all documents that mention “glia” that also mention “Midbrain” or “Hindbrain”, or any of their subtopics. This is reflected in the results description 108.
[00121] FIG. 9 is a further representation of the display of the embodiment depicted in FIG. 8, with additional features to save new queries for later use. Specifically, it features additional “Open Project” 901 and a “Save Query” 902 functionalities. Saving queries to a project allows users to combine search results from several different queries in order to create a larger collection of documents that they can then sift through with the sieve and any other filters / search features that an embodiment of the invention is implemented with. It is also one possible approach, though not the only approach, for enabling multiple queries to be combined with a Boolean logical “OR”.
[00122] FIG. 10 is a further representation of the display of the embodiment depicted in FIG. 9, where the user is saving the active query to “Projectl”, using the dialog box 1001 that appeared after the user clicked “Save Query” 902 in Fig. 9.
[00123] FIG. 11 is a further representation of the display of the embodiment depicted in FIG. 10, where the user has changed the active query on the sieve 101 to retrieve a different set of documents.
[00124] FIG. 12 is a further representation of the display of the embodiment depicted in FIG. 11, where the user is saving the active query to a project named “Projectl” using the dialog box 1001 that appeared as a result of the user clicking “Save Query” 902 in Fig. 11. [00125] FIG. 13 is a further representation of the display of the embodiment depicted in FIG. 12, where the user is now on an unmanipulated, default interface. The user achieved this view into the data by selecting the central node 102 and removing the existing selections. [00126] FIG. 14 is a further representation the display of the embodiment depicted in FIG. 13, where the user clicked “Open Project” 901 in Fig. 13, causing the project dialog box 1401 to appear. If the user selects “Projectl” as shown in Fig. 14, then the system will load
the set of documents created by combining the separate document sets from each of the subqueries saved to “Projectl” in Fig. 9 - Fig. 10 with Boolean logic “OR”. That is, it combines all of the document sets from each saved query and removes duplicates. This new document set will be then represented by the central node 102, and the distribution line 104 that is displayed will represent the distribution of search results from that new document set across the topics represented by the topic nodes 103.
[00127] FIG. 15 depicts an embodiment of the present invention that features several distinct hierarchical topic maps that the user can choose to view individually or in combination with other hierarchical maps. In this embodiment, the user can double-click a distribution line 104 or topic node 103 to view the body of related documents on a different page. In the system state depicted in Fig. 15, the user has selected to view the distribution of available documents across maps of “human neuroanatomy” 1502 and “brain functions” 1503, using the topic map selector 1501. These maps are displayed around the central node 102 as visually separated groups of topic nodes 103, using space and color for distinction. The group of topic nodes 103 on the left 1504 represent the topic map for neuroanatomy, and the group of topic nodes 103 on the right 1505 represent the topic map for brain functions. [00128] FIG. 16 is a further representation of the display of the embodiment depicted in FIG. 15, where the user has elected to view the distribution of available documents across three topic maps simultaneously, by clicking the “Methodology” button 1601 in the topic map selector 1501 in Fig. 15, causing the display of the group of topic nodes in orange 1602 that are depicted at the bottom of the sieve 101 in the interface.
[00129] FIG. 17 is a logical flow of an embodiment of the present invention, by which the user can utilize Boolean selection criteria to enable multiple topic node 103, of FIG. 1, selections to be logically “OR” separated. This logical flow is not the only way this interaction can be implemented, but merely represents the implementation used in the embodiment of the invention depicted in FIG. 1 - FIG. 14. In process step 1701, the system is fully loaded and ready for the user to interact with. Process step 1702 is triggered when a user selects a topic node 103 or a distribution line 104 in the sieve 101, as in FIG. 2. After the selection is registered in step 1702, the system checks if the user was holding down the “shift” key on the keyboard controlling the system in process step 1703. If the “shift” key was held, the system proceeds to step 1705 and adds the selected topic node 103 to the list of
existing selected topic node 103, if any, as in FIG. 5. If the “shift” key was not held, the system proceeds to step 1704 and removes any existing topic node 103 selections before making the newly selected topic node 103 the only active topic node 103 selection. After either step 1704 or 1705 completes, the system returns to step 1701, where it is ready to record any further interactions.
[00130] FIG. 18 is a logical flow diagram that can be implemented with the logical flow depicted in FIG. 17 for an embodiment of the present invention wherein multiple topic node 103 selections can be logically “AND” separated. This logical flow is not the only way this interaction can be implemented, but merely represents the implementation used in the embodiment of the invention depicted in Fig. 1 - Fig. 14. In process step 1801, the system is ready for interaction and the user has already interacted with the system to register one or more topic node 103 selections. Then, if the system registers an “enter” key pressed while the “shift” key is also held, then the system will proceed to process 1802, taking the currently selected body of documents and making the central node 102 represent these documents and clearing any of the topic node 103 selections. The system then proceeds to process 1803 to retrieve magnitude mappings 2701 for each topic node 103 displayed that has mappings with documents in the selected document set. The system then proceeds to process 1804 and uses this new data about magnitude mappings to display new distribution lines 104 to represent the distribution of documents within that document set across the displayed topic nodes 103. Process 1804 is completed when the system has re-rendered the interface and the system is ready to receive additional user-interactions. An example of the results of this interaction can be seen in FIG. 6, relative to FIG. 5. The process steps 1801 through 1804 can be repeated as desired and as additional selections are made to enable users to continuously view the distribution of newly selected or filtered document sets across displayed topic nodes 103. [00131] FIG. 19 is a logical flow diagram that can be implemented in addition to the logical flows depicted in FIG. 17 and FIG. 18, for an embodiment that enables opening a project with multiple saved queries in order to retrieve the cumulative set of documents represented by those queries, allowing users to combine the document sets retrieved from multiple distinct queries with a logical “OR”. An example of a corresponding interface display for this logical flow can be seen in FIG. 9 - FIG. 14. In process 1901, the system receives, processes, and renders the response from a user specified query. Process 1902 is
triggered once a user indicates that they wish to add the currently displayed query to a project. Since multiple queries can be added to a project, processes 1901 and 1902 can be repeated as many times as desired. Process 1903 is triggered when the user indicates that a desire to open an existing project. Process 1903 in turn triggers process 1904, where a collection of documents is retrieved such that a document is included in the collection if it matches at least one of the queries that were added to the project. At this point any duplicate documents can be removed. The system also retrieves the magnitudes of mappings 2701. Once the requisite data is retrieved, the system proceeds to process 1905, where the interface is re-rendered with the new document set and magnitude data, used for the distribution lines 104.
[00132] FIG. 20 is a block diagram showing, at a high-level, a suitable architecture of the infrastructure and modules necessary for a particular embodiment of the invention to set up and prepare the system and data objects for operation. Block 2001 represents the document data objects of the set X, which may come from any variety of sources. The purpose of the Data cleansing and normalization module 2003 for set X is to ensure all data has the consistent and desired formatting necessary for the embodying system to operate with it. Once the data is prepared according to the designed specification for the implementing system of the embodiment, the data is stored for later use, such as by storing it in a document database 2004 of some kind, such as SQL. Graph data set 2002 is a raw form of the set Y, related data objects. The purpose of the Data Cleansing and Normalization Module 2006 for Set Y is to prepare the set of related objects for use in accordance with operation of the system. The exact implementation and functions of this module will vary depending on the data 2002 that is provided as an input. However, the output should be hierarchical or heterarchical data with no loops. The module can be either an automated process, manual process, or a combination of both as is necessary to normalize the data inputs. Once normalized, the hierarchically related data of set Y 2002, output from the normalization module 2006, is stored for later use. Any suitable data storage mechanism will suffice, though the embodiment depicted utilizes a Neo4j graph database store 2005. Once the data sets X and Y are prepared and stored, the mapping module 2007 takes these data sets as inputs for its processes that determine which objects in set X should have mappings to specific objects in the set Y. The criteria for determining whether a mapping should be made
is arbitrary and will vary depending on the system’s needs. In the embodiment of the invention depicted, a mapping is generated for a document in the set X if a topic from the set Y is mentioned in a document in the set X. Once the mappings are generated, they are stored for later access. Any suitable mechanism of data storage and access is sufficient, such as, for example, the Elasticsearch database 2008. The documents from the set X are stored in the database 2008 and indexed with their mappings, which in the embodied system are topics from the set Y that the documents were found to be related to in the mapping module.
[00133] FIG. 21 depicts an example input and output of the data normalization process for a collection of document data objects from different sources and with different formatting, in accordance with the embodiment of the present invention in FIG. 20. Headers 2101 and 2102 are tabularized headers of sample XML-formatted document-based data from two different sources and with two different kinds of formatting. The embodied system in this example, utilizes data from both sources, and so the data must be normalized so that both have consistent headers for later access and querying. If a system only uses information from a single source and reformatting of the data is not necessary, this module may be omitted. [00134] FIG. 22 depicts an example input and output for the data normalization process, in accordance with the embodiment of the present invention in FIG. 20, to produce a hierarchically related set of data objects that represent topics. Item 2201 is a connected graph data object representing a variety of relationships between the topics contained within the graph. Embodiments of the present the invention are well suited for use with hierarchical or heterarchical related data objects. As such, the purpose of this module is to ensure that the data consumed by the system is either hierarchically or heterarchically related, and has no direct or transitive loops between data objects. If any of these circumstances is determined to be present, automated and/or manual methods can be used to address them, as suited to the needs and capabilities of the embodying system and environment. If an embodiment of the invention does not utilize graph data inputs, and instead generates hierarchies from scratch, then this module is not necessary.
[00135] FIG. 23 depicts an example of input and output data objects for the mapping module 2007 in FIG. 20, from an embodiment of the invention that will produce a mapping between a first set of document data objects 2103 of FIG. 21, and a second set of hierarchically related data objects representing topics 2202 of FIG. 22. The criteria by which
a mapping is made between members of the first and second of objects 2202 and 2103 are arbitrary but should be consistent and quantifiable. For example, the embodiment of FIG. 23 creates a mapping for a document in 2103 if a topic in 2202 is mentioned in the title or text of a document. The output of the mapping module 2007 is represented in the tabular data representation 2301, and depicts how the documents of 2103, stored in the SQL database 2004, are updated with a new column, “Mappings”, which contains all the topics from 2202 that were found to be mapped to the documents. Note that 2301 still retains additional fields like “Text” in 2103, but they are simply not depicted in the figure. In the embodiment depicted, the updated Set X 2301 is then stored in a search database like Elasticsearch, though any other sufficiently query-able system is suitable as well. The quantifiable metric in this embodiment will be the number of documents each topic is mentioned in.
[00136] FIG. 24 depicts system architecture for an embodiment of the invention that is accessible via the internet, using a client-server architecture. Any suitable architecture that enables the prepared data to be effectively queried and processed to perform the functions and capabilities outlined in this patent is suitable. In the embodiment depicted, the client web application 2401 is running in a browser on a user’s computer. The web application 2401 is in communication over the internet with web server 2402, which may be hosted privately, locally, in the cloud, or any other suitable hosting mechanism that enables communication via networks such as the internet. The web server 2402, in turn is coupled to document database 2008, containing documents with mappings 2301, and graph database 2005, which contains the hierarchically or heterarchically related data objects 2202 utilized by the system. [00137] FIG. 25 is a block diagram of an embodiment of the present invention, showing the modules utilized in implementation of the server application 2402 of FIG. 24, which communicates with the client application 2401 and the document database 2008 and graph database 2005. The Request Routing Modules 2501 are responsible for receiving requests from the client application, orchestrating their processing, and sending responses. The Document Retrieval Module 2502 and Graph Retrieval Module 2504 are responsible for communicating with the databases to get the data objects necessary to fulfill the client request. The Magnitude-of-Mapping Retrieval Module 2503 is responsible for quantifying the mappings that are found in the retrieved documents and providing a magnitude for each type of mapping, if the capability is not included in the database itself. The Data
Consolidation Module 2505 is then responsible for integrating the data retrieved from the variety of sources so that the Request Routing Module 2501 can respond with the appropriate data payload.
[00138] FIG. 26 is a visual representation of example inputs and output data of an embodiment of the recommended “Magnitude-of-Mapping Retrieval Module” 2503 of FIG. 25. When a query for documents is issued to the server and documents are retrieved in process 2301, the mappings must be quantified for each topic in the system that had a mapping determination made against the set X. An example output is depicted in 2601, where for each possible mapping type, or topic, the number of times the mapping appeared in the result set is totaled. This is not the only way to quantify the mappings, but simply the approach implemented in this particular embodiment of the invention. Any sensible quantification that is suitable to the aims of the embodying system may be implemented. [00139] FIG. 27 depicts abstract representations of example inputs and output of an embodiment of the “Data Consolidation Module”2505 of FIG. 25, which combines the magnitudes of mappings output 2601 with the set of hierarchically related topics 2202. The data need not be integrated in this way, but rather this module is presented and visualized in this manner to better help the reader understand how the system data is combined for later use in rendering.
[00140] FIG. 28 is a diagram, in accordance with an embodiment of the present invention, which shows modules for use in implementation of the client application 2401 of FIG. 24. The Render Module 2801 utilizes system data to build the display that the user sees. The Interaction Modul 2802 integrates with the rendered display and programs interactive capability so the user can interact with the system. The Query Generation Module 2803 maintains and tracks data that is necessary for query generation and passes it to the Request Module 2804, which maintains communication with the application server 2402, when requests are made for new query responses.
[00141] FIG. 29 shows an embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of Fig. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103. The amount of documents mapped to each topic in 2701 are used to determine the width of the distribution lines 104.
The central node 102 represents all documents returned from a submitted query, and one or more higher-tier topic nodes in 2701 are excluded from the rendering.
[00142] FIG. 30 depicts a second embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of FIG. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103. Again, the amount of documents mapped to each topic in 2701 are used to determine the width of the distribution lines 104 rendered. The central node 102 in this embodiment represents a topic node from the topic hierarchy in 2701. In this embodiment the master set of document are documents mapped to “Brain.” No node is rendered to represent the full-set of documents returned from an active query. In this embodiment, another mechanism is provided to select the full document set via the orphaned node 3001.
[00143] FIG. 31 depicts a third embodiment of a rendering by the Render Module 2801 of FIG. 28 as a result of processing the consolidated data 2701 of FIG. 27, in which the hierarchically related topics in 2701 are utilized to render topic nodes 103. Again, the amount of documents mapped each topic in the abstract representation of the data on the left-side of the figure are used to determine the width of the distribution lines 104 rendered. In this embodiment, the central node 102 represents the set of all documents returned from a query, and the root topic in the abstract data representation of the hierarchically related topics is also rendered.
[00144] FIG. 32 depicts an embodiment of a rendering by the Render Module 2801 of FIG. 28 wherein document data 2301 of FIG. 23, retrieved in response to a user query and stored in the system, is utilized to render search results 107.
[00145] FIG. 33 depicts an example of a basic interaction that the interaction module 2802 of FIG. 28 can be encoded to account for. In the embodiment depicted, clicking a topic node 102 or distribution line 104 connecting to a topic node 103 triggers the selection of the topic node 103. Selection of a topic node 103, triggers the generation of a new query and retrieving of the subset of results that correspond to the topic node 103 selected. Refer to Fig. 17 - Fig 19 for more sophisticated explanations of suggested interactions.
[00146] FIG 34 depicts an example of how the Query Generation Module 2803 might be implemented to interpret the application state depicted and convert it into a query. Note that the search bar 105 is not required, and the Sieve can be combined with any other search
facets desired as well. In the current system state, a user query 3401 is entered in the search bar 105, and a selection 3402 has been made on “Topic A”. In the embodiment depicted, the system retrieves all documents that matched the search query that also are mapped to “Topic A”. Furthermore, in this embodiment any document mapped to a subtopic is also considered to be mapped to the parent topic. Therefore, a selection of “Topic A” also retrieves documents that are mapped with “Topic G”, “Topic F”, “Topic E”, “Topic I”, and “Topic H”. The combined query is described in 3403.
[00147] FIG 35 is a further representation of the display associated with the embodying system depicted in FIG. 34, with a more complex example of how a query can be generated given a more complex selection state of the interface. In contrast to Fig. 34, this FIG. 35 depicts a system state where a multi-selection 3501 has been made on “Topic I” and “Topic F”. 3502 contains a more sophisticated description of the combined query.
[00148] FIG. 36 depicts a simplified example of an embodiment of the present invention, which features a total of 6 neuroscience related documents and a simplified topic map describing the human anatomy of the brain. In this FIG. 36, a “*” query, meaning return all documents, has been submitted and no selections have been made. Therefore, all documents are returned, regardless of their mappings. In Boolean logic, the query depicted can be stated as: (*).
[00149] FIG. 37 is a further representation of the display associated with the embodying system in FIG. 36, but with a selection made on the topic “Midbrain”, indicating a query of all available documents that are also related to the topic “Midbrain”. In Boolean logic, the query can be stated as: ((*) A (mapping: “Midbrain")).
[00150] FIG. 38 is a further representation of the display associated with the embodying system in FIG. 37, but with a selection made on the topic “Forebrain”, indicating a query of all available documents that are also related to the topic “Forebrain”, or any of its subtopics. In Boolean logic, the query can be stated as: ((*) A (mapping: “Forebrain")).
[00151] FIG. 39 is a further representation of the display associated with the embodying system in FIG. 38, but with a selection made on the topic “Diencephalon”, indicating a query of all available documents that are also related to the topic “Diencephalon”, or any of its subtopics. In Boolean logic, the query can be stated as: ((*) A (mapping: “Diencephalon”)).
[00152] FIG. 40 is a further representation of the display associated with the embodying system in FIG. 39, but with a selection made on the topic “Thalamus”, indicating a query of all available documents that are also related to the topic “Thalamus”. In Boolean logic, the query can be stated as: ((*) A (mapping: “Thalamus”)). In some embodiments, when the pan or zoom features are used, the distribution line 104 from “Forebrain” to the left off the page, need not be rendered, and thus the topic node 103 is the central node.
[00153] FIG. 41 is a further representation of the display associated with the embodying system in FIG. 40, but with selections made on the topics “Thalamus” and “Telencephalon”, indicating a query of all available documents that are also related to either the topic “Thalamus” or the topic “Telencephalon”, or any of their subtopics. In Boolean logic, the query can be stated as: ((*) A ((mapping: “Thalamus”) V (mapping: “Telencephalon”)). [00154] FIG. 42 is an embodiment of the sieve diagram 101 in which the central node 102 is rendered in the periphery. Distribution Lines 104 emanate from the central node 102 to connect directly or transitively to topic nodes 103. A red rectangular box 4201 is depicted on top of the interface to aid the reader of this document in identifying the region of the diagram that will be zoomed into in Fig 43.
[00155] FIG. 43 is a further representation of the embodiment of the sieve diagram depicted in Fig 42, in which the user has utilized the/pan/zoom functionality to zoom into the area of the diagram identified by the box 4201 in Fig 42. The red rectangular box 4301 of Fig. 43 identifies a region that has been selected for enlargement in Fig. 44.
[00156] FIG. 44 is a further representation of the embodiment of the sieve diagram depicted in Fig 43, in which the user has utilized the pan/zoom feature to zoom into the area identified by the red rectangle 4301 of Fig 43.
[00157] FIG. 45 is a perspective rendering of a three-dimensional sieve diagram in another embodiment of the present invention. In this embodiment, the distribution of a user’s investments across a variety of funds is depicted. In the image depicted, the user has selected the “Crypto” fund and is able to see additional details, in this case total amount invested, in the fund’s members: “Bitcoin” and “Ether”. A three-dimensional sieve diagram is likely to be more suitable for use with AR (Augmented Reality) and VR (Virtual Reality) systems. In place of a “pan” feature, this embodiment may include a robust “rotation” feature, along with a “zoom”.
[00158] FIG. 46 is a perspective rendering of an embodiment of a circular sieve. In the embodiment of Figure 46 the parent dataset is the central node 4601. The central node 4601 is surrounded by a primary topic node 4602, depicted as a ring, representing the primary topic from which a set of secondary topic nodes 4604 derive from. Since the ring represents a singular topic, the presence and magnitude of the distribution between the central node 4601 and the topic node 4602, which is usually represented by a distribution line having a visual characteristic, such as width, is instead represented by a filling 4603. In some embodiments the distribution lines 4605 represent the number of documents related to a topic 4604. The portion of the line emanating from the parent node has a width indicative of the number of documents mapped to the parent topic and the portion of the line emanating to the child topic represents the number of documents mapped to the child topic. In some embodiments, the filling 4603 leverages a topic ring 4602 as a container. The full height of the ring represents the number of documents mapped to the central node 4601, and the height of the filling 4603 represents the number of documents mapped to the child topic 4603 as a percentage of the documents mapped to the parent dataset 4601. In the embodiment shown in Figure 46, the filling 4603 indicates that more than 75% of the documents mapped in the central node 4601 are mapped to the topic “Chemicals and Drugs” 4602.
[00159] FIG. 47 depicts a derivative embodiment of the invention in which the parent dataset is not rendered. Therefore, the topic node 4702 for “Chemicals and Drugs” is the central node. The distribution lines 4705 are still able to indicate a volume of articles in each topic. In some embodiments, rendering the parent dataset provides clarity and usefulness, however, in other embodiments, more experienced or acquainted users may prefer to have the parent dataset not rendered.
[00160] In a further derivative embodiment shown in Figure 48 neither the parent dataset nor the primary topic node are rendered. In this embodiment the set of topics 4804 in the inner ring are the central nodes. This embodiment is advantageous to users who prefer less clutter and may be more familiar with the higher tiered topics. Distribution lines 4805 are varied in thickness at the child topic’s 4804 center. By allowing the thickness of the distribution lines 4805 to vary while in between the two topics creates a more readable and stylistically interesting display, while keeping the thickness proportional at the center of the topic 4804 center allows for visual access to information.
[00161] FIG. 49 depicts an embodiment of the invention wherein a filter ring 4911, representing a previously curated Boolean set of topics has been added around parent dataset 4901. The filter ring 4911 conceptually represents a filter of the mappings available for representation across the portions of the visualization that are radially more distal, relative to the filtering ring. In this embodiment, the filter ring 4911 represents all topics a user selected which they believe are related to a particular research assignment. In the representation in Figure 49, the user named the set of topic selections “Research Project A Topics.” The filling 4903 is shown about a quarter of the way filled, thus representing about 25% of the topics available in the parent dataset 4901 are mapped to the Boolean set of topics represented by the filter 4911. The documents considered for determining mapping magnitudes across the remaining topics, distal to the filter, are now only the subset of documents that matched the query represented by the filter ring.
[00162] In related embodiments, the search bar 4960 is used to create a filter ring 4911. In such an embodiment a user enters a search and the filter ring can depict such search, and the filling 4903 would represent the documents in the parent dataset related to such search. [00163] FIG. 50 depicts an embodiment of the invention in which topics related to human anatomy are depicted around a central node, representing a 60,000 document test set of the NCBI PubMed corpus. As compared to the embodiment in Figure 46, the filling 5003 is at a different level, as a different number of documents related to the topic ring 5002 of Anatomy than the topic ring 4602 of Chemicals and Drugs. Further, the topics 5004 differ from the topics 4604 as they all relate to the topic ring 5002, anatomy. In this embodiment, the interface only displays up to a certain number of topics at once. If a topic has no children, it is rendered with a slight transparency. If a topic has children it is solid in color. In this embodiment, a user can view a topic’s hidden children by double-clicking a topic of interest that has hidden children. This will cause all topics accept the indicated topic to disappear, reposition the indicated topic, and then display its children and transitive children until there are either no more topics to render or until the number of topics rendered would go over a predetermined render limit.
[00164] FIG. 51 depicts the embodiment of the invention shown in Figure 50 after a user has selected a topic 5107. The user happened to select Nervous System. In various embodiments the user may select Nervous system through using a mouse and keyboard, a
touchscreen, an algorithm, and other methods of selection. Selecting the topic 5107 causes the display of documents 5108 to show document information 5109 mapped to the topic 5107. In this embodiment the document information 5109 is the title and the authors. In similar embodiments the document information 5109 includes a summary, an abstract, a date and other information about and related to the document. In some embodiments a search results summary 5110 shows how many documents are returned for the selected topic 5107, in the example of Figures 51, 7,514 documents are associated with the topic Nervous System.
[00165] FIG. 52 depicts the embodiment of the invention in Figure 50 after a user has selected topic 5207 “Tissues,” causing display of documents 5208 to show documents mapped to the topic “Tissues.” The search results summary 5210 shows there are 17,744 documents associated with this topic. Compared to Figure 51, there are more documents mapped to Tissues than Nervous System, however, due to the number of child topics 5204 mapped to Nervous System, the angular space of the topic Nervous System is larger than that of Tissues. Therefore, in this embodiment the angular spaces are not indicative of a number of documents mapped to the topic. Further, the distribution lines 5205 are constrained in thickness to the angular space of the topic 5207. Therefore, in this embodiment, if the correct thickness of the distribution line must be constrained due to the angular space of the topic, the distribution lines 5207 may be misleading to the number of mapped documents. [00166] FIG. 53 depicts the embodiment of the invention in Figure 50 after a user has selection topic 5307 “Cells,” causing display of documents 5308 to show documents mapped to the topic “Cells.” The search results summary 5310 shows there are 19,576 documents associated with this topic. Compared to Figure 51, there are more documents mapped to Cells than Nervous System, however, due to the number of child topics 5304 mapped to Nervous System, the angular space of the topic Nervous System is larger than that of Cells. Therefore, in this embodiment the angular spaces are not indicative of a number of documents mapped to the topic. Further, the distribution lines 5305 are constrained in thickness to the angular space of the topic 5307. Therefore, in this embodiment, if the correct thickness of the distribution line must be constrained due to the angular space of the topic, the distribution lines 5307 may be misleading to the number of mapped documents.
[00167] FIG. 54 depicts a different embodiment of the invention than what is depicted in Figures 50-53. In this embodiment, the compressed distribution lines 5412 are colored with a higher intensity to visually communicate a higher density of documents being represented in a compressed space. In various embodiments the compressed distribution lines 5412 are colored differently, more opaque, less opaque, shaped, patterned, or marked any other various way to depict a higher density of documents.
[00168] FIG. 55 depicts an embodiment of a sieve 5500 in which the relationships are depicted through a relationship arc 5520. The relationship arc indicates, for a parent/child relationship, the largest width that distribution lines to children can have and re-establish the meaning of a relative numeric quantity associated with a width.
[00169] FIG. 56 depicts an embodiment of the Sieve in figure 55 wherein the user has a cursor hovering over distribution line 5621. In this embodiment hovering over the distribution line 5621 causes alt text 5622 to be displayed showing the number of documents associated with the distribution line 5621. In other embodiments the number of documents may not be alt text, but another method of causing the number of documents to be displayed on some user interaction such as hovering, selection, or another interaction mechanism. [00170] FIG. 57 depicts the embodiment of figure 55 wherein the user’s cursor is hovering over the relationship arc 5720. Hovering over the relationship arc 5720 causes a text 5721 to be displayed indicative of a number of documents associated with the relationship arc. In some embodiments the thickness of the relationship arc 5720 is indicative of the amount of documents, and the distribution lines 5723 to the children are proportional to the thickness of the relationship arc 5720 based on the ratio of documents in the relationship arc to the number of documents related to each Topic C1-C5. This is useful when a large number of documents must be represented in a limited space, as is the case with the children of “Topic Cl.”
[00171] FIG. 58 depicts the embodiment of figure 55, wherein the user is hovering a cursor over the distribution line 5824 between topic C and topic Cl. The text 5821 shows the number of documents related to topic Cl. As described above, the thickness of the first child distribution line 5824 is .9 times (450,000/500,000) as thick as the relationship arc 5820. In other embodiments the distribution line 5824 may be proportional to the thickness of the parent distribution line 5821.
[00172] FIG. 59 depicts the embodiment shown in figure 55 wherein the user is hovering a cursor over the secondary relationship arc 5930 of Topic Cl’s children. The secondary relationship arc 5930 re-establishes the width-quantity relationship between topic Cl and its children. The text 5921 shows that for children of topic Cl, 450,000 documents will be indicated by a line having the thickness of secondary relationship arc 5930.
[00173] FIG. 60 depicts the embodiment shown in figure 55 wherein the user is hovering a cursor over the distribution line 6032 between topic Cl and topic C13. In this example, each of the 450,000 documents in topic Cl relate to topic C13, and thus the thickness of the distribution line 6032 is the same as the thickness of the relationship arc 6030.
[00174] FIG. 61 depicts a derivative embodiment of figures 55-60 wherein the distribution lines only connect to the child topic nodes and does not extend through to the topic node’s relationship arc. The sieve diagram 6100 is read the same way as in the embodiments depicted in figures 55-60 and the difference is only stylistic.
[00175] FIG. 62 depicts an embodiment of the sieve, similar to the embodiment of figure 61. The embodiment of figure 62 is different in that instead of providing a filling, there is a distribution line 6203 which shows the documents from the central node 6201 to Map 1 6204.
[00176] FIG. 63 depicts an embodiment of a sieve diagram 6300 as a webbed map. The width of the distribution lines 6305 indicate an amount of documents associated with each topic A, B, C. Further, the widths of the borderlines around topics A, B, and C are indicative of the width with which the quantities depicted by the topics’ incoming distribution line will be represented in outgoing distribution lines to the topics’ children. Therefore, one can visually perceive the percentage of documents mapped to, for example, topic C that are also mapped to topic Cl by comparing the width of the borderline around topic C to the width of the distribution line from topic C to topic Cl. In the embodiment of figure 63, if a node has no children, for example node 6341, then it does not depict a meaningful borderline on the topic node.
[00177] FIG. 64 depicts an embodiment of sieve 6400 in which the lines 6405 connecting the topic nodes, directly and indirectly, to the central node are only used to show the hierarchical pathing between the data objects and their children. In this embodiment, the width of the border-line 6481 of the topic node is used to communicate the quantity of
documents mapped to a given topic. In other embodiments different colors or opacities might indicate the amount of documents mapped to a given topic.
[00178] FIG. 65 depicts an embodiment of the sieve similar to embodiments of figures 55- 62, however, the central node is a topic node, Root Topic 6550, of the hierarchical set of nodes. This embodiment may be a derivative of another embodiment where, for example, the Root Topic 6550 is selected from a broader selection of topics and the system renders a zoomed version. Similar sieve diagrams can be created, for example, by selecting Topic Cl and creating a new sieve diagram with Topic Cl as the central node and the children Cl 1- C17 surrounding Topic Cl.
[00179] FIG. 66 depicts an embodiment of the Sieve 6600 in which no default search results are displayed and initially only the sieve diagram is displayed. In the depiction of this embodiment, the user is about to select the topic “Topic C3”, as indicated by the user’s cursor 6670 hovering over the topic.
[00180] FIG. 67 depicts the embodiment from Figure 66 after the user has selected topic “Topic C3”. In this embodiment, the display of documents 6708 first displays information related to Topic C3, and provides user the option to see documents upon selecting the button labeled, “View Related Documents.” In the depiction of this embodiment, the user is about to click this button as indicated by the user’s cursor hovering the button.
[00181] FIG. 68 depicts the embodiment from Figure 67, after the user selected the button labeled “View Related Documents,” which causes the system to retrieve and display results, which in this embodiment are data objects in the first set of objects. In this embodiment, an option to view the original topic description is provided by selecting the button labeled “Back to Topic Details”.
[00182] FIG. 69 depicts an embodiment of sieve 6900 used to visualize a user’s investment portfolio. Distribution lines 6904 are used to indicate the amount of money invested in various stocks and stock categories, and are either red (loss), green (profit), or grey (no profit or loss). In some embodiments, the graphical linkage 6904 may have a color and a thickness, such that the color is a secondary feature representing a quality of the data of which the distribution line is indicative. In such embodiments, the thickness represents a cash-flow while the color represents a positive or negative number, thus allowing the distribution lines
to indicate a negative number as a negative thickness is not otherwise graphically conveyable.
[00183] FIG. 70 depicts the embodiment from Figure 69 after the user has selected the stock Netflix, causing the display of additional information 7008 about the selected stock and the performance of their invested funds in the selected stock.
[00184] FIG. 71 depicts the embodiment in Figure 50, in which the user has selected the topic “Animal Structures” 7104. Upon a query of “pig” 7171 entered into the search bar a display of the documents 7109 which correspond to those documents mapped to “Animal Structures” which also contain the word “pig” is generated. This embodiment also shows how some topics may have multiple parents. In this embodiment, the topics Cloaca and Nonmammalian are children of both embryonic structures and animal structures. Therefore, when Animal structures is highlighted, the children, which are displayed at a different parent topic are also highlighted. In some embodiments, the sieve diagram puts the children with the most closely related parent, but highlights the child topic when any of the parent topics are selected.
[00185] FIG. 72 is a top down rendering of an embodiment of the invention where the sieve diagram has been rendered in three dimensions. The central node 7201 is marked in a lighter color to denote the height of the node in three-dimensions. The distribution lines 7204 flow from the central node 7201 to the topics, either directly or indirectly. Similar to the embodiments shown FIGs. 47-49, in various embodiments of the three dimensional sieve diagram, the central node can be, inter alia, a plurality of topics on the outside, a topic ring, and a master document set; and in even further embodiments the sieve diagram includes a filter ring.
[00186] FIG. 73 is a perspective view of the embodiments of FIG. 72 wherein the three dimensional sieve diagram 7300 is shown with height. In some embodiments, the base of the first hierarchical layer 7380 is indicative of a quantity or quality of documents related to the topics of the layer. In this embodiment the height of each layer in the wedding cake shaped sieve diagram 7300 would decrease as documents are filtered out for not being associated with a child topic of the previous layer. In other embodiments of the sieve diagram 7300, the height of the first hierarchical layer 7380 is not even throughout. Instead, in other embodiments, the layer 7380 is formed as a pie chart with various slices of pie at different
heights to represent a quantity of documents in that section of the pie. In such embodiments, the distribution lines are not necessary to indicate the quantity, however, in some embodiments they may also be included for more visual cues. In embodiments where the height of base layer 7380 is indicative of a quantity of documents, the height would be a distribution line as defined by this application.
[00187] The present invention may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof.
[00188] Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, networker, or locator.) Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
[00189] The computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking
technologies. The computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web) .
[00190] Hardware logic (including programmable logic for use with a programmable logic device) implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g, PALASM, ABEL, or CUPL).
[00191] While the invention has been particularly shown and described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended clauses. While some of these embodiments have been described in the claims by process steps, an apparatus comprising a computer with associated display capable of executing the process steps in the clams below is also included in the present invention. Likewise, a computer program product including computer executable instructions for executing the process steps in the claims below and stored on a computer readable medium is included within the present invention.
[00192] The embodiments of the invention described above are intended to be merely exemplary; numerous variations and modifications will be apparent to those skilled in the art. All such variations and modifications are intended to be within the scope of the present invention as defined in any appended claims.
Claims
1. A computer-implemented method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection, wherein each of the second set of objects is hierarchically structured, the method implemented by computer processes comprising: storing the first set of objects and storing the second set of objects; parsing the first set of objects in relation to the second set of objects in order to generate a mapping between the first set of objects and the second set of objects; storing the mapping; causing an interactive graphical display of a representation of the first set of objects as a central node and a representation of the second set as a set of objects as a set of topic nodes surrounding the central node or surrounded by the central node, the displayed representations constituting a sieve diagram; and using the mapping to cause display in the sieve diagram of a set of relationships between the first set of objects and the second set of objects by providing a graphical linkage directly or indirectly between the central node and each node that corresponds to a member of the second set of objects, wherein the graphical linkage has a feature that graphically indicates a quantity associated with the mapping.
2. A computer-implemented method according to claim 1, the method further comprising: in response to graphical selection, by a user, in the sieve diagram of a subset of topic nodes corresponding to a topic subset of the second set of objects, retrieving an object subset, of the first set of objects, that is mapped to the topic set.
3. A computer-implemented method according to claim 1, the method further comprising: receiving a search query; searching, in response to the received search query, the first set of objects, for a
search results set of objects having a set of features matching a set of features of the search query; and displaying the search results set of objects.
4. A computer-implemented method according to claim 1, wherein the graphical linkage is a distribution line.
5. A computer-implemented method according to claim 1, the method further comprising: upon receiving a graphical selection by the user of a given one of the topic nodes in the sieve diagram, causing filtering of the first set of objects displayed to include only those objects corresponding to objects in the second set that are represented by the selected topic node.
6. A computer-implemented method according to claim 1, the method further comprising: filtering, upon receiving a graphical selection by the user of one of the members of the second set in the sieve diagram, the first set of objects displayed to include only objects corresponding to objects in the first set that are mapped to the selected member of the second set.
7. A computer-implemented method according to claim 1, the method further comprising: removing, upon receiving a graphical selection by the user of a central node in the sieve diagram, any filtering caused by a graphical selection of a topic node or of a distribution line.
8. A computer-implemented method according to claim 1, further comprising: displaying, upon the graphical selection by the user of the given one of the topic
nodes, information pertinent to the selected topic node.
9. A computer-implemented method according to claim 6, further comprising, upon the graphical selection by the user of the member of the second set, causing display of information pertinent to the selected member of the second set.
10. A computer-implemented method according to claim 3, wherein the search query includes a Boolean expression, and wherein the method further includes evaluating the Boolean expression and performing the search using the evaluated expression.
11. A computer-implemented method according to claim 3, wherein the search query is defined, at least in part, by a user selection of a set of graphical elements in the sieve diagram.
12. A computer-implemented method according to claim 3, wherein the search query includes at least one Boolean operator defined by a user input.
13. A computer-implemented method according to claim 3, further comprising the step of displaying, upon receipt of a user command to shrink displayed membership of the first set, only objects having features matching features of the search query.
14. A computer-implemented method according to claim 1, the method further comprising: upon invocation by a user of a topic map limiter, limiting display of objects in the second set.
15. A computer-implemented method according to claim 14, wherein the topic map limiter invokes display of a tier in which the display of objects in the second set is limited.
16. A computer-implemented method according to claim 14, wherein the topic map limiter allows selection by the user of which topics to eliminate from the sieve diagram.
17. A computer-implemented method according to claim 14, wherein the topic map limiter allows selection by the user of which topics to include in the sieve diagram.
18. A computer-implemented method according to claim 1, wherein the computer processes further comprise: upon a graphical selection by a user of a member of the second set of objects in the sieve diagram, causing display of a derivative sieve diagram in which the selected member is a derivative central node.
19. A computer-implemented method according to claim 1, wherein the graphical linkage is a distribution line and the feature is a thickness of the distribution line.
20. A computer implemented method according to claim 19, wherein the distribution line has a secondary feature that indicates the quantity associated with the mapping under conditions wherein the thickness of the distribution line has been graphically constrained.
21. A computer-implemented method according to claim 1, wherein the graphical linkage has a secondary feature that graphically indicates a quality associated with the mapping.
22. A computer-implemented method for interactive visualization of a set of objects in a hierarchical topic structure, the topic structure comprising a first set of topics and a second set of topics, the method implemented by computer processes comprising:
(a) causing display of a sieve diagram having: the first set of topics; the second set of topics arranged around the first set of topics, wherein each member in the second set of topics is a child of a corresponding topic in the first set of topics; and a set of distribution lines, each distinct distribution line connecting one of the topics with its corresponding child and having a visual characteristic indicative of a quantity of objects associated with such child; and
(b) upon a user selection, in the sieve diagram, of a topic, causing display, outside of the sieve diagram, of objects mapped to the user selected topic.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2021/041296 WO2023287395A1 (en) | 2021-07-12 | 2021-07-12 | Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2021/041296 WO2023287395A1 (en) | 2021-07-12 | 2021-07-12 | Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023287395A1 true WO2023287395A1 (en) | 2023-01-19 |
Family
ID=84920342
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2021/041296 Ceased WO2023287395A1 (en) | 2021-07-12 | 2021-07-12 | Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2023287395A1 (en) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020091679A1 (en) * | 2001-01-09 | 2002-07-11 | Wright James E. | System for searching collections of linked objects |
| US20040177065A1 (en) * | 2003-02-18 | 2004-09-09 | Hermann Tropf | Database and method for organizing data elements |
| US20090182837A1 (en) * | 2008-01-11 | 2009-07-16 | Rogers J Andrew | Spatial Sieve Tree |
| US20100106752A1 (en) * | 2004-05-04 | 2010-04-29 | The Boston Consulting Group, Inc. | Method and apparatus for selecting, analyzing, and visualizing related database records as a network |
| US20130013590A1 (en) * | 2011-02-17 | 2013-01-10 | International Business Machines Corporation | Searching and Displaying Data Objects Residing in Data Management Systems |
| US20150261833A1 (en) * | 2014-03-17 | 2015-09-17 | SynerScope B.V. | Data visualization system |
-
2021
- 2021-07-12 WO PCT/US2021/041296 patent/WO2023287395A1/en not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020091679A1 (en) * | 2001-01-09 | 2002-07-11 | Wright James E. | System for searching collections of linked objects |
| US20040177065A1 (en) * | 2003-02-18 | 2004-09-09 | Hermann Tropf | Database and method for organizing data elements |
| US20100106752A1 (en) * | 2004-05-04 | 2010-04-29 | The Boston Consulting Group, Inc. | Method and apparatus for selecting, analyzing, and visualizing related database records as a network |
| US20090182837A1 (en) * | 2008-01-11 | 2009-07-16 | Rogers J Andrew | Spatial Sieve Tree |
| US20130013590A1 (en) * | 2011-02-17 | 2013-01-10 | International Business Machines Corporation | Searching and Displaying Data Objects Residing in Data Management Systems |
| US20150261833A1 (en) * | 2014-03-17 | 2015-09-17 | SynerScope B.V. | Data visualization system |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6831668B2 (en) | Analytical reporting on top of multidimensional data model | |
| Oliveira et al. | Visual data exploration and mining: a survey | |
| CN101114301B (en) | Method and system for navigating in a database of a computer system | |
| US6085187A (en) | Method and apparatus for navigating multiple inheritance concept hierarchies | |
| US20210117985A1 (en) | Analytics engine for detecting medical fraud, waste, and abuse | |
| US9465523B2 (en) | Visual exploration of multidimensional data | |
| US7428545B2 (en) | Knowledge inferencing and data visualization method and system | |
| US9336267B2 (en) | Method and system for navigation and visualization of data in relational and/or multidimensional databases | |
| US20100064258A1 (en) | Method and apparatus for displaying a menu for accessing hierarchical content data including caching multiple menu states | |
| US20120198389A1 (en) | Expandable and collapsible arrays of documents | |
| US20060225000A1 (en) | Graphical application interface using browser | |
| US11599533B2 (en) | Analyzing data using data fields from multiple objects in an object model | |
| US20060224983A1 (en) | Graphical visualization of data using browser | |
| Carey et al. | Info navigator: A visualization tool for document searching and browsing | |
| US20060224999A1 (en) | Graphical visualization of data product using browser | |
| US20060224974A1 (en) | Method of creating graphical application interface with a browser | |
| US20060224972A1 (en) | Graphical application interface with a browser | |
| Baudel | From information visualization to direct manipulation: extending a generic visualization framework for the interactive editing of large datasets | |
| US20230004584A1 (en) | Using Objects in an Object Model as Database Entities | |
| US20060224982A1 (en) | Graphical application interface product using a browser | |
| US11061919B1 (en) | Computer-implemented apparatus and method for interactive visualization of a first set of objects in relation to a second set of objects in a data collection | |
| US20060224980A1 (en) | Method of creating graphical visualizations of data with a browser | |
| Hoque et al. | Dataopsy: Scalable and fluid visual exploration using aggregate query sculpting | |
| Liu et al. | Visualization support to better comprehend and improve decision tree classification modelling process: a survey and appraisal | |
| Menin et al. | From linked data querying to visual search: towards a visualization pipeline for LOD exploration |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21950319 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21950319 Country of ref document: EP Kind code of ref document: A1 |