[go: up one dir, main page]

WO2022093003A1 - Method and system to provide visualization interpretation through establishing relationship between internal and external trending data influences - Google Patents

Method and system to provide visualization interpretation through establishing relationship between internal and external trending data influences Download PDF

Info

Publication number
WO2022093003A1
WO2022093003A1 PCT/MY2020/050175 MY2020050175W WO2022093003A1 WO 2022093003 A1 WO2022093003 A1 WO 2022093003A1 MY 2020050175 W MY2020050175 W MY 2020050175W WO 2022093003 A1 WO2022093003 A1 WO 2022093003A1
Authority
WO
WIPO (PCT)
Prior art keywords
interpretation
trend
keyword
establishing
anomalies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/MY2020/050175
Other languages
French (fr)
Inventor
Suriani RAPA'EE
Fazli MAT NOR
Muhammad Hazwan MOHD FOWZI
Muhammad Amin HAMID
Sharipah Setapa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mimos Bhd
Original Assignee
Mimos Bhd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Bhd filed Critical Mimos Bhd
Publication of WO2022093003A1 publication Critical patent/WO2022093003A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Definitions

  • the present invention generally relates to the network technologies, and more particularly to a method to provide visualization interpretation through establishing relationship between internal and external trending data influences.
  • visualization is created without any interpretation of the data, and users usually need to think and interpret the visualization themselves, i.e. manually, based on limited knowledge provided by data and the visualization.
  • visualization requires a relevant interpretation based on certain domain and current trending for better understanding of the data.
  • the existing visualization normally has no specific interpretation of relevant data involved, which might cause several issues such as: each user will interpret the visualization differently; interpretation of visualization will be isolated from each other; there is no relationship between external trending data and the visualization; and/or the audience has to interpret by themselves and the interpretation might be interpreted wrongly without any knowledge of the data itself. [0004] Hence, incorrect interpretation for the user that might lead to wrong decision making.
  • US20110106589A1 discloses a clear and intuitive user interface that can turn on/off a combination of social media measurements, and help a user to drill down to as much details as desired across different timeframes and social media measurements.
  • the various types of superimposed graphs and data described herein may facilitate a user's ability to interpret and understand information associated with social media.
  • US7428545B2 discloses visualization space, data preparation tool, inference engine, and predictor.
  • the user interacts with the system through a visual representation space where various graphical objects are rendered.
  • Graphical objects represent data, knowledge (e.g. as induced rules), and query explanations (decisions on unknown data identifications).
  • the system integrates graphical objects through the use of visually cognitive, human-oriented depictions. A user can also examine non-graphical explanations (i.e. text based) to posed queries.
  • US20150220946A1 discloses a method of automated trend identification, that can include: receiving communication data; receiving at least one modularity selection, the modularity selection defining a plurality of features; identifying instances of the features in the communication data; receiving at least one report selection; producing a statistical measure of the identified instances of the features; evaluating the statistical measure; and identifying a trend of interest from the evaluation of the statistical measure, wherein the trend of interest comprises a report selection and a feature.
  • One aspect of the present invention provides a method of providing visualization interpretation.
  • the method comprises the step of retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
  • the step of establishing profile data trending anomalies (200) comprises establishing domain knowledge, internal and external influencers, where the relationship between domain knowledge with external influencers is already predefined; consolidating internal and external influencers; establishing anomalies for external influencers, wherein external influence anomalies is established.
  • the step of establishing interpretation relationship comprises getting metadata of dimensions, measurement, values for each selected data attributes; finding most frequent keywords for attributes; a frequent keyword of attribute of metadata is found and searched; finding trends for each metadata’s keyword; for each metadata’s keyword, trend is retrieved from the trend knowledge database; calculating frequency of each metadata trend relationship based on keyword and event date; the metadata trend relationship is retrieved from the trend knowledge database; getting highest trend of keywords and event date; the highest trend of keywords is selected based on the calculation earlier; finding related news or events based on trend keyword and data attribute; the related news or events are retrieved from the news knowledge database; and tagging related news, trend keyword and event date with report metadata using Named Entity Recognition (NER).
  • NER Named Entity Recognition
  • the step of establishing visualization interpretation comprises mapping trend keyword and NER tagging with dataset; scoring and ranking the mapped trend keyword to provide accurate interpretation; and generating potential interpretation from metric and visualizing it.
  • the system comprises a client app being installed in a client device; a server including a microprocessor for executing and a computer-readable storage medium for storing a data anomalies detection engine, an Interpretation generation engine, and an interpretation relationship engine; a domain knowledge database; a trend knowledge database; and a news knowledge database.
  • the microprocessor executes the data anomalies detection engine, interpretation generation engine, and interpretation relationship engine for retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
  • Another aspect of the present invention provides a non-transitory computer readable medium comprising computer-executable instructions that when executed by a processor of a computing device perform a method comprises retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
  • FIG 1 shows a block diagram of the system for providing interpretation of visualization in accordance with one embodiment of the present invention
  • FIG 2 shows tabular views of an exemplary Domain knowledge database containing predefined relationships among domain knowledge;
  • FIG 3 provides an example of model anomaly;
  • FIG 4 provides an illustration of the exemplary monitoring and processing of data based on haze
  • FIG 5 provides an illustration of matching metadata with data trending from the Trend knowledge database
  • FIG 6 provides an illustration of finding news headlines based keyword, event date, and location from the news knowledge database
  • FIG 7 provides an example of related news headlines or events based on attribute, trend keyword and event date;
  • FIG 8 provides an example of disease attribute tagging, where NER denotes
  • FIG 9 provides an example of total patient attribute tagging
  • FIG 10 provides an example of mapping attributes and trend keyword with NER prior to interpretation structure;
  • FIG 11 provides an exemplary tabular view showing how the score for each
  • NER is calculated to get the ranking
  • FIG 12 provides an exemplary interpretation that has metrics based on attribute, NER and score
  • FIG 13 provides block diagrams illustrating the generation of potential interpretation
  • FIG 14A and FIG 14B provide examples of visualization interpretation
  • FIG 15 shows a block flowchart of the method of providing visualization interpretation in accordance with one embodiment of the present invention.
  • FIG 16 shows a flowchart of establishing profile data trending anomalies in accordance with one embodiment of the present invention
  • FIG 17 shows a flowchart of establishing interpretation relationship in accordance with one embodiment of the present invention.
  • FIG 18 shows a flowchart of establishing visualization interpretation in accordance with one embodiment of the present invention.
  • the present invention provides method and system that combines external news, trending data and internal data to provide interpretation of visualization and to provide meaningful interpretation for better understanding of the visualization.
  • FIG 1 illustrates a block diagram of the system for providing interpretation of visualization in accordance with one embodiment of the present invention.
  • the system 1 comprises a client app 15 being installed in a client device 10, a server 20 including a microprocessor for executing and a computer-readable storage medium for storing a data anomalies detection engine 30, an interpretation generation engine 40, and an interpretation relationship engine 50, a domain knowledge database 60, a trend knowledge database 70, and a news knowledge database 80.
  • the domain knowledge database 60, the trend knowledge database 70, and the news knowledge database 80 can be stored in separate storage medium as shown in FIG 1, but can also be stored in the computer-readable storage medium of the server 20.
  • the microprocessor and the computer-readable storage medium can be implemented in single or multiple devices.
  • the client device 10 with the installed client app 15 can be any computing device such as a personal computer (PC), a notebook computer, a mobile phone, a workstation, and a personal digital assistant (PDA).
  • the client app 15 enables a user to communicate with the server 20.
  • the computer-readable storage medium of the server 20 can include volatile and non-volatile, removable and non-removable medium implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • the computer-readable storage medium can be implemented as a single storage device but may also be implemented across multiple storage devices or sub-systems. Examples of the computer-readable storage medium include random access memory, read only memory, magnetic discs, optical discs, flash memory, virtual memory, and non-virtual memory, magnetic sets, magnetic tape, magnetic disc storage or other magnetic storage devices, or any other medium which can be used to store the desired information and that may be accessed by an instruction execution system.
  • the computer-readable storage medium can be a non-transitory storage medium.
  • the data anomalies detection engine 30 identifies and collects domain anomalies by establishing attribute dataset for each domain and comparing with internal and external influencers.
  • the interpretation generation engine 40 analyses trend keyword, attribute dataset, and news for obtaining and interpreting visualization pattern with report metadata using Named Entity Recognition (NER) by performing matrix of tagging.
  • NER Named Entity Recognition
  • the interpretation relationship engine 50 matches and interprets visualization by providing a schema hierarchy priority with score characteristic.
  • the domain knowledge database 60 is a repository for specific knowledge stored related to a particular dataset.
  • An exemplary domain knowledge database 60 with health, weather, finance domain, etc. is shown in FIG 2 (detailed description hereinbelow).
  • the trend knowledge database 70 is a repository for information gathered from external repository such as Google Trend, Social Media like Tweeter, Youtube, Facebook, Instagram and etc.
  • the news knowledge database 80 is a repository of news gathered based on the trend keyword.
  • Example news repository but not limited to Bemama, CNN, Awani and etc.
  • the exemplary Domain knowledge database 60 comprises domains of health, finance, education, tourism, and environment. It is to be noted that the definition and content of each domain and the number of domains in the domain knowledge database 60 are not limited by the examples shown in FIG 2.
  • the exemplary health domain knowledge 61 comprises the categories of disease and internal influencer, where the disease category comprises infectious disease, heart disease, allergies and asthma, and cancer, while the internal influencer category comprises viruses and bacteria, smoking and allergy.
  • the exemplary environment domain knowledge 65 comprises potential external influencers such as haze and weather.
  • the exemplary predefined relationships as denoted by the long arrows are the intra-domain relationship between the allergies and asthma from the disease category and the viruses and bacteria from the internal influencer category and the inter-domain relationship between the allergies and asthma from the health domain knowledge 61 and the haze from the environment domain knowledge 65.
  • the profile data of a disease such as asthma is retrieved from the health domain knowledge 61, where the profile data comprises date/time, gender, state, age, total patient and disease.
  • the profile data are retrieved, they are classified as attributes.
  • the attributes are defined in fact table for the specific domain which is in health domain knowledge 61.
  • the attributes are classified as anomalies attribute when the abnormal pattern of attributes is identified in the dataset.
  • the model anomaly is established by relating specific internal symptom(s) (e.g., allergic) with external influencer(s) (e.g., haze).
  • FIG 4 there is provided an illustration of the exemplary monitoring and processing of data based on haze.
  • the data based on haze are monitored and processed, and the anomalies are detected based on trending news cause by haze through air pollution index.
  • Metadata describes content, purpose, source, structure of the dataset and summarizes basic information about data in the dataset. Examples of metadata as FIG 5 with regards to school attribute, is ‘School will be closed when API exceed 300’. Metadata is already predefined for each attribute in the database.
  • the search keyword for finding news headlines usually comprises attributes (Att) such as disease and trend keyword (tk) such as haze and location (e.g. Petaling Jaya).
  • attributes such as disease
  • trend keyword tk
  • haze and location e.g. Petaling Jaya
  • tk trend keyword
  • a search is carried out in the news knowledge database 80, and relevant news headlines are retrieved and shown as ’Results’.
  • the related news headlines are classified to the same pre- defined category such as people, organization, disease, etc.
  • the attribute is data field in dataset that represents features of a data object.
  • patient object attributes can be patient identity, address, date of birth, gender, etc.
  • the trend keyword is based on trending topic that matches with the most frequent keyword in metadata’s attribute for certain period of time.
  • NER denotes Named Entity Recognition
  • FIG 9 there is provided an example of total patient attribute tagging.
  • These two examples in FIG 8 and FIG 9 show relationship between two attributes (“disease” and “total patient”) by finding their external trending data, where the search results from the headline news shown there are similarities based on the frequent words found.
  • mapping attributes and trend keyword with NER prior to interpretation structure there is provided an example of mapping attributes and trend keyword with NER prior to interpretation structure.
  • FIG 12 there is provided an exemplary interpretation that has metrics based on attribute, NER and score. The pattern sentence can be interpreted based on the table shown in FIG 12 as: ‘Disease and total patient impacted by pollution’.
  • FIG 13 there is provided block diagrams illustrating generation of potential interpretation.
  • the interpretation is: ‘the combination of disease and total patient has been impacted by pollution’; in the bottom box, the interpretation is: ‘the combination of disease and total student has been impacted by pollution in schoolA in Petaling Jaya area’ .
  • the visualization interpretation of the bar diagram can be ‘The combination of diseases attribute and total patient attribute might have been impacted by (total patient) pollution (haze)’ .
  • the pie diagram shown in FIG 14B the visualization interpretation of the pie diagram can be ‘The combination of disease attribute and total student attribute might have been impacted by pollution (haze) in school in Petaling Jaya area’.
  • the method comprises retrieving dataset from various attribute domains including e.g. dimension, measurement and values at step 100; establishing profile data trending anomalies by modeling anomalies profile with specific disease and comparing with various data trending such as external (haze) and internal (virus, smoking) at step 200; establishing interpretation relationship at step 300; and establishing visualization interpretation at step 400.
  • the step 200 of establishing profile data trending anomalies comprises establishing domain knowledge, internal and external incluencers at step 210; consolidating internal and external influencers at step 220; and establishing anomalies for external influencers at step 230.
  • the relationship between domain knowledge with external influencers is already predefined as illustrated in FIG 2.
  • the example attributes are gender, date time for illness such as flu/cough, dengue with internal symptom such as virus, and the external influencers can be haze or flood.
  • external influence anomalies can be established as illustrated in FIG 3.
  • the step of establishing interpretation relationship 300 comprises getting metadata of dimensions, measurement, values for each selected data attribute at step 310; finding most frequent keywords for attributes at step 320; finding trends for each metadata’s keyword at step 330; calculating frequency of each metadata trend relationship based on keyword and event date at step 340; getting highest trend of keywords and event date at step 350; finding related news or events based on trend keyword and data attribute at step 360; and tagging related news, trend keyword and event date with report metadata using the Named Entity
  • the metadata for a specific domain e.g. health is selected; then data attribute with dimension, value and etc. of that specific domain are retrieved for a specific disease.
  • a frequent keyword of attribute of metadata can be identified.
  • trend such as state, disease, student and school is retrieved from a trend knowledge database such as Google Trends, etc.
  • the metadata trend relationship is retrieved from the trend knowledge.
  • the highest trend of keywords is selected in the step 350.
  • the related news or events are retrieved from the news knowledge database such as Google, online newspaper, etc. These news are tagged with the metadata accordingly.
  • the step of establishing visualization interpretation 400 comprises mapping trend keyword and NER tagging with dataset at step 410; scoring and ranking the mapped trend keyword to provide accurate interpretation at step 420; generating potential interpretation at step 430.
  • the trend keyword is maped with pollution as an exemplary NER. Based on the metrics, i.e. scoring and ranking, of the trend keyword, the interpretation can be presented visually.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method of providing visualization interpretation. In one embodiment, the method comprises the step of: retrieving dataset from a domain knowledge database (60) with various attribute domains comprising dimension, measurement and values (100); establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers (200); establishing interpretation relationship (300); and establishing visualization interpretation (400). A system therefore is also provided.

Description

METHOD AND SYSTEM TO PROVIDE VISUALIZATION INTERPRETATION THROUGH ESTABLISHING RELATIONSHIP BETWEEN INTERNAL AND EXTERNAL TRENDING DATA INFLUENCES
Field of the Invention [0001 ] The present invention generally relates to the network technologies, and more particularly to a method to provide visualization interpretation through establishing relationship between internal and external trending data influences.
Background of the Invention
[0002] Normally, visualization is created without any interpretation of the data, and users usually need to think and interpret the visualization themselves, i.e. manually, based on limited knowledge provided by data and the visualization. Thus, visualization requires a relevant interpretation based on certain domain and current trending for better understanding of the data.
[0003] The existing visualization normally has no specific interpretation of relevant data involved, which might cause several issues such as: each user will interpret the visualization differently; interpretation of visualization will be isolated from each other; there is no relationship between external trending data and the visualization; and/or the audience has to interpret by themselves and the interpretation might be interpreted wrongly without any knowledge of the data itself. [0004] Hence, incorrect interpretation for the user that might lead to wrong decision making.
[0005] US20110106589A1 discloses a clear and intuitive user interface that can turn on/off a combination of social media measurements, and help a user to drill down to as much details as desired across different timeframes and social media measurements. The various types of superimposed graphs and data described herein may facilitate a user's ability to interpret and understand information associated with social media.
[0006] US7428545B2 discloses visualization space, data preparation tool, inference engine, and predictor. The user interacts with the system through a visual representation space where various graphical objects are rendered. Graphical objects represent data, knowledge (e.g. as induced rules), and query explanations (decisions on unknown data identifications). The system integrates graphical objects through the use of visually cognitive, human-oriented depictions. A user can also examine non-graphical explanations (i.e. text based) to posed queries.
[0007] US20150220946A1 discloses a method of automated trend identification, that can include: receiving communication data; receiving at least one modularity selection, the modularity selection defining a plurality of features; identifying instances of the features in the communication data; receiving at least one report selection; producing a statistical measure of the identified instances of the features; evaluating the statistical measure; and identifying a trend of interest from the evaluation of the statistical measure, wherein the trend of interest comprises a report selection and a feature. Summary
[0008] One aspect of the present invention provides a method of providing visualization interpretation. In one embodiment, the method comprises the step of retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
[0009] In one embodiment, the step of establishing profile data trending anomalies (200) comprises establishing domain knowledge, internal and external influencers, where the relationship between domain knowledge with external influencers is already predefined; consolidating internal and external influencers; establishing anomalies for external influencers, wherein external influence anomalies is established.
[0010] In another embodiment, the step of establishing interpretation relationship comprises getting metadata of dimensions, measurement, values for each selected data attributes; finding most frequent keywords for attributes; a frequent keyword of attribute of metadata is found and searched; finding trends for each metadata’s keyword; for each metadata’s keyword, trend is retrieved from the trend knowledge database; calculating frequency of each metadata trend relationship based on keyword and event date; the metadata trend relationship is retrieved from the trend knowledge database; getting highest trend of keywords and event date; the highest trend of keywords is selected based on the calculation earlier; finding related news or events based on trend keyword and data attribute; the related news or events are retrieved from the news knowledge database; and tagging related news, trend keyword and event date with report metadata using Named Entity Recognition (NER).
[0011] In yet another embodiment, the step of establishing visualization interpretation comprises mapping trend keyword and NER tagging with dataset; scoring and ranking the mapped trend keyword to provide accurate interpretation; and generating potential interpretation from metric and visualizing it.
[0012] Another aspect of the present invention provides a system for providing visualization interpretation. In one embodiment, the system comprises a client app being installed in a client device; a server including a microprocessor for executing and a computer-readable storage medium for storing a data anomalies detection engine, an Interpretation generation engine, and an interpretation relationship engine; a domain knowledge database; a trend knowledge database; and a news knowledge database. The microprocessor executes the data anomalies detection engine, interpretation generation engine, and interpretation relationship engine for retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
[0013] Another aspect of the present invention provides a non-transitory computer readable medium comprising computer-executable instructions that when executed by a processor of a computing device perform a method comprises retrieving dataset from a domain knowledge database with various attribute domains comprising dimension, measurement and values; establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers; establishing interpretation relationship; and establishing visualization interpretation.
Brief Description of the Drawings
[0014] Preferred embodiments according to the present invention will now be described with reference to the Figures, in which like reference numerals denote like elements.
[0015] FIG 1 shows a block diagram of the system for providing interpretation of visualization in accordance with one embodiment of the present invention;
[0016] FIG 2 shows tabular views of an exemplary Domain knowledge database containing predefined relationships among domain knowledge; [0017] FIG 3 provides an example of model anomaly;
[0018] FIG 4 provides an illustration of the exemplary monitoring and processing of data based on haze;
[0019] FIG 5 provides an illustration of matching metadata with data trending from the Trend knowledge database; [0020] FIG 6 provides an illustration of finding news headlines based keyword, event date, and location from the news knowledge database;
[0021] FIG 7 provides an example of related news headlines or events based on attribute, trend keyword and event date; [0022] FIG 8 provides an example of disease attribute tagging, where NER denotes
Named Entity Recognition;
[0023] FIG 9 provides an example of total patient attribute tagging;
[0024] FIG 10 provides an example of mapping attributes and trend keyword with NER prior to interpretation structure; [0025] FIG 11 provides an exemplary tabular view showing how the score for each
NER is calculated to get the ranking;
[0026] FIG 12 provides an exemplary interpretation that has metrics based on attribute, NER and score;
[0027] FIG 13 provides block diagrams illustrating the generation of potential interpretation;
[0028] FIG 14A and FIG 14B provide examples of visualization interpretation;
[0029] FIG 15 shows a block flowchart of the method of providing visualization interpretation in accordance with one embodiment of the present invention; [0030] FIG 16 shows a flowchart of establishing profile data trending anomalies in accordance with one embodiment of the present invention;
[0031] FIG 17 shows a flowchart of establishing interpretation relationship in accordance with one embodiment of the present invention; and
[0032] FIG 18 shows a flowchart of establishing visualization interpretation in accordance with one embodiment of the present invention.
Detailed Description of the Invention
[0033] The present invention may be understood more readily by reference to the following detailed description of certain embodiments of the invention.
[0034] Throughout this application, where publications are referenced, the disclosures of these publications are hereby incorporated by reference, in their entireties, into this application in order to more fully describe the state of art to which this invention pertains.
[0035] The present invention provides method and system that combines external news, trending data and internal data to provide interpretation of visualization and to provide meaningful interpretation for better understanding of the visualization.
[0036] FIG 1 illustrates a block diagram of the system for providing interpretation of visualization in accordance with one embodiment of the present invention. The system 1 comprises a client app 15 being installed in a client device 10, a server 20 including a microprocessor for executing and a computer-readable storage medium for storing a data anomalies detection engine 30, an interpretation generation engine 40, and an interpretation relationship engine 50, a domain knowledge database 60, a trend knowledge database 70, and a news knowledge database 80. The domain knowledge database 60, the trend knowledge database 70, and the news knowledge database 80 can be stored in separate storage medium as shown in FIG 1, but can also be stored in the computer-readable storage medium of the server 20. The microprocessor and the computer-readable storage medium can be implemented in single or multiple devices.
[0037] The client device 10 with the installed client app 15 can be any computing device such as a personal computer (PC), a notebook computer, a mobile phone, a workstation, and a personal digital assistant (PDA). The client app 15 enables a user to communicate with the server 20.
[0038] The computer-readable storage medium of the server 20 can include volatile and non-volatile, removable and non-removable medium implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. The computer-readable storage medium can be implemented as a single storage device but may also be implemented across multiple storage devices or sub-systems. Examples of the computer-readable storage medium include random access memory, read only memory, magnetic discs, optical discs, flash memory, virtual memory, and non-virtual memory, magnetic sets, magnetic tape, magnetic disc storage or other magnetic storage devices, or any other medium which can be used to store the desired information and that may be accessed by an instruction execution system. In certain embodiments, the computer-readable storage medium can be a non-transitory storage medium. [0039] The data anomalies detection engine 30 identifies and collects domain anomalies by establishing attribute dataset for each domain and comparing with internal and external influencers.
[0040] The interpretation generation engine 40 analyses trend keyword, attribute dataset, and news for obtaining and interpreting visualization pattern with report metadata using Named Entity Recognition (NER) by performing matrix of tagging.
[0041] The interpretation relationship engine 50 matches and interprets visualization by providing a schema hierarchy priority with score characteristic.
[0042] The domain knowledge database 60 is a repository for specific knowledge stored related to a particular dataset. An exemplary domain knowledge database 60 with health, weather, finance domain, etc. is shown in FIG 2 (detailed description hereinbelow).
[0043] The trend knowledge database 70 is a repository for information gathered from external repository such as Google Trend, Social Media like Tweeter, Youtube, Facebook, Instagram and etc.
[0044] The news knowledge database 80 is a repository of news gathered based on the trend keyword. Example news repository but not limited to Bemama, CNN, Awani and etc.
[0045] Referring now to FIG 2, there is provided tabular views of an exemplary domain knowledge database 60 containing predefined relationships among domain knowledge. As shown in FIG 2, the exemplary Domain knowledge database 60 comprises domains of health, finance, education, tourism, and environment. It is to be noted that the definition and content of each domain and the number of domains in the domain knowledge database 60 are not limited by the examples shown in FIG 2. The exemplary health domain knowledge 61 comprises the categories of disease and internal influencer, where the disease category comprises infectious disease, heart disease, allergies and asthma, and cancer, while the internal influencer category comprises viruses and bacteria, smoking and allergy. The exemplary environment domain knowledge 65 comprises potential external influencers such as haze and weather. The exemplary predefined relationships as denoted by the long arrows are the intra-domain relationship between the allergies and asthma from the disease category and the viruses and bacteria from the internal influencer category and the inter-domain relationship between the allergies and asthma from the health domain knowledge 61 and the haze from the environment domain knowledge 65.
[0046] Referring now to FIG 3, there is provided an example of model anomalies. The profile data of a disease such as asthma is retrieved from the health domain knowledge 61, where the profile data comprises date/time, gender, state, age, total patient and disease. When the profile data are retrieved, they are classified as attributes. The attributes are defined in fact table for the specific domain which is in health domain knowledge 61. The attributes are classified as anomalies attribute when the abnormal pattern of attributes is identified in the dataset. Then, the model anomaly is established by relating specific internal symptom(s) (e.g., allergic) with external influencer(s) (e.g., haze).
[0047] Referring now to FIG 4, there is provided an illustration of the exemplary monitoring and processing of data based on haze. As asthma cases increase by days, the data based on haze are monitored and processed, and the anomalies are detected based on trending news cause by haze through air pollution index.
[0048] Referring now to FIG 5, there is provided an illustration of matching metadata with data trending from the trend knowledge database 70. Metadata describes content, purpose, source, structure of the dataset and summarizes basic information about data in the dataset. Examples of metadata as FIG 5 with regards to school attribute, is ‘School will be closed when API exceed 300’. Metadata is already predefined for each attribute in the database.
[0049] The most frequent words used in metadata is matched with the trending data to form the trend keyword. As in FIG 5, ‘Haze’ is the most frequent word found in metadata’s attributes, and accordingly, the ‘Haze’ is searched in the trend knowledge database 70. The results showed that the keyword ‘Haze’ search returned ‘95%’ trending in Petaling Jaya, when classified by district.
[0050] Referring now to FIG 6, there is provided an illustration of finding news headlines-based keyword, event date, and location from the News knowledge database 80. The search keyword for finding news headlines usually comprises attributes (Att) such as disease and trend keyword (tk) such as haze and location (e.g. Petaling Jaya). With the input keyword (e.g., Disease Haze Petaling Jaya), a search is carried out in the news knowledge database 80, and relevant news headlines are retrieved and shown as ’Results’.
[0051] Referring now to FIG 7, there is provided an example of related news headlines or events based on attribute, trend keyword and event date. Based on the attribute, trend keyword and event date, the related news headlines are classified to the same pre- defined category such as people, organization, disease, etc. The attribute is data field in dataset that represents features of a data object. As an example, patient object attributes can be patient identity, address, date of birth, gender, etc. Whereas, the trend keyword is based on trending topic that matches with the most frequent keyword in metadata’s attribute for certain period of time. First, tagging on news headlines using NER for each word to identify the pre-defined categories; then, counting the fraction of relevant categories among the identified categories as shown in FIG 17. The highest score of categories from attribute tagging process is used in generating the interpretation.
[0052] Referring now to FIG 8, there is provided an example of disease attribute tagging, where NER denotes Named Entity Recognition.
[0053] Referring now to FIG 9, there is provided an example of total patient attribute tagging. These two examples in FIG 8 and FIG 9 show relationship between two attributes (“disease” and “total patient”) by finding their external trending data, where the search results from the headline news shown there are similarities based on the frequent words found.
[0054] Referring now to FIG 10, there is provided an example of mapping attributes and trend keyword with NER prior to interpretation structure.
[0055] Referring now to FIG 11, there is provided an exemplary tabular view showing how the score for each NER is calculated to get the ranking. For instance, based on ‘Disease’ attribute tagging shown in FIG 8, total tagging is 7 and the word ‘haze’ which is tagged to pollution is 2. Then the pollution score is calculated as 2 out of 7 (i.e. 2/7=0.29). [0056] Referring now to FIG 12, there is provided an exemplary interpretation that has metrics based on attribute, NER and score. The pattern sentence can be interpreted based on the table shown in FIG 12 as: ‘Disease and total patient impacted by pollution’.
[0057] Referring now to FIG 13, there is provided block diagrams illustrating generation of potential interpretation. In the top box, the interpretation is: ‘the combination of disease and total patient has been impacted by pollution’; in the bottom box, the interpretation is: ‘the combination of disease and total student has been impacted by pollution in schoolA in Petaling Jaya area’ .
[0058] Referring now to FIG 14A and FIG 14B, there is provided examples of visualization interpretation. For the bar diagram shown in FIG 14A, the visualization interpretation of the bar diagram can be ‘The combination of diseases attribute and total patient attribute might have been impacted by (total patient) pollution (haze)’ . The pie diagram shown in FIG 14B, the visualization interpretation of the pie diagram can be ‘The combination of disease attribute and total student attribute might have been impacted by pollution (haze) in school in Petaling Jaya area’.
[0059] Referring now to FIG 15, there is provided a block flowchart for the method of providing visualization interpretation in accordance with one embodiment of the present invention. The method comprises retrieving dataset from various attribute domains including e.g. dimension, measurement and values at step 100; establishing profile data trending anomalies by modeling anomalies profile with specific disease and comparing with various data trending such as external (haze) and internal (virus, smoking) at step 200; establishing interpretation relationship at step 300; and establishing visualization interpretation at step 400.
[0060] Referring now to FIG 16, there is provided a flowchart of establishing profile data trending anomalies 200 in accordance with one embodiment of the present invention. The step 200 of establishing profile data trending anomalies comprises establishing domain knowledge, internal and external incluencers at step 210; consolidating internal and external influencers at step 220; and establishing anomalies for external influencers at step 230. In the step 210, the relationship between domain knowledge with external influencers is already predefined as illustrated in FIG 2. In the step 220, the example attributes are gender, date time for illness such as flu/cough, dengue with internal symptom such as virus, and the external influencers can be haze or flood. In the step 230, external influence anomalies can be established as illustrated in FIG 3.
[0061] Referring now to FIG 17, there is provided a flowchart of establishing interpretation relationship 300 in accordance with one embodiment of the present invention. The step of establishing interpretation relationship 300 comprises getting metadata of dimensions, measurement, values for each selected data attribute at step 310; finding most frequent keywords for attributes at step 320; finding trends for each metadata’s keyword at step 330; calculating frequency of each metadata trend relationship based on keyword and event date at step 340; getting highest trend of keywords and event date at step 350; finding related news or events based on trend keyword and data attribute at step 360; and tagging related news, trend keyword and event date with report metadata using the Named Entity
Recognition (NER) at step 370. [0062] In the step 310, the metadata for a specific domain e.g. health is selected; then data attribute with dimension, value and etc. of that specific domain are retrieved for a specific disease. In the step 320, a frequent keyword of attribute of metadata can be identified. In the step 330, for each metadata’s keyword, trend such as state, disease, student and school is retrieved from a trend knowledge database such as Google Trends, etc. The metadata trend relationship is retrieved from the trend knowledge. Through the calculation in the step 340, the highest trend of keywords is selected in the step 350. At the step 360, the related news or events are retrieved from the news knowledge database such as Google, online newspaper, etc. These news are tagged with the metadata accordingly.
[0063] Referring now to FIG 18, there is provided a flowchart of establishing visualization interpretation 400 in accordance with one embodiment of the present invention. The step of establishing visualization interpretation 400 comprises mapping trend keyword and NER tagging with dataset at step 410; scoring and ranking the mapped trend keyword to provide accurate interpretation at step 420; generating potential interpretation at step 430. At the step 410, the trend keyword is maped with pollution as an exemplary NER. Based on the metrics, i.e. scoring and ranking, of the trend keyword, the interpretation can be presented visually.
[0064] While the present invention has been described with reference to particular embodiments, it will be understood that the embodiments are illustrative and that the invention scope is not so limited. Alternative embodiments of the present invention will become apparent to those having ordinary skill in the art to which the present invention pertains. Such alternate embodiments are considered to be encompassed within the scope of the present invention. Accordingly, the scope of the present invention is defined by the appended claims and is supported by the foregoing description.

Claims

1. A method of providing visualization interpretation, characterized in that the method comprising step of: retrieving dataset from a domain knowledge database (60) with various attribute domains comprising dimension, measurement and values (100); establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers (200); establishing interpretation relationship (300); and establishing visualization interpretation (400).
2. The method of claim 1, wherein the step of establishing profile data trending anomalies (200) comprises: establishing domain knowledge, internal and external incluencers (210), wherein the relationship between domain knowledge with external influencers is predefined; consolidating internal and external influencers (220); and establishing anomalies for external influencers (230), wherein external influence anomalies are established.
3. The method of claim 1, wherein the step of establishing interpretation relationship (300) comprises: getting metadata of dimensions, measurement, values for each selected data attributes (310); finding most frequent keywords for attributes (320), by finding and searching a frequent keyword of attribute of metadata; finding trends for each metadata’s keyword (330), by retrieving trend from the trend knowledge database (70) for each metadata’s keyword; calculating frequency of each metadata trend relationship based on keyword and event date (340), wherein the metadata trend relationship is retrieved from the trend knowledge database (70); getting highest trend of keywords and event date (350), wherein the highest trend of keywords is selected based on the calculation in step (340); finding related news or events based on trend keyword and data attribute (360), wherein the related news or events are retrieved from the News Knowledge Database (80); and tagging related news, trend keyword and event date with report metadata using the Named Entity Recognition, NER (370). 19
4. The method of claim 1, wherein the step of establishing visualization interpretation (400) comprises: mapping trend keyword and NER tagging with dataset (410); scoring and ranking the mapped trend keyword to provide accurate interpretation (420); and generating potential interpretation from metric and visualizing it (430).
5. A system (1) for providing visualization interpretation, wherein the system (1) is connectable to a domain knowledge database (60), a trend knowledge database (70) and a news knowledge database (80), characterised in that the system (1) comprising: a client app (15) being installed in a client device (10); a server (20) including a microprocessor for executing a data anomalies detection engine (30), an Interpretation generation engine (40), and an interpretation relationship engine (50) stored on a computer-readable storage medium; wherein the microprocessor executes the data anomalies detection engine (30), interpretation generation engine (40), and interpretation relationship engine (50) that are operable for retrieving dataset from a domain knowledge database (60) with various attribute domains comprising dimension, measurement and values (100); establishing profile data trending anomalies by modeling anomalies profile with specific attribute and comparing with various data trending comprising external and internal influencers (200); establishing interpretation relationship (300); and establishing visualization interpretation (400). 20
6. The system (1) of claim 5, wherein the data anomalies detection engine (30) is operable to establish domain knowledge, internal and external influencers (210), where the relationship between domain knowledge with external influencers is already predefined; consolidate internal and external incluencers (220); and establish anomalies for external influencers (230), wherein external influence anomalies is established, in order to establish profile data trending anomalies (200).
7. The system (1) of claim 5, wherein the interpretation generation engine (40) is operable to get metadata of dimensions, measurement, and values for each selected data attributes (310); find most frequent keywords for attributes (320); find trends for each metadata’s keyword (330) from the trend knowledge database (70); calculate frequency of each metadata trend relationship based on keyword and event date (340); get highest trend of keywords and event date (350); then find related news or events based on trend keyword and data attribute (360) from the news knowledge database (80); and tag the related news, trend keyword and event date with report metadata using the named entity recognition, NER, (370), to establish interpretation relationship (300).
8. The system (1) of claim 5, wherein the interpretation relationship engine (50) is operable to map trend keyword and NER tagging with dataset (410); score and rank the mapped trend keyword to provide accurate interpretation (420); and generate potential visual interpretation from metric, to establish the visualization interpretation (400).
9. The system (1) of claim 5, wherein the domain knowledge database (60), the trend knowledge database (70), and the news knowledge database (80) are stored in a separate storage medium.
PCT/MY2020/050175 2020-10-28 2020-11-27 Method and system to provide visualization interpretation through establishing relationship between internal and external trending data influences Ceased WO2022093003A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2020005637 2020-10-28
MYPI2020005637 2020-10-28

Publications (1)

Publication Number Publication Date
WO2022093003A1 true WO2022093003A1 (en) 2022-05-05

Family

ID=81383049

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2020/050175 Ceased WO2022093003A1 (en) 2020-10-28 2020-11-27 Method and system to provide visualization interpretation through establishing relationship between internal and external trending data influences

Country Status (1)

Country Link
WO (1) WO2022093003A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119254648A (en) * 2024-12-04 2025-01-03 武汉亚耀科技有限公司 A method and system for visualizing optical module data based on the Internet of Things

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150242384A1 (en) * 2012-08-30 2015-08-27 Arria Data2Text Limited Method and apparatus for annotating a graphical output
JP2016518660A (en) * 2013-04-11 2016-06-23 オラクル・インターナショナル・コーポレイション Predictive diagnosis of SLA violations in cloud services by grasping and forecasting seasonal trends using thread strength analysis
WO2019075478A1 (en) * 2017-10-13 2019-04-18 Kpmg Llp System and method for analysis of structured and unstructured data
US20190332620A1 (en) * 2018-04-26 2019-10-31 Accenture Global Solutions Limited Natural language processing and artificial intelligence based search system
US20200320431A1 (en) * 2018-04-26 2020-10-08 Quickpath Analytics, Inc. System and method for detecting anomalies in prediction generation systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150242384A1 (en) * 2012-08-30 2015-08-27 Arria Data2Text Limited Method and apparatus for annotating a graphical output
JP2016518660A (en) * 2013-04-11 2016-06-23 オラクル・インターナショナル・コーポレイション Predictive diagnosis of SLA violations in cloud services by grasping and forecasting seasonal trends using thread strength analysis
WO2019075478A1 (en) * 2017-10-13 2019-04-18 Kpmg Llp System and method for analysis of structured and unstructured data
US20190332620A1 (en) * 2018-04-26 2019-10-31 Accenture Global Solutions Limited Natural language processing and artificial intelligence based search system
US20200320431A1 (en) * 2018-04-26 2020-10-08 Quickpath Analytics, Inc. System and method for detecting anomalies in prediction generation systems

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119254648A (en) * 2024-12-04 2025-01-03 武汉亚耀科技有限公司 A method and system for visualizing optical module data based on the Internet of Things

Similar Documents

Publication Publication Date Title
Goonetilleke et al. Twitter analytics: a big data management perspective
US20190370397A1 (en) Artificial intelligence based-document processing
JP7740839B2 (en) Method for accessing data records in a master data management system
CN118511490A (en) System and method for monitoring related indexes
US9020879B2 (en) Intelligent data agent for a knowledge management system
US20160148327A1 (en) Intelligent engine for analysis of intellectual property
TW201421395A (en) System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data
US12020271B2 (en) Identifying competitors of companies
US20210374681A1 (en) System and method for providing job recommendations based on users' latent skills
Dai et al. Scraping and clustering techniques for the characterization of LinkedIn profiles
Masrur et al. Interpretable machine learning for analysing heterogeneous drivers of geographic events in space-time
Giri Influence of selected factors in journals’ citations
US9720984B2 (en) Visualization engine for a knowledge management system
US20140114949A1 (en) Knowledge Management System
CN120031113A (en) A data processing method and system based on time series knowledge graph
US10229194B2 (en) Providing known distribution patterns associated with specific measures and metrics
US9305261B2 (en) Knowledge management engine for a knowledge management system
US20250251850A1 (en) Interactive patent visualization systems and methods
WO2022093003A1 (en) Method and system to provide visualization interpretation through establishing relationship between internal and external trending data influences
US12197463B2 (en) Creating descriptors for business analytics applications
CN119669203A (en) Multimodal data management system, method, device and medium
CN115599802B (en) Data retrieval system, method, device and storage medium
US10614083B2 (en) Method and system for identifying incipient field-specific entity records
Huang et al. Where are the sleeping beauties and princes in educational technology journals?
CN120974130B (en) Knowledge graph-based dataset quality assessment methods, devices, and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20960060

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20960060

Country of ref document: EP

Kind code of ref document: A1