SG10201909389VA - Data quality analysis - Google Patents
Data quality analysisInfo
- Publication number
- SG10201909389VA SG10201909389VA SG10201909389VA SG10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA
- Authority
- SG
- Singapore
- Prior art keywords
- dataset
- upstream
- datasets
- quality analysis
- data quality
- Prior art date
Links
- 238000011144 upstream manufacturing Methods 0.000 abstract 7
- 238000000034 method Methods 0.000 abstract 2
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/197—Version control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/65—Updates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Stored Programmes (AREA)
- Debugging And Monitoring (AREA)
- User Interface Of Digital Computer (AREA)
- Analysing Materials By The Use Of Radiation (AREA)
- Automatic Analysis And Handling Materials Therefor (AREA)
Abstract
DATA QUALITY ANALYSIS A method includes receiving information indicative of an output dataset generated by a data processing identifying, on lineage relating the dataset, one or more upstream datasets on which the output dataset depends; analyzing one or of identified or upstream on the dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream applying or of: a rule of allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular dataset, (ii) second indicative one more values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the datasets. method outputting associated the selected one or more upstream datasets. (Fig. 1)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562174997P | 2015-06-12 | 2015-06-12 | |
US15/175,793 US10409802B2 (en) | 2015-06-12 | 2016-06-07 | Data quality analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
SG10201909389VA true SG10201909389VA (en) | 2019-11-28 |
Family
ID=56178502
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
SG10201909389V SG10201909389VA (en) | 2015-06-12 | 2016-06-10 | Data quality analysis |
Country Status (10)
Country | Link |
---|---|
US (2) | US10409802B2 (en) |
EP (2) | EP3308297B1 (en) |
JP (3) | JP6707564B2 (en) |
KR (1) | KR102033971B1 (en) |
CN (2) | CN117807065A (en) |
AU (2) | AU2016274791B2 (en) |
CA (2) | CA3185178C (en) |
HK (1) | HK1250066A1 (en) |
SG (1) | SG10201909389VA (en) |
WO (1) | WO2016201176A1 (en) |
Families Citing this family (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10409802B2 (en) | 2015-06-12 | 2019-09-10 | Ab Initio Technology Llc | Data quality analysis |
US9734188B1 (en) * | 2016-01-29 | 2017-08-15 | International Business Machines Corporation | Systematic approach to determine source of data quality issue in data flow in an enterprise |
US10776740B2 (en) | 2016-06-07 | 2020-09-15 | International Business Machines Corporation | Detecting potential root causes of data quality issues using data lineage graphs |
US10915508B2 (en) | 2016-06-30 | 2021-02-09 | Global Ids, Inc. | Data linking |
US10915545B2 (en) | 2016-09-29 | 2021-02-09 | Microsoft Technology Licensing, Llc | Systems and methods for dynamically rendering data lineage |
US10657120B2 (en) * | 2016-10-03 | 2020-05-19 | Bank Of America Corporation | Cross-platform digital data movement control utility and method of use thereof |
US11853529B2 (en) | 2016-11-07 | 2023-12-26 | Tableau Software, Inc. | User interface to prepare and curate data for subsequent analysis |
US10885057B2 (en) * | 2016-11-07 | 2021-01-05 | Tableau Software, Inc. | Correlated incremental loading of multiple data sets for an interactive data prep application |
US10242079B2 (en) | 2016-11-07 | 2019-03-26 | Tableau Software, Inc. | Optimizing execution of data transformation flows |
CA2989617A1 (en) | 2016-12-19 | 2018-06-19 | Capital One Services, Llc | Systems and methods for providing data quality management |
US10147040B2 (en) | 2017-01-20 | 2018-12-04 | Alchemy IoT | Device data quality evaluator |
US10855783B2 (en) * | 2017-01-23 | 2020-12-01 | Adobe Inc. | Communication notification trigger modeling preview |
US10298465B2 (en) * | 2017-08-01 | 2019-05-21 | Juniper Networks, Inc. | Using machine learning to monitor link quality and predict link faults |
US10394691B1 (en) * | 2017-10-05 | 2019-08-27 | Tableau Software, Inc. | Resolution of data flow errors using the lineage of detected error conditions |
US10783138B2 (en) * | 2017-10-23 | 2020-09-22 | Google Llc | Verifying structured data |
US10331660B1 (en) * | 2017-12-22 | 2019-06-25 | Capital One Services, Llc | Generating a data lineage record to facilitate source system and destination system mapping |
CN110413632B (en) * | 2018-04-26 | 2023-05-30 | 腾讯科技(深圳)有限公司 | Method, device, computer readable medium and electronic equipment for managing state |
CN119090190A (en) | 2018-06-12 | 2024-12-06 | 鹰图公司 | Artificial Intelligence Application in Computer Aided Dispatch System |
US10678660B2 (en) * | 2018-06-26 | 2020-06-09 | StreamSets, Inc. | Transformation drift detection and remediation |
JP7153500B2 (en) * | 2018-08-09 | 2022-10-14 | 富士通株式会社 | Data management device and data recommendation program |
BR112021006722A2 (en) * | 2018-10-09 | 2021-07-27 | Tableau Software, Inc. | correlated incremental loading of multiple datasets to an interactive data preparation application |
US10691304B1 (en) | 2018-10-22 | 2020-06-23 | Tableau Software, Inc. | Data preparation user interface with conglomerate heterogeneous process flow elements |
US11250032B1 (en) | 2018-10-22 | 2022-02-15 | Tableau Software, Inc. | Data preparation user interface with conditional remapping of data values |
US11704494B2 (en) * | 2019-05-31 | 2023-07-18 | Ab Initio Technology Llc | Discovering a semantic meaning of data fields from profile data of the data fields |
US11157470B2 (en) * | 2019-06-03 | 2021-10-26 | International Business Machines Corporation | Method and system for data quality delta analysis on a dataset |
US11100097B1 (en) | 2019-11-12 | 2021-08-24 | Tableau Software, Inc. | Visually defining multi-row table calculations in a data preparation application |
US11886399B2 (en) | 2020-02-26 | 2024-01-30 | Ab Initio Technology Llc | Generating rules for data processing values of data fields from semantic labels of the data fields |
KR102240496B1 (en) * | 2020-04-17 | 2021-04-15 | 주식회사 한국정보기술단 | Data quality management system and method |
US20220059238A1 (en) * | 2020-08-24 | 2022-02-24 | GE Precision Healthcare LLC | Systems and methods for generating data quality indices for patients |
CN112131303A (en) * | 2020-09-18 | 2020-12-25 | 天津大学 | Large-scale data lineage method based on neural network model |
US11277473B1 (en) * | 2020-12-01 | 2022-03-15 | Adp, Llc | Coordinating breaking changes in automatic data exchange |
US12117978B2 (en) * | 2020-12-09 | 2024-10-15 | Kyndryl, Inc. | Remediation of data quality issues in computer databases |
KR102608736B1 (en) * | 2020-12-15 | 2023-12-01 | 주식회사 포티투마루 | Search method and device for query in document |
US11921698B2 (en) | 2021-04-12 | 2024-03-05 | Torana Inc. | System and method for data quality assessment |
US12032994B1 (en) | 2021-10-18 | 2024-07-09 | Tableau Software, LLC | Linking outputs for automatic execution of tasks |
US20230185786A1 (en) * | 2021-12-13 | 2023-06-15 | International Business Machines Corporation | Detect data standardization gaps |
KR20230138074A (en) | 2022-03-23 | 2023-10-05 | 배재대학교 산학협력단 | Method and apparatus for managing data quality of academic information system using data profiling |
KR102437098B1 (en) * | 2022-04-15 | 2022-08-25 | 이찬영 | Method and apparatus for determining error data based on artificial intenigence |
US12242441B1 (en) * | 2022-07-11 | 2025-03-04 | Databricks, Inc. | Data lineage tracking |
AU2023347913A1 (en) | 2022-09-20 | 2025-03-27 | Ab Initio Technology Llc | Techniques for discovering and updating semantic meaning of data fields |
US12169486B2 (en) * | 2022-10-19 | 2024-12-17 | Snowflake Inc. | File-based error handling during ingestion with transformation |
US11822375B1 (en) * | 2023-04-28 | 2023-11-21 | Infosum Limited | Systems and methods for partially securing data |
EP4510001A1 (en) * | 2023-08-15 | 2025-02-19 | AB Initio Technology LLC | Data set evaluation based on data lineage analysis |
US12204538B1 (en) | 2023-09-06 | 2025-01-21 | Optum, Inc. | Dynamically tailored time intervals for federated query system |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5966072A (en) | 1996-07-02 | 1999-10-12 | Ab Initio Software Corporation | Executing computations expressed as graphs |
CN1853181A (en) * | 2003-09-15 | 2006-10-25 | Ab开元软件公司 | Data profiling |
AU2004275334B9 (en) | 2003-09-15 | 2011-06-16 | Ab Initio Technology Llc. | Data Profiling |
US7328428B2 (en) * | 2003-09-23 | 2008-02-05 | Trivergent Technologies, Inc. | System and method for generating data validation rules |
US7743420B2 (en) * | 2003-12-02 | 2010-06-22 | Imperva, Inc. | Dynamic learning method and adaptive normal behavior profile (NBP) architecture for providing fast protection of enterprise applications |
KR100582896B1 (en) | 2004-01-28 | 2006-05-24 | 삼성전자주식회사 | Software Version Automated Management System and Version Control Method |
US7716630B2 (en) | 2005-06-27 | 2010-05-11 | Ab Initio Technology Llc | Managing parameters for graph-based computations |
US20070174234A1 (en) | 2006-01-24 | 2007-07-26 | International Business Machines Corporation | Data quality and validation within a relational database management system |
JP2008265618A (en) | 2007-04-23 | 2008-11-06 | Toyota Motor Corp | In-vehicle electronic control unit |
CN101971165B (en) * | 2008-02-26 | 2013-07-17 | 起元技术有限责任公司 | Graphic representations of data relationships |
CN101425078A (en) | 2008-11-17 | 2009-05-06 | 阿里巴巴集团控股有限公司 | Software source code updating method and device |
CA2744463C (en) | 2008-12-02 | 2019-05-28 | Erik Bator | Visualizing relationships between data elements |
EP2440882B1 (en) * | 2009-06-10 | 2020-02-12 | Ab Initio Technology LLC | Generating test data |
CN102656554B (en) * | 2009-09-16 | 2019-09-10 | 起元技术有限责任公司 | Map data set element |
JP2011253491A (en) * | 2010-06-04 | 2011-12-15 | Toshiba Corp | Plant abnormality detector, method for the plant abnormality detector, and program |
US8819010B2 (en) | 2010-06-28 | 2014-08-26 | International Business Machines Corporation | Efficient representation of data lineage information |
JP5331774B2 (en) * | 2010-10-22 | 2013-10-30 | 株式会社日立パワーソリューションズ | Equipment state monitoring method and apparatus, and equipment state monitoring program |
JP2012146241A (en) | 2011-01-14 | 2012-08-02 | Canon Inc | Software update method, software update device, and software update program |
US10013439B2 (en) | 2011-06-27 | 2018-07-03 | International Business Machines Corporation | Automatic generation of instantiation rules to determine quality of data migration |
US9330148B2 (en) | 2011-06-30 | 2016-05-03 | International Business Machines Corporation | Adapting data quality rules based upon user application requirements |
US8812411B2 (en) * | 2011-11-03 | 2014-08-19 | Microsoft Corporation | Domains for knowledge-based data quality solution |
US9202174B2 (en) | 2013-01-28 | 2015-12-01 | Daniel A Dooley | Automated tracker and analyzer |
US10489360B2 (en) | 2012-10-17 | 2019-11-26 | Ab Initio Technology Llc | Specifying and applying rules to data |
US9063998B2 (en) | 2012-10-18 | 2015-06-23 | Oracle International Corporation | Associated information propagation system |
US9569342B2 (en) * | 2012-12-20 | 2017-02-14 | Microsoft Technology Licensing, Llc | Test strategy for profile-guided code execution optimizers |
US9558230B2 (en) | 2013-02-12 | 2017-01-31 | International Business Machines Corporation | Data quality assessment |
US9576036B2 (en) * | 2013-03-15 | 2017-02-21 | International Business Machines Corporation | Self-analyzing data processing job to determine data quality issues |
US9256656B2 (en) | 2013-08-20 | 2016-02-09 | International Business Machines Corporation | Determining reliability of data reports |
JP2014006933A (en) | 2013-10-11 | 2014-01-16 | Ricoh Co Ltd | Information processing device, apparatus, information processing system, installing support method, and installing support program |
US10409802B2 (en) | 2015-06-12 | 2019-09-10 | Ab Initio Technology Llc | Data quality analysis |
-
2016
- 2016-06-07 US US15/175,793 patent/US10409802B2/en active Active
- 2016-06-10 CN CN202311630856.2A patent/CN117807065A/en active Pending
- 2016-06-10 WO PCT/US2016/036813 patent/WO2016201176A1/en active Application Filing
- 2016-06-10 SG SG10201909389V patent/SG10201909389VA/en unknown
- 2016-06-10 AU AU2016274791A patent/AU2016274791B2/en active Active
- 2016-06-10 KR KR1020187001032A patent/KR102033971B1/en active Active
- 2016-06-10 CA CA3185178A patent/CA3185178C/en active Active
- 2016-06-10 CA CA2988256A patent/CA2988256A1/en active Pending
- 2016-06-10 JP JP2017559576A patent/JP6707564B2/en active Active
- 2016-06-10 CN CN201680034382.7A patent/CN107810500B/en active Active
- 2016-06-10 EP EP16731466.5A patent/EP3308297B1/en active Active
- 2016-06-10 EP EP21156728.4A patent/EP3839758B1/en active Active
-
2018
- 2018-07-23 HK HK18109490.6A patent/HK1250066A1/en unknown
-
2019
- 2019-07-17 US US16/513,882 patent/US11249981B2/en active Active
- 2019-10-24 AU AU2019253860A patent/AU2019253860B2/en active Active
-
2020
- 2020-05-20 JP JP2020088498A patent/JP7507602B2/en active Active
-
2023
- 2023-02-20 JP JP2023024532A patent/JP7654699B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
JP6707564B2 (en) | 2020-06-10 |
KR102033971B1 (en) | 2019-10-18 |
AU2016274791B2 (en) | 2019-07-25 |
JP7654699B2 (en) | 2025-04-01 |
AU2019253860A1 (en) | 2019-11-14 |
US20200057757A1 (en) | 2020-02-20 |
JP7507602B2 (en) | 2024-06-28 |
CA3185178A1 (en) | 2016-12-15 |
CA2988256A1 (en) | 2016-12-15 |
CN107810500A (en) | 2018-03-16 |
EP3839758B1 (en) | 2022-08-10 |
AU2019253860B2 (en) | 2021-12-09 |
US20160364434A1 (en) | 2016-12-15 |
JP2020161147A (en) | 2020-10-01 |
US10409802B2 (en) | 2019-09-10 |
KR20180030521A (en) | 2018-03-23 |
JP2018523195A (en) | 2018-08-16 |
CN107810500B (en) | 2023-12-08 |
CN117807065A (en) | 2024-04-02 |
HK1250066A1 (en) | 2018-11-23 |
EP3308297A1 (en) | 2018-04-18 |
WO2016201176A1 (en) | 2016-12-15 |
US11249981B2 (en) | 2022-02-15 |
JP2023062126A (en) | 2023-05-02 |
EP3839758A1 (en) | 2021-06-23 |
CA3185178C (en) | 2023-09-26 |
AU2016274791A1 (en) | 2017-11-30 |
EP3308297B1 (en) | 2021-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
SG10201909389VA (en) | Data quality analysis | |
AR109632A1 (en) | SYSTEMS FOR DETERMINING AGRONOMIC RESULTS FOR A CULTIVABLE REGION AND RELATED METHODS AND APPLIANCES | |
MX2019013495A (en) | System and method for biometric identification. | |
MX381169B (en) | METHODS AND SYSTEMS FOR THE DETECTION OF COPY NUMBER VARIANTS. | |
WO2018039381A3 (en) | Interface tool for asset fault analysis | |
GB2553451A (en) | Implementing a neural network algorithm on a neurosynaptic substrate based on metadata associated with the neural network algorithm | |
PH12018501016B1 (en) | Information recommendation method and apparatus | |
WO2015048212A3 (en) | Evaluating rules applied to data | |
BR112018077322A2 (en) | systems and methods for identifying match content | |
EP4375952A3 (en) | Systems and methods for reducing data density in large datasets | |
WO2016164680A3 (en) | Automated model development process | |
MX2015000193A (en) | Private information hiding method and device. | |
GB2525719A8 (en) | Method and system for providing a vulnerability management and verification service | |
PH12018550190A1 (en) | Automation of image validation | |
EP4242892A3 (en) | Code pointer authentication for hardware flow control | |
MX368852B (en) | Setting different background model sensitivities by user defined regions and background filters. | |
MX2016006745A (en) | Method and apparatus for determining associated user. | |
SG11201909119YA (en) | Search method and apparatus and non-temporary computer-readable storage medium | |
SG11201909253QA (en) | Method and apparatus of data procrssing for online analytical processing | |
MX2020013214A (en) | Updating executable graphs. | |
EP3779835A4 (en) | Device, method and program for analyzing customer attribute information | |
EP3842944A4 (en) | Information processing device, abnormality analyzing method, and program | |
MX2017009709A (en) | Method for evaluating the authenticity of a painting as well as a corresponding use. | |
SG10201901587VA (en) | Application testing | |
EP3989235A4 (en) | Program, testing device, information processing device, and information processing method |