[go: up one dir, main page]

SG10201909389VA - Data quality analysis - Google Patents

Data quality analysis

Info

Publication number
SG10201909389VA
SG10201909389VA SG10201909389VA SG10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA SG 10201909389V A SG10201909389V A SG 10201909389VA
Authority
SG
Singapore
Prior art keywords
dataset
upstream
datasets
quality analysis
data quality
Prior art date
Application number
Inventor
Chuck Spitz
Joel Gould
Original Assignee
Ab Initio Technology Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ab Initio Technology Llc filed Critical Ab Initio Technology Llc
Publication of SG10201909389VA publication Critical patent/SG10201909389VA/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/197Version control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)
  • Debugging And Monitoring (AREA)
  • User Interface Of Digital Computer (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)
  • Automatic Analysis And Handling Materials Therefor (AREA)

Abstract

DATA QUALITY ANALYSIS A method includes receiving information indicative of an output dataset generated by a data processing identifying, on lineage relating the dataset, one or more upstream datasets on which the output dataset depends; analyzing one or of identified or upstream on the dataset depends. The analyzing includes, for each particular upstream dataset of the one or more upstream applying or of: a rule of allowable deviation between a profile of the particular upstream dataset and a reference profile for the particular dataset, (ii) second indicative one more values or prohibited values for each of one or more data elements in the particular upstream dataset, and based on the results of applying the one or more rules, selecting one or more of the datasets. method outputting associated the selected one or more upstream datasets. (Fig. 1)
SG10201909389V 2015-06-12 2016-06-10 Data quality analysis SG10201909389VA (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562174997P 2015-06-12 2015-06-12
US15/175,793 US10409802B2 (en) 2015-06-12 2016-06-07 Data quality analysis

Publications (1)

Publication Number Publication Date
SG10201909389VA true SG10201909389VA (en) 2019-11-28

Family

ID=56178502

Family Applications (1)

Application Number Title Priority Date Filing Date
SG10201909389V SG10201909389VA (en) 2015-06-12 2016-06-10 Data quality analysis

Country Status (10)

Country Link
US (2) US10409802B2 (en)
EP (2) EP3308297B1 (en)
JP (3) JP6707564B2 (en)
KR (1) KR102033971B1 (en)
CN (2) CN117807065A (en)
AU (2) AU2016274791B2 (en)
CA (2) CA3185178C (en)
HK (1) HK1250066A1 (en)
SG (1) SG10201909389VA (en)
WO (1) WO2016201176A1 (en)

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10409802B2 (en) 2015-06-12 2019-09-10 Ab Initio Technology Llc Data quality analysis
US9734188B1 (en) * 2016-01-29 2017-08-15 International Business Machines Corporation Systematic approach to determine source of data quality issue in data flow in an enterprise
US10776740B2 (en) 2016-06-07 2020-09-15 International Business Machines Corporation Detecting potential root causes of data quality issues using data lineage graphs
US10915508B2 (en) 2016-06-30 2021-02-09 Global Ids, Inc. Data linking
US10915545B2 (en) 2016-09-29 2021-02-09 Microsoft Technology Licensing, Llc Systems and methods for dynamically rendering data lineage
US10657120B2 (en) * 2016-10-03 2020-05-19 Bank Of America Corporation Cross-platform digital data movement control utility and method of use thereof
US11853529B2 (en) 2016-11-07 2023-12-26 Tableau Software, Inc. User interface to prepare and curate data for subsequent analysis
US10885057B2 (en) * 2016-11-07 2021-01-05 Tableau Software, Inc. Correlated incremental loading of multiple data sets for an interactive data prep application
US10242079B2 (en) 2016-11-07 2019-03-26 Tableau Software, Inc. Optimizing execution of data transformation flows
CA2989617A1 (en) 2016-12-19 2018-06-19 Capital One Services, Llc Systems and methods for providing data quality management
US10147040B2 (en) 2017-01-20 2018-12-04 Alchemy IoT Device data quality evaluator
US10855783B2 (en) * 2017-01-23 2020-12-01 Adobe Inc. Communication notification trigger modeling preview
US10298465B2 (en) * 2017-08-01 2019-05-21 Juniper Networks, Inc. Using machine learning to monitor link quality and predict link faults
US10394691B1 (en) * 2017-10-05 2019-08-27 Tableau Software, Inc. Resolution of data flow errors using the lineage of detected error conditions
US10783138B2 (en) * 2017-10-23 2020-09-22 Google Llc Verifying structured data
US10331660B1 (en) * 2017-12-22 2019-06-25 Capital One Services, Llc Generating a data lineage record to facilitate source system and destination system mapping
CN110413632B (en) * 2018-04-26 2023-05-30 腾讯科技(深圳)有限公司 Method, device, computer readable medium and electronic equipment for managing state
CN119090190A (en) 2018-06-12 2024-12-06 鹰图公司 Artificial Intelligence Application in Computer Aided Dispatch System
US10678660B2 (en) * 2018-06-26 2020-06-09 StreamSets, Inc. Transformation drift detection and remediation
JP7153500B2 (en) * 2018-08-09 2022-10-14 富士通株式会社 Data management device and data recommendation program
BR112021006722A2 (en) * 2018-10-09 2021-07-27 Tableau Software, Inc. correlated incremental loading of multiple datasets to an interactive data preparation application
US10691304B1 (en) 2018-10-22 2020-06-23 Tableau Software, Inc. Data preparation user interface with conglomerate heterogeneous process flow elements
US11250032B1 (en) 2018-10-22 2022-02-15 Tableau Software, Inc. Data preparation user interface with conditional remapping of data values
US11704494B2 (en) * 2019-05-31 2023-07-18 Ab Initio Technology Llc Discovering a semantic meaning of data fields from profile data of the data fields
US11157470B2 (en) * 2019-06-03 2021-10-26 International Business Machines Corporation Method and system for data quality delta analysis on a dataset
US11100097B1 (en) 2019-11-12 2021-08-24 Tableau Software, Inc. Visually defining multi-row table calculations in a data preparation application
US11886399B2 (en) 2020-02-26 2024-01-30 Ab Initio Technology Llc Generating rules for data processing values of data fields from semantic labels of the data fields
KR102240496B1 (en) * 2020-04-17 2021-04-15 주식회사 한국정보기술단 Data quality management system and method
US20220059238A1 (en) * 2020-08-24 2022-02-24 GE Precision Healthcare LLC Systems and methods for generating data quality indices for patients
CN112131303A (en) * 2020-09-18 2020-12-25 天津大学 Large-scale data lineage method based on neural network model
US11277473B1 (en) * 2020-12-01 2022-03-15 Adp, Llc Coordinating breaking changes in automatic data exchange
US12117978B2 (en) * 2020-12-09 2024-10-15 Kyndryl, Inc. Remediation of data quality issues in computer databases
KR102608736B1 (en) * 2020-12-15 2023-12-01 주식회사 포티투마루 Search method and device for query in document
US11921698B2 (en) 2021-04-12 2024-03-05 Torana Inc. System and method for data quality assessment
US12032994B1 (en) 2021-10-18 2024-07-09 Tableau Software, LLC Linking outputs for automatic execution of tasks
US20230185786A1 (en) * 2021-12-13 2023-06-15 International Business Machines Corporation Detect data standardization gaps
KR20230138074A (en) 2022-03-23 2023-10-05 배재대학교 산학협력단 Method and apparatus for managing data quality of academic information system using data profiling
KR102437098B1 (en) * 2022-04-15 2022-08-25 이찬영 Method and apparatus for determining error data based on artificial intenigence
US12242441B1 (en) * 2022-07-11 2025-03-04 Databricks, Inc. Data lineage tracking
AU2023347913A1 (en) 2022-09-20 2025-03-27 Ab Initio Technology Llc Techniques for discovering and updating semantic meaning of data fields
US12169486B2 (en) * 2022-10-19 2024-12-17 Snowflake Inc. File-based error handling during ingestion with transformation
US11822375B1 (en) * 2023-04-28 2023-11-21 Infosum Limited Systems and methods for partially securing data
EP4510001A1 (en) * 2023-08-15 2025-02-19 AB Initio Technology LLC Data set evaluation based on data lineage analysis
US12204538B1 (en) 2023-09-06 2025-01-21 Optum, Inc. Dynamically tailored time intervals for federated query system

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5966072A (en) 1996-07-02 1999-10-12 Ab Initio Software Corporation Executing computations expressed as graphs
CN1853181A (en) * 2003-09-15 2006-10-25 Ab开元软件公司 Data profiling
AU2004275334B9 (en) 2003-09-15 2011-06-16 Ab Initio Technology Llc. Data Profiling
US7328428B2 (en) * 2003-09-23 2008-02-05 Trivergent Technologies, Inc. System and method for generating data validation rules
US7743420B2 (en) * 2003-12-02 2010-06-22 Imperva, Inc. Dynamic learning method and adaptive normal behavior profile (NBP) architecture for providing fast protection of enterprise applications
KR100582896B1 (en) 2004-01-28 2006-05-24 삼성전자주식회사 Software Version Automated Management System and Version Control Method
US7716630B2 (en) 2005-06-27 2010-05-11 Ab Initio Technology Llc Managing parameters for graph-based computations
US20070174234A1 (en) 2006-01-24 2007-07-26 International Business Machines Corporation Data quality and validation within a relational database management system
JP2008265618A (en) 2007-04-23 2008-11-06 Toyota Motor Corp In-vehicle electronic control unit
CN101971165B (en) * 2008-02-26 2013-07-17 起元技术有限责任公司 Graphic representations of data relationships
CN101425078A (en) 2008-11-17 2009-05-06 阿里巴巴集团控股有限公司 Software source code updating method and device
CA2744463C (en) 2008-12-02 2019-05-28 Erik Bator Visualizing relationships between data elements
EP2440882B1 (en) * 2009-06-10 2020-02-12 Ab Initio Technology LLC Generating test data
CN102656554B (en) * 2009-09-16 2019-09-10 起元技术有限责任公司 Map data set element
JP2011253491A (en) * 2010-06-04 2011-12-15 Toshiba Corp Plant abnormality detector, method for the plant abnormality detector, and program
US8819010B2 (en) 2010-06-28 2014-08-26 International Business Machines Corporation Efficient representation of data lineage information
JP5331774B2 (en) * 2010-10-22 2013-10-30 株式会社日立パワーソリューションズ Equipment state monitoring method and apparatus, and equipment state monitoring program
JP2012146241A (en) 2011-01-14 2012-08-02 Canon Inc Software update method, software update device, and software update program
US10013439B2 (en) 2011-06-27 2018-07-03 International Business Machines Corporation Automatic generation of instantiation rules to determine quality of data migration
US9330148B2 (en) 2011-06-30 2016-05-03 International Business Machines Corporation Adapting data quality rules based upon user application requirements
US8812411B2 (en) * 2011-11-03 2014-08-19 Microsoft Corporation Domains for knowledge-based data quality solution
US9202174B2 (en) 2013-01-28 2015-12-01 Daniel A Dooley Automated tracker and analyzer
US10489360B2 (en) 2012-10-17 2019-11-26 Ab Initio Technology Llc Specifying and applying rules to data
US9063998B2 (en) 2012-10-18 2015-06-23 Oracle International Corporation Associated information propagation system
US9569342B2 (en) * 2012-12-20 2017-02-14 Microsoft Technology Licensing, Llc Test strategy for profile-guided code execution optimizers
US9558230B2 (en) 2013-02-12 2017-01-31 International Business Machines Corporation Data quality assessment
US9576036B2 (en) * 2013-03-15 2017-02-21 International Business Machines Corporation Self-analyzing data processing job to determine data quality issues
US9256656B2 (en) 2013-08-20 2016-02-09 International Business Machines Corporation Determining reliability of data reports
JP2014006933A (en) 2013-10-11 2014-01-16 Ricoh Co Ltd Information processing device, apparatus, information processing system, installing support method, and installing support program
US10409802B2 (en) 2015-06-12 2019-09-10 Ab Initio Technology Llc Data quality analysis

Also Published As

Publication number Publication date
JP6707564B2 (en) 2020-06-10
KR102033971B1 (en) 2019-10-18
AU2016274791B2 (en) 2019-07-25
JP7654699B2 (en) 2025-04-01
AU2019253860A1 (en) 2019-11-14
US20200057757A1 (en) 2020-02-20
JP7507602B2 (en) 2024-06-28
CA3185178A1 (en) 2016-12-15
CA2988256A1 (en) 2016-12-15
CN107810500A (en) 2018-03-16
EP3839758B1 (en) 2022-08-10
AU2019253860B2 (en) 2021-12-09
US20160364434A1 (en) 2016-12-15
JP2020161147A (en) 2020-10-01
US10409802B2 (en) 2019-09-10
KR20180030521A (en) 2018-03-23
JP2018523195A (en) 2018-08-16
CN107810500B (en) 2023-12-08
CN117807065A (en) 2024-04-02
HK1250066A1 (en) 2018-11-23
EP3308297A1 (en) 2018-04-18
WO2016201176A1 (en) 2016-12-15
US11249981B2 (en) 2022-02-15
JP2023062126A (en) 2023-05-02
EP3839758A1 (en) 2021-06-23
CA3185178C (en) 2023-09-26
AU2016274791A1 (en) 2017-11-30
EP3308297B1 (en) 2021-03-24

Similar Documents

Publication Publication Date Title
SG10201909389VA (en) Data quality analysis
AR109632A1 (en) SYSTEMS FOR DETERMINING AGRONOMIC RESULTS FOR A CULTIVABLE REGION AND RELATED METHODS AND APPLIANCES
MX2019013495A (en) System and method for biometric identification.
MX381169B (en) METHODS AND SYSTEMS FOR THE DETECTION OF COPY NUMBER VARIANTS.
WO2018039381A3 (en) Interface tool for asset fault analysis
GB2553451A (en) Implementing a neural network algorithm on a neurosynaptic substrate based on metadata associated with the neural network algorithm
PH12018501016B1 (en) Information recommendation method and apparatus
WO2015048212A3 (en) Evaluating rules applied to data
BR112018077322A2 (en) systems and methods for identifying match content
EP4375952A3 (en) Systems and methods for reducing data density in large datasets
WO2016164680A3 (en) Automated model development process
MX2015000193A (en) Private information hiding method and device.
GB2525719A8 (en) Method and system for providing a vulnerability management and verification service
PH12018550190A1 (en) Automation of image validation
EP4242892A3 (en) Code pointer authentication for hardware flow control
MX368852B (en) Setting different background model sensitivities by user defined regions and background filters.
MX2016006745A (en) Method and apparatus for determining associated user.
SG11201909119YA (en) Search method and apparatus and non-temporary computer-readable storage medium
SG11201909253QA (en) Method and apparatus of data procrssing for online analytical processing
MX2020013214A (en) Updating executable graphs.
EP3779835A4 (en) Device, method and program for analyzing customer attribute information
EP3842944A4 (en) Information processing device, abnormality analyzing method, and program
MX2017009709A (en) Method for evaluating the authenticity of a painting as well as a corresponding use.
SG10201901587VA (en) Application testing
EP3989235A4 (en) Program, testing device, information processing device, and information processing method