[go: up one dir, main page]

0% found this document useful (0 votes)
122 views5 pages

How To Use The Dataplex Data Lineage Feature

The document explains how to use the Dataplex data lineage feature for tracking metadata throughout the data lifecycle. It details the steps to enable the Data Lineage API and access the data lineage graph in both BigQuery and Dataplex. Key takeaways include the built-in tracking capabilities for data events in both platforms.

Uploaded by

Brou Brou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views5 pages

How To Use The Dataplex Data Lineage Feature

The document explains how to use the Dataplex data lineage feature for tracking metadata throughout the data lifecycle. It details the steps to enable the Data Lineage API and access the data lineage graph in both BigQuery and Dataplex. Key takeaways include the built-in tracking capabilities for data events in both platforms.

Uploaded by

Brou Brou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

How to use the Dataplex data lineage

feature
So far, you’ve learned that data tracing is the practice of tracking metadata to provide insight
into the path data has taken through an entire data lifecycle. Dataplex’s lineage feature
provides a built-in data tracing tool for all the data in your organization. This means you can
review and search metadata for available data across your organization, and trace its
transformation as it moves through the data lifecycle. In this reading, you’ll learn how to use
the Dataplex data lineage feature.

Enabling data lineage tracing


To use the automatic lineage tracing features offered by Google Cloud, you need to enable the
Data Lineage API in Dataplex. The Data Lineage API traces metadata from BigQuery, Cloud
Data Fusion, Cloud Composer, and Dataproc.

Metadata collected from BigQuery includes Copy, Load, and Query jobs. Specific query jobs
include creating tables and views. Also, SQL commands like SELECT, MERGE, UPDATE, and
DELETE can be tracked with Data Lineage API.

Navigate to the data lineage graph in BigQuery

To access the data lineage graph, first navigate to BigQuery from the Google Cloud console.
Then within BigQuery, navigate directly to the data lineage graph with these steps:

1. Open the BigQuery SQL workspace page.

1
2. Open the preferred table to review the data lineage.

3. Click the “Lineage” tab.

4. Select each of the process buttons to learn more about the transformation or action
that occurred.

2
To learn more information about the action, click any of the BigQuery icons to reveal a “Details”
table.

Navigate to the data lineage graph in Dataplex

Another way to access the data lineage graph is to navigate to Dataplex from the Google
Cloud console. In the Dataplex user interface, you can navigate directly to the data lineage
graph by completing these steps:

1. Open the Dataplex search page.

3
2. Navigate to the entry details page.

3. Click the “Lineage” tab.

4
4. Open a data lineage diagram with icons indicating which service transformed the data .

Key takeaways
BigQuery and Dataplex offer platform-native options to generate data lineage graphs. Data
that has been queried by BigQuery or managed by Dataplex comes with built-in tracking for
data events.

Resources for more information


For source documentation about how to create a data lineage graph, check out these links:
● Google Cloud documentation provides step-by-step directions on how to create a data
lineage graph:
https://cloud.google.com/data-catalog/docs/how-to/lineage-gcp#view-bq-lineage-graphs

You might also like