Use data lineage with Google Cloud systems

Data lineage displays the relations between your project's resources and the processes that created them. You can view data lineage information in the form of a graph visualization or list view in the Google Cloud console, or retrieve it from the Data Lineage API in the form of JSON data.

Lineage is captured across projects. When you view lineage that is generated from multiple projects, you can view the aggregated lineage information in any of the relevant projects.

Roles and permissions

To view lineage information, ask your administrator to grant you viewer roles as described in Predefined data lineage roles. You must have access in both the project where you view lineage, and the projects in which lineage is recorded.

Data Catalog tracks lineage information automatically when you enable the Data Lineage API. You don't need any administrator or editor roles to capture lineage for your data assets.

For more information about granting roles, see Manage access. You can assign a role at a higher folder or organization level (see Grant or revoke a single role).

Enable data lineage

Enable data lineage to begin automatically tracking lineage information for supported systems. You must enable the Data Lineage API in both the project where you view lineage, and the projects in which lineage is recorded. For more information, see Project types.

  1. To capture lineage information, do the following:

    1. In the Google Cloud console, on the Project selector page, select the project in which you want to record lineage.

      Go to Project selector

    2. Enable the Data Lineage API.

      Enable the Data Lineage API

    3. Repeat the previous steps for each project in which you want to record lineage.
  2. In the project where you view lineage, enable the Data Lineage API and the Data Catalog API.

    Enable the APIs

View lineage in Dataplex UI

You can view data lineage information in the Dataplex UI in the form of a graph or a list.

Lineage graphs represent information gathered by the Data Lineage API for a particular entry.

A sample graph shows data from two tables being transformed and then merged.
Figure 1. Example of a lineage visualization graph in Dataplex UI.

Lineage list view (Preview) displays detailed lineage information for entities in a single table that includes lineage information for entities with many connections.

To view the lineage, follow these instructions:

  1. Open the Dataplex search page and find the asset for which you want to view lineage information.

    Open the Dataplex search page

    For more information see How to search for data assets.

  2. On the entry details page, select the Lineage tab.

  3. Select the process or data source buttons to display the details panel.

  4. To view upstream or downstream lineage information for a resource, click Expand.

  5. To view lineage in list view instead of graph view, click List.

View lineage in BigQuery UI

You can view data lineage information in the BigQuery UI in the form of a graph or a list (Preview).

To view the lineage, follow these instructions:

  1. In the Google Cloud console, go to the BigQuery page.

    Open the BigQuery page

  2. Open the table for which you want to see the data lineage.

  3. Click the Lineage tab.

  4. Select the process or data source buttons to display the details panel.

  5. To view upstream or downstream lineage information for a resource, click Expand.

  6. To view lineage in list view instead of graph view, click List.

View lineage in Vertex AI UI

Systems like Vertex AI Pipelines generate lineage data for Vertex AI models and datasets. You can view data lineage information in the Vertex AI UI in the form of a graph or a list (Preview).

View lineage for a managed dataset in Vertex AI

To view the lineage for a dataset, follow these instructions:

  1. In the Google Cloud console, go to the Datasets page.

    Open the Datasets page

  2. Click the dataset for which you want to see the data lineage.

  3. Click the Lineage tab.

  4. Select the process or data source buttons to display the details panel.

  5. To view upstream or downstream lineage information for a resource, click Expand.

  6. To view lineage in list view instead of graph view, click List.

View lineage for a model in Vertex AI

To view the lineage for a model, follow these instructions:

  1. In the Google Cloud console, go to the Model Registry page.

    Open the Model Registry page

  2. Click the model for which you want to see the data lineage.

  3. Click the Lineage tab.

  4. Select the process or data source buttons to display the details panel.

  5. To view upstream or downstream lineage information for a resource, click Expand.

  6. To view lineage in list view instead of graph view, click List.

What's next