8000 [DOCS-10234] Test Health (new page) by joepeeples · Pull Request #28926 · DataDog/documentation · GitHub
[go: up one dir, main page]

Skip to content

[DOCS-10234] Test Health (new page) #28926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 26 additions & 21 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4760,111 +4760,116 @@ menu:
parent: tests
identifier: tests_monitors
weight: 6
- name: Test Health
url: tests/test_health
parent: tests
identifier: tests_test_health
weight: 7
- name: Flaky Test Management
url: tests/flaky_test_management
parent: tests
identifier: tests_flaky_test_management
weight: 7
weight: 8
- name: Early Flake Detection
url: tests/flaky_test_management/early_flake_detection
parent: tests_flaky_test_management
identifier: tests_early_flake_detection
weight: 701
weight: 801
- name: Auto Test Retries
url: tests/flaky_test_management/auto_test_retries
parent: tests_flaky_test_management
identifier: tests_auto_test_retries
weight: 702
weight: 802
- name: Test Impact Analysis
url: tests/test_impact_analysis
parent: tests
identifier: test_impact_analysis
weight: 8
weight: 9
- name: Setup
url: tests/test_impact_analysis/setup/
parent: test_impact_analysis
identifier: test_impact_analysis_setup
weight: 801
weight: 901
- name: .NET
url: tests/test_impact_analysis/setup/dotnet/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_dotnet
weight: 8101
weight: 9101
- name: JavaScript and TypeScript
url: tests/test_impact_analysis/setup/javascript/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_javascript
weight: 8102
weight: 9102
- name: Python
url: tests/test_impact_analysis/setup/python/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_python
weight: 8103
weight: 9103
- name: Swift
url: tests/test_impact_analysis/setup/swift/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_swift
weight: 8104
weight: 9104
- name: Java
url: tests/test_impact_analysis/setup/java/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_java
weight: 8105
weight: 9105
- name: Ruby
url: tests/test_impact_analysis/setup/ruby/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_ruby
weight: 8106
weight: 9106
- name: Go
url: tests/test_impact_analysis/setup/go/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_go
weight: 8107
weight: 9107
- name: How It Works
url: tests/test_impact_analysis/how_it_works/
parent: test_impact_analysis
identifier: test_impact_analysis_how_it_works
weight: 802
weight: 902
- name: Troubleshooting
url: tests/test_impact_analysis/troubleshooting/
parent: test_impact_analysis
identifier: test_impact_analysis_troubleshooting
weight: 803
weight: 903
- name: Developer Workflows
url: tests/developer_workflows
parent: tests
identifier: tests_developer_workflows
weight: 9
weight: 10
- name: Code Coverage
url: tests/code_coverage
parent: tests
identifier: tests_code_coverage
weight: 10
weight: 11
- name: Instrument Browser Tests with RUM
url: tests/browser_tests
parent: tests
identifier: tests_browser_tests
weight: 11
weight: 12
- name: Instrument Swift Tests with RUM
url: tests/swift_tests
parent: tests
identifier: tests_swift_tests
weight: 12
weight: 13
- name: Correlate Logs and Tests
url: tests/correlate_logs_and_tests
parent: tests
identifier: tests_correlate_logs_and_tests
weight: 13
weight: 14
- name: Guides
url: tests/guides/
parent: tests
identifier: tests_guides
weight: 14
weight: 15
- name: Troubleshooting
url: tests/troubleshooting/
parent: tests
identifier: tests_troubleshooting
weight: 15
weight: 16
- name: Quality Gates
url: quality_gates/
pre: ci
Expand Down
106 changes: 106 additions & 0 deletions content/en/tests/test_health.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: Test Health
description: "Measure the impact of flaky tests and how Test Optimization improves CI usage."
further_reading:
- link: "tests/flaky_test_management"
tag: "Documentation"
text: "Learn about managing flaky tests"
- link: "tests/flaky_test_management/auto_test_retries"
tag: "Documentation"
text: "Learn about Auto Test Retries"
- link: "tests/test_impact_analysis"
tag: "Documentation"
text: "Learn about Test Impact Analysis"
- link: "tests/flaky_test_management/early_flake_detection"
tag: "Documentation"
text: "Learn about Early Flake Detection"
- link: "quality_gates"
tag: "Documentation"
text: "Learn about Quality Gates"
---

## Overview

The [Test Health][5] dashboard provides analytics to help teams manage and optimize their testing in CI. This includes sections showing the current impact of test flakiness and how Test Optimization is mitigating these problems.

{{< img src="tests/test-health.png" alt="Test Health dashboard" style="width:100%;" >}}

## Summary metrics

Based on the current time frame and filters applied, the dashboard highlights the following key metrics:

- [**Pipelines Failed**](#pipelines-failed): Sum total of pipelines that failed due to flaky tests
- [**Time Wasted in CI**](#time-wasted-in-ci): Total time spent in CI due to flaky tests
- [**Pipelines Saved**](#pipelines-saved): How many pipelines were prevented from failing by Auto Test Retries
- [**Time Saved in CI**](#time-saved-in-ci): How much time has been saved by Test Impact Analysis and Auto Test Retries

### Pipelines Failed

Check warning on line 37 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.headings

'Pipelines Failed' should use sentence-style capitalization.

This table provides details on pipeline executions, failures, and their impact on developer experience.

| Metric | Description |
|--------|-------------|
| **Pipeline Executions with Tests** | Number of pipeline executions with one or more test sessions. |
| **Failures Due to Flaky Tests** | Number of pipeline executions that failed solely due to flaky tests. All tests that failed have one or more of the following tags: `@test.is_known_flaky` or `@test.is_new_flaky`. |
| **Failures Due to Non-Flaky Tests** | Number of pipeline executions that failed due to tests without any flakiness. None of the failing tests have any of the following tags: `@test.is_known_flaky`, `@test.is_new_flaky`, and `@test.is_flaky`. |
| **Dev Experience - Test Failure Breakdown** | Ratio of flaky to non-flaky test failures. When pipelines fail due to tests, how often is it a flaky test? A higher ratio of flaky test failures erodes trust in test results. Developers may stop paying attention to failing tests, assume they're flakes, and manually retry. |

### Time Wasted in CI

Check warning on line 48 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.headings

'Time Wasted in CI' should use sentence-style capitalization.

This table provides details on testing time, time lost due to failures, and the impact on developer experience.

| Metric | Description |
|--------|-------------|
| **Total Testing Time** | Sum of the duration of all test sessions. |
| **Time Lost Due to Flaky Tests** | Total duration of test sessions that failed solely due to flaky tests. All tests that failed have one or more of the following tags: `@test.is_known_flaky`, `@test.is_new_flaky`, or `@test.is_flaky`. |
| **Time Lost Due to Non-Flaky Tests** | Total duration of test sessions that failed due to tests without any flakiness. All tests that failed do not have any of the following tags: `@test.is_known_flaky`, `@test.is_new_flaky`, and `@test.is_flaky`. |
| **Dev Experience - Time Lost Breakdown** | Ratio of time lost due to flaky vs. non-flaky test failures. When you lose time due to tests, how much is due to flaky tests? A higher ratio of time lost to flaky test failures leads to developer frustration. |

### Pipelines Saved

Check warning on line 59 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.headings

'Pipelines Saved' should use sentence-style capitalization.

This table shows how many pipelines [Auto Test Retries][1] have prevented from failing.

<div class="alert alert-info">These metrics include test sessions in which a flaky test failed, and then was automatically retried and passed. Newer versions of test libraries provide more accurate results, as they include more precise telemetry regarding test retries.</div>

| Metric | Description |
|--------|-------------|
| **Pipeline Executions with Tests** | Number of pipeline executions with one or more test sessions. |
| **Saved by Auto Test Retries** | Number of CI pipelines with passed test sessions containing tests with `@test.is_retry:true` and `@test.is_new:false`. |

### Time Saved in CI

Check warning on line 70 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.headings

'Time Saved in CI' should use sentence-style capitalization.

This table shows how much CI usage time [Test Impact Analysis][4] and [Auto Test Retries][1] have saved.

| Metric | Description |
|--------|-------------|
| **Total Testing Time** | Sum of the duration of all test sessions. |
| **Total Time Saved** | Sum of time saved by Test Impact Analysis and Auto Test Retries. **% of Testing Time** is the percentage of time saved out of total testing time. Total time saved can exceed total testing time if you prevent a lot of unnecessary pipeline and job retries. |
| **Saved by Test Impact Analysis** | Total duration indicated by `@test_session.itr.time_saved`. |
| **Saved by Auto Test Retries** | Total duration of passed test sessions in which some tests initially failed but later passed due to Auto Test Retries. These tests are tagged with `@test.is_retry:true` and `@test.is_new:false`. |

## Use cases

### Enhance developer experience
Use **Dev Experience - Test Failure Breakdown** and **Dev Experience - Time Lost Breakdown** to identify how often flaky tests in particular cause failures and waste CI time.

Check notice on line 84 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.sentencelength

Suggestion: Try to keep your sentence length to 25 words or fewer.

These Test Optimization features improve developer experience by reducing test failures and wasted time:
- **[Auto Test Retries][1]** reduces the likelihood a flaky test causes a pipeline to fail. This includes your known flaky tests and flaky 628C tests that have yet to be identified. This also provides developers with confidence in test results when a test is actually broken, as it will have failed all retries.

Check warning on line 87 in content/en/tests/test_health.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.tense

Avoid temporal words like 'will'.
- **[Early Flake Detection][2]**, combined with **[Quality Gates][3]**, prevents new tests that are flaky from entering your default branch.
- **[Test Impact Analysis][4]** minimizes the flaky tests that run, by only running relevant tests based on code coverage. Skipping irrelevant tests also shortens the feedback loop for developers.

### Maximize pipeline efficiency and reduce costs
Lengthy test suites slow down feedback loops to developers, and running irrelevant tests incurs unnecessary costs.

These Test Optimization features help you save CI time and costs:
- **[Auto Test Retries][1]**: If a single flaky test fails during your session, the entire duration of the CI job is lost. Auto Test Retries allow flaky tests to rerun, increasing the likelihood of passing.
- **[Test Impact Analysis][4]**: By running only tests relevant to your code changes, you reduce the overall duration of the test session. This also prevents pipelines from failing due to unrelated flaky tests if you skip them.

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: /tests/flaky_test_management/auto_test_retries/
[2]: /tests/flaky_test_management/early_flake_detection/
[3]: /quality_gates/
[4]: /tests/test_impact_analysis/
[5]: https://app.datadoghq.com/ci/test/health
Binary file added static/images/tests/test-health.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
0