[build-tools] Store per-attempt results for Maestro test retries#3498
[build-tools] Store per-attempt results for Maestro test retries#3498
Conversation
Signed-off-by: Ash Wu <hsatac@gmail.com>
|
⏩ The changelog entry check has been skipped since the "no changelog" label is present. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3498 +/- ##
==========================================
+ Coverage 53.56% 53.62% +0.06%
==========================================
Files 815 815
Lines 34566 34600 +34
Branches 7175 7181 +6
==========================================
+ Hits 18513 18551 +38
+ Misses 15971 15967 -4
Partials 82 82 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Subscribed to pull request
Generated by CodeMention |
There was a problem hiding this comment.
Pull request overview
This PR updates the Maestro test pipeline in build-tools to preserve per-attempt artifacts and results when retries are enabled, instead of overwriting the final attempt’s output, and adjusts parsing/reporting to support multiple results per flow.
Changes:
- Generate per-attempt Maestro report and device-log filenames (e.g.,
*-attempt-N.*) so retries don’t overwrite prior artifacts. - Update Maestro JUnit parsing/merge logic to emit per-attempt flow results by grouping by flow name and extracting attempt index from the JUnit filename.
- Update result reporting to only treat duplicates as conflicts when they share the same
(name, retryCount)pair, and add/extend tests for the new behaviors.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/build-tools/src/steps/functions/internalMaestroTest.ts | Writes unique report/device-log outputs per retry attempt. |
| packages/build-tools/src/steps/functions/maestroResultParser.ts | Parses JUnit per file and merges results into per-attempt flow outputs. |
| packages/build-tools/src/steps/functions/reportMaestroTestResults.ts | Adjusts duplicate detection to allow same name across different retry attempts. |
| packages/build-tools/src/steps/functions/tests/internalMaestroTest.test.ts | Adds coverage for per-attempt report filename generation. |
| packages/build-tools/src/steps/functions/tests/maestroResultParser.test.ts | Adds coverage for per-attempt parsing (including reuse_devices + sharding shapes + backward compat). |
| packages/build-tools/src/steps/functions/tests/reportMaestroTestResults.test.ts | Verifies per-attempt results with same name but different retryCount are reported. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Multiple results — per-attempt JUnit files. Sort by attempt index from filename. | ||
| const sorted = flowEntries | ||
| .map(entry => { | ||
| const match = entry.sourceFile.match(ATTEMPT_PATTERN); | ||
| const attemptIndex = match ? parseInt(match[1], 10) : 0; | ||
| return { ...entry, attemptIndex }; | ||
| }) | ||
| .sort((a, b) => a.attemptIndex - b.attemptIndex); |
There was a problem hiding this comment.
When multiple JUnit XML files exist for the same flow, entries without an attempt-N match are assigned attemptIndex = 0. If the directory contains multiple files for the same flow that don't follow the attempt-N naming (e.g., leftover files or a different naming scheme), all of those results will get retryCount: 0, which can lead to false conflicts and the entire report being skipped. Consider falling back to the sorted position (0..n-1) for unmatched files, or deriving retryCount from flowOccurrences when the attempt index can’t be parsed, to keep results distinct and reporting robust.
| // Multiple results — per-attempt JUnit files. Sort by attempt index from filename. | |
| const sorted = flowEntries | |
| .map(entry => { | |
| const match = entry.sourceFile.match(ATTEMPT_PATTERN); | |
| const attemptIndex = match ? parseInt(match[1], 10) : 0; | |
| return { ...entry, attemptIndex }; | |
| }) | |
| .sort((a, b) => a.attemptIndex - b.attemptIndex); | |
| // Multiple results — per-attempt JUnit files. | |
| // Prefer attempt index from filename (attempt-N); fall back to per-flow position (0..n-1) | |
| // to keep retryCount values distinct even when filenames don't follow the pattern. | |
| const sorted = flowEntries | |
| .map((entry, index) => { | |
| const match = entry.sourceFile.match(ATTEMPT_PATTERN); | |
| const parsed = match ? parseInt(match[1], 10) : NaN; | |
| const hasExplicitAttemptIndex = Number.isInteger(parsed) && parsed >= 0; | |
| const attemptIndex = hasExplicitAttemptIndex ? parsed : index; | |
| return { ...entry, attemptIndex, hasExplicitAttemptIndex }; | |
| }) | |
| .sort((a, b) => { | |
| if (a.hasExplicitAttemptIndex && b.hasExplicitAttemptIndex) { | |
| return a.attemptIndex - b.attemptIndex; | |
| } | |
| if (a.hasExplicitAttemptIndex) return -1; | |
| if (b.hasExplicitAttemptIndex) return 1; | |
| return a.attemptIndex - b.attemptIndex; | |
| }); |
Why
Maestro test retries currently overwrite JUnit reports and device logs on each attempt, so only the final attempt's results are stored. This means we lose visibility into earlier failed attempts, making it harder to diagnose flaky tests.
Depends on: expo/universe#25691 and expo/universe#25692 (must be deployed first — the new unique index on
(turtle_job_run_id, path, retry_count)is required to accept multiple results per test path).How
Three changes across
build-tools:internalMaestroTest.ts— MoveoutputPathand device log path inside the retry loop with anattempt-Nsuffix, so each attempt produces a separatefile instead of overwriting. Removed the now-unnecessary
fs.promises.rmcall.maestroResultParser.ts— Extract an internalparseJUnitFilehelper, then rewrite the merge logic inparseMaestroResultsto group JUnit results byflow name. When multiple files exist for the same flow, the attempt index is extracted from the filename (
attempt-N). For single-file flows (backwardcompat with
reuse_devices: trueold format),retryCountis still derived fromai-*.jsonoccurrence count.reportMaestroTestResults.ts— Update the duplicate-name guard to check(name, retryCount)pairs instead of just names. Same name with differentretryCountis expected (per-attempt results), while same(name, retryCount)indicates a real naming conflict.Test Plan
internalMaestroTest.test.tsmaestroResultParser.test.ts: per-attempt results,reuse_devices=trueper-attempt, sharding with multiple testsuites, and backward compat with old single JUnit filereportMaestroTestResults.test.tsfor same name with differentretryCountbeing sent (not skipped)