8000 Implement coverage reports for Starlark by arvi18 · Pull Request #11 · coderabbit-test/bazel · GitHub
[go: up one dir, main page]

Skip to content
  • Insights
  • Conversation

    @arvi18
    Copy link
    @arvi18 arvi18 commented Apr 16, 2025

    If the new flag --starlark_coverage_report is set to a non-empty value, the Starlark coverage recorder will be enabled for all .bzl files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format, containing function, line, and branch coverage information.

    Summary by CodeRabbit

    • New Features
      • Added support for collecting and reporting code coverage for Starlark files, outputting results in LCOV format.
      • Introduced a new command-line option to generate Starlark coverage reports for executed files during a build.
    • Bug Fixes
      • Improved handling of parameter offsets to avoid potential errors with missing identifiers.
    • Refactor
      • Centralized and streamlined logic for printing comprehension clauses in code output.
    • Chores
      • Added a new failure code for coverage report dumping errors.

    fmeum added 5 commits May 31, 2022 11:57
    A `ConditionalExpression`, differing from an `IfStatement`, always has
    an `else` case.
    The generated coverage report includes function, line, and branch
    coverage information in LCOV format.
    If the new flag `--starlark_coverage_report` is set to a non-empty
    value, the Starlark coverage recorder will be enabled for all `.bzl`
    files under the output base executed during the Blaze command. At the
    end of the command, the collected coverage data will be emitted to the
    file path given as the flag argument in LCOV format.
    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    CC @Wyverald

    I generated reports for some rulesets and found the results to be relatively accurate, but this would definitely benefit from more testing (it doesn't come with test cases yet) and polish. I'm also not sure whether it works for repository rules and/or functions executed during the execution phase, e.g. map_each callbacks. I wanted to open a PR to get early feedback and will happily fix bugs and supply tests if this is generally viewed favorably.

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    Thanks for the PR, Fabian! I'll leave it to Jon & Sasha to review this. (NB: Jon is OOO until 9 June.)

    Also cc @adonovan

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    I fixed some bugs and refactored the branch logic, but will not make further changes.

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    Thanks for sending this, but a change this big needs a design discussion before code. What problem are we trying to solve? What design do you propose? How can we make it simpler?

    Sorry, I certainly didn't intend to skip proper process here. I wrote this code out of a concrete need to better understand the edge cases of rulesets I maintain and to improve my understanding of the Java Starlark implementation. I just thought about sharing it hoping that it could be useful for others. I am completely open to rewriting this from scratch or scrapping it entirely, depending on what the outcome of the design discussion is.

    My initial questions are:

    • Does this make the interpreter slower? A lot of work has gone into making it as fast as it is (although there is still room for improvement). Benchmarking is time consuming because the JVM is noisy---execution time may vary by 5% or more---making it hard to find regressions down at the 1-2% level.

    The way the hooks are currently implemented, I won't be able to dismiss this concern with confidence. As you said, benchmarking doesn't necessarily catch all regressions on such a low level.

    I do think that there is a way to implement the coverage hooks without any overhead in the case where coverage isn't being collected:

    1. Make the static CoverageRecorder instance final and populate it based on e.g. a system property starlark.lcov_report_file being set to a non-empty value.
    2. Make --starlark_coverage_profile a startup option that sets this system property for the Bazel server JVM.
    3. Add logic that dumps the report on server shutdown.

    The static final field should be inlined by the JIT - if we want to be certain, we could also make it a simple boolean that triggers a coverage slow path if true. I could then double-check that the code emitted by the JIT doesn't change with the PR merged.

    Do you like this approach better?

    • Do we need all this new machinery? In principle, a coverage tool could be implemented as a debugger that simply steps through the whole program one instruction at a time recording the current pc. We already have a primitive debugger. Can its hooks be used to record coverage, with minimal changes?

    After a first read of the code, I'm inclined to say this would work for statement coverage, but not for more fine-grained notions of coverage.

    • Do we need branch coverage? I notice the tool records control jumps such as break statements and the short-circuit in if true || true {, and information about individual parameters. Is this necessary? Statement coverage is good enough for most purposes and is less invasive.

    I agree that statement coverage would already be quite useful. However, given the general conciseness of Starlark (especially with list/dict comprehensions and conditional expressions), a single statement can encode rather complex logic. With the rulesets I have tried this on, statement coverage has almost always been nearing 100%, but branch coverage revealed some interesting uncovered cases (e.g., handling of workspace vs. external repository, checks for missing/superfluous files). While branch coverage information for breaks may not be as useful, I found it better to cover all branches rather than just some of them.

    Assuming we find a way to ensure that coverage support introduces no overhead at all (as possibly described above), would you still prefer to remove some or all of the branch coverage hooks?.

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    @adonovan Would you like me to prototype the static final approach and take a look at the JIT output? I think that this has a fair chance of alleviating the performance worries entirely.

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    Finally taking a look at this, as I promised Fabian back at BazelCon.

    It looks like there's quite a bit more code to this PR than I originally thought. I'm interested in supporting more tooling for Starlark in the new year, coverage included, but in light of

    • the added code complexity (who will maintain all this extra logic?),
    • concerns of performance regression, and
    • uncertainty about the extent of the benefit of this feature to users, and the relative benefits of expression granularity vs statement granularity,

    it's clear that a quick acceptance of this PR is off the table. I'd love to have a more involved discussion centered around the benefit vs maintainability tradeoff in Q1. I'll keep the PR open for now.

    @arvi18
    Copy link
    Author
    arvi18 commented Apr 16, 2025

    @brandjon @tetromino Friendly ping on this.

    I could trim this down to just line coverage and ensure that there is no runtime overhead if coverage is disabled.

    @gandalf-repo
    Copy link

    @coderabbitai full review

    @coderabbitai
    Copy link
    coderabbitai bot commented Apr 17, 2025
    ✅ Actions performed

    Full review triggered.

    @coderabbitai
    Copy link
    coderabbitai bot commented Apr 17, 2025

    Walkthrough

    This change introduces comprehensive Starlark code coverage instrumentation and reporting support. It adds a new command-line option to request LCOV-format coverage reports for Starlark files, and integrates coverage collection into the Blaze command execution flow. The Starlark evaluation engine is instrumented to record coverage at the AST node level, including branches and function entries. New infrastructure for coverage recording, AST traversal, and LCOV report generation is implemented. The system is thread-safe and supports filtering by filename. An explicit failure code is added for coverage report dumping errors, and minor related refactoring and bug fixes are included.

    Changes

    File(s) Change Summary
    src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java Integrated Starlark coverage collection and report dumping into the Blaze command execution flow, including error handling for report file writing.
    src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java Added --starlark_coverage_report command-line option for specifying the output path of the Starlark LCOV report.
    src/main 8000 /java/net/starlark/java/eval/Eval.java Instrumented the evaluation engine with coverage recording hooks for statements, expressions, parameters, and control flow branches.
    src/main/java/net/starlark/java/eval/Starlark.java Added static methods to start coverage collection and to dump collected coverage data.
    src/main/java/net/starlark/java/syntax/BUILD Included new source files CoverageRecorder.java and CoverageVisitor.java in the build configuration.
    src/main/java/net/starlark/java/syntax/CoverageRecorder.java, src/main/java/net/starlark/java/syntax/CoverageVisitor.java Introduced a coverage recording system with AST traversal, branch/function/line tracking, and LCOV report generation.
    src/main/java/net/starlark/java/syntax/NodePrinter.java Refactored comprehension clause printing into a dedicated method for clarity and reuse.
    src/main/java/net/starlark/java/syntax/NodeVisitor.java Removed a null check before visiting the else-case in conditional expressions.
    src/main/java/net/starlark/java/syntax/Parameter.java Fixed getEndOffset() in the Star subclass to handle null identifiers safely.
    src/main/java/net/starlark/java/syntax/Program.java Registered new Program instances with the coverage recorder upon compilation.
    src/main/protobuf/failure_details.proto Added a new failure code for Starlark coverage report dump failures.

    Sequence Diagram(s)

    sequenceDiagram
        participant User
        participant BlazeCommandDispatcher
        participant Starlark
        participant CoverageRecorder
        participant FileSystem
    
        User->>BlazeCommandDispatcher: Run command with --starlark_coverage_report=path
        BlazeCommandDispatcher->>Starlark: startCoverageCollection(filenameMatcher)
        Starlark->>CoverageRecorder: startCoverageCollection(filenameMatcher)
        BlazeCommandDispatcher->>...: Execute command (Starlark files evaluated)
        ...->>CoverageRecorder: recordCoverage(node) (during evaluation)
        BlazeCommandDispatcher->>FileSystem: Open coverage report file for writing
        alt File open fails
            BlazeCommandDispatcher->>User: Print error, exit with STARLARK_COVERAGE_REPORT_DUMP_FAILURE
        else File open succeeds
            BlazeCommandDispatcher->>Starlark: dumpCoverage(PrintWriter)
            Starlark->>CoverageRecorder: dump(PrintWriter)
            CoverageRecorder->>FileSystem: Write LCOV report
            BlazeCommandDispatcher->>User: Command completes
        end
    
    Loading

    Poem

    In the warren of code, a new trail appears,
    Starlark’s coverage counted—let’s give three cheers!
    With branches and lines, and functions in tow,
    Reports sprout in LCOV, for all to know.
    If errors arise, a bunny hops back,
    But with these new tools, we’re right on track!
    🐇✨

    Tip

    ⚡💬 Agentic Chat (Pro Plan, General Availability)
    • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.
    ✨ Finishing Touches
    • 📝 Generate Docstrings

    Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

    ❤️ Share
    🪧 Tips

    Chat

    There are 3 ways to chat with CodeRabbit:

    • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
      • I pushed a fix in commit <commit_id>, please review it.
      • Generate unit testing code for this file.
      • Open a follow-up GitHub issue for this discussion.
    • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
      • @coderabbitai generate unit testing code for this file.
      • @coderabbitai modularize this function.
    • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
      • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
      • @coderabbitai read src/utils.ts and generate unit testing code.
      • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
      • @coderabbitai help me debug CodeRabbit configuration file.

    Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

    CodeRabbit Commands (Invoked using PR comments)

    • @coderabbitai pause to pause the reviews on a PR.
    • @coderabbitai resume to resume the paused reviews.
    • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
    • @coderabbitai full review to do a full review from scratch and review all the files again.
    • @coderabbitai summary to regenerate the summary of the PR.
    • @coderabbitai generate docstrings to generate docstrings for this PR.
    • @coderabbitai resolve resolve all the CodeRabbit review comments.
    • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
    • @coderabbitai help to get help.

    Other keywords and placeholders

    • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
    • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
    • Add @coderabbitai anywhere in the PR title to generate the title automatically.

    CodeRabbit Configuration File (.coderabbit.yaml)

    • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
    • Please see the configuration documentation for more information.
    • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

    Documentation and Community

    • Visit our Documentation for detailed information on how to use CodeRabbit.
    • Join our Discord Community to get help, request features, and share feedback.
    • Follow us on X/Twitter for updates and announcements.

    Copy link
    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 6

    🧹 Nitpick comments (9)
    src/main/java/net/starlark/java/syntax/Program.java (2)

    88-90: Consider also registering programs compiled from expressions

    The current implementation only registers programs created via compileFile with the coverage recorder, but not those created via compileExpr. This could lead to incomplete coverage data when expressions are executed.

    Consider whether expression coverage should also be tracked by adding similar registration code to the compileExpr method.


    88-90: Add error handling for the registration call

    The register call to the CoverageRecorder has no error handling. If registration fails for any reason, it could potentially cause compilation to fail or leave the coverage recorder in an inconsistent state.

    Consider adding a try-catch block to handle potential exceptions from the registration process.

    src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java (1)

    296-303: The command-line option is well-defined

    The option follows the established pattern in this file and provides clear documentation of its purpose. It's properly categorized under logging with the appropriate effect tag.

    One suggestion would be to enhance the help text with more details about the LCOV format, the types of coverage collected (functions, lines, branches), and any performance implications of enabling coverage.

    src/main/java/net/starlark/java/eval/Starlark.java (1)

    978-984: Consider adding a method to check if coverage is enabled

    To allow conditional behavior based on whether coverage is enabled, consider adding a method to check the coverage state. This would help optimize code paths that only need to execute when coverage is being collected.

    +/**
    + * Returns whether Starlark coverage collection is currently enabled.
    + *
    + * @return true if coverage collection is active, false otherwise
    + */
    +public static boolean isCoverageEnabled() {
    +  return CoverageRecorder.getInstance().isActive();
    +}
    src/main/java/net/starl 8000 ark/java/eval/Eval.java (1)

    493-494: Hot‑path overhead: guard coverage when disabled

    eval is invoked for every expression; unconditionally calling
    CoverageRecorder.getInstance().recordCoverage(expr) adds at least one
    virtual method dispatch per node even when coverage is turned off.
    If the recorder implements a cheap singleton noop this may be fine, but you can eliminate the call entirely:

    if (CoverageRecorder.isActive()) {
      CoverageRecorder.getInstance().recordCoverage(expr);
    }

    Worth considering if micro‑benchmarks show a noticeable slowdown.

    src/main/java/net/starlark/java/syntax/CoverageVisitor.java (1)

    85-92: String concatenation cost in deeply‑nested functions

    enterFunction rebuilds the full a > b > c name by streaming and joining the
    entire stack on every function entry.
    In files with many nested lambdas this is O(depth²) and allocates a new
    String each time.

    An incremental StringBuilder (push/pop) or caching the previously built
    prefix would avoid repeated traversals and allocations.

    Not critical, but easy to optimise later.

    src/main/java/net/starlark/java/syntax/CoverageRecorder.java (3)

    158-167: dump should not close the PrintWriter it did not open

    Closing the writer here prevents callers from writing additional data (e.g. merging multiple recorders) or from handling the resource lifecycle themselves (try‑with‑resources, error handling, etc.).
    Leave responsibility to the caller:

    -    registeredPrograms.stream()
    -        .sorted(Comparator.comparing(Program::getFilename))
    -        .forEachOrdered(program -> {
    -          CoverageNodeVisitor visitor = new CoverageNodeVisitor(program);
    -          visitor.visitAll(program.getResolvedFunction().getBody());
    -          visitor.dump(out);
    -        });
    -    out.close();
    +    registeredPrograms.stream()
    +        .sorted(Comparator.comparing(Program::getFilename))
    +        .forEachOrdered(program -> {
    +          CoverageNodeVisitor visitor = new CoverageNodeVisitor(program);
    +          visitor.visitAll(program.getResolvedFunction().getBody());
    +          visitor.dump(out);
    +        });
       }

    115-136: Avoid creating a new CoverageVisitor for every top‑level statement

    register instantiates a fresh anonymous CoverageVisitor for each top‑level statement, causing:

    1. Redundant allocations (can be thousands in large files).
    2. Missed opportunities to share state across statements (e.g., sets/maps).

    Refactor to create one visitor per program and call visitor.visitAll(program.getResolvedFunction().getBody()) once, mirroring the approach used later in dump.


    93-103: Potential memory leak: Node keys retain full AST

    counts / virtualJumpCounts store strong references from long‑lived maps to every Node, which in turn references the entire subtree and the containing Program.
    If the JVM server process lives for many Bazel commands, this can keep all compiled ASTs alive even after dumping coverage.

    Consider:

    • Using a WeakHashMap<Node, LongAdder> (synchronised wrapper) so entries are cleared once the AST is unreachable, or
    • Storing a lightweight key (e.g., (filename,line,column)).
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    📥 Commits

    Reviewing files that changed from the base of the PR and between c0c98a2 and b603de0.

    📒 Files selected for processing (12)
    • src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (3 hunks)
    • src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java (1 hunks)
    • src/main/java/net/starlark/java/eval/Eval.java (17 hunks)
    • src/main/java/net/starlark/java/eval/Starlark.java (2 hunks)
    • src/main/java/net/starlark/java/syntax/BUILD (1 hunks)
    • src/main/java/net/starlark/java/syntax/CoverageRecorder.java (1 hunks)
    • src/main/java/net/starlark/java/syntax/CoverageVisitor.java (1 hunks)
    • src/main/java/net/starlark/java/syntax/NodePrinter.java (3 hunks)
    • src/main/java/net/starlark/java/syntax/NodeVisitor.java (1 hunks)
    • src/main/java/net/starlark/java/syntax/Parameter.java (1 hunks)
    • src/main/java/net/starlark/java/syntax/Program.java (1 hunks)
    • src/main/protobuf/failure_details.proto (1 hunks)
    🧰 Additional context used
    🧬 Code Graph Analysis (2)
    src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (1)
    src/main/java/net/starlark/java/eval/Starlark.java (1)
    • Starlark (55-985)
    src/main/java/net/starlark/java/syntax/CoverageRecorder.java (2)
    src/main/java/net/starlark/java/syntax/Resolver.java (1)
    • Function (119-277)
    src/main/java/net/starlark/java/syntax/Program.java (1)
    • Program (24-105)
    🔇 Additional comments (9)
    src/main/java/net/starlark/java/syntax/NodeVisitor.java (1)

    177-177:

    Details

    ✅ Verification successful

    Verify the safety of removing the null check

    The null check before visiting the else-case node has been removed, making this visit unconditional. Other similar methods in this file (like visitBlock, visit for defaultValue, Result, etc.) still retain their null checks. This change could potentially cause NullPointerExceptions if getElseCase() returns null and visit() doesn't handle null inputs.

    This change is likely related to the new coverage tracking implementation. Please verify that either:

    1. ConditionalExpression.getElseCase() is guaranteed to never return null, or
    2. The CoverageVisitor implementation handles null nodes appropriately.

    🏁 Script executed:

    #!/bin/bash
    # Check if ConditionalExpression.getElseCase() can return null
    grep -A 10 "public .* getElseCase" src/main/java/net/starlark/java/syntax/ConditionalExpression.java
    
    # Look for null checks in the CoverageVisitor class
    grep -A 3 "visit.*ConditionalExpression" src/main/java/net/starlark/java/syntax/CoverageVisitor.java

    Length of output: 660


    🏁 Script executed:

    #!/bin/bash
    echo "=== visitBranch definition in CoverageVisitor.java ==="
    grep -R "visitBranch" -n src/main/java/net/starlark/java/syntax/CoverageVisitor.java -A20
    
    echo
    echo "=== Beginning of NodeVisitor.java (to inspect visit(Node)) ==="
    sed -n '1,200p' src/main/java/net/starlark/java/syntax/NodeVisitor.java

    Length of output: 9410


    Null check removal is safe

    ConditionalExpression.getElseCase() is set via the constructor and, per the language grammar, always carries a non‑null expression; it can never be null. The unguarded call in NodeVisitor.visit(ConditionalExpression) therefore cannot NPE, and CoverageVisitor.visitBranch already allows a nullable successor. No further changes are required.

    src/main/java/net/starlark/java/syntax/BUILD (1)

    24-25: LGTM! New coverage implementation files added

    The addition of CoverageRecorder.java and CoverageVisitor.java to the build file supports the new Starlark coverage reporting feature. These files will implement the necessary infrastructure for AST traversal with coverage tracking and LCOV-compatible output generation.

    src/main/protobuf/failure_details.proto (1)

    536-536: LGTM! Appropriate error code for coverage report failures

    Adding a specific failure code for Starlark coverage report dumping failures is appropriate. The exit code 36 is consistent with other file-related errors in the system like SERVER_FILE_WRITE_FAILURE and other I/O operation failures.

    src/main/java/net/starlark/java/syntax/Parameter.java (1)

    119-121: LGTM! Defensive null check prevents potential NPE

    This change adds proper null handling for Star parameters with null identifiers. Since the constructor accepts a nullable identifier, this defensive programming approach prevents NullPointerExceptions when calculating the end offset of a Star parameter without an identifier.

    src/main/java/net/starlark/java/syntax/NodePrinter.java (3)

    44-46: Good improvement for handling Comprehension.Clause nodes

    Adding explicit handling for Comprehension.Clause nodes in the printNode method improves the completeness of this method and ensures all node types can be properly processed.


    283-283: Cleaner code through refactoring

    Changing the inline conditional logic to call the new printClause method improves readability and maintainability.


    479-491: Well-structured helper method for printing clauses

    The newly added printClause method properly handles both For and If clauses with the correct syntax. The refactoring centralizes this logic in a single place, making it easier to maintain and extend if needed.

    src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (1)

    375-381: Ensure coverage collection is reset between commands

    Starlark.startCoverageCollection is invoked every time a command is executed when the flag is set, but there is no corresponding call that resets or guards against a previous invocation still being active.
    If two consecutive commands run in the same server process, coverage from the first may pollute the second (or the second call might be a no‑op, depending on startCoverageCollection’s implementation).

    Please verify that:

    1. startCoverageCollection idempotently resets any prior state, or
    2. Coverage is explicitly cleared after dumpCoverage, or
    3. We guard with a boolean to only start once per server lifetime.

    If none of the above holds, users will get confusing, merged LCOV reports.

    src/main/java/net/starlark/java/syntax/CoverageRecorder.java (1)

    190-199:

    Details

    ❌ Incorrect review comment

    Same NPE risk when populating BranchInfo

    counts.get(positiveUniqueSuccessor) (and negativeUniqueSuccessor) can return null if the successor hasn’t reached visitCode yet. Guard the lookup just like the previous comment or defer branch materialisation until after the whole subtree has been visited.


    I’ll locate how counts is declared and how it’s populated to confirm whether get(...) can return null.


    🏁 Script executed:

    #!/bin/bash
    grep -n "private.*counts" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
    grep -n "counts *= *new" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
    grep -n "counts\\.put" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
    rg -n "counts\\.get" src/main/java/net/starlark/java/syntax/CoverageRecorder.java

    Length of output: 948


    🏁 Script executed:

    #!/bin/bash
    # Show context around counts.put to see how and when entries are added
    rg -n -C5 'counts\.put' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

    Length of output: 362


    🏁 Script executed:

    #!/bin/bash
    sed -n '160,240p' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

    Length of output: 3373


    I’ll inspect CoverageVisitor to understand the traversal order of visitBranch vs. visitCode.


    🏁 Script executed:

    #!/bin/bash
    # Locate the CoverageVisitor class and its methods
    rg -n "class CoverageVisitor" src/main/java/net/starlark/java/syntax/CoverageVisitor.java
    rg -n -C5 "visitBranch" src/main/java/net/starlark/java/syntax/CoverageVisitor.java
    rg -n -C5 "visitCode" src/main/java/net/starlark/java/syntax/CoverageVisitor.java

    Length of output: 8760


    🏁 Script executed:

    #!/bin/bash
    sed -n '80,160p' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

    Length of output: 2725


    No NPE risk in visitBranch – all counts are pre‑populated in register

    The counts map is fully populated for every executable Node (including branch conditions and successors) by the anonymous CoverageVisitor in register(…) (its visitCode runs on every node in the AST). Likewise, virtualJumpCounts is initialized for any branch with a missing negativeUniqueSuccessor. By the time dump() runs and the CoverageNodeVisitor calls counts.get(…) or virtualJumpCounts.get(…), neither lookup can return null. You can safely remove the suggested null guards.

    Likely an incorrect or invalid review comment.

    Comment on lines +978 to +980
    8000
    public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorder.startCoverageCollection(filenameMatcher);
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Add documentation for startCoverageCollection method

    This public API method lacks documentation explaining its purpose, when to use it, and how it integrates with the broader coverage system. Consider adding a Javadoc comment to clarify:

    • The purpose of the method
    • What the filenameMatcher parameter does exactly (what format it expects, examples)
    • When this should be called in relation to Starlark execution
    • Thread-safety guarantees
    • Any performance implications
    +/**
    + * Starts collecting coverage data for Starlark code execution.
    + * 
    + * <p>Coverage will be collected for all Starlark files that match the provided filename matcher.
    + * This method should be called before any Starlark code execution for which coverage data is
    + * desired.
    + *
    + * <p>This method is thread-safe and can be called concurrently with Starlark execution.
    + *
    + * @param filenameMatcher a function that returns true for filenames that should have coverage
    + *     collected, and false for files to exclude from coverage
    + */
     public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
       CoverageRecorder.startCoverageCollection(filenameMatcher);
     }
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorder.startCoverageCollection(filenameMatcher);
    }
    /**
    * Starts collecting coverage data for Starlark code execution.
    *
    * <p>Coverage will be collected for all Starlark files that match the provided filename matcher.
    * This method should be called before any Starlark code execution for which coverage data is
    * desired.
    *
    * <p>This method is thread-safe and can be called concurrently with Starlark execution.
    *
    * @param filenameMatcher a function that returns true for filenames that should have coverage
    * collected, and false for files to exclude from coverage
    */
    public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorder.startCoverageCollection(filenameMatcher);
    }

    Comment on lines +982 to +984
    public static void dumpCoverage(PrintWriter out) {
    CoverageRecorder.getInstance().dump(out);
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Add documentation and error handling for dumpCoverage method

    This public API metho 67E6 d also lacks documentation and doesn't provide any error handling mechanism. If an error occurs while writing the coverage data, it would silently fail.

    Consider enhancing this method with proper documentation and error handling:

    +/**
    + * Writes the collected Starlark coverage data to the specified writer in LCOV format.
    + * 
    + * <p>This method should be called after Starlark code execution has completed to generate
    + * the coverage report. It will dump coverage for all files that matched the filter provided
    + * to {@link #startCoverageCollection}.
    + *
    + * <p>The coverage report includes function, line, and branch coverage information.
    + *
    + * @param out the writer to which the LCOV report will be written
    + * @throws RuntimeException if an error occurs while writing the coverage data
    + */
     public static void dumpCoverage(PrintWriter out) {
    -  CoverageRecorder.getInstance().dump(out);
    +  try {
    +    CoverageRecorder.getInstance().dump(out);
    +  } catch (Exception e) {
    +    throw new RuntimeException("Failed to dump Starlark coverage data", e);
    +  }
     }
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    public static void dumpCoverage(PrintWriter out) {
    CoverageRecorder.getInstance().dump(out);
    }
    /**
    * Writes the collected Starlark coverage data to the specified writer in LCOV format.
    *
    * <p>This method should be called after Starlark code execution has completed to generate
    * the coverage report. It will dump coverage for all files that matched the filter provided
    * to {@link #startCoverageCollection}.
    *
    * <p>The coverage report includes function, line, and branch coverage information.
    *
    * @param out the writer to which the LCOV report will be written
    * @throws RuntimeException if an error occurs while writing the coverage data
    */
    public static void dumpCoverage(PrintWriter out) {
    try {
    CoverageRecorder.getInstance().dump(out);
    } catch (Exception e) {
    throw new RuntimeException("Failed to dump Starlark coverage data", e);
    }
    }

    Comment on lines +613 to +624
    if (!commonOptions.starlarkCoverageReport.isEmpty()) {
    FileOutputStream out;
    try {
    out = new FileOutputStream(commonOptions.starlarkCoverageReport);
    } catch (IOException ex) {
    String message = "Starlark coverage recorder: " + ex.getMessage();
    outErr.printErrLn(message);
    return createDetailedCommandResult(
    message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
    }
    Starlark.dumpCoverage(new PrintWriter(out));
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Leak‑prone I/O: use try‑with‑resources and flush PrintWriter

    FileOutputStream and the wrapping PrintWriter are opened but never closed.
    On long‑running servers this leaks file descriptors, and on short‑lived invocations you may miss buffered bytes because PrintWriter is not flushed.

    -        FileOutputStream out;
    -        try {
    -          out = new FileOutputStream(commonOptions.starlarkCoverageReport);
    -        } catch (IOException ex) {
    -          ...
    -        }
    -        Starlark.dumpCoverage(new PrintWriter(out));
    +        try (FileOutputStream fos =
    +                new FileOutputStream(commonOptions.starlarkCoverageReport);
    +             PrintWriter pw = new PrintWriter(fos)) {
    +          Starlark.dumpCoverage(pw);
    +        } catch (IOException ex) {
    +          ...
    +        }

    This guarantees flushing and closes both streams even if dumpCoverage throws.

    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    if (!commonOptions.starlarkCoverageReport.isEmpty()) {
    FileOutputStream out;
    try {
    out = new FileOutputStream(commonOptions.starlarkCoverageReport);
    } catch (IOException ex) {
    String message = "Starlark coverage recorder: " + ex.getMessage();
    outErr.printErrLn(message);
    return createDetailedCommandResult(
    message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
    }
    Starlark.dumpCoverage(new PrintWriter(out));
    }
    if (!commonOptions.starlarkCoverageReport.isEmpty()) {
    try (FileOutputStream fos =
    new FileOutputStream(commonOptions.starlarkCoverageReport);
    PrintWriter pw = new PrintWriter(fos)) {
    Starlark.dumpCoverage(pw);
    } catch (IOException ex) {
    String message = "Starlark coverage recorder: " + ex.getMessage();
    outErr.printErrLn(message);
    return createDetailedCommandResult(
    message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
    }
    }

    Comment on lines +166 to +168
    Parameter parameter = rfn.getParameters().get(i);
    CoverageRecorder.getInstance().recordCoverage(parameter.getIdentifier());
    Expression expr = parameter.getDefaultValue();
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Null‑safety for anonymous * / ** parameters

    Parameter.getIdentifier() may legally return null for the lone "*" that separates positional from keyword‑only parameters.
    Calling CoverageRecorder.recordCoverage(null) will throw a NullPointerException, crashing evaluation of such functions.

    -      CoverageRecorder.getInstance().recordCoverage(parameter.getIdentifier());
    +      Identifier pid = parameter.getIdentifier();
    +      if (pid != null) {
    +        CoverageRecorder.getInstance().recordCoverage(pid);
    +      }

    Apply the same guard inside the later loop that visits varargs/kwargs.

    Also applies to: 178-181

    Comment on lines +181 to +185
    protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
    functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
    counts.get(firstBodyStatement).sum()));
    }

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    counts.get(firstBodyStatement) can be null → potential NullPointerException

    visitFunction is executed before visitCode has visited the first body statement (pre‑order traversal), so the LongAdder may not yet be in counts. Dereferencing sum() without a null‑check will crash during initial registration for every function.

    -      functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
    -          counts.get(firstBodyStatement).sum()));
    +      LongAdder adder = counts.get(firstBodyStatement);
    +      long count = adder == null ? 0 : adder.sum();
    +      functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(), count));
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
    functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
    counts.get(firstBodyStatement).sum()));
    }
    protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
    LongAdder adder = counts.get(firstBodyStatement);
    long count = adder == null ? 0 : adder.sum();
    functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(), count));
    }

    Comment on lines +41 to +56
    static CoverageRecorder getInstance() {
    return CoverageRecorderHolder.INSTANCE;
    }

    /**
    * Collect coverage for all {@link Program}s compiled after the call whose
    * {@link Program#getFilename()} matches {@code filenameMatcher}.
    */
    sta A3E2 tic void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
    }

    class CoverageRecorderHolder {

    private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Make INSTANCE volatile (or use an AtomicReference) to avoid a data‑race

    startCoverageCollection can be invoked from a different thread than recordCoverage/getInstance.
    Because CoverageRecorderHolder.INSTANCE is a mutable static field without the volatile modifier, the write performed in line 50 may not become visible to threads already calling getInstance(), leading them to observe a partially‑constructed LcovCoverageRecorder or to keep using the NoopCoverageRecorder.

    -  class CoverageRecorderHolder {
    -    private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();
    +  class CoverageRecorderHolder {
    +    // Volatile guarantees safe publication of the recorder instance
    +    private static volatile CoverageRecorder INSTANCE = new NoopCoverageRecorder();

    Alternatively, replace the field with an AtomicReference<CoverageRecorder> and perform a CAS to protect against multiple concurrent initialisations.

    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    static CoverageRecorder getInstance() {
    return CoverageRecorderHolder.INSTANCE;
    }
    /**
    * Collect coverage for all {@link Program}s compiled after the call whose
    * {@link Program#getFilename()} matches {@code filenameMatcher}.
    */
    static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
    }
    class CoverageRecorderHolder {
    private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();
    static CoverageRecorder getInstance() {
    return CoverageRecorderHolder.INSTANCE;
    }
    /**
    * Collect coverage for all {@link Program}s compiled after the call whose
    * {@link Program#getFilename()} matches {@code filenameMatcher}.
    */
    static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
    CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
    }
    class CoverageRecorderHolder {
    // Volatile guarantees safe publication of the recorder instance
    private static volatile CoverageRecorder INSTANCE = new NoopCoverageRecorder();

    @visz11
    Copy link
    Collaborator
    visz11 commented Jul 26, 2025

    /refacto-test

    @refacto-test
    Copy link
    refacto-test bot commented Jul 26, 2025

    Refacto is reviewing this PR. Please wait for the review comments to be posted.

    }
    }
    }
    if (!commonOptions.starlarkCoverageReport.isEmpty()) {
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Security Issue: Potential path traversal vulnerability in coverage report file path handling. The code directly uses the user-provided file path without validation, which could allow directory traversal attacks (e.g., '../sensitive_file'). Consider validating the path by comparing canonical and absolute paths:

    String reportPath = commonOptions.starlarkCoverageReport;
    File reportFile = new File(reportPath);
    if (reportFile.getCanonicalPath().equals(reportFile.getAbsolutePath())) {
      out = new FileOutputStream(reportFile);
    } else {
      // Handle invalid path
    }

    String message = "Starlark coverage recorder: " + ex.getMessage();
    outErr.printErrLn(message);
    return createDetailedCommandResult(
    message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Resource Management Issue: The FileOutputStream isn't guaranteed to be closed if an exception occurs during Starlark.dumpCoverage(). This could lead to resource leaks in this long-running application. Consider using try-with-resources to ensure proper resource cleanup:

    try (FileOutputStream out = new FileOutputStream(commonOptions.starlarkCoverageReport);
         PrintWriter writer = new PrintWriter(out)) {
      Starlark.dumpCoverage(writer);
    } catch (IOException ex) {
      // Handle exception
    }

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    5 participants

    0