10000 fix(Srv/stdio): risk of goroutine leaks and concurrent reads in `readNextLine()` by cryo-zd · Pull Request #257 · mark3labs/mcp-go · GitHub
[go: up one dir, main page]

Skip to content

fix(Srv/stdio): risk of goroutine leaks and concurrent reads in readNextLine() #257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 10, 2025

Conversation

cryo-zd
Copy link
Contributor
@cryo-zd cryo-zd commented May 8, 2025

Problem Summary

 The original implementation of readNextLine spawns a goroutine to perform a ReadString('\n') call and deliver the result via readChan or errChan. However, this goroutine did not return immediately after writing to a channel, which introduces two subtle concurrency issues caused by incorrect assumptions about goroutine scheduling and lifecycle.
[Problem]Data Loss or Goroutine Leak
 If the outer readNextLine function exits due to context cancellation (e.g., ctx.Done()), but the inner goroutine is still blocked on ReadString or attempting to send to a now-unread readChan, the following may happen:

  • The goroutine continues to read input and successfully captures a line,
  • But readChan is never read again,
  • Or the goroutine continues reading beyond its intended scope.

 ➡️ Result: The read data is lost, and the goroutine leaks, causing silent memory and input channel pressure over time.

[Problem] Concurrent Reads on bufio.Reader:
 If the spawned goroutine successfully delivers a line to readChan but does not return immediately, it may remain alive and proceed to call ReadString('\n') again. This can happen if the Go scheduler delays the main thread’s consumption of the value, allowing the same goroutine to continue running.
 ➡️ Result: Multiple goroutines may perform concurrent reads on the shared bufio.Reader, which is not safe for concurrent use. This can cause data corruption, misordered reads, or panics.

Root Cause: Invalid Timing Assumption

The original logic assumes:

That the readChan or errChan consumer will always read immediately after the line is sent, and that the goroutine will exit soon enough due to a closed done channel.

 This is a scheduling-dependent assumption, and does not hold reliably in Go's concurrent model. The race occurs in the tiny gap between delivering the line and reacting to done.

Fix: Exit Goroutine Immediately After Attempting to Deliver

 We fix this by ensuring the goroutine returns immediately after the select that attempts to deliver the line,:

select {
case readChan <- line:
case <-done:
}
return

This ensures:

  • Each goroutine performs at most one read,
  • The goroutine always exits quickly after delivery or cancellation,
  • All data is either consumed or explicitly dropped with awareness.

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability when reading lines by ensuring background processes exit promptly a 8000 fter completing their task.

Copy link
Contributor
coderabbitai bot commented May 8, 2025

Walkthrough

An explicit return statement was added after sending a read line to a channel within a goroutine in the readNextLine method. This change ensures the goroutine exits immediately after sending the line, modifying its control flow to prevent further execution beyond that point.

Changes

File(s) Change Summary
server/stdio.go Added an explicit return statement after sending a read line to a channel in a goroutine, altering the control flow in the readNextLine method.

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd1e1e8 and 1737e4d.

📒 Files selected for processing (1)
  • server/stdio.go (1 hunks)
🔇 Additional comments (1)
server/stdio.go (1)

164-182: Excellent fix for preventing concurrent reads and goroutine leaks!

The addition of this explicit return statement is critical because it ensures the goroutine exits immediately after sending data to the channel. Without this return, the goroutine would implicitly continue to the end of the function, potentially leading to:

  1. Multiple goroutines concurrently reading from the shared bufio.Reader if the main thread is delayed in consuming data
  2. Potential data corruption or misordered reads due to concurrent access
  3. Goroutine leaks if the parent function exits due to context cancellation while the goroutine is blocked

This small change properly ensures each goroutine performs exactly one read operation and then exits, effectively preventing the concurrency issues identified in the PR objectives.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@cryo-zd cryo-zd changed the title fix(Srv/stdio): risk of concurrent reads and data loss in readNextLine() fix(Srv/stdio): risk of goroutine leak and data loss in readNextLine() May 8, 2025
@cryo-zd cryo-zd changed the title fix(Srv/stdio): risk of goroutine leak and data loss in readNextLine() fix(Srv/stdio): risk of goroutine leaks and concurrent reads in readNextLine() May 8, 2025
Copy link
Collaborator
@rwjblue-glean rwjblue-glean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very subtle bug! Thanks for tracking it down!

@ezynda3 ezynda3 merged commit 3dfa331 into mark3labs:main May 10, 2025
2 checks passed
@winterfx
Copy link
Contributor

Hi @cryo-zd, thank you for the detailed explanation of the concurrency issue — I really appreciate the insights. 🙏 I’m trying to fully understand the scenario you described and would like to clarify one point:

From the implementation of readNextLine, the spawned goroutine appears to execute ReadString('\n') once, then attempts to write the result to readChan or errChan, and then it returns (or at least, there's no loop). So it seems like the goroutine should terminate after that.

However, the comment mentions:

“If the Go scheduler delays the main thread’s consumption of the value, the same goroutine may continue running and call ReadString('\n') again.”

Could I ask for clarification here?

What exactly does "the same goroutine may continue running" mean in this context?

Since the goroutine isn’t written with a loop, how would it end up calling ReadString more than once?

Thanks again for your time and help — I just want to make sure I’m not missing something subtle here.

@cryo-zd
Copy link
Contributor Author
cryo-zd commented May 13, 2025

Hi @cryo-zd, thank you for the detailed explanation of the concurrency issue — I really appreciate the insights. 🙏 I’m trying to fully understand the scenario you described and would like to clarify one point:

From the implementation of readNextLine, the spawned goroutine appears to execute ReadString('\n') once, then attempts to write the result to readChan or errChan, and then it returns (or at least, there's no loop). So it seems like the goroutine should terminate after that.

However, the comment mentions:

“If the Go scheduler delays the main thread’s consumption of the value, the same goroutine may continue running and call ReadString('\n') again.”

Could I ask for clarification here?

What exactly does "the same goroutine may continue running" mean in this context?

Since the goroutine isn’t written with a loop, how would it end up calling ReadString more than once?

Thanks again for your time and help — I just want to make sure I’m not missing something subtle here.

oh, yes, I made a mistake here, ReadString('\n') will be called only once. I think the reason that I lost my input data is that the context was cancelled, and the spawned goroutine read the data and put it into readchan without a consumer reading from it(Because the the func readNextLine that spawned this goroutine has exited and no one can read from the readchan)
Thanks for pointing out this mistake here!
(scenario: I input one data ended with \n and then send SIGTERM immediately, then the context was cancelled, and readNextLine() exited due to <-ctx.Done(). Next the spawned goroutine returned from reader.ReadString('\n') and put the data into readchan(Goruntime selects case at random when multiple cases are ready). Then the input data is lost and I hope the input data could be processed by server because I implemented some metrics and logs in the server)
(And recently I found that my requirement is a little subtle: [1] we could not proceed to s.processMessage(ctx, line, stdout) because context is cancelled. [2] it's unreasonable to hope stdioServer still work because I have already sent SIGTERM).
So, it's not a bug and the implement is robust enough.

@cryo-zd cryo-zd deleted the fix/readline branch May 13, 2025 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0