8000 fix: add status field in Resource class and Resource as a return type in CallToolResult by bzsurbhi · Pull Request #549 · modelcontextprotocol/modelcontextprotocol · GitHub
[go: up one dir, main page]

Skip to content

fix: add status field in Resource class and Resource as a return type in CallToolResult #549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bzsurbhi
Copy link
@bzsurbhi bzsurbhi commented May 19, 2025
  • Add status field in Resource
  • Add Resource as a return type in CallToolResult

Motivation and Context

Enhanced support for long-running operations by implementing a Resource-based polling mechanism. The server can now return a Resource object (containing URI and status) from call_tool(), allowing clients to poll the resource status until it reaches Ready state. Once ready, clients can retrieve the content using read_resource()."

How Has This Been Tested?

Breaking Changes

This is not a breaking change

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@jonathanhefner
Copy link
Contributor

I previously left some feedback regarding the async resource approach: #491 (comment). I believe all of those points apply to this approach as well.

Some additional points for this approach:

  • The only way to check the status of a reso 8000 urce is by listing all resources and iterating through them until you find a matching uri.
  • If a tool returns multiple items in CallToolResult.content, the client may have to maintain a subscription for each item. (Though resources support multiple chunks of content, all chunks must share the same metadata, such as annotations; thus, individual CallToolResult.content items should map to individual resources rather than chunks.)

@000-000-000-000-000
Copy link

The only way to check the status of a resource is by listing all resources and iterating through them until you find a matching uri.

+1 this is a general problem we need to solve -- clients should be able to GetResource wherein they provide a URI and get the details of that URI. I feel that is a key thing missing from this PR.

If a tool returns multiple items in CallToolResult.content, the client may have to maintain a subscription for each item. (Though resources support multiple chunks of content, all chunks must share the same metadata, such as annotations; thus, individual CallToolResult.content items should map to individual resources rather than chunks.)

Adding GetResource will provide an alternative to subscription.

The resource-based approach changes the nature of progress notifications, results, and errors.
- Servers must use a different API to set these things on an async resource versus sending JSON-RPC objects.
- Clients must have a separate code path to extract these things from an async resource versus processing a stream of JSON-RPC objects.
- No support for progress messages.

I don't see why the nature of progress notifications, results, and errors changes at all. This is just adding a new possible return type for tools and ensuring resources have status. Progress messages are still supported via normal channel. Think of this this way - this PR is just doing the following:

  1. Extending CallToolResult to allow for returning Resources (instead of just EmbeddResources). This enables servers to give back resource references without supplying the actual data. This is useful for cases where 1/ the server generates some data in another location (a database, another system, etc) that should not be directly returned to the caller e.g. I wrote 1GB of data to S3 and I am not going to pass that data back over the wire from MCP server 2/ the server is working on an async workflow that will eventually populate a result at the location of the resource e.g. I will eventually write a file to S3 that you can consume at this location but the file is not available yet to return.,
  2. Adding a status to Resources to indicate whether they are ready to be consumed. This is useful for 1/ cases where a Resource might be present but not ready to be consumed - for example, in the case of a multi-part upload or multi-step workflow where multiple steps or independent nodes collaborate on a resource 2/ async workflows where a server wants to return a reference to a resource which might not exist yet

The resource-based approach is problematic for partial results.

  • The proposal does not currently include an API for partial results. Such an API could be added, but there is an issue with adding cumulative partial results to a resource: the server has to buffer all previous partial results each time a new partial result is added. For example, if a tool returns many images, each as a partial result, the server will have to buffer N images in memory to append the N+1 image to the async resource.
  • Likewise, on the client side, if a client wants to fetch partial results, it must fetch N images that it may have already seen to get the N+1 image.

Why is the pattern partial results any different? Servers still publish partial results over a stream. When servers eventually provide a final result it now may have a Resource attached. Am I missing something?

The resource-based approach doesn't address server-side requests (e.g. sampling requests).
- If the server makes a request to the client, where will the request be stored? It could possibly be stored in the async resource, but that seems awkward to
- If the request is stored in the async resource, clients must use a different code path to extract the request from the async resource versus from a stream of JSON-RPC objects.

This seems specific to the prior PR - i dont know if this applies here.

The resource-based approach compels the client to subscribe to a resource.

  • Barring webhooks, the only way to actively poll for updates would be to fetch the resource, which may not be appropriate.
  • Incidentally, it's not clear to me how Streamable HTTP transport resumability works with resource update notifications. If a client first subscribes to an async resource after a server has already sent a resource update notification, what Last-Event-ID must the client use to get that notification?

This is a great point -- we really need a GetResource operation in addition to a ListResource / ReadResource one.

The resource-based approach doesn't address resource reclamation.
- How is expiry time communicated to the client?
- Does subscribing to an async resource refresh the expiry time? Such a side effect would be a notable difference from subscribing to a regular resource.
- When a client fetches an async resource that is marked as completed, can the resource be deleted immediately? Such a side effect would also be a notable difference from a regular resource.

This also seems specific to prior PR.

@bzsurbhi bzsurbhi requested a review from Joffref May 22, 2025 21:40
@jonathanhefner
Copy link
Contributor

If a tool returns multiple items in CallToolResult.content, the client may have to maintain a subscription for each item.

Adding GetResource will provide an alternative to subscription.

That would shift the problem from maintaining multiple subscriptions to polling multiple resources.

The resource-based approach changes the nature of progress notifications, results, and errors.

  • Servers must use a different API to set these things on an async resource versus sending JSON-RPC objects.
  • Clients must have a separate code path to extract these things from an async resource versus processing a stream of JSON-RPC objects.
  • No support for progress messages.

I don't see why the nature of progress notifications, results, and errors changes at all. This is just adding a new possible return type for tools and ensuring resources have status. Progress messages are still supported via normal channel.

By "changing the nature of", I mean there will be a bifurcation. Think about how sync tools and async tools will require entirely separate code paths in the SDKs. Think about how defining a new property on, say, TextContent will also require defining that property on TextResourceContents to maintain feature parity. And what should we do for a new property on ImageContent? Think about how a tool author must decide whether to make their tool sync or async. Why not just make every tool async?

By "no support for progress messages", I was referring to the prior PR having a progress number and total, but not capturing progress messages. This PR has neither of those things; however, if clients must still use the normal channel for progress notifications, why not get a normal result type directly from that channel as well?

Think of this this way - this PR is just doing the following:

  1. Extending CallToolResult to allow for returning Resources (instead of just EmbeddResources). This enables servers to give back resource references without supplying the actual data. This is useful for cases where 1/ the server generates some data in another location (a database, another system, etc) that should not be directly returned to the caller e.g. I wrote 1GB of data to S3 and I am not going to pass that data back over the wire from MCP server 2/ the server is working on an async workflow that will eventually populate a result at the location of the resource e.g. I will eventually write a file to S3 that you can consume at this location but the file is not available yet to return.,

This is an important use case, but it is orthogonal to async tools. Representing the data as a resource is one approach, but it does not require adding a status field. The downside to that approach is that binary data must be encoded as base64, which increases payload size by 33%. Other approaches have been discussed as well.

  1. Adding a status to Resources to indicate whether they are ready to be consumed. This is useful for 1/ cases where a Resource might be present but not ready to be consumed - for example, in the case of a multi-part upload or multi-step workflow where multiple steps or independent nodes collaborate on a resource 2/ async workflows where a server wants to return a reference to a resource which might not exist yet

It's worth pointing out that the status field implies that all resources, even resources that are not tool results, now have new potential failure modes. Clients will now have to check the status of a resource and handle those potential states before any read.

Why is the pattern partial results any different? Servers still publish partial results over a stream. When servers eventually provide a final result it now may have a Resource attached. Am I missing something?

I was assuming that partial results would not be over a stream, because if they were over a stream, why have a resource attached?

The resource-based approach doesn't address server-side requests (e.g. sampling requests).

This seems specific to the prior PR - i dont know if this applies here.

Neither PR addresses what happens when an async tool makes a request to a disconnected client, so the point applies to both PRs.

Again, if the answer is that the request is sent over the resumable HTTP SSE stream, then why isn't the full tool result also sent over the stream? If we need an API to poll the status of the stream after a disconnection, then let's just add that.

The resource-based approach doesn't address resource reclamation.

This also seems specific to prior PR.

Neither PR addresses how the server should handle resource reclamation, so the point applies to both PRs.

When should a resource expire? How is that time frame communicated to the client? If a server sends a request to a disconnected client, how long must the server wait before deleting the "pending" resource? When an "available" or "error" resource has been read, can it be immediately deleted? How long must the server maintain a record of a "deleted" resource?

@davemssavage
Copy link

I'm looking into this problem at the moment too, this may have already been considered however I wonder if the simplest possible option is just to allow a resource uri reference in a progress notification e.g. for FastMCP.

context.report_progress(
  progress=5, 
  total=10, 
  message='generating content', 
  resource_uri='resource:/content_5_10.mp4'
)

That way there's virtually no new concepts required, I suggest it is up to server implementors how long they keep any in progress resources hanging around i.e. no guarantees and they are expected to be ephemeral but I would imagine most servers would/could make these uris valid for minutes with little overall cost implications to enable clients time to cache the results.

I suggest resource_uri should be an optional field that defaults to None for backwards compatibility.

If this is of interest I can draft as an RFC, I'll try to have a pass at implementing it as a branch to the python sdk to demonstrate the idea further.

@hesreallyhim
Copy link
Contributor

This is a nuanced discussion involving modelcontextprotocol/python-sdk#800 as well, I commented there that it's already permitted (from a model POV) for Notifications to include arbitrary extra data (models have ConfigDict(extra="allow") in their definition), so this is a shortcoming of FastMCP if the user can't take advantage of this flexibility; why not have a system of dynamic resource creation/identification where the creation of a resource itself becomes a resource (base is job:// or task:// or something, maybe the rest of the URI could be inferable from the URI of the "pending" resource itself)?

@000-000-000-000-000
Copy link

By "changing the nature of", I mean there will be a bifurcation. Think about how sync tools and async tools will require entirely separate code paths in the SDKs. Think about how defining a new property on, say, TextContent will also require defining that property on TextResourceContents to maintain feature parity. And what should we do for a new property on ImageContent?

FWIW i think this need to keep parity already exists - i dont think this PR changes the need to ensure TextResourceContents and TextContent evolve in sync. If we were to redesign this we might use the same Content objects for both but at this point we need to continue with this abstraction.

Also philosophically, while it is nice if clients can treat async results identically to sync ones, i dont think this is a necessity. The code on client side will meaningfully differ for async regardless.

Think about how a tool author must decide whether to make their tool sync or async. Why not just make every tool async?

Servers can choose similar to APIs. Tools which execute short, simple operations are probably better to model as sync while ones which are long-running will be better modeled as async.

By "no support for progress messages", I was referring to the prior PR having a progress number and total, but not capturing progress messages. This PR has neither of those things; however, if clients must still use the normal channel for progress notifications, why not get a normal result type directly from that channel as well?

They can -- this is an alternative to sitting on the line waiting for streaming results.

This is an important use case, but it is orthogonal to async tools. Representing the data as a resource is one approach, but it does not require adding a status field. The downside to that approach is that binary data must be encoded as base64, which increases payload size by 33%. #90 (reply in thread) have been discussed as well.

I don't see anything wrong with adding other content types to Resource in fullness of time.

It's worth pointing out that the status field implies that all resources, even resources that are not tool results, now have new potential failure modes. Clients will now have to check the status of a resource and handle those potential states before any read.

That's a great point - adding status will be backwards incompatible for existing consumers. Keeping status in a separate extension class of Resource or creating a brand new class would be 2 options here. @bzsurbhi would suggest thinking through this point.

I was assuming that partial results would not be over a stream, because if they were over a stream, why have a resource attached?

Basically I'm just saying that servers can decide to still send progress notifications if they really want to and clients can decide whether to ignore them or not. That said, I don't expect this to happen in general. The point of this is that clients don't need to wait on the line for completion.

Neither PR addresses what happens when an async tool makes a request to a disconnected client, so the point applies to both PRs.
Again, if the answer is that the request is sent over the resumable HTTP SSE stream, then why isn't the full tool result also sent over the stream? If we need an API to poll the status of the stream after a disconnection, then let's just add that.

I'm not fully understanding this comment. So in Streamable HTTP client/server interactions can be completely stateless i.e. single request/response similar to REST APIs. This PR is defining semantics for indicating to clients when some artifact of a request is still being processed + allowing tools to specify Resources as responses. When requests are not stateless or involve streaming, servers can still send these Resources but the use-cases for doing so would probably be different.

Neither PR addresses how the server should handle resource reclamation, so the point applies to both PRs.
When should a resource expire? How is that time frame communicated to the client? If a server sends a request to a disconnected client, how long must the server wait before deleting the "pending" resource? When an "available" or "error" resource has been read, can it be immediately deleted? How long must the server maintain a record of a "deleted" resource?

Resource reclamation seems like an orthogonal problem that already exists for Resources today. Not sure if we need to couple it with this proposal on enhancing Tool calls w/ Resources.

@000-000-000-000-000
Copy link

I'm looking into this problem at the moment too, this may have already been considered however I wonder if the simplest possible option is just to allow a resource uri reference in a progress notification e.g. for FastMCP.

context.report_progress(
  progress=5, 
  total=10, 
  message='generating content', 
  resource_uri='resource:/content_5_10.mp4'
)

That way there's virtually no new concepts required, I suggest it is up to server implementors how long they keep any in progress resources hanging around i.e. no guarantees and they are expected to be ephemeral but I would imagine most servers would/could make these uris valid for minutes with little overall cost implications to enable clients time to cache the results.

I suggest resource_uri should be an optional field that defaults to None for backwards compatibility.

If this is of interest I can draft as an RFC, I'll try to have a pass at implementing it as a branch to the python sdk to demonstrate the idea further.

I think this kind of defeats the purpose here of facilitating flows where clients do not need to wait on the line for servers to complete.

@davemssavage
Copy link

I think this kind of defeats the purpose here of facilitating flows where clients do not need to wait on the line for servers to complete.

It occurs to me that rather than a resource with a status perhaps a future result might be better and more generalisable approach. This can then be solved at the SDK layer rather than the protocol layer and this would then allow any type of response not just content so would also support the structuredContent approach too.

This avoids the need for complicating the resource protocol so a resource is always expected to be 'ready' it just might be ready in the future.

e.g. for the python sdk this might look like:

future: Future[ImageContent] = session.call_tool_soon('generate_image', progress_callback=progress_callback)
image = future.result(timeout=60) # 60 seconds

The naming convention here is meant to mirror start_task_soon from anyio but other naming options are obviously available.

In this case the python client can opt into this behaviour and use a callback or wait for a result and cancel in the case it's taking too long. I think this covers use case that this change is trying to address and doesn't require a new concept at the protocol level.

This might be going off on a completely new use case but I'm interested in the idea of creating tools to perform open ended long running tasks like training a model.

In this scenario a tool will output many intermediary artefacts such as metrics, validation_data and checkpoints and a single reponse object doesn't feel that useful. It would be neat if there were a way to return a sequence of future resources that the server could emit as they become available. That was the main reason for allowing progress to respond with resources, however the limitation in this approach is is it's not obvious how a client would reconnect with a long running task if it disconnects and then rejoins say the next day if the training is taking a long time.

It might be because it's late in the day in the UK but this last part seems hard to implement both with this Future approach or with Resources that can be updated (as per discussion here modelcontextprotocol/python-sdk#800 (comment).

@000-000-000-000-000
Copy link

solving on client side won't help since it still requires a persistent connection. But, I do think the idea of creating a new class like "Promise" and a new alternative to CallTool might be another way to achieve this.

@davemssavage
Copy link
davemssavage commented May 27, 2025

solving on client side won't help since it still requires a persistent connection. But, I do think the idea of creating a new class like "Promise" and a new alternative to CallTool might be another way to achieve this.

Yep that's the sort of direction I'm thinking too if there's an appetite for a new method, I agree supporting non-persistent connections is going to be really useful for these sorts of use cases.

Perhaps something along the lines of the following for the low level sdk:

task: TaskReference = session.start_task('train_model', training_config)
session.subscribe_updates(task, 'progress', progress_callback)
session.subscribe_updates(task, 'metrics', metrics_callback)
result = session.wait_result(task, timeout=60)

The start_task would behave a bit like call_tool but instead of being expected to get a result immediately the client gets a reference that can be used to get the result later. The task reference should be externalisable so another process can pick up the refrence after a disconnect

rejoined_task = session.join_task(old_task)
session.subscribe_updates(rejoined_task, 'progress', progress_callback)
session.subscribe_updates(rejoined_task, 'metrics', metrics_callback)
result = session.result(rejoined_task, timeout=60)

There's a question of how the client would request updates that it had missed whilst disconnected, perhaps allowing an optional position marker on rejoin to enable the client to say replay updates after a certain point. It's not obvious how to parse the position markers around, I might need to think on that some more...somehow associated with the old_task??

The task reference might advertise the channels the client can subscribe to with a description e.g.

class TaskReference(BaseModel):
  task_token: str
  channels: Dict[str, str] 

I'm being very lazy with the channels descripton there but the idea would be to provide at the very least a string description of the channels the client could subscribe to.

High level sdk libs might then wrap up this low level task protocol into Promises to make the simple cases easy but allow for more complex behaviour without requiring lots of new semantics.

I realise this is going way of piste from the original ticket.

@hesreallyhim
Copy link
Contributor
hesreallyhim commented May 27, 2025

Lemme try to pull some strands together, maybe just to clear up my own thinking,, so the problem to address is how to deal with really long-running tool processes in a way that the user can stay informed about the status. Is that roughly accurate? So an LLM decides to output a tool_call to respond to a user prompt, maybe the prompt is even like, "solve this incredibly hard problem," ("What is the meaning of life?") and the LLM has a "build-an-LLM tool" so it decides the best thing to do is to train a new foundational model to solve the problem. So unlike a "normal" tool, you can't wait synchronously for the answer, that can't be the expected response. So to me, it's like the tool's function is to kick off a long process, with maybe many intermediate steps and many resources or outputs along the way. So the way I would think about it is you get like a response with metadata for the overall "job", and maybe the response from the tool is a description of all the sub-tasks like "data collection task", "training task", "evaluation task", finally the new model is built and it gets to the "answer the original question" task. For me I would picture this like the big job is a resource with a URI and the response from the tool is just this URI and a "plan" of all the sub-tasks or something. Then the client is happy and life goes on and either it subscribes to updates about this job://buildModel/{UUID} or it can poll it or just check in (i.e. read_resource) or whatever, and as the job goes on you get the creation of new tasks like task://dataCollection/{UUID} or task://buildModel/{UUID}/dataCollection, then task://buildModel/{UUID}/trainModel then task://buildModel/{UUID}/fineTuning, and then finally, task://buildModel/{UUID}/promptModel and you get "42" which is the answer to the original user prompt. Each of these sub-tasks you can also do normal "progressNotification" style messaging if you like too.

The closest thing I can think of is the "Deep Research" "tool" like in chatGPT e.g., it can take like 20 minutes, but it's synchronous and blocking, so are we trying to think about how to deal with situations like that but even on a larger scale? Supposing I'm actually following the thread (maybe I'm just following my own thread) I guess I like this approach overall. Is this and this competing proposals? If so, it would be easier to reason about if they were both at the same level, like @davemssavage 's PR is on the SDK level - or is that also part of the debate, like "is this a protocol change or an SDK change"?

OK, so if what I said above is off base, then, sorry for clogging up the debate, just minimize this comment lol, but if I'm following, here's closing thoughts:

  • a tool should not return a Resource, but it could return a URI for a new resource - in that case, it's really just a TextResponse but semantically it might be worth having a new class to differentiate it, so this line should really be a ResourceURI or something.
  • ResourceStatus - I kind of like this, although I don't love the idea of pre-determining an enum in advance, where we have a certain set of use cases in mind which is not general/future-proofed enough - having it just be a "description" (free-text or whatever) is also not great but I prefer that over the list you have right now, which seems kind of tailored to certain resources like e.g. "transferring a giant object to storage" or "an ETL job" etc., also "available" is too vague to me. So if I had to pick I would prefer just "type": "string" - I mean a MIMEtype, e.g., is really almost an enum, cuz we know mostly what the options are, but that's listed as a string regardless.
  • The process I described really the major change is not adding a status to a resource but more like some notion of a sub-resource or related-resource...
  • I think if you two are really trying to solve the same problem, @davemssavage 's PR should target the spec, not some high-level SDK.
  • I guess the suggestion I'm making, if coherent at all, is to "Add ResourceURI as a return type in CallToolResult", and "add subResources/relatedResources as field in Resource" - (I think adding "status" as another field is fine too)

Hope this helped. 🤷

EDIT (caveat): Maybe I'm thinking of a resource too loosely, I think of it as literally pretty much any entity in the universe, and the contents is data about the entity, but a resource itself does not have to be a bit of data. Maybe that's contrary to the protocol, which says "Resources represent any kind of data that an MCP server wants to make available to clients." but I think it's nice to think of a resource ANYTHING, and if you want you can argue that anything can be represented "as data," but maybe I am on my own in thinking that way.

EDIT: I noticed that the notion of a "sub-resource" is already mentioned (maybe not defined) in the schema.ts:

* The URI of the resource that has been updated. This might be a sub-resource of the one that the client actually subscribed to.

@davemssavage
Copy link

Final thought then time for some sleep, it would be neat if start_task played nicely with existing tools e.g. server implementors might perform some mapping to make tools automatically appear as tasks so existing libraries don't need rewriting. Some of the semantics around channels in my previous post would require new server side apis but these should be opt-in rather than mandatory.

@jonathanhefner
Copy link
Contributor

By "changing the nature of", I mean there will be a bifurcation. Think about how sync tools and async tools will require entirely separate code paths in the SDKs. Think about how defining a new property on, say, TextContent will also require defining that property on TextResourceContents to maintain feature parity. And what should we do for a new property on ImageContent?

FWIW i think this need to keep parity already exists - i dont think this PR changes the need to ensure TextResourceContents and TextContent evolve in sync. If we were to redesign this we might use the same Content objects for both but at this point we need to continue with this abstraction.

I don't disagree that it would be nice to unify many types, but, other than this PR, why would TextContent and TextResourceContents need to evolve in sync? Note that there is no ImageResourceContents for parity with ImageContent, nor AudioResourceContents for parity with AudioContent.

Also philosophically, while it is nice if clients can treat async results identically to sync ones, i dont think this is a necessity. The code on client side will meaningfully differ for async regardless.

If clients can treat sync and async results identically, why would the code meaningfully differ?

The spec supports resuming a disconnected stream. If a result is transmitted via the stream, then the client can handle it in the same way upon resume, regardless of whether the tool is sync or async. There just needs to be a mechanism to trigger the resume, such as a timer or a webhook.

Think about how a tool author must decide whether to make their tool sync or async. Why not just make every tool async?

Servers can choose similar to APIs. Tools which execute short, simple operations are probably better to model as sync while ones which are long-running will be better modeled as async.

What about a tool that executes a short, simple operation, except that a particular input causes it to run for a very long time?

By "no support for progress messages", I was referring to the prior PR having a progress number and total, but not capturing progress messages. This PR has neither of those things; however, if clients must still use the normal channel for progress notifications, why not get a normal result type directly from that channel as well?

They can -- this is an alternative to sitting on the line waiting for streaming results.

The client does not need to stay online while waiting for streamed results.

This is an important use case, but it is orthogonal to async tools. Representing the data as a resource is one approach, but it does not require adding a status field. The downside to that approach is that binary data must be encoded as base64, which increases payload size by 33%. Other approaches have been discussed as well.

I don't see anything wrong with adding other content types to Resource in fullness of time.

I don't understand the relationship between these two comments.

Neither PR addresses what happens when an async tool makes a request to a disconnected client, so the point applies to both PRs.
Again, if the answer is that the request is sent over the resumable HTTP SSE stream, then why isn't the full tool result also sent over the stream? If we need an API to poll the status of the stream after a disconnection, then let's just add that.

I'm not fully understanding this comment. So in Streamable HTTP client/server interactions can be completely stateless i.e. single request/response similar to REST APIs. This PR is defining semantics for indicating to clients when some artifact of a request is still being processed + allowing tools to specify Resources as responses. When requests are not stateless or involve streaming, servers can still send these Resources but the use-cases for doing so would probably be different.

Servers can make requests to clients, e.g. sampling requests. Much like progress notifications, those requests are sent via the stream. If a client must read the stream to receive and handle such messages, why not also send the result via the stream? The only real purpose of the resource is to monitor the status of the result. Instead of altering the resource primi 8000 tive to suit that purpose (and dealing with the ensuing consequences), let's add a separate API specifically for that purpose.

Neither PR addresses how the server should handle resource reclamation, so the point applies to both PRs.
When should a resource expire? How is that time frame communicated to the client? If a server sends a request to a disconnected client, how long must the server wait before deleting the "pending" resource? When an "available" or "error" resource has been read, can it be immediately deleted? How long must the server maintain a record of a "deleted" resource?

Resource reclamation seems like an orthogonal problem that already exists for Resources today. Not sure if we need to couple it with this proposal on enhancing Tool calls w/ Resources.

I'm not sure that resource reclamation is actually a problem that exists today. Regardless, there is a fundamental difference when you are creating resources on every tool call, so it is not an orthogonal problem. The questions I asked in my previous comment were specific to tool-created resources.

@jonathanhefner
Copy link
Contributor

The start_task would behave a bit like call_tool but instead of being expected to get a result immediately the client gets a reference that can be used to get the result later. The task reference should be externalisable so another process can pick up the refrence after a disconnect

This sounds like the approach I proposed in #491 (comment). Ultimately, I concluded that it would be better to generalize the approach for all tools, not just (anticipated) long-running tasks. See #543.

There's a question of how the client would request updates that it had missed whilst disconnected, perhaps allowing an optional position marker on rejoin to enable the client to say replay updates after a certain point. It's not obvious how to parse the position markers around, I might need to think on that some more...somehow associated with the old_task??

See #543 (comment).

@bzsurbhi
Copy link
Author

@davemssavage
The TaskReference approach introduces a new concept (tasks) and multiple new methods (start_task, subscribe_updates, wait_result, join_task) to the protocol, making it more complex. Both clients and servers would need significant new code to support task management, persistence, and reconnection logic.

I agree adding status field to the existing Resource class overloads it and makes it backwards compatible for the existing users

I overall like the idea of introducing a new class and an alternative to tool/call

class PromiseStatus(str, Enum):
    """The status of a promise."""
    PENDING = "pending"
    """Promise is still being processed."""
    FULFILLED = "fulfilled"
    """Promise has been fulfilled with a result."""
    REJECTED = "rejected"
    """Promise has been rejected with an error."""
    CANCELLED = "cancelled"
    """Promise was cancelled before completion."""

class Promise(BaseModel):
    """A promise representing an asynchronous operation."""
    id: str
    """Unique identifier for this promise."""
    status: PromiseStatus
    """Current status of the promise."""
    createdAt: str
    updatedAt: str
    expiresAt: str | None = None

class CallToolAsyncRequestParams(RequestParams):
    """Parameters for calling a tool asynchronously."""
    
    name: str
    """Name of the tool to call."""
    arguments: dict[str, Any] | None = None
    """Arguments to pass to the tool."""
    model_config = ConfigDict(extra="allow")

class CallToolAsyncRequest(Request[CallToolAsyncRequestParams, Literal["tools/callAsync"]]):
    """Used by the client to invoke a tool asynchronously."""
    
    method: Literal["tools/callAsync"]
    params: CallToolAsyncRequestParams

class CallToolAsyncResult(Result):
    """The server's response to an asynchronous tool call request."""
    
    promise: Promise
    """The promise representing the asynchronous operation."""

class GetPromiseRequestParams(RequestParams):
    """Parameters for getting the status of a promise."""
    
    id: str
    """The ID of the promise to check."""
    model_config = ConfigDict(extra="allow")

class GetPromiseRequest(Request[GetPromiseRequestParams, Literal["promises/get"]]):
    """Used by the client to check the status of a promise."""
    
    method: Literal["promises/get"]
    params: GetPromiseRequestParams

class GetPromiseResult(Result):
    """The server's response to a promise status check.""" 
    promise: Promise
    result: list[TextContent | ImageContent | EmbeddedResource] | None = None
    error: str | None = None

class CancelPromiseRequestParams(RequestParams):
    """Parameters for cancelling a promise."""
    
    id: str
    """The ID of the promise to cancel."""
    model_config = ConfigDict(extra="allow")

class CancelPromiseRequest(Request[CancelPromiseRequestParams, Literal["promises/cancel"]]):
    """Used by the client to cancel a promise."""
    
    method: Literal["promises/cancel"]
    params: CancelPromiseRequestParams

class CancelPromiseResult(Result):
    """The server's response to a promise cancellation request."""
    
    promise: Promise
    """The updated state of the promise."""

This implementation provides:

  • A Promise model with status tracking
  • An asynchronous tool call method (tools/callAsync)
  • Methods to check promise status and retrieve results (promises/get)
  • A way to cancel promises (promises/cancel)

The client workflow would be:

  • Call tools/callAsync to start an asynchronous operation
  • Receive a promise ID
  • Periodically poll with promises/get to check status
  • When status is FULFILLED, retrieve the result
  • Optionally cancel with promises/cancel if needed

This approach:

  • Separates resource representation from operation state
  • Is backwards compatible and does not overload Resource
  • Doesn't require persistent connections
  • Provides a clean API for async operations

Note: The class names, method names, and exact structure shown here are for demonstration purposes only and not final. The actual implementation would need to be refined through discussion and align with the protocol's naming conventions and design patterns."

Let me know what you think

@bzsurbhi
Copy link
Author

bPDDRi8m48NtFiKS04m3_MLHWK90L0WXIdVTMCu8B3bsP3ke277tsYIXBGsepIAHxJVlVJD91jOnuufCajm_1HSvKmvYAL2v7rklQBPe22cFeFrSxVGXek2qbhR3cPH3krCSzefbs09craNcrVP0IZ6qVLtfq8SkXTXNh7pjiMg9jf0EzhdHcR1u88Hn9xRCOHrUctnbzoWa7yWBhmaCxue5futHIXTMxaYAvnuD3TnIg39j

@davemssavage
Copy link

Let me know what you think

I agree trying to keep things close to the original tool protocol is probably better on reflection. Thinking about it some more the JSONRPCRequest interface already has the idea of a requestId https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/draft/schema.ts#L79 I wonder if what we're actuallly talking about is externalising this so a client can get the result of a previously submitted JSONRPCRequest?

A couple of points I'm generally not a big fan of polling, I think given the protocol already has support for server side notifications perhaps these should be used to notify a client that a task is complete or failed rather than the client having to keep polling the result, that tends to lead to unnessery delays. I guess the client should always be able to fall back to polling but ideally it wouldn't be the primary mechanism.

Also I think it would be sensible to support task progress for async tool calls so supporting notifications of async task progress should be allowed, if this change also added in the idea of notifying of a resource along side the string message I think this would cover the original idea I had reporting arbitary info e.g. metrics, validation data, checkpoints etc as per #549 (comment).

We should also make sure it works with the structuredContent responses as well as the TextContent | ImageContent | Embedded

I think we're all thinking along the same lines perhaps it's worth putting together a new rfc, I find it a bit easier to comment on code when it's in a merge request than buried in comments. I'd also be happy to contribute towards a python-sdk experiment to prove it works in reality if you need a volunteer for such work.

@bzsurbhi
Copy link
Author
bzsurbhi commented May 29, 2025

I believe we could use the existing Resource subscription and notification mechanisms rather than implementing new asynchronous code. For polling support, we can add a status field through a new AsyncOperation/AsyncResource class instead of modifying the base Resource class directly. This approach leverages our existing infrastructure while maintaining cleaner separation of concerns.

class OperationStatus(str, Enum):
    """The status of an asynchronous operation."""

    PENDING = "pending"
    """Operation is still being processed."""
    FULFILLED = "fulfilled"
    """Operation has been fulfilled with a result."""
    REJECTED = "rejected"
    """Operation has been rejected with an error."""
    CANCELLED = "cancelled"
    """Operation was cancelled before completion."""

class AsyncOperation(BaseModel):
    """Information about an asynchronous operation."""

    requestId: RequestId
    """The ID of the request that created this operation."""
    status: OperationStatus
    """Current status of the operation."""
    resourceUri: Annotated[AnyUrl, UrlConstraints(host_required=False)]
    """The URI of the resource that will contain the result when ready."""
   ...

class CallToolAsyncRequestParams(RequestParams):
    """Parameters for calling a tool asynchronously."""
    
    name: str
    """Name of the tool to call."""
    arguments: dict[str, Any] | None = None
    """Arguments to pass to the tool."""
    model_config = ConfigDict(extra="allow")

class CallToolAsyncRequest(Request[CallToolAsyncRequestParams, Literal["tools/callAsync"]]):
    """Used by the client to invoke a tool asynchronously."""
    
    method: Literal["tools/callAsync"]
    params: CallToolAsyncRequestParams

class CallToolAsyncResult(Result):
    """The server's response to an asynchronous tool call request."""
    
    operation: AsyncOperation
    """Information about the asynchronous operation."""

class GetOperationRequestParams(RequestParams):
    """Parameters for getting the status of an operation."""
    
    requestId: RequestId
    """The ID of the request to check."""
    model_config = ConfigDict(extra="allow")

class GetOperationRequest(Request[GetOperationRequestParams, Literal["operations/get"]]):
    """Used by the client to check an operation's status."""
    
    method: Literal["operations/get"]
    params: GetOperationRequestParams

class GetOperationResult(Result):
    """The server's response to an operation status check."""
    
    operation: AsyncOperation
    """The current state of the operation."""

In this approach:

  • The existing Resource class is not modified
  • AsyncOperation has its own OperationStatus enum
  • AsyncOperation includes resourceUri to link to the resource

The server implementation would:

  • Create a new resource with a unique URI when tools/callAsync is called
  • Create an AsyncOperation object that references this resource
  • Return the AsyncOperation in CallToolAsyncResult
  • Update both the AsyncOperation status and the resource as the operation proceeds
  • When complete, populate the resource content

The client workflow would be:

  • Call tools/callAsync to start an async operation
  • Receive an AsyncOperation with resourceUri and status=PENDING
  • Either:
    -- Check operation status with operations/get
    -- Subscribe to the resource with resources/subscribe using resourceUri and receive notifications
    -- Poll the resource with resources/read using resourceUri
  • When operation status=FULFILLED, read the resource to get the result

This approach leverages existing resource endpoints for content retrieval and notifications

I'm down to creating a new RFC once we agree on the high level approach

I'm happy to continue this discussion offline to explore more ideas and reach a conclusion. Could you share your Discord username? I wasn't able to locate you there

@davemssavage
Copy link
davemssavage commented May 31, 2025

Hi @bzsurbhi I have updated my nickname on the mcp discord server, hopefully that makes me easier to find. I'm not a big user of discord to date so this might not be sufficient.

I took a pass at a merge request that captures some of my thinking here:

#617

Happy to progress this either on discord or via either pull request as appropriate. I should probably flag that I'm working on this as a background project so it's evenings and weekends so responses may be sporadic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0