8000 OTEP: Enhancing OpenTelemetry for Large-Scale Metric Management by menderico · Pull Request #4672 · open-telemetry/opentelemetry-specification · GitHub
[go: up one dir, main page]

Skip to content

Conversation

menderico
Copy link

Changes

This proposal outlines an enhancement to OpenTelemetry's OpAmp control plane to address the challenges of large-scale metric management in push-based telemetry systems. It suggests extending OpAmp to include a standardized protocol for server-driven metric configuration, allowing backends to define sampling periods and routing instructions based on metric name and resource. This would enable proactive management of telemetry flow, reducing operational costs and improving efficiency, similar to capabilities found in pull-based systems like Prometheus, but without requiring server-side polling and registration. The proposal details specific protobuf extensions for ScheduleInfoRequest and ScheduleInfoResponse and describes a state machine for how agents and servers would interact to implement this dynamic metric scheduling.

@menderico menderico requested review from a team as code owners October 1, 2025 16:23
Copy link
linux-foundation-easycla bot commented Oct 1, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

them, but for some systems and metrics it might be preferable to have them
written over not having any data at all.

## Prior art and alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using existing remote configuration capabilities of OpAMP?

OpAMP server 10BC0 can send a config setting that describes the sampling policy (what this proposal defines in ScheduleInfoResponse). You can define an Otel semconv that defines a name of the config file that contains sampling settings and include that file in AgentConfigMap. The format of the file can be also defined by the same semconv or be delegated to Otel Configuration SIG to make a decision on.

The information carried in ScheduleInfoRequest can be part of existing AgentDescription message, recorded in non_identifying_attributes.

This way there is no need to make any changes to OpAMP itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we could use the AgentConfigMap as a vehicle to send config, OpAmp is lacking the ability to declare what semantics would be acceptable config files.

E.g. how would we know that any particular Agent can accept the config file we want to send? How would reserve the "name" portion of the ConfigMap? How do we know it's safe to attach our config into the config map for any agent?

This proposal first started with looking at using "custom capabilities" to advertise whether or not Metric config would be something an Agent would understand and accept.

While I appreciate your suggestion - I still think there's either some missing pieces or clear interaction between AgentConfigMap + CustomCapabilities to make this practical to use generically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. how would we know that any particular Agent can accept the config file we want to send? How would reserve the "name" portion of the ConfigMap? How do we know it's safe to attach our config into the config map for any agent?

This can all be semantic conventions. A convention for Agent to specify in non-identifying attributes what it can accept. Another convention for the "name" of the config.

I still think there's either some missing pieces or clear interaction between AgentConfigMap + CustomCapabilities to make this practical to use generically.

What exactly is impractical with the semantic conventions approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tigran,

In this proposal we do not expect the config to be rolled out all at once, for a couple of reasons:

  • We would like to allow dynamic configuration updates and gradual rollout mechanisms. The server can distribute specific configurations to different agents at different times, enhancing system resilience against widespread outages and supporting features like rate limiting during high workloads.
  • Configuration sizes can be substantial in large scale systems like the ones targeted by the proposal. In our existing systems, even agents requesting minimal configurations based on their metric usage can lead to significant memory consumption for configuration handling if they export significant amount of data, and most of these agents only require a fraction of the total config in the server.

I took a look at the config and at high level it seems this would not be possible with the current AgentConfigMap, or at least it would require a mechanism to permit these to be rolled out slowly and based on the agent needs, while the config needs to be solved all at a single step, at least that is my reading from https://opentelemetry.io/docs/specs/opamp/#configuration

Also, adopting a single configuration format for metric sampling periods would impose a strict standard across all systems. This proposal, conversely, focuses solely on the agent-server interface, allowing each system to define its own configuration. This offers greater flexibility for servers to establish their own rules for metric configuration and collection.

Let me know your thoughts.

Copy link
Member
@tigrannajaryan tigrannajaryan Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would like to allow dynamic configuration updates and gradual rollout mechanisms.

The Server controls the rollout of config changes and can do it gradually. There is nothing in the OpAMP spec which says when is the Server supposed to send a remote config change to the Agent. It is a Server decision and the Server can decide to do a gradual rollout using whatever strategy it chooses.

However, if the intent here is to be able to rollout an incremental change (e.g. just an update to a sampling rate - a single numeric value) then I can see how re-sending the entire config can be inefficient.

We have a long-standing issue to support incremental changes that we discussed briefly in the past but did not move forward with. It may be time to revive that.

Let me think a bit about this proposal. My strong preference is to minimize the changes to OpAMP protocol itself, especially changes that serve a relatively narrow use case.

Also, adopting a single configuration format for metric sampling periods would impose a strict standard across all systems.

I don't see that as a downside. To me a standardization of configuration file format for Otel-compatible systems is a benefit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about the update, there are a few points that would be requirements for this proposal that might not fit with the current configuration model

  • One, as you said, is the fact that sending the config both ways is not ideal, since these can be very large, in this protocol we are proposing splitting the config into smaller pieces and only sending / resending when there are differences between them. This could be done in some other way than the one proposed, but it is almost prohibitive to send the whole config
  • The other important aspect is that we don't want to send the whole config, but only what the agent is expected to use, IIUC the effective config can be reported back but there would be limits on how we would request information for additional metrics. One possibility is that whenever a new metric is added or another resource is being monitored the agent extend its effective config and send it to the server, who then provides the config, but this still means the whole config would be resent in this event.

Copy link
Member
@tigrannajaryan tigrannajaryan Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth I think the problem that this proposal wants to solve is a type of a configuration problem. If the generic configuration capabilities in OpAMP are not good enough to solve this particular problem I prefer to spend time fixing/extending the generic OpAMP capabilities so that it can handle this particular use case without the need to implement a specialized new flow and new set of messages. I suggest that we try to do that. This will allow future similar use cases to be handled without new effort (e.g. think trace sampling). Can we work together on this?

If we don't find a good way to solve this via a general-purpose OpAMP config capabilities I am happy to come back to this special-purpose solution and discuss it, but I think it should be our last resort second choice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think a demo showing this working with an OpAMP client in an SDK would be helpful to understand more of the tradeoffs. Being able to see the difference in dev experience, user experience, performance based on a few of the approaches laid out here would make it easier to evaluate the options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth I think the problem that this proposal wants to solve is a type of a configuration problem. If the generic configuration capabilities in OpAMP are not good enough to solve this particular problem I prefer to spend time fixing/extending the generic OpAMP capabilities so that it can handle this particular use case without the need to implement a specialized new flow and new set of messages. I suggest that we try to do that. This will allow future similar use cases to be handled without new effort (e.g. think trace sampling). Can we work together on this?

Happy to!

I also think this proposal will have direct ties into Sampling configuration, both for traces and logs.

The key use case to think about here is designing an OpAMP server that can know how to control sampling or logging without needing to know the implementations of clients that connect to it, and be certain it's not causing issues or downing SDKs/Collectors by sending bad configs where they aren't understood.

OpAMP already has a lot in place for this, but is just missing a few "connect the dots" pieces:

  • The ability to declare support for specific config formats that are supported by an agent.
  • The ability to request partial configuration, or specific config formats.
  • The interaction of the above system with the existing "full configuration" mode of OpAMP.

If you compare OpAMP to xDS (Envoy's control plane), you'll see a huge deviation between the two, with Envoy allowing only specific use cases via control plane. The idea here is OpAMP is generally more flexible but allows that style of use case within it.

Again the goal here is that an implementation of an OpAMP server could control e.g. metric-reporting, log/span sampling without having to know if it's talking to a specific SDK or Collector. It may not even need (or want) full configuration control over things it talks to, just a key set of use cases.

When looking at the design of OpAMP it seemed like custom capability was the way this would be most suited, but I agree that this doesn't answer how the interaction with general configuration would be done.

@tigrannajaryan - Ideally @menderico and I could join the OpAMP SIG to discuss, however the time is not very EU friendly. I've added it to my calendar and will see what I can do but let's continue this discussion offline.

Copy link
Member
@tigrannajaryan tigrannajaryan Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key use case to think about here is designing an OpAMP server that can know how to control sampling or logging without needing to know the implementations of clients that connect to it, and be certain it's not causing issues or downing SDKs/Collectors by sending bad configs where they aren't understood.

Absolutely agree. That's the philosophy behind OpAMP design and we should stick to this.

The ability to declare support for specific config formats that are supported by an agent.
The ability to request partial configuration, or specific config formats.
The interaction of the above system with the existing "full configuration" mode of OpAMP.

This is a good initial summary. Let's work on this.

We should definitely do this within OpAMP SIG, but if the time does not work, offline is fine too.

Copy link
Member
@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am blocking this proposal since it requests changes to OpAMP protocol without exploring less invasive alternatives sufficiently.

I am not against the capability, but I am not convinced the proposed solution is the right one.

Let's keep the PR open and discuss options.

@tigrannajaryan
Copy link
Member

@open-telemetry/opamp-spec-maintainers @open-telemetry/opamp-spec-approvers FYI.

@tigrannajaryan
Copy link
Member

Also CC @jaronoff97 since you were asking for incremental config updates a while back. This may the opportunity for us to add that.


By extending the OpenTelemetry OpAmp control plane, we can introduce a
standardized protocol for metric configuration. This enhancement would provide a
common, interoperable language for the backend to instruct clients on how to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is client in this an SDK or a collector? Right now collectors are the only OpAMP clients, which to me limits part of the effectiveness of this proposal. The idea of being able to control this on an SDK level would reach the goals in the proposal, whereas simply doing this in a collector is already possible through OpAMP remote configuration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see below the proposal would be for a SDK/API changes, I think it would be useful to specify these changes in the summary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +32 to +35
batch, sample, and transmit telemetry data. While OpAmp's existing custom
capabilities offer a path for vendor-specific solutions, a standardized
mechanism is essential for the broad adoption and interoperability that
OpenTelemetry champions. This new capability would enable proactive,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what vendor specific solutions this refers to, OpAMP's existing capabilities for the collector and bridge are both vendor-neutral and allow you to control batching and sampling.

Comment on lines +40 to +41
This protocol extension would unlock several powerful use cases for dynamically
managing telemetry data at the source.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these are possible today via remote configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. We'd like to be able to have remote configuration where we don't need to know the exact details of what agent we're talking to. I.e. we need an abstraction of specific control use cases that we can push to any agent.

Today, OpAMP implementations (in my experience) require hard-coding knowledge of each agent implementation in every sever. We need something to break that down, or the cost of supporting new agents grows with every new agent you want to support vs. the cost growing with every use case you want to support.

Comment on lines +123 to +124
For metrics we could extend the current `ScheduleInfoRequest` from
`AgentToServer` in a way that agents could use to request the sampling period

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no current ScheduleInfoRequest as far as i can tell?

Comment on lines +258 to +260
2. Otherwise, the sampling period provided will then be used to configure
the push metric period **instead of using one set in the code**. Clients
are expected to spread writes over time to avoid overloading a server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this potentially create a new exporter for every schedule? I would worry about the efficiency of doing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we would not need more than on exporter for each schedule.

When metrics scale per-sdk (e.g. even in large scale prometheus applications), filtering the metrics sent per batch or altering to have different metrics at different intervals is a must-have capability. This can be done efficiently and does not imply new mechanism for export, it can be controlled purely at the "metric reader" layer.

or forwarded to a separate system that does some form of analytics.
2. Each request should have its resource description extended by the
extra_resources provided by the server.
4. Once the client receives the response, it should wait for `next_delay` and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would these operations be blocking or async? I would worry about them blocking and causing a massive memory backup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would operate similar to how Jaeger-Remote-Sampler works today.

These calls are NON blocking, and there is a fallback behavior used until the values from OpAmp are returned.

them, but for some systems and metrics it might be preferable to have them
written over not having any data at all.

## Prior art and alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think a demo showing this working with an OpAMP client in an SDK would be helpful to understand more of the tradeoffs. Being able to see the difference in dev experience, user experience, performance based on a few of the approaches laid out here would make it easier to evaluate the options.

@jaronoff97
Copy link

@tigrannajaryan yep, i think incremental updates would be useful here. In my view, the ultimate goal of this is to codify an extra configuration flow layer on top of the existing remote configuration layer. I'm not sure it's necessarily valuable to modify OpAMP to support this, but rather it would useful to understand this flow as something that exists using OpAMP as an underlying abstraction. I do think there's a lot of value in codifying the flow described, I'm not sure we would need to do that in OpAMP though.

@jmacd
Copy link
Contributor
jmacd commented Oct 7, 2025

This topic is welcome for discussion in Sampling SIG.

See open-telemetry/oteps#240, @PeterF778's previous attempt.

For myself, while reviewing Collector extensions for rate-limiting and memory-limiting (e.g., open-telemetry/opentelemetry-collector#9591), I found myself comparing Envoy's rate limiter configuration with the same rubric as applied to Jaeger Remote sampling and tailsamplingprocessor configuration, https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/tailsamplingprocessor/README.md.

I would be glad to see an OpAmp integration here, where multiple forms of "sampling" configuration can be distributed and applied to multiple signals, especially in the SDKs where users stand to gain the most.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0