10BC0 OTEP: Allow Entity on InstrumentationScope by jsuereth · Pull Request #4665 · open-telemetry/opentelemetry-specification · GitHub
[go: up one dir, main page]

Skip to content

Conversation

jsuereth
Copy link
Contributor

Alternative to #4316 -

We propose allowing Entity to interact with InstrumentationScope, instead of allowing Resource to be mutable as a forward path for modelling "session" in browsers/devices. Additionally, we believe this will help with multi-tenant telemetry use cases.

E.g. This OTEP proposes the following

const sessionEntity: Entity = discoverSessionEntity();
// SDK has created an isolated meter for this session.
const meterForSession = meterProvider.getMeter("my meter", sessionEntity); 
const latencyForSession = meterForSession.getHistogram("latency", ...);
latencyForSession.record(...); // Record latency against session.

instead of:

const meter = meterProvider.getMeter("my meter");
const latency = meter.getHistogram("latency", ...);
const sessionEntity: Entity = discoverSessionEntity();
// On resource mutation, metrics must be flushed and/or new allocation for all metrics created.
entityProvider.addOrReplace(sessionEntity);  
latency.record(...); // Record latency against session.

- `Resource` *remains immutable*
- Building on [OTEP 264`](0264-resource-and-entities.md), identifying attirbutes
are clearly outlined in Resource going forward, addressing unclear real world usage of Resource attributes ([e,g, identifying attributes in OpAMP](https://github.com/open-telemetry/opamp-spec/blob/main/specification.md#agentdescriptionidentifying_attributes)).
- SDK will be given an explicit initialization stage where `Resource` is not in a complete state, addressing OpenTelemetry JS concerns around async resource detection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this concern is relevant to any case where you need to begin collecting telemetry before an async resource detection is resolved. Even in languages which can block, it may be necessary in situations like lambda functions to begin immediately collecting telemetry without blocking startup in order to prevent adding latency to user applications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - I think it's just IMPOSSIBLE in today's specification where you can't block.

I'll rephrase

Comment on lines +150 to +151
Internally, the entities discovered via resource detection MUST be merged in
the order provided to create the initial resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the order provided does this refer to the order in the detectors list parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - will clarify

### What is the expected impact on Collector components?

How should standard Collector processors (like the attributes processor,
resource processor, or filter processor) be updated to interact with Entities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also include the transform processor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just examples, but YES - I think anything which interacts with OTTL is in-scope to be looked at.

jsuereth and others added 6 commits September 22, 2025 17:17
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
@jsuereth jsuereth changed the title Allow Entity on InstrumentationScope OTEP: Allow Entity on InstrumentationScope Sep 22, 2025

TODO: **In Progress**

## Future possibilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me another future use case would be a way to describe connections between scopes. For instance this instrumentation scope is a continuation of session xyz which is a use case for browser/mobile.

This also applies to cicd where a pipeline task run scope could be a child of the pipeline run scope.

Potentially Having this described in the messages makes it easier for vendors to implement as well as enabling other signals/sigs to leverage it due to the generic approach.

Going forward, these operations (`Get a Logger`, `Get a Meter`, `Get a Tracer`)
will be updated to include an additional optional parameter:

* `entities` (optional): Specifies the `Entity` set to associate with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly that every time an entity changes (including if only descriptive attributes change) you need to re-obtain a new Meter/Tracer? Or do we allow the entity to mutate and that mutation is reflected in subsequent exporting of data exported via existing Meter/Trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the identity of the entitty is different, it's a different entity. You need a different Meter/Trace by identity.

What to do about descriptive attributes is an open question. I'm inclined to not update them right now, as I think this is what the Entity-Signal should be doing going forward, and Entities used in Resource or scope should only be leveraging descriptive attributes that are "unlikely to change, but valuable".


- Reporting data against "mutable" or "changing" entities, where `Resource`
is currently defined as an "immutable" identity in the data model.
- Providing true multi-tenant capabilities, where, e.g. metrics about one tenant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by the usage of word "tenant" here. Do you refer to instrumentation libraries as tenants here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, a tenant would be something beyond that.

E.g. let's look at older Java Application Servers. One AppServer may have multiple Applications.

The instrumentation library may be for "jboss" or "tomcat" or whatever the app server is. The tenant would be the application itself. I'd instantiate one scope for each tenant vs. one for the entire app server in this world.

@tigrannajaryan
Copy link
Member

Does this OTEP imply removing EntityRefs from the Resource in proto?

@tigrannajaryan
Copy link
Member

I like this direction.


- Reporting data against "mutable" or "changing" entities, where `Resource`
is currently defined as an "immutable" identity in the data model.
- Providing true multi-tenant capabilities, where, e.g. metrics about one tenant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general pattern for instrumentation libraries is that they accept a *Provider, and create the Meter/Tracer/Logger within the instrumentation library. Having Entity on InstrumentationScope means that multi-tenancy must be implemented in instrumentation libraries, and can't easily be implemented prior to creating the instance of the instrumentation library. IIUC, if I instantiate one instance of a library for each "tenant", I would still need to create a new *Provider for each one.

If instead, we made adding an entity to a MeterProvider to create a "Bound" MeterProvider its own call (e.g. meterProvider.BindEntity(detectedEntity)), then adding an entity could happen outside of or inside of the instrumentation library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an alternative we should consider. I'll add that to options.

This would still give us a "stable" identity for the SDK itself, and instrumentation could more clearly denote what they're reporting against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in Entity SIG and we're going to move this direction, thanks for the suggestion!

### Protocol Details

The OpenTelemetry Protocol (OTLP) will be modified such that
`InstrumentationScope` will support carrying `Entity` objects. There are two
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to involve a data model change? Couldn't we just emit multiple resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. I'll actually add it to open questions.

Pros of data model change:

  • Entity used in Scope can leverage context of Resource (e..g a process in the scope of a host), as entities only have locally-unique ids.
  • We have less repeated data in OTLP
  • We preserve the Resource -> InstrumentationScope -> Signal model for {Signal}Provider -> {Signal} -> {operation} we have in our API today.

Cons of data model change:

  • Many implementations ignore scope today, and its usage for identity is still relatively new to the data model. This would increase the need for them to interact with scope.
  • Entity information would be split between resource + scope. When would two metric streams be the same, e.g.
    • a Resource with Entity A + B and empty Instrumentation scope
    • a Resource with Entity A and Instrumentation Scope with Entity B

I'm not sure I have strong opinion here, but I still lean towards the data model change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also concerned about the additional complexity introduced in this OTEP. From the user’s perspective, recreating entities from EntityRefs is already not straightforward. Adding yet another source to retrieve them creates even more overhead for processing entities in the Collector and reconstructing them on the backend. I would therefore favor a solution that does not require adding EntityRef to the Instrumentation Scope.

Copy link
Contributor
@smith smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few typo suggestions. There's a lot here but it's an understandable summary of the proposal.

- `MeterProvider` MUST treat entity found on InstrumentationScope as identifying,
an aggregate reported events separately by scope. Note: this is the case in
the specification today, however many implementations do not yet respect
InstrumentationScope loose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something got lost here,

however many implementations do not yet respect InstrumentationScope [and] lose...


The `ResourceListener` MUST provide the following operations in the SDK:

* `On Resource Initialize`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this one have a space and the other two do not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably my fat fingers hitting space by accident, will fix :)

@jsuereth
Copy link
Contributor Author

Does this OTEP imply removing EntityRefs from the Resource in proto?

No. EntityRef in resource would interact with Entity on InstrumentationScope, and we need to sort out how.

Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
jsuereth and others added 4 commits September 24, 2025 15:00
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
- Building on [OTEP 264](0264-resource-and-entities.md), identifying attributes
are clearly outlined in Resource going forward, addressing unclear real world usage of Resource attributes ([e,g, identifying attributes in OpAMP](https://github.com/open-telemetry/opamp-spec/blob/main/specification.md#agentdescriptionidentifying_attributes)).
- SDK will be given an explicit initialization stage where `Resource` is not in a complete state, addressing OpenTelemetry JS concerns around async resource detection.
- `InstrumentationScope` is enhanced with `Entity` similar to `Resource`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an entity on the scope, is there a specific implied relationship to the entities on the resource? In the call Thursday you used “runs-on”, but there could be a more generic way to phrase it that makes it more clear to everyone when you should use a scope attribute? “Nested-within” is sort of what i’m going for, but i can’t think of a good generic relationship.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree there's some term here. "runs-on", "in-context-of" or "nested-within" are all options. Yes - there's some kind of implicit relationship here.

5C7A

Should we consider this a failure or a feature?

For simplicity in modeling, we plan to prototype where this is *disallowed*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an instrumentation creates a scope with an entity, how do they know if there is already that entity on the resource? It may be impossible for the instrumentation author to know if they are following this restriction or not.

For example, an instrumentation which monitors memory usage wants to ensure there is a process entity to report against and calls meter.for(processEntity) but the top-level resource detector may or may not have already added a process entity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the open question.

My naive initial proposal is - that would turn into a failure. We basically have two options:

  1. IT's a failure, we consider that use case problematic
  2. We create a "smart" meter.for(entity), where any conflicts creates a new "identity" (combination of resource + scope) such that the conflicting higher-level is ignored for the lower-level. (So with Resource + Scope, the Scope entity overrides Resource, if we decide to flatten resource as @dashpole suggests, the Meter-level entity would override the global entity when reporting Resource.

I'm actually fine with either, caveat if we go with (2) I'm leaning against having Entity on Scope and instead have a 'layered' Resource in the SDK.

Copy link
Member
@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about the meeting yesterday and I want to write down my understanding of what we discussed and how I'm currently thinking about it. Below in no particular order are some of the discussion points we hit.

No entity on scope

Instead of adding entity to InstrumentationScope, we discussed adding it directly to the core resource as a "layer" and reporting multiple resources with their associated telemetry. One possibility for lifetime management is by creating new providers and using the existing shutdown methods.

// this reports metrics against the "core" resource
const meterProvider = new api.metrics.MeterProvider();

// Scope a provider to a session entity
// This "layers" a new resource on top of the core resource using the existing merge rules
const sessionEntity = getEntityForSession();
const sessionMeterProvider = meterProvider.for(sessionEntity);

// TODO: can providers be arbitrarily layered? e.g
// const pageViewMeterProvider = sessionMeterProvider.for(getPageViewEntity());

// Use the scoped provider to report metrics against the session entity
sessionMeterProvider.getMeter('example-meter').createCounter('example_counter').add(1);

// use the core provider to report metrics which are not specific to the session
meterProvider.getMeter('example-meter').createCounter('example_counter').add(1);

Does the provider need to listen to lifetime events?

The current proposal assumes the *Provider receives the EntityInitialized event and does something. It's not clear to me what that something is intended to be. The current specification requires processors to be called synchronously with API method invocations. For example SpanProcessor.onStart() is invoked synchronously with tracer.StartSpan(). Because of this, any spans started before the resource is resolved still need to have some sort of resource attached.

In JS this is solved by allowing individual attributes of the resource to be a Promise<AttributeValue>. It is incumbent upon the processor to ensure all resource attributes are resolved before first export (by awaiting resource.waitForAsyncAttributes(). Accessing attributes before async resource is resolved results in a logged warning. Resources are merged together in the order their detectors are configured. One drawback of the current strategy is that you have to know all possible attribute keys in advance. This could be improved by allowing the key to also resolve asynchronously.


If both above suggestions are accepted, using provider.for() to merge entities and allowing async resource attributes, I believe there may be no need for a ResourceProvider or ResourceInitializer.

  1. Resources are merged in the SDK before being passed to the *Provider constructors, async resource attributes continue to resolve even after this occurs
  2. New entities, including potentially async attribute values, are layered on the resource using provider.for(...).
  3. The export pipeline ensures all resource attributes are resolved before any export. If resource is resolved, wait is a noop. If a new entity was layered on top of the core resource with async attributes, this mechanism continues to work.

This is actually how the JS entity prototype currently works, except that *Provider.for is not yet implemented.

`On Change` registers a `ResourceListener` to be called when a Resource has
been detected or detection has failed.

If the `EntityProvider` is already initialized, then it MUST call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EntityProvider is not defined anywhere?


The SDK is updated to, explicitly, include three new components:

- `ResourceInitializer`: A formalized component that is responsible for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name ResourceInitializer implies it handles only initialization and not the rest of the lifetime. Do we want to consider a name like ResourceManager to leave the door open for future extensions like change/add/remove entities?

type ResourceStatus = 'detecting' 
5C7A
| 'initialized';

type ResourceChangeCallback = (
  resource: Resource,
  status: ResourceStatus
) => void;

interface ResourceProvider {
  readonly status: ResourceStatus;
  onResourceInitialize(callback: ResourceChangeCallback): void;
  onResourceChange(callback: ResourceChangeCallback): void;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0