-
Notifications
You must be signed in to change notification settings - Fork 933
OTEP: Allow Entity on InstrumentationScope #4665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- `Resource` *remains immutable* | ||
- Building on [OTEP 264`](0264-resource-and-entities.md), identifying attirbutes | ||
are clearly outlined in Resource going forward, addressing unclear real world usage of Resource attributes ([e,g, identifying attributes in OpAMP](https://github.com/open-telemetry/opamp-spec/blob/main/specification.md#agentdescriptionidentifying_attributes)). | ||
- SDK will be given an explicit initialization stage where `Resource` is not in a complete state, addressing OpenTelemetry JS concerns around async resource detection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this concern is relevant to any case where you need to begin collecting telemetry before an async resource detection is resolved. Even in languages which can block, it may be necessary in situations like lambda functions to begin immediately collecting telemetry without blocking startup in order to prevent adding latency to user applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True - I think it's just IMPOSSIBLE in today's specification where you can't block.
I'll rephrase
Internally, the entities discovered via resource detection MUST be merged in | ||
the order provided to create the initial resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the order provided
does this refer to the order in the detectors
list parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - will clarify
### What is the expected impact on Collector components? | ||
|
||
How should standard Collector processors (like the attributes processor, | ||
resource processor, or filter processor) be updated to interact with Entities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also include the transform
processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are just examples, but YES - I think anything which interacts with OTTL is in-scope to be looked at.
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
Co-authored-by: Christophe Kamphaus <christophe.kamphaus@gmail.com>
|
||
TODO: **In Progress** | ||
|
||
## Future possibilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me another future use case would be a way to describe connections between scopes. For instance this instrumentation scope is a continuation of session xyz which is a use case for browser/mobile.
This also applies to cicd where a pipeline task run scope could be a child of the pipeline run scope.
Potentially Having this described in the messages makes it easier for vendors to implement as well as enabling other signals/sigs to leverage it due to the generic approach.
Going forward, these operations (`Get a Logger`, `Get a Meter`, `Get a Tracer`) | ||
will be updated to include an additional optional parameter: | ||
|
||
* `entities` (optional): Specifies the `Entity` set to associate with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that every time an entity changes (including if only descriptive attributes change) you need to re-obtain a new Meter/Tracer? Or do we allow the entity to mutate and that mutation is reflected in subsequent exporting of data exported via existing Meter/Trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the identity of the entitty is different, it's a different entity. You need a different Meter/Trace by identity.
What to do about descriptive attributes is an open question. I'm inclined to not update them right now, as I think this is what the Entity-Signal should be doing going forward, and Entities used in Resource or scope should only be leveraging descriptive attributes that are "unlikely to change, but valuable".
|
||
- Reporting data against "mutable" or "changing" entities, where `Resource` | ||
is currently defined as an "immutable" identity in the data model. | ||
- Providing true multi-tenant capabilities, where, e.g. metrics about one tenant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by the usage of word "tenant" here. Do you refer to instrumentation libraries as tenants here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, a tenant would be something beyond that.
E.g. let's look at older Java Application Servers. One AppServer may have multiple Applications.
The instrumentation library may be for "jboss" or "tomcat" or whatever the app server is. The tenant would be the application itself. I'd instantiate one scope for each tenant vs. one for the entire app server in this world.
Does this OTEP imply removing EntityRefs from the Resource in proto? |
I like this direction. |
|
||
- Reporting data against "mutable" or "changing" entities, where `Resource` | ||
is currently defined as an "immutable" identity in the data model. | ||
- Providing true multi-tenant capabilities, where, e.g. metrics about one tenant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general pattern for instrumentation libraries is that they accept a *Provider, and create the Meter/Tracer/Logger within the instrumentation library. Having Entity on InstrumentationScope means that multi-tenancy must be implemented in instrumentation libraries, and can't easily be implemented prior to creating the instance of the instrumentation library. IIUC, if I instantiate one instance of a library for each "tenant", I would still need to create a new *Provider for each one.
If instead, we made adding an entity to a MeterProvider to create a "Bound" MeterProvider its own call (e.g. meterProvider.BindEntity(detectedEntity)
), then adding an entity could happen outside of or inside of the instrumentation library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an alternative we should consider. I'll add that to options.
This would still give us a "stable" identity for the SDK itself, and instrumentation could more clearly denote what they're reporting against.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in Entity SIG and we're going to move this direction, thanks for the suggestion!
### Protocol Details | ||
|
||
The OpenTelemetry Protocol (OTLP) will be modified such that | ||
`InstrumentationScope` will support carrying `Entity` objects. There are two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to involve a data model change? Couldn't we just emit multiple resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. I'll actually add it to open questions.
Pros of data model change:
- Entity used in Scope can leverage context of Resource (e..g a process in the scope of a host), as entities only have locally-unique ids.
- We have less repeated data in OTLP
- We preserve the Resource -> InstrumentationScope -> Signal model for
{Signal}Provider -> {Signal} -> {operation}
we have in our API today.
Cons of data model change:
- Many implementations ignore
scope
today, and its usage for identity is still relatively new to the data model. This would increase the need for them to interact with scope. - Entity information would be split between resource + scope. When would two metric streams be the same, e.g.
- a Resource with Entity A + B and empty Instrumentation scope
- a Resource with Entity A and Instrumentation Scope with Entity B
I'm not sure I have strong opinion here, but I still lean towards the data model change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also concerned about the additional complexity introduced in this OTEP. From the user’s perspective, recreating entities from EntityRefs
is already not straightforward. Adding yet another source to retrieve them creates even more overhead for processing entities in the Collector and reconstructing them on the backend. I would therefore favor a solution that does not require adding EntityRef
to the Instrumentation Scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few typo suggestions. There's a lot here but it's an understandable summary of the proposal.
- `MeterProvider` MUST treat entity found on InstrumentationScope as identifying, | ||
an aggregate reported events separately by scope. Note: this is the case in | ||
the specification today, however many implementations do not yet respect | ||
InstrumentationScope loose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something got lost here,
however many implementations do not yet respect InstrumentationScope [and] lose...
|
||
The `ResourceListener` MUST provide the following operations in the SDK: | ||
|
||
* `On Resource Initialize` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this one have a space and the other two do not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably my fat fingers hitting space by accident, will fix :)
No. EntityRef in resource would interact with Entity on InstrumentationScope, and we need to sort out how. |
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
- Building on [OTEP 264](0264-resource-and-entities.md), identifying attributes | ||
are clearly outlined in Resource going forward, addressing unclear real world usage of Resource attributes ([e,g, identifying attributes in OpAMP](https://github.com/open-telemetry/opamp-spec/blob/main/specification.md#agentdescriptionidentifying_attributes)). | ||
- SDK will be given an explicit initialization stage where `Resource` is not in a complete state, addressing OpenTelemetry JS concerns around async resource detection. | ||
- `InstrumentationScope` is enhanced with `Entity` similar to `Resource`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an entity on the scope, is there a specific implied relationship to the entities on the resource? In the call Thursday you used “runs-on”, but there could be a more generic way to phrase it that makes it more clear to everyone when you should use a scope attribute? “Nested-within” is sort of what i’m going for, but i can’t think of a good generic relationship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there's some term here. "runs-on", "in-context-of" or "nested-within" are all options. Yes - there's some kind of implicit relationship here.
|
||
Should we consider this a failure or a feature? | ||
|
||
For simplicity in modeling, we plan to prototype where this is *disallowed* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an instrumentation creates a scope with an entity, how do they know if there is already that entity on the resource? It may be impossible for the instrumentation author to know if they are following this restriction or not.
For example, an instrumentation which monitors memory usage wants to ensure there is a process
entity to report against and calls meter.for(processEntity)
but the top-level resource detector may or may not have already added a process entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the open question.
My naive initial proposal is - that would turn into a failure. We basically have two options:
- IT's a failure, we consider that use case problematic
- We create a "smart"
meter.for(entity)
, where any conflicts creates a new "identity" (combination of resource + scope) such that the conflicting higher-level is ignored for the lower-level. (So with Resource + Scope, the Scope entity overrides Resource, if we decide to flatten resource as @dashpole suggests, the Meter-level entity would override the global entity when reportingResource
.
I'm actually fine with either, caveat if we go with (2) I'm leaning against having Entity on Scope
and instead have a 'layered' Resource in the SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about the meeting yesterday and I want to write down my understanding of what we discussed and how I'm currently thinking about it. Below in no particular order are some of the discussion points we hit.
No entity on scope
Instead of adding entity to InstrumentationScope
, we discussed adding it directly to the core resource as a "layer" and reporting multiple resources with their associated telemetry. One possibility for lifetime management is by creating new providers and using the existing shutdown
methods.
// this reports metrics against the "core" resource
const meterProvider = new api.metrics.MeterProvider();
// Scope a provider to a session entity
// This "layers" a new resource on top of the core resource using the existing merge rules
const sessionEntity = getEntityForSession();
const sessionMeterProvider = meterProvider.for(sessionEntity);
// TODO: can providers be arbitrarily layered? e.g
// const pageViewMeterProvider = sessionMeterProvider.for(getPageViewEntity());
// Use the scoped provider to report metrics against the session entity
sessionMeterProvider.getMeter('example-meter').createCounter('example_counter').add(1);
// use the core provider to report metrics which are not specific to the session
meterProvider.getMeter('example-meter').createCounter('example_counter').add(1);
Does the provider need to listen to lifetime events?
The current proposal assumes the *Provider
receives the EntityInitialized
event and does something. It's not clear to me what that something is intended to be. The current specification requires processors to be called synchronously with API method invocations. For example SpanProcessor.onStart()
is invoked synchronously with tracer.StartSpan()
. Because of this, any spans started before the resource is resolved still need to have some sort of resource attached.
In JS this is solved by allowing individual attributes of the resource to be a Promise<AttributeValue>
. It is incumbent upon the processor to ensure all resource attributes are resolved before first export (by awaiting resource.waitForAsyncAttributes()
. Accessing attributes before async resource is resolved results in a logged warning. Resources are merged together in the order their detectors are configured. One drawback of the current strategy is that you have to know all possible attribute keys in advance. This could be improved by allowing the key to also resolve asynchronously.
If both above suggestions are accepted, using provider.for()
to merge entities and allowing async resource attributes, I believe there may be no need for a ResourceProvider
or ResourceInitializer
.
- Resources are merged in the SDK before being passed to the
*Provider
constructors, async resource attributes continue to resolve even after this occurs - New entities, including potentially async attribute values, are layered on the resource using
provider.for(...)
. - The export pipeline ensures all resource attributes are resolved before any export. If resource is resolved, wait is a noop. If a new entity was layered on top of the core resource with async attributes, this mechanism continues to work.
This is actually how the JS entity prototype currently works, except that *Provider.for
is not yet implemented.
`On Change` registers a `ResourceListener` to be called when a Resource has | ||
been detected or detection has failed. | ||
|
||
If the `EntityProvider` is already initialized, then it MUST call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EntityProvider is not defined anywhere?
|
||
The SDK is updated to, explicitly, include three new components: | ||
|
||
- `ResourceInitializer`: A formalized component that is responsible for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name ResourceInitializer
implies it handles only initialization and not the rest of the lifetime. Do we want to consider a name like ResourceManager
to leave the door open for future extensions like change/add/remove entities?
type ResourceStatus = 'detecting'
5C7A
| 'initialized';
type ResourceChangeCallback = (
resource: Resource,
status: ResourceStatus
) => void;
interface ResourceProvider {
readonly status: ResourceStatus;
onResourceInitialize(callback: ResourceChangeCallback): void;
onResourceChange(callback: ResourceChangeCallback): void;
}
Alternative to #4316 -
We propose allowing Entity to interact with InstrumentationScope, instead of allowing Resource to be mutable as a forward path for modelling "session" in browsers/devices. Additionally, we believe this will help with multi-tenant telemetry use cases.
E.g. This OTEP proposes the following
instead of: