-
Notifications
You must be signed in to change notification settings - Fork 711
Improve shutdown handling of custom OpenTelemetry tracer #3805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Previously, custom OTLP header extraction was separated from the creation of the top-level request span - we used both a `TraceLayer` and a custom middleware function. We now register a single middleware function, which performs all of the necessarily logic. This removes the need to store data in `request.extesnions` - we can now just use normal function calls.
The shutdown handling is now done in `CustomTracer::drop`. Almost all of the logic around custom tracers is now called from 'tensorzero_tracing_middleware' (including the possible creation of a fresh `CustomTracer`). The 'build_with_context' is now very simple - it just looks up an entry in the context, and delegates `build_with_context` on an already-constructed `SdkTracer`. By performing the shutdown logic in `Drop`, we ensure that we only run the shutdown after all associated `Spans` have already been dropped, avoiding the problem of shutting down the tracer before we've finished executing all of our spans (e.g. if we create a span for the rate-limiting `return_tickets` call, which happens in the background) This also eliminates the need for a moka eviction handler, since dropping the final `Arc<CustomTracer>` from anywhere is now enough to perform a correct shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the OpenTelemetry custom tracer handling to improve shutdown behavior by implementing Drop
for CustomTracer
and simplifying the middleware architecture. The key improvement ensures that tracer shutdown only occurs after all associated spans have been dropped, preventing premature shutdown issues.
Key changes:
- Implements
Drop
forCustomTracer
to handle automatic shutdown when the tracer is no longer in use - Consolidates custom tracer logic into a single
tensorzero_tracing_middleware
function that handles both header parsing and tracer creation - Removes the need for cache eviction handlers by using
Arc<CustomTracer>
reference counting
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
tensorzero-core/src/observability/mod.rs | Refactors custom tracer architecture with Drop implementation, new middleware structure, and updated context handling |
gateway/src/main.rs | Updates router to pass tracer wrapper to the new middleware signature |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
//! This is automatically propagated to descendant spans by `tracing-opentelemetry`. Once all of descendant spans are dropped | ||
//! (i.e. all background processing for our request is finished), the `CustomTracer` will get automatically dropped, | ||
//! which triggers shutdown in our `Drop` impl for `CustomTracer` | ||
//! 4. When a span is exported using `TracerWrapper::build_with_context`, we inspect the `Context` for a `CustomTracerContextEntry`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step numbering is inconsistent - there are two steps labeled '4.' in the documentation. The second occurrence should be step '5.'
//! 4. When a span is exported using `TracerWrapper::build_with_context`, we inspect the `Context` for a `CustomTracerContextEntry`. | |
//! 5. When a span is exported using `TracerWrapper::build_with_context`, we inspect the `Context` for a `CustomTracerContextEntry`. |
Copilot uses AI. Check for mistakes.
if let Some(key) = parent_cx.get::<CustomTracerContextEntry>() { | ||
key.inner.inner.build_with_context(builder, parent_cx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context lookup should check for CustomTracerContextEntry
but the variable name suggests it expects a CustomTracerKey
. This appears to be correct since CustomTracerContextEntry
is the new context entry type, but the variable name key
is misleading.
if let Some(key) = parent_cx.get::<CustomTracerContextEntry>() { | |
key.inner.inner.build_with_context(builder, parent_cx) | |
if let Some(entry) = parent_cx.get::<CustomTracerContextEntry>() { | |
entry.inner.inner.build_with_context(builder, parent_cx) |
Copilot uses AI. Check for mistakes.
} | ||
} | ||
fn apply_otel_http_trace_layer(self, otel_tracer: Option<Arc<TracerWrapper>>) -> Self { | ||
// If OpenTelemetry is disable, then we don't need to create extra spans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'disable' to 'disabled'.
// If OpenTelemetry is disable, then we don't need to create extra spans | |
// If OpenTelemetry is disabled, then we don't need to create extra spans |
Copilot uses AI. Check for mistakes.
Depends on #3801 |
The shutdown handling is now done in
CustomTracer::drop
.Almost all of the logic around custom tracers is now called from
'tensorzero_tracing_middleware' (including the possible creation
of a fresh
CustomTracer
).The 'build_with_context' is now very simple - it just looks up
an entry in the context, and delegates
build_with_context
onan already-constructed
SdkTracer
.By performing the shutdown logic in
Drop
, we ensure that we only runthe shutdown after all associated
Spans
have already been dropped,avoiding the problem of shutting down the tracer before we've finished
executing all of our spans (e.g. if we create a span for the
rate-limiting
return_tickets
call, which happens in the background)This also eliminates the need for a moka eviction handler, since
dropping the final
Arc<CustomTracer>
from anywhere is now enoughto perform a correct shutdown.
Important
Improve shutdown handling of custom OpenTelemetry tracer by moving logic to
CustomTracer::drop
and updating middleware.CustomTracer::drop
to ensure shutdown occurs after all spans are dropped.tensorzero_tracing_middleware
now handles most custom tracer logic, including creating newCustomTracer
.build_with_context
to delegate to existingSdkTracer
.Arc<CustomTracer>
triggers shutdown.tensorzero_otlp_headers_middleware
totensorzero_tracing_middleware
.tensorzero_tracing_middleware
now creates top-level span and handles OTLP headers.apply_otel_http_trace_layer
to acceptotel_tracer
parameter inobservability/mod.rs
.observability/mod.rs
.This description was created by
for 6961268. You can customize this summary. It will automatically update as commits are pushed.