8000 Fix Zipkin receiver keep-alive flag not being applied to HTTP server by Parship999 · Pull Request #7389 · jaegertracing/jaeger · GitHub
[go: up one dir, main page]

Skip to content

Conversation

Parship999
Copy link
Contributor
@Parship999 Parship999 commented Jul 27, 2025

Which problem is this PR solving?

Description of the changes

  • Issue is the --collector.zipkin.keep-alive flag was not being applied to the HTTP server created by the OpenTelemetry Zipkin receiver. While the flag existed in the collector configuration (options.Zipkin.KeepAlive), it was never used to control the actual HTTP server's keep alive behavior.

  • Root Cause:

    • The OpenTelemetry zipkin receiver creates its HTTP server using confighttp.ServerConfig.ToServer(), but the KeepAlive setting from Jaeger's configuration was stored separately and never applied to the server.
  • My solution:

    • Created a zipkinReceiverWrapper that wraps the OpenTelemetry Zipkin receiver
    • The wrapper applies the keep-alive setting after the receiver starts by accessing the internal HTTP server using reflection and unsafe operations
    • when --collector.zipkin.keep-alive=false, the wrapper calls SetKeepAlivesEnabled(false) on the HTTP server
    • when --collector.zipkin.keep-alive=true(default) then no action takes place and keep alive remains enabled

How was this change tested?

  • added TestZipkinReceiverKeepAlive that verifies the wrapper correctly applies keep-alive settings
  • Build and run the collector with --collector.zipkin.keep-alive=false and --collector.zipkin.keep-alive=true
  • Verified HTTP response headers:
    • With keep-alive=false: HTTP responses include connection: close
    • With keep-alive=true: HTTP connections are kept alive (no connection: close header)
  • Manual testing:
    • Created test that starts collector with different keep-alive settings
    • used curl -v to see the HTTP connection behavior
    • verified that when keep-alive is disabled, connections are closed after each request and when enabled, connections are reused

Checklist

Signed-off-by: Parship Chowdhury <i.am.parship@gmail.com>
@Parship999 Parship999 requested a review from a team as a code owner July 27, 2025 14:48
@Parship999 Parship999 requested a review from jkowall July 27, 2025 14:48
Copy link
codecov bot commented Jul 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.44%. Comparing base (4d6ac93) to head (ffde2eb).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7389      +/-   ##
==========================================
- Coverage   96.44%   96.44%   -0.01%     
==========================================
  Files         375      375              
  Lines       22871    22922      +51     
==========================================
+ Hits        22058    22106      +48     
- Misses        615      617       +2     
- Partials      198      199       +1     
Flag Coverage Δ
badger_v1 9.11% <ø> (-0.10%) ⬇️
badger_v2 1.72% <ø> (-0.02%) ⬇️
cassandra-4.x-v1-manual 11.83% <ø> (-0.13%) ⬇️
cassandra-4.x-v2-auto 1.71% <ø> (-0.02%) ⬇️
cassandra-4.x-v2-manual 1.71% <ø> (-0.02%) ⬇️
cassandra-5.x-v1-manual 11.83% <ø> (-0.13%) ⬇️
cassandra-5.x-v2-auto 1.71% <ø> (-0.02%) ⬇️
cassandra-5.x-v2-manual 1.71% <ø> (-0.02%) ⬇️
elasticsearch-6.x-v1 16.80% <ø> (+<0.01%) ⬆️
elasticsearch-7.x-v1 16.85% <ø> (-0.01%) ⬇️
elasticsearch-8.x-v1 16.99% <ø> (-0.01%) ⬇️
elasticsearch-8.x-v2 1.72% <ø> (-0.02%) ⬇️
elasticsearch-9.x-v2 1.72% <ø> (?)
grpc_v1 10.35% <ø> (-0.11%) ⬇️
grpc_v2 1.72% <ø> (-0.02%) ⬇️
kafka-3.x-v1 9.28% <ø> (-0.10%) ⬇️
kafka-3.x-v2 1.72% <ø> (-0.02%) ⬇️
memory_v2 1.72% <ø> (-0.02%) ⬇️
opensearch-1.x-v1 16.89% <ø> (-0.01%) ⬇️
opensearch-2.x-v1 16.89% <ø> (-0.01%) ⬇️
opensearch-2.x-v2 1.72% <ø> (-0.02%) ⬇️
opensearch-3.x-v2 1.72% <ø> (-0.02%) ⬇️
query 1.72% <ø> (-0.02%) ⬇️
tailsampling-processor 0.47% <ø> (-0.01%) ⬇️
unittests 95.41% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Parship Chowdhury <i.am.parship@gmail.com>
Signed-off-by: Parship Chowdhury <i.am.parship@gmail.com>
Signed-off-by: Parship Chowdhury <i.am.parship@gmail.com>
@Parship999
Copy link
Contributor Author

@yurishkuro could you please take a look and let me know if the approach seems correct?

return errors.New("receiver is nil")
}

receiverValue := reflect.ValueOf(w.Traces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like you're trying to unwrap the zipkin receiver internals and set a flag. Why not just create a PR in OTEL contrib to expose the keepalive (assuming it's not exposed now)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created the pr as you suggested, now waiting for review:
open-telemetry/opentelemetry-collector-contrib#42531

@Parship999 Parship999 marked this pull request as draft July 29, 2025 12:29
@Parship999
Copy link
Contributor Author

@yurishkuro based on your feedback i have created this PR in the core repo:
open-telemetry/opentelemetry-collector#13783

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Zipkin server in the collector no longer uses keep-alive flag
2 participants
0