-
Notifications
You must be signed in to change notification settings - Fork 500
Implementing configurable aggregation cardinality limit #3624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing configurable aggregation cardinality limit #3624
Conversation
… emitting all views instead of last view Issues Cardinality of the Aggregation limit (2000) is hardcoded and no ability to configure as part of the meter/aggregation Meter stores meter-name to storage in storage_registry_. When multiple views are added, the last view overrides the previous view, and collector can only emit the last view's metrics that are added. Fixes Introducing cardinality as configurable parameter in AggregationConfig, and implementing the limit from AggregationConfig in Storage Changing the storage_registry_ to keep track of view_name to storage, instead of metric_name to storage. So when multiple views are added, different view-names will be collected by collector. Please provide a brief description of the changes here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3624 +/- ##
==========================================
+ Coverage 90.03% 90.09% +0.07%
==========================================
Files 220 220
Lines 7097 7110 +13
==========================================
+ Hits 6389 6405 +16
+ Misses 708 705 -3
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements configurable aggregation cardinality limits for OpenTelemetry C++ metrics SDK and fixes a collector issue where only the last view was emitted instead of all views. The cardinality limit can now be configured per view through the AggregationConfig class.
Key Changes
- Added cardinality limit configuration to AggregationConfig with a default value of kAggregationCardinalityLimit
- Updated metric storage classes to use configurable cardinality limits from AggregationConfig
- Added comprehensive test coverage for cardinality limit functionality
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
sdk/include/opentelemetry/sdk/metrics/aggregation/aggregation_config.h | Added cardinality_limit_ field and constructor to AggregationConfig class |
sdk/include/opentelemetry/sdk/metrics/state/attributes_hashmap.h | Added hash map pre-allocation when cardinality limit exceeds default |
sdk/include/opentelemetry/sdk/metrics/state/sync_metric_storage.h | Updated constructor to store AggregationConfig and use configurable cardinality limit |
sdk/include/opentelemetry/sdk/metrics/state/async_metric_storage.h | Updated to use configurable cardinality limits from AggregationConfig |
sdk/src/metrics/state/sync_metric_storage.cc | Updated to use cardinality limit from AggregationConfig when creating new AttributesHashMap |
sdk/src/metrics/state/temporal_metric_storage.cc | Updated merged_metrics creation to use configurable cardinality limit |
sdk/test/metrics/sum_aggregation_test.cc | Added comprehensive tests for cardinality limit functionality with histogram and counter |
sdk/test/metrics/cardinality_limit_test.cc | Updated test to use AggregationConfig constructor |
sdk/test/metrics/async_metric_storage_test.cc | Fixed conditional include directives for ENABLE_METRICS_EXEMPLAR_PREVIEW |
ci/do_ci.sh | Added test timeout for bazel valgrind tests |
CHANGELOG.md | Added changelog entry for the new functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow up on the original PR.
A couple of questions:
About the title:
Implementing configurable aggregation cardinality limit and collector emitting all views instead of last view
This change seem to be only about "configurable aggregation cardinality limit",
I don't see the relevance with "collector emitting all views instead of last view",
please clarify.
A lot of code checks:
aggregation_config
? aggregation_config->cardinality_limit_
: kAggregationCardinalityLimit
Would it be simpler to always provide am aggregation config, so that testing for a nullptr is no longer needed, resulting in:
aggregation_config->cardinality_limit_
sdk/include/opentelemetry/sdk/metrics/state/sync_metric_storage.h
Outdated
Show resolved
Hide resolved
Addressed the feedback:
|
// Note: When the cardinality limit is set to n, the attributes hashmap emits n-1 distinct | ||
// attribute sets, plus an overflow bucket for additional attributes. The test logic below is made | ||
// generic to succeed for either n or n-1 total cardinality. If this behavior is unexpected, | ||
// please investigate and file an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the spec, the overflow bucket is included in the cardinality limit, not added on top of it. We can remove this comment, or update accordingly.
{ | ||
// Something weird about attributes hashmap. If cardinality is setup to n, it emits n-1 | ||
// including overflow. Just making the logic generic here to succeed for n or n-1 total | ||
// cardinality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be contradicting the earlier comment
// Note: When the cardinality limit is set to n, the attributes hashmap emits n-1 distinct
// attribute sets, plus an overflow bucket for additional attributes. The test logic below is made
// generic to succeed for either n or n-1 total cardinality. If this behavior is unexpected,
// please investigate and file an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, n-1 is possible for now which is a bug in the metrics SDK, filed an issue (see the other reply) and will fix it in separate PR.
// including overflow. Just making the logic generic here to succeed for n or n-1 total | ||
// cardinality. | ||
EXPECT_GE(cardinality_limit, md.point_data_attr_.size()); | ||
EXPECT_LT(cardinality_limit / 2, md.point_data_attr_.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this range check done. can't we get exact ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there is an issue when the metric collection process which could return 1 less aggregated metric than the configured cardinality. Filed below issue to track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes looks good, as this is to introduce configuration for setting cardinality limit. The issue discussed (n-1 limit sometimes) to be handled as separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, see minor cleanup
Implementing configurable aggregation cardinality limit (open-telemetry#3624)
Changes
This resumes the PR #3299.
Please provide a brief description of the changes here.
For significant contributions please make sure you have completed the following items:
CHANGELOG.md
updated for non-trivial changes