8000 Implementing configurable aggregation cardinality limit by ThomsonTan · Pull Request #3624 · open-telemetry/opentelemetry-cpp · GitHub
[go: up one dir, main page]

Skip to content

Conversation

ThomsonTan
Copy link
Contributor
@ThomsonTan ThomsonTan commented Sep 3, 2025

Changes

This resumes the PR #3299.

Please provide a brief description of the changes here.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

PradSenn and others added 7 commits March 11, 2025 17:45
… emitting all views instead of last view

Issues

Cardinality of the Aggregation limit (2000) is hardcoded and no ability to configure as part of the meter/aggregation
Meter stores meter-name to storage in storage_registry_. When multiple views are added, the last view overrides the previous view, and collector can only emit the last view's metrics that are added.

Fixes
Introducing cardinality as configurable parameter in AggregationConfig, and implementing the limit from AggregationConfig in Storage
Changing the storage_registry_ to keep track of view_name to storage, instead of metric_name to storage. So when multiple views are added, different view-names will be collected by collector.
Please provide a brief description of the changes here.
@ThomsonTan ThomsonTan requested a review from a team as a code owner September 3, 2025 21:59
Copy link
codecov bot commented Sep 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.09%. Comparing base (a025766) to head (24ed437).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3624      +/-   ##
==========================================
+ Coverage   90.03%   90.09%   +0.07%     
==========================================
  Files         220      220              
  Lines        7097     7110      +13     
==========================================
+ Hits         6389     6405      +16     
+ Misses        708      705       -3     
Files with missing lines Coverage Δ
...metry/sdk/metrics/aggregation/aggregation_config.h 100.00% <100.00%> (ø)
...telemetry/sdk/metrics/state/async_metric_storage.h 94.60% <100.00%> (+0.16%) ⬆️
...entelemetry/sdk/metric 8000 s/state/attributes_hashmap.h 98.25% <100.00%> (+1.89%) ⬆️
...ntelemetry/sdk/metrics/state/sync_metric_storage.h 88.89% <100.00%> (+0.32%) ⬆️
sdk/src/metrics/state/sync_metric_storage.cc 100.00% <100.00%> (ø)
sdk/src/metrics/state/temporal_metric_storage.cc 97.34% <100.00%> (+0.04%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ThomsonTan ThomsonTan added the pr:please-review This PR is ready for review label Sep 4, 2025
@lalitb lalitb requested a review from Copilot September 5, 2025 06:10
Copy link
@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements configurable aggregation cardinality limits for OpenTelemetry C++ metrics SDK and fixes a collector issue where only the last view was emitted instead of all views. The cardinality limit can now be configured per view through the AggregationConfig class.

Key Changes

  • Added cardinality limit configuration to AggregationConfig with a default value of kAggregationCardinalityLimit
  • Updated metric storage classes to use configurable cardinality limits from AggregationConfig
  • Added comprehensive test coverage for cardinality limit functionality

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sdk/include/opentelemetry/sdk/metrics/aggregation/aggregation_config.h Added cardinality_limit_ field and constructor to AggregationConfig class
sdk/include/opentelemetry/sdk/metrics/state/attributes_hashmap.h Added hash map pre-allocation when cardinality limit exceeds default
sdk/include/opentelemetry/sdk/metrics/state/sync_metric_storage.h Updated constructor to store AggregationConfig and use configurable cardinality limit
sdk/include/opentelemetry/sdk/metrics/state/async_metric_storage.h Updated to use configurable cardinality limits from AggregationConfig
sdk/src/metrics/state/sync_metric_storage.cc Updated to use cardinality limit from AggregationConfig when creating new AttributesHashMap
sdk/src/metrics/state/temporal_metric_storage.cc Updated merged_metrics creation to use configurable cardinality limit
sdk/test/metrics/sum_aggregation_test.cc Added comprehensive tests for cardinality limit functionality with histogram and counter
sdk/test/metrics/cardinality_limit_test.cc Updated test to use AggregationConfig constructor
sdk/test/metrics/async_metric_storage_test.cc Fixed conditional include directives for ENABLE_METRICS_EXEMPLAR_PREVIEW
ci/do_ci.sh Added test timeout for bazel valgrind tests
CHANGELOG.md Added changelog entry for the new functionality

Copy link
Member
@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow up on the original PR.

A couple of questions:

About the title:

Implementing configurable aggregation cardinality limit and collector emitting all views instead of last view

This change seem to be only about "configurable aggregation cardinality limit",
I don't see the relevance with "collector emitting all views instead of last view",
please clarify.

A lot of code checks:

aggregation_config
  ? aggregation_config->cardinality_limit_
  : kAggregationCardinalityLimit

Would it be simpler to always provide am aggregation config, so that testing for a nullptr is no longer needed, resulting in:

aggregation_config->cardinality_limit_

@ThomsonTan ThomsonTan changed the title Implementing configurable aggregation cardinality limit and collector emitting all views instead of last view Implementing configurable aggregation cardinality limit Sep 8, 2025
@ThomsonTan
Copy link
Contributor Author

Thanks for the follow up on the original PR.

A couple of questions:

About the title:

Implementing configurable aggregation cardinality limit and collector emitting all views instead of last view

This change seem to be only about "configurable aggregation cardinality limit", I don't see the relevance with "collector emitting all views instead of last view", please clarify.

A lot of code checks:

aggregation_config
  ? aggregation_config->cardinality_limit_
  : kAggregationCardinalityLimit

Would it be simpler to always provide am aggregation config, so that testing for a nullptr is no longer needed, resulting in:

aggregation_config->cardinality_limit_

Addressed the feedback:

  1. Removed the emitting all views as suggested by @dbarker.
  2. Created a default AggregationConfig.

// Note: When the cardinality limit is set to n, the attributes hashmap emits n-1 distinct
// attribute sets, plus an overflow bucket for additional attributes. The test logic below is made
// generic to succeed for either n or n-1 total cardinality. If this behavior is unexpected,
// please investigate and file an issue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the spec, the overflow bucket is included in the cardinality limit, not added on top of it. We can remove this comment, or update accordingly.

{
// Something weird about attributes hashmap. If cardinality is setup to n, it emits n-1
// including overflow. Just making the logic generic here to succeed for n or n-1 total
// cardinality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be contradicting the earlier comment

 // Note: When the cardinality limit is set to n, the attributes hashmap emits n-1 distinct


  // attribute sets, plus an overflow bucket for additional attributes. The test logic below is made
  // generic to succeed for either n or n-1 total cardinality. If this behavior is unexpected,
  // please investigate and file an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, n-1 is possible for now which is a bug in the metrics SDK, filed an issue (see the other reply) and will fix it in separate PR.

// including overflow. Just making the logic generic here to succeed for n or n-1 total
// cardinality.
EXPECT_GE(cardinality_limit, md.point_data_attr_.size());
EXPECT_LT(cardinality_limit / 2, md.point_data_attr_.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this range check done. can't we get exact ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there is an issue when the metric collection process which could return 1 less aggregated metric than the configured cardinality. Filed below issue to track it.

#3638

Copy link
Member
@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good, as this is to introduce configuration for setting cardinality limit. The issue discussed (n-1 limit sometimes) to be handled as separate PR.

Copy link
Member
@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, see minor cleanup

@ThomsonTan ThomsonTan merged commit 4193b21 into open-telemetry:main Sep 11, 2025
66 checks passed
@ThomsonTan ThomsonTan deleted the add_cadinality_limit branch September 11, 2025 16:48
malkia added a commit to malkia/opentelemetry-cpp that referenced this pull request Sep 12, 2025
Implementing configurable aggregation cardinality limit (open-telemetry#3624)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr:please-review This PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

0