-
Notifications
You must be signed in to change notification settings - Fork 26.6k
Fix adaptive metrics decay when provider metrics are not updated #16048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.3
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 3.3 #16048 +/- ##
============================================
- Coverage 60.75% 60.73% -0.03%
+ Complexity 11757 11752 -5
============================================
Files 1952 1952
Lines 89012 89012
Branches 13421 13421
============================================
- Hits 54079 54059 -20
- Misses 29367 29382 +15
- Partials 5566 5571 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi maintainers, This PR fixes the adaptive metrics decay issue when provider metrics are not updated and adds a unit test covering the scenario. All checks are green. I’d really appreciate your review when you have time. Thanks! |
|
you'd better add a comparison test to ensure applying this PR also does well under high QPS circumstance. |
|
Hi @zrlw , thanks for the suggestion. |
|
Hi @zrlw , thanks for the suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What is the purpose of the change?
This PR fixes an issue in AdaptiveLoadBalance / AdaptiveMetrics where latency decay behaves incorrectly when provider metrics are not updated for a period of time.
Currently, when no new provider metrics arrive, getLoad() may repeatedly apply the penalty branch or aggressively right-shift lastLatency, which can result in stale or extreme values dominating EWMA. This makes adaptive load balancing unstable, especially in low-QPS or intermittent-update scenarios.
This PR ensures that latency decays safely and progressively instead of collapsing or being stuck at penalty values.
Fixes #15810
What is changed?
1. Improved decay logic in AdaptiveMetrics#getLoad()
2. Added unit test
Added
testAdaptiveMetricsDecayWithoutProviderUpdateVerifies that when provider metrics are not updated:
Why is this needed?
Adaptive load balancing relies on EWMA latency to reflect recent performance trends.
Without this fix:
This change makes adaptive load balancing more stable, realistic, and robust under real-world traffic patterns.
Verifying this change
mvn -pl dubbo-cluster -am testChecklist