8000 Fix 'initialRequest' guard might be incorrect in DohResolver and potential 'rollback' on 'lastRequest' among concurrent requests. by LinZong · Pull Request #345 · dnsjava/dnsjava · GitHub
[go: up one dir, main page]

Skip to content

Fix 'initialRequest' guard might be incorrect in DohResolver and potential 'rollback' on 'lastRequest' among concurrent requests. #345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of servi 8000 ce and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

LinZong
Copy link
Contributor
@LinZong LinZong commented Oct 7, 2024

Hi all, thanks for such solid & amazing dns implementation in Java world!
Trying to employ the DoH capability to my project, and found that there might be two hard-to-find issues lying in current DoH implementation.

  1. 'initialRequest' guard might be incorrect if the initial value of 'idleConnectionTimeout' is set to a value larger than current (and later for a period) System.nanoTime().

Related code snippet:

long lastRequestTime = lastRequest.get();
boolean isInitialRequest =
(lastRequestTime < System.nanoTime() - idleConnectionTimeout.toNanos());
if (!isInitialRequest) {
initialRequestPermit.release();
}

Imaging that there is a service running DohResolver on a server booted recently, when DohResolver is about to send the first request, its lastRequest.get() == 0L, idleConnectionTimeout.get() == some user-defined value from constructor or 2mins by default. If idleConnectionTimeout is larger than current System.nanoTime(), the value of boolean guard isInitialRequest will be false, thing goes wrong.

(Eg: someone constructs a DohResolver with unary constructor but its server was booted 1min before. On my Ubuntu 18.04 server with Azul Zulu 8.0.322 JVM, System.nanoTime() is equals to my server's uptime.)
image

To address such case, we may need to introduce a new state variable to indicate the 'initial request' was acutally completed, to ensure the connection pool was created after the first connection was established successfully.

private boolean checkInitialRequest() {
// If initial request haven't been completed successfully yet, just return true.
if (!initialRequestSentMark.get()) {
return true;
}
// Otherwise, check whether such request is happened
// after last successful request plus idle connection timeout.
long lastRequestTime = lastRequest.get();
return (lastRequestTime + idleConnectionTimeout.toNanos() < System.nanoTime());
}

  1. Timestamp variable 'lastRequest' might occur race condition on concurrent DoH request/response.

Object httpClient = getHttpClient(executor);
requestBuilderTimeoutMethod.invoke(requestBuilder, remainingTimeout);
Object httpRequest = requestBuilderBuildMethod.invoke(requestBuilder);
Object bodyHandler = byteArrayBodyPublisherMethod.invoke(null);
CompletableFuture<Message> f =
((CompletableFuture<?>)
httpClientSendAsyncMethod.invoke(httpClient, httpRequest, bodyHandler))
.whenComplete(
(result, ex) -> {
if (ex == null) {
lastRequest.set(startTime);
}
maxConcurrentRequestPermit.release();
if (isInitialRequest) {
initialRequestPermit.release();
}
})

If there are two concurrent DoH requests happend, one with a smaller 'startTime' got the response from server behind of another one with a larger 'startTime', then the value 'lastRequest' will first be set to the large one, then 'bump' to the small one.

We may also need to introduce a CAS get-and-set operation like code snippet below to ensure value in 'lastRequest' field is always monotonic.

private void setLastRequestTime(long startTime) {
long current = lastRequest.get();
// Only update value of 'lastRequest' if timestamp in 'lastRequest' is smaller than incoming 'startTime' value.
if (current < startTime) {
while (!lastRequest.compareAndSet(current, startTime)) {
// CAS failed, re-verify the eligibility of timestamp in 'lastRequest' to be updated to the incoming 'startTime' value.
current = lastRequest.get();
if (current > startTime) {
return;
}
}
}
}

@LinZong LinZong closed this Oct 7, 2024
@LinZong LinZong changed the title Fix 'initialRequest' guard might be incorrect in DohResolver and potential ' Fix 'initialRequest' guard might be incorrect in DohResolver and potential 'rollback' on 'lastRequest' among concurrent requests. Oct 7, 2024
@LinZong LinZong reopened this Oct 7, 2024
@ibauersachs
Copy link
Member

Thanks, I'll need some more time to properly review this.

8000 LinZong and others added 2 commits January 5, 2025 15:26
… of 'idleConnectionTimeout' is set to a value larger than current (and later for a period) System.nanoTime().

2. Fix race condition on setting 'lastRequest' timestamp among concurrent requests, ensuring its value is always be monotonic.
@ibauersachs ibauersachs force-pushed the bugfix/fix_initial_request_guard_condition branch from e1e5c0c to a026ff3 Compare January 5, 2025 14:27
@ibauersachs
Copy link
Member

Thanks for the detailed report and test!

The issue was the simple mistake of not calculating as mandated by the System.nanoTime() documentation.

The test was flaky, I thus changed it to fake the provided time.

@ibauersachs ibauersachs merged commit 9a748a0 into dnsjava:master Jan 5, 2025
14 checks passed
@ibauersachs ibauersachs added this to the v3.6.3 milestone Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0