net: defer self.destroy calls to nextTick#54857
net: defer self.destroy calls to nextTick#54857anilhelvaci wants to merge 5 commits intonodejs:mainfrom
Conversation
|
Review requested:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #54857 +/- ##
==========================================
+ Coverage 88.46% 88.48% +0.02%
==========================================
Files 703 703
Lines 207368 207540 +172
Branches 39983 40020 +37
==========================================
+ Hits 183438 183639 +201
+ Misses 15933 15871 -62
- Partials 7997 8030 +33
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
nice. thank you and congrats on your first contribution!
This comment was marked as outdated.
This comment was marked as outdated.
Thank you, so excited to become a part of the family! @anonrig What are the next steps then? Do I have to do anything to initiate the merge? I see that my branch is behind a few commits, should I rebase and push again? |
Running the CI and getting it to pass. We'll take of running it, in case there are related failures it would up to you to fix them.
no need unless there are conflicts. |
|
Hey @mcollina , thanks for your reply! So the CI is currently failing. When I look up the details, I see two kinds of failures;
Stack trace shows that my tests timeout for some reason. For this file, I have no idea why this is failing. Here's the stack trace; There are other failing tests but their severity is I rebased my branch on top of the current |
That's a red flag, we don't want to merge flaky tests in our codebase. A timeout probably means there's a race condition somewhere that you forgot to take into account and result in the test never exiting. To try to reproduce the flakiness locally, you can try running: tools/test.py --repeat 9999 -t 2 test/parallel/test-http-client-immediate-error.jsOnce you have a repro, you can attempt getting a fix ready. |
|
Hey @aduh95 , thanks for the clarifying 🙏 Unfortunately the piece of code you suggested did not reproduce the problem for me. I'm on OSX, can you think of anything else that I can try? |
|
@mcollina FYI, this adds overhead to |
|
CI is failing |
|
Yep, it is @mcollina. Any suggestions on how to reproduce it? Any chance I can shell into the machine(or container) facing problems? Or, are there any snapshot/images I can pull into my machine and try to spin it up? In other words, what are the usual steps that you guys take when you face a problem in CI? |
|
In this specific case, it seems that any Linux box on top of any virtualization software would do. For more exotic systems @nodejs/build can provide access. |
4725298 to
fcc49db
Compare
|
Could anyone approve the CI workflow please? I've rebased my work on top of latest main. |
|
@anilhelvaci Done :) |
|
Hey @mcollina @ronag @ShogunPanda , I am very confused because of the ping pong about where the problem lies between Argument for
|
|
@anilhelvaci I'm sorry we are not responsive. Your work and engagement is appreciated and imho quite impressive. However, this is not a priority for me and also quite a complicated problem. I will try to make time for this and give some reasonable guidance. However, right now I just don't have time. |
Thanks a lot @ronag ! |
Wrote tests for the suggested fix in nodejs#48771 (comment) Fixes: nodejs#48771
…ll be squashed later.
e9ce892 to
4ab3376
Compare
Fixes: 48771 fix: revert net.js fix: rollback net changes
4ab3376 to
a69a7e1
Compare
|
IMHO whatever solution we come up with should be changes to the http client, and net should be left untouched, unless it can be shown it's a bug in net that affects other things as well. |
Yep @ronag , the current version leaves If we agree on the solution, I'll rephrase the PR title and commit messages. I think I'm not allowed to update the labels so I'll ask one of the reviewers to change it to |
|
Where can I see that version? |
You can take a look at the Files Changed tab of this PR. @ronag |
|
Now it shows. Certainly in the right direction! |
lib/_http_client.js
Outdated
|
|
||
| if (!err && socket) { | ||
| socket.onImmediateError = function onImmediateError(err) { | ||
| req.immediateErr = err; |
There was a problem hiding this comment.
I believe this should do it 793c722. Haven't been able to find a guide on when to use a k prefix for the symbols but in order keep things consistent still used it.
Failed to start CI⚠ Commits were pushed since the last approving review: ⚠ - net: defer self.destroy calls to nextTick ⚠ - debug: insert console.logs for detecting where the test times out. Wi… ⚠ - chore: try handling the error event onSocket ⚠ - http: use private symbol for immediateError property ⚠ - chore: apply private symbol to onImmediateError and address linting e… ✘ Refusing to run CI on potentially unsafe PRhttps://github.com/nodejs/node/actions/runs/18156288164 |
I believe the Jenkins wasn’t kicked off because there’s a standing change request by @mcollina. I remember @joyeecheung started a job manually once. Don’t know what’s the best thing to do here. Presumably having the change request resolved is the best but there’s some debugging logs left from the previous troubleshooting efforts. I don’t want to remove them yet simply because I’m not sure what to expect from the CI run and asking people to manually trigger Jenkins jobs on behalf of me is not convenient for anybody. cc @ronag |
|
@anilhelvaci you need somebody else to approve the PR to start the job. |
Fixes: #48771
What is the problem being solved?
#48771 Reported
requestobject returned fromhttp.requestmethod cannot catcherrorevents triggered when there’s an immediate failure trying to connect to an address returned from dns lookup.Solution
#51038 implemented changes suggested in #48771 (comment). However the #51038 couldn’t be merged due to lack of tests(#51038 (review)). In this PR, I apply the same fix but with some tests.
Testing Considerations
All
process.nextTick(() => self.destroy())are hit except one. Below is theself.destroy()call that is not hit in these tests provided:node/lib/net.js
Line 1113 in 9404d3a
I am not tagging this PR as "DRAFT" since the piece of code that isn't tested is for a connection timeout case.
@mcollina Please let me know if these tests are sufficient or not.