-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
esp32: Webserver Socket stops responding after a minute. #15844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, I've had the same issue and no solution for that. But since I already was going for seed xiao ESP32C3, I was able to replace it with an ESP32S3 (also seed xiao), which is a little bit more expensive but has the same footprint and mostly same pin configuration. In my investigations I also found out: This is independent from Socket or asyncio.start-server(). Don't mind refactoring your code into different approaches to realize a server. It "feels" like there is a buffer mishandling between the ESP32(C3) and the wifi module built-in. |
I also encounter the same problem on esp32c3, except that instead of one minute, the web server stops responding after a random period of time between a few minutes and one day or so, and it will never resume responding, so I have to do a hard reset at least every day. However, The randomness of this bug makes it very difficult to solve. |
Thanks for the clear report and the reproduction. Using the instructions in the original report, with ESP32_GENERIC firmware, IDF 5.2.2 and a recent master MicroPython This looks like the same root cause as with the issue #12819. It's because TCP sockets are remaining open on the ESP32 server for up to 2 minutes. I tested the fix suggested in #12819 which is to add I also made a simpler client program in Python to show the same issue, so you don't need to test with a web browser. Just run this on the host PC: import time
import requests
IP = '<ip of esp32>'
for i in range(1000):
r = requests.get(f'http://{IP}/*JOY;{i};0;0;0;0;0')
r.close()
time.sleep(0.15) Eventually that will stop sending requests. |
The problem does not happen on a PYBD-SF6 (and presumably all other bare-metal lwIP MicroPython boards) because those boards have a very limited amount of static memory allocated to lwIP, and lwIP i 8000 s then forced to close and reuse sockets that are sitting in the TIME-WAIT state. In particular, in the Ideally we need a way on ESP32 to limit the maximum number of opened PCBs, and force |
Hi Damien, |
Probably not, because the TCP socket waits in TIME-WAIT state for 2 minutes, after which it is reclaimed. So there's probably enough RAM to allocate a few thousand sockets before they need to be recycled, at which point the 2 minutes from the first sockets has passed and their memory is freed. |
Probably that same or part only. #14421 |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
I've hidden the comments above because I was reading the WireShark output wrong, my mistake! The underlying problem is the same as described here. The conditions are:
What happens then is that the client sends TCP SYN, the server sends TCP ACK, the client recognises the ACK is invalid because it belongs to the old connection and sends TCP RST, but the server ignores this (required for RFC1337). It will stay stuck like this until after the 2 minutes expires, although the client usually times out first. Reproduce without MicroPythonI can reproduce this in C if I modify the tcp_server example to close the socket after it sends the first response back to the client. ESP-IDF server patch: diff --git i/examples/protocols/sockets/tcp_server/main/tcp_server.c w/examples/protocols/sockets/tcp_server/main/tcp_server.c
index 5f1c284960..b4ac5c97b3 100644
--- i/examples/protocols/sockets/tcp_server/main/tcp_server.c
+++ w/examples/protocols/sockets/tcp_server/main/tcp_server.c
@@ -59,6 +59,7 @@ static void do_retransmit(const int sock)
to_write -= written;
}
}
+ break;
} while (len > 0);
}
Client program to repro with tcp_server: #!/usr/bin/env python
import socket
import time
HOST="10.42.0.150" # <-- Server IP goes here
PORT=3333
while True:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(5) # so fails when bug triggers
s.connect((HOST, PORT))
s.send(b"ABCD")
print(time.time(), s.recv(4))
s.close() WorkaroundsI can't find a perfect workaround, but I think any of these would work:
Fixes?Linux and OpenBSD servers don't have this problem because if they see an out-of-sequence SYN for a TIME-WAIT socket then they allow it as a new connection. OpenBSD code, Linux code. We could request LWIP support be added to do the same thing. The issues I can see are:
A simpler/quicker "fix" would to be to immediately clean up the socket if an RST is received in TIME-WAIT (i.e. change this bit). However this does open the possibility of "TIME-WAIT assassination", trading one bug for a worse one in some cases (old data arrives on new socket 🙀). |
This comment was marked as outdated.
This comment was marked as outdated.
Isn't rfc1337 saying that immediately cleaning up the socket if an RST is received in TIME-WAIT is a fix for TWA, rather than 'opening up the possibility of TWA' as you said?
|
My understanding is that TIME-WAIT assassination is when the TCP stack cleans up the socket too early (the bad effect being that after this, additional out-of-band packets from the old socket might be delivered to a new socket). The fix of ignoring RST means ignoring the RST packet and doing nothing, rather than cleaning up the socket early. |
One more contributing factor to this is that ESP-IDF doesn't limit the number of sockets in TIME-WAIT. The config option that should limit this is not effective (see espressif/esp-idf#9670). If that bug was fixed then this problem would happen much less often because TIME-WAIT sockets would be cleaned up more rapidly when there are a lot of short lived connections. However, I think the problem might still happen sometimes - in my packet captures sometimes Linux reuses the same source port very quickly, for whatever reason. |
Candidate fix in the linked PR, and I've posted some builds there if anyone is able to test: #15952 (comment) |
Uh oh!
There was an error while loading. Please reload this page.
Port, board and/or hardware
esp32 port, ESP32 and ESP32C3
ESP32 (lolin32-Lite, esp32 devkit v1) and ESP32C3 (esp32c3 supermini)
both board have same issues
MicroPython version
MicroPython v1.23.0 on 2024-06-02; ESP32C3 module with ESP32C3
in older versions than MicroPython v1.20.0 code works fine withou issues
MicroPython v1.19.1 works OK
MicroPython v1.20.0 works OK
MicroPython v1.21.0 NOT working
MicroPython v1.22.0 NOT working
MicroPython v1.23.0 NOT working
Reproduction
Copy content of zip file to your ESP device. Set correct SSID and Password and reset your board.
issue.zip
The web server will offer you index.html, which cyclically send requests to server and server answers back.

in terminal you can see positions of virtual joysticks on webpage.
After about a minute, the server stops responding. There is enough free RAM, I don't know where the problem is, everything works correctly on older versions.
There is another version of the web server available in the Zip archive, without the use of threads, but the problem is exactly the same.
Expected behaviour
I expect that the server will behave the same in all versions. It should respond indefinitely to web browser requests.
Observed behaviour
In Chrome, using developer tools, I check the responses from the server. After 1-5 minutes, the server stops responding and sometimes restarts.

Additional Information
Web server must run in background, without blocking of REPL and WebRepl
Code of webserver with threads:
Code of webserver without threads:
Code of Conduct
Yes, I agree
The text was updated successfully, but these errors were encountered: