-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
CircuitPython version
Adafruit CircuitPython 9.0.0-alpha.1-50-gaa0d7aad83 on 2023-09-28; M5Stack Timer Camera X with ESP32
Code/REPL
https://github.com/bill88t/CircuitPython_FTP_Server, current master 53224fdd20302d4005054011b51a45591f3ac0fd loaded onto the board
code.py:
import wifi
from socketpool import SocketPool
from ftp_server import ftp
from supervisor import reload
if not wifi.radio.connected:
print("No wifi")
reload()
pool = SocketPool(wifi.radio)
ftps = ftp(pool, str(wifi.radio.ipv4_address), verbose=True)
try:
ftps.serve_till_quit()
except KeyboardInterrupt:
pass
finally:
ftps.deinit()
reload()
Behavior
When making a full filesystem dump (237 files), PASV fails with the client never connecting to the data socket.
Well what does that have to do with the core?
Well, first of all, it does work on 8.x without a single error or retry, same settings, same everything.
Second, the failure has to do with the sockets and listen
.
Debugging the failure led me to the conclusion that if a client connects very quickly after .listen(..)
is run, the client may be accepted by the networking stack, but stored nowhere and so .accept
times out.
When this bug happens, .accept()
will fail by timeout, raising it's regular OSError
, as if no client is waiting.
However the client is connected and waiting.
It is not rejected like when the socket is closed or no more connections are permitted.
Description
So this is some sort of race condition where a client connection comes in at just the wrong time to be accepted by the code but not stored anywhere.
FTP relies on making a new socket for every transfer and directory listing. For that reason we end up making ~300 sockets/connections for a full dump of 237 files.
It would probably be possible to make a simplier example, since we only need to spam (yes, around 7 connections / second) and testing if the connection object is not None
.
This issue has already received a workaround on the current (ftp) master, so it's not really a world ending bug.
However it's still an issue that could conceivably appear in prod for a single connection, assuming bad enough luck (timing).
FTP just happens to roll a broken dice enough times to make it break almost 100%.
Additional information
If anybody intends on actually going through the ftp code:
line 645: self._data_socket, self._client_pasv = self._pasv_sock.accept()
ends up returning None
forever while this bug happens.
The timeout will close the socket and tell the client the file transfer failed (message 25).
The client will then re-request a new socket which will work.