Possible Async Bug in MultiPart Form Parser #2927
Replies: 3 comments 4 replies
-
@Kludex sorry for the ping but I have a few people telling me this could be a security issue due being a denial of service attack vector (stopping from accepting new connections). I think that's a bit premature but would appreciate your thoughts. FYI I also tested out a solution using multipart as discussed in this issue. I got a nice performance improvement but still had to use |
Beta Was this translation helpful? Give feedback.
-
This doesn't seem proof of improvement.
No, it doesn't make sense to run a CPU bound code that doesn't escape the GIL in a threadpool. |
Beta Was this translation helpful? Give feedback.
-
Could you please expand on how that's not showing improvement? Being able to process ~2x the amount of concurrent requests with the same resources seems significant to me.
Upon further investigation I think it's less about running it in a threadpool and more about giving the scheduler a place to break and handle other more pressing actions (like handling new connections). How else would you explain the doubling in handled requests? I'm going to try continuing my replacement of |
Beta Was this translation helpful? Give feedback.
-
Hello!
I think I might've discovered a possible issue with how starlette parses multi-part form data. It calls python-multipart's
parser.write()
on the main asyncio thread which can cause issues in a high concurrency & large payload environment.I've constructed a minimum reproducible example and walked through what I think a possible solution is. Any feedback would be greatly appreciated or if I should move this to an issue.
Context
I am working on a data processing service which uses fastapi/starlette as it's HTTP framework. The data comes in varying sizes but usually maxes out around ~8-10Mb so to get data into the service I use an
UploadFile
object supplied by the multi-part form parser.Example
My code can be simplified to the following example:
Test Scenario
As a stress test on the my HTTP server I decided to send 1000 concurrent requests each with their own 10Mb file. I understand that 1k concurrent requests is way to high for a single fastapi process and ~700 requests outright failed initially but the rest of the results were interesting:
Here is a graph with the total request time in orange and the client connect time in blue. As expected the total time continues to increase with the number of requests; however, I was surprised to see that the connect time was increasing as well.

After profiling the test with

viztracer
I discovered that the reason client connects were slowing down was because theMultiPartParser.parse()
function was holding onto the main thread:IMO this is incorrect and should not be happening. The main loop should be left free to respond to other requests while this parsing happens in the background. Starlette actually already uses
run_in_threadpool
to read multi part files into the spooled temporary file but the coreparser.write
happens on the main thread.Proposed solution
My proposal is to call the entire
parser.write(chunk)
in an threadpool instead of just the file writing. This way the main thread is free to do other work.I tested out my changes in this draft PR and I got significantly better results. We were able to get double the responses and the average client connect time was significantly reduced. The only reason some requests still failed is because we ran out of threads for scheduling though that can be controlled via anyio thread pool count.
Summary
I think we should merge the changes in this PR to allow starlette to process more multi-part forms in parallel. It will slightly slow down small form parsing but I think the scalability and predictability tradeoff is worth it. I'd love to hear any feedback or other possible solutions!
Beta Was this translation helpful? Give feedback.
All reactions