-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
extmod/uasyncio Event.set() not safe in ISR #5795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've been thinking about this one a bit and I would like to propose a solution that is broader than just I believe that fundamentally what we need is an event facility (different from Proposal:
In terms of semantics the only guarantee is that after the ISR signals the slot, the callback is executed at least once by the asyncio loop. In particular, the asyncio loop may invoke the callback anytime even without ISR signal (typically this only happens due to a race condition). Pseudo-code implementation:
Sample usage with
I believe an alternative would be to make |
It certainly is important to think about general ways of interacting with objects from an ISR, however a solution with a fixed number of slots will always have the problem of either wasting RAM or not providing enough slots. An easy solution (but one that comes with an overhead) for the Events could be something along these lines:
This solution will only cost performance by iterating over all events after an event has been set. Drawback: Because the Events are all linked, an Event will never be reclaimed by the garbage collector because this reference won't go away (unless you introduce an Event.deinit() method which would then be a weird micropython extension to the Event class). |
My argument wasn't about fixed slots but about creating a facility that is easy to use in native modules for purposes other than just TBH not being able to reclaim The more I think about it, the more I'm inclined towards making But maybe dpgeorge already has it all figured out :-) |
The main problem to solve is how to wait for an event, efficiently. How does the async scheduler know that an event has occurred? Currently, without any IO (eg sockets), there can be no events, so the waiting is in such cases just a sleep until the next task should be scheduled. With IO (eg sockets) the waiting is either until the next task must be scheduled, or an IO entity becomes readable/writable/in-error. This is done using To handle more general events (eg BLE scan result, pin change, SPI DMA transfer half complete) I can see two main options:
This has been discussed a bit here: #5332 (comment) |
@dpgeorge, I'm not sure how to interpret your comment, let me try to start from the beginning. Let's take the example of ESP-now so we have something concrete: https://docs.espressif.com/projects/esp-idf/en/release-v4.0/api-reference/network/esp_now.html . The interface is based on callbacks:
The callbacks "run from a high-priority WiFi task" which means it's almost like an interrupt, I wouldn't be surprised if the callbacks actually ran on the PRO core which in some sense is worse than a regular same-core interrupt. Since ESP-now is about send/recv it would be possible to integrate it underneath the select/poll interface. In that case, a pile of C code would have the esp-now callback handlers and would implement the read/write/ioctl needed by the poll and stream interfaces. Synchronization between the ISR-like callback-handlers and the read/write/ioctl stuff would be "an internal implementation problem", presumably solved by using some FreeRTOS queues. Let's call that option 1. Option 2 might be to implement an esp_now python object that has an Option 3 is almost the same as option 2 but uses an A different use-case, such as BLE scan, would not have option 1 because it doesn't map well into read/write but both options 2 & 3 are pretty straight-forward. Am I missing something or perhaps not understanding your concerns? |
Re option 1:
In this particular case of ESP-NOW, how does stream read/write map to the ESP-NOW events? I guess the obvious thing is for the read callback to make the object readable (in the stream sense, with poll/select), and the object is initially writable and once send is called it becomes unwritable until So the ESP-NOW could map quite cleanly to the stream paradigm, but it wouldn't be so easy for objects like BLE which don't have a clear read/write interface. For option 2&3, I understand what you are saying. Regardless of whether it's useful for uasyncio, it would make sense to make But the point I was trying to make previously is: how does the main uasyncio loop know that an event (eg ISR) occurred and that a task was put on to the run queue? Schematically the uasyncio loop does this: while True:
dt = get_minimum_time_to_next_waiting_task()
wait_for_event_or_timeout(dt) # this may push a task to the front of the queue
t = pop_next_runnable_task()
run_task(t) and the question is how This polling loop actually works quite efficiently on an ARM-based MCU because it uses WFI in the loop, so essentially waits for any IRQ (which is the only source of an event that could make an objected readable/writeable) and then checks all objects again. But that's still not ideal because 1) it's O(N) each loop, where N is the number of registered objects; 2) it can't do a more efficient sleep, it has to keep polling. Ideally def wait_for_event_or_timeout(dt):
t0 = ticks_ms()
while True:
t_remain = ticks_diff(ticks_add(t0, dt), ticks_ms())
if t_remain <= 0:
return # timeout reached
machine.lightsleep(t_remain) # any events we wait on must wake this sleep state
if event_of_interest_occurred():
return # event occurred which we were waiting on |
So I believe there are two parts to this question: a data structure part and a wake-up part. I believe you're really asking about the second. I'm not familiar with the stm32 port, but I have used WFI :-). The steps I expect to see are:
Getting all the above to work right is obviously a bit tricky, but I believe it can be done. Did I miss something or misunderstand the problem? |
I think that's a good summary of what needs to be done. For the current implementation using select.poll steps 2-6 are effectively what poll does for you. But it only does it for streams, not arbitrary events. That gets to my original comment above #5795 (comment) that either everything is made to act like a stream and be pollable, or a new low-level event waiting mechanism (event waiter object) is invented (and works for example like described in steps 2-6 just above). It's a hard problem because various ports like unix, zephyr, esp32, stm32 all have different underlying "operating systems" and support events in different ways. Going back to the ESP-NOW example, it may not actually suit a stream protocol: for streams that have an error (POLLHUP, POLLERR) the stream is then considered unusable from that point and all operations that are waiting (both read and write) are woken and raise an exception (OSError). For ESP-NOW that is not desired behaviour, because an error with send does not necessarily mean the connection is broken/lost. Receiving would still work and the next send might work as well. You don't want the reading code to raise an exception just because the send of one packed failed. Also, a stream being in a readable/writeable state is just that, a state. And polling is about being notified when that state changes. But an event is not a state. An event would be a state change. And one could have a queue of events which is in a state of being empty or non-empty (eg readable or not). But events are not states and trying to combine those concepts together using poll might get messy.
That means all interrupts/events that could possibly be used in uasyncio must have a way of notifying the loop. So there has to be some kind of registration of what events the loop is interested in. Or in the simple case just have a global "any_event" notification that is signalled by all possible ISRs, and to which anyone can subscribe, and the loop would just go back to sleep if it wasn't interested in any of the events that woke it (this is how WFI would work because WFI wakes on any IRQ).
There's already a simple mechanism to provide asynchronous callbacks on sockets that become readable, grep for |
I would have to agree. WRT to ESP-now I would want to refresh my memory of how UDP works with select/poll. ESP-now would have to be similar, except for the reliability aspect (which is a big difference). But in the end, I think the only reason for us looking to shove everything into select/poll is that that's the only thing the inner loop currently checks.
Can you suggest an example use-case to think about or to prototype? Maybe a
You're right, having a thread in an infinite loop calling select/poll would be one option. Yuck... |
I spent some time looking at This leaves out something like the ESP-Now callback and probably some other communication related callbacks but it's probably wiser to push the burden on them to deal with the multi-core aspect internally, e.g., have the ISR use a FreeRTOS primitive to signal a task that runs on the MP core and have that call mp_sched_schedule. An alternative would be to rewrite the scheduler to be multi-core safe using CAS instructions, which I believe is entirely doable. I also noticed that a bunch of peripherals don't support completion interrupts: I2C, SPI, I2S come to mind. Instead they have a read-with-timeout function. For uasyncio I have the feeling the best is to wrap this with a coro that explicitly polls (a while loop with a read with zero timeout, and an await sleep). I believe other ports are simpler 'cause they're single-core and so the atomic sections around the scheduler primitives are entirely sufficient. Ugh, it's getting late here, my brain is fading... So why can't device events in asyncio be solved with With that |
Fully disabling IRQs is quite drastic. There may be some very low-level IRQs which must still run without interruption (and are independent to MicroPython). All that's needed is to disable all IRQs that may run Python code and which may access the event queue. It also probably needs something like
The TaskQueue methods also need to be protected against re-entry (eg an IRQ can come in while adding a task to a queue). |
…n-main Translations update from Hosted Weblate
If you call Event.set() in an ISR, you could end up losing all tasks that were awaiting the event (if the ISR is executed while uasyncio is sorting tasks on the queue e.g.).
I made a testscript a while ago to demonstrate this: https://gist.github.com/kevinkk525/5b89395604da3df3be7a015abcba04fa
The text was updated successfully, but these errors were encountered: