8000 py/schedule: Convert micropythyon.schedule() to circular buffer. by andrewleech · Pull Request #4617 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

py/schedule: Convert micropythyon.schedule() to circular buffer. #4617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

andrewleech
Copy link
Contributor

This means the schedule operates on a first-scheduled, first-executed manner rather than the current last-scheduled, first executed.

Benefit:
When application/interrupt schedules multiple functions they're guaranteed to be run in the order of scheduling.

Currently in a busy system that isn't sleeping very often, if multiple functions or interrupts schedule tasks before they can be serviced they will be run out of order, potentially causing confusion or unexpected side-effects.

Costs:

  • This costs 32 bytes extra flash on my build (-Os or -Og), no extra ram usage.
  • runtime.h/mp_sched_num_pending() is now a uint_8 subtraction rather than directly reporting a uint16_t.
  • The implementation requires MICROPY_SCHEDULER_DEPTH to be order of 2 sized. This is enforced with a MP_STATIC_ASSERT.

@pi-anl pi-anl force-pushed the fifo_scheduler branch 2 times, most recently from a8bc6a2 to 0949535 Compare March 20, 2019 20:56
@andrewleech
Copy link
Contributor Author

Unit tests for the win!

For reference, the current circular buffer implementation follows: https://www.snellman.net/blog/archive/2016-12-13-ring-buffers/

@andrewleech
Copy link
Contributor Author

As much as I personally think my current implementation is quite elegant and efficient, I think it would be better migrated to an "index and depth" scheme rather than "head and tail" as that would change the mp_sched_num_pending function back to a simple reporting of a value. This function is run at high frequency by the vm, so efficiency there is most important. Pushing the arithmetic overhead into the push/get functions makes sense as they're run far less often.

@dpgeorge
Copy link
Member

Thanks for the patch. It looks like this would be a sensible change: since there's no concept of priority in the scheduling a fair way to run them would be FIFO rather than the current LIFO.


I was actually considering some more fundamental changes to how the scheduling allocates slots. Right now it's possible to lose scheduled events if the schedule stack/buffer is full, which is not good and in certain cases would lead to very bad system behaviour. Eg a UART RX IRQ for an incoming character might not be able to schedule a callback and the Python code would then be forever waiting for a notification that there is a character, when in fact there is one waiting.

To fix this I was thinking to more closely mirror how hardware IRQs work: each source that can schedule something has its own dedicated slot to store a pending scheduled callback, and the set of pending callbacks is a linked list (or similar) of all slots that are pending. So when an event needs to be scheduled for a given source, it just links in its slot in to the main list. This means 1) every source can always schedule a pending callback which will always be guaranteed to be executed; 2) multiple triggers for a given source that happen before the first one is handled (eg lots of incoming UART chars) only ever lead to a single callback to Python (the callback may then have multiple flags/events to respond to).

This scheme could be in addition to how it's currently done, so that users can still schedule their own callbacks via micropython.schedule().

@dpgeorge
Copy link
Member

This function is run at high frequency by the vm, so efficiency there is most important

Actually it's not run at high frequency: it was designed so that only a single variable -- namely MP_STATE_VM(sched_state) -- needs to be tested in the VM for any pending event, like an async exception or scheduled callback. So it's ok to have this function slightly less efficient. Although it'd still be good to do some benchmarking to be sure.

@andrewleech
Copy link
Contributor Author

I definitely like the sound of interrupt slots for scheduling tasks, certainly would be safer and easier to manage for most interrupt usages.

I've just pushed a re-work with idx & length, so no speed hit to mp_sched_num_pending anymore. It's also shrunk, somehow. Now it only seems to be 16 bytes bigger on flash than the existing lifo scheme.

py/scheduler.c Outdated
@@ -30,6 +30,17 @@

#if MICROPY_ENABLE_SCHEDULER

static inline bool mp_sched_full(void) {
// MICROPY_SCHEDULER_DEPTH must be a power of 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new restriction, although probably a fair one to impose (existing uses in tho repo are a power of 2).

Copy link
Contributor Author
@andrewleech andrewleech Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a hard requirement of the previous head/tail rolling approach, and of the masking used in this newer approach... but not really necessary in an idx/len algorithm so just as easy to remove it again I'd say.

Actually, scratch that. Keeping the restriction and using mask for overflow handling is smaller, going to a "(idx == MICROPY_SCHEDULER_DEPTH) -> idx = 0" style check adds an extra 32 bytes to flash size.

py/scheduler.c Outdated
MP_STATE_VM(sched_stack)[MP_STATE_VM(sched_sp)].func = function;
MP_STATE_VM(sched_stack)[MP_STATE_VM(sched_sp)].arg = arg;
++MP_STATE_VM(sched_sp);
uint8_t iput = MP_STATE_VM(sched_idx) + MP_STATE_VM(sched_len)++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some &-logic to wrap it around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep sorry, pushed a broken version

This means the schedule operates on a first-in, first-executed manner rather than the current last-in, first executed.
@andrewleech
Copy link
Contributor Author

This version (928c63c) should be correct and safe.
I do have another version without the power of 2 restriction, but it's bigger and likely fractionally slower.

@pfalcon
Copy link
Contributor
pfalcon commented Mar 21, 2019

each source that can schedule something has its own dedicated slot to store a pending scheduled callback, and the set of pending callbacks is a linked list (or similar) of all slots that are pending. So when an event needs to be scheduled for a given source, it just links in its slot in to the main list.

This is exactly how I wanted to implement uselect.poll() support for modlwip when working on esp8266 port, i.e. refactor the current awful busy-polling scheme of baremetal ports to use a real wait queue. That would also allow to integrate in a decent way POSIX OS-level poll and MicroPython's native poll. So, no need for all this MicroPython-specific adhockery with "scheduled callbacks" which, while synchronous on C level, still async on Python level, and thus confusing. It all can be implemented within scope of proven and composable/scalable POSIX poll() model.

@dpgeorge
Copy link
Member

Ok, this is good as it is, merged in 8977c7e (with a minor addition to do a static assert on MICROPY_SCHEDULER_DEPTH fitting in 8 bits).

@dpgeorge dpgeorge closed this Mar 26, 2019
@dpgeorge
Copy link
Member

So, no need for all this MicroPython-specific adhockery with "scheduled callbacks" which, while synchronous on C level, still async on Python level, and thus confusing. It all can be implemented within scope of proven and composable/scalable POSIX poll() model.

I'm not sure I understand this idea, do you mean to remove the ability to have a Python callback run on an interrupt/IRQ? Eg Timer timeout and Pin level change.

@pfalcon
Copy link
Contributor
pfalcon commented Mar 26, 2019

I'm not sure I understand this idea, do you mean to remove the ability to have a Python callback run on an interrupt/IRQ?

No, I don't mean removing anything, how such an idea could pop up based on the comment I wrote? ;-) What I mean is to stop digging that hole further, and think of more general and composable means of monitoring for events (and an interrupt is nothing but an event). And such means exist in Python (and thus MicroPython) - it's (u)select.poll object. But its implementation for baremetal ports isn't efficient enough, and that's where effort would rather be invested.

@dpgeorge
Copy link
Member

No, I don't mean removing anything, how such an idea could pop up based on the comment I wrote?

Because you used the words "no need for", and "it can all be implemented within the scope of".

What I mean is to stop digging that hole further, and think of more general and composable means of monitoring for events

There's no need to monitor for an interrupt, when it comes it triggers the handler/callback directly. That's the most efficient way to do it.

And such means exist in Python (and thus MicroPython) - it's (u)select.poll object.

How in this scheme does something like a pin interrupt call a callback (asynchronously)?

@pfalcon
Copy link
Contributor
pfalcon commented Mar 26, 2019

Because you used the words "no need for", and "it can all be implemented within the scope of".

"No deed for" implementation of that "slotted schedule queue", because it rather would be implemented in the scope of more general and composable select.poll() support.

There's no need to monitor for an interrupt, when it comes it triggers the handler/callback directly. That's the most efficient way to do it.

But with this "scheduling" scheme nothing really happens "directly" already, it's stored and then monitored at other time, but with a completely adhoc mechanism available only in MicroPython (not other Python implementations), and poorly composable, because still based on "callback" concept.

How in this scheme does something like a pin interrupt call a callback (asynchronously)?

As we know, nothing really happens truly asynchronously. It's always synchronous - to the cycles of CPU, or cycles of VM, or, in the case of explicit scheme based on uselect.poll(), to the application state machine, which is a way least surprising to users (leading to least number of bugs or requiring least explicit synchronization efforts).

It works in a obvious manner - an application registers different sources of events, e.g. UART available for reading, network socket available for writing, pin interrupt happening, and call uselect.poll() to get notified when any of them happens, to handle it in a similar and consistent way. All hopefully wrapped in a scheduler a-la uasyncio.

@dpgeorge
Copy link
Member

... and call uselect.poll() to get notified when any of them happens, to handle it in a similar and consistent way

So the proposal is to remove the ability to have asynchronous callbacks on interrupts, or to have both async and via polling? If there are no async callbacks then how does one achieve low latency response to inputs while doing some background computation?

@pfalcon
Copy link
Contributor
pfalcon commented Mar 26, 2019

So the proposal is to remove the ability to have asynchronous callbacks on interrupts, or to have both async and via polling?

My suggestion is to leave the current interrupt handling facilities in MicroPython as they are, possibly applying simple localized changes/improvements, of which this patch is a good example. And instead, invest effort into further development of more cross-platform and scalable facilities, like optimizing poll protocol and extending it to as many object types as possible.

And I give this suggestion with full awareness that I'm biased to exactly these "more cross-platform and scalable facilities", while other stakeholders may have different outlooks. But hopefully such suggestions may help them to be less biased towards their outlooks, and find the best places to apply effort, to maximize MicroPython's capabilities and usefulness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0