8000 uasyncio tests can fail on windows ports · Issue #8301 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

uasyncio tests can fail on windows ports #8301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stinos opened this issue Feb 14, 2022 · 10 comments · Fixed by #8617
Closed

uasyncio tests can fail on windows ports #8301

stinos opened this issue Feb 14, 2022 · 10 comments · Fixed by #8617

Comments

@stinos
Copy link
Contributor
stinos commented Feb 14, 2022

Since #8030 the dev variant yields 'random' test failures for uasyncio, probably depending on machine/version/...

For example on my machine the uasyncio_basic consistently prints took 30 50 0 as last line, instead of the expected took 20 40 0. Some builds of the mingw port also fail in Appveyor for me. The reason for this is mentioned in #7796 :

most likely all sleep/wait implementations on windows by default use a 16mSec (or close to that) resolution

The dev variant enables both uasyncio and the scheduler, and uasyncio calls poll_poll_internal and that uses MICROPY_EVENT_POLL_HOOK which in turn calls mp_hal_delay_us(500). That last one is the culprit.

Setting MICROPY_ENABLE_SCHEDULER to 0 leaves MICROPY_EVENT_POLL_HOOK undefined, so doesn't compile because moduselect.c uses that unconditionally. Just adding #define MICROPY_EVENT_POLL_HOOK fixes that. It also makes the uasyncio tests pass. But it also turns poll_poll_internal into a busy loop. Likewise for mp_hal_delay_ms because it's conditional on ifdef MICROPY_EVENT_POLL_HOOK which should argueably should be #if MICROPY_ENABLE_SCHEDULER.

Things also grew a bit messy over time with there being multiple ways to sleep on Windows: mp_hal_delay_ms calls SleepEx but mp_hal_delay_us uses an implementation which uses which uses a timer. Which I wrote, but I don't remember why: I don't think it's better/worse/very different from SleepEx. Could be that things used to be different back then. Anyway these 2 should be consolidated.

Then the question remains how to fix this

  • I don't hink having busy waits is ok, even though they will make everything work and do so fairly accurately.
  • by default there's just no accurate wait on windows. We could work with that, add a word of warning to the README, and change the tests so they pass
  • figure out a way to make the wait accurate. I know at least one but it's generally considered bad practice (because https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/). There might be other ways though, not sure.
@dlech
Copy link
Contributor
dlech commented Feb 14, 2022

MICROPY_EVENT_POLL_HOOK which in turn calls mp_hal_delay_us(500)

I noticed this recently too (looking at the unix port). This seems like a long time to block. It seems like the event poll hook should actually poll() for events similar to how embedded versions will wait for interrupt. Or at the very least should only delay for 1 ms to not block things for too long. This of course won't be exactly 1ms, but the important part is that it lets other threads and processes run for a bit before resuming.

Since Linux/Mac/Windows aren't realtime systems, I don't think it is reasonable to expect accurate timers below 100ms on these system. And even above that point, there can be hiccups where a thread might not be resumed for longer than that.

@stinos
Copy link
Contributor Author
stinos commented Mar 1, 2022

I tested some things and the Appveyor test failures are remedied by increasing the sleep periods (mostly using >100mSec instead of the typical 20 to 50mSec used now, to account for the default 15.6mSec resolution windows uses) in the tests I've seen failing. This does mean tests run longer but that's not really an issue I think.

The other option (bascially increasing the kernel timer interrupt rate, which affects the whole system) also works and makes the tests pass. It's convenient in the sense that it makes the windows port suitable for testing asyncio with small wait periods.

I'm inclined to implement the first option, and add a section in the readme which shows how to create a variant which implements the second. @dpgeorge what do you think?

@dpgeorge
Copy link
Member

I don't hink having busy waits is ok

I agree. Being power efficient is more important.

Since Linux/Mac/Windows aren't realtime systems, I don't think it is reasonable to expect accurate timers below 100ms on these system.

I agree. If these systems are intended to be semi-realtime then that's up to the Linux (or other OS) implementation to be tuned so that it does have better timing accuracy (either tuned by the distribution, or tuned by the user for their application). MicroPython should then do the best job it can within that system (and in particular not change the kernel systick on Windows).

I tested some things and the Appveyor test failures are remedied by increasing the sleep periods (mostly using >100mSec instead of the typical 20 to 50mSec used now, to account for the default 15.6mSec resolution windows uses) in the tests I've seen failing. This does mean tests run longer but that's not really an issue I think.

My aim writing those tests was to have short delays so the tests didn't take too long. Having lots of tests and running them over and over again (when developing) does eat up time!

Changing to ~100ms for delays in those tests would probably be OK though. And it might even get these timing tests to pass on mac (which has similar issues with time resolution).

OTOH, I also think it's fair enough to increase the kernel systick on Windows when running under AppVeyor (eg with a special flag passed to MicroPython when running). Just consider it as creating a special Windows test environment to run the test suite.

Anyway, I'm happy to increase the timing of the relevant tests to ~100ms. Hopefully that's enough.

@dlech
Copy link
Contributor
dlech commented Mar 30, 2022

It would be more work are require a separate build, but a more deterministic way to run tests that depend on the passage of time is to create a "fake" clock that doesn't use the system clock at all. Instead, the clock state is driven programatically. This means tests that use this would have to be modified to tell the clock to tick X times, then test a condition, tick again, test something else, etc... But it eliminates the intermittent failures that inevitably come with relying on a real clock.

@dpgeorge
Copy link
Member

It would be more work are require a separate build, but a more deterministic way to run tests that depend on the passage of time is to create a "fake" clock that doesn't use the system clock at all.

Another alternative is to have a build (or command line option) that runs the clock at a rate * R but still report "normal" ticks to the program. So if you ask for R=10 then the clock runs 10 times slower but tick values are /10:

t0 = time.ticks_ms()
time.sleep_ms(0.01)  # actually sleeps for 100ms
t1 = time.ticks_ms()
dt = time.ticks_diff(t1, t0)  # returns 0.01

@stinos
Copy link
Contributor Author
stinos commented Mar 31, 2022

And it might even get these timing tests to pass on mac (which has similar issues with time resolution).

Wasn't aware of this, but makes sense.

Another alternative is to have a build (or command line option) that runs the clock at a rate * R but still report "normal" ticks to the program

This is essentially the same as slowing down the tests, which, I found out today, still isn't sufficient to get them to always pass on windows: all my 100mSec sleeps now actually take 110mSec (which wasn't the case earlier). Combined with the overall test time increase, I'm thinking it's time to give up on this idea.

It would be more work are require a separate build, but a more deterministic way to run tests that depend on the passage of time is to create a "fake" clock that doesn't use the system clock at all. Instead, the clock state is driven programatically.

That's what I usually do: have a single swappable time source from which everything else derives (this is great e.g. when the entire application runs on top of streaming data from an ADC and you use each sample as a tick, and then for the tests that just gets swapped for a fake timer to which the actual time gets set in the tests, very easy and versatile - as long as you don't need 'real clock' time somewhere).

This means tests that use this would have to be modified to tell the clock to tick X times

Not necessarily, or at least: not explicitly. For the type of tests MicroPython has it should be suffcient to make all tick functions increase a uSec tick for instance (to make sure there are no infinite polling loops), and then have the sleep function increase that tick and round to the next mSec (so that a sequence of tick/sleep(10)/tick/tick_diff still returns 10). I quickly tested this for the windows port and even with a pretty dumb implementation it does pass all tests.
There's still mod_time_sleep() in unix/modtime.c which uses plain select (but only when MICROPY_PY_BUILTINS_FLOAT which looks like an oversight). Not sure what to do there: get rid of the select and use sleep instead, or perhaps implement it via moduselect?
Would it be worth continuing this path?

Another alternative: change all timing tests to detect if they run on the unix/windows port and if so change all equality comparisons to not be exact but in a range like +-10mSec of the expected value. There aren't that many tests so should be fine.

@dpgeorge
Copy link
Member

This is essentially the same as slowing down the tests, which, I found out today, still isn't sufficient to get them to always pass on windows: all my 100mSec sleeps now actually take 110mSec (which wasn't the case earlier).

I'm surprised this doesn't work. The tests already have round(dt, -1) for the check on the timing, which is essentially +/- 10ms error margin. Doing a 10x slow down on the timing of the test should give 10 time better accuracy on the dt measurement, which should be within those error margins.

Yet another option: change the tests so they don't test timing, just ordering of events. Then have a few special tests that do test timing, one set which tests rough timing (for windows/mac/etc) and another that tests accurate timing (for bare metal).

@stinos
Copy link
Contributor Author
stinos commented May 3, 2022

The tests already have round(dt, -1) for the check on the timing, which is essentially +/- 10ms error margin.

That's actually +-5mSec but that's covered in #8617 already :)

@dpgeorge
Copy link
Member
dpgeorge commented May 3, 2022

That's actually +-5mSec

You are right! So changing it to round(dt, -2) will make it 50ms which seems good enough.

@dpgeorge
Copy link
Member
dpgeorge commented May 4, 2022

Tests timing was increased in c90f097.

Windows sleep was improved/cleaned up in 8aa79c9 and 20028c7

tannewt pushed a commit to tannewt/circuitpython that referenced this issue Aug 24, 2023
Fix displayio when CIRCUITPY_REPL_LOGO is disabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0