-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
_thread module freezing whole device (RPI_PICO_W) #12980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Im using pymakr2 extension for VSCode, to connect to the device, maybe that plays a role |
I've has similar problems, the |
I had the same problem with 1.22.0 but everything works on 1.21.0 |
See #13288 |
I am having the same issue, after i soldered a dozen picos for a project, I started injecting code with thonny on all of them one by one. The 1.22 update came in the middle and I didn't notice while injecting. When i started testing them I thought at first that half my picos were defect, after a few hours losing my mind some intuition told me maybe it's not the same micropython i am running on all of them, googled, and found this page. |
@innatecuriosity thanks for reporting this issue. It is the same as that reported in #13288, and is fixed by #13312. I should have taken more notice of this issue and investigated it before the release v1.22.0 was published. This bug should have been a show-stopper for the release. |
Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org>
Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org>
Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues #12980 and #13288. Signed-off-by: Damien George <damien@micropython.org>
Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org>
Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org>
* rp2/rp2_flash: Lockout second core only when doing flash erase/write. Using the multicore lockout feature in the general atomic section makes it much more difficult to get correct. Signed-off-by: Damien George <damien@micropython.org> * rp2/mutex_extra: Implement additional mutex functions. These allow entering/exiting a mutex and also disabling/restoring interrupts, in an atomic way. Signed-off-by: Damien George <damien@micropython.org> * rp2/mpthreadport: Fix race with IRQ when entering atomic section. Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org> * all: Bump version to 1.22.1. Signed-off-by: Damien George <damien@micropython.org> * Generic STM32F401CD Port Compiles (not working yet...) * rp2/rp2_dma: Fix fetching 'write' buffers for writing not reading. Signed-off-by: Nicko van Someren <nicko@nicko.org> * rp2/machine_uart: Fix event wait in uart.flush() and uart.read(). Do not wait in the worst case up to the timeout. Fixes issue micropython#13377. Signed-off-by: robert-hh <robert@hammelrath.com> * renesas-ra/ra: Fix SysTick clock source. The SysTick_Config function must use the system/CPU clock to configure the ticks. Signed-off-by: iabdalkader <i.abdalkader@gmail.com> * renesas-ra/boards/ARDUINO_PORTENTA_C33: Fix the RTC clock source. Switch the RTC clock source to Sub-clock (XCIN). This board has an accurate LSE crystal, and it should be used for the RTC clock source. Signed-off-by: iabdalkader <i.abdalkader@gmail.com> * extmod/asyncio: Support gather of tasks that finish early. Adds support to asyncio.gather() for the case that one or more (or all) sub-tasks finish and/or raise an exception before the gather starts. Signed-off-by: Damien George <damien@micropython.org> * mimxrt/modmachine: Fix deepsleep wakeup pin ifdef. Signed-off-by: Kwabena W. Agyeman <kwagyeman@live.com> * extmod/modssl_mbedtls: Fix cipher iteration in SSLContext.get_ciphers. Prior to this commit it would skip every second cipher returned from mbedtls. The corresponding test is also updated and now passes on esp32, rp2, stm32 and unix. Signed-off-by: Damien George <damien@micropython.org> * rp2: Change machine.I2S and rp2.DMA to use shared DMA IRQ handlers. These separate drivers must share the DMA resource with each other. Fixes issue micropython#13380. Signed-off-by: Damien George <damien@micropython.org> * py/compile: Fix potential Py-stack overflow in try-finally with return. If a return is executed within the try block of a try-finally then the return value is stored on the top of the Python stack during the execution of the finally block. In this case the Python stack is one larger than it normally would be in the finally block. Prior to this commit, the compiler was not taking this case into account and could have a Python stack overflow if the Python stack used by the finally block was more than that used elsewhere in the function. In such a scenario the last argument of the function would be clobbered by the top-most temporary value used in the deepest Python expression/statement. This commit fixes that case by making sure enough Python stack is allocated to the function. Fixes issue micropython#13562. Signed-off-by: Damien George <damien@micropython.org> * renesas-ra/ra/ra_i2c: Fix 1 byte and 2 bytes read issue. Tested on Portenta C33 with AT24256B (addrsize=16) and SSD1306. Fixes issue micropython#13280. Signed-off-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com> * extmod/btstack: Reset pending_value_handle before calling write-done cb. The pending_value_handle needs to be freed and reset before calling mp_bluetooth_gattc_on_read_write_status(), which will call the Python IRQ handler, which may in turn call back into BTstack to perform an action like a write. In that case the pending_value_handle will need to be available for the write/read/etc to proceed. Fixes issue micropython#13611. Signed-off-by: Damien George <damien@micropython.org> * extmod/btstack: Reset pending_value_handle before calling read-done cb. Similar to the previous commit but for MP_BLUETOOTH_IRQ_GATTC_READ_DONE: the pending_value_handle needs to be reset before calling mp_bluetooth_gattc_on_read_write_status(), which will call the Python IRQ handler, which may in turn call back into BTstack to perform an action like a write. In that case the pending_value_handle will need to be available for the write/read/etc to proceed. Fixes issue micropython#13634. Signed-off-by: Damien George <damien@micropython.org> * esp32/mpnimbleport: Release the GIL while doing NimBLE port deinit. In case callbacks must run (eg a disconnect event happens during the deinit) and the GIL must be obtained to run the callback. Fixes part of issue micropython#12349. Signed-off-by: Damien George <damien@micropython.org> * esp32: Increase NimBLE task stack size and overflow detection headroom. The Python BLE IRQ handler will most likely run on the NimBLE task, so its C stack must be large enough to accommodate reasonably complicated Python code (eg a few call depths). So increase this stack size. Also increase the headroom from 1024 to 2048 bytes. This is needed because (1) the esp32 architecture uses a fair amount of stack in general; and (2) by the time execution gets to setting the Python stack top via `mp_stack_set_top()` in this interlock code, about 600 bytes of stack are already used, which reduces the amount available for Python. Fixes issue micropython#12349. Signed-off-by: Damien George <damien@micropython.org> * all: Bump version to 1.22.2. Signed-off-by: Damien George <damien@micropython.org> * Submodule update --------- Signed-off-by: Damien George <damien@micropython.org> Signed-off-by: Nicko van Someren <nicko@nicko.org> Signed-off-by: robert-hh <robert@hammelrath.com> Signed-off-by: iabdalkader <i.abdalkader@gmail.com> Signed-off-by: Kwabena W. Agyeman <kwagyeman@live.com> Signed-off-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com> Co-authored-by: Damien George <damien@micropython.org> Co-authored-by: Nicko van Someren <nicko@nicko.org> Co-authored-by: robert-hh <robert@hammelrath.com> Co-authored-by: iabdalkader <i.abdalkader@gmail.com> Co-authored-by: Kwabena W. Agyeman <kwagyeman@live.com> Co-authored-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com>
* rp2/rp2_flash: Lockout second core only when doing flash erase/write. Using the multicore lockout feature in the general atomic section makes it much more difficult to get correct. Signed-off-by: Damien George <damien@micropython.org> * rp2/mutex_extra: Implement additional mutex functions. These allow entering/exiting a mutex and also disabling/restoring interrupts, in an atomic way. Signed-off-by: Damien George <damien@micropython.org> * rp2/mpthreadport: Fix race with IRQ when entering atomic section. Prior to this commit there is a potential deadlock in mp_thread_begin_atomic_section(), when obtaining the atomic_mutex, in the following situation: - main thread calls mp_thread_begin_atomic_section() (for whatever reason, doesn't matter) - the second core is running so the main thread grabs the mutex via the call mp_thread_mutex_lock(&atomic_mutex, 1), and this succeeds - before the main thread has a chance to run save_and_disable_interrupts() a USB IRQ comes in and the main thread jumps off to process this IRQ - that USB processing triggers a call to the dcd_event_handler() wrapper from commit bcbdee2 - that then calls mp_sched_schedule_node() - that then attempts to obtain the atomic section, calling mp_thread_begin_atomic_section() - that call then blocks trying to obtain atomic_mutex - core0 is now deadlocked on itself, because the main thread has the mutex but the IRQ handler (which preempted the main thread) is blocked waiting for the mutex, which will never be free The solution in this commit is to use mutex enter/exit functions that also atomically disable/restore interrupts. Fixes issues micropython#12980 and micropython#13288. Signed-off-by: Damien George <damien@micropython.org> * all: Bump version to 1.22.1. Signed-off-by: Damien George <damien@micropython.org> * Generic STM32F401CD Port Compiles (not working yet...) * rp2/rp2_dma: Fix fetching 'write' buffers for writing not reading. Signed-off-by: Nicko van Someren <nicko@nicko.org> * rp2/machine_uart: Fix event wait in uart.flush() and uart.read(). Do not wait in the worst case up to the timeout. Fixes issue micropython#13377. Signed-off-by: robert-hh <robert@hammelrath.com> * renesas-ra/ra: Fix SysTick clock source. The SysTick_Config function must use the system/CPU clock to configure the ticks. Signed-off-by: iabdalkader <i.abdalkader@gmail.com> * renesas-ra/boards/ARDUINO_PORTENTA_C33: Fix the RTC clock source. Switch the RTC clock source to Sub-clock (XCIN). This board has an accurate LSE crystal, and it should be used for the RTC clock source. Signed-off-by: iabdalkader <i.abdalkader@gmail.com> * extmod/asyncio: Support gather of tasks that finish early. Adds support to asyncio.gather() for the case that one or more (or all) sub-tasks finish and/or raise an exception before the gather starts. Signed-off-by: Damien George <damien@micropython.org> * mimxrt/modmachine: Fix deepsleep wakeup pin ifdef. Signed-off-by: Kwabena W. Agyeman <kwagyeman@live.com> * extmod/modssl_mbedtls: Fix cipher iteration in SSLContext.get_ciphers. Prior to this commit it would skip every second cipher returned from mbedtls. The corresponding test is also updated and now passes on esp32, rp2, stm32 and unix. Signed-off-by: Damien George <damien@micropython.org> * rp2: Change machine.I2S and rp2.DMA to use shared DMA IRQ handlers. These separate drivers must share the DMA resource with each other. Fixes issue micropython#13380. Signed-off-by: Damien George <damien@micropython.org> * py/compile: Fix potential Py-stack overflow in try-finally with return. If a return is executed within the try block of a try-finally then the return value is stored on the top of the Python stack during the execution of the finally block. In this case the Python stack is one larger than it normally would be in the finally block. Prior to this commit, the compiler was not taking this case into account and could have a Python stack overflow if the Python stack used by the finally block was more than that used elsewhere in the function. In such a scenario the last argument of the function would be clobbered by the top-most temporary value used in the deepest Python expression/statement. This commit fixes that case by making sure enough Python stack is allocated to the function. Fixes issue micropython#13562. Signed-off-by: Damien George <damien@micropython.org> * renesas-ra/ra/ra_i2c: Fix 1 byte and 2 bytes read issue. Tested on Portenta C33 with AT24256B (addrsize=16) and SSD1306. Fixes issue micropython#13280. Signed-off-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com> * extmod/btstack: Reset pending_value_handle before calling write-done cb. The pending_value_handle needs to be freed and reset before calling mp_bluetooth_gattc_on_read_write_status(), which will call the Python IRQ handler, which may in turn call back into BTstack to perform an action like a write. In that case the pending_value_handle will need to be available for the write/read/etc to proceed. Fixes issue micropython#13611. Signed-off-by: Damien George <damien@micropython.org> * extmod/btstack: Reset pending_value_handle before calling read-done cb. Similar to the previous commit but for MP_BLUETOOTH_IRQ_GATTC_READ_DONE: the pending_value_handle needs to be reset before calling mp_bluetooth_gattc_on_read_write_status(), which will call the Python IRQ handler, which may in turn call back into BTstack to perform an action like a write. In that case the pending_value_handle will need to be available for the write/read/etc to proceed. Fixes issue micropython#13634. Signed-off-by: Damien George <damien@micropython.org> * esp32/mpnimbleport: Release the GIL while doing NimBLE port deinit. In case callbacks must run (eg a disconnect event happens during the deinit) and the GIL must be obtained to run the callback. Fixes part of issue micropython#12349. Signed-off-by: Damien George <damien@micropython.org> * esp32: Increase NimBLE task stack size and overflow detection headroom. The Python BLE IRQ handler will most likely run on the NimBLE task, so its C stack must be large enough to accommodate reasonably complicated Python code (eg a few call depths). So increase this stack size. Also increase the headroom from 1024 to 2048 bytes. This is needed because (1) the esp32 architecture uses a fair amount of stack in general; and (2) by the time execution gets to setting the Python stack top via `mp_stack_set_top()` in this interlock code, about 600 bytes of stack are already used, which reduces the amount available for Python. Fixes issue micropython#12349. Signed-off-by: Damien George <damien@micropython.org> * all: Bump version to 1.22.2. Signed-off-by: Damien George <damien@micropython.org> * Submodule update --------- Signed-off-by: Damien George <damien@micropython.org> Signed-off-by: Nicko van Someren <nicko@nicko.org> Signed-off-by: robert-hh <robert@hammelrath.com> Signed-off-by: iabdalkader <i.abdalkader@gmail.com> Signed-off-by: Kwabena W. Agyeman <kwagyeman@live.com> Signed-off-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com> Co-authored-by: Damien George <damien@micropython.org> Co-authored-by: Nicko van Someren <nicko@nicko.org> Co-authored-by: robert-hh <robert@hammelrath.com> Co-authored-by: iabdalkader <i.abdalkader@gmail.com> Co-authored-by: Kwabena W. Agyeman <kwagyeman@live.com> Co-authored-by: Takeo Takahashi <takeo.takahashi.xv@renesas.com>
Summary
When using firmware built from the master branch calling
_thread.start_new_thread
freezes the whole machine. v1.22.0-preview and earlier works as expected.Using Raspberry Pi Pico W.
I went back to find the commit where it breaks, and i found the breaking point: bcbdee2
Scenario
Im using a Raspberry Pi Pico W with firmware built from the repository (make etc.) In my application, right at the start of boot.py I started a process on the second thread that constantly measured the orientation, then continued as normal, but yesterday the whole program started freezing.
After some testing I found out that calling
_thread.start_new_thread(test_function, ())
freezes the machine, it cannot be stopped with ctrl-c or any other method.I tested with this file, then importing and running test()
With 2d363a2 and earlier I get the expected:
But after bcbdee2 I get:


then freezes completely. Power cycling does make the device responsive again.
Sorry about the format, first time creating an issue.
The text was updated successfully, but these errors were encountered: