Description
Description
There's a general DMA/Cache-management issue with the MicroPython's stm32 port; most drivers make assumptions about the memory (cachable or not DAM-acessible or not) which probably works fine for all (most?) of the default linker scripts, which happen to place everything in one memory region, but when buffers are moved around, drivers starts failing.
A few examples to demonstrate the issue (there are many more in sdcard, I2C etc..):
-
DAC should flush buffer before DMA call if memory is cacheable and fail if it's not DMA-accesible:
Line 201 in a4ab847
-
SPI regular transfer (fallback) path should be taken if the memory is not accessible by DMA:
Line 588 in a4ab847
For example:if (len == 1 || query_irq() == IRQ_STATE_DISABLED || !dma_is_accessible(src)) {
-
SDCARD check here is over-simplistic: On H7 SDMMC2's DMA has access to all memory while SDMMC1's DMA can only access D1 memory:
micropython/ports/stm32/sdcard.c
Line 527 in a4ab847
-
Something like this issue ports/stm32: Make ETH DMA buffer attributes configurable. #16633 could've been avoided, or at least easily detected, if there was a runtime assert that memory is "dmable".
-
All of those drivers, and others, should Not do any cache-management if memory is Not cacheable.
The problem will only get worse as we add more MCUs/ports with more complex memory and will not scale well in the future. Note I've been maintaining patches to all of those drivers, to fix these issues, but it starting to get out of hand.
Proposed fix:
To fix this issue, I propose we implement some sort of memory map per MCU family, override-able per-board (if needed), with attributes per region: cacheable, DAM-accessible, TCM, external etc.. And a new generic API to query those attributes. It would be great if this mechanism is generic somehow, so it can be reused by future ports with more complex memory.
Other aspects
Having this memory map/attributes query could also help in other areas. For example, it could help gc
allocate memory based on attributes (gc_alloc(size, MP_MEMORY_REGION_ATTR_TCM/EXTERNAL
). Then gc
could loop through gc blocks and try to fulfill such request with something like: if mp_memory_region_get_attributes(gc_block_start) & attr...
Toy API:
typdef enum {
MP_MEMORY_REGION_ATTR_CACHEABLE = (1 << 0),
MP_MEMORY_REGION_ATTR_TCM = (1 << 1),
MP_MEMORY_REGION_ATTR_SRAM = (1 << 2),
MP_MEMORY_REGION_ATTR_EXTERNAL = (1 << 3),
MP_MEMORY_REGION_ATTR_DMA = (1 << 4),
// Use to add port-specific/custom attributes.
MP_MEMORY_REGION_ATTR_LAST = (1 << 15)
} mp_memory_region_attr_t;
typedef struct {
uintptr_t base;
uintptr_t limit;
uint32_t attrs;
} mp_memory_region_t;
// Or add regions dynamically maybe?
uint32_t mp_memory_region_add(uintptr_t base, size_t limit);
uint32_t mp_memory_region_get_attributes(uintptr_t ptr);
// Some helper macros
#define mp_memory_region_is_cacheable(ptr) \
mp_memory_region_get_attributes(ptr) & MP_MEMORY_REGION_ATTR_CACHEABLE
#define mp_memory_region_is_dmable(ptr) \
mp_memory_region_get_attributes(ptr) & MP_MEMORY_REGION_ATTR_DMA
extern const mp_memory_region_t mp_memory_regions[];
// ======================
// MCU/board/port memory map with port-specific/custom attributes.
// Real-life example: H7 SDMMC1 DMA can only access D1 memory/devices.
// `sdcard.c` should check if `SDMMC1` && memory is Not D1, then allocate
// a D1 buffer possibly via: `gc_alloc(size, MP_MEMORY_REGION_ATTR_D1)`,
// or use a pre-allocated static D1 buffer.
#define MP_MEMORY_REGION_ATTR_D1 (MP_MEMORY_REGION_ATTR_LAST << 1)
const mp_memory_region_t mp_memory_regions[] = {
{ 0x90000000, 0xD0000000, MP_MEMORY_REGION_ATTR_DMA |
MP_MEMORY_REGION_ATTR_CACHEABLE | MP_MEMORY_REGION_ATTR_D1 },
{ 0x24000000, 0x24080000, MP_MEMORY_REGION_ATTR_DMA |
MP_MEMORY_REGION_ATTR_CACHEABLE | MP_MEMORY_REGION_ATTR_D1 },
{ 0, 0, 0 } // end
};
Implementation
I'd be very happy to work on this, but I need feedback on the API first before.
Code of Conduct
Yes, I agree