From f2847979eeea9b39d6ee09c52dbabe4bd439dc13 Mon Sep 17 00:00:00 2001 From: webreflection Date: Wed, 24 Apr 2024 21:28:09 +0200 Subject: [PATCH] Added more F.A.Q. details around our ffi --- docs/user-guide/faq.md | 191 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 191 insertions(+) diff --git a/docs/user-guide/faq.md b/docs/user-guide/faq.md index f73893b..dac55d3 100644 --- a/docs/user-guide/faq.md +++ b/docs/user-guide/faq.md @@ -530,3 +530,194 @@ def download_file(path, mime_type): !!! warning The presented utility works only on *Pyodide* at the moment, as there is no `from_` or `assign` convention in *MicroPython*. Once this is fixed or a better example is discovered the example will be updated too so that all of them should work in both interpreters. + +### create_proxy + +Explained in details [in the ffi page](../ffi/), it's probably useful to cover the *when* `create_proxy` is needed at all. + +To start with, there's a subtle difference between *Pyodide* and *MicroPython* around this topic, with or without using our `pyscript.ffi`, as it just forwards the utility behind scene. + +##### Background + +A *Python* function executed in the *JS* world inevitably needs to be wrapped in a way that, once executed, both its native (*Python*) function reference and any passed argument/parameter to such function can be normalized to *Python* references before such invocation happens. + +The *JS* primitive to do so is the [Proxy](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Proxy) one, which enables "*traps*" such as [apply](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Proxy/Proxy/apply) to do extra work before any result is actually returned from such invocation. + +Once the `apply(target, self, args)` trap is invoked: + + * the interpreter must find which `target` in the current *WASM* running code that needs to be invoked + * the `self` context for regular functions is likely ignored for common cases, but it's likely desired to eventually define `python.method()` invokes when these happen in the *JS* world + * the `args` is a list of passed arguments where any proxy coming from *Python* must be resolved as reference, any primitive might be eventually converted into its *Python* primitive representation, if needed, and any *JS* reference must be translated into *Python* like objects / references + +This orchestration might feel convoluted for many or obvious for others, yet the detail behind the scene is that such `target` reference *needs to exist* on the *WASM* runtime in order to be executed when the *JS* world asks for it ... so here the caveat: globally available functions might outlive any *JS* runtime interoperability in the *WASM* world but locally scoped or runtime functions cannot be retained forever! + +```python title="A basic Python to JS callback" +import js + +js.addEventListener( + "custom:event", + lambda e: print(e.type) +) +``` + +In this scenario that `lambda` has no meaning or references in the running *Python* code, it's just delegated to the *JS* runtime / environment but it *must exist* whenever that `custom_event` is dispatched, hence triggered, or emitted, in the *JS* world. + +From a pure architectural point of view there is literally nothing that defines in that user explicit intent how long that `lambda` should be kept alive in the current *Python* program while from the *JS* point of view that callback might never even be needed or invoked (i.e. the `custom:event` never happens ... which is a forever pending *lambda* use case). + +Because all interpreters do care about memory consumption and have some *WASM* memory constrain to deal with, `create_proxy` (or any similar API) has been provided to delegate the responsibility to kill those references to the user, specially for unknown, in time, invocations scenarios like the one described in here. + +**On the other hand**, when a *Python* callback is attached, as opposite of being just passed as argument, to a specific foreign instance, it is fairly easy for the *WASM* runtime to know when such `lambda` function, or any other non global function, could be freed from the memory. + +```python title="A sticky lambda" +from pyscript import document + +# logs "click" if nothing else stopped propagation +document.onclick = lambda e: print(e.type) +``` + +"*How is that easy?*" is a valid question and the answer is that if the runtime has *JS* bindings, hence it's capable of dealing with *JS* references, that `document` would be a well known *JSProxy* that points to some underlying *JS* reference. + +In this case there's usually no need to use `create_proxy` because that reference is well understood and the interpreter can use the *FinalizationRegistry* to simply destroy that lambda, or decrease its reference counting, whenever the underlying *JS* reference is not needed anymore, hence finalized after its own release from *JS*. + +Sure thing this example is fairly poor, because a `document` reference in the *JS* world would live "*forever*", but if instead of a `document` there was a live DOM element, as soon as that element gets replaced and it's both not live or referenced anymore, the *FinalizationRegistry* would inform the *WASM* based runtime that such reference is gone, and whatever was attached to it behind the scene can be gone too. + +#### In Pyodide + +The `create_proxy` utility is exported [among others](https://pyodide.org/en/stable/usage/api/python-api/ffi.html#module-pyodide.ffi.wrappers) to smooth out and circumvent memory leaks in the long run. + +Using it separately from other utilities though requires some special care, most importantly, it requires that the user invokes that `destroy()` method when such callback is not needed anymore, hence it requires users to mentally track callbacks lifecycle, but that's not always possible for at least these reasons: + + * if the callback is passed to 3rd party libraries, the reference is kinda "*lost in a limbo*" where who knows when that reference could be actually freed + * if the callback is passed to listeners or timers, or even promises based operations, it's pretty unpredictable and counter intuitive, also a bad *DX*, to try to track those cases + +Luckily enough, the *Promise* use case is automatically handled by *Pyodide* runtime, but we're left with other cases: + +```python title="Pyodide VS create_proxy" +from pyscript import ffi, window + +# this is needed even if `print` won't ever need +# to be freed from the Python runtime +window.setTimeout( + ffi.create_proxy(print), + 100, + "print" +) + +# this is needed not because `print` is used +# but because otherwise the lambda is gone +window.setTimeout( + ffi.create_proxy( + lambda x: print(x) + ), + 100, + "lambda" +) + +def print_type(event): + print(event.type) + +# this is needed even if `print_type` +# is not a scoped / local function, rather +# a never freed global reference in this Python code +window.addEventListener( + "some:event", + ffi.create_proxy(print_type), + # despite this intent, the proxy + # will be trapped forever if not destroyed + ffi.to_js({"once": True}) +) + +# this does NOT need create_function as it is +# attached to an object reference, hence observed to free +window.Object().no_create_function = lambda: print("ok") +``` + +To simplify some of this orchestration we landed the `experimental_create_proxy = "auto"` flag which goal is to intercept *JS* world callbacks invocation, and automatically proxy and destroy any proxy that is not needed or used anymore in the *JS* environment. + +Please give it a try and actually try to *not* ever use, or need, `create_proxy` at all, and tell us when it's needed instead, than you! + +!!! Note + + When it comes to *worker* based code, no *Proxy* can survive a roundtrip to the *main* thread and back. + In this scenario we inevitably need to orchestrate the dance differently and reference instead *Python* callbacks, or een *JS* one, as these travel by their unique *id*, not their identity on the *worker*. + We orchestrate the *free* dance automatically because nothing would work otherwise so that long story short, if your *pyodide* code runs from a *worker*, you likely never need to use `create_proxy` at all. + +#### In MicroPython + +Things are definitively easier to reason about in this environment, but mostly because it doesn't expose (yet?) a `destroy()` utility for created proxies. + +Accordingly, using `create_proxy` in *micropython* might be needed only to have portable code, as proxies are created anyway when *Python* code refers to a callback and is passed to any *JS* utility, plus proxies won't be created multiple times if these were already proxy of some *Python* callback. + +All the examples that require `create_proxy` in *Pyodide*, won't bother *MicroPython* but these would be also kinda not needed in general. + +```python title="MicroPython VS create_proxy" +from pyscript import window + +# this works +window.setTimeout(print, 100, "print") + +# this also works +window.setTimeout(lambda x: print(x), 100, "lambda") + +def print_type(event): + print(event.type) + +# this works too +window.addEventListener( + "some:event", + print_type, + ffi.to_js({"once": True}) +) + +# and so does this +window.Object().no_create_function = lambda: print("ok") +``` + +!!! Note + + Currently *MicroPython* doesn't provide a `destroy()` method so it's actually preferred, in *MicroPython* projects, to not use or need the `create_proxy` because it lacks control over destroying it while it's up to the interpreter to decide when or how proxies can be destroyed. + +### to_js + +Also xplained in details [in the ffi page](../ffi/), it's probably useful to cover the *when* `to_js` is needed at all. + +##### Background + +Despite their similar look on the surface, *Python* dictionaries and *JS* object literals are very different primitives: + +```python title="A Python dict" +ref = {"some": "thing"} +``` + +```js title="A JS literal" +const ref = {some: "thing"}; +// equally valid as ... +const ref = {"some": "thing"}; +``` + +In both worlds accessing `ref["some"]` would also produce the same result: pointing at `"value"` string as result. However, in *JS* `ref.some` would also return the very same `"value"` and while in *Python* `ref.get("some")` would do the same, some interpreter preferred to map dictionaries to *JS* [Map](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map) instead, probably because [Map.get](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map/get) is really close to what *Python* dictionaries expect. + +Long story short, *Pyodide* opted for that default conversion but unfortunately all *JS* APIs are usually expecting object literals, and *JS* maps don't really work seamlessly the same, so that it's possible to define a different `dict_converter` in *Pyodide*, but that definition is verbose and not too *DX* friendly: + +```python title="A common Pyodide converter" +import js +from pyodide.ffi import to_js + +js.callback( + to_js( + {"async": False}, + # transform a Map into an object literal + dict_converter=js.Object.fromEntries + ) +) +``` + +Beside the fact that *MicroPython* `to_js` implementation already converts, by default, *Python* dictionaries to *JS* literal, after some experience with common use cases around *Python* and *JS* interoperability, we decided to automatically provide an `ffi` that always results into a *JS* object literal, so that no converter, unless explicitly defined, would be needed to have the desired result out of the box. + +#### Caveats + +One fundamental thing to consider when `to_js` is used, is that it detaches the created reference from its original "*source*", in this case the *Python* dictionary, so that any change applied elsewhere to such reference won't ever be reflected to its original counterpart. + +This is probably one of the main reasons *Pyodide* sticked with the dictionary like proxy when it passes its reference to *JS* callbacks but at the same time no *JS* callback usually expect a foreign runtime reference to deal with, being this a *Python* one or any other programming language. + +Accordingly, if your *JS* code is written to explicitly target *Pyodide* kind of proxies, you probably never need to use `to_js` as that won't reflect changes to the *Python* runtime, if changes ever happen within the callback receiving such reference, but if you are just passing *data* around, data that can be represented as *JSON*, as example, to configure or pass some option argument to *JS*, you can simply use our `pyscript.ffi.to_js` utility and forget about all these details around the conversion: dictionaries will be object literals and lists or tuples will be arrays, that's all you need to remember!