-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Unix embed #1279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unix embed #1279
Conversation
Very interesting. I have not had time lately to keep up with the forum so am a bit out of the loop right now. But I do have an alternative to this sitting in my private repo which is the following: a Python script which takes in the uPy source and processes it to make the API functions all start with "micropy_" (or whatever you like) and also inserts a context variable into each function and each call to completely eliminate global state. Then it provides wrapper functions like you have here to create uPy instances (and you can create more than one since the VM state is now local). It also makes an amalgamation of the source (33k line file!) so it's super easy to embed it anywhere. My code is not yet ready for public consumption so maybe your way is better for the moment. |
As far as interesting goes this definitely sounds like a win :P
I think it will be complementary in the end. When embedding you don't want to copy/paste the parts which reside in main() now so extracting it into non-static functions seems beneficial in any case |
I agree we should factor the boiler-plate code out into reusable functions. I think they should be named with prefix "mp_" to make them feel like part of the core. Something like mp_new_instance(...) to create a new uPy instance, and mp_free_instance(...) to clean up. This API extends nicely to the case where you can create multiple instances. The exec functions would be mp_exec_str(...), mp_exec_file(...), etc. As for the options (compile_only, emit_opts), I think they should be set by a special function like mp_exec_opts(...) so they apply to executing strs and files. |
Ok, so I didn't want to give such feedback before somebody else tries to review this patch "semantically", now that @dpgeorge tried to do that, I can give my "surface" review on one point. Fortunately, now I can just link to #1267 (comment) , and give summary: "Please do not start implementing functionality with file renames. Doing that can be warranted only if you want you patch to spend extended time in review, then possibly grew old and closed as out of date." The issue is not that renames are bad, the issue is that unneeded and bad renames are bad, and proving both functionality and goodness of renames are hard at the same time. So, how I'd do that? I'd pose a no-nonsense, yet simple, embedding problem. Example given in the original description can be achieved by So, an example I had in mind for this is: a C app reads file line by line, and writes out each line. But it allows user to specify a python file, that file should contain definition of function "process", a C app will run each line thru that function and write out returned result. So, that will be an adequate example of embedding. Now how it should be achieved: initial version should achieve that by patching out "static" declarations, and applying #ifdef's. Function renames are not allowed at this stage. That's next stage. And splitting out functions to separate source files in another stage. All these should be represented by separate commits, easy to follow. So, what about this patch? The only thing I can propose is that someone else (e.g. me) prepared alternative implementation per ideas above, then we compare which implementation is better. Because currently I cannot see whether this patch is good or bad, can be made better or not. Of course, if someone else can see that, feel free to merge. P.S. Sorry if this looks like criticism. It's actually genuine attempt to try to get contributors produce mergeable patches. We now have few unmergeable patches: #1234 , #1267 , now this one. They are unmergeable not because I don't want to merge them, they are such because they're stuck in the queue without sign of progress. While simple, clean, and down-to-earth patches get merged in a day or two. So, I see the problem, but other people don't, and keep producing such patches. So, I just tried to analyze where's problem and how to overcome it. Someone may say it's not fair - I can come up with any rename I think is needed, because I have commit access, but I challenge other peoples' decision to rename/split, as if I don't believe they had good reasons to. But the point is actually the same - I should not perform random and unclear renames/splits, in particular, I should not try to "squash" changes together, even if for me, it's clear they're needed and I think I can skip intermediate steps. And if 10 functions needs to be split out, generally having 10 commits which split a single function each is better than 1 which splits out 10. Perhaps, good real-world compromise is 3 commits with 3-4 functions. I also may try to split first, apply functionality then - to try to get more changes "after the last rename" (such changes are easier to query/follow). That works best with 1-2 functions. Applying that en-masse is shady technique, which I do criticize here. If I do that myself, I'd like to hear criticism. |
Don't worry, criticism approved: I certainly see why you consider this not mergeable.
So: I'll see what happens when I keep renames and moving code around to a minimum (I'll abandon current code and reuse this PR unless someone sees a reason not to?).
Yeah that's what you get with hello world samples - are they ever really useful? After all the example given can also be achieved and performed much faster by
Your example with the C app is one way to do embedding and should work with changes presented. In the extreme case it still is not adequate though: if you don't need performance you could do without embedding as you could pipe each line to
Don't you think in the end it would, apart from function names and file structure maybe, be pretty much the same? |
So here's take 2 with more gradual changes/smaller commits. Should be easy to follow: first some functions are extracted from main(), then those functions are made public and then some more public configuration is added. |
My suspicion that it may be like that is the reason of the above rant - "please make it more visible". But beyond that answer, can offer 2 other: 1) I don't know; 2) I'm not sure. As an example, I doubt I would produce this patch: stinos@edb2e13 . Why all that is needed if you can call gc_init() directly? |
See #1335. |
Split initialization phase into actual initialization of the list object and setting the path to a default or a user-supplied value. The first assuress uPy is setup completely after calling initialize() and makes initialization similar to how it is done for mp_sys_argv. The latter allows configuring the path using defaults or an arbitrary value which comes in handy when embedding.
When MICROPY_PY_EMBED is declared the main uPy initialization, configuration and execution functions are made non-static and main() itself is excluded from compilation. These functions are also declared in a seperate header to make them accessible. This allows embedding the source and running files, strings or repl.
When embedding uPy it isn't always desired to print to command line. Solve this by making the function used to report uncaught exceptions a variable.
When embedding one might want to specify the heap to use, in which case uPy must not free() it during deinitialization phase.
Because I intended to have a minimal external API with one single initialize() function which as the name implies initializes; completely, without exceptions. Maybe there's a cleaner way to do this, or maybe it should be left split after all (so require user to always call mp_stack_set_limit/gc_init)
yes that's a viable minimal sample. But still requires some work before being an alternative to this PR as it misses required functionality and is not pluggable into existing code nor very reusable.
|
That's impossible. Not until there will be a GC/memmgr implementation based on malloc(). And when it will be, single initialize() won't be doing what people want (you of course may argue that people are alwys looking for a new exciting ways to add DoS-ability to their apps, then yes, single initialize() is a good way to achieve that).
Of course it requires more work (if there're usecases for more advanced embedding, which should be posed first, because that PR is quite adequate for "print('hello world')"). Just as its description says, it starts that work from the start, and that's its main contribution. |
So what do you suggest? initialize() which just calls mp_init and inits mp_sys_path/mp_sys_argv and leave heap/stak initialization to the user?
I need pretty much everything main does now except argument parsing: initialize uPy, initialize path and argv, call do_str/do_file/do_repl and set custom exception handler. |
To write mergeable patches, that's the biggest culprit we discuss here ;-).
Now talking about embedding, 1st step would be to accept that MicroPython has support for embedding already. Like I wrote in the forum (that what unix' main.c does is to embeds MicroPython engine in (largely) CPython-compatible binary), and #1335 demonstrates. Now you can call it's low-level API and say that it would be nice to have higher-level API, and that will be good to consider. But before that, we need to solve few "logistical" issues. For example, do you want to have a libmicropython.a ? If so, that should be submitted as a separate and acked by @dpgeorge. I'd also like further fate of #1335 to be decided. Enough work items for starters. |
I conclude you want seperate PRs for pretty much each of the commits in this PR then? |
It's not that I "want", it's that I show a different approach. It will be sad if you don't agree that it's better and we'll spend resources in competing approaches. (I for one can't agree that your code is better, because I smell "enterprisey" style in it - "we need feature, yesterday, so mince and smash to have it done today, don't care about tomorrow, chances are nobody will care at all, or at least somebody else, not us, will". I have a dayjob too and am sometimes reduced to writing such code, so hope you can appreciate that I recognize it ;-) ). |
Answering direct question, the above would be ideal - done step by step, and each step being self-sufficient enough to be a separate PR. It's not always possible, but then those should be separate commits of one PR, each commit giving that feeling of a useful progress being made, not just moving things around (and certainly not "mince and smash" feeling). But again, that's only what I'd love to see. |
I'm not entirely sure what you mean with Actually this perfectly illustrates one problem I'm seeing here, not sure if I brought it up already: more often than not I apparently fail to properly understand what you mean because you use some lingo which my brain has no decoder for :) Like lots of vague phrases (from my point of view) which clashes with what I am used to namely a direct, practical manner of speech which calls things by name.
well that is the weirdest definition of enterprise I ever read. Which again illustrates my above point. (moreover I wrote this PR in my spare time and my dayjob is about as far from an actual enterprise as it can get so that only adds to the confusion) |
Good.
The usual disclaimer goes of: 1) I'm not native English speaker; 2) oftentimes a face choice not whether to write it in a way "A", or "B", but whether to write at all or not; 3) I make typos (and thinkos) as I write just as anybody else.
Going more specific is bordering on "going personal" or "telling somebody else what to do". Granted, I don't possess enough language magic to do "unobtrusive hinting" in a way native speakers can ;-).
Well, then I was wrong, sorry, and maybe it's just me write code in that style under constraint pressure ;-).
That's why the term "enterprisey". I picked it up on python-dev. It's certainly vague enough to not be mixed with black&while image of "enterprise" ;-).
I didn't say that, sorry ;-). (Or didn't mean it literally in "never-never" sense).
That's exactly what I won't be doing, per above (and more reasons, like I can't be so arrogant as to think that my way is better until it's fully comparable to yours, which it is not, and even then I still may be missing something). Ok, let's get back from communicative aspects to specifics:
Well, I'd like to make it a nice API which is pleasure and breeze to use for everyone, I hope we're on the same line here.
Ok, let's see what's not clear in my comment #1279 (comment) :
Yeah, "ticket" is missing after "separate", the entire phrase looks weird. (My friend who had a lot of that yet at school writing tests always told "a pen drags behind the thought" ;-) ).
So, let me be more explicit: I'm not interested to do any further work/discussion on this until:
|
Thanks for the elaborate explanation. I will look into static lib creation and maybe some small PR with only some of the commits here. But I'll double check if it makes sense :) And we'll just see what happens next. |
Thanks. I guess Damien has a backlog of stuff after ESA announcement and it may take him some time to get to these things, but well, the whole point is making it in sustainable manner and having gradual progress. |
Guys has been embedding support merged into main? Because this PR apparently hasn't… |
I'm interested in using MP as a library, what's the progress on that? |
I have not tried it yet but maybe this helps: https://github.com/micropython/micropython/tree/master/examples/embedding |
After some discussion on the forum it seems there is interest in embedding uPy into other applications, hence this PR with most of the changes I'm using myself. It basically consists of moving all actual uPy functionality out of main.c into publicly available functions in two seperate files called pyexec/pyrun appropriately (well that's open for discussing semantics of course).
This is for the unix port only: I quickly glanced over the other port's main/pyexec but there sure are a lot of differenecs and stuffing that all into common files would probably lead to an #ifdef mess.
Creating the simplest application with uPy would now come down to
Which I realize by reading this now, could furthermore be simplified by providing default pyexec_opts_t by passing NULL to pyexec_str..
There is basically still one missing piece regarding the build but I'm not sure how to continue with that, plus I do not have much experience with Makefiles: to actually use the above one has to replace the current main.c with the above code. Or add a target to the makefile to build a static (dynamic should work as well) library with all objects except main.o and then build using for instance
So one question is where to go from here: could leave it as is and people can do their own builds. Or we could alter the build to consist of building a static library + building main.c into an executable (which afaik yields the exact same binary?). Or leave the current build as-is and also provide an extra target to build a static lib.