8000 Unix embed by stinos · Pull Request #1279 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

Unix embed #1279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

Unix embed #1279

wants to merge 8 commits into from

Conversation

stinos
Copy link
Contributor
@stinos stinos commented May 22, 2015

After some discussion on the forum it seems there is interest in embedding uPy into other applications, hence this PR with most of the changes I'm using myself. It basically consists of moving all actual uPy functionality out of main.c into publicly available functions in two seperate files called pyexec/pyrun appropriately (well that's open for discussing semantics of course).

This is for the unix port only: I quickly glanced over the other port's main/pyexec but there sure are a lot of differenecs and stuffing that all into common files would probably lead to an #ifdef mess.

Creating the simplest application with uPy would now come down to

#include "py/compile.h"
#include "pyexec.h"
#include "pyrun.h"

int main(int argc, char **argv) {
#if MICROPY_ENABLE_GC
    pyrun_init(PYRUN_DEFAULT_STACK_LIMIT,
       malloc(PYRUN_DEFAULT_HEAP_SIZE), PYRUN_DEFAULT_HEAP_SIZE);
#else
    pyrun_init(PYRUN_DEFAULT_STACK_LIMIT);
#endif

    pyexec_opts_t opts = {.compile_only = false, .emit_opt = MP_EMIT_OPT_NONE};
    pyexec_str("print('hello embedded uPy world')", &opts);

    pyrun_deinit();
}

Which I realize by reading this now, could furthermore be simplified by providing default pyexec_opts_t by passing NULL to pyexec_str..

There is basically still one missing piece regarding the build but I'm not sure how to continue with that, plus I do not have much experience with Makefiles: to actually use the above one has to replace the current main.c with the above code. Or add a target to the makefile to build a static (dynamic should work as well) library with all objects except main.o and then build using for instance

gcc -I.. -I. -I./build -std=c99 micropython.a mymain.c

So one question is where to go from here: could leave it as is and people can do their own builds. Or we could alter the build to consist of building a static library + building main.c into an executable (which afaik yields the exact same binary?). Or leave the current build as-is and also provide an extra target to build a static lib.

@dpgeorge
Copy link
Member

Very interesting.

I have not had time lately to keep up with the forum so am a bit out of the loop right now. But I do have an alternative to this sitting in my private repo which is the following: a Python script which takes in the uPy source and processes it to make the API functions all start with "micropy_" (or whatever you like) and also inserts a context variable into each function and each call to completely eliminate global state. Then it provides wrapper functions like you have here to create uPy instances (and you can create more than one since the VM state is now local). It also makes an amalgamation of the source (33k line file!) so it's super easy to embed it anywhere.

My code is not yet ready for public consumption so maybe your way is better for the moment.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) to 93.82% when pulling dcef882 on stinos:unix-embed into a3a14b9 on micropython:master.

@stinos
Copy link
Contributor Author
stinos commented May 22, 2015

alternative ... eliminate global state ... amalgamation

As far as interesting goes this definitely sounds like a win :P

maybe your way is better for the moment

I think it will be complementary in the end. When embedding you don't want to copy/paste the parts which reside in main() now so extracting it into non-static functions seems beneficial in any case

@dpgeorge
Copy link
Member

I agree we should factor the boiler-plate code out into reusable functions. I think they should be named with prefix "mp_" to make them feel like part of the core. Something like mp_new_instance(...) to create a new uPy instance, and mp_free_instance(...) to clean up. This API extends nicely to the case where you can create multiple instances.

The exec functions would be mp_exec_str(...), mp_exec_file(...), etc.

As for the options (compile_only, emit_opts), I think they should be set by a special function like mp_exec_opts(...) so they apply to executing strs and files.

@pfalcon
Copy link
Contributor
pfalcon commented Jun 2, 2015

Ok, so I didn't want to give such feedback before somebody else tries to review this patch "semantically", now that @dpgeorge tried to do that, I can give my "surface" review on one point. Fortunately, now I can just link to #1267 (comment) , and give summary: "Please do not start implementing functionality with file renames. Doing that can be warranted only if you want you patch to spend extended time in review, then possibly grew old and closed as out of date."

The issue is not that renames are bad, the issue is that unneeded and bad renames are bad, and proving both functionality and goodness of renames are hard at the same time.

So, how I'd do that? I'd pose a no-nonsense, yet simple, embedding problem. Example given in the original description can be achieved by micropython -c "print('hello embedded uPy world')". So, if that's all what it can achieve, why such embedding needed (especially if it paints a lot of stuff red and green). And if it can do more - well, that's not shown.

So, an example I had in mind for this is: a C app reads file line by line, and writes out each line. But it allows user to specify a python file, that file should contain definition of function "process", a C app will run each line thru that function and write out returned result.

So, that will be an adequate example of embedding. Now how it should be achieved: initial version should achieve that by patching out "static" declarations, and applying #ifdef's. Function renames are not allowed at this stage. That's next stage. And splitting out functions to separate source files in another stage. All these should be represented by separate commits, easy to follow.

So, what about this patch? The only thing I can propose is that someone else (e.g. me) prepared alternative implementation per ideas above, then we compare which implementation is better. Because currently I cannot see whether this patch is good or bad, can be made better or not. Of course, if someone else can see that, feel free to merge.

P.S. Sorry if this looks like criticism. It's actually genuine attempt to try to get contributors produce mergeable patches. We now have few unmergeable patches: #1234 , #1267 , now this one. They are unmergeable not because I don't want to merge them, they are such because they're stuck in the queue without sign of progress. While simple, clean, and down-to-earth patches get merged in a day or two. So, I see the problem, but other people don't, and keep producing such patches. So, I just tried to analyze where's problem and how to overcome it. Someone may say it's not fair - I can come up with any rename I think is needed, because I have commit access, but I challenge other peoples' decision to rename/split, as if I don't believe they had good reasons to. But the point is actually the same - I should not perform random and unclear renames/splits, in particular, I should not try to "squash" changes together, even if for me, it's clear they're needed and I think I can skip intermediate steps. And if 10 functions needs to be split out, generally having 10 commits which split a single function each is better than 1 which splits out 10. Perhaps, good real-world compromise is 3 commits with 3-4 functions. I also may try to split first, apply functionality then - to try to get more changes "after the last rename" (such changes are easier to query/follow). That works best with 1-2 functions. Applying that en-masse is shady technique, which I do criticize here. If I do that myself, I'd like to hear criticism.

@stinos
Copy link
Contributor Author
stinos commented Jun 4, 2015

Don't worry, criticism approved: I certainly see why you consider this not mergeable.
However the main reasons for this PR are not to get it merged as such, but rather:

  • sure I could have said in last forum post about embedding 'here check out my branch' but that would likely have resulted in zero responses. A PR at least draws attention and yields feedback - as proven here :]
  • multiple questions on the forum have been asked about embedding so far
  • I use embedding myself
  • to get that working I have to maintain a mix of removing a bunch of STATIC keywords, plus a seperate file with blatant copies of code in main()
  • getting embedded uPy working properly should fix above three points and as such be best for everyone in the end
  • main() is after all pretty crufty so some cleanup shouldn't hurt

So: I'll see what happens when I keep renames and moving code around to a minimum (I'll abandon current code and reuse this PR unless someone sees a reason not to?).

Example given in the original description can be achieved by micropython -c "print('hello embedded uPy world')".

Yeah that's what you get with hello world samples - are they ever really useful? After all the example given can also be achieved and performed much faster by echo hello embedded uPy world..

So, if that's all what it can achieve, why such embedding needed (especially if it paints a lot of stuff red and green). And if it can do more - well, that's not shown.

Your example with the C app is one way to do embedding and should work with changes presented. In the extreme case it still is not adequate though: if you don't need performance you could do without embedding as you could pipe each line to micropython myprocess.py after #1306 is merged.
Just for illustration: in one of our programs using uPy, shapes/images/video are displayed and describing what is shown where and when is controlled using uPy, from the command line or from a file or remotely using TCP, or a combination of those. That alone, and other reasons make it much harder if not impossible to achieve without embedding (we also require a custom main() for various reasons, have strict requirements regarding timing so are pretty much forced to use seperate threads for the render loop and the script thread, don't always want exceptions printed to the command line, possibly some other things I'm not thinking of).

alternative implementation per ideas above

Don't you think in the end it would, apart from function names and file structure maybe, be pretty much the same?

@stinos
Copy link
Contributor Author
stinos commented Jun 9, 2015

So here's take 2 with more gradual changes/smaller commits. Should be easy to follow: first some functions are extracted from main(), then those functions are made public and then some more public configuration is added.

@pfalcon
Copy link
Contributor
pfalcon commented Jun 17, 2015

Don't you think in the end it would, apart from function names and file structure maybe, be pretty much the same?

My suspicion that it may be like that is the reason of the above rant - "please make it more visible". But beyond that answer, can offer 2 other: 1) I don't know; 2) I'm not sure. As an example, I doubt I would produce this patch: stinos@edb2e13 . Why all that is needed if you can call gc_init() directly?

@pfalcon
8000 Copy link
Contributor
pfalcon commented Jun 17, 2015
  1. I don't know; 2) I'm not sure.

See #1335.

stinos added 8 commits June 22, 2015 09:31
Split initialization phase into actual initialization of the list object
and setting the path to a default or a user-supplied value.
The first assuress uPy is setup completely after calling initialize() and
makes initialization similar to how it is done for mp_sys_argv.
The latter allows configuring the path using defaults or an arbitrary value
which comes in handy when embedding.
When MICROPY_PY_EMBED is declared the main uPy initialization, configuration
and execution functions are made non-static and main() itself is excluded from compilation.
These functions are also declared in a seperate header to make them accessible.
This allows embedding the source and running files, strings or repl.
When embedding uPy it isn't always desired to print to command line. Solve
this by making the function used to report uncaught exceptions a variable.
When embedding one might want to specify the heap to use, in which case
uPy must not free() it during deinitialization phase.
@stinos
Copy link
Contributor Author
stinos commented Jun 22, 2015

Why all that is needed if you can call gc_init() directly?

Because I intended to have a minimal external API with one single initialize() function which as the name implies initializes; completely, without exceptions.
So don't tell the user "ok this is called initialize but before that you must also call init_xxx and init_yyy else you get a segfault".

Maybe there's a cleaner way to do this, or maybe it should be left split after all (so require user to always call mp_stack_set_limit/gc_init)

See #1335.

yes that's a viable minimal sample. But still requires some work before being an alternative to this PR as it misses required functionality and is not pluggable into existing code nor very reusable.
I added one small patch to build a static library from uPy to compare:

> make CFLAGS_EXTRA=-DMICROPY_PY_EMBED=1 staticlib
> cat > embed.c:
#define MICROPY_PY_EMBED 1
#include "pyexec.h"
int main(int argc, char **argv) {
    initialize();
    do_str("print('helo')");
    deinitialize();
}
> gcc -I . -I.. -Ibuild -std=gnu99 -DUNIX -DMICROPY_PY_EMBED=1 -lm -lffi -ldl embed.c micropython.a -o upyemb
> ./upyemb
helo

@pfalcon
Copy link
Contributor
pfalcon commented Jun 23, 2015

minimal external API with one single initialize()

That's impossible. Not until there will be a GC/memmgr implementation based on malloc(). And when it will be, single initialize() won't be doing what people want (you of course may argue that people are alwys looking for a new exciting ways to add DoS-ability to their apps, then yes, single initialize() is a good way to achieve that).

But still requires some work before being an alternative to this PR

Of course it requires more work (if there're usecases for more advanced embedding, which should be posed first, because that PR is quite adequate for "print('hello world')"). Just as its description says, it starts that work from the start, and that's its main contribution.

@stinos
Copy link
Contributor Author
stinos commented Jun 24, 2015

So what do you suggest? initialize() which just calls mp_init and inits mp_sys_path/mp_sys_argv and leave heap/stak initialization to the user?

if there're usecases for more advanced embedding, which should be posed first

I need pretty much everything main does now except argument parsing: initialize uPy, initialize path and argv, call do_str/do_file/do_repl and set custom exception handler.

@pfalcon
Copy link
Contributor
pfalcon commented Jun 24, 2015

So what do you suggest?

To write mergeable patches, that's the biggest culprit we discuss here ;-).

initialize() which just calls mp_init and inits mp_sys_path/mp_sys_argv and leave heap/stak initialization to the user?

Now talking about embedding, 1st step would be to accept that MicroPython has support for embedding already. Like I wrote in the forum (that what unix' main.c does is to embeds MicroPython engine in (largely) CPython-compatible binary), and #1335 demonstrates. Now you can call it's low-level API and say that it would be nice to have higher-level API, and that will be good to consider. But before that, we need to solve few "logistical" issues. For example, do you want to have a libmicropython.a ? If so, that should be submitted as a separate and acked by @dpgeorge. I'd also like further fate of #1335 to be decided. Enough work items for starters.

@stinos
Copy link
Contributor Author
stinos commented Jun 25, 2015

To write mergeable patches .... should be submitted as a separate

I conclude you want seperate PRs for pretty much each of the commits in this PR then?

@pfalcon
Copy link
Contributor
pfalcon commented Jun 25, 2015

It's not that I "want", it's that I show a different approach. It will be sad if you don't agree that it's better and we'll spend resources in competing approaches. (I for one can't agree that your code is better, because I smell "enterprisey" style in it - "we need feature, yesterday, so mince and smash to have it done today, don't care about tomorrow, chances are nobody will care at all, or at least somebody else, not us, will". I have a dayjob too and am sometimes reduced to writing such code, so hope you can appreciate that I recognize it ;-) ).

@pfalcon
Copy link
Contributor
pfalcon commented Jun 25, 2015

I conclude you want seperate PRs for pretty much each of the commits in this PR then?

Answering direct question, the above would be ideal - done step by step, and each step being self-sufficient enough to be a separate PR. It's not always possible, but then those should be separate commits of one PR, each commit giving that feeling of a useful progress being made, not just moving things around (and certainly not "mince and smash" feeling). But again, that's only what I'd love to see.

@stinos
Copy link
Contributor Author
stinos commented Jun 25, 2015

if you don't agree that it's better

I'm not entirely sure what you mean with it here, but if it's about #1335 vs this one I already told: yes it is a perfectly working sample but not usable for me at all now and since I can only guess which direction you are going with I cannot possibly tell if it will be usable in the future. Let alone if it's better. If on the other hand if means 'approach to how PRs are submitted' then yes I agree as i finally start to understand what you mean. I think.

Actually this perfectly illustrates one problem I'm seeing here, not sure if I brought it up already: more often than not I apparently fail to properly understand what you mean because you use some lingo which my brain has no decoder for :) Like lots of vague phrases (from my point of view) which clashes with what I am used to namely a direct, practical manner of speech which calls things by name.

"enterprisey" style in it - "we need feature, yesterday

well that is the weirdest definition of enterprise I ever read. Which again illustrates my above point. (moreover I wrote this PR in my spare time and my dayjob is about as far from an actual enterprise as it can get so that only adds to the confusion)
So apart from where you say you think initialize() should not call gc_init() you gave zero exact reasons of what is not good. Instead you say you have a certain feeling about it. How on earth am I supposed to do something with that? Why not just say "x and y and z is bad, do it over like this" or if that costs too much time, which I'd perfectly understand, just say "ditch y and z and do just x and then we'll see" or if all else fails just recommend to delete the entire PR and forget about it. At least that would be clear.

@pfalcon
Copy link
Contributor
pfalcon commented Jun 25, 2015

'approach to how PRs are submitted' then yes I agree as i finally start to understand what you mean. I think.

Good.

not sure if I brought it up already: more often than not I apparently fail to properly understand what you mean because you use some lingo which my brain has no decoder for :)

The usual disclaimer goes of: 1) I'm not native English speaker; 2) oftentimes a face choice not whether to write it in a way "A", or "B", but whether to write at all or not; 3) I make typos (and thinkos) as I write just as anybody else.

Like lots of vague phrases (from my point of view)

Going more specific is bordering on "going personal" or "telling somebody else what to do". Granted, I don't possess enough language magic to do "unobtrusive hinting" in a way native speakers can ;-).

well that is the weirdest definition of enterprise I ever read. (moreover I wrote this PR in my spare time

Well, then I was wrong, sorry, and maybe it's just me write code in that style under constraint pressure ;-).

from an actual enterprise as it can get so that only adds to the confusion)

That's why the term "enterprisey". I picked it up on python-dev. It's certainly vague enough to not be mixed with black&while image of "enterprise" ;-).

So apart from where you say you think initialize() should not call gc_init()

I didn't say that, sorry ;-). (Or didn't mean it literally in "never-never" sense).

Why not just say "x and y and z is bad, do it over like this"

That's exactly what I won't be doing, per above (and more reasons, like I can't be so arrogant as to think that my way is better until it's fully comparable to yours, which it is not, and even then I still may be missing something).

Ok, let's get back from communicative aspects to specifics:

I can only guess which direction you are going with I cannot possibly tell if it will be usable in the future.

Well, I'd like to make it a nice API which is pleasure and breeze to use for everyone, I hope we're on the same line here.

At least that would be clear.

Ok, let's see what's not clear in my comment #1279 (comment) :

For example, do you want to have a libmicropython.a ? If so, that should be submitted as a separate and acked by @dpgeorge.

Yeah, "ticket" is missing after "separate", the entire phrase looks weird. (My friend who had a lot of that yet at school writing tests always told "a pen drags behind the thought" ;-) ).

I'd also like further fate of #1335 to be decided. Enough work items for starters.

So, let me be more explicit: I'm not interested to do any further work/discussion on this until:

  1. A ticket regarding creation of Makefile target for MicroPython static lib is submitted and ack'ed by @dpgeorge (and hint - I already submit many tickets ;-) ).
  2. My existing PR unix: Add example of embedding interpreter in a standalone app. #1335 is reviewed by @dpgeorge and he gives some suggestions regarding it. For example, maybe he agrees to merge it ;-). (I myself wouldn't find it too suitable for merging, in a sense, that I envision it being reworked many times later. But that's the issue - we need to start somewhere with this embedding stuff, and unix: Add example of embedding interpreter in a standalone app. #1335 is arguably pretty unobtrusive way to that).

@stinos
Copy link
Contributor Author
stinos commented Jun 26, 2015

Thanks for the elaborate explanation. I will look into static lib creation and maybe some small PR with only some of the commits here. But I'll double check if it makes sense :) And we'll just see what happens next.

@stinos stinos closed this Jun 26, 2015
@pfalcon
Copy link
Contributor
pfalcon commented Jun 26, 2015

Thanks. I guess Damien has a backlog of stuff after ESA announcement and it may take him some time to get to these things, but well, the whole point is making it in sustainable manner and having gradual progress.

@stinos stinos deleted the unix-embed branch May 3, 2017 07:28
tannewt added a commit to tannewt/circuitpython that referenced this pull request Feb 4, 2019
@psprint
Copy link
psprint commented May 14, 2021

Guys has been embedding support merged into main? Because this PR apparently hasn't…

@dzarda
Copy link
dzarda commented Jul 9, 2021

I'm interested in using MP as a library, what's the progress on that?

@anyc
Copy link
anyc commented Apr 17, 2023

I have not tried it yet but maybe this helps: https://github.com/micropython/micropython/tree/master/examples/embedding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0