8000 py/gc.c: decouple gc_sweep from allocation/free operations by vitiral · Pull Request #1321 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

py/gc.c: decouple gc_sweep from allocation/free operations #1321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

py/gc.c: decouple gc_sweep from allocation/free operations #1321

wants to merge 2 commits into from

Conversation

vitiral
Copy link
Contributor
@vitiral vitiral commented Jun 12, 2015

This commit is the first part in a series of commits to decouple
the alloc/dealloc component of the upython memory manager from the
garbage collector.

  • mem_free implemented to be independent of gc_free.
  • mem_first and mem_next functions for a standard api to walk
    through memory

This commit WILL have a small impact on performance. The mem_next
function is not quite as fast as the original implementation.
Also, mem_free performs some checks (in MEM_VALID) that have
essentially already been done by the loop.

I contend the following:

  • there is no way to make the "next" funcionality faster and allow
    it to be fully decoupled
  • we could make the mem_free function faster by implementing a
    mem_free_unsafe (or something) that would not perform the checks.
    This could be a macro for speed, and called by the mem_free
    function. Whether this complexity is required for the small
    increase in performance is up to the upython developers

This commit is the first part in a series of commits to decouple
the alloc/dealloc component of the upython memory manager from the
garbage collector.

- mem_free implemented to be independent of gc_free.
- mem_first and mem_next functions for a standard api to walk
    through memory

This commit WILL have a small impact on performance. The mem_next
function is not *quite* as fast as the original implementation.
Also, mem_free performs some checks (in MEM_VALID) that have
essentially already been done by the loop.

I contend the following:
- there is no way to make the "next" funcionality faster and allow
    it to be fully decoupled
- we could make the mem_free function faster by implementing a
    mem_free_unsafe (or something) that would not perform the checks.
    This could be a macro for speed, and called by the mem_free
    function. Whether this complexity is required for the small
    increase in performance is up to the upython developers
if (ATB_GET_KIND(block) == AT_MARK){
ATB_MARK_TO_HEAD(block);
} else {
// TODO: bad indentation kept intentionally for git diff (will move back next commit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such things should not be necessary: use git diff -w to ignore whitespace. Or for the diff on github itself: append ?w=1 to the end of this url (so you get https://github.com/micropython/micropython/pull/1321/files?w=1) That is invalueable anyway fot commits like this else all you see is one huge bunch of adds/deletes.

@stinos
Copy link
Contributor
stinos commented Jun 13, 2015

there is no way to make the "next" funcionality faster and allow it to be fully decoupled

Storing the result of MP_STATE_MEM( gc_alloc_table_byte_len ) * BLOCKS_PER_ATB in a variable (set in gc_init or so) should save some cycles already, especially if you also use in in the other macros. Also measure whether using an if is faster than the switch case, you never know. Another improvement: mem_next now does ATB_GET_KIND but the outer loop immediately does that as well afterwards. So if you return that result from mem_next (pass mp_uint_t* argument or so) that also saves cycles. Same idea for mem_free probably but harder to implement maybe: when leaving mem_free you know ATB_GET_KIND for the next block already yet mem_next has to fetch it again.

Some other remarks:

  • mem_first is implementation is basically the same as mem_next, maybe merge them?
  • in mem_free you are incrementing block in a loop but the main outer loop is unaware of that, so restarts at the original block. That could add an unnecessary amount of cycles (or I'm missing something)?
  • the set the last_free pointer to this block if it's earlier in the heap part isn't in the original code
  • DEBUG_printf("gc_sweep(%p)\n",PTR_FROM_BLOCK(block)) statement is gone
  • assert(ATB_GET_KIND(block) != AT_MARK) seems unnecessary as it follows an if/else on that exact condition

@dpgeorge
Copy link
Member

@cloudformdesign thanks for the effort trying to get this ready for merging, but I think we need to change how this is going to be done. This patch increases code size by about 150 bytes and (supposedly) decreases performance. By the time we add all patches to get to a state where the allocator is separated from the GC it's likely that the increase in code size will be at least a few times 150 bytes and performance will further suffer. Since the current GC/allocator is well proven, compact and relatively efficient it seems a shame to lose it, and so I don't want to make any major changes to it (such as those proposed here).

Secondly, I was under the impression that #1168 was about providing a completely independent memory allocator/GC to replace the current one. Ie a complete replacement for gc.c that implements all the functions in gc.h (gc_alloc, gc_free, ...). But it seems that you want to divide gc.c into allocator and GC, and then replace the allocator. (I may have missed something in the previous discussions about all this so sorry if what I'm saying has already been discussed, or I'm missing something.)

So, way forward would be to copy gc.c to some new file (eg gcplug.c) and work on that. There would be a config variable to select which allocator to use (gc.c or gcplug.c).

Also, please read CODECONVENTIONS carefully and follow them. You have some whitespace inconsistencies. And you need to write (void) for functions that take no arguments.

@dpgeorge
Copy link
Member

@cloudformdesign I hope you are not discouraged to work on this, it would be a shame to lose the work you have done so far. Note that you have picked probably the most critical (and therefore difficult) piece of the code to work on: the memory manager is critical for efficiency in time, efficiency in memory use, and robustness.

@vitiral
Copy link
Contributor Author
vitiral commented Jun 18, 2015

No no, my daughter was just born on Sunday night, so this has been on haitus!

I think separating it so it can be chosen at compile time is an excellent idea. I need to figure out how to do this for this project. Do you do it in the makefile, in header files, at the command line, or something else?

@pfalcon if the allocator was selected this way would you be more open to the single large commit I have already made? Obviously we would make it no longer change gc.c

@pfalcon
Copy link
Contributor
pfalcon commented Jun 18, 2015

@cloudformdesign : Congratulations!

I'm personally ok with any implementation which achieves realistic goals in a seamless manner (and there're other people to do reviews). Note that the way @dpgeorge described it is how I'd approach it - not starting with separating gc and memmgr part, but just making it be able to use gc_something.c instead of existing gc.c . The way you approached it is better, more formal, more reusable, more advanced, more cool. It's also more complicated, harder to do, harder to do right, and then harder to get merged. Note that I also predicted @dpgeorge's concerns, and that's again why I suggested to start with factoring out just a single function - to learn a way, and to prove, that it's possible to provide gc/memmgr separation without affecting performance.

@vitiral
Copy link
Contributor Author
vitiral commented Jun 18, 2015

We need to talk about the not affecting performance part. I don't think this is possible without implementing separate ifdef blocks that reach into the internals of the selected men manager for some functions, which harms plug and play.

If we require performance to not be impacted at all then this is going to be very difficult (read impossible). This is another reason to have separate pluggable functionality, as we can experiment a little as we simultaneously reach our goals. My goal would be to eventually have the faster current implementation and a lower performing but robust option. We can then do whatever we want to improve performance from there

@dpgeorge
Copy link
Member

Great news @cloudformdesign! Take your time, no hurry.

I think separating it so it can be chosen at compile time is an excellent idea. Do you do it in the makefile, in header files, at the command line, or something else?

It depends. All configurable stuff in py/ is done using #if guards; ie all files are compiled but sometimes they contain nothing. In unix/ some files are optionally compiled depending on a flag that the makefile has access to.

Part of the work making gc.c replaceable will be figuring out how to configure it :) Really, if you can just provide a minimal working example (ie passes most tests) of how to use something other than gc.c then that would be a great start.

If we require performance to not be impacted at all then this is going to be very difficult

If gc.c is replaceable with gc_X.c then there will be no change in performance using the existing gc.c. That is how it should work, a simple way of using different memory implementations that doesn't force anything on the implementation except that they should provide gc_alloc, etc (which can be renamed to mem_alloc, or something).

@vitiral
Copy link
Contributor Author
vitiral commented Jun 22, 2015

Hey @dpgeorge it will be a little while until I start work on this, but #1234 passes all tests for unix (still having build problems for ARM) and successfully splits the memory manager. Would you mind reviewing that? If the gc.c was renamed to gcplug.c and there were some selector magic, that might be 90% of the way there. (although there are still issues with the code style and the process I developed it with).

Personally I would solve this issue by creating a mem folder in the base directory, where you could mix and match the different garbage collectors. A minimal folder would have everything in gc.c. A plug folder would only implement the garbage collector (in gc.c), but would require a separate memory manager, call it mem.h, to define the required functions for de/allocation. mem.h would then be defined in a simplemem folder or something like that.

Then to add memory managers you just put them in a named folder, and create a mem.h header file that correctly imports and renames the required functions. The Makefile would then use a different -I folder depending on the options selected. I think this layout might require all files to import just gc.h instead of py/gc.h, so that would be one problem. Other problems are also possible.

@dpgeorge
Copy link
Member

The case of having a memory manager with separate and pluggable GC and malloc components is difficult. Let's first try to solve the case of having the whole thing pluggable as a single entity.

I don't think we should introduce subdirs in py/. Currently we use prefixes on filenames to separate parts of the implementation (eg mod_, obj_) so we can continue doing it that way. I also think we should have a single file implementing a given memory allocation scheme. As is currently done with the nlr implementations, all files are run through the compiler, but only one is actually used due to the correct macro being defined (so all files are wrapped in a #if). The relevant state for the memory manager will need to be selected in mpstate.h using the same #if macro.

gc.h is the main header that describes the memory manager interface. Any implementation must provide the functions listed there.

On file names, I can think of two options:

  1. Keep the current gc.h as the main header, and gc.c as the reference/standard implementation. New implementations are named gcXXX.c. Could possibly rename gc.c to gcstd.c to indicate it's the standard implementation.
  2. Rename to something more generic, eg prefix with heap* or mem* (after all it is a heap memory manager that we are talking about here). Then gc.h becomes heap.h or mem.h, gc.c becomes heapgc.c or memgc.c, and new implementations are heapXXX.c or memXXX.c.

Further down the track, when we want to implement sub components within a heap manager, there will still be one master file for a given implementation (heapXXX.c) but it will #include relevant sub-files to provide the actual functions. Eg heapplug.c could #include heapgcplug.c and heaptinymem.c.

@vitiral
Copy link
Contributor Author
vitiral commented Nov 4, 2015

I am now working on this. Closing this pull request and will be opening a new one.

I will be doing the following:

  • renaming gc.c/h to mem_micro.c/h (to reflect @dpgeorge comment that "the current GC/allocator is well proven, compact and relatively efficient")
  • creating a new mem_plugable.h that can be used to put together a garbage collector / heap packages.
  • creating heap_basic.c/h and gc_basic.c/h to be pretty much exactly the same as the current implementation, except split appart.

Does that all sound good?

Edit: made it heap instead of malloc

@vitiral vitiral closed this Nov 4, 2015
@dpgeorge
Copy link
Member
dpgeorge commented Nov 4, 2015

Having duplicates of code is not ideal, but I don't see any other way to proceed.

@pfalcon
Copy link
Contributor
pfalcon commented Nov 4, 2015

Does that all sound good?

No. As was discussed, avoid munging existing code until you have something of real-world value to show. Starting with renaming of files, which may show as a right step in distance future, isn't a good way. As was discussed before, you should optimize your work for following criteria: 1) change as little as possible in existing code; 2) do something useful as soon as possible and with as little effort as possible ("little effort" as in: "avoid avoidable changes and refactors", not as "write sloppy code").

As usual, that's just my personal opinion.

@pfalcon
Copy link
Contributor
pfalcon commented Nov 7, 2015

Here's good example of consequences of changing memalloc/gc code: http://forum.micropython.org/viewtopic.php?f=2&t=1112 . Fanalization is a pretty straightforward code. 1.5 years after it was added, it's still buggy (and that's of course not the only bug fixed over this time). That's not the reason to not work on memalloc/gc, it's a reason to have very rigid process for that.

tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Nov 10, 2018
Move atmel-samd to tinyusb and support nRF flash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0