RFC: Add zipfile support #1797

dhylands · 2016-01-25T01:37:30Z

What I have coded so far seems to be working, although it needs some more testing. The next step after this is to hook up zipfiles into the import mechanism so that you can add a zipfile to sys.path.

Part of this adds mpfile.c/.h which gives a nice C API for accessing files or "file-like" objects. For example, a python script contained within a zipfile would be read using a "file-like" object (in particular an instance of the ZipExtFile object). File-like objects could also be written in python (see the ByteFile class in tests/extmod/zipfile1 for an example).

Using mpfile, it would allow us to remove the os-specific lexers and instead create an mp-file based lexer.

To support importing from a zipfile, I think that I need to extend mp_raw_load_code_load_file and mp_lexer_new_from_file to support reading from a file or file-like object.

Using a compressed zipfile typically doubles the effective amount of storage that you have, and it also eliminates the average 256 bytes that are wasted per file on the regular filesystem.

On the unix build, this adds about 5.3K and on stmhal it adds about 2.4K

I've coded this so that it should work on a big-endian MCU, but I don't have anything to test it on.

One caveat to be aware of is that uzlib doesn't seem to support partial decompressing (or streaming), so the code needs to allocate enough memory for both the compressed and uncompressed data. The compressed data is freed as soon as the decompression is done.

peterhinch · 2016-01-26T07:40:57Z

That looks very useful given the limited flash space on the Pyboard. If this is accepted, corresponding mods to rshell to copy/edit/delete files in a zipped directory would be good.

stinos · 2016-01-26T08:31:34Z

Just out of interest, have you checked other compression algorithms? I seem to remember last time I tested something like this, plain zip was almost the worst option both speed- and compression-wise. I did not compare the code size though.

peterhinch · 2016-01-26T13:08:34Z

@stinos zlib is a Python standard library so there are inter-operability benefits in implementing the same algorithm.

dhylands · 2016-01-31T08:07:07Z

Importing modules from a zipfile seems to be working now. I've also coded (but not yet tested) importing byte-code modules.

I added the ability to configure zipfile support and/or zipimport support. Size increases for stmhal:

2040 - zipimport alone
2536 - ZipFile alone
3072 - Both ZipFile and zipimport

I currently disabled the tests for windows. We can turn them on if windows decides to enable the options by default.

dpgeorge · 2016-02-01T17:31:51Z

Thanks @dhylands. There is a lot of stuff here, some which brings undelying structural changes, like the mp_file stuff, which, if we used, would make sense to convert all lexers to use it. So it's going to take some time to consider the approach you've taken.

What is the main reason for this, is it because of lack of flash space? Can you please give the use-case that you hit that prompted you to do this.

PR #1811 (frozen bytecode) might eliminate a lot of the need for zipfile import. If you are anyway going to be compiling firmware yourself, then frozen bytecode is the most optimal thing to do. It requires no overhead for the filesystem, no ram for decompression or compilation, and the bytecode runs from flash (also extra qstrs from your scripts are in flash).

dhylands · 2016-02-01T17:50:44Z

Yeah - my primary usecase is filling the flash storage, in my case on the Espruino Pico. There is no sdcard available on this device to expand the space.

Using precompiled bytecode takes about half the space of using source code.
Using zip compressed precompiled bytecode takes up about half the space of that.

With the filesystem, there is about half a block (256 bytes) of space wasted per file stored on the filesystem. Using a zipfile the wasted space per file is much less.

Using frozen bytecode would help, although the Espruino Pico flash is very close to full (about 11k available). Reducing the filesystem would give back some of that space (there is one 64K block that we're only using 16K of - so removing that 16K block from the FS gives back 64K of flash).

The disadvantage of using frozen bytecode is that you need to rewrite the entire image for each update.

So using zipfile import seemed like a reasonable tradeoff (2K to implement), and it doesn't require reflashing the firmware.

I did make zipimport and zipfile support completely configurable.

Once precompiled bytecode (.mpy files) lands, that buys me a bunch of space, and I can use that just as easily as using zipfiles.

pfalcon · 2016-02-01T21:08:47Z

Codebase updated to uzlib 1.2.2 with the fix, @dhylands , please rebase.

This supports decompressing stored files, and if MICROPY_PY_ZLIB is enabled then DEFLATED files (the default compression that zip uses) can be decompressed.

dmazzella · 2017-07-17T20:12:04Z

news on this?

dhylands · 2017-07-17T21:57:53Z

Enough has changed, that this probably needs to be totally redone. Due to personal reasons, I haven't had the time in a large enough block to do anything with this. I don't mind if this is closed and I can reopen in the future if I get a chance to rework it.

klardotsh · 2018-10-01T19:34:37Z

I'm quite tempted to take a look into reviving this - I've got something like 106k of FROZEN_MPY going to my device at this point and the project still isn't complete - phew! And that's down from about 115k of raw Python scripts (with comments and la-dee-da). Compare that to: 16K for a .tar.gz of my source tree, and 24K for a .zip (created with tar cjvf blah.tar.gz mysrc and 7z a blah.zip mysrc, respectively).

I think there's still more than plenty of value here (it's certainly easier than my other alternative once the project gets big enough - having the "Python part" of the project run only on the PC, and ultimately flash a compiled C hex to the PyBoard/NRF target. I've already long since blown past the flashable-with-stock-mpconfigport size on one of my previously-target boards...).

pfalcon · 2018-10-01T21:23:00Z

@klardotsh: How is it going to help you? Do you have too much RAM to trade for flash? That's unlikely situation for a typical "deeply embedded" board. It would help low-end Linux boards, yeah, those which have 32MB RAM and 4MB flash.

If your frozen bytecode takes too much space, make sure you compile it with right optimization settings. And if you do, next step is to look into removing too much of the reflection information included, like method/kwarg names, which is only required for overdynamicity in Python, which is optional, extra feature in MicroPython.

pfalcon · 2018-10-01T21:31:36Z

A few comments on this PR:

py: Add C API for reading from file or file-like objects

This particular commit was authorized 2016-01-24, but since 2014-01-08, we already have C API for reading from file-like objects. It's called py/stream.c.

extmod: Add uzipfile

This is apparently useful, but the commit message should describe which subset of CPython's zipfile API it implements.

One caveat to be aware of is that uzlib doesn't seem to support partial decompressing (or streaming)

Ok, since about 2016-08-17 it supports it.

Note that there's an API change in upstream uzlib 2.9.xx (pre-3.0), I'm waiting for my other patches to be processed before working on upgrading moduzlib to it.

klardotsh · 2018-10-01T22:17:49Z

@pfalcon I seem to get no difference in output filesize on most files no matter what optimization levels I call mpy-cross with. That said, I can easily take a 20k mpy file down to 8k by throwing it in a zip file.

Even being able to zip/gzip individual modules would be fantastic - my project isn't super heavy on RAM, so a few files (especially ones mostly made of consts) being deflated at runtime is doable.

FWIW my target devices are currently a PyBoard and an NRF52840 dev board, so I've got the flash on these (what, 1MB or so?). Other devices I'd enjoy being able to throw this project on aren't as lucky - for example the Adafruit Feather nRF52832 has something like 256k, and some STM32 devices I'd like to port to I believe are 128k (meaning I'm pretty sure I'd have to shrink my project's code somehow - even if I rip the compiler and REPL out of MicroPython, there's no way I'm fitting uPy into... what, 20K that I'd have left in my current form?)

It makes me wonder a little how folks actually run full projects on MicroPython/CircuitPython boards that don't have as huge as PyBoard's ROM - GPIO SD cards? Clever hackery I haven't discovered yet?

peterhinch · 2018-10-02T09:37:41Z

@klardotsh There is official support for SD cards connected via an SPI interface.

Enable builds for ugame10

dpgeorge · 2021-05-18T04:17:44Z

Closing due to inactivity, and because it requires a lot of rework.

dhylands force-pushed the zipfile branch 15 times, most recently from 3bd1bab to c22ef4d Compare January 25, 2016 06:37

dhylands force-pushed the zipfile branch 5 times, most recently from a06f15f to fee7f18 Compare January 31, 2016 07:54

dhylands added 3 commits February 1, 2016 13:40

py: Add C API for reading from file or file-like objects

faeb8e1

extmod: Add uzipfile

88cf919

This supports decompressing stored files, and if MICROPY_PY_ZLIB is enabled then DEFLATED files (the default compression that zip uses) can be decompressed.

py: Add support for importing from a zipfile

793324b

dhylands force-pushed the zipfile branch from fee7f18 to 793324b Compare February 1, 2016 21:40

pfalcon force-pushed the master branch 6 times, most recently from 9167980 to 1cc81ed Compare April 10, 2016 22:16

pfalcon force-pushed the master branch from 91ecff0 to 56e7ebf Compare January 28, 2017 09:08

klardotsh mentioned this pull request Oct 12, 2018

Target upstream CircuitPython KMKfw/kmk_firmware#52

Closed

3 tasks

andrewleech mentioned this pull request Dec 14, 2018

stm32/main: Allow freezing boot.py and main.py into firmware image #4348

Closed

nickzoic pushed a commit to nickzoic/micropython that referenced this pull request Apr 16, 2019

Merge pull request micropython#1797 from adafruit/deshipu-patch-1

b6f0b41

Enable builds for ugame10

dpgeorge closed this May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

RFC: Add zipfile support #1797

RFC: Add zipfile support #1797

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RFC: Add zipfile support #1797

RFC: Add zipfile support #1797

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!