8000 Enable link-time optimization for nrf targets by jepler · Pull Request #2031 · adafruit/circuitpython · GitHub
[go: up one dir, main page]

Skip to content

Enable link-time optimization for nrf targets #2031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 18, 2019
Merged

Conversation

jepler
Copy link
@jepler jepler commented Aug 3, 2019

The good: It gets back about 60kB of flash space and increases performance (pystone) by about +14%

The bad: It adds about 12-16 seconds to each target built

The ugly: This is exactly the same thing tried for #1396 by others, and it's not clear why it seems to be working for me today.

See the commit message for further info.

jepler added 2 commits August 2, 2019 07:53
Testing performed: installed freshly built .uf2 on a Particle Xenon.
Checked that circuitpython still starts.
Checked that the size of all .uf2 files for nrf builds are plausible.

Aside from memory savings, the performance of Python code (pystone)
increased by about +14%.

However, this adds about 12-16 seconds to each nrf build.

Timings & Sizes (build system: i5-3320M, -j5 parallelism on 4 threads):

Before:
$ make -j5 BOARD=particle_xenon
765004 bytes free in flash out of 1048576 bytes ( 1024.0 kb ).
232076 bytes free in ram for stack out of 245760 bytes ( 240.0 kb ).
68.54user 11.83system 0:34.34elapsed 234%CPU
pystones before: 570

After:
$ make -j5 BOARD=particle_xenon
804284 bytes free in flash out of 1048576 bytes ( 1024.0 kb ).
232072 bytes free in ram for stack out of 245760 bytes ( 240.0 kb ).
71.06user 11.77system 0:46.91elapsed 176%CPU
pystones after: 650

Timings on travis:

Before:
Build feather_nrf52840_express for pl took 55.79s and succeeded
Build feather_nrf52840_express for zh_Latn_pinyin took 3.18s and succeeded

After:
Build feather_nrf52840_express for pl took 62.72s and succeeded
Build feather_nrf52840_express for zh_Latn_pinyin took 19.10s

Closes: adafruit#1396
@jepler jepler requested a review from dhalbert August 3, 2019 02:06
@jepler
Copy link
Author
jepler commented Aug 3, 2019

Travis overall times changed from "4 hrs 11 min 5 sec" (https://travis-ci.com/adafruit/circuitpython/builds/121819759) to "4 hrs 38 min 26 sec" (https://travis-ci.com/adafruit/circuitpython/builds/121828350). In addition, because job # 1 is now badly imbalanced, the total wait time increased from 53 min 51 sec to 1 hr 21 min 13 sec.

@tannewt
Copy link
Member
tannewt commented Aug 6, 2019

I'll leave it up to @dhalbert

Copy link
Collaborator
@dhalbert dhalbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this and it works with the RGB BLE example (server and client: two boards). I am seeing about 42KB of extra space.

It's possible the -flto-partition=none is the key, but I don't know. If it stops working again we can always turn it off again.

Thanks for being persistent about experimenting about this.

@dhalbert dhalbert merged commit 3f7321a into adafruit:master Aug 18, 2019
@dhalbert
Copy link
Collaborator

I will look at the build times and re-balance the Travis jobs after this finishes, and then submit a PR for that. @tannewt is working on GitHub actions to use for CI instead of Travis, but until that is done we could increase the number of jobs to 6 or 7 for now.

@jepler jepler deleted the nrf-lto branch November 3, 2021 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0