-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Changes to allow different compiler optimizations per board #3190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is probably making it too complicated, but I wonder if it would even make sense to compile different objects with different optimization levels, so even if there isn't enough room to optimize everything, you can have the most vital parts optimized... |
ifdef OPTIMIZATION_LEVEL | ||
CFLAGS += -O$(OPTIMIZATION_LEVEL) | ||
endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't the CFLAGS from lines 90, 97 and 104 interfere with this?
Note that in |
@@ -15,3 +15,4 @@ LONGINT_IMPL = MPZ | |||
CIRCUITPY_AUDIOBUSIO = 0 | |||
|
|||
CIRCUITPY_BITBANG_APA102 = 1 | |||
OPTIMIZATION_LEVEL = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding this flag for individual boards, I think it would make sense to do -O2 by default for all SAMD51 and nRF52 Express boards. There may be space problems on non-Express boards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhalbert See comment below. I think we will need to implement this on a board-by-board basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can always use INTERNAL_FLASH_FILESYSTEM = 1 as a guide. There is at least one board (sam32) that has that set but has 1M of flash (i.e. plenty for the 10% size increase). And,as future chips may well fall into that same category. I would consider it safer to have a separate flag. Also, the non_SAMD chips need to be considered as well.
Thank you for doing all this testing! We have not pursued this, and it's really nice someone is. |
I tried compiling with -O2 for all SAMD51 and SAMD54 boards, but there were
three boards
(loc_ber_m4_base_board, pewpew_m4, and one other I don't remember) that did
not have sufficient
flash. Also, I am hesitant to make changes for a board I have not
tested.So, making it a global change
was not something I thought would be a good idea.
…On Wed, Jul 22, 2020 at 6:12 PM Dan Halbert ***@***.***> wrote:
Note that in py/py.mk, gc.o and vm.o will be compiled -O3 if SUPEROPT_GC
or SUPEROPT_VM are set to 1 (which is the default). So could you make
sure that this change is not overriding that higher optimization level?
Thanks.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFNJKESGFOG6QUEN533V6C3R45W43ANCNFSM4PFFPCFQ>
.
|
@dhalbert - Because gcc only pays attention to the last -O option, the -O3
for gc.o and vm.o will still be in effect.
From the (very long) compile line:
...
-DCFG_TUD_MIDI_TX_BUFSIZE=128 -DCFG_TUD_CDC_TX_BUFSIZE=256
-DCFG_TUD_MSC_BUFSIZE=1024 -O2 -flto -flto-partition=none
...
-O3 -c -MD -o build-pyportal/py/vm.o ../../py/vm.c
…On Wed, Jul 22, 2020 at 6:12 PM Dan Halbert ***@***.***> wrote:
Note that in py/py.mk, gc.o and vm.o will be compiled -O3 if SUPEROPT_GC
or SUPEROPT_VM are set to 1 (which is the default). So could you make
sure that this change is not overriding that higher optimization level?
Thanks.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFNJKESGFOG6QUEN533V6C3R45W43ANCNFSM4PFFPCFQ>
.
|
*@deshipu* - gcc only uses the last -O option on the command line. So, the
-Os higher up will be overridden
by the -O2 farther along.
…On Wed, Jul 22, 2020 at 6:03 PM Radomir Dopieralski < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In ports/atmel-samd/Makefile
<#3190 (comment)>
:
> @@ -106,6 +106,11 @@ CFLAGS += -Os -DNDEBUG
CFLAGS += -DCFG_TUSB_MCU=OPT_MCU_SAMD51 -DCFG_TUD_MIDI_RX_BUFSIZE=128 -DCFG_TUD_CDC_RX_BUFSIZE=256 -DCFG_TUD_MIDI_TX_BUFSIZE=128 -DCFG_TUD_CDC_TX_BUFSIZE=256 -DCFG_TUD_MSC_BUFSIZE=1024
endif
+# option to override compiler optimization level, set in boards/$(BOARD)/mpconfigboard.mk
+ifdef OPTIMIZATION_LEVEL
+CFLAGS += -O$(OPTIMIZATION_LEVEL)
+endif
+
Don't the CFLAGS from lines 90, 97 and 104 interfere with this?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3190 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFNJKETDHLNR2QC74RQIHXDR45V3JANCNFSM4PFFPCFQ>
.
|
These boards all have I have tried in the past to minimize the number of board-specific settings in |
Thanks, I was not sure the new setting would be the last one, since it depends on the order of applying The Also note in
|
That is what I am trying to avoid. I would like the defaults for a particular chip or chip family to be the most common case, and then the boards that don't work can be overriden in I did a big refactoring of the board files several months ago and was able to reduce their sizes significantly by choosing reasonable defaults for various options, instead of including a large amount of boilerplate in each file. I am trying to avoid creeping back to that. It makes it easier to add a new board when the number of required options is small.
I wouldn't worry about not being able to test the individual boards in person. The thing to do is to make the change, submit the PR, see which ones fail, and fix just those boards. These would be the general defaults, and then the A couple of other things:
ifeq ($(CHIP_FAMILY), samd21)
...
CFLAGS += -Os -DNDEBUG
...
endif
ifeq ($(CHIP_FAMILY), samd51)
...
CFLAGS += -Os -DNDEBUG
...
endif
ifeq ($(CHIP_FAMILY), same54)
...
CFLAGS += -Os -DNDEBUG
...
endif I would change the above to something like the below. Note the use of ifeq ($(CHIP_FAMILY), samd21)
...
OPTIMIZATION_FLAGS ?= -Os
...
endif
ifeq ($(CHIP_FAMILY), samd51)
...
# boards without external flash may need -Os
OPTIMIZATION_FLAGS ?= -O2
...
endif
ifeq ($(CHIP_FAMILY), same54)
...
OPTIMIZATION_FLAGS ?= -O2
...
endif
CFLAGS += $(OPTIMIZATION_FLAGS) -DNDEBUG After the above, all the SAMD51 boards will have |
@dhalbert - I see your point. I was just unsure about making changes for boards I have not tested. I did try using -O3 on a couple of boards and saw problems (like not booting properly). This made me cautious about increasing the optimization level globally. So, I will change the defaults in the Makefiles and then let them be overridden if needed by the boards mpconfigboard.mk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! pewpew_m4
is failing; it needs -Os
, I assume
One interesting thing which doesn't need to be done in this PR is to do the -Os
only for particular language builds that don't fit.
@@ -14,3 +14,5 @@ EXTERNAL_FLASH_DEVICES = "GD25Q16C" | |||
# We use a CFLAGS define here because there are include order issues | |||
# if we try to include "mpconfigport.h" into nrfx_config.h . | |||
CFLAGS += -DCIRCUITPY_NRF_NUM_I2C=2 | |||
|
|||
OPTIMIZATION_LEVEL = 2 |
There was a problem hiding this comment.
The reason will be displayed to describe this comment to others. Learn more.
I believe this is left-over from the first version of this PR and can be removed.
@@ -15,3 +15,5 @@ LD_COMMON = boards/common_default.ld | |||
LD_DEFAULT = boards/STM32F405_default.ld | |||
LD_BOOT = boards/STM32F405_boot.ld # UF2 boot option | |||
UF2_OFFSET = 0x8010000 | |||
|
|||
OPTIMIZATION_LEVEL = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, pewpew_m4 is still failing.
update from main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. I am surprised more boards didn't fail: more than I expected have room for `-O2', which is good.
The pre-commit check is stuck, but it was fine before. Never mind. |
While working on some performance issues, it was found that using the -O2 option instead of -Os on boards with sufficient flash memory can greatly help performance.
Connie & I did some testing on the boards we have available. The tests we ran were:
benchmark
sprite.py (attached in zip file)
dispio_test.py (attached in zip file
Here is a table of our results showing the improvement pct. using -O2
The option has been implemented by adding a line:
OPTIMIZATION_LEVEL =
to the boards/$BOARD/mpconfigboard.mk file
and putting code into the various Makefiles to add that value to CFLAGS if present. This allows boards
for which no change is desired to build exactly as before. The modification is taking advantage
of the fact that if there are multiple optimization (-On) parameters given to gcc the last one wins.
test_scripts.zip
)