Page 1 of 3
[DONE]inter procedural optimisation
Posted: Wed Apr 17, 2013 11:02 am
by alex31
Hello,
newly released gcc 4.7.3 toolchain from linaro support inter procedural optimisation,
this technique can optimize a lot embedded application, permitting, by example
to inline function which are not in the same compilation unit, like
halIsCounterWithin (just an example, in this case eliminate function call overhead for precise timing)
this is enabled by -flto switch.
it seems, reading another forum that there is need to modify some function declaration in startup module
http://www.coocox.org/forum/topic.php?id=3002is there any plan to make chibios compatible with inter procedural optimisation ?
thanks
Alexandre
Re: inter procedural optimisation
Posted: Wed Apr 17, 2013 12:50 pm
by Giovanni
It should be compatible already, I tested it a while ago using 4.7.1 but the resulting code was actually slower than the version without LTO. I don't know if this is still the case with 4.7.3.
Giovanni
Re: inter procedural optimisation
Posted: Tue Jul 09, 2013 12:54 pm
by alex31
wanted to test with last linaro (4.7.4 pre) toolchain, but got errors :
Linking build/ch.elf
`chThdExit' referenced in section `.text' of /tmp/ccgojDkz.ltrans7.ltrans.o: defined in discarded section `.text' of build/obj/chthreads.o (symbol from plugin)
collect2: error: ld returned 1 exit status
I use ARMCM4-STM32F407-DISCOVERY demo, and just have changed this in Makefile :
adding -flto switch on USE_OPT and LD var
Code: Select all
USE_OPT = -flto -O2 -ggdb -fomit-frame-pointer -falign-functions=16
.
.
LD = $(TRGT)gcc -flto
it seems that it's not enough to compile with lto, if you already have had succeeded to compile with lto, could you tell me how ?
Thanks
Alexandre
^^^^^
Re: inter procedural optimisation
Posted: Tue Jul 09, 2013 1:00 pm
by Giovanni
Try disabling the garbage collector in the Makefile.
Giovanni
Re: inter procedural optimisation
Posted: Tue Jul 09, 2013 1:13 pm
by alex31
Thanks for your fast answer !
i have tried
unfortunately, it doesn't resolve the issue.
Alexandre
Re: inter procedural optimisation
Posted: Tue Jul 09, 2013 6:40 pm
by Giovanni
I see that problem too but so far I don't know a solution, I don't remember this from my previous attempt.
Giovanni
Re: inter procedural optimisation
Posted: Thu Jul 11, 2013 1:23 am
by szmodz
Add -u chThdExit -u _vectors to linker options.
You also need to modify _port_switch_from_isr in ARMCMx/chcore_v7m.c
asm volatile ("_port_exit_from_isr:" : : : "memory");
needs to become
asm volatile(
".global _port_exit_from_isr\n\t"
".thumb_func\n\t"
".type _port_exit_from_isr, %function\n\t"
"_port_exit_from_isr:" : : : "memory");
This is the quick and dirty solution, and should get things running, but there are actually more subtle issues with the code.
Re: inter procedural optimisation
Posted: Thu Jul 11, 2013 8:19 am
by szmodz
Giovanni wrote:I see that problem too but so far I don't know a solution, I don't remember this from my previous attempt.
Here, I fixed it for you:
https://github.com/szmodz/ChibiOS/commits/fixesI need all of those to get ChibiOS working reliably on a disco f4 across optimization settings (including, but not limited to LTO). I've been using these for a while, just cleaned up the commits, and pushed them onto a branch.
Re: inter procedural optimisation
Posted: Thu Jul 11, 2013 11:55 am
by Giovanni
Thanks, for the fix.
So, is the code compiled using LTO any better than the normal code? just to understand if this thing is worth pursuing.
Giovanni
Re: inter procedural optimisation
Posted: Thu Jul 11, 2013 12:22 pm
by szmodz
Yes, it's both faster and smaller. But that's NOT the main reason why it's worth pursuing. The problem is not that LTO breaks the code, but that the code is already broken, LTO just exposes the problems. Sooner or later they will come out and bite you anyway.
Those are the sort of annoying problems where you're trying to track down a bug, make some small change, rebuild, and the problem seemingly goes away, but not because it's fixed, but because the small change triggered the compiler to generate slightly different code, and mask the original problem.