Page 1 of 2
GCC 10 Inlining behavior breaks RT test 12-7 Topic is solved
Posted: Sat Oct 24, 2020 9:50 pm
by mtrescott
GCC 10
enables inlining by default at the
-O2 optimization level. For some reason, this causes RT test sequence 12-7 to go into an infinite loop when attempting to reset the semaphore. I'm not able to fully make sense of the disassembly, but poking around the variables in GDB, it looks like the priorities of the test threads all get reset to zero, so
__sch_ready_behind() is never able to find a thread to put the new one in front of.
The quick fix is to add
-fno-inline-small-functions to the
USE_OPT line in your Makefile. But I'm not sure what the correct fix would be, and whether this is a GCC bug or a ChibiOS bug. Hopefully someone can shed some light on this.
FWIW, I'm using GCC 10.2.1 and Newlib 3.3.0 on openSUSE Tumbleweed. But those are both packages I built myself because openSUSE doesn't have a complete set of libgcc and Newlib binaries for Thumb (
openSUSE Bugzilla #1106014).
Edit: Should also note that disabling LTO masks the problem but it's the inlining that really does it (it can't inline the functions without LTO). If I add
-finline-functions to my
USE_OPT with GCC 9 and ChibiStudio on Windows, the same problem comes up.
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Sat Oct 24, 2020 9:56 pm
by Giovanni
Hi,
Compilers from distro's repositories are almost always broken, this is not limited to GCC 10. Let's try this again when ARM releases its "official" GCC 10, end of this quarter I think.
Giovanni
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Sat Oct 24, 2020 9:58 pm
by mtrescott
Hi Giovanni,
Wow, that was a fast response! Not sure if you saw my edit; enabling inlining on GCC 9 also triggers this problem.
Matthew
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Sun Oct 25, 2020 7:53 am
by Giovanni
Hi,
The test suite is executed correctly using the ARM-built GCC compiler, I suggest you give it a try. It is what I am using here (Linux Mint 18).
Giovanni
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Tue Oct 27, 2020 5:52 am
by mtrescott
I tried ChibiStudio (with the ARM toolchain) on Linux and you are right, the problem does not appear! However, if I use ChibiStudio on Windows (also with the ARM toolchain) and compile with -finline-functions, then this bug appears. I tried both GCC 9 and GCC 7; both of them trigger the problem when -finline-functions is enabled. I even uninstalled the arm-none-eabi toolchain I had installed in MinGW to make sure it wasn't interfering—no success there either. I guess it's not urgent since there's a workaround, but it is slightly concerning.
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Tue Oct 27, 2020 8:24 am
by Giovanni
Hi,
Which one of the demos triggers the problem? so far I have not seen it, please also look at the compiler identification string in the report to make sure it is using the right compiler and not something else in the path.
Giovanni
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Tue Oct 27, 2020 1:37 pm
by mtrescott
Hi,
The demo that triggers the problem is the RT-TM4C123G-LAUNCHPAD from contrib. The problem shows up whether I build against ChibiOS 20.03 or trunk (although in 20.03 the benchmarks were test suite #11 of course). I added -v to my compiler options to verify that it's using the ARM Embedded toolchain shipped with ChibiOS. I know an STM32 board would be the gold standard but I don't have one to test with unfortunately.
Matthew
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Mon Dec 14, 2020 3:37 pm
by Giovanni
Well, I just tested GCC 10 from ARM and it breaks 12.7 too, I don't know the cause yet.
Did somebody already look into this?
Giovanni
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Mon Dec 14, 2020 3:55 pm
by mtrescott
Never had a chance to really dive into it, all I can tell you is that it's not GCC 10 specific, it's just that inlining is enabled by default at the O2 optimization level and inlining causes this bug.
Re: GCC 10 Inlining behavior breaks RT test 12-7
Posted: Tue Dec 15, 2020 11:31 am
by Giovanni
Well, I found the cause but I have not yet decided the proper fix, the problem is here:
Code: Select all
void chSemResetWithMessageI(semaphore_t *sp, cnt_t n, msg_t msg) {
chDbgCheckClassI();
chDbgCheck((sp != NULL) && (n >= (cnt_t)0));
chDbgAssert(((sp->cnt >= (cnt_t)0) && queue_isempty(&sp->queue)) ||
((sp->cnt < (cnt_t)0) && queue_notempty(&sp->queue)),
"inconsistent semaphore");
sp->cnt = n;
while (queue_notempty(&sp->queue)) {
chSchReadyI(queue_lifo_remove(&sp->queue))->u.rdymsg = msg;
}
}
Because (probably) an aliasing problem the compiler only checks queue_notempty() once and then the loop executes endlessly without checking the condition again. It does not "see" queue_lifo_remove() change the memory so the problem is there.
I added a memory barrier at the problem disappears:
Code: Select all
static inline thread_t *queue_lifo_remove(threads_queue_t *tqp) {
thread_t *tp = tqp->prev;
tqp->prev = tp->queue.prev;
tqp->prev->queue.next = (thread_t *)tqp;
asm volatile ("" : : : "memory");
return tp;
}
But, of course, this is not an acceptable fix. I am not sure if the problem is that list handling code or GCC itself.
Giovanni