GCC 10 Inlining behavior breaks RT test 12-7 Topic is solved

Report here problems in any of ChibiOS components. This forum is NOT for support.
mtrescott
Posts: 8
Joined: Sun Oct 11, 2020 4:57 am
Location: Michigan, USA
Has thanked: 1 time
Been thanked: 1 time

GCC 10 Inlining behavior breaks RT test 12-7  Topic is solved

Postby mtrescott » Sat Oct 24, 2020 9:50 pm

GCC 10 enables inlining by default at the -O2 optimization level. For some reason, this causes RT test sequence 12-7 to go into an infinite loop when attempting to reset the semaphore. I'm not able to fully make sense of the disassembly, but poking around the variables in GDB, it looks like the priorities of the test threads all get reset to zero, so __sch_ready_behind() is never able to find a thread to put the new one in front of.

The quick fix is to add -fno-inline-small-functions to the USE_OPT line in your Makefile. But I'm not sure what the correct fix would be, and whether this is a GCC bug or a ChibiOS bug. Hopefully someone can shed some light on this.

FWIW, I'm using GCC 10.2.1 and Newlib 3.3.0 on openSUSE Tumbleweed. But those are both packages I built myself because openSUSE doesn't have a complete set of libgcc and Newlib binaries for Thumb (openSUSE Bugzilla #1106014).

Edit: Should also note that disabling LTO masks the problem but it's the inlining that really does it (it can't inline the functions without LTO). If I add -finline-functions to my USE_OPT with GCC 9 and ChibiStudio on Windows, the same problem comes up.
Last edited by mtrescott on Sat Oct 24, 2020 9:57 pm, edited 1 time in total.

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby Giovanni » Sat Oct 24, 2020 9:56 pm

Hi,

Compilers from distro's repositories are almost always broken, this is not limited to GCC 10. Let's try this again when ARM releases its "official" GCC 10, end of this quarter I think.

Giovanni

mtrescott
Posts: 8
Joined: Sun Oct 11, 2020 4:57 am
Location: Michigan, USA
Has thanked: 1 time
Been thanked: 1 time

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby mtrescott » Sat Oct 24, 2020 9:58 pm

Hi Giovanni,

Wow, that was a fast response! Not sure if you saw my edit; enabling inlining on GCC 9 also triggers this problem.

Matthew

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby Giovanni » Sun Oct 25, 2020 7:53 am

Hi,

The test suite is executed correctly using the ARM-built GCC compiler, I suggest you give it a try. It is what I am using here (Linux Mint 18).

Giovanni

mtrescott
Posts: 8
Joined: Sun Oct 11, 2020 4:57 am
Location: Michigan, USA
Has thanked: 1 time
Been thanked: 1 time

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby mtrescott » Tue Oct 27, 2020 5:52 am

I tried ChibiStudio (with the ARM toolchain) on Linux and you are right, the problem does not appear! However, if I use ChibiStudio on Windows (also with the ARM toolchain) and compile with -finline-functions, then this bug appears. I tried both GCC 9 and GCC 7; both of them trigger the problem when -finline-functions is enabled. I even uninstalled the arm-none-eabi toolchain I had installed in MinGW to make sure it wasn't interfering—no success there either. I guess it's not urgent since there's a workaround, but it is slightly concerning.

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby Giovanni » Tue Oct 27, 2020 8:24 am

Hi,

Which one of the demos triggers the problem? so far I have not seen it, please also look at the compiler identification string in the report to make sure it is using the right compiler and not something else in the path.

Giovanni

mtrescott
Posts: 8
Joined: Sun Oct 11, 2020 4:57 am
Location: Michigan, USA
Has thanked: 1 time
Been thanked: 1 time

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby mtrescott » Tue Oct 27, 2020 1:37 pm

Hi,

The demo that triggers the problem is the RT-TM4C123G-LAUNCHPAD from contrib. The problem shows up whether I build against ChibiOS 20.03 or trunk (although in 20.03 the benchmarks were test suite #11 of course). I added -v to my compiler options to verify that it's using the ARM Embedded toolchain shipped with ChibiOS. I know an STM32 board would be the gold standard but I don't have one to test with unfortunately.

Matthew

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby Giovanni » Mon Dec 14, 2020 3:37 pm

Well, I just tested GCC 10 from ARM and it breaks 12.7 too, I don't know the cause yet.

Did somebody already look into this?

Giovanni

mtrescott
Posts: 8
Joined: Sun Oct 11, 2020 4:57 am
Location: Michigan, USA
Has thanked: 1 time
Been thanked: 1 time

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby mtrescott » Mon Dec 14, 2020 3:55 pm

Never had a chance to really dive into it, all I can tell you is that it's not GCC 10 specific, it's just that inlining is enabled by default at the O2 optimization level and inlining causes this bug.

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: GCC 10 Inlining behavior breaks RT test 12-7

Postby Giovanni » Tue Dec 15, 2020 11:31 am

Well, I found the cause but I have not yet decided the proper fix, the problem is here:

Code: Select all

void chSemResetWithMessageI(semaphore_t *sp, cnt_t n, msg_t msg) {

  chDbgCheckClassI();
  chDbgCheck((sp != NULL) && (n >= (cnt_t)0));
  chDbgAssert(((sp->cnt >= (cnt_t)0) && queue_isempty(&sp->queue)) ||
              ((sp->cnt < (cnt_t)0) && queue_notempty(&sp->queue)),
              "inconsistent semaphore");

  sp->cnt = n;
  while (queue_notempty(&sp->queue)) {
    chSchReadyI(queue_lifo_remove(&sp->queue))->u.rdymsg = msg;
  }
}


Because (probably) an aliasing problem the compiler only checks queue_notempty() once and then the loop executes endlessly without checking the condition again. It does not "see" queue_lifo_remove() change the memory so the problem is there.

I added a memory barrier at the problem disappears:

Code: Select all

static inline thread_t *queue_lifo_remove(threads_queue_t *tqp) {
  thread_t *tp = tqp->prev;

  tqp->prev             = tp->queue.prev;
  tqp->prev->queue.next = (thread_t *)tqp;

  asm volatile ("" : : : "memory");

  return tp;
}


But, of course, this is not an acceptable fix. I am not sure if the problem is that list handling code or GCC itself.

Giovanni


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 132 guests