Getting stuck with tickless mode

ChibiOS public support forum for topics related to the STMicroelectronics STM32 family of micro-controllers.

Moderators: barthess, RoccoMarco

User avatar
kulve
Posts: 86
Joined: Mon Nov 28, 2011 8:36 pm
Location: Finland

Re: Getting stuck with tickless mode

Postby kulve » Sun Mar 29, 2015 3:16 pm

I haven't experienced the stuck with the assert commented out.

In one of the assert cases in chVTIsTimeWithinX():

start == 141132
end == 141187
time == 141232

steved
Posts: 834
Joined: Fri Nov 09, 2012 2:22 pm
Has thanked: 12 times
Been thanked: 138 times

Re: Getting stuck with tickless mode

Postby steved » Sun Mar 29, 2015 3:26 pm

Just to add that I'm currently seeing a similar problem, as mentioned in another thread. I'm trying to debug it, and have reduced the code to a single serial port active, with DELTA=5. I'm also consistently seeing that assert in line 480 triggered when the system halts. I'll comment it out now and see what happens - crashes usually occur every 20-150 minutes.
One interesting point - normally I use the serial port UART driver interrupt-driven (32F051, using the DMA for I2C) with character by character receive. I changed back to 'normal' DMA use, and the system certainly didn't crash over a fairly long test period. That doesn't make a lot of sense! On the receive side, the interrupt-driven handler is a little more efficient than the DMA-driven one, since it eliminates the overhead of programming the DMA. On the transmit side, there's very little code executing in the ISR, and I'm only running at 9600 baud. I have today seen some indications of possible strange behaviour from the USART_ISR_TC bit (which I use to trigger a semaphore) which I am investigating.

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: Getting stuck with tickless mode

Postby Giovanni » Sun Mar 29, 2015 3:49 pm

I added that assertion after your reports steved, now I have to understand under which conditions that erroneous time window can occur.

Giovanni

User avatar
kulve
Posts: 86
Joined: Mon Nov 28, 2011 8:36 pm
Location: Finland

Re: Getting stuck with tickless mode

Postby kulve » Sun Mar 29, 2015 4:07 pm

I'm happy to add any debug prints or checks you need, if it helps.

steved
Posts: 834
Joined: Fri Nov 09, 2012 2:22 pm
Has thanked: 12 times
Been thanked: 138 times

Re: Getting stuck with tickless mode

Postby steved » Sun Mar 29, 2015 11:11 pm

Giovanni wrote:I added that assertion after your reports steved, now I have to understand under which conditions that erroneous time window can occur.

Giovanni

Nearly eight hours of operation without a crash once I removed that assertion in line 480. I'll leave it on overnight.

Would it be more helpful to restore the assertion, then note any variable values or anything? (I have done some digging around the code to be confident that things are set up as expected. Have also looked for any callbacks or other system lock areas which might persist for an excessive time; nothing found so far).

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: Getting stuck with tickless mode

Postby Giovanni » Mon Mar 30, 2015 8:08 pm

A small update.

I am able to reproduce the problem with a couple of timers.

Apparently that assertions is triggered by a spurious interrupt associated to a timer that has already reset, for some reason the interrupt remains pending and is triggered as soon the critical zone is left.

Removing the assertion is safe because the following code simply ignores the interrupt, I am looking into the root cause, very hard to debug stuff however, tomorrow I will try by enabling the timer freeze during debug, this should make things easier.

Giovanni

szmodz
Posts: 11
Joined: Thu Jul 11, 2013 1:07 am
Been thanked: 1 time

Re: Getting stuck with tickless mode

Postby szmodz » Mon Mar 30, 2015 9:03 pm

Giovanni wrote:Apparently that assertions is triggered by a spurious interrupt associated to a timer that has already reset, for some reason the interrupt remains pending and is triggered as soon the critical zone is left.


Disabling the timer interrupt through DIER won't clear the pending status if the interrupt is already pending. Perhaps st_lld_stop_alarm should also call NVIC_ClearPendingIRQ. Just a guess though.

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: Getting stuck with tickless mode

Postby Giovanni » Tue Mar 31, 2015 8:47 am

szmodz wrote:
Giovanni wrote:Apparently that assertions is triggered by a spurious interrupt associated to a timer that has already reset, for some reason the interrupt remains pending and is triggered as soon the critical zone is left.


Disabling the timer interrupt through DIER won't clear the pending status if the interrupt is already pending. Perhaps st_lld_stop_alarm should also call NVIC_ClearPendingIRQ. Just a guess though.


You nailed it :)

TIM interrupts are edge sensitive (it is not documented) and remain latched even if the SR register is cleared, now my test is working.

Now there is to decide if discard one interrupt once in a while (what I am doing now) or reset the NVIC each time a virtual timer is reprogrammed, probably the former is slower but I don't like the idea to have undesired interrupts.

Giovanni

szmodz
Posts: 11
Joined: Thu Jul 11, 2013 1:07 am
Been thanked: 1 time

Re: Getting stuck with tickless mode

Postby szmodz » Tue Mar 31, 2015 9:45 am

I think most, if not all, peripherals work this way. UARTs and SPIs sure do.

http://www.st.com/st-web-ui/static/acti ... 256689.pdf

Check out 1.3.1 on page 5 for an example. But yeah, the docs should make this more clear.

There is no need to clear the RXNE flag in the receive interrupt routine, as it was
automatically cleared by the DMA data read operation. However, the interrupt remains
pending in the NVIC (even though the RXNE flag is no longer set).

User avatar
Giovanni
Site Admin
Posts: 14704
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1146 times
Been thanked: 960 times

Re: Getting stuck with tickless mode

Postby Giovanni » Tue Mar 31, 2015 5:08 pm

aaaaand in addition, we hit a compiler bug....

Code: Select all

void chVTDoResetI(virtual_timer_t *vtp) {
#if CH_CFG_ST_TIMEDELTA > 0
  virtual_timer_t *first;
#endif

  chDbgCheckClassI();
  chDbgCheck(vtp != NULL);
  chDbgAssert(vtp->vt_func != NULL, "timer not set or already triggered");

  /* Checking if the element to be removed was the first in the list.*/
#if CH_CFG_ST_TIMEDELTA > 0
  first = ch.vtlist.vt_next;
#endif

  asm ("" : : : "memory");

  /* Removing the element from the delta list.*/
  vtp->vt_next->vt_delta += vtp->vt_delta;
  vtp->vt_prev->vt_next = vtp->vt_next;
  vtp->vt_next->vt_prev = vtp->vt_prev;
  vtp->vt_func = NULL;

  /* The above code changes the value in the header when the removed element
     is the last of the list, restoring it.*/
  ch.vtlist.vt_delta = (systime_t)-1;

#if CH_CFG_ST_TIMEDELTA > 0
  {
    systime_t nowdelta, delta;

    /* Just removed the last element in the list, alarm timer stopped and
       return.*/
    if (&ch.vtlist == (virtual_timers_list_t *)ch.vtlist.vt_next) {
      port_timer_stop_alarm();
      return;
    }

    /* If the removed element was not the first one then just return, the
       alarm is already set to the first element.*/
    if (vtp != first) {
      return;
    }

    /* If the new first element has a delta of zero then the alarm is not
       modified, the already programmed alarm will serve it.*/
    if (ch.vtlist.vt_next->vt_delta == 0) {
      return;
    }

    /* Distance in ticks between the last alarm event and current time.*/
    nowdelta = chVTGetSystemTimeX() - ch.vtlist.vt_lasttime;

    /* If the current time surpassed the time of the next element in list
       then the event interrupt is already pending, just return.*/
    if (nowdelta >= ch.vtlist.vt_next->vt_delta) {
      return;
    }

    /* Distance from the next scheduled event and now.*/
    delta = ch.vtlist.vt_next->vt_delta - nowdelta;

    /* Making sure to not schedule an event closer than CH_CFG_ST_TIMEDELTA
       ticks from now.*/
    if (delta < (systime_t)CH_CFG_ST_TIMEDELTA) {
      delta = (systime_t)CH_CFG_ST_TIMEDELTA;
    }

    port_timer_set_alarm(ch.vtlist.vt_lasttime + nowdelta + delta);
  }
#endif /* CH_CFG_ST_TIMEDELTA > 0 */
}


Note that "asm ("" : : : "memory");", without that barrier the compiler assumes that ch.vtlist.vt_next is invariant through the function while instead this group does modify it if the removed element is the first of the list.....

Code: Select all

  /* Removing the element from the delta list.*/
  vtp->vt_next->vt_delta += vtp->vt_delta;
  vtp->vt_prev->vt_next = vtp->vt_next;
  vtp->vt_next->vt_prev = vtp->vt_prev;


Now the question is... how to prevent the compiler being too smart?

Giovanni


Return to “STM32 Support”

Who is online

Users browsing this forum: No registered users and 124 guests