Threads are dying, I cannot find the issue :(

Postby **Giovanni** » Mon Mar 02, 2015 3:08 pm

It must be a time-dependent issue, the compiler only hides the problem.

Do you have interrupts at priosity higher that kernel level? if so, are those "clean" of calls to the RTOS? are you using the correct declarations for fast ISRs?

Giovanni

russian · Postby **russian** » Mon Mar 02, 2015 4:26 pm

Which one would be the kernel priority you are referring to? Would that be CORTEX_PRIORITY_SYSTICK? Mine is set to 6 and I do have some IRQ with higher priorities.

Which RTOS calls are prohibited from higher priority - for example, can I invoke chSysLockFromIsr from priorities higher then CORTEX_PRIORITY_SYSTICK?

Is there a way to validate this contract programmatically? All IRQ handlers start with CH_IRQ_PROLOGUE(), if I add the priority of each IRQ as a parameter and pass it to dbg_check_enter_isr. dbg_check_enter_isr can maintain a stack of nested IRQ priorities, the RTOS calls in question can validate the top of this stack. Does this sound like a possible design for a check of that priority constraint?

Postby **Giovanni** » Mon Mar 02, 2015 4:32 pm

Hi,

Using default values priorities 0 and 1 are "fast interrupts", you can use those priorities but the ISR declaration is different, no prologue/epilogue macros and you cannot call OS code from there.

Fast interrupts can preempt the kernel even in between chSysLock() and chSysUnlock().

Giovanni

russian · Postby **russian** » Mon Mar 02, 2015 4:42 pm

I am sorry I do not understand your latest response

Code: Select all

#define CORTEX_PRIORITY_SYSTICK 6
#define STM32_ADC_ADC1_DMA_PRIORITY         2
#define STM32_ADC_ADC2_DMA_PRIORITY         2
#define STM32_ADC_ADC3_DMA_PRIORITY         2
#define STM32_GPT_TIM5_IRQ_PRIORITY         4

https://svn.code.sf.net/p/rusefi/code/branches/2015027_chibistudio/config/stm32f4ems/chconf.h
https://svn.code.sf.net/p/rusefi/code/branches/2015027_chibistudio/config/stm32f4ems/mcuconf.h

my ADC DMA and TIM5 have priority higher then CORTEX_PRIORITY_SYSTICK

from my ADC DMA and TIM5 I believe I only call gptStartOneShotI/gptStopTimerI and sysLock/sysUnlock

Do I break chibi contract by doing that?

Postby **Giovanni** » Mon Mar 02, 2015 6:32 pm

I never mentioned CORTEX_PRIORITY_SYSTICK, it is unrelated. The priority limit is CORTEX_MAX_KERNEL_PRIORITY (2).

Do you use other interrupt sources other than those in mcuconf.h ? If you don't use priorities 0 and 1 then it is fine.

Giovanni

russian · Postby **russian** » Mon Mar 02, 2015 6:37 pm

Gotcha, it's not CORTEX_PRIORITY_SYSTICK - that's why I was asking what it is.

All my interrupts are coming via Chibi HAL layer and it's all in this mcuconf.h - I am not touching any hardware directly.

So my STM32_ADC_ADC1_DMA_PRIORITY 2 is at the lowest kernel priority, so it's not violating anything.

russian · Postby **russian** » Tue Mar 03, 2015 6:03 am

I think I've found the issue, it has nothing to do with Chibi, sorry for bothering.

I actually believe it has something to do with the compiler, but it's midnight here so I am not at full capacity. I have what I believe should be a lock-free atomic read from a 64 bit timer counter and that's where my code hangs up:

Code: Select all

typedef struct {
   uint64_t highBits;
   uint32_t lowBits;
} State64;

   /**
    * this method is lock-free and thread-safe, that's because the 'update' method
    * is atomic with a critical zone requirement.
    *
    * http://stackoverflow.com/questions/5162673/how-to-read-two-32bit-counters-as-a-64bit-integer-without-race-condition
    */
   uint64_t localH;
   uint32_t localLow;
   int counter = 0;
   while (true) {
      localH = state.highBits;
      localLow = state.lowBits;
      uint64_t localH2 = state.highBits;
      if (localH == localH2)
         break;
   }

the idea was that I am reading from highBits twice and if there is a mismatch I am re-reading both again until they match. Looks like the compiler has taken localH = state.highBits out of the look and I am hanging up here. Probably my fault that I have not declared the fields volatile?

Postby **Giovanni** » Tue Mar 03, 2015 8:54 am

It is probable, anyway, critical zones are more efficient, just use them.

You are not going to improve anything by not using a critical zone for a small code like accessing a variable.

Giovanni

russian · Postby **russian** » Tue Mar 03, 2015 2:24 pm

Now I am less sure. Something funny is going on:

My firmware is stuck, I am pausing and resuming execution. Note how current thread changes, but p_current is always the same?

Code: Select all

#define ON_UNLOCK_HOOK onUnlockHook()
#define dbg_leave_lock() {dbg_lock_cnt = 0;ON_UNLOCK_HOOK;}

void onUnlockHook(void) {
   uint64_t t = getTimeNowNt() - lastLockTime;
   if (t > maxLockTime) {
      maxLockTime = t;
   }
}

void onUnlockHook(void) {
 801b520:   b580         push   {r7, lr}
 801b522:   b082         sub   sp, #8
 801b524:   af00         add   r7, sp, #0
   uint64_t t = getTimeNowNt() - lastLockTime;
 801b526:   f003 fa0b    bl   801e940 <getTimeNowNt>
 801b52a:   f24f 5360    movw   r3, #62816   ; 0xf560
 801b52e:   f2c2 0300    movt   r3, #8192   ; 0x2000
 801b532:   e9d3 2300    ldrd   r2, r3, [r3]
 801b536:   1a82         subs   r2, r0, r2
 801b538:   eb61 0303    sbc.w   r3, r1, r3
 801b53c:   e9c7 2300    strd   r2, r3, [r7]
   if (t > maxLockTime) {
 801b540:   f24f 5368    movw   r3, #62824   ; 0xf568
 801b544:   f2c2 0300    movt   r3, #8192   ; 0x2000
 801b548:   681b         ldr   r3, [r3, #0]
 801b54a:   4618         mov   r0, r3
 801b54c:   f04f 0100    mov.w   r1, #0
 801b550:   e9d7 2300    ldrd   r2, r3, [r7]
 801b554:   4299         cmp   r1, r3
...
...
...

Postby **Giovanni** » Tue Mar 03, 2015 4:42 pm

Are you using a different working area for each thread ?

Giovanni

ChibiOS Free Embedded RTOS

Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Re: Threads are dying, I cannot find the issue :(

Who is online