LWIP: bug in sys_now Topic is solved

Report here problems in any of ChibiOS components. This forum is NOT for support.
drizzi
Posts: 4
Joined: Wed Apr 19, 2017 4:52 pm
Been thanked: 1 time

LWIP: bug in sys_now  Topic is solved

Postby drizzi » Thu Mar 07, 2024 5:18 pm

Hello,
regarding the LWIP bindings, I think there is a bug in sys_now (sys_arch.c), at least in the (OSAL_ST_FREQUENCY / 1000) >= 1 && OSAL_ST_FREQUENCY % 1000) == 0 case.

LWIP expects sys_now to wrap at 0xFFFFFFFF but, if e.g. OSAL_ST_FREQUENCY == 10kHz it wraps at 0x1999999A

Code: Select all

  return ((u32_t)chVTGetSystemTimeX() - 1) / (OSAL_ST_FREQUENCY / 1000) + 1;


(0xFFFFFFFF-1)/10 + 1 = 0x1999999A

This result in the TCP thread to hang after ~5 days (see sys_timeouts_sleeptime and TIME_LESS_THAN in LWIP timeouts.c).

User avatar
Giovanni
Site Admin
Posts: 14563
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1111 times
Been thanked: 937 times
Contact:

Re: LWIP: bug in sys_now

Postby Giovanni » Thu Mar 07, 2024 9:24 pm

Hi,

Moving in bug reports.

Giovanni

steved
Posts: 833
Joined: Fri Nov 09, 2012 2:22 pm
Has thanked: 12 times
Been thanked: 137 times

Re: LWIP: bug in sys_now

Postby steved » Tue Aug 20, 2024 5:44 pm

I concur. I've added a routine which extends the resolution of chVTGetSystemTimeX() for that specific case (it doesn't appear necessary to include the adjustments by 1):

Code: Select all

/**
 *  This routine creates a 32-bit incrementing timer tick which returns all possible
 *  values before wrapping round.
 *  Intended for situations where OSAL_ST_FREQUENCY is a multiple of =1kHz
 *  It's been tested with OSAL_ST_FREQUENCY=10000
 *  Must be called at least a few times for each complete cycling of the system tick
 */
static u32_t calcNext(u32_t ctr)
{
#define DIVISOR  (OSAL_ST_FREQUENCY / 1000)
  static u32_t topBit = 0;
  static u32_t lastCtr = 0;
  if (ctr < lastCtr)
  {
    /* wraparound */
    if (++topBit >= DIVISOR)
      topBit = 0;
  }
  lastCtr = ctr;
  return (u32_t)((((uint64_t)(topBit) << 32) + ctr)/DIVISOR);
}


u32_t sys_now(void) {

#if OSAL_ST_FREQUENCY == 1000
  return (u32_t)chVTGetSystemTimeX();
#elif (OSAL_ST_FREQUENCY / 1000) >= 1 && (OSAL_ST_FREQUENCY % 1000) == 0
  return calcNext((u32_t)chVTGetSystemTimeX());
  //return ((u32_t)chVTGetSystemTimeX() - 1) / (OSAL_ST_FREQUENCY / 1000) + 1;
#elif (1000 / OSAL_ST_FREQUENCY) >= 1 && (1000 % OSAL_ST_FREQUENCY) == 0
  return ((u32_t)chVTGetSystemTimeX() - 1) * (1000 / OSAL_ST_FREQUENCY) + 1;
#else
  return (u32_t)(((u64_t)(chVTGetSystemTimeX() - 1) * 1000) / OSAL_ST_FREQUENCY) + 1;
#endif
}


twarge
Posts: 7
Joined: Wed Dec 20, 2017 4:41 am
Been thanked: 2 times

Re: LWIP: bug in sys_now

Postby twarge » Tue Aug 20, 2024 9:13 pm

I am thrilled that you identified this; hopefully it is the reason our ethernet dies. It has been a problem for *years*.

User avatar
Giovanni
Site Admin
Posts: 14563
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1111 times
Been thanked: 937 times
Contact:

Re: LWIP: bug in sys_now

Postby Giovanni » Wed Aug 21, 2024 12:49 pm

This is actually very old, the following solution was also proposed: https://sourceforge.net/p/chibios/bugs/973/

Which one should I include?

Giovanni

steved
Posts: 833
Joined: Fri Nov 09, 2012 2:22 pm
Has thanked: 12 times
Been thanked: 137 times

Re: LWIP: bug in sys_now

Postby steved » Wed Aug 21, 2024 4:26 pm

Didn't find that proposal.

I'd tend towards the SourceForge suggestion, simply on the basis that it appears to have covered all the options (I'd really just covered the one that mattered to me).
However depends how well its been tested.
I'll try and run it through my test harness in the next few days.

steved
Posts: 833
Joined: Fri Nov 09, 2012 2:22 pm
Has thanked: 12 times
Been thanked: 137 times

Re: LWIP: bug in sys_now

Postby steved » Fri Aug 23, 2024 10:07 am

I've compared the two options for SYSTICK of 5kHz, 10kHz, 50kHz, 100kHz.

For the first two, both proposals agree.
For the second two, the results diverge.

Given this, I'd tend towards my solution because its easier to understand (extend the SYsTick counter beyond 32 bits, then divide), which is not to say the other solution is wrong.

I've attached my test harness (which runs on a PC) in case anyone wants to look further
Attachments
wraproundTest.7z
Test harness (for PC)
(56.29 KiB) Downloaded 71 times

User avatar
Giovanni
Site Admin
Posts: 14563
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1111 times
Been thanked: 937 times
Contact:

Re: LWIP: bug in sys_now

Postby Giovanni » Fri Aug 23, 2024 4:59 pm

Sorry, I am confused, they cannot diverge and be both not wrong.

In addition, the proposal does not address the following cases which are probably also wrong:

Code: Select all

#elif (1000 / OSAL_ST_FREQUENCY) >= 1 && (1000 % OSAL_ST_FREQUENCY) == 0
  return ((u32_t)chVTGetSystemTimeX() - 1) * (1000 / OSAL_ST_FREQUENCY) + 1;
#else
  return (u32_t)(((u64_t)(chVTGetSystemTimeX() - 1) * 1000) / OSAL_ST_FREQUENCY) + 1;
#endif


Another problem is that the system time can be 16 bits wide, in this case everything fails, even the case "OSAL_ST_FREQUENCY == 1000". There are combinations of counter width and frequency which are not compatible at all, imagine 100MHz and 16 bits counter, which is actually supported, such a condition should be detected (system time total time should be greater of 1mS).

Sufficient headache for today, will try again tomorrow :-)

Giovanni

User avatar
Giovanni
Site Admin
Posts: 14563
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1111 times
Been thanked: 937 times
Contact:

Re: LWIP: bug in sys_now

Postby Giovanni » Sat Aug 24, 2024 9:25 am

Hi,

What about something like this?

Code: Select all


#if (OSAL_ST_FREQUENCY < 1000) || ((OSAL_ST_FREQUENCY % 1000) != 0)
#error "LWIP requires a systick frequency that is a multiple of 1000"
#endif

#if (OSAL_ST_RESOLUTION >= 32) && (OSAL_ST_FREQUENCY == 1000)
u32_t sys_now(void) {return (u32_t)osalOsGetSystemTimeX();}

#else
u32_t sys_now(void) {
  static u32_t last_ms = (u32_t)0;
  static u32_t last_unprocessed = (u32_t)0;
  static systime_t last_system_time = (systime_t)0;
  u32_t delta;
  systime_t now_time;

  /* Calculating delta, in ticks, from the last acquired system time.*/
  now_time = osalOsGetSystemTimeX();
  delta = (u32_t)osalTimeDiffX(last_system_time, now_time) + last_unprocessed;
  last_system_time = now_time;

  last_ms          += delta / (OSAL_ST_FREQUENCY / 1000U);
  last_unprocessed  = delta % (OSAL_ST_FREQUENCY / 1000U);

  return last_ms;
}
#endif


Trying to avoid 64 bits math and handle 16 bits counters.

Note, not tested yet, no HW until September.

Giovanni

User avatar
Giovanni
Site Admin
Posts: 14563
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1111 times
Been thanked: 937 times
Contact:

Re: LWIP: bug in sys_now

Postby Giovanni » Tue Aug 27, 2024 9:42 am

Any feedback?

Giovanni


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 5 guests