Hi,
I noticed that the kernel statistics are broken somehow - let me try to explain.
Note: I used the RT-STM32F103RB-NUCLEO64 demo of the stable_21.11.x version to investigate the issue in-depth. However, I could reproduce the same issue with other demos as well.
When I read chVTGetSystemTimeX() right after system initialization, I get a correct value of 0. When I read currcore->kernel_stats.m_crit_thd.worst right after, I get an arbitrary value. At first glance, the values seemed random, but when resetting the board, the value would increase every time. Actually, there seems to be some correlation to time, as I found that the value would increment at ~16 MHz (every ~1 second the value increases by 0x01000000). When reading currcore->kernel_stats.m_crit_thd.worst multiple times within a loop afterwards, the value would not change, though. It is "updated" only by reset (and probably whenever the last measurement would exceed it).
Interestingly, the issue does not affect any of the other statistics. Neither the other members of m_crit_thd are affected, nor any member of m_crit_isr. All of those seem to be correct.
I had a glimpse at the code, but I did not find any issues there. I also wonder what kind of error would result in such odd behaviour. Looks like some addresses are VERY off.
Maybe the issue is actually related to the compiler, but I tested with GCC 10.3.1 as well as GCC 9.3.1 with no effect.
I am sorry that I can only (rather vaguely) describe the issue and can not provide any sort of fix.
Regards,
Thomas
invalid kernel stats Topic is solved
- Giovanni
- Site Admin
- Posts: 14444
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1074 times
- Been thanked: 921 times
- Contact:
Re: invalid kernel stats
Hi,
Not sure to understand fully
Perhaps it is measuring the critical zone when the system is started on chSysInit().
Giovanni
Not sure to understand fully
Perhaps it is measuring the critical zone when the system is started on chSysInit().
Giovanni
-
- Posts: 135
- Joined: Wed Feb 04, 2015 5:03 pm
- Location: CITEC, Bielefeld University, germany
- Has thanked: 15 times
- Been thanked: 24 times
- Contact:
Re: invalid kernel stats
I am fairly sure that there is no invalid measurement during startup, because the reported values are much greater than the total uptime of the system.
For instance, I just got the value 2141351942 (0x7FA27006) right after startup (uptime < 100 ms). @ 72 MHz, however, that reported value would account for almost 30s of CPU time.
As mentioned before, this "reset value" seems to be correlated with some internal 16 MHz timer (as for that particular demo that is). If I manually reset the the value of currcore->kernel_stats.m_crit_thd.worst to 0 right after chSysInit(), consecutive readings seem to be correct.
Since I could not find any issues with the dbg-statistics code, my guess is that somewhere in the code, data is written to an invalid pointer.
- Thomas
For instance, I just got the value 2141351942 (0x7FA27006) right after startup (uptime < 100 ms). @ 72 MHz, however, that reported value would account for almost 30s of CPU time.
As mentioned before, this "reset value" seems to be correlated with some internal 16 MHz timer (as for that particular demo that is). If I manually reset the the value of currcore->kernel_stats.m_crit_thd.worst to 0 right after chSysInit(), consecutive readings seem to be correct.
Since I could not find any issues with the dbg-statistics code, my guess is that somewhere in the code, data is written to an invalid pointer.
- Thomas
-
- Posts: 135
- Joined: Wed Feb 04, 2015 5:03 pm
- Location: CITEC, Bielefeld University, germany
- Has thanked: 15 times
- Been thanked: 24 times
- Contact:
Re: invalid kernel stats
I think I may have found the issue - and the solution is very simple.
When the TM object is initialized, all members (including 'last') are set to 0. This initialization is done within the chSysInit() function, which ends with a call to chSysUnlock(). That method again eventually calls chTMStopMeasurementX(), which calculates the best and worst values according to the current result of chSysGetRealtimeCounterX() and the member variable 'last' of the TM object. Since 'last' was not set to a valid value yet but only initialized with 0, the result must be considered "undefined" and that is what I originally observed.
I solved this issue by simply initializing the value of 'last' not with 0 but with chSysGetRealtimeCounterX(). This fixes the initial measurement during startup and does not interfere with any subsequent measurements, as 'last' will be overwritten by every call of chTMStartMeasurementX().
Regards
Thomas
When the TM object is initialized, all members (including 'last') are set to 0. This initialization is done within the chSysInit() function, which ends with a call to chSysUnlock(). That method again eventually calls chTMStopMeasurementX(), which calculates the best and worst values according to the current result of chSysGetRealtimeCounterX() and the member variable 'last' of the TM object. Since 'last' was not set to a valid value yet but only initialized with 0, the result must be considered "undefined" and that is what I originally observed.
I solved this issue by simply initializing the value of 'last' not with 0 but with chSysGetRealtimeCounterX(). This fixes the initial measurement during startup and does not interfere with any subsequent measurements, as 'last' will be overwritten by every call of chTMStartMeasurementX().
Regards
Thomas
-
- Posts: 135
- Joined: Wed Feb 04, 2015 5:03 pm
- Location: CITEC, Bielefeld University, germany
- Has thanked: 15 times
- Been thanked: 24 times
- Contact:
Re: invalid kernel stats
I just noticed that subsequent readings of kernel statistics is corrupted when compiling the code with GCC 10.3, even though it works fine with GCC 9.3.
Furthermore, there seems to be more to this, as I could reproduce the issue on a NUCLEO-F767ZI with both compilers. Actually I rather expect some broken code somewhere in ChibiOS (writing to an invalid address perhaps) than multiple compilers doing bogus on multiple (but not all!) devices.
Furthermore, there seems to be more to this, as I could reproduce the issue on a NUCLEO-F767ZI with both compilers. Actually I rather expect some broken code somewhere in ChibiOS (writing to an invalid address perhaps) than multiple compilers doing bogus on multiple (but not all!) devices.
- Giovanni
- Site Admin
- Posts: 14444
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1074 times
- Been thanked: 921 times
- Contact:
-
- Posts: 135
- Joined: Wed Feb 04, 2015 5:03 pm
- Location: CITEC, Bielefeld University, germany
- Has thanked: 15 times
- Been thanked: 24 times
- Contact:
Re: invalid kernel stats
I am not sure how to do that. I just tried by disabling the calls of SCB_EnableXCache() crt1.c, but with no effect. Can you please give me some hints, how to disable caches for the M7 platform?
- Giovanni
- Site Admin
- Posts: 14444
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1074 times
- Been thanked: 921 times
- Contact:
Re: invalid kernel stats
Thargon wrote:I am not sure how to do that. I just tried by disabling the calls of SCB_EnableXCache() crt1.c, but with no effect. Can you please give me some hints, how to disable caches for the M7 platform?
This was correct, then it is not related to cache.
Could you explain how to reproduce and verify the problem?
Giovanni
- Giovanni
- Site Admin
- Posts: 14444
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1074 times
- Been thanked: 921 times
- Contact:
Re: invalid kernel stats
Hi,
Fixed as bug #1222. (initialization problem)
Any news about the GCC 10.3 problem? I am trying to look into it.
Giovanni
Fixed as bug #1222. (initialization problem)
Any news about the GCC 10.3 problem? I am trying to look into it.
Giovanni
-
- Posts: 135
- Joined: Wed Feb 04, 2015 5:03 pm
- Location: CITEC, Bielefeld University, germany
- Has thanked: 15 times
- Been thanked: 24 times
- Contact:
Re: invalid kernel stats
Giovanni wrote:Fixed as bug #1222. (initialization problem)
Thanks!
Giovanni wrote:Any news about the GCC 10.3 problem? I am trying to look into it.
Unfortunately, I could not track down the issue.
GCC 9.3 seems to work fine. though. After wiping all temporal data, a fresh new build from scratch did not show that behaviour (usually GCC errors when using old temp files *shrug*).
Actually, GCC 10 has so many issues (linker warnings, bigger code size etc.) that I decided to stick to 9.3 for the time being. If I can be of any help, please let me know.
Who is online
Users browsing this forum: No registered users and 28 guests