Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

This forum is about you. Feel free to discuss anything is related to embedded and electronics, your awesome projects, your ideas, your announcements, not necessarily related to ChibiOS but to embedded in general. This forum is NOT for support.
KarlK90
Posts: 7
Joined: Fri Feb 05, 2021 10:53 am
Been thanked: 3 times

Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby KarlK90 » Fri Feb 05, 2021 12:39 pm

Building upon the great work of psyco (https://github.com/psycowithespn/ChibiOS-RISC-V) I ported Chibios to the GigaDevice GD32VF103 microcontroller. Which is very similar to the STM32F103/GD32F103 mcus as it features nearly the exact same peripherals expect for the USB peripheral. So far the HAL port is a rough duplicate 1:1 of the STM32 HAL drivers with some fixes for interrupt handling. The current state says STM32 everywhere and the register definitions are the STM32 ones as well, but my intend is to clean up the port once the USB bug is fixed and submit it to ChibiOS-Contrib. The current state is in the GD32VF103 branch in my repo:

https://github.com/KarlK90/ChibiOS-RISC ... /gd32vf103 (RT and OSLIB test cases are all successful, GPIO and PWM Timer and UART is working!)

Now for my problem that kept my busy for the past few days: I tried to figure at bug with the USB peripheral which is not the USB Device peripheral of the STM32F103 but the well known Synopsys Designware OTG FS peripheral, found on the STM32F107/STM32F4 etc. Which gives me quite a headache because it lock-ups after several seconds and I can't figure out the cause for this behaviour. Unfortunately my knowledge of USB is quite limited, in fact this is the first time a work with it on a low level basis.

Here is what I found out so far:

(Using the USB-CDC Demo which is in the repo, there is also a compiled elf binary in the ./build folder and a SVD descriptor in /docs)

Board: Sipeeed Longan Nano and own custom Board for testing purposes

* Enumeration succeeds, terminal is functional
* Reading the Device Descriptor and Device Status with lsusb/usbhid-dump succedes
* The CDC-Terminal becomes non-responsive after several seconds
* Lock-up occurs without input/output to the terminal
* Lock-up occurs with input/output to the terminal (e.g. the write command or issuing multiple test rt commands)
* The time varies, sometimes almost immediately, sometimes after 10s..30s
* Decreasing STM32_USB_OTG1_RX_FIFO_SIZE to 128 or 256 leads to a faster lock-ups
* Disabling interrupts during FIFO access with STM32_USB_OTGFIFO_FILL_BASEPRI makes no difference
* Lock-up is reproduce-able on multiple Boards and USB Hosts (2.0/3.0)
* Reading the Device Descriptor succeeds after lock-up, Device Status doesn't.
* Attaching to the process after lock-up I can reliably see that ITTXFE (IN Token received when TxFIFO empty) flag is set on IN Endpoint 1

Compiling the official USB-HID examples https://github.com/riscv-mcu/GD32VF103_ ... /HID_Mouse and this CDC example https://github.com/fabiopjve/ULWOS2/tre ... inal_GD32V doesn't lead to this lock-ups. Both use the official Driver from the GD32VF103_Firmware_Library. So I'm confident that this isn't a hardware problem, but a bug with the OTG peripheral.

So far I tried to isolate the bug and read multiple bug reports on the OTG peripheral on forums e.g. (https://community.st.com/s/question/0D5 ... 1stm32f413) to find the root cause, but had no luck. I started to compare the official OTG driver to the ChibiOS driver but this didn't lead to any new clues at the moment, but I am sure that I am missing something.

My questions are: Does this bug seem familiar? Any ideas what could trigger this behaviour? What could I do to find the cause?

Thank you! :)

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby Giovanni » Fri Feb 05, 2021 2:31 pm

Hi,

Good work with the port!!

That OTG is quite complex, we faced several mysterious problems over the years because required delays, clock ratios and other details. I am not sure about how to help, I suggest to verify the OTG state before and after lock-ups, that could give an hint about the problem.

Giovanni

KarlK90
Posts: 7
Joined: Fri Feb 05, 2021 10:53 am
Been thanked: 3 times

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby KarlK90 » Fri Feb 05, 2021 3:47 pm

Thank you! :-)

This OTG peripheral seems to be quite delicate and fragile from what I have read on several sites, even in between the same mcu line ups. :roll:

A quick look on the states after the lock-up:

The whole OTG driver is still in USB_ACTIVE state.
EP0 is USB_EP0_STP_WAITING.
EP1 IN endpoint is still in transmitting state what is mysterious, as the otg FIFO is empty.

It seems like a transfer complete interrupt is missing?!

A quick look at the USB_HID mouse example that is working shows the same (IN Token received when TxFIFO empty) flag, maybe this is not connected.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby Giovanni » Fri Feb 05, 2021 8:48 pm

Note that this IP can be configured in silicon, it has a ton of options and variants, the current OTG driver is for this IP as it is found on STM32. It could have been configured differently on that device.

Giovanni

psyco
Posts: 21
Joined: Fri May 22, 2020 1:40 am
Been thanked: 11 times

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby psyco » Sun Feb 07, 2021 10:06 pm

I have a similar board here, the "SeeedStudio GD32 RISC-V kit". With some minor mods, I was able to get it up and running on your repo and ChibiOS trunk. The VCP worked for about 30 seconds while I was interacting with the shell, then just stopped.

After setting the chconf.h DBG asserts and state checks, I found usbStartReceiveI is called with an in progress receive on EP0 during setup / selection. This is way before the CDC code takes over. I'm not that familiar with USB, but this seems like it could be a lead for you:

Code: Select all

(gdb) bt
#0  chSysHalt (reason=reason@entry=0x8013f54 <__func__.5.lto_priv.2> "usbStartReceiveI")
    at ../../../../ChibiOS/os/rt/src/chsys.c:142
#1  0x0800293c in usbStartReceiveI (usbp=usbp@entry=0x20000c34 <USBD1>, ep=ep@entry=0 '\000',
    buf=buf@entry=0x0, n=n@entry=0) at ../../../../ChibiOS/os/hal/src/hal_usb.c:459
#2  0x08002b6c in _usb_ep0in (usbp=0x20000c34 <USBD1>, ep=<optimized out>)
    at ../../../../ChibiOS/os/hal/src/hal_usb.c:908
#3  0x08003642 in otg_epin_handler (usbp=usbp@entry=0x20000c34 <USBD1>, ep=ep@entry=0 '\000')
    at ../../../os/hal/ports/GD/GD32VF103/OTGv1/hal_usb_lld.c:400
#4  0x08003a78 in usb_lld_serve_interrupt (usbp=usbp@entry=0x20000c34 <USBD1>)
    at ../../../os/hal/ports/GD/GD32VF103/OTGv1/hal_usb_lld.c:642             
#5  0x08003ac4 in USBFS () at ../../../os/hal/ports/GD/GD32VF103/OTGv1/hal_usb_lld.c:686
#6  0x08000634 in _irq_handler () at ../../../os/common/ports/RISCV-CLIC/compilers/GCC/chcoreasm.S:285
Backtrace stopped: frame did not save the PC
(gdb) p USBD1
$1 = {state = USB_SELECTED, config = 0x801536c <usbcfg>, transmitting = 0, receiving = 1, epc = {
     0x801416c <ep0config>, 0x0, 0x0, 0x0}, in_params = {0x20001b54 <SDU1>, 0x20001b54 <SDU1>, 0x0},
   out_params = {0x20001b54 <SDU1>, 0x0, 0x0}, ep0state = USB_EP0_OUT_WAITING_STS,
   ep0next = 0x801537c <vcom_configuration_descriptor_data> "\t\002C", ep0n = 67, ep0endcb = 0x0,
   setup = "\200\006\000\002\000\000\377", status = 0, address = 89 'Y', configuration = 0 '\000',
   saved_state = USB_READY, otg = 0x50000000, otgparams = 0x8014190 <fsparams>, pmnext = 144}


And the corresponding assert that failed:

Code: Select all

osalDbgAssert(!usbGetReceiveStatusI(usbp, ep), "already receiving");

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby Giovanni » Sun Feb 07, 2021 10:25 pm

Hard to tell...

The driver relies on small delays in its code, are those correctly implemented in the RISC V port? it looks like some kind of race condition.

Giovanni

KarlK90
Posts: 7
Joined: Fri Feb 05, 2021 10:53 am
Been thanked: 3 times

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby KarlK90 » Thu Mar 11, 2021 4:14 pm

I solved the lock-ups :-).

The problem or rather problems where not within the USB OTG driver but within the port code.

Nucleisys, who designed the bumblebee core in the gd32vf103, extended the CLIC unit on this device with multiple non-standard extensions and registers and calls it ECLIC. One of the extensions are machine sub-modes for IRQ and NMI handling. One error was closely tied to this sub-modes. Another error was a race condition on thread context switchin.

https://doc.nucleisys.com/nuclei_spec/i ... cess-modes

In detail:

1.) msubm register wasn't saved and restored when serving an interrupt. This worked seemingly fine until one higher priority interrupt pre-empted another lower priority interrupt. In this case msubm was never restored to normal operation mode but stayed in interrupt handling mode and subsequent interrupts with a lower priority where not taken. The USB lock-ups where the direct result of this error when a higher priority timer interrupt pre-empted the usb interrupt. From the outside this looked like a USB error because the system continued to work like normal, taking timer interrupts but never serviced the usb interrupts. So simple fix was to save and restore msubm. (Simple fix, but took me days to find the cause :P)

2.) The second race condition happened when interrupts were not disabled on thread context switching after an interrupt was taken. This lead to strange jumps into nowhere. Sometimes after 5s, sometimes after 30s and sometimes minutes. Very strange and I first thought about a stackoverflow. But the fix was to disable interrupts when leaving the interrupt handler in the old thread and re-enable them in the new thread after leaving interrupt handling. On RISC-V this is done by setting mstatus.mpie and calling subsequently calling mret.

I already pushed the fixes to the branch.

Now on to merging this port into chibios contrib :-)

KarlK90
Posts: 7
Joined: Fri Feb 05, 2021 10:53 am
Been thanked: 3 times

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby KarlK90 » Thu Mar 11, 2021 5:21 pm

And about merging the port: I'm not entirely sure how to handle the driver situation, the gd32vf103 is nearly completely API compatible with the STM32 drivers and the stm32f103. There are subtle differences in the registers like four different USB clock pre-scalers instead of two but nothing dramatic different. At the moment the port just changed the ARM systick timer to the native RISC-V mtimer and replaced the NVIC calls with the compatible ECLIC calls.
So far so good. GigaDevice just made the decision (or was forced) to give every peripheral, register and field in the register a different name and this is the point where it gets hairy. My options so far are:

1) Copy the stm32 drivers, adjust for the differences and use the STM32 definitions. The upside is that any fixes from chibios can easily be adapted and the process is rather straight forward.

2) Copy the stm32 drivers, adjust for the differences and rename every peripheral, register and field to match the GigaDevice names. Fixes from chibios can't easily be merged and it is much more work. On the upside the reference manual and register definitions match the code.

What would you suggest?

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby Giovanni » Thu Mar 11, 2021 5:56 pm

Hi,

You can start from the STM32 LLD tree but it should have its own tree, old STM32 peripherals don't get much changes anyway.

Giovanni

KarlK90
Posts: 7
Joined: Fri Feb 05, 2021 10:53 am
Been thanked: 3 times

Re: Chibios OS RISC-V GD32VF103 Port and USB OTG lockup problem

Postby KarlK90 » Thu Mar 11, 2021 6:02 pm

Thank you, then I will go with the second option :-)


Return to “User Projects”

Who is online

Users browsing this forum: No registered users and 4 guests