Multi Core: locking execution on the other core

Discussions and support about ChibiOS/RT, the free embedded RTOS.
electronic_eel
Posts: 83
Joined: Sat Mar 19, 2016 8:07 pm
Has thanked: 1 time
Been thanked: 18 times

Multi Core: locking execution on the other core

Postby electronic_eel » Wed Oct 30, 2024 12:11 pm

Hi,

I'm currently using ChibiOS/RT on an RP2040 in Multi Core / SMP mode. If I understood the code correctly, when I call chSysLock() on one core to enter a critical section, the other core is blocked from also entering a critical section. But unless it also tries to get into a critical section with one of the lock functions it will continue to run normally. Correct?

Is there an easy way to ensure that the other core is also prevented from executing code at all? Could for example chSysNotifyInstance() somehow be used for this?

What I'm trying to do is to write to the flash IC connected to the RP2040. Usually code runs from the flash via XIP. But this is not possible while writing to the flash. So I have to disable XIP, write to the flash, re-enable XIP. The functions that deal with disabling XIP, writing and re-enabling XIP must all be executed from RAM.

I successfully moved them into the ".ram0_init" section and can run them from there. But this is all happening on one core while the other core still tries to use XIP and fails. So I have to ensure that the other core is stalled on some function in RAM and only proceed with my code when this is ensured.

I'm also not sure what is the best way to move the ChibiOS core functions that the locked core should busy-loop on into a specific linker object section. I guess I'd have to add something like " __attribute__((section(".ram0_init.")))" to their code. But this would be a patch to core ChibiOS code which I'd prefer to keep unmodified. Or do you see an easy way to force the other core into a function of mine that I could then easily move into RAM?

Thanks.

electronic_eel
Posts: 83
Joined: Sat Mar 19, 2016 8:07 pm
Has thanked: 1 time
Been thanked: 18 times

Re: Multi Core: locking execution on the other core

Postby electronic_eel » Wed Oct 30, 2024 4:05 pm

Ok, I now implemented some "core catching" logic where the other core is instructed to spin until the flash-accessing core is finished and XIP can be used again:

Code: Select all

typedef enum
{
    RELEASE = 0,
    WAITING_FOR_CATCH,
    OTHER_CORE_CAUGHT,
} flash_access_state_t;

static volatile flash_access_state_t flash_access_state = RELEASE;

void RAM_FUNCTION(catch_core0)(void)
{
    if (port_get_core_id() != 0)
    {
        // we got the wrong core, nothing we can do about
        return;
    }

    if (flash_access_state == WAITING_FOR_CATCH)
    {
        // do not reschedule this thread anymore, block all IRQs
        chSysDisable();
       
        flash_access_state = OTHER_CORE_CAUGHT;
        __DMB();
       
        // spin here until the other core releases us again
        while (flash_access_state != RELEASE);
       
        chSysEnable();
    }
}

void RAM_FUNCTION(read_out_flash_id)(void)
{
    rom_connect_internal_flash_func();
    rom_flash_exit_xip_func();

    // TODO: real flash access functions go here
   
    rom_flash_flush_cache_func();
    flash_enable_xip_via_boot2();
   
    // XIP should now work again as before
}

void cmd_flash_id(BaseSequentialStream *chp, int argc, char *argv[])
{
    (void)argv; (void)argc;
   
    // this function can just be called from core1
    // this is necessary because we need to lock the other core
    // during direct flash access
    if (port_get_core_id() != 1)
    {
        chprintf(chp, "ERROR: flash_id called from the wrong core!\r\n");
        return;
    }

    flash_access_state = WAITING_FOR_CATCH;
    __DMB();
   
    chEvtSignal(&ch0.mainthread, EVT_CATCH_CORE0);

    // TODO: probably add some timeout
    while (flash_access_state != OTHER_CORE_CAUGHT);

    // now we can lock the OS
    chSysLock();
   
    read_out_flash_id();

    chSysUnlock();

    // release the other core again
    flash_access_state = RELEASE;
    __DMB();
   
    chprintf(chp, "we back to XIP\r\n");
}


I don't know if this is the best way to implement this, but short testing shows that it seems to work.

I communicate between the two cores with the volatile enum flash_access_state and call __DMB() after writing to it. Is this enough to guarantee that the other core always sees a change?

Is calling chSysDisable() enough to ensure no other code can be executed on the non-flash-accessing core?

User avatar
Giovanni
Site Admin
Posts: 14535
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1098 times
Been thanked: 934 times
Contact:

Re: Multi Core: locking execution on the other core

Postby Giovanni » Wed Oct 30, 2024 4:43 pm

Hi,

Critical sections are global so performing a chSysLock() prevents any other core from passing a chSysLock(). On the other hand, chySysDisable()chSysEnable()/chSysSuspend() are about interrupts not critical sections so those are not going to affect other cores.

Suspending the other core is non-trivial, you could wakeup an high-priority thread running on the target core that disables interrupts then jumps in RAM and waits for some memory flag to be set before returning to flash and proceed. Bertter not having active VTs while doing that because those could go out of sync if interrupts are disabled for long.

Giovanni

electronic_eel
Posts: 83
Joined: Sat Mar 19, 2016 8:07 pm
Has thanked: 1 time
Been thanked: 18 times

Re: Multi Core: locking execution on the other core

Postby electronic_eel » Wed Oct 30, 2024 5:03 pm

Giovanni wrote:Suspending the other core is non-trivial, you could wakeup an high-priority thread running on the target core that disables interrupts then jumps in RAM and waits for some memory flag to be set before returning to flash and proceed.

Yes, this is basically what I implemented in the code snipped I posted. If that is also what you suggest, then this method can't be that bad :)

Giovanni wrote:Bertter not having active VTs while doing that because those could go out of sync if interrupts are disabled for long.

Yes, that could indeed be an issue. With "Going out of sync" you don't mean that they become just "due but still unhandled", but that the underlying timer wraps so that they aren't detected as due even if they should, correct?

The RP2040 has this nice 64 bit microsecond timer. It is unlikely that this is ever going to wrap in the lifetime of my circuit. But the alarms based on it just compare the lower 32 bits. These will overflow. I'm haven't investigated in the code yet how this case is handled in detail. I guess issues in this case are what you mean with going out of sync.

User avatar
Giovanni
Site Admin
Posts: 14535
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1098 times
Been thanked: 934 times
Contact:

Re: Multi Core: locking execution on the other core

Postby Giovanni » Wed Oct 30, 2024 5:30 pm

electronic_eel wrote:Yes, that could indeed be an issue. With "Going out of sync" you don't mean that they become just "due but still unhandled", but that the underlying timer wraps so that they aren't detected as due even if they should, correct?


Correct, out of sync means not serving the timer interrupt in due time, that can mess with next deadlines in the delta list.

Giovanni

electronic_eel
Posts: 83
Joined: Sat Mar 19, 2016 8:07 pm
Has thanked: 1 time
Been thanked: 18 times

Re: Multi Core: locking execution on the other core

Postby electronic_eel » Wed Oct 30, 2024 5:51 pm

Thank you.

I will review the implementation of the virtual timers for the RP2040 and then try to use assert and some code to trigger this case to verify if this issue can affect my code or not.


Return to “ChibiOS/RT”

Who is online

Users browsing this forum: No registered users and 3 guests