FatFS interface on H7

Report here problems in any of ChibiOS components. This forum is NOT for support.
geebee
Posts: 33
Joined: Thu Dec 06, 2018 10:22 pm
Has thanked: 4 times
Been thanked: 15 times

FatFS interface on H7

Postby geebee » Tue Feb 09, 2021 9:06 pm

Hi,

I think there is an issue with the FatFS bindings on the STM32H7, and in general for systems witch cache.
I can't seem to get things working without adding the cache invalidate/flush suggested for DMA. However, even doing so did not get past initialization.

Upon debugging in depth, it turns out that FatFS's read/write buffer is not aligned to a cache line, and important internal state variables share the same cache line, so when invalidated before reading the second or third sector the state gets corrupted.

I attached a suggestion for a small fix for it:
- add cacheBufferInvalidateUnaligned/cacheBufferFlushUnaligned
- use them in disk_read/disk_write

Notes:
- I have developed and tested thoroughly on 20.3, and briefly checked that it still works with the master branch
- There are no comments and there are some formatting mismatch.

Thanks!

GB
Attachments
fatfs_patch.zip
(3.53 KiB) Downloaded 170 times

User avatar
alex31
Posts: 379
Joined: Fri May 25, 2012 10:23 am
Location: toulouse, france
Has thanked: 38 times
Been thanked: 62 times
Contact:

Re: FatFS interface on H7

Postby alex31 » Wed Feb 10, 2021 8:36 am

Hello,

I can confirm that we have unexplained filesystem corruption using SD/FATFS/ChibiOS 20.3 on L1 cache fitted MCU.
Don't know if our problem is related to the bug you mention, should made a try.

Alexandre

User avatar
Giovanni
Site Admin
Posts: 14455
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1076 times
Been thanked: 922 times
Contact:

Re: FatFS interface on H7

Postby Giovanni » Wed Feb 10, 2021 8:44 am

alex31 wrote:Hello,

I can confirm that we have unexplained filesystem corruption using SD/FATFS/ChibiOS 20.3 on L1 cache fitted MCU.
Don't know if our problem is related to the bug you mention, should made a try.

Alexandre


I think it is related to that, FatFS support is not cache-aware yet. Those unaligned functions are a good idea.

Giovanni

User avatar
alex31
Posts: 379
Joined: Fri May 25, 2012 10:23 am
Location: toulouse, france
Has thanked: 38 times
Been thanked: 62 times
Contact:

Re: FatFS interface on H7

Postby alex31 » Wed Feb 10, 2021 1:47 pm

Hello geebee,

I have question about your implementation,

You do up to 2 cacheBufferFlush in your invalidate function, can you explain the rational behind that ?

I don't say it's a problem, it's just i want to understand.

I have another concern about the size calculation :
in my opinion, if you want to flush a cross boundary buffer, you'll have to flush all the touched lines
example :
32 bytes memory buffer which is @address 16, 2 cache lines has to be flushed : 0-31 and 32-63, so
a flush (0x0, 64) should be emitted, but your macro will emit a flush(0x, 32) :

compiler explorer snip
(I have replaced uint32_t* by uint64_t* in the snip because compiler explorer runs on 64 bits machine, but of course uint32_t is correct in your initial code for chibios)

Thanks

Alexandre


edited 10/02/21 -- 14h25

geebee
Posts: 33
Joined: Thu Dec 06, 2018 10:22 pm
Has thanked: 4 times
Been thanked: 15 times

Re: FatFS interface on H7

Postby geebee » Thu Feb 11, 2021 8:11 pm

alex31 wrote:Hello geebee,

I have question about your implementation,

You do up to 2 cacheBufferFlush in your invalidate function, can you explain the rational behind that ?

I don't say it's a problem, it's just i want to understand.


FatFS has something like this (conceptually):

Code: Select all

struct __attribute__((packed)) state_and_buffer {
  uint32_t internal_state;
  uint8_t buffer[32];
};


If you do not flush, when you f_mount two cache lines are invalidated, one with internal_state and 28 bytes of buffer, and the other with 4 bytes of buffer plus some other stuff. Then buffer is filled in with data from DMA, and FatFS uses that data to compute some value for internal_state.

Now say that you want to open a file and need to read in another sector. You invalidate the same two lines and start off the DMA. But then once the operation is completed and you re-read the cache line in, internal_state is also re-read from SRAM, so the state computed at the previous iteration is discarded.


So flushing before invalidating is needed to ensure that the values written to the variables sharing cache lines with the buffer are not lost.

The reason why it's done up to two times is that you can have that situation at the beginning and at the end of the buffer.

Looking back at my code now, I see a bug when start is aligned but stop is not and the buffer fits inside a cache line. I was trying to be clever with not flushing twice if it's the same cache line, but either we should flush every time anyway, or change the condition to

Code: Select all

...
    if ((astop != astart) || (astart == start))                         \
      cacheBufferFlush(astop, CACHE_LINE_SIZE);                         \
...


alex31 wrote:I have another concern about the size calculation :
in my opinion, if you want to flush a cross boundary buffer, you'll have to flush all the touched lines
example :
32 bytes memory buffer which is @address 16, 2 cache lines has to be flushed : 0-31 and 32-63, so
a flush (0x0, 64) should be emitted, but your macro will emit a flush(0x, 32) :

compiler explorer snip
(I have replaced uint32_t* by uint64_t* in the snip because compiler explorer runs on 64 bits machine, but of course uint32_t is correct in your initial code for chibios)


That is correct, but also I think that's what my code does. You did not put this piece of the macro in that compiler explorer, which is what would increase astop to the next cache line in that situation:

Code: Select all

  if (astop != stop)                                                    \
    astop += CACHE_LINE_SIZE;                                           \


Thanks for looking things over! The more testing the better.

I'm re-attaching the changes with the bug above fixed. Again, this is really just a starting point as it's lacking comments, proper formatting, and probably clearer variable names.

GB
Attachments
fatfs_patch.zip
(3.54 KiB) Downloaded 150 times

User avatar
Giovanni
Site Admin
Posts: 14455
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1076 times
Been thanked: 922 times
Contact:

Re: FatFS interface on H7

Postby Giovanni » Thu Feb 11, 2021 8:20 pm

I think you need only to flush the two cache lines at boundaries then invalidate all the flash lines covering the whole buffer. The reason is that variables adjacent to the buffer would be corrupted if only the invalidation is performed (unrelated variables would be invalidated, the content is in cache and not yet in RAM).

I see another problem: imagine that the adjacent variables are modified by other threads or ISRs, the values could change after the "flush" and before the "invalidate". Because of this the whole procedure should be performed within a critical section. This could lead to very sporadic and hard to debug errors.

Giovanni

geebee
Posts: 33
Joined: Thu Dec 06, 2018 10:22 pm
Has thanked: 4 times
Been thanked: 15 times

Re: FatFS interface on H7

Postby geebee » Thu Feb 11, 2021 10:10 pm

True. The extra stuff at the beginning of the buffer is safe in this specific case, but not in general. And even for this case the linker could put data accessed by other threads after the end of the buffer.

The ideal fix would be for FatFS itself to move the buffer to the beginning of struct FATFS (or add extra fields to align it, but that gets complicated since the fields present depend on various configuration macros), and then we can just use the regular flush/invalidate. I don't see any downsides to that, so we could propose it and eventually it will make its way to ChibiOS.

Until then, we can probably use an additional aligned buffer to do DMA for fatfs_diskio if the MCU has cache. Not ideal, but maybe the MCUs for which this is a problem have enough memory that it's not too big of a problem? At least for our application, we can sacrifice 512 bytes for this, not sure how much it impacts other users.

GB

geebee
Posts: 33
Joined: Thu Dec 06, 2018 10:22 pm
Has thanked: 4 times
Been thanked: 15 times

Re: FatFS interface on H7

Postby geebee » Thu Feb 11, 2021 10:40 pm

Actually, I was experimenting a bit with things, and at least for our use the following seems to work well, requiring no changes in FatFS, and only requiring to add the (regular, aligned) flush/invalidate to fatfs_diskio.

Basically manually allocate a larger buffer, and incapsulate it into a method to return a pointer to a FATFS, which will adjust the offset:

Code: Select all


FATFS *fatfs(void)
{
  static __attribute__((aligned(CACHE_LINE_SIZE))) uint8_t _fatfs_buf[sizeof(FATFS) + CACHE_LINE_SIZE];
  const size_t offset = offsetof(FATFS, win) % CACHE_LINE_SIZE;
  return (FATFS*)(_fatfs_buf + (offset ? CACHE_LINE_SIZE - offset : 0));
}



and then use like

Code: Select all

  f_mount(fatfs(), "", 1);


For me, this gets compiled down to a constant function when building for debug, and probably gets fully inlined with optimizations. In any case for most uses the FATFS object is only needed during mounting.

On top of adding the (aligned) flush/invalidate, I would also check in disk_read and disk_write that the buffer is aligned and chSysHalt() otherwise, and in the code near the halt comment on how to properly allocate the buffer/object.

GB

geebee
Posts: 33
Joined: Thu Dec 06, 2018 10:22 pm
Has thanked: 4 times
Been thanked: 15 times

Re: FatFS interface on H7

Postby geebee » Thu Feb 11, 2021 10:59 pm

Thinking some more about it, this problem can potentially affect anything else that uses DMA. For example, if you allocate a small buffer for a SPI transfer that ends up on the same cache line as other global data used by other threads, you could have the same behavior you outlined.

So maybe the correct thing to do would be to check in flush/invalidate that the address is aligned and the size is a multiple of the cache line size. The check could be performed only when CH_DBG_ENABLE_CHECKS is true to not add overhead to release builds.

GB

User avatar
Giovanni
Site Admin
Posts: 14455
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1076 times
Been thanked: 922 times
Contact:

Re: FatFS interface on H7

Postby Giovanni » Fri Feb 12, 2021 10:28 am

The easiest solution is to add a critical section to the unaligned invalidate function. The best solution for performance are aligned buffers.

Giovanni


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 9 guests