I've been spending more time than I should thinking about this, and come up with my own set of random thoughts, mostly based on solution 3:
I'm assuming 512-byte data chunks, for compatibility with FatFS, although the principles can be applied to chunks of any size.
The flash I'm intending to use (Microchip SST26VF064) supports page writes of any size, and with any start address; its only the erase which operates on large (4K, typically) blocks.
And I'm sure other devices are much the same.
So an algorithm which appends data within blocks could be quite efficient, as well as being economical with RAM.
At the lowest level, work with the device's minimum page erase size as a basic block. Divide it into 514-byte buckets, where the first two bytes are a bucket number, and the rest is a chunk of data.
[Could use 516-byte buckets, with a 16-bit CRC on the data]
For an empty bucket, the bucket number is 0xffff.
For a used bucket, the bucket number is some other non-zero value which uniquely identifies the bucket within the block.
For a 'cast off' bucket, the pointer is zero.
When allocating buckets within a block, allow a percentage of free space to allow for writes (system configuration constant?)
At a higher level, assign some blocks to be a "directory"; these will follow the structure just described, although probably with a much smaller data size. These entries identify the
location of each block within the memory. Again, allow a percentage of free space (separately configurable from the previous value?) within the overall device.
Within each block, similar processes apply for both directory and data:
To read:
1. Calculate the initial bucket address as an offset within the block.
2. If the first two bytes of the bucket are 0xffff, return "uninitialised data"
3. If the first two bytes of the bucket match the expected bucket number, return the following data
4. Step through the "reallocation space" looking for the required bucket number; return data
To write:
1. Calculate the initial bucket address as an offset within the block.
2. If the first two bytes of the bucket are 0xffff, write the data and return
3. Find the bucket currently assigned; update the bucket address
3a. Possibly check to see if the write data has actually changed; if not, return
4. Find an unused bucket in the "reallocation space" of the current block
5. If reallocation space available (in current block)
Write new data into bucket
Write zero into the first two bytes of the previously assigned bucket
return
6. Find an unused block
7. Copy data from the currently assigned block, garbage collecting on the way, and merging new data
8. Update the directory entry to point to the new block
9. Erase the previously assigned block
At the highest level, the required location is determined initially from the directory block.
A 4K block size is going to hold 7.9 buckets total, so lots wasted space. Probably better to work with larger blocks (32K?) to give realistic numbers.
The directory of blocks is going to need around 4K entries for a 8Mbyte flash memory - a trivial overhead
This approach mostly does garbage collection on the fly, with a maximum of two page erases and two complete page writes. Some form of background garbage collection is also a
possibility, scanning and seeking blocks with less than a certain percentage of buckets free.
Treat directory blocks the same as data blocks? That way should help with wear levelling, since directory blocks probably reallocated more often.
If go for 32K block size, will often only need one 'active' directory block.
Keeping track of the directory location(s) could be interesting; these may have to be specifically identified, and the device scanned to find them at startup. This information
could then be cached in non-volatile storage elsewhere - FRAM/EEPROM/battery backed RAM.
The F767 supports a 'memory-mapped' mode for QSPI. Could this simplify things at all?
A level of RAM caching could sometimes be beneficial
This has also made me wonder whether the original MFS implementation is appropriate to a file system; with the right design, a single copy of everything may well suffice.
As an aside, there are some interesting comments on the
UBIFs web site, especially in relation to NAND flash.