-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward-looking erase-state CRCs #497
Conversation
918bb02
to
31f4d26
Compare
d677a96
to
2d2dd8b
Compare
db2abfb
to
aa225a9
Compare
076f871
to
17c9665
Compare
17c9665
to
1f37eb5
Compare
This change is necessary to handle out-of-order writes found by pjsg's fuzzing work. The problem is that it is possible for (non-NOR) block devices to write pages in any order, or to even write random data in the case of a power-loss. This breaks littlefs's use of the first bit in a page to indicate the erase-state. pjsg notes this behavior is documented in the W25Q here: https://community.cypress.com/docs/DOC-10507 --- The basic idea here is to CRC the next page, and use this "erase-state CRC" to check if the next page is erased and ready to accept programs. .------------------. \ commit | metadata | | | | +---. | | | | |------------------| | | | erase-state CRC -----. | |------------------| | | | | commit CRC ---|-|-' |------------------| / | | padding | | padding (doesn't need CRC) | | | |------------------| \ | next prog | erased? | +-' | | | | | v | / | | | | '------------------' This is made a bit annoying since littlefs doesn't actually store the page (prog_size) in the superblock, since it doesn't need to know the size for any other operation. We can work around this by storing both the CRC and size of the next page when necessary. Another interesting note is that we don't need to any bit tweaking information, since we read the next page every time we would need to know how to clobber the erase-state CRC. And since we only read prog_size, this works really well with our caching, since the caches must be a multiple of prog_size. This also brings back the internal lfs_bd_crc function, in which we can use some optimizations added to lfs_bd_cmp. Needs some cleanup but the idea is passing most relevant tests.
- General cleanup from integration, including cleaning up some older commit code - Partial-prog tests do not make sense when prog_size == block_size (there can't be partial-progs!) - Fixed signed-comparison issue in modified filebd
Previously forward-looking CRCs was just two new CRC types, one for commits with forward-looking CRCs, one without. These both contained the CRC needed to complete the current commit (note that the commit CRC must come last!). [-- 32 --|-- 32 --|-- 32 --|-- 32 --] with: [ crc3 tag | nprog size | nprog crc | commit crc ] without: [ crc2 tag | commit crc ] This meant there had to be several checks for the two possible structure sizes, messying up the implementation. [-- 32 --|-- 32 --|-- 32 --|-- 32 --|-- 32 --] with: [nprogcrc tag| nprog size | nprog crc | commit tag | commit crc ] without: [ commit tag | commit crc ] But we already have a mechanism for storing optional metadata! The different metadata tags! So why not use a separate tage for the forward-looking CRC, separate from the commit CRC? I wasn't sure this would actually help that much, there are still necessary conditions for wether or not a forward-looking CRC is there, but in the end it simplified the code quite nicely, and resulted in a ~200 byte code-cost saving.
This fixes most of the remaining bugs (except one with multiple padding commits + noop erases in test_badblocks), with some other code tweaks. The biggest change was dropping reliance on end-of-block commits to know when to stop parsing commits. We can just continue to parse tags and rely on the crc for catch bad commits, avoiding a backwards-compatiblity hiccup. So no new commit tag. Also renamed nprogcrc -> fcrc and commitcrc -> ccrc and made naming in the code a bit more consistent.
Initially I thought the fcrc would be sufficient for all of the end-of-commit context, since indicating that there is a new commit is a simple as invalidating the fcrc. But it turns out there are cases that make this impossible. The surprising, and actually common, case, is that of an fcrc that will end up containing a full commit. This is common as soon as the prog_size is big, as small commits are padded to the prog_size at minimum. .------------------. \ | metadata | | | | | | | +-. |------------------| | | | foward CRC ------------. |------------------| / | | | commit CRC -----' | |------------------| | | padding | | | | | |------------------| \ \ | | metadata | | | | | | +-. | | | | | | +-' |------------------| / | | | commit CRC --------' | |------------------| | | | / '------------------' When the commit + crc is all contained in the fcrc, something silly happens with the math behind crcs. Everything in the commit gets canceled out: crc(m) = m(x) x^|P|-1 mod P(x) m ++ crc(m) = m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x)) crc(m ++ crc(m)) = (m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x))) x^|P|-1 mod P(x) crc(m ++ crc(m)) = (m(x) x^|P|-1 + m(x) x^|P|-1) x^|P|-1 mod P(x) crc(m ++ crc(m)) = 0 * x^|P|-1 mod P(x) This is the reason the crc of a message + naive crc is zero. Even with an initializer/bit-fiddling, the crc of the whole commit ends up as some constant. So no manipulation of the commit can change the fcrc... But even if this did work, or we changed this scheme to use two different checksums, it would still require calculating the fcrc of the whole commit to know if we need to tweak the first bit to invalidate the unlikely-but-problematic case where we happen to match the fcrc. This would add a large amount of complexity to the commit code. It's much simpler and cheaper to keep the 1-bit counter in the tag, even if it adds another moving part to the system.
…orphans This of course should never happen normally, two half-orphans requires two parents, which is disallowed in littlefs for this reason. But it can happen if there is an outdated half-orphan later in the metadata linked-list. The two half-orphans can cause the deorphan step to get stuck, constantly "fixing" the first half-orphan before it has a chance to remove the problematic, outdated half-orphan later in the list. The solution here is to do a full check for half-orphans before restarting the half-orphan loop. This strategy has the potential to visit more metadata blocks unnecessarily, but avoids situations where removing a later half-orphan will eventually cause an earlier half-orphan to resolve itself. Found with heuristic powerloss testing with test_relocations_reentrant_renames after 192 nested powerlosses.
This looks exciting! |
7e5f2fc
to
ba1c764
Compare
This wasn't implemented correctly anyways, as it would need to recursively rename directories that may not exist. Things would also get a bit complicated if only some files in a directory were renamed. Doable, but not needed for our use case. For now just ignore any directory components. Though this may be worth changing if the source directory structure becomes more complicated in the future (maybe with a -r/--recursive flag?).
9954939
to
81a7a1e
Compare
This is a bit tricky since we need two different version of littlefs in order to test for most compatibility concerns. Fortunately we already have scripts/changeprefix.py for version-specific symbols, so it's not that hard to link in the previous version of littlefs in CI as a separate set of symbols, "lfsp_" in this case. So that we can at least test the compatibility tests locally, I've added an ifdef against the expected define "LFSP" to define a set of aliases mapping "lfsp_" symbols to "lfs_" symbols. This is manual at the moment, and a bit hacky, but gets the job done. --- Also changed BUILDDIR creation to derive subdirectories from a few Makefile variables. This makes the subdirectories less manual and more flexible for things like LFSP. Note this wasn't possible until BUILDDIR was changed to default to "." when omitted.
This uses the "github.event.pull_request.base.ref" variable as the "lfsp" target for compatibility testing.
This just means a rewrite of the superblock entry with the new minor version. Though it's interesting to note, we don't need to rewrite the superblock entry until the first write operation in the filesystem, an optimization that is already in use for the fixing of orphans and in-flight moves. To keep track of any outdated minor version found during lfs_mount, we can carve out a bit from the reserved bits in our gstate. These are currently used for a counter tracking the number of orphans in the filesystem, but this is usually a very small number so this hopefully won't be an issue. In-device gstate tag: [-- 32 --] [1|- 11 -| 10 |1| 9 ] ^----^-----^--^--^-- 1-bit has orphans '-----|--|--|-- 11-bit move type '--|--|-- 10-bit move id '--|-- 1-bit needs superblock '-- 9-bit orphan count
See SPEC.md for more info. Also considered adding an explanation to DESIGN.md, but there's not a great place for it. Maybe FCRCs are too low-level for the high-level design document. Though may be worth reconsidering if DESIGN.md gets revisited.
Just added documentation about this change to the SPEC.md: Things look like they are working so this should be ready to merge soon. Sorry again about the delay, this is the first change that involves bumping the on-disk minor version. This sort of irreversible change can lead to unpleasant situations if a mistake is made, so I wanted to make sure forward/backward compatibility is decently tested first. 116332d and 4c93600 should hopefully provide decent confidence that the on-disk minor version is respected. This will also need a good explanation in the release notes... |
Original issue here: #245 (comment)
This is a proposal to fix the out-of-order writes found by @pjsg's fuzzing work.
The problem is that it is possible for (non-NOR) block devices to write pages in any order, or to even write random data in the case of a power-loss. This breaks littlefs's use of the first bit in a page to indicate the erase-state.
@pjsg notes this behavior is documented in the W25Q here:
https://community.cypress.com/docs/DOC-10507
The basic idea here is to CRC the next page, and use this "erase-state CRC" to check if the next page is erased and ready to accept programs.
This is made a bit annoying since littlefs doesn't actually store the page (prog_size) in the superblock, since it doesn't need to know the size for any other operation. We can work around this by storing both the CRC and size of the next page when necessary.
Another interesting note is that we don't need to any bit tweaking information, since we read the next page every time we would need to know how to clobber the erase-state CRC. And since we only read prog_size, this works really well with our caching, since the caches must be a multiple of prog_size.
This also brings back the internal lfs_bd_crc function, in which we can use some optimizations added to lfs_bd_cmp in #395.
TODO:
Depends on #495
Related to #245