flash: spi_nand: initial generic driver#100845
flash: spi_nand: initial generic driver#100845kartben merged 2 commits intozephyrproject-rtos:mainfrom
Conversation
2dbcf36 to
c8775c8
Compare
|
Ping @rogeryou and @daniel-0723, since I can't seem to add you as reviewers |
c8775c8 to
97f4c91
Compare
|
|
You could already check if the implemented NFTL disk driver module works for you (#100858). |
|
I also gave this a try with a Winbond W25N01GVZEIG |
|
@de-nordic @rghaddab is there anything blocking this review? I appreciate you probably don't have any hardware to test it against. |
Sorry, I will take a look this evening. |
|
@JordanYates how does this fit #100683? Are we just going to have the same as NOR, where we have an SPI NOR driver and then flash-specific ones? |
This driver issues the SET_FEATURE/GET_FEATURE command to addresses 0xa0, 0xb0, 0xc0. These are in the vendor specific range (0x80-0xff) according to ONFI spec. Most SPI NANDs have similar content there, e.g. ECC enable bit is the same on Micron MT29F4G01ABBFDWB and Winbond W25N01GVZEIG. But there are also differences, I haven't looked at everything but Micron MT29F4G01ABBFDWB has 3 bits of ECC status in the status register, but Winbond W25N01GVZEIG has only 2 bits of ECC status. Currently, the driver in this PR is not checking the ECC status bits. |
Thanks @tpambor for the feedback, but what I meant is how will we organize the code for NAND flashes in general. In the case of NOR, we have:
My question was whether we will have something similar here, in the lines of:
|
I wasn't aware that these were defined as vendor specific commands, but as I mentioned in the PR description I did my best to ensure that only defines that were common across the two manufacturers were used. Maybe it won't end up being true for all NAND flashes, but I would want to see a counter-example before complicating the driver.
Correct, the ECC bits are currently not checked since AFAIK its not required for this basic operation mode. Obviously when combined with something like Dhara they start to have meaning, but what is the flash API supposed to do when these bits start to be set? Note that the program and erase fail bits are common, and are checked to confirm operation success/failure. |
The NAND flashes supported by this PR don't do bad block allocation or wear leveling internally AFAIK. This driver is more the first option. The longer term goal would be to extend this driver with support for the extension commands that Dhara needs, so that it can be either used "raw" for use-cases that don't care about the unreliability, or with Dhara to get the error handling.
Sounds reasonable to me |
I had a closer look across vendor datasheets and these registers are sufficiently similar to implement a generic solution for the basic functionality.
I would expect that if ECC detects unrecoverable errors, this is propagated as an error (like -EIO) to the upper layers (FTL, FS, application). Dhara currently doesn't do anything too special: if a page contains unrecoverable errors, it propagates that error for data pages and avoids utilizing them for metadata. While handling this isn't strictly required for basic operation, "bad things" may happen if unrecoverable errors are present but not reported, regardless of whether Dhara is used or not.
This is fine and all that Dhara expects. Dhara would then mark the block as a bad but these extensions can be implemented in a follow-up PR. |
97f4c91 to
28477d0
Compare
Is your proposal the these features be handled elsewhere? |
My proposal is that advanced features can be debated in future PRs without bogging down the discussion for baseline support. That's what killed the other PR. |
Agreed! This is a good idea to move forward. |
de-nordic
left a comment
There was a problem hiding this comment.
Sorry it takes long, overworked and under the weather.
| k_sleep(K_USEC(poll_us)); | ||
| } while (!sys_timepoint_expired(timeout)); | ||
|
|
||
| LOG_ERR("Timeout waiting for flash ready (Status %02X)", *status); |
There was a problem hiding this comment.
This probably should be debug. Status does not matter that much in production code and you have to deal with timeout in calling code, so you will probably log the timeout info there too; probably with more accurate info like device you have been using.
There was a problem hiding this comment.
Personally I've found it useful even when not actively developing the driver. It is an error, and does provide context that isn't propagated further up, LOG_ERR seems appropriate to me.
132397d to
d104e63
Compare
a73e4ca to
1431609
Compare
de-nordic
left a comment
There was a problem hiding this comment.
I think it looks OK, the only controversial addition is the line that removes const, as it breaks one of our coding guidelines.
| { | ||
| const struct spi_nand_config *config = dev->config; | ||
| uint32_t write_block = config->parameters->write_block_size; | ||
| uint8_t *src_u8 = (void *)src; |
There was a problem hiding this comment.
We are not supposed to do that according to https://docs.zephyrproject.org/latest/contribute/coding_guidelines/index.html, rule 70
| /* Validate JEDEC ID */ | ||
| ret = spi_nand_cmd_read_dummy(dev, SPI_NAND_CMD_READ_ID, jedec_id, SPI_NAND_MAX_ID_LEN); | ||
| if (ret != 0) { | ||
| goto release; | ||
| } |
There was a problem hiding this comment.
Jesd 106 "Standard Manufacturer’s Identification Code" , the document only contains id codes.
Leave it as is. Till we figure common way to address it with NOR driver.
Initial generic driver for SPI-NAND devices that expose the ONFI parameter page. Does not support advanced features such as continuous reads, software ECC, or optional power down modes. Configuration comes from devicetree in order to support other modules, but is validated against the loaded ONFI data at boot. Signed-off-by: Jordan Yates <jordan@embeint.com>
Add `jedec,spi-nand` node to the build all test. Signed-off-by: Jordan Yates <jordan@embeint.com>
1431609 to
096ff17
Compare
|
tpambor
left a comment
There was a problem hiding this comment.
LGTM. Retested on hardware:
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Valid CRC: 3D0F
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Manufacturer: WINBOND
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Model: W25N01GV
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Page Size (data): 2048
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Page Size (spare): 64
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Pages per Block: 64
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Blocks per Unit: 1024
[00:00:00.002,000] <dbg> spi_nand: onfi_parameters_load: Units: 1
uart:~$ flash test spi-nand@0 0 20000 1
Erase OK.
Write OK.
Verified OK.
Erase-Write-Verify test done.
|
@JordanYates I was testing this driver, and got into an issue. The driver operation are normal, but when I analysed the data, I found duplication and overwriting on previous pages size_t len = SPI_NAND_PAGE_SIZE; (2048) after writing 3 pages, expected output on read is page 1: full of 1, page 2: full of 2 and page 3: full of 3 Can you advise if I am doing anything wrong |
|
@prashant3285 I only have tested with the shell but I haven't seen any such behavior |
|
After further digging I see a pattern where odd number blocks are not working as desired but the even number blocks work fine Works: Block 0 2 4 Logs |
|
This issue is due to NAND with multiple planes. |
|
@prashant3285 Could you fill a issue so that we can track and fix this. Thanks. |



Initial generic driver for SPI-NAND devices that expose the ONFI parameter page. Does not support advanced/optional features such as continuous reads, software ECC, or power down modes.
Configuration comes from devicetree in order to support other modules, but is validated against the loaded ONFI data at boot.
Can provide a base towards future work on #99908.
PR #50690 was used as inspiration and for some logic (copyrights preserved), but this PR attempts to sidestep the significant review challenges around the software ECC support by simply providing a baseline driver that uses hardware ECC.
Implementing the driver for the Micron flash devices also exposed non-generic register defines etc used in the previous PR. Everything in this PR should be generic across Micron (tested in hardware) and Macronix (datasheets and the other PR).
Tested on Micron MT29F4G01ABBFDWB. Example debug logs from a (erase-write-read-erase functionality test)