Skip to content

flash: spi_nand: support bad block management#104685

Merged
MaureenHelm merged 3 commits intozephyrproject-rtos:mainfrom
tpambor:spi-nand-bbm
Mar 23, 2026
Merged

flash: spi_nand: support bad block management#104685
MaureenHelm merged 3 commits intozephyrproject-rtos:mainfrom
tpambor:spi-nand-bbm

Conversation

@tpambor
Copy link
Copy Markdown
Contributor

@tpambor tpambor commented Feb 27, 2026

This PR adds support for bad block management in SPI NAND flash devices. This includes functions to mark a block as bad and to check if a block is bad, either marked as bad in factory or by software. The bad block marker is stored in the first byte of the OOB area of the first page of each block.

Demo:

[00:00:00.009,000] <dbg> spi_nand: spi_nand_wait_until_ready: Ready after 200 us (Op reset, Status 00)
[00:00:00.011,000] <dbg> spi_nand: spi_nand_wait_until_ready: Ready after 200 us (Op read, Status 00)
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load: Valid CRC: 3D0F
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:      Manufacturer: WINBOND
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:             Model: W25N01GV
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:  Page Size (data): 2048
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load: Page Size (spare): 64
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:   Pages per Block: 64
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:   Blocks per Unit: 1024
[00:00:00.011,000] <dbg> spi_nand: onfi_parameters_load:             Units: 1
[00:00:00.012,000] <dbg> spi_nand: spi_nand_wait_until_ready: Ready after 200 us (Op read, Status 00)
[00:00:00.012,000] <dbg> spi_nand: spi_nand_is_bad_block: Block at address 000000 is good
[00:00:00.012,000] <dbg> spi_nand: spi_nand_mark_bad_block: Marking block starting at 000000 as bad
[00:00:00.013,000] <dbg> spi_nand: spi_nand_wait_until_ready: Ready after 600 us (Op write, Status 00)
[00:00:00.014,000] <dbg> spi_nand: spi_nand_wait_until_ready: Ready after 200 us (Op read, Status 00)
[00:00:00.014,000] <dbg> spi_nand: spi_nand_is_bad_block: Block at address 000000 is bad (marker 00)

@tpambor tpambor force-pushed the spi-nand-bbm branch 2 times, most recently from 3bc14f4 to 68955ff Compare March 2, 2026 13:13
@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 2, 2026

I validated this also with the flash translation layer from #100858. All disk tests pass:

*** Booting Zephyr OS build v4.3.0-7034-g6367b7e0488c ***
Running TESTSUITE disk_driver
===================================================================
Disk reports 56132 sectors
Disk reports sector size 2048
START - test_erase
Testing erase of 8 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 16 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 24 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 32 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
 PASS - test_erase in 3.430 seconds
===================================================================
START - test_read
Testing reads of 8 sectors
E: Requested sectors are out of range
Testing reads of 1 sectors
Testing reads of 29 sectors
E: Requested sectors are out of range
Testing reads of 31 sectors
E: Requested sectors are out of range
 PASS - test_read in 0.059 seconds
===================================================================
START - test_write
Testing writes of 8 sectors
E: Requested sectors are out of range
Testing writes of 1 sectors
Testing writes of 29 sectors
E: Requested sectors are out of range
Testing writes of 31 sectors
E: Requested sectors are out of range
 PASS - test_write in 1.630 seconds
===================================================================
TESTSUITE disk_driver succeeded

------ TESTSUITE SUMMARY START ------

SUITE PASS - 100.00% [disk_driver]: pass = 3, fail = 0, skip = 0, total = 3 duration = 5.119 seconds
 - PASS - [disk_driver.test_erase] duration = 3.430 seconds
 - PASS - [disk_driver.test_read] duration = 0.059 seconds
 - PASS - [disk_driver.test_write] duration = 1.630 seconds

------ TESTSUITE SUMMARY END ------

I will contribute the tests once either this or #100858 is merged.

@tpambor tpambor requested a review from JordanYates March 2, 2026 15:52
@tpambor tpambor added this to the v4.4.0 milestone Mar 2, 2026
@tpambor tpambor mentioned this pull request Mar 2, 2026
Comment thread drivers/flash/spi_nand.c
page_address = addr >> config->addr_page_shift;

/* Copy data from main storage to cache (ignore ECC errors) */
ret = spi_nand_page_read_to_cache(dev, page_address);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I wrong that the intended usage flow for this would be something like:

if (is_bad_block(dev, block)) {
   return -EINVAL;
}
flash_read(dev, block, mem, len);

If so, the current implementation will read from main memory to cache twice for every block read.

Copy link
Copy Markdown
Contributor Author

@tpambor tpambor Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad blocks don't really matter for reads. E.g. if a block goes bad, i.e. one of its pages is no longer programmable, it is still possible to read the other pages stored in the block and migrate these to another block.

Bad blocks are relevant for writes/erases. Datasheets of NAND flashes usually state something like:

System software should initially check the first spare area location for non-FFH data on the first page of each block prior to performing any program or erase operations on the NAND Flash device.

Nevertheless, performance could be improved by doing a bad block scan on init and storing a bad block table in memory.

Comment thread drivers/flash/spi_nand.c
return ret;
}

/* Copy bad block marker to cache (all other bytes stay reset at 0xff) */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know the cache bytes are 0xFF?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the difference between PROGRAM LOAD (0x02) instruction and PROGRAM LOAD RANDOM (0x84) instruction. PROGRAM LOAD first resets the cache to the unprogrammed state (0xff) and then data transmitted over SPI is stored in the cache. PROGRAM LOAD RANDOM skips the resets and just overwrites the data in the cache. Context here is that to set the bad blocker mark a partial write of only that byte is performed. This is fine as it also is not protected by ECC.

@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 6, 2026

Rebased and resolved conflicts. No other changes.

Comment thread include/zephyr/drivers/flash.h Outdated
Add support for bad block management in SPI NAND flash devices.
This includes functions to check if a block is bad and to mark
a block as bad. The bad block marker is stored in the OOB area of
the first page of each block.

Signed-off-by: Tim Pambor <tim.pambor@codewrights.de>
@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 20, 2026

Rebase only, no other changes

tpambor added 2 commits March 20, 2026 17:13
Add a test for the FTL disk driver using the spi_nand flash driver. This
test utilizes the frdm_mcxn947 board with the MikroE Flash 5 Click shield.

Signed-off-by: Tim Pambor <tim.pambor@codewrights.de>
…lash

Add a test for the FTL disk driver using the spi_nand flash driver. This
test utilizes the frdm_mcxn947 board with the MikroE Flash 5 Click shield.

Signed-off-by: Tim Pambor <tim.pambor@codewrights.de>
@zephyrbot zephyrbot added area: Tests Issues related to a particular existing or missing test area: Disk Access labels Mar 20, 2026
@sonarqubecloud
Copy link
Copy Markdown

@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 20, 2026

I have added test coverage by adding FTL disk tests on top of the spi_nand driver.

disk_access test:

*** Booting Zephyr OS build v4.3.0-9208-g46db77d8e54b ***
Running TESTSUITE disk_driver
===================================================================
Disk reports 56976 sectors
Disk reports sector size 2048
START - test_erase
Testing erase of 8 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 16 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 24 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
Testing erase of 32 sectors
E: Requested sectors are out of range
E: Requested sectors are out of range
E: Requested sectors are out of range
 PASS - test_erase in 5.160 seconds
===================================================================
START - test_read
Testing reads of 8 sectors
E: Requested sectors are out of range
Testing reads of 1 sectors
Testing reads of 29 sectors
E: Requested sectors are out of range
Testing reads of 31 sectors
E: Requested sectors are out of range
 PASS - test_read in 0.378 seconds
===================================================================
START - test_write
Testing writes of 8 sectors
E: Requested sectors are out of range
Testing writes of 1 sectors
Testing writes of 29 sectors
E: Requested sectors are out of range
Testing writes of 31 sectors
E: Requested sectors are out of range
 PASS - test_write in 1.996 seconds
===================================================================
TESTSUITE disk_driver succeeded

------ TESTSUITE SUMMARY START ------

SUITE PASS - 100.00% [disk_driver]: pass = 3, fail = 0, skip = 0, total = 3 duration = 7.534 seconds
 - PASS - [disk_driver.test_erase] duration = 5.160 seconds
 - PASS - [disk_driver.test_read] duration = 0.378 seconds
 - PASS - [disk_driver.test_write] duration = 1.996 seconds

------ TESTSUITE SUMMARY END ------

disk_performance:

*** Booting Zephyr OS build v4.3.0-9208-g46db77d8e54b ***
Running TESTSUITE disk_performance
===================================================================
Disk reports 56976 sectors
Disk reports sector size 2048
START - test_random_read
2048 Byte IOPS over 10 random reads: 208 IOPS
 PASS - test_random_read in 0.052 seconds
===================================================================
START - test_random_write
2048 Byte IOPS over 10 random writes: 170 IOPS
 PASS - test_random_write in 0.181 seconds
===================================================================
START - test_sequential_read
Average read speed over one sector: 272 KiB/s
Average read speed over 40 sectors: 320 KiB/s
 PASS - test_sequential_read in 2.189 seconds
===================================================================
START - test_sequential_write
Average write speed over one sector: 662 KiB/s
Average write speed over 40 sectors: 240 KiB/s
 PASS - test_sequential_write in 3.264 seconds
===================================================================
TESTSUITE disk_performance succeeded

------ TESTSUITE SUMMARY START ------

SUITE PASS - 100.00% [disk_performance]: pass = 4, fail = 0, skip = 0, total = 4 duration = 5.686 seconds
 - PASS - [disk_performance.test_random_read] duration = 0.052 seconds
 - PASS - [disk_performance.test_random_write] duration = 0.181 seconds
 - PASS - [disk_performance.test_sequential_read] duration = 2.189 seconds
 - PASS - [disk_performance.test_sequential_write] duration = 3.264 seconds

------ TESTSUITE SUMMARY END ------

@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 20, 2026

@de-nordic please take a look

@tpambor
Copy link
Copy Markdown
Contributor Author

tpambor commented Mar 23, 2026

@de-nordic PTAL

@MaureenHelm MaureenHelm merged commit 571d7e2 into zephyrproject-rtos:main Mar 23, 2026
29 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Disk Access area: Flash area: Tests Issues related to a particular existing or missing test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants