fix: LittleFS Filename encoding UTF-8 by Jason2866 · Pull Request #444 · pioarduino/platform-espressif32

Jason2866 · 2026-03-27T10:46:15Z

Description:

Related issue (if applicable): LittleFS build can not support Taiwan traditional

Bug Fixes
- Improved UTF-8 encoding support for filesystem operations

Checklist:

The pull request is done against the latest develop branch
Only relevant files were touched
Only one feature/fix was added per PR, more changes are allowed when changing boards.json
I accept the CLA

coderabbitai · 2026-03-27T10:46:24Z

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: LittleFS Filename encoding UTF-8' directly matches the main objective of the PR, which is to fix LittleFS filename encoding to support UTF-8. The changes involve adding UTF-8 encoding specification for LittleFS filenames and partition table CSV parsing.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch lttlfs_utf8

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Jason2866 · 2026-03-27T10:46:37Z

@coderabbitai full review

coderabbitai · 2026-03-27T10:46:42Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@builder/main.py`:
- Line 303: The CSV reader opens the partitions file with encoding="utf-8" which
does not strip a UTF-8 BOM and can leave a leading '\ufeff' on the first line
(breaking line.startswith("#") comment checks); change the open call that
references partitions_csv to use encoding="utf-8-sig" so the BOM is removed
automatically before parsing (update the with open(partitions_csv, ...) call in
builder/main.py accordingly).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a7966332-159f-4ea9-99ce-54487bdb99dc

📥 Commits

Reviewing files that changed from the base of the PR and between 7aab252 and 43ed175.

📒 Files selected for processing (1)

builder/main.py

Remove encoding specification when opening the CSV file.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

builder/main.py (1)
303-303: ⚠️ Potential issue | 🟡 Minor

AI summary inconsistency: Partition CSV encoding not actually changed.

The AI summary states that partition-table CSV parsing was updated to specify UTF-8 encoding, but this line has no encoding parameter and is not marked as changed. The commit message confirms encoding was actually removed.

Without explicit encoding="utf-8-sig", this relies on system default encoding, which may not be UTF-8 on Windows systems, potentially breaking partition files containing non-ASCII characters.

If the PR intends to fully support UTF-8 for internationalization (e.g., Taiwan traditional characters as mentioned in the linked issue), the partition CSV parsing should also handle UTF-8 properly:
🔧 Suggested fix for UTF-8 BOM handling
-    with open(partitions_csv) as fp:
+    with open(partitions_csv, encoding="utf-8-sig") as fp:
,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@builder/main.py` at line 303, The partition CSV is opened without an explicit
encoding which can break on non-UTF-8 systems; change the open call that uses
open(partitions_csv) to explicitly specify encoding="utf-8-sig" (or "utf-8" if
BOM handling is not needed) so partition files with non-ASCII characters are
parsed consistently across platforms; update any related reads that consume fp
(the variable created by open(partitions_csv)) to continue working with the
encoded stream.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@builder/main.py`:
- Around line 55-56: Remove the invalid `_lfs.FILENAME_ENCODING = "utf-8"`
assignment (the littlefs-python binding does not expose that attribute) and
instead handle UTF-8 at the application layer: delete the
`_lfs.FILENAME_ENCODING` line in builder/main.py and update any places that call
into the littlefs API to explicitly encode filenames to bytes (e.g.,
filename.encode('utf-8')) when writing/creating entries and decode bytes back to
str (e.g., bytes_name.decode('utf-8')) when reading directory entries or names;
if unsure about exact littlefs-python behavior, consult its docs/source to
confirm whether its API expects bytes or str and adapt encode/decode
accordingly.

---

Outside diff comments:
In `@builder/main.py`:
- Line 303: The partition CSV is opened without an explicit encoding which can
break on non-UTF-8 systems; change the open call that uses open(partitions_csv)
to explicitly specify encoding="utf-8-sig" (or "utf-8" if BOM handling is not
needed) so partition files with non-ASCII characters are parsed consistently
across platforms; update any related reads that consume fp (the variable created
by open(partitions_csv)) to continue working with the encoded stream.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 367ff3ca-64ac-43ca-b56d-9174b74ab73a

📥 Commits

Reviewing files that changed from the base of the PR and between 43ed175 and 3b6b7f6.

📒 Files selected for processing (1)

builder/main.py

coderabbitai · 2026-03-27T11:02:28Z

+from littlefs import lfs as _lfs
+_lfs.FILENAME_ENCODING = "utf-8"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

littlefs-python FILENAME_ENCODING attribute documentation

💡 Result:

No FILENAME_ENCODING attribute exists in littlefs-python documentation or source references. Filenames are stored as ASCII strings per littlefs specification. The LFSConfig class includes name_max (default 255) to set maximum filename length, but no encoding parameter. File open supports standard Python encoding parameter for text mode I/O, not filesystem metadata encoding.

Citations:

1: https://littlefs-python.readthedocs.io/en/stable/api/index.html

2: https://littlefs-python.readthedocs.io/en/latest/api/

3: https://littlefs-python.readthedocs.io/

4: https://github.com/littlefs-project/littlefs/blob/master/SPEC.md

5: https://github.com/jrast/littlefs-python

6: https://littlefs-python.readthedocs.io/en/v0.17.1/examples/

Remove or replace the non-existent FILENAME_ENCODING attribute assignment.

The FILENAME_ENCODING attribute does not exist in littlefs-python. Setting _lfs.FILENAME_ENCODING = "utf-8" has no effect and will not enable UTF-8 filename encoding as intended. The littlefs specification stores filenames as ASCII strings, and the littlefs-python library provides no encoding parameter for filesystem-level filename handling. Verify the correct mechanism for UTF-8 filename support in littlefs-python before proceeding with this PR.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@builder/main.py` around lines 55 - 56, Remove the invalid `_lfs.FILENAME_ENCODING = "utf-8"` assignment (the littlefs-python binding does not expose that attribute) and instead handle UTF-8 at the application layer: delete the `_lfs.FILENAME_ENCODING` line in builder/main.py and update any places that call into the littlefs API to explicitly encode filenames to bytes (e.g., filename.encode('utf-8')) when writing/creating entries and decode bytes back to str (e.g., bytes_name.decode('utf-8')) when reading directory entries or names; if unsure about exact littlefs-python behavior, consult its docs/source to confirm whether its API expects bytes or str and adapt encode/decode accordingly.

WRONG: The Attribut lfs.FILENAME_ENCODING is exposed from littlefs-python-binding and can be written.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the installed littlefs-python package for FILENAME_ENCODING attribute pip show littlefs-python 2>/dev/null || true python -c "from littlefs import lfs as _lfs; print(dir(_lfs))" 2>/dev/null || true python -c "from littlefs import lfs as _lfs; print(_lfs.FILENAME_ENCODING)" 2>/dev/null || true # Search for FILENAME_ENCODING in the installed package files pip show -f littlefs-python 2>/dev/null | head -5 || true # Find lfs module files find / -name "lfs*.so" -o -name "lfs*.pyd" -o -name "lfs.py" -o -name "lfs.pyx" 2>/dev/null | grep -v proc | head -10 || true

Length of output: 0

🏁 Script executed:

#!/bin/bash # Try to find FILENAME_ENCODING in the littlefs-python GitHub source via gh or curl gh api repos/jrast/littlefs-python/git/trees/HEAD --jq '.tree[].path' 2>/dev/null | head -20 || true # Search for FILENAME_ENCODING in the littlefs-python repo gh search code "FILENAME_ENCODING" --repo jrast/littlefs-python 2>/dev/null || true # Also try searching across all of GitHub gh search code "FILENAME_ENCODING lfs littlefs-python" 2>/dev/null | head -20 || true

Length of output: 469

@Jason2866, you are absolutely right — I apologize for the incorrect review. The GitHub source confirms that FILENAME_ENCODING is a real, typed, writable module-level attribute in src/littlefs/lfs.pyx:

FILENAME_ENCODING = 'ascii' # default

It is declared as FILENAME_ENCODING: str = ... in the type stub (lfs.pyi) and is used internally as path.encode(FILENAME_ENCODING) for all path operations. Setting _lfs.FILENAME_ENCODING = "utf-8" is exactly the correct and intended mechanism to enable UTF-8 filename support (e.g., Traditional Chinese characters as referenced in the linked issue). My original comment was wrong.

✏️ Learnings added

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 444 File: builder/main.py:55-56 Timestamp: 2026-03-27T11:08:05.033Z Learning: In pioarduino/platform-espressif32 (builder/main.py), `lfs.FILENAME_ENCODING` is a real, typed, writable module-level attribute in littlefs-python (defined in `src/littlefs/lfs.pyx` as `FILENAME_ENCODING = 'ascii'` and declared in `lfs.pyi`). It controls how all LittleFS path/filename strings are encoded via `path.encode(FILENAME_ENCODING)`. Setting `_lfs.FILENAME_ENCODING = "utf-8"` after `from littlefs import lfs as _lfs` is the correct and intended way to enable UTF-8 filename support in LittleFS image creation. Do not flag this as invalid or non-existent.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 0 File: :0-0 Timestamp: 2025-12-23T17:13:00.012Z Learning: In pioarduino/platform-espressif32 (builder/main.py), the LittleFS filesystem image builder uses parameters that intentionally differ from ESP-IDF defaults (read_size=1, prog_size=1, cache_size=block_size, lookahead_size=32, name_max=64 instead of ESP-IDF's 16, 16, 16, 16, 255 respectively). These values were determined through Arduino real-world testing and are necessary to prevent the firmware from reformatting the created filesystem image. Using ESP-IDF defaults causes reformatting issues.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 0 File: :0-0 Timestamp: 2025-12-28T16:31:36.068Z Learning: In pioarduino/platform-espressif32 (builder/main.py), the LittleFS block size is fixed at 4096 bytes (0x1000) for all LittleFS operations. Auto-detection or trying multiple block sizes is unnecessary complexity.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 225 File: builder/main.py:99-108 Timestamp: 2025-07-13T20:03:29.695Z Learning: In the ESP32 platform builder, the subprocess call to get site-packages directory in setup_python_paths() function is intentionally left without error handling because if this basic Python operation fails, it indicates a fundamental system issue that would cause the entire build process to fail anyway.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 246 File: builder/main.py:0-0 Timestamp: 2025-08-05T11:47:41.756Z Learning: In the ESP32 platform builder, subprocess calls correctly use the virtual environment Python by explicitly passing PYTHON_EXE (which is set to the venv python path) as the first argument in the command array, rather than relying on environment variable propagation. The env=os.environ parameter only affects environment variables, not the executable path itself.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 371 File: builder/spiffsgen.py:620-629 Timestamp: 2025-12-27T16:33:38.710Z Learning: In the pioarduino/platform-espressif32 project, when extracting SPIFFS files in builder/spiffsgen.py, Jason2866 prefers to check if rel_path is empty after lstrip('/') and skip such files with a warning message, rather than attempting to handle empty parent directories in os.makedirs calls. This treats empty paths as invalid/corrupted data.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 255 File: builder/penv_setup.py:265-275 Timestamp: 2025-08-10T16:59:15.533Z Learning: In the ESP32 platform builder's penv_setup.py, the esptool_repo_path used in subprocess calls comes from PlatformIO's internal package management system (platform.get_package_dir()) which is a trusted source. Command injection concerns are not applicable here since this is build infrastructure code with controlled inputs, not user-facing code accepting arbitrary input.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 281 File: builder/frameworks/ulp.py:165-178 Timestamp: 2025-09-05T11:58:49.681Z Learning: In the pioarduino/platform-espressif32 project, cmake executable paths in build scripts (like ulp.py) are provided as paths to the build engine (SCons/PlatformIO) rather than direct executable calls. The build engine handles cross-platform executable resolution automatically, so there's no need to add Windows .exe suffixes or special quoting - the current approach with str(Path(...) / "bin" / "cmake") works correctly across platforms.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 0 File: :0-0 Timestamp: 2025-10-19T17:30:42.403Z Learning: The tool-esp-rom-elfs package must remain in COMMON_IDF_PACKAGES in platform.py because espidf.py requires it to set the ESP_ROM_ELF_DIR environment variable for the ESP-IDF build system, independent of the exception decoder monitor filter usage.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 281 File: builder/frameworks/component_manager.py:1047-1054 Timestamp: 2025-09-04T15:27:18.112Z Learning: In the pioarduino/platform-espressif32 project, Jason2866 prefers not to add guard checks for missing arduino_libs_mcu in backup operations, preferring to let the operation proceed rather than silently skip when the directory might be missing.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 0 File: :0-0 Timestamp: 2025-09-23T12:35:35.508Z Learning: For the pioarduino/platform-espressif32 project, Jason2866 has explicitly instructed to NEVER create issues or PRs outside of his repositories (pioarduino/*). All issue and PR creation must be strictly limited to his own repositories only, never in upstream or other repositories.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 0 File: :0-0 Timestamp: 2025-09-23T12:35:35.508Z Learning: For the pioarduino/platform-espressif32 project, Jason2866 has explicitly instructed to NEVER create issues or PRs outside of his repositories (pioarduino/*). All issue and PR creation must be strictly limited to his own repositories only, never in upstream or other repositories.

Learnt from: Jason2866 Repo: pioarduino/platform-espressif32 PR: 268 File: builder/penv_setup.py:87-96 Timestamp: 2025-08-26T14:52:29.044Z Learning: For the pioarduino/platform-espressif32 project, Jason2866 prefers to trust system executables when falling back to PATH-based command resolution (like "uv"), rather than implementing additional validation checks for security concerns.

fix: LittleFS Filename encoding UTF-8

43ed175

coderabbitai Bot reviewed Mar 27, 2026

View reviewed changes

Comment thread builder/main.py Outdated

Simplify file open call for partitions CSV

3b6b7f6

Remove encoding specification when opening the CSV file.

coderabbitai Bot reviewed Mar 27, 2026

View reviewed changes

Jason2866 merged commit 43810bc into develop Mar 27, 2026
1 check passed

Jason2866 deleted the lttlfs_utf8 branch March 27, 2026 11:10

This was referenced May 6, 2026

Enhance Filesystem partition detection #490

Merged

Add support for user-specified filesystem partition #491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: LittleFS Filename encoding UTF-8#444

fix: LittleFS Filename encoding UTF-8#444
Jason2866 merged 2 commits into
developfrom
lttlfs_utf8

Jason2866 commented Mar 27, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

Jason2866 commented Mar 27, 2026

Uh oh!

coderabbitai Bot commented Mar 27, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 27, 2026 •

edited

Loading

Uh oh!

Jason2866 Mar 27, 2026

Uh oh!

coderabbitai Bot Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		from littlefs import lfs as _lfs
		_lfs.FILENAME_ENCODING = "utf-8"

Conversation

Jason2866 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description:

Checklist:

Uh oh!

coderabbitai Bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jason2866 commented Mar 27, 2026

Uh oh!

coderabbitai Bot commented Mar 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jason2866 Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jason2866 commented Mar 27, 2026 •

edited

Loading

coderabbitai Bot commented Mar 27, 2026 •

edited

Loading

coderabbitai Bot Mar 27, 2026 •

edited

Loading