Skip to content

feat: better error message on unicode whitespace that isn't ascii whitespace#8295

Merged
asterite merged 5 commits intomasterfrom
ab/better-error-when-non-unicode-whitespace
Apr 30, 2025
Merged

feat: better error message on unicode whitespace that isn't ascii whitespace#8295
asterite merged 5 commits intomasterfrom
ab/better-error-when-non-unicode-whitespace

Conversation

@asterite
Copy link
Collaborator

@asterite asterite commented Apr 30, 2025

Description

Problem

Resolves #5163

Summary

Here's an example error we get now:

error: Unknown start of token: \u{a0}
  ┌─ src/main.nr:3:16
  │
3 │     if true {}   else { // Placing "} else {" on the same line causes the issue
  │                - Unicode character ' ' (No-Break Space) looks like ' ' (Space), but is it not

This is similar to what Rust outputs.

Additional Context

Documentation

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'ACVM Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: d875198 Previous: 888f3a9 Ratio
perfectly_parallel_batch_inversion_opcodes 4090221 ns/iter (± 18999) 3216630 ns/iter (± 3626) 1.27

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

asterite and others added 2 commits April 30, 2025 14:59
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Copy link
Member

@TomAFrench TomAFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a lexer test to check we get these errors would be nice.

@asterite
Copy link
Collaborator Author

…com:noir-lang/noir into ab/better-error-when-non-unicode-whitespace
@asterite asterite enabled auto-merge April 30, 2025 18:21
@asterite asterite added this pull request to the merge queue Apr 30, 2025
Merged via the queue into master with commit 09d55f6 Apr 30, 2025
115 checks passed
@asterite asterite deleted the ab/better-error-when-non-unicode-whitespace branch April 30, 2025 19:00
github-merge-queue bot pushed a commit to AztecProtocol/aztec-packages that referenced this pull request May 1, 2025
Automated pull of nightly from the
[noir](https://github.com/noir-lang/noir) programming language, a
dependency of Aztec.
BEGIN_COMMIT_OVERRIDE
feat: add `--debug-compile-stdin` to read `main.nr` from `STDIN` for
testing (noir-lang/noir#8253)
feat: better error message on unicode whitespace that isn't ascii
whitespace (noir-lang/noir#8295)
chore: update `quicksort` from iterative `noir_sort` version
(noir-lang/noir#7348)
fix: use correct meta attribute names in contract custom attributes
(noir-lang/noir#8273)
feat: `nargo expand` to show code after macro expansions
(noir-lang/noir#7613)
feat: allow specifying fuzz-related dirs when invoking `nargo test`
(noir-lang/noir#8293)
chore: redo typo PR by ciaranightingale
(noir-lang/noir#8292)
chore: Extend the bug list with issues found by the AST fuzzer
(noir-lang/noir#8285)
fix: don't disallow writing to memory after passing it to brillig
(noir-lang/noir#8276)
chore: test against zkpassport rsa lib
(noir-lang/noir#8278)
feat: omit element size array for more array types
(noir-lang/noir#8257)
chore: refactor array handling in ACIRgen
(noir-lang/noir#8256)
chore: document cast (noir-lang/noir#8268)
END_COMMIT_OVERRIDE

---------

Co-authored-by: AztecBot <tech@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lexer does not handle U+00a0 and other unicode space characters

2 participants