Fix missing range checks #23

sgeisler · 2018-06-22T15:42:55Z

Fixes #22.

TODO:

rewrite from_str_lenient to catch all bad characters
adapt tests that use Error::InvalidChar since its signature was updated
validate that all bad characters are -1 in CHARSET_REV
write tests to show the bug is fixed
have a look at the checks for mixed case, they seem overly complicated

Formerly input strings were processed byte-wise, since it happens now character-wise the tests which result in Error::InvalidChar(c)'s had to be changed.

sgeisler · 2018-06-22T21:33:34Z

The tests fail due to rustup issues with nightly: rust-lang/rust#51699.

tamasblummer · 2018-06-23T08:34:24Z

src/lib.rs

            }

-            // Uppercase
-            let c = if b >= b'A' && b <= b'Z' {


was dropping conversion to lower case intentional?

Yes, it is intentional. When I looked at CHARSET_REV I noticed that there are entries for both cases, so the conversion to lowercase should be redundant. But I still have to verify that CHARSET_REV actually satisfies all my assumptions about it, that's one of the reasons this PR is still WIP.

I like this use of CHARSET_REV in testing for proper range.

sgeisler · 2018-06-25T22:55:56Z

src/lib.rs

-            // Lowercase
-            if b >= b'a' && b <= b'z' {
+
+            if c.is_lowercase() {


The is_lowercase and is_uppercase functions are probably slower (since they work for all Unicode characters) than the simple range checks used before, but they seem so much more idiomatic. The ascii-equivalent (is_ascii_lowercase) is still nightly-only, so if we wanted to avoid handwritten range checks we had to use our own trait for that. But I don't see the need for such optimizations right now.

The implementation of is_lowercase is:

pub fn is_lowercase(self) -> bool { match self { 'a'...'z' => true, c if c > '\x7f' => derived_property::Lowercase(c), _ => false, } }

So for ASCII characters, it should be fairly performant. is_uppercase has a similar structure.

sgeisler · 2018-06-25T22:58:28Z

src/lib.rs

-            // Lowercase
-            if b >= b'a' && b <= b'z' {
+
+            if c.is_lowercase() {
                has_lower = true;


I didn't find a better (and preferably shorter/simpler) way of doing the checks for "all characters have the same case". At least none that doesn't need some bigger refactoring of the HRP processing, so I will leave it for now and maybe open another PR dedicated to refactoring.

sgeisler · 2018-06-25T23:07:01Z

Since this seems to be only an internal bug fix I increased the version from 0.4.1 to 0.4.2. Can anyone review the changes, please? @clarkmoody @tamasblummer

clarkmoody · 2018-06-27T18:31:54Z

src/lib.rs

@@ -402,7 +397,7 @@ pub enum Error {
    /// The data or human-readable part is too long or too short
    InvalidLength,
    /// Some part of the string contains an invalid character
-    InvalidChar(u8),
+    InvalidChar(char),


Is this enum variant change a breaking change that needs a major version bump?

Good catch, sorry, I will have to bump the version to 0.5.0. Would you generally agree that it's better to process unicode strings char-wise instead of their UTF-8 representation byte-wise? Because this change is what made this API break necessary.

clarkmoody · 2018-07-05T15:21:31Z

Pushed 0.5.0 to crates.io

sgeisler added 2 commits June 22, 2018 17:38

rewrite from_str_lenient to fix rust-bitcoin#22

d9016f5

Adapt invalid_string tests to chars instead of bytes

866e15f

Formerly input strings were processed byte-wise, since it happens now character-wise the tests which result in Error::InvalidChar(c)'s had to be changed.

sgeisler force-pushed the fix-missing-range-checks branch from f21e548 to 866e15f Compare June 22, 2018 21:24

tamasblummer reviewed Jun 23, 2018

View reviewed changes

sgeisler added 2 commits June 25, 2018 00:41

Add CHARSET_REV test

16cb139

Add bug test for rust-bitcoin#22

4929ff4

sgeisler commented Jun 25, 2018

View reviewed changes

sgeisler force-pushed the fix-missing-range-checks branch from 3f510a5 to 7a8c1d4 Compare June 25, 2018 23:05

sgeisler changed the title ~~WIP: fix missing range checks~~ Fix missing range checks Jun 25, 2018

sgeisler requested a review from clarkmoody June 25, 2018 23:07

tamasblummer approved these changes Jun 26, 2018

View reviewed changes

clarkmoody reviewed Jun 27, 2018

View reviewed changes

bump version to 0.5.0.

afa37d1

sgeisler force-pushed the fix-missing-range-checks branch from 7a8c1d4 to afa37d1 Compare June 27, 2018 19:25

clarkmoody approved these changes Jul 5, 2018

View reviewed changes

clarkmoody merged commit abb419a into rust-bitcoin:master Jul 5, 2018

sgeisler mentioned this pull request Jul 5, 2018

Suppress warnings caused by AsciiExt use statement #25

Merged

TheBlueMatt mentioned this pull request Jul 25, 2018

Use bech32 v0.8.0 rust-bitcoin/rust-bitcoin#100

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing range checks #23

Fix missing range checks #23

sgeisler commented Jun 22, 2018 •

edited

Loading

sgeisler commented Jun 22, 2018

tamasblummer Jun 23, 2018

sgeisler Jun 23, 2018

clarkmoody Jun 25, 2018

sgeisler Jun 25, 2018

clarkmoody Jun 27, 2018

sgeisler Jun 25, 2018

sgeisler commented Jun 25, 2018

clarkmoody Jun 27, 2018

sgeisler Jun 27, 2018

clarkmoody commented Jul 5, 2018

Fix missing range checks #23

Fix missing range checks #23

Conversation

sgeisler commented Jun 22, 2018 • edited Loading

TODO:

sgeisler commented Jun 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgeisler commented Jun 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clarkmoody commented Jul 5, 2018

sgeisler commented Jun 22, 2018 •

edited

Loading