Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77

shepmaster · 2015-05-07T23:35:33Z

Right now, Strings only have ASCII alphanumeric characters. I tried to use it to generate strings with spaces in it and was surprised when my code was never called 😺

It would be cool if there were more of the Unicode set included. Even awesomer might be if we had some tuning knobs where we could say "more punctuation, less letters", perhaps using the Unicode character classes?

Thanks for the fun-to-use library!

The text was updated successfully, but these errors were encountered:

BurntSushi · 2015-05-08T00:29:57Z

Yeah, this definitely needs to be fixed. I think the ASCII limitation has been there and nobody has had cause enough to fix it.

There's basically two concerns here. The first is ergonomics. It's really nice when your witnesses are limited to characters that are likely to be meaningful in your terminal (i.e., printable characters). The second is that, of course, strings outside the ASCII range should be tested. When I first wrote quickcheck, I didn't know how to balance these so I just chose to address the first concern.

It seems like the principle of least surprise should apply here: String should be fuzzed with the set of Unicode scalar values and some other type (perhaps defined in quickcheck, i.e., AsciiString) should be more limited when you want a guarantee of nicer witnesses. I think it could deref to &String, which should mostly provide the ergonomics of a plain String.

shepmaster · 2015-05-08T02:23:18Z

It seems like shrinking could also come into play. I could conceive of a string with only ascii as being "smaller" than one with ascii + ascii punctuation. Then it could "grow" to include more common unicode, then "grow" towards uncommon. Just thinking outloud, really.

bluss · 2015-06-15T12:50:53Z

Has this changed? I think I saw wildly different results than this.

Just as a data point, this is what I ended up using for a string search testing / fuzzing:

#[derive(Clone, Debug)]
struct Text(String);

//static ALPHABET: &'static str = "ABCD";
static ALPHABET: &'static str = "\0\u{1}\u{2}\u{3}";

impl Arbitrary for Text {
    fn arbitrary<G: qc::Gen>(g: &mut G) -> Self {
        let len = u16::arbitrary(g);
        let mut s = String::with_capacity(len as usize);
        for _ in 0..len {
            let i = usize::arbitrary(g);
            let i = i % ALPHABET.len();
            s.push(ALPHABET.as_bytes()[i] as char)
        }
        Text(s)
    }
    fn shrink(&self) -> Box<Iterator<Item=Self>> {
        Box::new(self.0.shrink().map(Text))
    }
}

Since the algorithm was working on just the byte slice, I switched to an alphabet of 0-4 for even easier to read intermediate debug outputs..

shepmaster · 2015-06-15T22:18:00Z

@bluss it appears so! This could be considered closed via 848acee.

BurntSushi · 2015-06-16T10:50:22Z

Errm, right, I forgot to update this issue, but I grew tired of the ASCII-only behavior so I just changed it.

ASCII-only is still useful though, so we might want an AsciiString type or something.

shepmaster closed this as completed Jun 15, 2015

BurntSushi mentioned this issue Sep 28, 2015

Strings generated by default should have more usual characters #99

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77

Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77

shepmaster commented May 7, 2015

BurntSushi commented May 8, 2015

shepmaster commented May 8, 2015

bluss commented Jun 15, 2015

shepmaster commented Jun 15, 2015

BurntSushi commented Jun 16, 2015

Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77

Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77

Comments

shepmaster commented May 7, 2015

BurntSushi commented May 8, 2015

shepmaster commented May 8, 2015

bluss commented Jun 15, 2015

shepmaster commented Jun 15, 2015

BurntSushi commented Jun 16, 2015