-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document that String is ASCII alphanumerics-only, maybe have a way of generating more Unicode? #77
Comments
Yeah, this definitely needs to be fixed. I think the ASCII limitation has been there and nobody has had cause enough to fix it. There's basically two concerns here. The first is ergonomics. It's really nice when your witnesses are limited to characters that are likely to be meaningful in your terminal (i.e., printable characters). The second is that, of course, strings outside the ASCII range should be tested. When I first wrote quickcheck, I didn't know how to balance these so I just chose to address the first concern. It seems like the principle of least surprise should apply here: |
It seems like shrinking could also come into play. I could conceive of a string with only ascii as being "smaller" than one with ascii + ascii punctuation. Then it could "grow" to include more common unicode, then "grow" towards uncommon. Just thinking outloud, really. |
Has this changed? I think I saw wildly different results than this. Just as a data point, this is what I ended up using for a string search testing / fuzzing: #[derive(Clone, Debug)]
struct Text(String);
//static ALPHABET: &'static str = "ABCD";
static ALPHABET: &'static str = "\0\u{1}\u{2}\u{3}";
impl Arbitrary for Text {
fn arbitrary<G: qc::Gen>(g: &mut G) -> Self {
let len = u16::arbitrary(g);
let mut s = String::with_capacity(len as usize);
for _ in 0..len {
let i = usize::arbitrary(g);
let i = i % ALPHABET.len();
s.push(ALPHABET.as_bytes()[i] as char)
}
Text(s)
}
fn shrink(&self) -> Box<Iterator<Item=Self>> {
Box::new(self.0.shrink().map(Text))
}
} Since the algorithm was working on just the byte slice, I switched to an alphabet of 0-4 for even easier to read intermediate debug outputs.. |
Errm, right, I forgot to update this issue, but I grew tired of the ASCII-only behavior so I just changed it. ASCII-only is still useful though, so we might want an |
Right now,
String
s only have ASCII alphanumeric characters. I tried to use it to generate strings with spaces in it and was surprised when my code was never called 😺It would be cool if there were more of the Unicode set included. Even awesomer might be if we had some tuning knobs where we could say "more punctuation, less letters", perhaps using the Unicode character classes?
Thanks for the fun-to-use library!
The text was updated successfully, but these errors were encountered: