-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rules] [strings] Add support for some core Strings theory rules that reason about Regular expressions #46
Conversation
carcara/src/checker/rules/strings.rs
Outdated
if let Some(s_1) = args.first() { | ||
match s_1.as_ref() { | ||
Term::Const(Constant::String(s)) => Ok(s.len()), | ||
_ => Err(CheckerError::LengthCannotBeEvaluated(r.clone())), | ||
} | ||
} else { | ||
unreachable!() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here it's simpler to use .first().unwrap()
, since you're just pattern-matching on Some
.
if let Some(s_1) = args.first() { | |
match s_1.as_ref() { | |
Term::Const(Constant::String(s)) => Ok(s.len()), | |
_ => Err(CheckerError::LengthCannotBeEvaluated(r.clone())), | |
} | |
} else { | |
unreachable!() | |
} | |
let s_1 = args.first().unwrap(); | |
match s_1.as_ref() { | |
Term::Const(Constant::String(s)) => Ok(s.len()), | |
_ => Err(CheckerError::LengthCannotBeEvaluated(r.clone())), | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright!
carcara/src/checker/rules/strings.rs
Outdated
if let [r_1, r_2 @ ..] = &args[..] { | ||
let r_2_concat = concat(pool, r_2.to_vec()); | ||
Ok(str_fixed_len_re(pool, r_1.clone())? + str_fixed_len_re(pool, r_2_concat)?) | ||
} else { | ||
unreachable!() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? If r_2.len() >= 2
, concat
will return a str.++
term, which will lead to an error when you call str_fixed_len_r(pool, r_2_concat)
. If it was instead an re.++
term, I suppose it would work. Either way, I think it would be simpler to just call str_fixed_len_re
to each argument directly, and sum the results, e.g.
if let [r_1, r_2 @ ..] = &args[..] { | |
let r_2_concat = concat(pool, r_2.to_vec()); | |
Ok(str_fixed_len_re(pool, r_1.clone())? + str_fixed_len_re(pool, r_2_concat)?) | |
} else { | |
unreachable!() | |
} | |
let mut lengths = args.iter().map(|a| str_fixed_len_re(pool, a.clone())); | |
lengths.try_fold(0, |acc, x| Ok(acc + x?)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not correct, indeed :(
I confused this function with another one I previously implemented and ended up calling the wrong one. Also, I did not call str_fixed_len_re
for each argument directly because I got too focused on how CPC represents re.++
and str.++
terms ((str.++ a b c)
is transformed into (str.++ a (str.++ b (str.++ c "")))
). However, with the representation in arrays that we have internally, it is, in fact, the same as applying str_fixed_len_re
directly to each argument.
I am going to change that there and in other places where a similar behavior occurs.
carcara/src/checker/rules/strings.rs
Outdated
Term::Op(Operator::ReUnion, args) => { | ||
if let [r_1, r_2 @ ..] = &args[..] { | ||
let n = str_fixed_len_re(pool, r_1.clone())?; | ||
if r_2.len() == 1 { | ||
match r_2.first().unwrap().as_ref() { | ||
Term::Op(Operator::ReNone, _) => Ok(n), | ||
_ => Err(CheckerError::LengthCannotBeEvaluated(r.clone())), | ||
} | ||
} else { | ||
let new_r_2 = pool.add(Term::Op(Operator::ReUnion, r_2.to_vec())); | ||
if str_fixed_len_re(pool, new_r_2)? == n { | ||
Ok(n) | ||
} else { | ||
Err(CheckerError::LengthCannotBeEvaluated(r.clone())) | ||
} | ||
} | ||
} else { | ||
unreachable!() | ||
} | ||
} | ||
Term::Op(Operator::ReIntersection, args) => { | ||
if let [r_1, r_2 @ ..] = &args[..] { | ||
let n = str_fixed_len_re(pool, r_1.clone())?; | ||
if r_2.len() == 1 { | ||
match r_2.first().unwrap().as_ref() { | ||
Term::Op(Operator::ReAll, _) => Ok(n), | ||
_ => Err(CheckerError::LengthCannotBeEvaluated(r.clone())), | ||
} | ||
} else { | ||
let new_r_2 = pool.add(Term::Op(Operator::ReIntersection, r_2.to_vec())); | ||
if str_fixed_len_re(pool, new_r_2)? == n { | ||
Ok(n) | ||
} else { | ||
Err(CheckerError::LengthCannotBeEvaluated(r.clone())) | ||
} | ||
} | ||
} else { | ||
unreachable!() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to the ReConcat
case, in these two cases you are constructing a re.union
/re.inter
term with the remaining arguments just to call str_fixed_re_len
recursively, which is not very efficient (for each argument, you allocate a vector with all the remaining arguments). I think it would be best to call str_fixed_re_len
on each argument directly to get their lenghts, and then check if they are all the same.
Also, since these two cases are almost identical, differing only in which terms they ignore (re.none
or re.all
), it might make sense to put this logic in a separate function and call it in each case.
… reason about Regular expressions (#46) * Add some core Strings theory rules that reason about RegEx * Rename prefix and suffix functions * Replace sanity checks with unreachable!() * PR modifications * Modifications in str_fixed_len_re * Fix linting
This PR adds the
re_inter
,re_unfold_neg
andre_unfold_neg_concat_fixed
rules from the Strings theory.For simplification reasons, the
re_unfold_neg_concat_fixed
rule was splitted in a prefix version and suffix version (resp.re_unfold_neg_concat_fixed_prefix
andre_unfold_neg_concat_fixed_suffix
), just like it was done for other previously implemented rules.Future PR's will add the
re_unfold_pos
rule.Rules reference can be found in the cvc5 documentation and in the CPC specification.