Conversation
| freges = :((p ⟹ (q ⟹ r)) ⟹ ((p ⟹ q) ⟹ (p ⟹ r))) # Frege's theorem | ||
| params = SaturationParams(timeout = 12, eclasslimit = 10000, schedulerparams = (match_limit = 6000, ban_length = 5)) | ||
| @test prove(calculational_logic_theory, ex, 2, 10, params) | ||
| @test_broken true == prove(calculational_logic_theory, freges, 2, 10, params) |
There was a problem hiding this comment.
This fails now because saturation reaches the enodelimit before the theory can be proven. Increasing the enodelimit fixes the test.
The question is why it produces more enodes....
The reason seems to be the different order of applying rules (vector instead of stack).
In fact, this test succeeds:
@test true == prove(reverse(calculational_logic_theory), freges, 2, 10, params)There was a problem hiding this comment.
I would argue that the order of applying rules should not matter for EqSat.
I'm not 100% sure why reversing the order has an effect on the number of enodes after x iterations.
There was a problem hiding this comment.
The issue that the order of rules affects the results is probably not caused by the changes in this PR but already exists in ale/3.0.
There was a problem hiding this comment.
Thanks a lot @gkronber
I'm not 100% sure why reversing the order has an effect on the number of enodes after x iterations.
Me neither, it's probably because of union!-ing of e-classes that
# Make sure class 2 has fewer parents
if length(g.classes[id_1].parents) < length(g.classes[id_2].parents)
id_1, id_2 = id_2, id_1
end
union!(g.uf, id_1.val, id_2.val)
So the IDs likely change? I'm not sure why we take the e-class with fewer parents, it was taken from egg
There was a problem hiding this comment.
We should keep this. I suspect another issue in the rebuilding phase (where we clean up duplicate enodes).
To solve the problem in this PR, should we just increase the enodelimit for the failing tests? Another way, would be to reverse the order of applying the matches again (iterating from back to front). Then behaviour should be the same as before.
There was a problem hiding this comment.
@gkronber could it be something related to analysis_pending and rebuilding is processed?
Adding this
function rebuild_memo!(g::EGraph)
new_memo = Dict{VecExpr,Id}()
for eclass in values(g.classes)
for node in eclass.nodes
canonicalize!(g, node)
new_memo[node] = eclass.id
end
end
g.memo = new_memo
end
After every rebuild didn't seem to solve the issue at all, so it looks like the memo is alright
There was a problem hiding this comment.
I think we should not concern ourself with rebuilding in this PR. The changes in this PR are ok and we can fix the failing test by increasing the enodelimit. I have a branch ready with several improvements of the rebuild procedure.
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## ale/3.0 #261 +/- ##
===========================================
- Coverage 81.42% 81.42% -0.01%
===========================================
Files 19 19
Lines 1497 1491 -6
===========================================
- Hits 1219 1214 -5
+ Misses 278 277 -1 ☔ View full report in Codecov by Sentry. |
…ses to wrapped vector v.
…lics/Metatheory.jl into ale/3.0-ematch-hashes
@gkronber I've managed to simplify a lot the logic behind e-matching yesterday by tinkering around:
The ematch buffer is now using
UInt64. There is no pairing anymore, and no delimiters anymore. The structure of a match in the buffer is as follows. First element gives the id of the e-class that matched, the second gives the rule index in the theory (positive or negative based on direction). Third element containsisliteral_bitvec, a UInt64 bitmask, where a bit in positioniis1if and only if the pattern variable with Debruijn indexiin the match contains a literal.The length of a match group is given by
length(theory[rule_idx].patvars). No need for magic number delimiters (although it may be good to have them to double check during development.This
isliteral_bitveclimits the number of pattern variables to be 64 per rule, (we can easily extend to 128 by just using two elements of the buffer forisliteral_bitvec).Example. If we have
@rule f(~x, ~y::Int, ~z) --> g(~x, ~z) * ~ythen during pattern matching it will append to the buffer a vector of the formisliteral_bitvec = 0x0000000000000002because pattern variable at position 2 is matching a literalxyzinstantiate_enode!will know what pattern variable is a literal by receiving and checking againstisliteral_bitvec.I've also simplified the logic of consuming the buffer in
eqsat_apply!. It is now reading the buffer from position 0 to end (as a queue). Before, it was reading the buffer in reverse (as a stack of matches).This should use asymptotically less memory and in theory should be a lot faster. In practice, it turned out to be slightly slower, and some tests are broken.
Could you help me take a look at the broken tests?