Skip to content

Conversation

@michael-schwarz
Copy link
Member

This implements a cheap form of reachability that works on a set of dep collected during solving (and heavy-weight reachability).

Something that is a bit odd:

  • If a side-effect occurs from x to y, it still is necessary to record that x depends on y, to avoid those globals that only receive side-effects but are never read from being pruned. This has the effect that globals that are only side-effected to do not depend on anything, which I am not quite sure is the right thing here.

This stronger form of reachability can then be used to additionally filter warnings and accesses to replay: Only those that are superstable and still reachable need to be replayed.

I tested the performance impact on a small program, and this cheap reachability seems to be a lot cheaper than full reachability, meaning that it hopefully is not too expensive also for larger programs:

TOTAL                           0.676 s
  parse                           0.016 s
  convert to CIL                  0.010 s
  mergeCIL                        0.008 s
  analysis                        0.642 s
    createCFG                       0.007 s
      handle                          0.003 s
      iter_connect                    0.003 s
        computeSCCs                     0.002 s
    global_inits                    0.013 s
    solving                         0.612 s
      S.Dom.equal                     0.004 s
      postsolver                      0.183 s
        potentially_incremental_reach       0.176 s
        cheap_full_reach                0.001 s
    warn_global                     0.002 s
      access                          0.001 s
Timing used

There are still three incremental tests that fail, all of which use restarting. I'll try to have a look, but I am not so familiar with it: maybe @sim642 has an idea what could go wrong there?

Closes #710

@michael-schwarz michael-schwarz self-assigned this Apr 30, 2022
@michael-schwarz michael-schwarz changed the title Incremental: Full reachability to prune Incremental: (Cheap!) Full Reachability to Prune Apr 30, 2022
@michael-schwarz
Copy link
Member Author

There are more flaws than I thought: Upon reanalysis everything not in the main thread seems to be pruned away (!?)

@sim642 sim642 self-requested a review May 1, 2022 08:49
@michael-schwarz
Copy link
Member Author

There are more flaws than I thought: Upon reanalysis everything not in the main thread seems to be pruned away (!?)

The problem was everyone's favorite thing, hashconsing: I re-lifted only the keys in dep but the values need to be re-lifted too.

@michael-schwarz michael-schwarz marked this pull request as ready for review May 1, 2022 11:24
@sim642 sim642 mentioned this pull request May 2, 2022
5 tasks
@sim642
Copy link
Member

sim642 commented May 2, 2022

I suppose this should also fix the second problem (and the corresponding test) from #647, right? If so, then the test should be cherry-picked here.

Copy link
Member

@sim642 sim642 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I'm thinking about the init_reachable possibility and simplifications related to it, is to find a solution which minimizes the necessary changes to the postsolvers (and especially the incremental ones in TD3). It's risky to make so big changes here last minute, considering the subtle interactions and bugs in the current version, which has gotten a lot more testing.

Comment on lines 159 to 160
(* Tracks dependencies between an unknown and the things it depends on AND has side-effects to *)
let dep = data.dep in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition of dep containing unknowns which it has side-effects to can be very misleading because the dep definition in the TD3 paper doesn't include that, or does it?

Couldn't dep just be the inverse of infl, like we have previously had, but just use side_infl for the additional lookups when needed later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like side-effects are not always recorded into side_infl? Is there some specific reason for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, before this side_infl was only populated during postsolving for marshaling, but wasn't needed to collect during the solving itself.
Of course that can be changed, but I'm wondering if that's even necessary since leaves (side effect targets with no right-hand sides) themselves don't have warnings, which are emitted during their rhs evaluation. So there shouldn't even be anything to reuse from them in var_messages and rho_write.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the tests seem to fail but I am still a bit skeptical about using side_infl: it is explicitly cleared before we start the solver, which strikes me as very odd.

It seems that we would run the risk of removing globals that are only side-effected to from superstable things (but remain reachable) if we remove the things that side-effects are done to from dep (though I could not construct an example thus far).

        (* reachability will populate these tables for incremental global restarting *)
        HM.clear side_dep;
        HM.clear side_infl;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the better solution is to rename dep to uses or something that indicates that these are the unknowns that x reads or has side-effects to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the tests seem to fail but I am still a bit skeptical about using side_infl: it is explicitly cleared before we start the solver, which strikes me as very odd.

Right, the previous use of side_infl was just for restarting, which is all done before starting the solving itself. So to minimize memory usage and not keep around data during the solving, which will be recollected during postsolving anyway.

It seems that we would run the risk of removing globals that are only side-effected to from superstable things (but remain reachable) if we remove the things that side-effects are done to from dep (though I could not construct an example thus far).

Hmm, you may have some point here, but if that's indeed a problem, then it's broader than just this PR. Maybe it becomes a problem if at least two incremental changes are made (so there's an intermediate run, which both loads and saves). Then I suppose that might lose the original data and not be able to perform restarting on the final execution correctly?

Since all the testing of incrementality has always been just one change, such an issue has never had a chance to reveal itself. I guess @stilscher's benchmarks that do longer chains of incremental changes might also expose these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe this sort of structure is needed indeed and uses name with fewer mixup opportunities.

Now I'm wondering if it's even necessary to collect this uses during the actual solving or again just during postsolving. The postsolver sees all the changed/new uses and all the reachable_and_superstable should also remain.

@michael-schwarz
Copy link
Member Author

Following the suggestions by @sim642 , I have now massively simplified it by passing in the intersection of superstable and reachable' as init_reachable, meaning the postsolvers can remain unchanged.

@michael-schwarz
Copy link
Member Author

I suppose this should also fix the second problem (and the corresponding test) from #647, right? If so, then the test should be cherry-picked here.

Done

@michael-schwarz
Copy link
Member Author

michael-schwarz commented May 2, 2022

I now went with a solution where we do not add the things that we cause side-effects to to dep, but instead also query side_infl, which we in turn never reset as its values supposedly need to persist across executions. The other question is whether in the post-solver that iterates over the part that might have changed , sides_infl x should be reset to the empty set when executing the RHS for x to make sure that old things that no side-effects are now caused to are purged from sides_infl...

@sim642
Copy link
Member

sim642 commented May 2, 2022

The other question is whether in the post-solver that iterates over the part that might have changed , sides_infl x should be reset to the empty set when executing the RHS for x to make sure that old things that no side-effects are now caused to are pruged from sides_infl...

That sounds like it should be sufficient. The one_constraint postsolver hook could be used to do that since it gets executed exactly when the right-hand side has been executed.

@michael-schwarz
Copy link
Member Author

Actually, resetting the set in this manner breaks the test case 11/09, so I'd suggest we leave it as currently in this PR for now and merge that.

@sim642
Copy link
Member

sim642 commented May 2, 2022

Actually, resetting the set in this manner breaks the test case 11/09, so I'd suggest we leave it as currently in this PR for now and merge that.

Would be good to have a TODO in there somewhere about clearing the sets. Otherwise we'll forget unless we immediately figure it out.

@michael-schwarz michael-schwarz force-pushed the interactive_postsolver_pruning branch from f106225 to 56d230f Compare May 3, 2022 06:45
@michael-schwarz michael-schwarz requested a review from sim642 May 3, 2022 07:21
@michael-schwarz
Copy link
Member Author

@stilscher: Attention, this contains breaking changes to the config options!

@michael-schwarz michael-schwarz merged commit 9326978 into interactive May 3, 2022
@michael-schwarz michael-schwarz deleted the interactive_postsolver_pruning branch May 3, 2022 07:51
@sim642 sim642 added this to the v2.0.0 milestone Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warnings not removed for unreachable code (contexts) with incremental post-solving

3 participants