Incremental: (Cheap!) Full Reachability to Prune #713

michael-schwarz · 2022-04-30T14:58:13Z

This implements a cheap form of reachability that works on a set of dep collected during solving (and heavy-weight reachability).

Something that is a bit odd:

If a side-effect occurs from x to y, it still is necessary to record that x depends on y, to avoid those globals that only receive side-effects but are never read from being pruned. This has the effect that globals that are only side-effected to do not depend on anything, which I am not quite sure is the right thing here.

This stronger form of reachability can then be used to additionally filter warnings and accesses to replay: Only those that are superstable and still reachable need to be replayed.

I tested the performance impact on a small program, and this cheap reachability seems to be a lot cheaper than full reachability, meaning that it hopefully is not too expensive also for larger programs:

TOTAL                           0.676 s
  parse                           0.016 s
  convert to CIL                  0.010 s
  mergeCIL                        0.008 s
  analysis                        0.642 s
    createCFG                       0.007 s
      handle                          0.003 s
      iter_connect                    0.003 s
        computeSCCs                     0.002 s
    global_inits                    0.013 s
    solving                         0.612 s
      S.Dom.equal                     0.004 s
      postsolver                      0.183 s
        potentially_incremental_reach       0.176 s
        cheap_full_reach                0.001 s
    warn_global                     0.002 s
      access                          0.001 s
Timing used

There are still three incremental tests that fail, all of which use restarting. I'll try to have a look, but I am not so familiar with it: maybe @sim642 has an idea what could go wrong there?

Closes #710

michael-schwarz · 2022-04-30T15:46:08Z

There are more flaws than I thought: Upon reanalysis everything not in the main thread seems to be pruned away (!?)

src/solvers/postSolver.ml

michael-schwarz · 2022-05-01T10:57:46Z

There are more flaws than I thought: Upon reanalysis everything not in the main thread seems to be pruned away (!?)

The problem was everyone's favorite thing, hashconsing: I re-lifted only the keys in dep but the values need to be re-lifted too.

sim642 · 2022-05-02T02:37:51Z

I suppose this should also fix the second problem (and the corresponding test) from #647, right? If so, then the test should be cherry-picked here.

sim642

The reason I'm thinking about the init_reachable possibility and simplifications related to it, is to find a solution which minimizes the necessary changes to the postsolvers (and especially the incremental ones in TD3). It's risky to make so big changes here last minute, considering the subtle interactions and bugs in the current version, which has gotten a lot more testing.

sim642 · 2022-05-02T02:57:54Z

src/solvers/td3.ml

+      (* Tracks dependencies between an unknown and the things it depends on AND has side-effects to *)
+      let dep = data.dep in


This definition of dep containing unknowns which it has side-effects to can be very misleading because the dep definition in the TD3 paper doesn't include that, or does it?

Couldn't dep just be the inverse of infl, like we have previously had, but just use side_infl for the additional lookups when needed later?

It seems like side-effects are not always recorded into side_infl? Is there some specific reason for that?

Well, before this side_infl was only populated during postsolving for marshaling, but wasn't needed to collect during the solving itself.
Of course that can be changed, but I'm wondering if that's even necessary since leaves (side effect targets with no right-hand sides) themselves don't have warnings, which are emitted during their rhs evaluation. So there shouldn't even be anything to reuse from them in var_messages and rho_write.

None of the tests seem to fail but I am still a bit skeptical about using side_infl: it is explicitly cleared before we start the solver, which strikes me as very odd.

It seems that we would run the risk of removing globals that are only side-effected to from superstable things (but remain reachable) if we remove the things that side-effects are done to from dep (though I could not construct an example thus far).

(* reachability will populate these tables for incremental global restarting *) HM.clear side_dep; HM.clear side_infl;

Maybe the better solution is to rename dep to uses or something that indicates that these are the unknowns that x reads or has side-effects to?

None of the tests seem to fail but I am still a bit skeptical about using side_infl: it is explicitly cleared before we start the solver, which strikes me as very odd.

Right, the previous use of side_infl was just for restarting, which is all done before starting the solving itself. So to minimize memory usage and not keep around data during the solving, which will be recollected during postsolving anyway.

It seems that we would run the risk of removing globals that are only side-effected to from superstable things (but remain reachable) if we remove the things that side-effects are done to from dep (though I could not construct an example thus far).

Hmm, you may have some point here, but if that's indeed a problem, then it's broader than just this PR. Maybe it becomes a problem if at least two incremental changes are made (so there's an intermediate run, which both loads and saves). Then I suppose that might lose the original data and not be able to perform restarting on the final execution correctly?

Since all the testing of incrementality has always been just one change, such an issue has never had a chance to reveal itself. I guess @stilscher's benchmarks that do longer chains of incremental changes might also expose these.

So maybe this sort of structure is needed indeed and uses name with fewer mixup opportunities.

Now I'm wondering if it's even necessary to collect this uses during the actual solving or again just during postsolving. The postsolver sees all the changed/new uses and all the reachable_and_superstable should also remain.

src/solvers/td3.ml

michael-schwarz · 2022-05-02T08:44:41Z

Following the suggestions by @sim642 , I have now massively simplified it by passing in the intersection of superstable and reachable' as init_reachable, meaning the postsolvers can remain unchanged.

src/solvers/td3.ml

src/solvers/postSolver.ml

src/solvers/td3.ml

michael-schwarz · 2022-05-02T09:47:31Z

I suppose this should also fix the second problem (and the corresponding test) from #647, right? If so, then the test should be cherry-picked here.

Done

michael-schwarz · 2022-05-02T10:11:52Z

I now went with a solution where we do not add the things that we cause side-effects to to dep, but instead also query side_infl, which we in turn never reset as its values supposedly need to persist across executions. The other question is whether in the post-solver that iterates over the part that might have changed , sides_infl x should be reset to the empty set when executing the RHS for x to make sure that old things that no side-effects are now caused to are purged from sides_infl...

sim642 · 2022-05-02T10:21:14Z

The other question is whether in the post-solver that iterates over the part that might have changed , sides_infl x should be reset to the empty set when executing the RHS for x to make sure that old things that no side-effects are now caused to are pruged from sides_infl...

That sounds like it should be sufficient. The one_constraint postsolver hook could be used to do that since it gets executed exactly when the right-hand side has been executed.

michael-schwarz · 2022-05-02T11:56:40Z

Actually, resetting the set in this manner breaks the test case 11/09, so I'd suggest we leave it as currently in this PR for now and merge that.

src/solvers/td3.ml

sim642 · 2022-05-02T13:04:42Z

Actually, resetting the set in this manner breaks the test case 11/09, so I'd suggest we leave it as currently in this PR for now and merge that.

Would be good to have a TODO in there somewhere about clearing the sets. Otherwise we'll forget unless we immediately figure it out.

src/solvers/td3.ml

michael-schwarz · 2022-05-03T07:51:01Z

@stilscher: Attention, this contains breaking changes to the config options!

michael-schwarz added 2 commits April 28, 2022 13:18

Add example where incremental postsolver does not remove warnings

a33457a

Steps towards cheap full reachability

bda1100

michael-schwarz added the bug label Apr 30, 2022

michael-schwarz requested review from sim642 and stilscher April 30, 2022 14:58

michael-schwarz self-assigned this Apr 30, 2022

michael-schwarz changed the title ~~Incremental: Full reachability to prune~~ Incremental: (Cheap!) Full Reachability to Prune Apr 30, 2022

Add fundamental problem where multi-threaded code becomes dead

080d474

sim642 reviewed May 1, 2022

View reviewed changes

src/solvers/postSolver.ml Outdated Show resolved Hide resolved

sim642 self-requested a review May 1, 2022 08:49

michael-schwarz added 2 commits May 1, 2022 12:33

Newline for dep log

5d8d36d

dep: Relift VS too

b9252dc

michael-schwarz marked this pull request as ready for review May 1, 2022 11:24

sim642 mentioned this pull request May 2, 2022

Interactive analysis #391

Merged

5 tasks

sim642 linked an issue May 2, 2022 that may be closed by this pull request

Warnings not removed for unreachable code (contexts) with incremental post-solving #710

Closed

sim642 mentioned this pull request May 2, 2022

Fix incremental warnings with reluctant destabilization #709

Merged

sim642 reviewed May 2, 2022

View reviewed changes

michael-schwarz added 3 commits May 2, 2022 10:18

Simplify

ee55218

Undo unneeded modifications

c1933db

minimize diff

5850d93

sim642 reviewed May 2, 2022

View reviewed changes

src/solvers/td3.ml Outdated Show resolved Hide resolved

sim642 reviewed May 2, 2022

View reviewed changes

src/solvers/postSolver.ml Outdated Show resolved Hide resolved

sim642 reviewed May 2, 2022

View reviewed changes

src/solvers/td3.ml Outdated Show resolved Hide resolved

michael-schwarz added 3 commits May 2, 2022 11:19

Move incr_verify specific changes back

83ec1a1

Cleanup

ddda093

Cleanup

f45f194

Add incremental test where race in removed call remains

9940dea

Record less in dep, do not clear side_infl and side_dep

da4564d

sim642 reviewed May 2, 2022

View reviewed changes

src/solvers/td3.ml Show resolved Hide resolved

src/solvers/td3.ml Show resolved Hide resolved

src/solvers/td3.ml Outdated Show resolved Hide resolved

src/solvers/td3.ml Outdated Show resolved Hide resolved

michael-schwarz added 2 commits May 2, 2022 15:29

Print same solver stats consistently

8b7d4d3

Typo

50fed60

sim642 reviewed May 2, 2022

View reviewed changes

src/solvers/td3.ml Outdated Show resolved Hide resolved

Merge branch 'interactive' into interactive_postsolver_pruning

56d230f

michael-schwarz force-pushed the interactive_postsolver_pruning branch from f106225 to 56d230f Compare May 3, 2022 06:45

michael-schwarz added 3 commits May 3, 2022 08:49

Comment about resetting side_infl

d622611

Rename incremental.verify -> incremental.postsolver.enabled

3f96c77

Add option incremental.postsolver.superstable-reached

44f1efa

michael-schwarz requested a review from sim642 May 3, 2022 07:21

sim642 added the precision label May 3, 2022

sim642 approved these changes May 3, 2022

View reviewed changes

michael-schwarz merged commit 9326978 into interactive May 3, 2022

michael-schwarz deleted the interactive_postsolver_pruning branch May 3, 2022 07:51

sim642 mentioned this pull request May 3, 2022

Minimal write-only restarting is insufficient: add ana.malloc.include-node #647

Closed

sim642 added this to the v2.0.0 milestone Aug 12, 2022

		(* Tracks dependencies between an unknown and the things it depends on AND has side-effects to *)
		let dep = data.dep in

Incremental: (Cheap!) Full Reachability to Prune #713

Incremental: (Cheap!) Full Reachability to Prune #713

Uh oh!

Conversation

michael-schwarz commented Apr 30, 2022

Uh oh!

michael-schwarz commented Apr 30, 2022

Uh oh!

Uh oh!

michael-schwarz commented May 1, 2022

Uh oh!

sim642 commented May 2, 2022

Uh oh!

sim642 left a comment

Choose a reason for hiding this comment

Uh oh!

sim642 May 2, 2022

Choose a reason for hiding this comment

Uh oh!

michael-schwarz May 2, 2022

Choose a reason for hiding this comment

Uh oh!

sim642 May 2, 2022

Choose a reason for hiding this comment

Uh oh!

michael-schwarz May 2, 2022

Choose a reason for hiding this comment

Uh oh!

michael-schwarz May 2, 2022

Choose a reason for hiding this comment

Uh oh!

sim642 May 2, 2022

Choose a reason for hiding this comment

Uh oh!

sim642 May 2, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

michael-schwarz commented May 2, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michael-schwarz commented May 2, 2022

Uh oh!

michael-schwarz commented May 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sim642 commented May 2, 2022

Uh oh!

michael-schwarz commented May 2, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sim642 commented May 2, 2022

Uh oh!

Uh oh!

michael-schwarz commented May 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

michael-schwarz commented May 2, 2022 •

edited

Loading