Checkpoint targets #88

myronahn · 2013-09-05T04:30:48Z

(Will fill in more detail and a proposal soon)

Allow the ability to specify checkpoint targets in the workflow - when the workflow generates a checkpoint target, it will delete all intermediate targets generated to reach that checkpoint, but will not delete intermediate targets needed for subsequent targets or that are checkpoint targets themselves.

aboytsov · 2013-09-10T20:54:17Z

Would love an example :)

myronahn · 2013-10-30T13:56:20Z

Prototype: c0947a5

Unit tests pass

myronahn · 2013-10-31T07:48:42Z

I've been thinking about this - it might be better to mark parts of the workflow that are "temporary" (i.e. intermediate targets) rather than mark ones that are permanent. This way, everything by default is permanent (a "checkpoint") and you can gradually subtract out what you don't need permanently.

Major components:

Syntax of marking a target as "temporary"
- Mark a target as temporary by prepending a tilde, e.g. ~a
- Can mark a target as temporary in either the input or output list of a step
- If a target is marked anywhere as temporary, it is "globally" considered to be a temporary file, even in steps where it is not marked as temporary.
What happens during the build?
- They are mostly treated as normal targets
- As soon as all the steps that rely on a temporary target are built, the target is deleted (if possible in the underlying fs)
- Have an option to turn off deletion of temp targets --keep-temp-files
Effects on the dependency tree
- The problem is, we don't want deleted temp targets to unnecessarily trigger the re-execution of a step
- e.g. if we have steps C <- ~B and ~B <- A, if B is deleted, we don't want to re-trigger these steps as long as C is newer than A
- Algorithm: when checking input target timestamps against output target timestamps:
  - If a temp input target has data, treat it as a normal target
  - If a temp input target has no data, ignore it
  - If a temp output target has data, treat it as a normal target
  - If a temp output target has no data, recursively expand it to be the output targets of all the steps that depend on this step (that are in the current build tree). Recursively apply the rules for temp output targets.
  - Then you can compare input target dates against output target dates normally

I have a prototype of this behavior now.

myronahn · 2013-10-31T08:01:06Z

Simple example:

~b <- a
  commands to build b from a

~c <- b
  commands to build c from b

~d <- c
  commands to build d from c

e <- d
  commands to build e from d

f <- d
  commands to build f from d

b, c, and d are temporary or intermediate files, and are globally marked as such.

Running from scratch:

b is built
c is built
b is deleted after c is built
d is built
c is deleted after d is built
e is built
f is built
d is deleted after e and f are built

myronahn · 2013-10-31T08:01:57Z

It is probably good to establish a convention of marking temp targets in the output list - I think it is more clear this way.

aboytsov · 2013-11-01T07:26:12Z

I like the direction it's going to. I agree this is much clearer than defining checkpoints, and I think this feature could be useful. I wanted to make a comment that temp status should only be defined in outputs, but then I've noticed you made that comment already. :)

I think timestamp-wise, you can simply propagate the dates of inputs to a temp output, i.e.:

~c <- a, b
d <- ~c
   if c is missing, max(a, b) will be assumed to be its timestamp during the dependency evaluation stage

I do feel, however, that it we could be missing a bunch of things this behavior can affect, and we should make sure we cover everything, for example:

invocation: is it drake +^~c or drake +^c? I would vote for ~ to be outside of referencing.
if drake +^c is invoked, c should be built and then d. but drake +=c is logically a no-op, correct? the distinction between explicitly specifying a target in the command-line and pulling it through dependencies is actually a moot one and I don't recommend relying on it.
this can make things tricky if one wants to build things in stages, which could theoretically be addressed by your ---keep_temp_files flag.
it's also a bit unclear if a subset of the dependency tree is selected, how you will judge if a temporary file should be deleted. it's probably risky to evaluate this relative to the subset rather than the whole graph, but the latter has nuances, too. i think it's possible to come up with a consistent behavior, but we should be able to articulate it in a way easily understood.
not sure what you mean by "has no data" - you mean "missing", right? i read "has no data" as "exists, but empty", but i think this is not what you meant.

Finally, I am really excited to see Drake's development marches on! You've been doing awesome work on it, lately, Myron, and I think this is useful to a lot of people outside Factual as well! You guys rock!

myronahn · 2013-11-01T08:00:52Z

@aboytsov I'm really glad you were able to take a look at this, and also very happy that you like the direction of these changes! Finally, I'm glad to see that some of my work is appreciated 👍

To address your comments:

re: timestamp checking - I considered several methods for timestamp checking on temp files that are missing.

Reach upwards (what you proposed, I think): if an input is temp/missing, recursively reach upwards and check timestamps of the parent's input files instead.
Reach downwards (current implementation): if an output is temp/missing, recursively reach downwards check timestamps of the children's output files instead.
Reach upwards/downwards: basically a combination of the two.
It turns out that reaching downwards is easier, since when you reach upwards, you have to consider the special case of a step having no inputs and forcing the build based on this. There is no special case reaching downwards.

myronahn · 2013-11-01T08:21:20Z

More comments:

I agree, I think drake +^c is better - coincidentally, this is how it works now.
Hmm, good question. I think drake +=c should work the same as if c were not a temp target, that is, it should try to build it w/o building dependencies, since the user is asking for c. I believe this is how it works now.
Yes, true enough - on the other hand, I'd assume that the workflow would define appropriate checkpoints that correspond to stages. And you're right - if all else fails, the --keep-temp-files flag can be used. I'd assume that people would start using temp targets once a workflow is fairly tested and mature and would use --keep-temp-files (or not even use temp targets) if they were actively developing it.
Good point, I thought about this as well. Right now if a subset of a dependency tree is selected, I only delete the file based on the dependencies on the temp file in the subset. This can be a bit surprising, you're right (especially if you want to run other parts of the workflow later), and you're right that the alternative (delete a target based on the dependencies in the full tree) is also surprising, and in my opinion, even more full of pitfalls. I'm open to suggestions here - perhaps a flag to flip between the two alternatives?
Ah, my bad, I meant missing, or more specifically, (not (fs data/in? target))

myronahn · 2013-11-01T08:29:02Z

As an overall philosophy, I tried to make it so that if the user doesn't use temp targets, then Drake will act exactly the same as before.

If the user wants to use temp targets, he/she should definitely be warned in the docs that:

It is an advanced feature that may be surprising.
It will automatically recursively delete stuff off of the file system.
It might slow things down if expensive temp targets are constantly being rebuilt.
One should only use it on a fully debugged/mature workflow as a form of space optimization.
One should definitely know about the --keep-temp-files flag.

myronahn · 2013-11-01T08:34:06Z

Oh, and I'll modify the parser so temp files can only be defined in the output list.

Squashed all the commits into one - in another branch

Unit tests pass

#106 Standardize on name "temp targets" Use -> for cleaner code Add comments for ramifications of error when deleting temp target Also: made sure temp target testing is in the regtest suite

#106 Standardize on name "temp targets" Use -> for cleaner code Add comments for ramifications of error when deleting temp target Also: made sure temp target testing is in the regtest suite Conflicts: resources/regtest/run-all.sh src/drake/core.clj

calfzhou · 2014-05-06T03:03:42Z

This feature is just what i'm looking for. 👍

myronahn pushed a commit that referenced this issue Oct 30, 2013

Put in temp file deleting code for #88 and fixed some bugs

32cc44c

myronahn pushed a commit that referenced this issue Oct 31, 2013

More comments for checkpointing, fixed small bug #88

c852fe7

Unit tests pass

myronahn pushed a commit that referenced this issue Oct 31, 2013

Turned off some logging, option to keep temp files, tests #88¬

6ae0304

myronahn pushed a commit that referenced this issue Oct 31, 2013

Put in regression tests, fixed small bugs, improved exceptions #88

65bb110

myronahn pushed a commit that referenced this issue Nov 1, 2013

Removed ability to specify temp files in inputs #88

14303d9

myronahn pushed a commit that referenced this issue Nov 1, 2013

Implementation of checkpoints/temp targets #88

d42da6f

Squashed all the commits into one - in another branch

myronahn pushed a commit that referenced this issue Nov 1, 2013

Added missing regression tests #88

5ee0535

myronahn pushed a commit that referenced this issue Nov 1, 2013

Final modification for regression tests #88

54cb040

myronahn pushed a commit that referenced this issue Nov 1, 2013

Implementation of checkpoints/temp targets #88

df9b8c0

Squashed all the commits into one - in another branch

myronahn pushed a commit that referenced this issue Nov 5, 2013

Added checkpoints testing to run-all regtest script #88

46df84b

myronahn pushed a commit that referenced this issue Nov 5, 2013

Put in temp file deleting code for #88 and fixed some bugs

1013ee1

myronahn pushed a commit that referenced this issue Nov 5, 2013

More comments for checkpointing, fixed small bug #88

ae9553e

Unit tests pass

myronahn pushed a commit that referenced this issue Nov 5, 2013

Turned off some logging, option to keep temp files, tests #88¬

55c3f8c

myronahn pushed a commit that referenced this issue Nov 5, 2013

Put in regression tests, fixed small bugs, improved exceptions #88

d6ea5a2

myronahn pushed a commit that referenced this issue Nov 5, 2013

Removed ability to specify temp files in inputs #88

99efa6f

myronahn pushed a commit that referenced this issue Nov 5, 2013

Added checkpoints testing to run-all regtest script #88

6799321

myronahn pushed a commit that referenced this issue Nov 8, 2013

Updated changelog for #88

5359d4e

myronahn pushed a commit that referenced this issue Nov 14, 2013

Fix intermittent problem with calculating temp targets #88

110b12e

myronahn pushed a commit that referenced this issue Nov 14, 2013

Fix intermittent problem with calculating temp targets #88

a48cab0

myronahn pushed a commit that referenced this issue Jan 22, 2014

Updated changelog for #88

7bcf123

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoint targets #88

Checkpoint targets #88

myronahn commented Sep 5, 2013

aboytsov commented Sep 10, 2013

myronahn commented Oct 30, 2013

myronahn commented Oct 31, 2013

myronahn commented Oct 31, 2013

myronahn commented Oct 31, 2013

aboytsov commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

calfzhou commented May 6, 2014

Checkpoint targets #88

Checkpoint targets #88

Comments

myronahn commented Sep 5, 2013

aboytsov commented Sep 10, 2013

myronahn commented Oct 30, 2013

myronahn commented Oct 31, 2013

myronahn commented Oct 31, 2013

myronahn commented Oct 31, 2013

aboytsov commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

myronahn commented Nov 1, 2013

calfzhou commented May 6, 2014