Add 'ignore_archives' flag to functions that fetch runs #64

as2388 · 2020-01-20T16:36:50Z

Defaults to False, so default behaviour is to always fetch from archives.

The intention here is that pipelines can set ignore_archives to True when incrementally fetching new data for a dataset they fetched in the past. This is a cheap and simple way to work around the performance problem of pipelines going to archives if the last message received was more than 3 months ago. Of course, this would cause us problems if more than 3 months pass between a pipeline run and we're fetching a flow that had some data in archives 3 months ago, but this is extremely unlikely to happen in practice. This is because we run pipelines frequently then terminate flows forever, and in any case can workaround by removing the Raw Data folder.

(In other words, this prevents the incremental fetching performance problem originally described here: #60).

For example usage, see AfricasVoices/Project-IOM#19

as2388 added 4 commits January 20, 2020 14:37

Add a "ignore_archives" flag to get_raw_runs_for_flow_id

c6b4c7a

Add "ignore_archives" flag to update_raw_runs_with_latest_modified

fa09515

Add missing comma

f955ad2

Update docstring for ignore_archives

41e84e0

as2388 requested review from lukechurch and IsaackMwenda January 20, 2020 16:36

as2388 mentioned this pull request Jan 20, 2020

Ignore archives when fetching incrementally AfricasVoices/Project-IOM#19

Merged

IsaackMwenda approved these changes Jan 21, 2020

View reviewed changes

lukechurch approved these changes Jan 24, 2020

View reviewed changes

as2388 merged commit 7f01306 into master Jan 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'ignore_archives' flag to functions that fetch runs #64

Add 'ignore_archives' flag to functions that fetch runs #64

as2388 commented Jan 20, 2020 •

edited

Loading

Add 'ignore_archives' flag to functions that fetch runs #64

Add 'ignore_archives' flag to functions that fetch runs #64

Conversation

as2388 commented Jan 20, 2020 • edited Loading

as2388 commented Jan 20, 2020 •

edited

Loading