Adopt a linear-time replay algorithm #302

davidmrdavid · 2021-07-01T00:04:39Z

Problem statement 🙀

The DF Python SDK's main goal is to rehydrate an orchestrator's state by processing its internal history log. Previously, our algorithm for processing an orchestration's history was quite naive: each DF API invocation could incur on a full scan of the history log; this yielded a quadratic runtime complexity algorithm.

This PR's main changes 😎

Among other things, this PR replaces this algorithm with a more efficient one. The new approach does a linear scan of the history irrespective of the number of DF API invocations. The approach borrows heavily from the TaskOrchestrationExecutor in durableTask but it also takes a few organizational liberties to maximize code reuse.

This PR represents an algorithmic change to the core of the DF Python SDK, and so it touches almost all code-paths that would get invoked during day-to-day usage of the library. As a result, this PR demands significant manual testing before merging to prevent regressions.

Outline of changes

Task-management classes

A major change of this PR is to provide a state-machine implementation of Tasks. Here we implement TaskBase, which is a base class for all Tasks and provides shared logic for all its subclasses. Similarly, CompoundTask is a child of TaskBase but a base class for all compound tasks such as WhenAll and WhenAny.

A particular notable Task abstraction introduced is the RetryAbleTask. This class is meant to encapsulate the logic for determining if a DF API with the -WithRetry suffix is done executing. Introducing this abstraction was necessary for two reasons: (1) it kept the task-management code simple as it allowed us to have 1 task per every DF API and (2) it allowed us to piggyback on the shared Task logic by framing the of scheduling timers and retries of the original task as "sub-tasks" of the RetryAble task.

The new replay driver

The next big change in this PR is the introduction of the TaskOrchestrationExecutor class, which takes care of interleaving the linear pass through the orchestration history with the resumption of the orchestration generator function. This class was written with code-reuse in mind, so it introduces many shared utilities that abstract over common task-creating and task-updating procedures that occur in response to specific history events.

ToDos and other FYIs

As an FYI, I had to remove two very small test files which tested the behavior of utilities that no longer exist. I don't feel great about removing tests, so I'll look to re-implement a version of these tests that targets the same behaviors before merging.

In addition to re-implementing some tests, we should consider doing a lot of manual testing before merging. A first step will be to test that all samples continue working. After that, we can do a bug-bash of sorts to test further edge-cases.

Finally, this PR wouldn't be complete without a full set of comparative benchmark results of this new algorithm against the current approach. Before merge, I'll be posting those results for us to understand the performance impact of these changes.

I'm really excited about this change! Thanks y'all! ✨✨

azure/durable_functions/models/TaskOrchestrationExecutor.py

ConnorMcMahon

Super excited to see this. Left some initial comments on areas to clear up the code from a reader's perspective, but this does seem substantially cleaner than our old implementation.

azure/durable_functions/models/DurableOrchestrationContext.py

azure/durable_functions/models/NewTask.py

tests/orchestrator/test_sequential_orchestrator_custom_status.py

azure/durable_functions/models/DurableOrchestrationContext.py

azure/durable_functions/models/TaskOrchestrationExecutor.py

azure/durable_functions/models/DurableOrchestrationContext.py

ConnorMcMahon

Looks good, though there is a rename that looks like we still haven't done quite yet.

ConnorMcMahon · 2021-07-14T18:10:10Z

azure/durable_functions/models/NewTask.py

@@ -0,0 +1,314 @@
+from azure.durable_functions.models.RetryOptions import RetryOptions


Looks like we still haven't resolved this task.

davidmrdavid · 2021-08-09T21:47:32Z

My latest commits includes changes for event_sent and raise_event processing :)

ConnorMcMahon

LGTM!

davidmrdavid added 14 commits June 3, 2021 17:54

first pass

5bf0c33

second pass

ecf9038

third pass

78fc2ab

fourth pass

0a2787e

8 test cases to go

9469937

fixed old tests

8451a6a

refactor codebase and comment it

60c1b01

merge with dev

243ec59

add replay schema to orchestration-state

fcf51bd

further refactors

f7b54e3

improve naming conventions

8880614

refactor and pass tests

dff3c44

remove print statement

b9e3de0

remove TODO statement

acfc463

davidmrdavid commented Jul 1, 2021

View reviewed changes

azure/durable_functions/models/TaskOrchestrationExecutor.py Outdated Show resolved Hide resolved

davidmrdavid added the Performance About the performance (speed, memory usage, etc) of the SDK label Jul 1, 2021

davidmrdavid added 4 commits July 2, 2021 11:27

update replay_schema usage

415daba

re-create tests: whenany and external event

0f0a54f

differentiate entity versus external-event handling

975e0c3

fix linter

b857567

ConnorMcMahon reviewed Jul 2, 2021

View reviewed changes

davidmrdavid added 7 commits July 2, 2021 15:33

add is_replaying tests

7cb7ee3

Document the compound_action_constructor

df140d4

Replace all instances of action_wrapper with compound_action_constructor

9cb00e9

make whenAny deterministic

c56173f

pass linter and type checks

9c953f2

remove unnecessary _input casting to str

d2b7680

remove commented out code

3c5c488

davidmrdavid requested a review from ConnorMcMahon July 13, 2021 23:04

ConnorMcMahon approved these changes Jul 14, 2021

View reviewed changes

davidmrdavid added 5 commits July 14, 2021 15:10

rename NewTask.py -> Task.py

5710fb7

add warning on duplicate task

5549315

Merge branch 'dev' into dajusto/implement-taskorchestrationexecutor

ba09fad

ammend bug bash findings

66cf57b

pass style tests

03c634c

davidmrdavid requested a review from ConnorMcMahon August 9, 2021 21:47

davidmrdavid added 3 commits August 9, 2021 17:29

assign Task IDs only at await-time

565fa2c

add tests for external events and out-of-order task IDs

02f4cec

pass stylecop

3ad9ab4

ConnorMcMahon approved these changes Aug 14, 2021

View reviewed changes

davidmrdavid merged commit bd26e33 into dev Aug 16, 2021

davidmrdavid deleted the dajusto/implement-taskorchestrationexecutor branch August 16, 2021 17:17

This was referenced Aug 16, 2021

Increase sequence-ID in fire-and-forget APIs #311

Closed

Increase sequence number in fire-and-forget APIs and fix ContinueAsNew Serialization #312

Merged

davidmrdavid mentioned this pull request Jan 7, 2022

Adopt a linear-time replay algorithm Azure/azure-functions-durable-js#305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adopt a linear-time replay algorithm #302

Adopt a linear-time replay algorithm #302

Uh oh!

davidmrdavid commented Jul 1, 2021

Uh oh!

Uh oh!

ConnorMcMahon left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ConnorMcMahon left a comment

Uh oh!

ConnorMcMahon Jul 14, 2021

Uh oh!

davidmrdavid commented Aug 9, 2021

Uh oh!

ConnorMcMahon left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,314 @@
		from azure.durable_functions.models.RetryOptions import RetryOptions

Adopt a linear-time replay algorithm #302

Adopt a linear-time replay algorithm #302

Uh oh!

Conversation

davidmrdavid commented Jul 1, 2021

Problem statement 🙀

This PR's main changes 😎

Outline of changes

Task-management classes

The new replay driver

ToDos and other FYIs

Uh oh!

Uh oh!

ConnorMcMahon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ConnorMcMahon left a comment

Choose a reason for hiding this comment

Uh oh!

ConnorMcMahon Jul 14, 2021

Choose a reason for hiding this comment

Uh oh!

davidmrdavid commented Aug 9, 2021

Uh oh!

ConnorMcMahon left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants