Skip to content

Conversation

@davidmrdavid
Copy link
Collaborator

@davidmrdavid davidmrdavid commented Jul 15, 2020

Addresses: #148

This PR fixes continue_as_new, a feature that was failing silently before. Judging from the JS code, it appears that continue_as_new actions should mark the orchestrator as "completed", and fixing this marker made continue_as_new work as expected.

Before merging, I could love some clarification on why this fix worked, which I believe must have to do with the expectations of durable-extention. I still do not understand why this failed silently, as I was expecting, at least, some kind of "non-determinism" exception.

@davidmrdavid davidmrdavid changed the title WIP: Fixing ContinueAsNew ContinueAsNew works, needed to mark the orchestrator as completed Jul 17, 2020
@ConnorMcMahon
Copy link

@davidmrdavid, the reason why is because of how OutOfProc handles "uncompleted" orchestrations. It essentially just waits on that thread forever, meaning that even though the orchestration is effectively finished because it calls ContinueAsNew(), it won't treat the C# side of the replay as completed.

See this code for a bit more detail: https://github.com/Azure/azure-functions-durable-extension/blob/dev/src/WebJobs.Extensions.DurableTask/Listener/OutOfProcOrchestrationShim.cs#L97.

Copy link

@ConnorMcMahon ConnorMcMahon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though we should validate isDone = true in the tests.

Copy link

@ConnorMcMahon ConnorMcMahon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard previous review, I have some new comments.

@davidmrdavid
Copy link
Collaborator Author

Heads up, this is already being validated in the tests. One of my changes is in the tests :)

And I'll look to not return a Task

@davidmrdavid
Copy link
Collaborator Author

davidmrdavid commented Jul 29, 2020

@ConnorMcMahon , I rewrote the code such that it would enable continue_as_new to work without a yield. All we needed was some shared state, and it turned out to be fairly clean. The test now validates this scenario (no yielding) as well. Please give this another look when you get the chance :)

Copy link

@ConnorMcMahon ConnorMcMahon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new approach, but I have some questions about the details of the new approach.

@davidmrdavid
Copy link
Collaborator Author

There seems to be some confusion about my helper function. I'll try to explain it below :)

The _handle_continue_as_new helper function addresses two possible scenarios:

  1. continue_as_new is the last "durable API" usage in the code. This is the correct usage.
  2. continue_as_new was followed by some yield statement to some other durable API. This is what we call "bad usage".

For case (1), _handle_continue_as_new is called with add_to_actions as true because the only pre-existing way to add to the actions array was via a yield statement. But, since continue_as_new is now designed to work without a yield, we need to compensate for it.

Here, we return None because the orchestrator cannot return a value when continue_as_new is called and because the handle method is on the execution path to set a return value for the orchestrator.

For case (2), we know that continue_as_new was followed by some yield statement to some other Durable API. In this case, we call _handle_continue_as_new with add_to_actions as false. When we do this, _handle_continue_as_new essentially swaps, via its return statement, the yield'ed task with the continue_as_new task, and this is enough to guarantee that the pre-existing logic adds the continue_as_new to the actions array.

Said differently, here we return the continue_as_new task because the orchestrator is already in the execution path to process tasks by adding them to the actions array.

Please let me know if you believe I should be adding more comments (or anything else) to make this clearer in the code :)

@ConnorMcMahon
Copy link

I think we can simplify this given the dicussion we have had around not needing continueasnew to return a task?

@davidmrdavid
Copy link
Collaborator Author

Hmm, it does not return a Task at the moment :)
You can see that in the lack of a return statement in azure/durable_functions/tasks/continue_as_new.py
a Task is generated, but not really returned, just passed via shared state and is a Task for convenience because that helps it flow through the handle method easily.

Happy to not even construct a Task though if that's what you're opposed to. I'm not entirely convinced that the _handle_continue_as_new method is all that complex, or is it?

Happy to also discuss verbally, I realize the Python SDK may look a little foreign, so perhaps the flow of data is not entirely clear.

self.generator = fn_output

else:
if not isinstance(fn_output, Iterator):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inverted some if-else logic, which frees us from having to check for continue_as_new in so many places


def _add_to_actions(self, generation_state):
# Do not add new tasks to action if continue_as_new was called
if self.durable_context.will_continue_as_new:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not add to the actions array if continue_as_new was called

custom_status=self.durable_context.custom_status)

# No output if continue_as_new was called
if self.durable_context.will_continue_as_new:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not set output if continue_as_new was called

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly for consistency. I'm trying to avoid continue_as_new from sometimes returning something and sometimes not. Just to avoid subtle bugs

return self._function_context

@property
def will_continue_as_new(self) -> bool:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property alias exists just for readability

@davidmrdavid
Copy link
Collaborator Author

@ConnorMcMahon , I gave this one last refactor, I think it's a lot more readable and does not have the old, confusing, helper method. Let me know what you think and, if it looks good, please give it a thumbs up and GitHub so we can merge :) . Happy to take on more feedback though

Copy link

@ConnorMcMahon ConnorMcMahon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much happier with this final iteration, it's very clear to me now :)

@davidmrdavid davidmrdavid merged commit aae83c9 into dev Aug 5, 2020
@davidmrdavid davidmrdavid deleted the dajusto/continue_as_new branch August 5, 2020 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants