Skip to content

"Non-Deterministic workflow detected: TaskScheduledEvent: 0 TaskScheduled" ERROR when activity task failed #189

@Josephjxu

Description

@Josephjxu

Hello,

I am running into issues with orchestration durable function. The function is written in Python (listed below) but my questions refer to durable functions and related tables in general.

def orchestrator_function(context: df.DurableOrchestrationContext):
    request = context.get_input()

    retry_options = df.RetryOptions(first_retry_interval_in_milliseconds=5000, max_number_of_attempts=3)
    parallel_tasks = []

    my_request = request.copy()
    context.set_custom_status("Creating Tasks")
    task_messages = yield context.call_activity_with_retry("task_a_activity", retry_options=retry_options, input_=my_request)

    context.set_custom_status("Preparing Tasks")
    for task_request in task_messages:
        parallel_tasks.append(context.call_activity_with_retry("task_b_activity", retry_options=retry_options, input_=task_request))

    context.set_custom_status(f"Running on {len(task_messages)} tasks")
    outputs = yield context.task_all(parallel_tasks)

    context.set_custom_status("Completed")
    complete_request = request.copy()
    yield context.call_activity("task_c_activity", complete_request)
    return f"Orchestration Completed"

main = df.Orchestrator.create(orchestrator_function)

The orchestration request we have could generate hundreds activity tasks (IO and CPU computing) that each run a few minutes. There could be occasionally run time error on task_b_activity which causes taskFailed error (e.g., Python exit unexpectedly). However, the orchestration sometimes show the following error

**Non-Deterministic workflow detected: TaskScheduledEvent: 0 TaskScheduled task_a_activity**

My questions are:

  1. What is exactly the error Non-Deterministic workflow detected: TaskScheduledEvent: 0 TaskScheduled XYZ?
  2. Is there anything wrong with the above code?
  3. The error occurred on task_b, why orchestration error on task_a?
  4. When orchestration failed, I notice that the orchestration history table no longer get updated but all activity functions in work queue are still being picked up. Is this expected? Is there a way to update the orchestration history table when activity functions are still running?
  5. The function is running App Service with vnet. Is there way to verify if we are using the latest durable extension?

Your assistance is greatly appreciated.

Best regards.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions