Skip to content

Conversation

@tadasant
Copy link

Description

This PR adds foundational support for configurable task timeouts in sub-recipes by allowing the task_timeout field to be stored in task payloads and providing a method to extract it.

Motivation

Currently, all sub-recipe tasks use a global default timeout. Some sub-recipes may legitimately need more time to complete, such as:

  • Long-running data processing tasks
  • Complex build processes
  • Tasks that involve downloading large files
  • Integration tests that take significant time

This PR lays the groundwork for implementing task-specific timeouts by:

  1. Storing the timeout value in the task payload when specified
  2. Providing a method to extract the timeout from tasks

Implementation Details

Changes Made

  1. Modified create_tasks_from_params in sub_recipe_tools.rs:

    • Extracts task_timeout from sub_recipe values if present
    • Adds the timeout to the task payload when creating tasks
    • Gracefully handles invalid timeout values
  2. Added get_task_timeout() method to Task struct:

    • Returns Option<u64> with the timeout value in seconds
    • Supports both sub_recipe and text_instruction task types
    • Returns None if no timeout is specified or if the value is invalid
  3. Added comprehensive tests:

    • Tests for creating tasks with and without timeouts
    • Tests for invalid timeout values
    • Tests for multiple tasks sharing the same timeout

Usage Example

Users can specify a timeout in their sub-recipe configuration:

sub_recipes:
  - name: long_running_build
    path: recipes/build.yaml
    values:
      task_timeout: "7200"  # 2 hours
      target: "release"

Next Steps

This PR only adds the ability to store and retrieve timeout values. The actual timeout enforcement would need to be implemented in the task execution layer, taking into account the current architecture's use of TaskConfig and cancellation tokens.

Testing

All new functionality is covered by unit tests that verify:

  • Timeout values are correctly extracted and stored
  • Invalid timeouts are gracefully ignored
  • The feature is backward compatible (tasks without timeouts work as before)

Note on Architecture

The current main branch has a different architecture than v1.1.4, using TaskConfig and cancellation tokens for task management. This PR provides the data foundation that can be integrated with the existing timeout/cancellation system in a future update.

- Add task_timeout extraction from sub_recipe values in create_tasks_from_params
- Add get_task_timeout() method to Task struct for future timeout implementation
- Include comprehensive tests for timeout field handling
- Support both sub_recipe and text_instruction task types

This commit adds the foundation for configurable task timeouts by storing the
timeout value in task payloads. The actual timeout enforcement will need to be
implemented in the task execution layer based on the current architecture.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@DOsinga
Copy link
Collaborator

DOsinga commented Aug 2, 2025

let me know if you want to merge this since it says draft

@tadasant
Copy link
Author

tadasant commented Aug 3, 2025

@DOsinga - @lifeizhou-ap mentioned to me here that

We used to include some default timeout for running sub recipe task. However, we have some similar feedback as yours, so we removed the default timeout, and we don't enable configuration for the timeout at the moment. So task_timeout is not a valid configuration in the code base. However, the change has not been released yet.

So my understanding is that we don't want the feature in this PR, but maybe I'm not reading that correctly?

Note that this is in draft because it is entirely coded by Claude Code (I don't have any Rust experience). I'm pretty sure it works as intended, though Lifei's comment makes me wonder if my manual testing was skewed because timeout is already off by default? I'd need to do more testing to feel confident this is achieving the intended behavior relative to current behavior.

I'm not sure I will have the capacity to shepherd this change in imminently; if someone wants to take over or implement separately please go ahead.

@michaelneale michaelneale marked this pull request as ready for review August 12, 2025 02:47
@michaelneale michaelneale self-assigned this Aug 12, 2025
@michaelneale
Copy link
Collaborator

cc @lifeizhou-ap do you want to take a closer look and indicate if you think it is ok to include?

@lifeizhou-ap
Copy link
Collaborator

lifeizhou-ap commented Aug 13, 2025

Hi @tadasant @michaelneale

I added the timeout in this PR originally but later on removed it based on the comments. I removed the global default timeout on both configuring side and applying side. So at the moment (since release 1.2.0) there is no timeout constraints to run sub recipes, and can execute long running sub recipes.

If I understand correctly, this PR only sets the timeout. So we need to add the "applying timeout" code back.

Now the question is: whether we would like to add customised timeout in? It would be good to add, but good to hear how users' feedback. @tadasant to check whether the current behaviours are good enough. Thank you!

@michaelneale
Copy link
Collaborator

thanks @lifeizhou-ap - I might close this for now - @tadasant feel free to re-open or ping (probably will need to update things again) if you still want this.

@tadasant
Copy link
Author

Yeah I don't feel strongly that a timeout needs to exist as a user; I wrote this PR when we were in that in-between phase where the "set the timeout" step didn't have parity with the "apply the timeout" step. But now that timeout is off by default, I think we can skip this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants