To-Do Tool for Goose #3902

tlongwell-block · 2025-08-06T21:13:31Z

🎯 Add To-Do Tools for Task Management in Goose

Summary

Implements session-scoped todo list functionality to help agents track and manage complex multi-step tasks. This feature provides two simple platform tools (todo__read and todo__write) that allow agents to maintain a working memory throughout their session.

Motivation

Agents often work on complex tasks that span multiple steps, files, or conversation turns. Without a way to track progress, agents may:

Forget to complete all requested steps
Lose context between operations
Struggle to communicate progress to users
Have difficulty resuming interrupted work

This PR addresses these issues by providing a simple, reliable task tracking mechanism.

Implementation Details

Core Changes

New module: crates/goose/src/agents/todo_tools.rs (~140 lines)
- Thread-safe implementation using Arc<Mutex<String>>
- Two tools: read (returns todo content) and write (replaces todo content)
- Format-agnostic: agents decide structure (markdown, plain text, etc.)
- Session-scoped: todo list exists only during agent lifetime
Integration: Minimal changes to crates/goose/src/agents/agent.rs (~20 lines)
- Registers todo tools alongside existing platform tools
- Follows established patterns exactly
System prompt: Added Task Management section to guide agent usage
- Describes when and how to use todo tools
- Provides markdown checkbox examples
- Maintains consistent tone with existing documentation

Testing

Comprehensive test suite: crates/goose/tests/todo_tools_test.rs (11 tests)
- Unit tests for tool creation and configuration
- Integration tests for read/write operations
- Edge cases: empty lists, large content (100KB), unicode/emoji
- Concurrency testing with 10 simultaneous operations
- All tests passing ✅

Design Decisions

Minimal API: Just read/write operations (no append, update, delete)
Complete replacement: Write replaces entire content for simplicity
No persistence: Session-scoped by design
No size limits: Agents self-regulate content
Thread-safe: Proper synchronization for concurrent access

Usage Example

// Agent writes a todo list
todo__write(content: "- [ ] Review code\n- [ ] Run tests\n- [ ] Update docs")

// Agent reads current todos
todo__read() // Returns: "- [ ] Review code\n- [ ] Run tests\n- [ ] Update docs"

// Agent updates with progress
todo__write(content: "- [x] Review code\n- [ ] Run tests\n- [ ] Update docs")

Quality Checklist

All tests passing (11/11)
Code compiles without warnings
Formatted with cargo fmt
Clippy clean (./scripts/clippy-lint.sh)
Thread-safe implementation verified
Documentation added to system prompt
Follows existing Goose patterns

Impact

No breaking changes: Additive feature only
Minimal footprint: ~160 lines of new code
Zero dependencies: Uses only standard library
Backward compatible: Existing agents unaffected

Files Changed

crates/goose/src/agents/agent.rs - Register todo tools
crates/goose/src/agents/todo_tools.rs - Core implementation (new)
crates/goose/src/agents/mod.rs - Module export
crates/goose/src/prompts/system.md - Task Management section
crates/goose/tests/todo_tools_test.rs - Test suite (new)

This PR delivers a simple solution for task tracking that enhances agent capabilities without adding complexity or breaking existing functionality.

crates/goose/src/agents/agent.rs

michaelneale · 2025-08-06T22:47:47Z

@tlongwell-block very nice - some ideas:

can we have it return an error if it is greater than a certain size (not sure what it is, but it should keep the list modest abd brief)
is it worth looking at how to throw it in the latest message each time (but only appended to last)
is it worth benchmarking a fine grained (vs blob) version?
have you tried it with the goose bench tests (or any rough benchmark or even casual A/B tests - would be good to know).

I like this version more than mine as it is in the right place

crates/goose/src/prompts/system.md

michaelneale

I think it would be good to make this granular (and cap number of tasks) vs a markdown - keeps the system prompt much smaller (can look at the old linked PR for what those signatures would look like). Would also be good to see before/after benchmarks to make sure doesn't degrade things (or even a casual A/B on the same task) - but otherwise I think this is good but would love @katzdave and @DOsinga opinion, I think we need this, but needs to be nice and lean, just one tool with simple actions (maybe the markdown blob approach is good in that sense)

bonus: is there an obvious way to tack on the list to each latest message as it is sent (but not kept in the session)?

tlongwell-block · 2025-08-07T00:58:55Z

@tlongwell-block very nice - some ideas:

can we have it return an error if it is greater than a certain size (not sure what it is, but it should keep the list modest abd brief)

Yes, though I wonder if perhaps amending the prompt to simply say something like "keep the todo list brief" would be sufficient. Having LLMs using the tool will self-limit its size. I will definitely set some tunable upper limit, though.

is it worth looking at how to throw it in the latest message each time (but only appended to last)

We did discuss this a bit in a chat earlier. I had some concerns around how to manage avoiding polluting the sessions file and/or context. We discussed either adding dynamically editing sessions to keep the to-do from being a part of every turn, or to simply transparently tack on the todo before submitted the session context to the LLM provider. Neither seemed... great.

tlongwell-block · 2025-08-07T01:40:09Z

is it worth benchmarking a fine grained (vs blob) version?

I don't think so. The blob is just as effective at conveying the information as a structured list, perhaps even more so since individual LLMs can format it as they please.

The lack of structure won't impact the LLM's ability to read the list. And it will definitely minimize the cognitive overhead required to use it.

tlongwell-block · 2025-08-07T01:43:12Z

Add Character Limit to Todo List Tool

Summary

Implements a configurable character-based size limit for the todo list tool to prevent unbounded growth and provide clear feedback to agents about usage.

Changes

Core Implementation

Added character limit validation in dispatch_tool_call for TODO_WRITE_TOOL_NAME
- Validates content size while holding the lock (prevents race conditions)
- Rejects writes that exceed the limit with clear error message
- Returns character count on successful writes
Configuration via environment variable
- GOOSE_TODO_MAX_CHARS with default of 50,000 characters
- Setting to 0 disables the limit (unlimited)
Clean read responses
- TODO_READ_TOOL_NAME returns pure content without metadata

Files Modified

crates/goose/src/agents/agent.rs: Added limit validation and helper function
crates/goose/tests/todo_tools_test.rs: Added comprehensive tests

Testing

Added tests for:

Character limit enforcement
Character count in write responses
Clean read responses (no metadata)
Unlimited mode with GOOSE_TODO_MAX_CHARS=0
Unicode character counting

Benefits

Prevents memory issues from unbounded todo list growth
Clear feedback to agents about size constraints
Thread-safe validation with proper lock ordering
Configurable via environment variable
Simple implementation with minimal code changes (~20 lines)

Configuration

Set the environment variable to customize the limit:

export GOOSE_TODO_MAX_CHARS=100000  # Allow up to 100k characters
export GOOSE_TODO_MAX_CHARS=0       # Disable limit (unlimited)

Default is 50,000 characters (~12,500 tokens).

Error Messages

When limit is exceeded:

"Todo list too large: 51234 chars (max: 50000)"

On successful write:

"Updated (8456 chars)"

Migration

No breaking changes for existing sessions
Todo lists are session-scoped and not persisted
Generous default limit unlikely to affect normal usage

michaelneale · 2025-08-07T03:54:50Z

yeah I wouldn't worry about making a variable for the limit - as long as there is one - say 2000 chars (and it should throw back an error if it tries to update one that is larger telling it to be briefer)

katzdave

Nice job, really excited about this change!

crates/goose/src/agents/agent.rs

crates/goose/src/prompts/system.md

Done

* main: feat: add @-mention file reference expansion to .goosehints (#3873) feat(cli): Add --name/-n to session remove and --id/-i alias for session export (#3941) Docs: provider and model run options (#4013) To-Do Tools (#3902) ci: correctly match doc only changes (#4009) Remove PR trigger for Linux build workflow (#4008) docs: update release docs with an additional step needed + adjust list formatting (#4005) chore(release): release version 1.3.0 (#3921) docs: MCP-ui blog content (#3996) feat: Add `GOOSE_TERMINAL` env variable to spawned terminals (#3911) add missing dependencies for developer setup (#3930)

…ndow * 'main' of github.com:block/goose: sanitize message content on deserialization (#3966) Move summarize button inside of context view (#4015) blog: post on lead/worker model (#3994) Actually send cancellation to MCP servers (#3865) fix: enable 'goose://' handler for debian systems (#3952) fit: default ollama port (#4001) Remove cognitive complexity clippy lint (#4010) feat: add @-mention file reference expansion to .goosehints (#3873) feat(cli): Add --name/-n to session remove and --id/-i alias for session export (#3941) Docs: provider and model run options (#4013) To-Do Tools (#3902) ci: correctly match doc only changes (#4009) Remove PR trigger for Linux build workflow (#4008) docs: update release docs with an additional step needed + adjust list formatting (#4005)

* 'main' of github.com:block/goose: Move summarize button inside of context view (#4015) blog: post on lead/worker model (#3994) Actually send cancellation to MCP servers (#3865) fix: enable 'goose://' handler for debian systems (#3952) fit: default ollama port (#4001) Remove cognitive complexity clippy lint (#4010) feat: add @-mention file reference expansion to .goosehints (#3873) feat(cli): Add --name/-n to session remove and --id/-i alias for session export (#3941) Docs: provider and model run options (#4013) To-Do Tools (#3902) ci: correctly match doc only changes (#4009) Remove PR trigger for Linux build workflow (#4008)

Co-authored-by: David Katz <[email protected]> Signed-off-by: Jack Wright <[email protected]>

tlongwell-block added 3 commits August 6, 2025 15:44

To-Do Tool for Goose

4d3580d

To-Do Tool for Goose

e1e4144

system prompt update

c0ea20f

michaelneale self-assigned this Aug 6, 2025

michaelneale reviewed Aug 6, 2025

View reviewed changes

crates/goose/src/agents/agent.rs Outdated Show resolved Hide resolved

michaelneale reviewed Aug 6, 2025

View reviewed changes

crates/goose/src/prompts/system.md Show resolved Hide resolved

michaelneale mentioned this pull request Aug 6, 2025

feat: task tracker for session efficiency #3848

Closed

michaelneale assigned katzdave Aug 6, 2025

michaelneale previously requested changes Aug 7, 2025

View reviewed changes

michaelneale added status: in progress p1 Priority 1 - High (supports roadmap) performance Performance related labels Aug 7, 2025

upper size limit for todo list. Size hint for LLM when writing it

0d084e7

tlongwell-block and others added 5 commits August 7, 2025 09:59

update comments

3353614

comment fixes

189af99

render output

42d1a1d

oops

a68d56b

stronger system prompt

847b03e

michaelneale removed their assignment Aug 8, 2025

tlongwell-block and others added 6 commits August 8, 2025 10:17

trim system prompt

a54d3ae

Merge branch 'main' into tlongwell/todo

f98e87c

heed clippy and move map_or to is_some_and

5329d9a

format

c66d06d

test fix

31e1886

test env var handling

d0fd64a

serialize checks

063caeb

katzdave approved these changes Aug 11, 2025

View reviewed changes

crates/goose/src/agents/agent.rs Show resolved Hide resolved

crates/goose/src/prompts/system.md Show resolved Hide resolved

update prompt just a little

4af80ec

tlongwell-block merged commit 566b9dc into main Aug 11, 2025
11 checks passed

tlongwell-block deleted the tlongwell/todo branch August 11, 2025 19:57

alexhancock mentioned this pull request Aug 13, 2025

chore(release): release version 1.4.0 #4069

Merged

ayax79 pushed a commit to ayax79/goose that referenced this pull request Aug 21, 2025

To-Do Tools (block#3902)

932ade3

Co-authored-by: David Katz <[email protected]> Signed-off-by: Jack Wright <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

To-Do Tool for Goose #3902

To-Do Tool for Goose #3902

Uh oh!

tlongwell-block commented Aug 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

michaelneale commented Aug 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

michaelneale left a comment •

edited

Loading

Uh oh!

tlongwell-block commented Aug 7, 2025

Uh oh!

tlongwell-block commented Aug 7, 2025

Uh oh!

tlongwell-block commented Aug 7, 2025

Uh oh!

michaelneale commented Aug 7, 2025

Uh oh!

katzdave left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

To-Do Tool for Goose #3902

To-Do Tool for Goose #3902

Uh oh!

Conversation

tlongwell-block commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Add To-Do Tools for Task Management in Goose

Summary

Motivation

Implementation Details

Core Changes

Testing

Design Decisions

Usage Example

Quality Checklist

Impact

Files Changed

Uh oh!

Uh oh!

michaelneale commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

michaelneale left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlongwell-block commented Aug 7, 2025

Uh oh!

tlongwell-block commented Aug 7, 2025

Uh oh!

tlongwell-block commented Aug 7, 2025

Add Character Limit to Todo List Tool

Summary

Changes

Core Implementation

Files Modified

Testing

Benefits

Configuration

Error Messages

Migration

Uh oh!

michaelneale commented Aug 7, 2025

Uh oh!

katzdave left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tlongwell-block commented Aug 6, 2025 •

edited

Loading

michaelneale commented Aug 6, 2025 •

edited

Loading

michaelneale left a comment •

edited

Loading