fix: revert #913 to restore #914 + #916 changes lost in bad squash#920
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (16)
📝 WalkthroughWalkthroughThis PR coordinates a multi-faceted evolution: it removes multi-turn transcript support from the promptfoo provider to simplify single-prompt behavior, migrates evaluation configs and documentation accordingly, introduces four new UK tax-focused evaluation specifications, extends the CLI with provider discovery and free-port selection, and implements non-interactive login mode with signal handling. ChangesEvaluation Framework Refactoring and New Test Specifications
CLI Commands and Infrastructure
Developer Setup Documentation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…onal-finance evals (#921) @lobu/promptfoo-provider gains vars.transcript: string[] support — replays sequential turns in one Lobu thread, returns the final assistant response for assertion. Single-turn callers via plain prompt are unchanged. Migrates the 4 dormant personal-finance behavioural YAMLs (gap-surfacing, sa102-employment, sa105-property, sa108-cgt) into promptfooconfig.yaml using vars.transcript. Deletes the original YAML files. Strictly additive atop current main (which already includes #918's tool_use SSE events). Re-do of #913 after #920 reverted that PR — the original landing accidentally undid #914 and #916 because of a bad rebase-and-soft-reset.
What
Reverts the squash commit of #913 (`69151a9d`).
Why
When I rebased #913 (promptfoo-multiturn) onto post-#918-merge main, I did a `git reset --soft origin/main` on a branch that had been pre-rebased onto `feat/tool-use-sse` — NOT onto a state that also included #914 (build-hygiene) and #916 (scaffold-dx).
The resulting `git status` diff included MY multiturn additions PLUS the inverse of #914 and #916 (because my branch didn't have them but main did). I committed all of it as "the multiturn delta" and squash-merged. That silently undid #914 and #916 in the same commit.
Damage scope
These changes are silently reverted on main HEAD (`69151a9d`):
Test plan
Follow-up
#913's multiturn work needs to be re-applied as a NEW, additive-only PR off post-revert main. That PR will carry ONLY the multiturn-specific changes (~4 files).
Summary by CodeRabbit
New Features
initcommand now supports--list-providersflag and automatically selects free ports to prevent collisions.logincommand adds-q/--quietmode for non-interactive environments with proper signal handling.Documentation
Chores