Skip to content

[PoC DO NOT MERGE] hfjobs uv run #7

Draft
davanstrien wants to merge 16 commits into
lhoestq:mainfrom
davanstrien:feat/uv-run-command
Draft

[PoC DO NOT MERGE] hfjobs uv run #7
davanstrien wants to merge 16 commits into
lhoestq:mainfrom
davanstrien:feat/uv-run-command

Conversation

@davanstrien
Copy link
Copy Markdown
Contributor

⚠️ This is a Proof of Concept PR - NOT intended for merge yet. Seeking feedback on the UX.

Most of the code was written by Claude via Claude Code. If we decide to add this feature, I will probably open a new clean PR to better align with the rest of the hfjobs code.

This PR introduces a new hfjobs uv command suite focused on making it easier to run UV scripts on HF infrastructure.

New subcommands:

  • hfjobs uv init - Create a HF dataset repository for UV scripts
  • hfjobs uv push - Push scripts to an existing repository
  • hfjobs uv sync - Sync local scripts to repository
  • hfjobs uv run - Run a UV script directly on HF infrastructure (main feature)

Key feature: hfjobs uv run

The highlight is the ability to run local UV scripts with a single command:

# Run a local UV script immediately
hfjobs uv run my_script.py

# With arguments
hfjobs uv run my_script.py arg1 arg2 --option value

# With specific hardware
hfjobs uv run my_script.py --flavor gpu-a10g-small

# Using a persistent repository for multiple runs
hfjobs uv run my_script.py --repo my-scripts

How it works:

1. Automatically uploads the script to a HF dataset repository
2. Executes using the official UV Docker container
3. Streams output back to the terminal

Repository management:

- Ephemeral repos: Auto-created for one-off runs (username/hfjobs-uv-run-TIMESTAMP-HASH)
- Persistent repos: Specified with --repo for organizing script collections

Questions for feedback:

1. Is the UX intuitive? Any suggestions for the command structure?
2. Should we add a --from-repo option to run scripts directly from shared repos?
3. Any concerns about the ephemeral repository approach?

It could make things simpler to enforce a single script per repo, but for related scripts, it might be nicer to have them in the same repo. 

One feature that could be quite cool is to have a 

```bash
hfjobs uv run --from-repo davanstrien/my-cool-script --args 1 --arg2 

This would require a bit of thought about how to manage a default image / suggested hardware etc. It could be possible to do via a config file(s) in the repo but I'm not sure what the best approach for this would be. I would suggest that this is something to only explore once we've seen some in the wild usage anyway to see how people are using and sharing scripts/ hfjobs.

Try it out:

 uv pip install git+https://github.com/davanstrien/hfjobs.git@feat/uv-run-command

- Explain what UV is and how it works with hfjobs
- Add concrete hello world example with cowsay
- Document key benefits for ML workflows
- Set up document structure for future sections
- Add uv init --script command for creating templates
- Show example output of generated script
- Focus on uv add --script as recommended approach
- Document alternative package indexes with vLLM example
- Simplify Python version requirements section
- Add links to official UV documentation
- Add 'hfjobs scripts init' to create HF dataset repos for UV scripts
- Add 'hfjobs scripts push' to update scripts in existing repos
- Auto-generate README with usage instructions
- Add 'hfjobs-uv-script' tag for discovery
- Add documentation for UV script sharing

This minimal MVP makes it easy to share and run UV scripts with hfjobs
- Rename ScriptsCommand to UvCommand
- Update imports in cli.py
- Maintains all existing functionality
- Reset README.md to upstream version
- Remove uv-script-sharing.md documentation
- Documentation will be provided in PR description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant