Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Dependencies, Better Docker Support, and Segmentation Demo #480

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

tim-win
Copy link

@tim-win tim-win commented Sep 1, 2024

This PR introduces several significant improvements and updates to code surrounding the core YOLO-World project, addressing multiple issues and enhancing overall functionality and ease of use. When I finally got out of dependency hell, I decided to put down a ladder!

Key Changes

  1. Dependency Updates:

    • Updated dependencies to match the latest recommended versions from issue #364, including torch >2.0.0 (phew!).
    • Upgraded to CUDA 12.1 to ensure compatibility with the latest GPU architectures and because thank god it works.
  2. Docker Support:

    • Took the existing Docker demo system under my wing and cleaned it right up: it automatically handles the mm* dependency issues everyone has run into, as well as torch and others required for the demo.
    • Added a build_and_run.sh script for easy building and running of Docker containers with different model configurations, matching configs to models, so no one else needs the headache I have.
  3. Segmentation Demo:

    • Added demo/segmentation_demo.py to showcase YOLO-World's open vocabulary segmentation capabilities. The guts of which was stolen shamelessly form @onuralpszr 's excellent hugginface space, https://huggingface.co/spaces/onuralpszr/YOLO-World-Seg, which did not work but showed me enough to get this running.
    • Integrated segmentation support into the Docker container, allowing for easy testing and demonstration of this feature.
  4. Issue Resolutions:

    • This PR covers much of the work done in #419, bringing it up to date as of August 2024.
    • Implicitly fixes issues #279, #364, and #425.
  5. Tested Configurations:

    • Verified functionality with pretrain-x-1280ft, which performs excellently.
    • Tested seg-l and seg-l-seghead configurations, which show good performance but really work well with my use case ( :/ )

Detailed Improvements

  • Refactored the Dockerfile for better efficiency and clarity.
  • Updated pyproject.toml and requirements files with pinned dependency versions.
  • Minor changes to configuration files, there were some local paths that needed to be removed.
  • Documentation the Docker-based demo workflow.

How to Use

Users can now easily run YOLO-World demos, including the new segmentation demo, using the provided Docker build system. For example:

./build_and_run.sh pretrain-x-1280ft  # For gradio object detection demo
./build_and_run.sh seg-l              # For segmentation demo

(note, while this is in MR, the fixes are not on master. So you have to replace this line in the dockerfile:

RUN git clone --recursive https://github.com/AILab-CVC/YOLO-World /yolo/

With this line:

RUN git clone --recursive https://github.com/tim-win/YOLO-World /yolo/

Hopefully this PR will save the people who come after me significant amounts of time. Feedback and further testing is welcome!

@tim-win
Copy link
Author

tim-win commented Sep 3, 2024

Pinging @wondervictor as you may be able to review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant