This repository contains the code supporting the OWLv2 base model for use with Autodistill.
OWLv2 is a zero-shot object detection model that follows from on the OWL-ViT architecture. OWLv2 has an open vocabulary, which means you can provide arbitrary text prompts for the model. You can use OWLv2 with autodistill for object detection.
Read the full Autodistill documentation.
Read the OWLv2 Autodistill documentation.
To use OWLv2 with autodistill, you need to install the following dependency:
pip3 install autodistill-owlv2
from autodistill_owlv2 import OWLv2
from autodistill.detection import CaptionOntology
from autodistill.utils import plot
import cv2
# define an ontology to map class names to our OWLv2 prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = OWLv2(
ontology=CaptionOntology(
{
"person": "person",
"dog": "dog"
}
)
)
# run inference on a single image
results = base_model.predict("dog.jpeg")
plot(
image=cv2.imread("dog.jpeg"),
classes=base_model.ontology.classes(),
detections=results
)
# label a folder of images
base_model.label("./context_images", extension=".jpeg")
This model is licensed under an Apache 2.0 (see original model implementation license, and the corresponding HuggingFace Transformers documentation).
We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!