Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 14 additions & 9 deletions docs/source/en/tasks/object_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,11 @@ The task illustrated in this tutorial is supported by the following model archit
Before you begin, make sure you have all the necessary libraries installed:

```bash
pip install -q datasets transformers evaluate timm albumentations
pip install -q datasets transformers accelerate evaluate albumentations
```

You'll use 🤗 Datasets to load a dataset from the Hugging Face Hub, 🤗 Transformers to train your model,
and `albumentations` to augment the data. `timm` is currently required to load a convolutional backbone for the DETR model.
and `albumentations` to augment the data.

We encourage you to share your model with the community. Log in to your Hugging Face account to upload it to the Hub.
When prompted, enter your token to log in:
Expand Down Expand Up @@ -347,6 +347,7 @@ and `id2label` maps that you created earlier from the dataset's metadata. Additi
... id2label=id2label,
... label2id=label2id,
... ignore_mismatched_sizes=True,
... revision="no_timm", # DETR models can be loaded without timm
... )
```

Expand All @@ -362,7 +363,7 @@ Face to upload your model).
>>> training_args = TrainingArguments(
... output_dir="detr-resnet-50_finetuned_cppe5",
... per_device_train_batch_size=8,
... num_train_epochs=10,
... num_train_epochs=100,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you manage to investigate this? 100 epochs is very large. If we train for this many steps do we still see a change in metrics over the first 10 epochs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @qubvel perhaps this would be great to investigate as part of making it easier to train object detection models

... fp16=True,
... save_steps=200,
... logging_steps=50,
Expand Down Expand Up @@ -492,10 +493,10 @@ Next, prepare an instance of a `CocoDetection` class that can be used with `coco
... return {"pixel_values": pixel_values, "labels": target}


>>> im_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5")
>>> image_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5")

>>> path_output_cppe5, path_anno = save_cppe5_annotation_file_images(cppe5["test"])
>>> test_ds_coco_format = CocoDetection(path_output_cppe5, im_processor, path_anno)
>>> test_ds_coco_format = CocoDetection(path_output_cppe5, image_processor, path_anno)
```

Finally, load the metrics and run the evaluation.
Expand All @@ -510,10 +511,13 @@ Finally, load the metrics and run the evaluation.
... test_ds_coco_format, batch_size=8, shuffle=False, num_workers=4, collate_fn=collate_fn
... )

>>> device = torch.device("cuda") if torch.cuda.is_available() else "cpu"
>>> model.to(device)

>>> with torch.no_grad():
... for idx, batch in enumerate(tqdm(val_dataloader)):
... pixel_values = batch["pixel_values"]
... pixel_mask = batch["pixel_mask"]
... pixel_values = batch["pixel_values"].to(device)
... pixel_mask = batch["pixel_mask"].to(device)

... labels = [
... {k: v for k, v in t.items()} for t in batch["labels"]
Expand All @@ -523,8 +527,9 @@ Finally, load the metrics and run the evaluation.
... outputs = model(pixel_values=pixel_values, pixel_mask=pixel_mask)

... orig_target_sizes = torch.stack([target["orig_size"] for target in labels], dim=0)
... results = im_processor.post_process(outputs, orig_target_sizes) # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax)

... # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax)
... results = image_processor.post_process_object_detection(outputs, threshold=0, target_sizes=orig_target_sizes)
...
... module.add(prediction=results, reference=labels)
... del batch

Expand Down