-
Notifications
You must be signed in to change notification settings - Fork 32k
Improve object detection task guideline #29967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -46,11 +46,11 @@ The task illustrated in this tutorial is supported by the following model archit | |
| Before you begin, make sure you have all the necessary libraries installed: | ||
|
|
||
| ```bash | ||
| pip install -q datasets transformers evaluate timm albumentations | ||
| pip install -q datasets transformers accelerate evaluate albumentations | ||
| ``` | ||
|
|
||
| You'll use 🤗 Datasets to load a dataset from the Hugging Face Hub, 🤗 Transformers to train your model, | ||
| and `albumentations` to augment the data. `timm` is currently required to load a convolutional backbone for the DETR model. | ||
| and `albumentations` to augment the data. | ||
|
|
||
| We encourage you to share your model with the community. Log in to your Hugging Face account to upload it to the Hub. | ||
| When prompted, enter your token to log in: | ||
|
|
@@ -347,6 +347,7 @@ and `id2label` maps that you created earlier from the dataset's metadata. Additi | |
| ... id2label=id2label, | ||
| ... label2id=label2id, | ||
| ... ignore_mismatched_sizes=True, | ||
| ... revision="no_timm", # DETR models can be loaded without timm | ||
| ... ) | ||
| ``` | ||
|
|
||
|
|
@@ -362,7 +363,7 @@ Face to upload your model). | |
| >>> training_args = TrainingArguments( | ||
| ... output_dir="detr-resnet-50_finetuned_cppe5", | ||
| ... per_device_train_batch_size=8, | ||
| ... num_train_epochs=10, | ||
| ... num_train_epochs=100, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you manage to investigate this? 100 epochs is very large. If we train for this many steps do we still see a change in metrics over the first 10 epochs?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cc @qubvel perhaps this would be great to investigate as part of making it easier to train object detection models |
||
| ... fp16=True, | ||
| ... save_steps=200, | ||
| ... logging_steps=50, | ||
|
|
@@ -492,10 +493,10 @@ Next, prepare an instance of a `CocoDetection` class that can be used with `coco | |
| ... return {"pixel_values": pixel_values, "labels": target} | ||
|
|
||
|
|
||
| >>> im_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5") | ||
| >>> image_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5") | ||
|
|
||
| >>> path_output_cppe5, path_anno = save_cppe5_annotation_file_images(cppe5["test"]) | ||
| >>> test_ds_coco_format = CocoDetection(path_output_cppe5, im_processor, path_anno) | ||
| >>> test_ds_coco_format = CocoDetection(path_output_cppe5, image_processor, path_anno) | ||
| ``` | ||
|
|
||
| Finally, load the metrics and run the evaluation. | ||
|
|
@@ -510,10 +511,13 @@ Finally, load the metrics and run the evaluation. | |
| ... test_ds_coco_format, batch_size=8, shuffle=False, num_workers=4, collate_fn=collate_fn | ||
| ... ) | ||
|
|
||
| >>> device = torch.device("cuda") if torch.cuda.is_available() else "cpu" | ||
| >>> model.to(device) | ||
|
|
||
| >>> with torch.no_grad(): | ||
| ... for idx, batch in enumerate(tqdm(val_dataloader)): | ||
| ... pixel_values = batch["pixel_values"] | ||
| ... pixel_mask = batch["pixel_mask"] | ||
| ... pixel_values = batch["pixel_values"].to(device) | ||
| ... pixel_mask = batch["pixel_mask"].to(device) | ||
|
|
||
| ... labels = [ | ||
| ... {k: v for k, v in t.items()} for t in batch["labels"] | ||
|
|
@@ -523,8 +527,9 @@ Finally, load the metrics and run the evaluation. | |
| ... outputs = model(pixel_values=pixel_values, pixel_mask=pixel_mask) | ||
|
|
||
| ... orig_target_sizes = torch.stack([target["orig_size"] for target in labels], dim=0) | ||
| ... results = im_processor.post_process(outputs, orig_target_sizes) # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax) | ||
|
|
||
| ... # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax) | ||
| ... results = image_processor.post_process_object_detection(outputs, threshold=0, target_sizes=orig_target_sizes) | ||
| ... | ||
| ... module.add(prediction=results, reference=labels) | ||
| ... del batch | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.