huggingface · osanseviero · Apr 5, 2023 · Mar 27, 2023 · Mar 28, 2023 · Mar 28, 2023
diff --git a/assets/56_fine_tune_segformer/output.png b/assets/56_fine_tune_segformer/output.png
diff --git a/assets/56_fine_tune_segformer/pizza-scene.png b/assets/56_fine_tune_segformer/pizza-scene.png
diff --git a/assets/56_fine_tune_segformer/segformer.png b/assets/56_fine_tune_segformer/segformer.png
diff --git a/assets/56_fine_tune_segformer/sidewalk-examples.png b/assets/56_fine_tune_segformer/sidewalk-examples.png
diff --git a/assets/56_fine_tune_segformer/sidewalk-labeling-crop-poster.png b/assets/56_fine_tune_segformer/sidewalk-labeling-crop-poster.png
diff --git a/assets/56_fine_tune_segformer/sidewalk-labeling-crop.mp4 b/assets/56_fine_tune_segformer/sidewalk-labeling-crop.mp4
diff --git a/assets/56_fine_tune_segformer/widget-poster.png b/assets/56_fine_tune_segformer/widget-poster.png
diff --git a/assets/56_fine_tune_segformer/widget.mp4 b/assets/56_fine_tune_segformer/widget.mp4
diff --git a/fine-tune-segformer.md b/fine-tune-segformer.md
@@ -14,7 +14,7 @@ authors:
 
 <script async defer src="https://unpkg.com/medium-zoom-element@0/dist/medium-zoom-element.min.js"></script>
 
-<a target="_blank" href="https://colab.research.google.com/drive/1MdkavsjGHYcuGyjmsf9wmeAK3WvtYLty?usp=sharing">
+<a target="_blank" href="https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/56_fine_tune_segformer.ipynb">
     <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
 </a>
 
@@ -27,7 +27,7 @@ Because semantic segmentation is a type of classification, the network architect
 [SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer) is a model for semantic segmentation introduced by Xie et al. in 2021. It has a hierarchical Transformer encoder that doesn't use positional encodings (in contrast to ViT) and a simple multi-layer perceptron decoder. SegFormer achieves state-of-the-art performance on multiple common datasets. Let's see how our pizza delivery robot performs for sidewalk images.
 
 <figure class="image table text-center m-0 w-full">
-  <medium-zoom background="rgba(0,0,0,.7)" alt="Pizza delivery robot segmenting a scene" src="assets/56_fine_tune_segformer/pizza-scene.png"></medium-zoom>
+  <medium-zoom background="rgba(0,0,0,.7)" alt="Pizza delivery robot segmenting a scene" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/pizza-scene.png"></medium-zoom>
 </figure>
 
 Let's get started by installing the necessary dependencies. Because we're going to push our dataset and model to the Hugging Face Hub, we need to install [Git LFS](https://git-lfs.github.com/) and log in to Hugging Face.
@@ -59,7 +59,7 @@ To create your semantic segmentation dataset, you'll need two things:
 We went ahead and captured a thousand images of sidewalks in Belgium. Collecting and labeling such a dataset can take a long time, so you can start with a smaller dataset and expand it if the model does not perform well enough.
 
 <figure class="image table text-center m-0 w-full">
-    <medium-zoom background="rgba(0,0,0,.7)" alt="Example images from the sidewalk dataset" src="assets/56_fine_tune_segformer/sidewalk-examples.png"></medium-zoom>
+    <medium-zoom background="rgba(0,0,0,.7)" alt="Example images from the sidewalk dataset" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-examples.png"></medium-zoom>
     <figcaption>Some examples of the raw images in the sidewalk dataset.</figcaption>
 </figure>
 
@@ -68,7 +68,7 @@ To obtain segmentation labels, we need to indicate the classes of all the region
 ### Set up the labeling task on Segments.ai
 
 First, create an account at [https://segments.ai/join](https://segments.ai/join?utm_source=hf&utm_medium=colab&utm_campaign=sem_seg). 
-Next, create a new dataset and upload your images. You can either do this from the web interface or via the Python SDK (see the [notebook](https://colab.research.google.com/drive/1BImTyBjW3KtvHGVcjGpYYFZdRGXzM3-j?usp=sharing)).
+Next, create a new dataset and upload your images. You can either do this from the web interface or via the Python SDK (see the [notebook](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/56_fine_tune_segformer.ipynb)).
 
 
 ### Label the images
@@ -81,7 +81,7 @@ Now that the raw data is loaded, go to [segments.ai/home](https://segments.ai/ho
         style="max-width: 70%; margin: auto;"
         autoplay loop autobuffer muted playsinline
     >
-      <source src="assets/56_fine_tune_segformer/sidewalk-labeling-crop.mp4" poster="assets/56_fine_tune_segformer/sidewalk-labeling-crop-poster.png" type="video/mp4">
+      <source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-labeling-crop.mp4" poster="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-labeling-crop-poster.png" type="video/mp4">
   </video>
   <figcaption>Tip: when using the superpixel tool, scroll to change the superpixel size, and click and drag to select segments.</figcaption>
 </figure>
@@ -92,7 +92,7 @@ When you're done labeling, create a new dataset release containing the labeled d
 
 Note that creating the release can take a few seconds. You can check the releases tab on Segments.ai to check if your release is still being created.
 
-Now, we'll convert the release to a [Hugging Face dataset](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset) via the Segments.ai Python SDK. If you haven't set up the Segments Python client yet, follow the instructions in the "Set up the labeling task on Segments.ai" section of the [notebook](https://colab.research.google.com/drive/1BImTyBjW3KtvHGVcjGpYYFZdRGXzM3-j#scrollTo=9T2Jr9t9y4HD). 
+Now, we'll convert the release to a [Hugging Face dataset](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset) via the Segments.ai Python SDK. If you haven't set up the Segments Python client yet, follow the instructions in the "Set up the labeling task on Segments.ai" section of the [notebook](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/56_fine_tune_segformer.ipynb#scrollTo=9T2Jr9t9y4HD). 
 
 *Note that the conversion can take a while, depending on the size of your dataset.*
 
@@ -192,7 +192,7 @@ repo_id = f"datasets/{hf_dataset_identifier}"
 filename = "id2label.json"
 id2label = json.load(open(hf_hub_download(repo_id=hf_dataset_identifier, filename=filename, repo_type="dataset"), "r"))
 id2label = {int(k): v for k, v in id2label.items()}
-label2id = {v: k for k, v in id2label.items()
+label2id = {v: k for k, v in id2label.items()}
 
 num_labels = len(id2label)
 ```
@@ -237,7 +237,7 @@ test_ds.set_transform(val_transforms)
 The SegFormer authors define 5 models with increasing sizes: B0 to B5. The following chart (taken from the original paper) shows the performance of these different models on the ADE20K dataset, compared to other models.
 
 <figure class="image table text-center m-0 w-full">
-  <medium-zoom background="rgba(0,0,0,.7)" alt="SegFormer model variants compared with other segmentation models" src="assets/56_fine_tune_segformer/segformer.png"></medium-zoom>
+  <medium-zoom background="rgba(0,0,0,.7)" alt="SegFormer model variants compared with other segmentation models" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/segformer.png"></medium-zoom>
   <figcaption><a href="https://arxiv.org/abs/2105.15203">Source</a></figcaption>
 </figure>
 
@@ -324,7 +324,7 @@ def compute_metrics(eval_pred):
             references=labels,
             num_labels=len(id2label),
             ignore_index=0,
-            reduce_labels=feature_extractor.reduce_labels,
+            reduce_labels=feature_extractor.do_reduce_labels,
         )
 
     # add per category metrics as individual key-value pairs
@@ -387,7 +387,7 @@ However, you can also try out your model directly on the Hugging Face Hub, thank
         style="max-width: 70%; margin: auto;"
         autoplay loop autobuffer muted playsinline
     >
-      <source src="assets/56_fine_tune_segformer/widget.mp4" poster="assets/56_fine_tune_segformer/widget-poster.png" type="video/mp4">
+      <source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/widget.mp4" poster="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/widget-poster.png" type="video/mp4">
   </video>
 </figure>
 
@@ -438,7 +438,7 @@ pred_seg = upsampled_logits.argmax(dim=1)[0]
 Now it's time to display the result. We'll display the result next to the ground-truth mask.
 
 <figure class="image table text-center m-0 w-full">
-  <medium-zoom background="rgba(1,1,1,1)" alt="SegFormer prediction vs the ground truth" src="assets/56_fine_tune_segformer/output.png"></medium-zoom>
+  <medium-zoom background="rgba(1,1,1,1)" alt="SegFormer prediction vs the ground truth" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/output.png"></medium-zoom>
 </figure>
 
 What do you think? Would you send our pizza delivery robot on the road with this segmentation information?