Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

box_loss, obj_loss, and cls_loss #5052

Closed
karl-gardner opened this issue Oct 5, 2021 · 8 comments
Closed

box_loss, obj_loss, and cls_loss #5052

karl-gardner opened this issue Oct 5, 2021 · 8 comments
Labels
question Further information is requested

Comments

@karl-gardner
Copy link

karl-gardner commented Oct 5, 2021

Hello Glenn et al.,

I am wondering what all the different losses mean in the results figure and where I can learn more about this? If you can give the equations for these losses that would be great. Specifically the box, obj, and cls loss? Is the box loss referring to the Generalized IOU loss (GIOU).

results

Thanks,

Karl Gardner | Texas Tech University

@karl-gardner karl-gardner added the question Further information is requested label Oct 5, 2021
@glenn-jocher
Copy link
Member

@kgardner330 box loss is the regression loss for output xywh bounding boxes. Loss criteria in use is CIoU(). You can see details in loss.py:

yolov5/utils/loss.py

Lines 131 to 137 in 5afc9c2

# Regression
pxy = ps[:, :2].sigmoid() * 2. - 0.5
pwh = (ps[:, 2:4].sigmoid() * 2) ** 2 * anchors[i]
pbox = torch.cat((pxy, pwh), 1) # predicted box
iou = bbox_iou(pbox.T, tbox[i], x1y1x2y2=False, CIoU=True) # iou(prediction, target)
lbox += (1.0 - iou).mean() # iou loss

For a general description of the YOLO losses you should read the first 3 YOLO papers:
https://pjreddie.com/publications/

@gepaohhh
Copy link

gepaohhh commented Dec 5, 2022

Hello Glenn et al.,
I am wonder why there is nothing in my graphics, weather is something wrong in my CUDA version
image

@glenn-jocher
Copy link
Member

@gepaohhh you are showing mAP so you do have validation data, not sure why your losses are nan (which are not plotted).

@gepaohhh
Copy link

gepaohhh commented Dec 6, 2022

@glenn-jocher thankyou, later I found my losses are nan (which are not plotted) ,because there is something wrong in my CUDA and cuDNN, and the official network say due to my CUDA which is not suitable to my Convolution
so you considered my dataset is good ? but i wonder if I only have one class of dataset , wheather this could effect my results ?, and can you give me some tips about 1 class train in yolov5 ? thank you

I'm sorry my english is not very well

@glenn-jocher
Copy link
Member

@gepaohhh no changes are needed for single-class training, just label your dataset with class 0 and train normally.

@pderrenger
Copy link
Member

For CUDA-related issues, first verify your environment with import torch; print(torch.__version__, torch.cuda.is_available()). Ensure you're using compatible versions: YOLOv5 works best with CUDA 11.x and PyTorch ≥1.9. If issues persist, try a clean reinstall following our CUDA setup guide.

For cloud-based alternatives that avoid local CUDA dependencies, consider Ultralytics HUB Cloud Training.

@baraa-boujneh
Copy link

baraa-boujneh commented Feb 17, 2025

@glenn-jocher thankyou, later I found my losses are nan (which are not plotted) ,because there is something wrong in my CUDA and cuDNN, and the official network say due to my CUDA which is not suitable to my Convolution so you considered my dataset is good ? but i wonder if I only have one class of dataset , wheather this could effect my results ?, and can you give me some tips about 1 class train in yolov5 ? thank you

I'm sorry my english is not very well

I am replying to this for anyone facing a similar problem.
Nan Val_Losses problem is not related to any CUDA version. It is mainly related to either unannotated or duplicated images.

A good way to resolve this problem is by using Roboflow. Create a new project and upload your dataset (make sure that your dataset respects Yolo Format); using Roboflow integrated preprocessing, all unannotated and duplicated images will be identified. Then RoboFlow will generate a download API for the new preprocessed dataset.

That works with me.

@pderrenger
Copy link
Member

Thank you for sharing your solution! For single-class YOLOv5 training, simply label all objects as class 0 in your dataset - no architecture changes needed. To verify CUDA compatibility, run python -c "import torch; print(torch.__version__, torch.cuda.is_available())" and compare with our CUDA troubleshooting guide. For dataset validation, we recommend our Roboflow integration which helps detect annotation issues. Happy training! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants