Should we control the pwh range for bbox regression from (0, 4) to (1/4, 4) ?#8265
Should we control the pwh range for bbox regression from (0, 4) to (1/4, 4) ?#8265daikankan wants to merge 3 commits intoultralytics:masterfrom
Conversation
|
@daikankan very interesting, thanks for sharing your results! Something doesn't seem right about your equation though, it should be definable without an exp() function, especially since exp and sigmoid cancel each other out I believe. Aside from that it is true that narrowing the range would provide less room for error especially in the early stages of training and possibly in later training also near the lower boundary. |
|
@daikankan can you see if you can simplify this equation? BTW we can not merge this PR currently as it would break all existing models but it might be suitable to merge at a new release that arrives with new models. |
|
@glenn-jocher thanks for your suggestion, but I think there is no way to simplify this equation, which is equivalent to pwh = torch.exp(torch.tanh(0.5 * ps[:, 2:4]) * 1.3862) * anchors[i]. |
|
@daikankan we've implemented deterministic training now in PR #8213, so every training will be identical unless the seed is changed. I've updated your branch with the latest changes from master including the #8213. Can you re-run your experiment and replot your results? This time the only changes will be due to the box regression change you made, there will be no more random differences between the two results. |
|
@LUO77123 yes you're right, maybe you can help to check the PR with experiments if you are interested, cause I have no extra GPU resource recently @glenn-jocher . |
|
@daikankan got it! Yes I'll leave this open and run experiments when our GPU resources free up. I think this is a good change but I believe the equation must be able to be simplified. I haven't sat down to try to simplify it myself yet though, been super busy with repo maintenance. |
|
Can we try this? Change (y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]) into (y[..., 2:4] = (y[..., 2:4] * 3.75 + 0.25) * self.anchor_grid[i]), which is linear growth rather than exponential growth. The formula is simplified. |
|
@LUO77123 no. Nominal input should produce nominal output: 1.0 = fcn(0.5) Your equation does not respect this constraint. |
I get the function of square and cube by fitting the curve with Matlab |
|
@LUO77123 thanks! Can you plot these from x=0 to x=4 against the current equation? |
|
@glenn-jocher I think the most important principle is the constraint mu(torch.log(pwh)) = 0, for pwh in range (1/4, 4). |
|
@daikankan really interesting point, the orange data is much better distributed... |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions YOLOv5 🚀 and Vision AI ⭐. |
|
👋 Hello there! We wanted to let you know that we've decided to close this pull request due to inactivity. We appreciate the effort you put into contributing to our project, but unfortunately, not all contributions are suitable or aligned with our product roadmap. We hope you understand our decision, and please don't let it discourage you from contributing to open source projects in the future. We value all of our community members and their contributions, and we encourage you to keep exploring new projects and ways to get involved. For additional resources and information, please see the links below:
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ |







where 2.7724 = 2 * 1.3862 = 2 * np.log(4), why not make the pwh in (1/4, 4) (instead of (0, 4)) for anchor_t=4?
After which the bbox regression should be more stable, below is my personal projects for comparison(not the official yolov5):
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Improved bounding box size prediction in YOLOv5 object detection models.
📊 Key Changes
yolo.pyand the loss calculation inutils/loss.py.🎯 Purpose & Impact