Skip to content

Should we control the pwh range for bbox regression from (0, 4) to (1/4, 4) ?#8265

Closed
daikankan wants to merge 3 commits intoultralytics:masterfrom
daikankan:master
Closed

Should we control the pwh range for bbox regression from (0, 4) to (1/4, 4) ?#8265
daikankan wants to merge 3 commits intoultralytics:masterfrom
daikankan:master

Conversation

@daikankan
Copy link
Copy Markdown
Contributor

@daikankan daikankan commented Jun 20, 2022

compare
where 2.7724 = 2 * 1.3862 = 2 * np.log(4), why not make the pwh in (1/4, 4) (instead of (0, 4)) for anchor_t=4?
After which the bbox regression should be more stable, below is my personal projects for comparison(not the official yolov5):
results

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improved bounding box size prediction in YOLOv5 object detection models.

📊 Key Changes

  • Altered the computation of the width and height for bounding box predictions from using a squared scale factor to using an exponential function.
  • Updated the forward method in yolo.py and the loss calculation in utils/loss.py.

🎯 Purpose & Impact

  • Purpose: To enhance the accuracy of bounding box predictions by refining the way width and height are calculated.
  • Impact: Users can expect more accurate object detection, particularly in the size and scale of detected objects within an image. This may result in better performance for tasks where exact object dimensions are crucial. 🎯📈

@glenn-jocher
Copy link
Copy Markdown
Member

@daikankan very interesting, thanks for sharing your results!

Something doesn't seem right about your equation though, it should be definable without an exp() function, especially since exp and sigmoid cancel each other out I believe.

Aside from that it is true that narrowing the range would provide less room for error especially in the early stages of training and possibly in later training also near the lower boundary.

@glenn-jocher
Copy link
Copy Markdown
Member

@daikankan can you see if you can simplify this equation?

BTW we can not merge this PR currently as it would break all existing models but it might be suitable to merge at a new release that arrives with new models.

@daikankan
Copy link
Copy Markdown
Contributor Author

@glenn-jocher thanks for your suggestion, but I think there is no way to simplify this equation, which is equivalent to pwh = torch.exp(torch.tanh(0.5 * ps[:, 2:4]) * 1.3862) * anchors[i].

@LUO77123
Copy link
Copy Markdown

LUO77123 commented Jul 7, 2022

compare where 2.7724 = 2 * 1.3862 = 2 * np.log(4), why not make the pwh in (1/4, 4) (instead of (0, 4)) for anchor_t=4? After which the bbox regression should be more stable, below is my personal projects for comparison(not the official yolov5): results

Hello, I want to try your scheme. I found that there are two places that need to be modified in yolo.py (y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]) and loss.py (pwh = (pwh.sigmoid() * 2) ** 2 * anchors[i]). Is that so, or are there other modifications I didn't find?

@glenn-jocher
Copy link
Copy Markdown
Member

@daikankan we've implemented deterministic training now in PR #8213, so every training will be identical unless the seed is changed.

I've updated your branch with the latest changes from master including the #8213. Can you re-run your experiment and replot your results? This time the only changes will be due to the box regression change you made, there will be no more random differences between the two results.

@daikankan
Copy link
Copy Markdown
Contributor Author

@LUO77123 yes you're right, maybe you can help to check the PR with experiments if you are interested, cause I have no extra GPU resource recently @glenn-jocher .

@glenn-jocher
Copy link
Copy Markdown
Member

@daikankan got it! Yes I'll leave this open and run experiments when our GPU resources free up. I think this is a good change but I believe the equation must be able to be simplified. I haven't sat down to try to simplify it myself yet though, been super busy with repo maintenance.

@LUO77123
Copy link
Copy Markdown

Can we try this? Change (y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]) into (y[..., 2:4] = (y[..., 2:4] * 3.75 + 0.25) * self.anchor_grid[i]), which is linear growth rather than exponential growth. The formula is simplified.

@glenn-jocher
Copy link
Copy Markdown
Member

@LUO77123 no. Nominal input should produce nominal output: 1.0 = fcn(0.5)

Your equation does not respect this constraint.

@LUO77123
Copy link
Copy Markdown

@LUO77123 no. Nominal input should produce nominal output: 1.0 = fcn(0.5)

Your equation does not respect this constraint.

I get the function of square and cube by fitting the curve with Matlab
square:
(-0.17662+2.121x)^2+0.218806
image
cube:
(0.277629+1.27898
x)^3+0.228605
image

@glenn-jocher
Copy link
Copy Markdown
Member

@LUO77123 thanks! Can you plot these from x=0 to x=4 against the current equation?

@LUO77123
Copy link
Copy Markdown

(-0.17662+2.121_x)^2+0.218806_

image
I drew four kinds of diagrams through Matlab, the original 0-4, and the remaining three 0.25-4, which are marked. Do you think it's ok?

@daikankan
Copy link
Copy Markdown
Contributor Author

@glenn-jocher I think the most important principle is the constraint mu(torch.log(pwh)) = 0, for pwh in range (1/4, 4).
For instance, suppose x is in standard normal distribution, then we get the comparision:
x
y

@glenn-jocher
Copy link
Copy Markdown
Member

@daikankan really interesting point, the orange data is much better distributed...

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions YOLOv5 🚀 and Vision AI ⭐.

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Mar 22, 2023
@github-actions github-actions bot removed the Stale Stale and schedule for closing soon label Apr 10, 2023
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Oct 3, 2023

👋 Hello there! We wanted to let you know that we've decided to close this pull request due to inactivity. We appreciate the effort you put into contributing to our project, but unfortunately, not all contributions are suitable or aligned with our product roadmap.

We hope you understand our decision, and please don't let it discourage you from contributing to open source projects in the future. We value all of our community members and their contributions, and we encourage you to keep exploring new projects and ways to get involved.

For additional resources and information, please see the links below:

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Oct 3, 2023
@github-actions github-actions bot closed this Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Stale Stale and schedule for closing soon

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants