-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can Yolo3 take different width-height-ratio images as training input? #800
Comments
|
Thank you, @AlexeyAB . Question 1I also notice that there is a method of configuration on "dim" in darknet/src/detector.c#L87.
If Yolo v1/v2/v3 can take different width/height/ratio of images as training/validation/test input, then what is the point of configuring something like "dim*2"? Does it mean that I should just keep it as the original when I am combining different width/height/ratio images as my training data?
Question 2I am also confused by another thing. The instruction from Google groups mention that the "detector.c" should be in the src folder (https://github.com/AlexeyAB/darknet/tree/5bc62b14e06a3fcfda4e3a19fba77589920eddee/src), however I can only find "detector.c" from the examples folder (https://github.com/pjreddie/darknet/tree/master/examples). Should I just leave the detector.c in the examples folder if I am using pjreddie's yolo3 repo (https://github.com/pjreddie/darknet)? |
|
@AlexeyAB I am planning to train images on drones using YOLOv3, would want to ask if resizing the image would help the detector become more accurate. If so, what size would be the recommended size. Appreciate ur help thank you! |
@danieltwx Hi, you shouldn't resize images. |
@AlexeyAB
Appreciate ur help thank you! |
@AlexeyAB I'm training YOLOv3 on a dataset with just 1 object to be detected and classified per image (classes=4). The object is a rectangle that takes 80-95% of the image space almost always (it is a business card). The ratio of the images is 1:1.5 approx. Given that the borders of the object are very close to the limits of the image (sometimes even touching them), I've set width=640, height=416 in my .cfg file, for the moment. Is it safe to set both width and height at 416 as recommended? Or I'm risking losing valuable information due to the closeness of the object to the image limits? Thanks for your great contribution and support to the community! |
Hello, Thanks |
@MurreyCode you don't need to adjust height and width differently in your config or resize your database images. YOLO architecture does it by itself keeping the aspect ratio safe (no information will be ignored) according to the resolution in .cfg file. For Example, if you have image size 1248 x 936, YOLO will resize it to 416 x 312 and then pad the extra space with black bars to fit into a 416 x 416 network. |
for which version ? |
Hi Alexey, Looking forward for your answer & help on this. |
@maheshmechengg you can increase resolution as much as you like as long as it’s divisible by 32. But you will need to decrease your batch size. As you increase your image training resolution the images take up more memory in your GPU so you need to decrease your batch size to allow them to fit on the GPU memory. Decreasing batch size does slowly decrease accuracy. But in my experience a higher resolution (to an extent) and a decreased batch size results in better accuracy. You adjust batch size by increasing subdivisions in your config as per the instructions for out of memory issue if or when these arise as you increase your resolution. Number of steps or iterations does not need to increase in addition with a resolution increase. |
yes thanks, i did the same way as you said. |
Hi Alexey, I have a question about .cfg file of YoloV3. How does changing width or height effect the model? Isn't it taking fixed shape images as input? When I increase the height and the width while testing the models performance it increases the detection score and decrease the fps and frankly I couldn't find the reason. Thanks in advance. |
@sekomer increasing the height and width increases the amount of pixels the model can use to detect objects. More pixels equates to better accuracy because there is more detail in the image for the model to utilise. An image sized 100x100 px has far less detail then an image with 1000x1000px. It runs slower because the model needs to scan across more pixels (residual blocks). Suggest reading this: https://www.section.io/engineering-education/introduction-to-yolo-algorithm-for-object-detection/ |
First, thanks for your answer. We're on the same page as what you said, but what I don't understand is what changes when I double the h and w values in the cfg file during testing. Is it splitting the image into 4 subimages and iterating over them or doing some other black magic? I want to learn this. |
Hi people. I'm using yolov4 to train 5K images 3180 x 2160 for object detection, 1 class. The training seems to complete successfully ([email protected] = 98%) but the problem comes to inference some images. Training charts also look OK The problem is that when I conduct inference, the predicted bounding box is shifted from the real object. Shifted around 100 pixels x 100 pixels in X, Y, respectively. The object seems to be recognized but the BB is not located exactly in the right position. I have in my cyolov4-custom.cfg: Do you think this shift could be because my training images are non-squared (3180 x 2160 - 1:1.7 proportion), while and weight and height values in the .cfg are 1:1 proportion (416x416). Could this mistmatch be the responsible for such shift in the predicted bounding box? Please any light or hints to clarify this would be extremely helpful, thanks |
Hi @vongracia Few things:
If the above doesn't solve your issue (it should) you can do the below to increase bounding box tightness:
|
|
Can you publish the cfg files or tell how to train 360 degree camera 3180 x 2160 pixel images thanks in advance |
@saktheeswaranswan Are you using this repo? You should be using: https://github.com/AlexeyAB/darknet |
@vongracia |
Images from VOC or some other datasets does not share exactly the same width-height ratio. For example, in VOC2012 some images are 334x500, some are 500x332, some are 486x500. In KITTY dataset, the width is always roughly 3 times of the height (1200x300).
I don't see any fully connected layers in yolo3. Does it mean that yolo3 can take different width-height-ration images as yolo3's training input?
Or do I need to crop images to the same size or apply SPP-Net technique to yolo3 before training. If SPP-Net is needed, before which yolo3 layer shall I apply the SPP-Net?
The text was updated successfully, but these errors were encountered: