Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choice of skip variable [flag: query] #41

Closed
AntiLibrary5 opened this issue Jul 23, 2021 · 2 comments
Closed

Choice of skip variable [flag: query] #41

AntiLibrary5 opened this issue Jul 23, 2021 · 2 comments

Comments

@AntiLibrary5
Copy link

Hi,
Mainly I want to clarify the intent of how the images are paired for the loss in case I misunderstand.. MapNet ensures global consistency via its clever relative loss. I'm just having hardtime grasping intuitively why a value of skip defaults to 10. It is a hyperparam and for sure can be optimized but you providing it as the default must mean you had good results for that value.

To that extent, skip=10 means if the dataloader picks an index of say 36, given steps=3, the loader would pick the images indexed: [26, 36, 46] with a gap of 10 images.

But doesn't it mean you're picking images which are farther apart chronologically, and thus also in translation. (a person collecting the data moving at 1m/sec means the 3 images would be 10 meters apart, so we loose the point of relative loss).

Actually I trained my own model with the default hyperparams and then with skip=1 and I got poorer results so I wanted to clarify the intent of how the images are paired for the loss in case I misunderstand.

Thank you for your time.

@samarth-robo
Copy link
Contributor

Hi @AntiLibrary5 you are right in your understanding of skip.

The reason skip > 1 is because often the camera motion and image change between two consecutive frames at 30 fps is quite small. This is true both from the camera pose as well as motions of dynamic objects in the scene. So it might not provide a strong learning signal, because anyway the network will predict a similar pose (since the input images are almost the same).

On the other hand, if you connect two far-away images with the relative loss, it provides a stronger learning signal. For example, the network's prediction might jump by a large amount between two images 10 frames apart (let's say because of the large image change from the motion of a dynamic object like a car, or the sudden brightness change from the appearance of a new light).

At the same time, if skip is set to a very large separation, then the two images will not have any overlap in their view frustums.

So skip needs to be a compromise between all these considerations, and good skip values are also likely to be different for different datasets.

@AntiLibrary5
Copy link
Author

Thank you very much for the explanation and insight!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants