Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions in paper #16

Open
Breeze-Zero opened this issue Jan 5, 2022 · 1 comment
Open

Some questions in paper #16

Breeze-Zero opened this issue Jan 5, 2022 · 1 comment

Comments

@Breeze-Zero
Copy link

Breeze-Zero commented Jan 5, 2022

1641399242(1)

Hello, I have been studying your article recently. I noticed that your PPT described pre-train Task 2: region-level as shown in the picture above. But doesn't the actual code input local images into the teacher model. In addition, I am not quite clear about region-level loss function. Is it to calculate the similarity matrix of local features output by student model and global features of teacher model? I hope you can answer my two doubts at your convenience
@ChunyuanLI
Copy link
Contributor

But doesn't the actual code input local images into the teacher model.

Yes, the actual code implements 2-crop and multi-crop. The latter includes both large crops and small crops (I guess the small crops is the "local images" you mentioned). The slides illustrate the case for 2-crop for simplicity.

Is it to calculate the similarity matrix of local features output by student model and global features of teacher model?

No. The similarity matrix is computed between any two local features (to clarify, it means grid features) between student and teacher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants