Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization for Log Alpha #5

Open
hxu105 opened this issue Jan 4, 2025 · 0 comments
Open

Initialization for Log Alpha #5

hxu105 opened this issue Jan 4, 2025 · 0 comments

Comments

@hxu105
Copy link

hxu105 commented Jan 4, 2025

Howdy, thank you for sharing this amazing work. I have some questions about the log alpha initialization. For both Llama and GPT2, you have initialized the log alphas with N(10, 0.01). Is there any specific reason for choosing this sharp Gaussian distribution? What will be the impact if I use standard normals or other initializations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant