Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do i need to retrain a model for a specific interaction? #1063

Closed
brunosamp-usp opened this issue Mar 17, 2025 · 9 comments
Closed

What do i need to retrain a model for a specific interaction? #1063

brunosamp-usp opened this issue Mar 17, 2025 · 9 comments
Assignees

Comments

@brunosamp-usp
Copy link

What would you like to report?

I am working in a reaction which has about 12 intermediates that all adsorbs to platinum. I want to predict the adsorption energy for all of them. However, i'd also like to train the available models to obtain a more exact adsorption energy for these intermediates.
To achieve this, i thought about making a .db file, containing all the structures adsorbed to the Pt. Am i correct? What else do i need?

@mshuaibii
Copy link
Collaborator

mshuaibii commented Mar 24, 2025

Hi -

Yes you are definitely on the right track. I would suggest checking out this tutorial for more concrete steps on how to use your dataset to fine-tune existing models for your specific application - https://fair-chem.github.io/core/fine-tuning/fine-tuning-oxides.html.

I would not be surprised if the models worked out of the box without any additional training, so maybe start there first and see how they do! We did something similar in this work - https://arxiv.org/abs/2405.02078.

@mshuaibii mshuaibii self-assigned this Mar 24, 2025
@brunosamp-usp
Copy link
Author

brunosamp-usp commented Mar 25, 2025

Thank you @mshuaibii ! I am running a training with the SchNet model, on S2EF. I don't have a gpu, so i am running it on the cpu. Is it normal for, during the training, the Energy MAE to struggle in being lowered? My energy mae is oscillating between 8E+00 and 1.00E+1 for about 5 days.

2025-03-25 09:52:14 /home/brunoss/programs/fairchem/src/fairchem/core/trainers/ocp_trainer.py:194: (INFO): energy_mae: 8.35e+00, forces_mae: 9.22e-02, forces_cosine_similarity: 8.37e-02, forces_magnitude_error: 1.42e-01, energy_forces_within_threshold: 0.00e+00, loss: 9.47e+00, lr: 8.59e-05, epoch: 3.09e-01, step: 2.58e+04

2025-03-25 09:52:14 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:52:14 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:52:26 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:52:27 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:52:38 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:52:38 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:52:46 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:52:46 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:52:58 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:52:58 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:53:08 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object
2025-03-25 09:53:08 /home/brunoss/programs/fairchem/src/fairchem/core/datasets/lmdb_dataset.py:187: (WARNING): LMDB does not contain edge index information, set otf_graph=True
2025-03-25 09:53:21 /home/brunoss/programs/fairchem/src/fairchem/core/models/base.py:80: (WARNING): Turning otf_graph=True as required attributes not present in data object

@mshuaibii
Copy link
Collaborator

What is the energy value you are training on here and can you share an example config. SchNet is a very old model that doesnt do so great, I recommend you try at least EquiformerV2 or GemNet-OC. You can find sample configs for these models here - https://github.com/FAIR-Chem/fairchem/tree/main/configs/oc20/s2ef/2M.

@brunosamp-usp
Copy link
Author

@mshuaibii Ah, okay, thank you! I wanna train my model to predict adsorption energy with good accuracy, but i was not sure of which data set i should use for this model. Do you think equiformer_v2 would good for this?

@brunosamp-usp
Copy link
Author

@mshuaibii to explain exactly what i want to do: i wanna train a model to predict adsorption energy. After training, i wanna use the trained model to predict the adsorption energy of some systems that are related to my PHD project. So, my idea was to use the equiformer config https://github.com/FAIR-Chem/fairchem/blob/main/configs/oc20/s2ef/2M/equiformer_v2/equiformer_v2_N%4012_L%406_M%402.yml
And train it on the S2EF 2M split.

@mshuaibii
Copy link
Collaborator

Got it - in that case you don't need to train it from scratch on your end. If you are interested in adsorption energies I would advise on starting with the pretrained models we have available to use (trained on S2EF All split). This example shows you how to do so - https://github.com/FAIR-Chem/fairchem?tab=readme-ov-file#quick-start.

This is the quickest way to get adsorption energies for your project.

@brunosamp-usp
Copy link
Author

brunosamp-usp commented Mar 25, 2025

@mshuaibii Ah, okay, i see. So, the next idea for my project was to get the results obtained with the model (e.g., the pre-trained one) and compare it to the experimental value. Depending on the error, we wanted to retrain/fine tune the model to improve the accuracy of the adsorption energy. So, i could just use the pre-trained model and do the finetuning of it using this tutorial https://fair-chem.github.io/core/fine-tuning/fine-tuning-oxides.html ? Thank you.

@mshuaibii
Copy link
Collaborator

Yup, exactly! You can use that model as the starting point and finetune that.

@brunosamp-usp
Copy link
Author

@mshuaibii Thank you for your patience, its now much clearer to me. Best regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants