-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Got low accuraccy when replicating the expirements on ogbn-mag dataset (due to torch-geometric 2.0.0). Fixed by downgrading to torch-geometric 1.5.0 #40
Comments
The log is very wierd. I guess it's probably due to some pyg update. I'll take a look at it later. In the meantime is it possible you could try our reported pyg version? |
It is 2.0.0. I check by running the pip install to get the version since I already installed pyg |
The strange thing is that, we we run the experiment on the OAG dataset with original configuration that you did, we got the accuracy almost the same with reports on the paper (around 51% in MRR) |
I can try that version (pytorch_geometric 1.3.2). However, I think the code should work with new version of pytorch-geometric, since I need to downgrade torch-scatter, torch-sparse but they might not get along well with pytorch 1.8.1 ( it's impossible to install pytorch 1.3 on the latest RTX 3090). |
I downgrade with torch-geometric 1.3.2 and got this error: and the training file has this error: ` |
I checked and saw the latest update from ogbn-mag is 7 months ago, which torch-geometric-1.6.3 was the latest version at that time. I tried to run the preprocess_.py but it still have the same error with 1.3. However, the train_.py is working now. I will update the result to you when it is available. |
with torch-geometric-1.6.3 the accuracy is very low on my training also
|
It seems that I got the code working with much better results using torch-geometric 1.5.0 (released in May 2020). I need to provide some small fixes in your code also. For example, the preprocess.py file missed the declaration of Evaluator and the node_year_dict should be updated to node_year. Here is what I got. Is that seem normally?
|
Hi:
Thanks for figuring out the problem. Do you happen to know which part makes
your previous experiment fails?
…On Fri, Sep 17, 2021, 21:27 Hung PHAN ***@***.***> wrote:
The log is very wierd. I guess it's probably due to some pyg update. I'll
take a look at it later. In the meantime is it possible you could try our
reported pyg version?
It seems that I got the code working with much better results using
torch-geometric 1.5.0 (released in May 2020). I need to provide some small
fixes in your code also. For example, the preprocess.py file missed the
declaration of Evaluator and the node_year_dict should be updated to
node_year.
Here is what I got. Is that seem normally?
Epoch: 51 LR: 0.00026 Train Loss: 1.4745 Train Acc: 0.5646 Valid Acc:
0.4391 Test Acc: 0.4235 Data Preparation: 22.6s Epoch: 52 LR: 0.00025 Train
Loss: 1.4691 Train Acc: 0.5634 Valid Acc: 0.4395 Test Acc: 0.4405 Data
Preparation: 21.9s Epoch: 53 LR: 0.00025 Train Loss: 1.4615 Train Acc:
0.5677 Valid Acc: 0.4418 Test Acc: 0.4142 Data Preparation: 21.6s Epoch: 54
LR: 0.00024 Train Loss: 1.4544 Train Acc: 0.5687 Valid Acc: 0.4288 Test
Acc: 0.4125 Data Preparation: 21.4s Epoch: 55 LR: 0.00024 Train Loss:
1.4552 Train Acc: 0.5692 Valid Acc: 0.4374 Test Acc: 0.4250 Data
Preparation: 21.2s UPDATE!!! 0.4516340750319941
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#40 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHREXR5AIMXLLEB2IN4A3ULUCQILDANCNFSM5EF7I6PA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Let me check a little bit to see which part caused the problem. I got the final accuracy at 47% of test acc (I guess it stills have some bugs since the reported acc is around 50%). I think currently the repository for obgn-mag worked with pytorch 1.5.0 but it didn't work on the latest 2.0.0 and the reported 1.3.2 also. |
Hi
|
Hi: I just noticed that pyg group re-implemented the hgt model using their updated API. Have you tried it? |
No I haven't since I downgrade my torch-geometric to your code version. I will try it after I finish my experiments on current torch-geometric. |
They implement the hgt_loader which should be the same with my
implementation.
…On Fri, Sep 24, 2021, 04:01 Hung PHAN ***@***.***> wrote:
No I haven't since I downgrade my torch-geometric to your code version. I
will try it after I finish my experiments on current torch-geometric.
It seems that in the new HGT model I don't see the code of graph sampling
compared to your version. Is that correct?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#40 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHREXRZY5RLD3PQYZ6YS363UDRLA3ANCNFSM5EF7I6PA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I see |
I will check and notify you |
Any updates? |
Sorry for late reply. I am able to run the dataset on both this version (torch-geometric 1.5) and the newer version by Microsoft (torch-geometric 2.0) using the ogbn-mag dataset. The accuracies of 2 versions are quite similar which are around 50%. |
Hello
I and my friend tried to replicate the experiment on the ogbn-mag dataset. We haven't changed anything in configurations compared to the original code. You can see my running files at here:
https://github.com/pdhung3012/pyHGT/blob/master/ogbn-mag/preprocess_ogbn_mag_RTX3090.py
https://github.com/pdhung3012/pyHGT/blob/master/ogbn-mag/train_ogbn_mag_RTX3090.py
https://github.com/pdhung3012/pyHGT/blob/master/ogbn-mag/eval_ogbn_mag_RTX3090.py
However, we couldn't get the accuracy at around 0.5 like your experiment using Tesla graphic card. Here are I and my friend's computers' configurations:
Python 3.8
Nvvidia RTX 3090 (my friend used titan xp)
Pytorch 1.8.0
Cuda 11.1
Configuration:
/home/hungphd/anaconda3/envs/py38/bin/python /home/hungphd/git/pyHGT/ogbn-mag/eval_ogbn_mag_RTX3090.py --prev_norm --last_norm --use_RTE +--------------+-----------------------+ | Parameter | Value | +--------------+-----------------------+ | data_dir | dataset_v1/OGB_MAG.pk | +--------------+-----------------------+ | model_dir | ./hgt_4layer | +--------------+-----------------------+ | task_type | variance_reduce | +--------------+-----------------------+ | vr_num | 8 | +--------------+-----------------------+ | n_pool | 8 | +--------------+-----------------------+ | n_batch | 32 | +--------------+-----------------------+ | batch_size | 128 | +--------------+-----------------------+ | conv_name | hgt | +--------------+-----------------------+ | n_hid | 512 | +--------------+-----------------------+ | n_heads | 8 | +--------------+-----------------------+ | n_layers | 4 | +--------------+-----------------------+ | cuda | 0 | +--------------+-----------------------+ | dropout | 0.200 | +--------------+-----------------------+ | sample_depth | 6 | +--------------+-----------------------+ | sample_width | 520 | +--------------+-----------------------+ | prev_norm | 1 | +--------------+-----------------------+ | last_norm | 1 | +--------------+-----------------------+ | use_RTE | 1 | +--------------+-----------------------+
We both achieved quite low accuracy:
Model #Params: 21173389 eval: 100%|██████████| 328/328 [1:07:18<00:00, 12.31s/it, accuracy=0.002] 0.0021459739144948616
Here is what we see when training:
Epoch: 93 LR: 0.00004 Train Loss: 4.2447 Train Acc: 0.1353 Valid Acc: 0.0956 Test Acc: 0.0021 Data Preparation: 17.0s Epoch: 94 LR: 0.00003 Train Loss: 4.2646 Train Acc: 0.1341 Valid Acc: 0.0928 Test Acc: 0.0021 Data Preparation: 16.2s Epoch: 95 LR: 0.00003 Train Loss: 4.2497 Train Acc: 0.1360 Valid Acc: 0.0945 Test Acc: 0.0017 Data Preparation: 12.8s Epoch: 96 LR: 0.00002 Train Loss: 4.2595 Train Acc: 0.1328 Valid Acc: 0.0989 Test Acc: 0.0021 Data Preparation: 13.3s Epoch: 97 LR: 0.00002 Train Loss: 4.2609 Train Acc: 0.1349 Valid Acc: 0.1002 Test Acc: 0.0016 Data Preparation: 13.3s Epoch: 98 LR: 0.00001 Train Loss: 4.2525 Train Acc: 0.1352 Valid Acc: 0.0954 Test Acc: 0.0012 Data Preparation: 14.1s Epoch: 99 LR: 0.00001 Train Loss: 4.2530 Train Acc: 0.1346 Valid Acc: 0.0967 Test Acc: 0.0021 Data Preparation: 12.8s Epoch: 100 LR: 0.00000 Train Loss: 4.2650 Train Acc: 0.1340 Valid Acc: 0.0981 Test Acc: 0.0019
Is there problem with the compatibility of the code with new version of pytorch/cuda/graphic card?
Sincerely
The text was updated successfully, but these errors were encountered: