Single-Task Results on CelebA Dataset #2

bhsimon0810 · 2023-12-15T03:07:42Z

Hi, could you provide the single-task results on the 40-task CelebA dataset? I am running experiments of an MTL method and want to compare the $\Delta_m$ with FAMO and other baselines. Unlike Cityscapes, NYUv2, and QM9, the single-task results on CelebA seem not to be contained in the codes in this repo. Although I can run single-task experiments, the obtained results may be slightly different from the results in your paper. Then, the comparison of $\Delta_m$ may be unfair. I would greatly appreciate it if you could provide me with more details about the single-task results on CelebA. Thanks in advance!

Cranial-XIX · 2023-12-17T18:58:38Z

Hi, here are the single-task results for the 40 tasks:

[0.6736886  0.68121034 0.81524944 0.5760289  0.7205613  0.8555076
 0.38203922 0.58225113 0.787647   0.8321292  0.5029583  0.68694085
 0.6781237  0.5240381  0.5161666  0.95694304 0.6968786  0.67976356
 0.8808315  0.8582131  0.97034    0.93267566 0.5057539  0.40307626
 0.9703734  0.48644206 0.60786104 0.5261031  0.56907415 0.59815097
 0.6858371  0.924108   0.5424991  0.7406311  0.71019936 0.87365365
 0.9305602  0.33704284 0.7647628  0.91907   ]

Please let me know if you have any further questions :)

bhsimon0810 · 2023-12-18T06:01:28Z

Thanks! That helps a lot!

bhsimon0810 · 2023-12-18T15:59:07Z

Sorry to bother you again. Which epoch do you choose to compute the final $\Delta_m$? Could you let me know if you used the results from the last epoch or the best epoch (obtained from the best averaged validation accuracy over 40 tasks)?

Cranial-XIX · 2023-12-18T16:06:42Z

I used the best epoch.

bhsimon0810 · 2023-12-18T16:18:40Z

Thanks again!

bhsimon0810 · 2024-01-13T09:48:59Z

Hi, sorry to bother you. Could you provide the per-task results on CelebA of the other 11 baselines and your FAMO, as listed in Table 3 in your paper? I am trying a MTL method and want to compare the Mean Rank (MR). But the computation of MR involves the per-task results, so I am reaching out to request these data. Thanks for your understanding.

zzzx1224 · 2024-01-17T20:28:48Z

Hi, here are the single-task results for the 40 tasks:

[0.6736886  0.68121034 0.81524944 0.5760289  0.7205613  0.8555076
 0.38203922 0.58225113 0.787647   0.8321292  0.5029583  0.68694085
 0.6781237  0.5240381  0.5161666  0.95694304 0.6968786  0.67976356
 0.8808315  0.8582131  0.97034    0.93267566 0.5057539  0.40307626
 0.9703734  0.48644206 0.60786104 0.5261031  0.56907415 0.59815097
 0.6858371  0.924108   0.5424991  0.7406311  0.71019936 0.87365365
 0.9305602  0.33704284 0.7647628  0.91907   ]

Please let me know if you have any further questions :)

Hi, sorry to bother you. I'm wondering if these results are single-task learning or FAMO. Because I rerun the FAMO code on celebA and got a very low delta compared with these numbers, which is around 0.15%. Thanks!

Cranial-XIX · 2024-01-17T21:52:38Z

Hi, here are the single-task results for the 40 tasks:
[0.6736886  0.68121034 0.81524944 0.5760289  0.7205613  0.8555076
 0.38203922 0.58225113 0.787647   0.8321292  0.5029583  0.68694085
 0.6781237  0.5240381  0.5161666  0.95694304 0.6968786  0.67976356
 0.8808315  0.8582131  0.97034    0.93267566 0.5057539  0.40307626
 0.9703734  0.48644206 0.60786104 0.5261031  0.56907415 0.59815097
 0.6858371  0.924108   0.5424991  0.7406311  0.71019936 0.87365365
 0.9305602  0.33704284 0.7647628  0.91907   ]
Please let me know if you have any further questions :)
Hi, sorry to bother you. I'm wondering if these results are single-task learning or FAMO. Because I rerun the FAMO code on celebA and got a very low delta compared with these numbers, which is around 0.15%. Thanks!

Yes, they are the single-task learning results I got on my side. FAMO may get a better result on your side :) I am averaging over 3 seeds so maybe you are lucky this time? Can you run another baseline like CAGrad or NashMTL to confirm?

Cranial-XIX · 2024-01-17T22:48:15Z

Hi, sorry to bother you. Could you provide the per-task results on CelebA of the other 11 baselines and your FAMO, as listed in Table 3 in your paper? I am trying a MTL method and want to compare the Mean Rank (MR). But the computation of MR involves the per-task results, so I am reaching out to request these data. Thanks for your understanding.

Here is the link to all results, you can use torch.load to load it and then it is a dictionary with self-explanatory key/values.

zzzx1224 · 2024-01-18T10:23:01Z

Hi, here are the single-task results for the 40 tasks:
[0.6736886  0.68121034 0.81524944 0.5760289  0.7205613  0.8555076
 0.38203922 0.58225113 0.787647   0.8321292  0.5029583  0.68694085
 0.6781237  0.5240381  0.5161666  0.95694304 0.6968786  0.67976356
 0.8808315  0.8582131  0.97034    0.93267566 0.5057539  0.40307626
 0.9703734  0.48644206 0.60786104 0.5261031  0.56907415 0.59815097
 0.6858371  0.924108   0.5424991  0.7406311  0.71019936 0.87365365
 0.9305602  0.33704284 0.7647628  0.91907   ]
Please let me know if you have any further questions :)
Hi, sorry to bother you. I'm wondering if these results are single-task learning or FAMO. Because I rerun the FAMO code on celebA and got a very low delta compared with these numbers, which is around 0.15%. Thanks!
Yes, they are the single-task learning results I got on my side. FAMO may get a better result on your side :) I am averaging over 3 seeds so maybe you are lucky this time? Can you run another baseline like CAGrad or NashMTL to confirm?

Thanks a lot for the reply! I ran my evaluation again with the single-task learning results and (famo, 20000) from the results you shared, but still got a very low delta, around 1.75. Here is my evaluation function according to the equations in the paper.

def calculate_delta(famo, stl):
    sum = 0
    for i in range(famo.shape[0]):
        sum += -1 * (famo[i] - stl[i]) / stl[i] * 100

    return sum / famo.shape[0]

I think I might make some wrong in the evaluation function. Could you share your evaluation function? I would much appreciate that.

bhsimon0810 · 2024-01-22T15:21:00Z

Hi, sorry to bother you. Could you provide the per-task results on CelebA of the other 11 baselines and your FAMO, as listed in Table 3 in your paper? I am trying a MTL method and want to compare the Mean Rank (MR). But the computation of MR involves the per-task results, so I am reaching out to request these data. Thanks for your understanding.

Here is the link to all results, you can use torch.load to load it and then it is a dictionary with self-explanatory key/values.

Thanks very much! Appreciate your help!

zzzx1224 · 2024-01-22T16:49:45Z

Hi, here are the single-task results for the 40 tasks:
[0.6736886  0.68121034 0.81524944 0.5760289  0.7205613  0.8555076
 0.38203922 0.58225113 0.787647   0.8321292  0.5029583  0.68694085
 0.6781237  0.5240381  0.5161666  0.95694304 0.6968786  0.67976356
 0.8808315  0.8582131  0.97034    0.93267566 0.5057539  0.40307626
 0.9703734  0.48644206 0.60786104 0.5261031  0.56907415 0.59815097
 0.6858371  0.924108   0.5424991  0.7406311  0.71019936 0.87365365
 0.9305602  0.33704284 0.7647628  0.91907   ]
Please let me know if you have any further questions :)
Hi, sorry to bother you. I'm wondering if these results are single-task learning or FAMO. Because I rerun the FAMO code on celebA and got a very low delta compared with these numbers, which is around 0.15%. Thanks!
Yes, they are the single-task learning results I got on my side. FAMO may get a better result on your side :) I am averaging over 3 seeds so maybe you are lucky this time? Can you run another baseline like CAGrad or NashMTL to confirm?
Thanks a lot for the reply! I ran my evaluation again with the single-task learning results and (famo, 20000) from the results you shared, but still got a very low delta, around 1.75. Here is my evaluation function according to the equations in the paper.
def calculate_delta(famo, stl):
    sum = 0
    for i in range(famo.shape[0]):
        sum += -1 * (famo[i] - stl[i]) / stl[i] * 100

    return sum / famo.shape[0]
I think I might make some wrong in the evaluation function. Could you share your evaluation function? I would much appreciate that.

Thanks for your time and I think I have solved the problem. There is a typo in the paper. The numbers in ''MR'' column and ''delta m'' column are swapped in CelebA table. The calculated number 1.75% is correct.

Cranial-XIX · 2024-01-23T20:15:35Z

Close the issue :)

bhsimon0810 closed this as completed Dec 18, 2023

bhsimon0810 reopened this Dec 18, 2023

bhsimon0810 closed this as completed Dec 19, 2023

bhsimon0810 reopened this Jan 13, 2024

Cranial-XIX closed this as completed Jan 23, 2024

bhsimon0810 mentioned this issue May 12, 2024

Questions for the detailed data in QM9 and CELEBA OptMN-Lab/fairgrad#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single-Task Results on CelebA Dataset #2

Single-Task Results on CelebA Dataset #2

bhsimon0810 commented Dec 15, 2023

Cranial-XIX commented Dec 17, 2023

bhsimon0810 commented Dec 18, 2023

bhsimon0810 commented Dec 18, 2023

Cranial-XIX commented Dec 18, 2023

bhsimon0810 commented Dec 18, 2023

bhsimon0810 commented Jan 13, 2024

zzzx1224 commented Jan 17, 2024

Cranial-XIX commented Jan 17, 2024

Cranial-XIX commented Jan 17, 2024 •

edited

Loading

zzzx1224 commented Jan 18, 2024

bhsimon0810 commented Jan 22, 2024

zzzx1224 commented Jan 22, 2024

Cranial-XIX commented Jan 23, 2024

Single-Task Results on CelebA Dataset #2

Single-Task Results on CelebA Dataset #2

Comments

bhsimon0810 commented Dec 15, 2023

Cranial-XIX commented Dec 17, 2023

bhsimon0810 commented Dec 18, 2023

bhsimon0810 commented Dec 18, 2023

Cranial-XIX commented Dec 18, 2023

bhsimon0810 commented Dec 18, 2023

bhsimon0810 commented Jan 13, 2024

zzzx1224 commented Jan 17, 2024

Cranial-XIX commented Jan 17, 2024

Cranial-XIX commented Jan 17, 2024 • edited Loading

zzzx1224 commented Jan 18, 2024

bhsimon0810 commented Jan 22, 2024

zzzx1224 commented Jan 22, 2024

Cranial-XIX commented Jan 23, 2024

Cranial-XIX commented Jan 17, 2024 •

edited

Loading