Regarding compute_metrics() using with HuggingFace Trainer #4220
Unanswered
anmolagarwal999
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using DeepSpeed ZeroStage 3 and am passing a custom
compute_metrics()
to the trainer. I have 4 GPUs (devices). The compute_metrics function is being invoked by all the devices. Moreover, all the points (let’s say there N datapoints in the eval set) in the entire eval dataset seem to be sent to the compute_metrics of all the devices, which seems to be redundant and inefficient. Am I missing something here? (My expectation was that either (1) compute_metrics would be called only once OR (2) the evaluation dataset would be distributed across compute_metrics)Beta Was this translation helpful? Give feedback.
All reactions