Gradient accumulation not properly implemented #53

clemsgrs · 2023-07-14T08:17:11Z

Hi, based on the following lines, it seems gradient accumulation is not properly implemented:

HIPT/2-Weakly-Supervised-Subtyping/utils/core_utils.py

Lines 285 to 290 in a9b5bb8

    
           loss = loss / gc 
        
           loss.backward() 
        
           # step 
        
           optimizer.step() 
        
           optimizer.zero_grad()

A proper implementation should look like the following:

loss = loss / gc
loss.backward()

if (batch_idx + 1) % gc == 0:
      optimizer.step()
      optimizer.zero_grad()

vildesboe · 2023-12-12T10:32:39Z

Hi, based on the following lines, it seems gradient accumulation is not properly implemented:

HIPT/2-Weakly-Supervised-Subtyping/utils/core_utils.py

Lines 285 to 290 in a9b5bb8

loss = loss / gc

loss.backward()

# step

optimizer.step()

optimizer.zero_grad()

A proper implementation should look like the following:
loss = loss / gc
loss.backward()

if (batch_idx + 1) % gc == 0:
      optimizer.step()
      optimizer.zero_grad()

Hi! I'm also working on reproducing this HIPT paper. Would you be interested in some discussion?

clemsgrs · 2023-12-12T13:40:49Z

sure, happy to chat.
I’ve made my own version of the code here: https://github.com/clemsgrs/hipt

you can contact me at: clement (dot) grisi (at) radboudumc (dot) nl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient accumulation not properly implemented #53

Gradient accumulation not properly implemented #53

clemsgrs commented Jul 14, 2023 •

edited

Loading

vildesboe commented Dec 12, 2023

clemsgrs commented Dec 12, 2023

Gradient accumulation not properly implemented #53

Gradient accumulation not properly implemented #53

Comments

clemsgrs commented Jul 14, 2023 • edited Loading

vildesboe commented Dec 12, 2023

clemsgrs commented Dec 12, 2023

clemsgrs commented Jul 14, 2023 •

edited

Loading