You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The temporary workaround was to move upblock3 inputs to interleaved, which is a small performance regression. The larger issue is that this is a band-aid for a different issue, which will likely resurface when making any additional changes to the model.
Without the workaround, the model first has bad PCC on upblock3ttnn.concat. Interestingly it seems that each board reset causes this failing op PCC to change, which might indicate some kind of out of bounds memory access. Enabling TT_METAL_CLEAR_L1=1 makes it consistent.
The text was updated successfully, but these errors were encountered:
Running the model using comparison mode turns up an issue with to_layout:
models/experimental/functional_unet/tt/unet_shallow_ttnn.py:518: in __call__
x = self.upblock1(x, c4_residual)
models/experimental/functional_unet/tt/unet_shallow_ttnn.py:304: in __call__
residual = ttnn.to_layout(residual, ttnn.ROW_MAJOR_LAYOUT)
RuntimeError: ttnn.to_layout: Comparing output tensor 0 against CPU locally failed: pcc is 0.9840820378995527 but should be >=0.9990000128746033
This may or may not be related, need to investigate.
Summary
Since 282a7b2, a temporary workaround was required get the model working correctly:
tt-metal/models/experimental/functional_unet/tt/unet_shallow_ttnn.py
Line 465 in 7866642
The temporary workaround was to move upblock3 inputs to interleaved, which is a small performance regression. The larger issue is that this is a band-aid for a different issue, which will likely resurface when making any additional changes to the model.
Without the workaround, the model first has bad PCC on
upblock3
ttnn.concat
. Interestingly it seems that each board reset causes this failing op PCC to change, which might indicate some kind of out of bounds memory access. Enabling TT_METAL_CLEAR_L1=1 makes it consistent.The text was updated successfully, but these errors were encountered: