The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)? #43

Ryan-ZL-Lin · 2024-02-28T03:29:14Z

Hi
I noticed that the result of torch.allclose(xbow, xbow2), torch.allclose(xbow, xbow3) are all false when running the Collab example gpt-dev.ipynb in The mathematical trick in self-attention section. Here is what I got, has anyone encountered the same issue?

The text was updated successfully, but these errors were encountered:

0xArwa · 2024-03-24T08:59:18Z

@Ryan-ZL-Lin You can adjust the relative tolerance for less strict comparison. the default value is 1e-05 in PyTorch 2.2

This snippet will output True

torch.allclose(xbow, xbow2, rtol= 1e-04) # default 1e-05

yyinsomnia · 2024-07-13T11:49:52Z

我也详细排查了一下，发现太小的值，这里[1,5,1] 是 0.0020，会导致allclose判断为False
这个问题很有意思，我记得21年跑这个代码没有这个问题，现在出现了。我也运行了一下Andrej的原始notebook也是Fasle
说明大概率是python和torch的版本升级导致的？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)? #43

The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)? #43

Ryan-ZL-Lin commented Feb 28, 2024

0xArwa commented Mar 24, 2024 •

edited

Loading

yyinsomnia commented Jul 13, 2024

The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)? #43

The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)? #43

Comments

Ryan-ZL-Lin commented Feb 28, 2024

0xArwa commented Mar 24, 2024 • edited Loading

yyinsomnia commented Jul 13, 2024

0xArwa commented Mar 24, 2024 •

edited

Loading