You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered some correctness issues while using Dynamo + OpenXLA. After investigation, I found that Torch-XLA currently generates the same hash for two different graphs when computing the graph hash. These two graphs have the same inputs and outputs and the same computational logic, but the difference lies in the operands of the operators coming from different inputs. For Torch-XLA, these two graphs are seen as identical computation graphs, which can lead to incorrect results.
To Reproduce
I simplified the computation graph I encountered and constructed the following test example:
The above test1 and test2 have the same hash, so Torch-XLA will only compile the first graph. The second graph will use the pre-compiled version of the first graph without recompilation, which can lead to correctness issues.
Additional context
The root cause of this issue is that the current Torch-XLA hash calculation does not consider the source of the operator's inputs within the computation graph. I believe the solution to this problem is to modify the logic for calculating the hash after obtaining the PostOrder traversal of the graph. One possible approach is to traverse the PostOrder sequence of the computation graph and include the index order of the operands between operators in the hash value calculation. I'm not sure if there are other better methods.
The text was updated successfully, but these errors were encountered:
🐛 Bug
I encountered some correctness issues while using Dynamo + OpenXLA. After investigation, I found that Torch-XLA currently generates the same hash for two different graphs when computing the graph hash. These two graphs have the same inputs and outputs and the same computational logic, but the difference lies in the operands of the operators coming from different inputs. For Torch-XLA, these two graphs are seen as identical computation graphs, which can lead to incorrect results.
To Reproduce
I simplified the computation graph I encountered and constructed the following test example:
The above
test1
andtest2
have the same hash, so Torch-XLA will only compile the first graph. The second graph will use the pre-compiled version of the first graph without recompilation, which can lead to correctness issues.Additional context
The root cause of this issue is that the current Torch-XLA hash calculation does not consider the source of the operator's inputs within the computation graph. I believe the solution to this problem is to modify the logic for calculating the hash after obtaining the PostOrder traversal of the graph. One possible approach is to traverse the PostOrder sequence of the computation graph and include the index order of the operands between operators in the hash value calculation. I'm not sure if there are other better methods.
The text was updated successfully, but these errors were encountered: