You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In metrics/eval.py, each dataset (e.g. X_gt, X_syn) is encoded separately. This is problematic, as this fits separate sklearn.preprocessing.LabelEncoder's. This results in unexpected behaviour if the unique elements for each column are not identical for X_gt, X_syn, as in this case the encoding of X_gt does not denote the same variable as in X_syn.
Description
In metrics/eval.py, each dataset (e.g. X_gt, X_syn) is encoded separately. This is problematic, as this fits separate sklearn.preprocessing.LabelEncoder's. This results in unexpected behaviour if the unique elements for each column are not identical for X_gt, X_syn, as in this case the encoding of X_gt does not denote the same variable as in X_syn.
How to Reproduce
Expected Behavior
Evidently, above we want the processed df_syn to be [1,2,2].
Fix
Seems like we can just get the encoders when calling X_gt.encode(), and pass this to all other encode calls.
The text was updated successfully, but these errors were encountered: