-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input names are silently modified, leading to inputs mismatch during inference #1153
Comments
Changing the default behaviour would probably hurt some existing code but we could add an option to change it? Would that work for you or feel free to suggest anything you would like to see implemented. |
I think it depends on what the Onnx team plans to do. If they just drop the requirement from the specification then it can safely be drop everywhere without breaking anything. In the meantime an option to just leave the names untouched would indeed solve my case! |
The |
I was not working on onnx when this statement was made, maybe it was for possible future developments. In practice, onnx, onnxruntime do not use this constraint. |
Yes they never actually enforced it, and now it has even been removed from the specification (from MUST to SHOULD). |
When converting models using
convert_sklearn
,initial_inputs
names are silently modified to complywith the
onnx
specification which asks for identifiers to follow the C90 identifier specification.As a result, the input names will not match when trying to perform inference even when using exactly the same input data.
Furthermore, this behaviour is quite hidden, and if we look at the following basic tutorial, the column
home.dest
is conveniently dropped.Trying to use it results in a renaming into
home_dest
, which makes theonnx
models fail at inference because the input data doesn't match anymore.This is a really treacherous behavior, and in fact it is probably not even necessary as
onnx
doesn't even enforce their own specification.It is also the kind of behaviour users might be tempted to overcome by being "smart", but in fact most smart solutions are wrong, especially when taking into account the possibility that a renamed identifier might collide with an existing one.
By the way looking at the code (and testing it) the line that turn input names into C-style identifiers is not correct:
re.sub("[^\\w+]", "_", seed)
let some special characters such as "+" unchanged.The text was updated successfully, but these errors were encountered: