-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Use dtype=int for the indices returned by TopK #11031
Comments
This could break existing code. Sometimes the resulting array of indices is subject to another operation. Then the result can become different if the dtype of the index array changes. Or it can make an error as some opertions can be applied to only floating point arrays. nd.topk(nd.arange(10), k=2).mean() Currently the result is nd.linalg.gemm2(nd.topk(x, k=2), y) Currently this is allowed, but it'll cause an error if the dtype changes to integer.
|
Yes, this would be backward incompatible (So I create the issue to discuss how we should proceed). Using int32 will avoid bugs in the future when users run |
A few general remarks about indices in mxnet: There is no specific convention in MXNet about what datatype to be used when encoding indices. It is basically left to the user or inferred from other data. This is generally fine and consistent. For example look at all operators that are indeed consumers of index arrays (pick, take, gather_nd): The situation is slightly different for operators that actually produce index arrays (without having any index array as input). Such operators are topk, argsort, argmax, argmin, argmax_channel (hope I didn't forget one here). Currently they infer the index type (used in the output array of the operator) from the datatype used in the input. But that is weird as there is no objective correlation between data type and index type. If data input is float, why should be then the index type also a float? Moreover it can cause weird problems: If you want to get the index of the maximum element of a 10M large array of floats, then the float-type used as output would not be able to represent all potential indices (while an int would be able). To make such thing work, you currently would have to cast the entire data input to doubles, then receive indices as a double, then cast back to int. And even that may not work as the operator may not support doubles (which is the case for topk). So my proposal would be to make a change for all index-producing operators (listed above) to allow explicit setting of the output index type by a parameter. We may decide to keep backward compatibility by making the default behaviour match the current behaviour or choose int as the default and accepting non-backward compatibility. But we should do something;-) |
Sounds good. We can add a argument called ‘idx_dtype’, which is set to be backward-compatible by default.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: moin <[email protected]>
Sent: Tuesday, May 29, 2018 8:08:21 PM
To: apache/incubator-mxnet
Cc: Xingjian SHI; Author
Subject: Re: [apache/incubator-mxnet] Use dtype=int for the indices returned by TopK (#11031)
A few general remarks about indices in mxnet:
There is no specific convention in MXNet about what datatype to be used when encoding indices. It is basically left to the user or inferred from other data. This is generally fine and consistent. For example look at all operators that are indeed consumers of index arrays (pick, take, gather_nd):
They are all written in a way such that the datatype of the indices can be anything, as long as it can be cast to an int internally. In particular, the datatype does not have to be compliant with other inputs to such operators (i.e. take() will accept the data array to be of type double and the index array of type int/float/half_t/whatever). I think this is a really good design and we should not impose any restrictions that the incoming indices must be of integral type. Neither that every index should be integral.
The situation is slightly different for operators that actually produce index arrays (without having any index array as input). Such operators are topk, argsort, argmax, argmin, argmax_channel (hope I didn't forget one here). Currently they infer the index type (used in the output array of the operator) from the datatype used in the input. But that is weird as there is no objective correlation between data type and index type. If data input is float, why should be then the index type also a float? Moreover it can cause weird problems: If you want to get the index of the maximum element of a 10M large array of floats, then the float-type used as output would not be able to represent all potential indices (while an int would be able). To make such thing work, you currently would have to cast the entire data input to doubles, then receive indices as a double, then cast back to int. And even that may not work as the operator may not support doubles (which is the case for topk).
So my proposal would be to make a change for all index-producing operators (listed above) to allow explicit setting of the output index type by a parameter. In order to keep backward compatibility, the default should stay as is (inferred from data etc).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#11031 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7k1F_IUwfgiNxu41J0qBAAdm107Wks5t3To1gaJpZM4UJ-Oc>.
|
I think that |
Yes, dtype should be better. I can try to add the dtype option for topk first. One issue is that the compiling speed would be slower if we use a nested macro to consider all the combinations.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Deokjae Lee <[email protected]>
Sent: Tuesday, May 29, 2018 10:33:26 PM
To: apache/incubator-mxnet
Cc: Xingjian SHI; Author
Subject: Re: [apache/incubator-mxnet] Use dtype=int for the indices returned by TopK (#11031)
I think that random.multinomial also need to be in the list. I'm positive on the introducion of the new name idx_dtype, but using the old dtype also looks not bad for the consistency, as the operators in random package use dtype to specify the data type of the output array. We need to set the backward compatible default currently, but I agree on changing it as integer in 2.0. Anyway an index is an integer by its nature.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#11031 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AE8D7venndZZjQAn9yfZMgWOs0XeROLLks5t3Vw2gaJpZM4UJ-Oc>.
|
I think we should make it consistent from the beginning, i.e. add all affected operators in one shot. |
The dtype of
ret_indices
in "topk" is currently real_t. Should we revise it toint
to avoid possible loss of precision? (https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/ordering_op-inl.h#L432)The text was updated successfully, but these errors were encountered: