Mini Julep: skipmissing indexing #30606
Labels
julep
Julia Enhancement Proposal
missing data
Base.missing and related functionality
needs decision
A decision on this change is needed
This mini Julep aims to address issues which currently block progress regarding two essential use cases of
skipmissing
. The first one is how to compute reductions over dimensions of an array, skipping missing values (#28027). The other is how to find the index of the maximum/minimum value in an array, skipping missing values (#29305).The solution I suggest is to make
SkipMissing
an "enhanced iterator" which would also support indexing. More specifically, it would use the same indices as the original array:keys(itr::SkipMissing)
would return(i for i in keys(itr.x) if !ismissing(itr.x[i]))
, or an equivalent iterator.getindex
using indices of the original array would either return the corresponding value, or throw an error if the value is missing:getindex(itr::SkipMissing, i...) = (v = itr.x[i...]; ismissing(v) ? throw(...) : v)
Consistently with this proposal,
argmax(skipmissing(x))
would return the indexi
so thatx[i]
is the highest non-missing value inx
(fixing #29305). Andreduce(skipmissing(x), +, dims=i)
would compute the sum of non-missing values over dimensioni
ofx
, with the same shape asreduce(x, +, dims=i)
(fixing #28027). In both cases, ifx
contains no missing values, the result would be indistinguishable from applying the operation directly tox
.IteratorSize(SkipMissing)
would still returnSizeUnknown
, since computing the length is an O(N) operation.The text was updated successfully, but these errors were encountered: