Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peakprom can't find a peak within the minimum distance w of the boundary #9

Closed
apbraun opened this issue May 14, 2020 · 6 comments · Fixed by #13
Closed

peakprom can't find a peak within the minimum distance w of the boundary #9

apbraun opened this issue May 14, 2020 · 6 comments · Fixed by #13

Comments

@apbraun
Copy link
Contributor

apbraun commented May 14, 2020

Calculating the extrema of an array with peakprom it would be nice to be able to switch off the strictbounds imposed by the maxima/minima.
Maybe something like

...
@eval begin
    function peakprom(x::AbstractVector{T}, ::$Extrema, w=1, minprom::T=zero(T), strictbounds::Bool=true) where T
        if ($Extrema) === Maxima
            m = maxima(x, w, strictbounds)
        else
            m = minima(x, w, strictbounds)
        end
...

would already be enough.

@apbraun
Copy link
Contributor Author

apbraun commented May 14, 2020

Apparently

...
lcan = ($argcomp1)(skipmissing(uview(x, lb:(m[i] - 1)))) # Slice from lower bound to the index prior to the current extrema
rcan = ($argcomp1)(skipmissing(uview(x, (m[i] + 1):rb))) # Slice corollary upper side
...

would also need to be changed to

...
lcan = m[i] == lbegin ? x[m[i]] : ($argcomp1)(skipmissing(uview(x, lb:(m[i] - 1)))) # Slice from lower bound to the index prior to the current extrema
rcan = m[i] == lend ? x[m[i]] : ($argcomp1)(skipmissing(uview(x, (m[i] + 1):rb))) # Slice corollary upper side
...

I've never made an issue before should I add these changes inside a pull request?

@halleysfifthinc
Copy link
Owner

You are welcome to give it a shot! However, the peakprom function is in need of a big overhaul and needs more tests to ensure parity/correctness with the minima/maxima functions.

For reference, the minima/maxima functions were significantly refactored for the latest version of Peaks.jl to enable correct and full support for missing/NaN elements. Since then, I haven't had the time to update the peakprom function to also correctly support missing/NaN elements. The strictbounds argument encompasses both missing/NaN allowances and extrema within w of the boundary, so simply adding the strictbounds argument to the peakprom function in its current state could result in incorrect and/or incomplete results.

Unfortunately, I won't have the time to work on this until August, but I should be able to give you some pointers and/or review a PR if you open one.

@apbraun
Copy link
Contributor Author

apbraun commented Jun 3, 2020

How would you like to handle the case of missing/NaN elements in the peakprom function? Since minima/maxima with strictbounds=true only allows for extrema where there is no missing/NaN within w of the boundary, what should happen if there is a missing/NaN element within the left/right boundary of the extremum that's considered for the calculation of the prominence? Should the prominence also default to missing/NaN since the value is just unknown? For strictbounds=false a missing/NaN value should just be ignored.

@halleysfifthinc
Copy link
Owner

That is the main question with updating the peakprom function. The algorithm for peak prominence is the same as what is used in MATLAB's findpeaks function, however, I don't know if/how the MATLAB algorithm works in the presence of NaNs, and obviously MATLAB doesn't have missings.

There are several key API decisions/concerns in my mind:

  1. Should the returned prominences for given peaks (indices) be (roughly, see point 2) the same for both strictbounds options?
  • That would require the same boundary extrema to be used, which would suggest that the boundary extrema should always be calculated with strictbounds=false. (The alternative would be to always pass on the strictbounds variable given to peakprom)
  • Example/testcase: A peak that has no missing/NaN within ±w adjacent elements (ie it will be returned for strictbounds = true|false), and where either/both of the possible boundary extrema do contain a missing/NaN within ±w adjacent elements.
  • I think the least surprising and most intuitive result would be for the calculated prominence to be the same in both cases.
  1. Regarding whether the prominence should be missing/NaN:
  • I hadn't thought of that! This will depend in part on what we decide for point 1; but either way, I think that it would probably be best to match the common Julia convention of returning missing whenever missings are involved (also see the last sub-bullet of point 3).
  1. On a somewhat related topic of conventions and naming:
  • I realized recently that there are some naming changes that would be helpful to make the API of the minima/maxima functions match base Julia extrema related functions better:
  • There are three types of (relevant) extrema functions in base: minimum (returns value), argmin (returns index), and findmin (returns tuple(value, index)). I plan on updating to match that naming scheme over several release cycles of Peaks.jl: deprecate minima/maxima to argminima/argmaxima and add findminima/findmaxima, re-add minima/maxima (returning the values) in another release.
  • The handling of missing in the minima/maxima functions might be viewed by some as not matching the typical conventions for missing behavior. None of the 3 extrema functions in Base I mentioned currently work with missings (relevant base PR JuliaLang/julia#35989) so there isn't a direct precedent, but typical conventions for missing behavior in the minima/maxima functions would possibly call for returning multiple missings whenever they occurred in the w window amidst the other valid peaks. However I think that (or similar results) would be confusing and unhelpful; I think the current behavior of just not returning any peaks that would be "poisoned" by a missing is more reasonable.

Thoughts on any and/or all of this?

@apbraun
Copy link
Contributor Author

apbraun commented Jul 3, 2020

  1. I think it is reasonable to have the same prominence for identical peaks independent of strictbounds, since strictbounds applies to w which mainly limits the distance of the peak to its neighbors.
  2. I would agree with you on the last bullet of 3. which would inflict, that no point with a missing/NaN value is returned for strictbounds = true.
  • If in this case there is a missing/NaN between a considered peak and the next equally or higher leveled peak outside the w window, I would agree that the most reasonable value is missing/NaN for the prominence.
  • If strictbounds = false, minima/maxima returns the peak even if there is a missing/NaN value inside the w window, so maybe peakprom could just ignore/skip over missing/NaN as well or just return missing/NaN. I would start working on the first variant for the moment. It should be easy to revert to returning missing/NaN in both cases.
  1. In my understanding peakprom would return peaks inside the boundary region for strictbounds = false. This would open up another boundary case to be considered namely what the peak prominence of a point on the boundary should be. I'm not entirely sure whether 0 is a reasonable value for this case.

@apbraun
Copy link
Contributor Author

apbraun commented Jul 3, 2020

* I would start working on the first variant for the moment. It should be easy to revert to returning `missing`/`NaN` in both cases

Skipping over NaNs and missings is more costly than I thought. This would mean that the strictsbounds=false option would prove about 18 times as long as the strictbounds=true option that just returns NaN or missing in case it encounters either of these values. I'm not entirely sure whether that's useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants