Skip to content

Conversation

@brechmos-stsci
Copy link

@brechmos-stsci brechmos-stsci commented Aug 24, 2018

This update will allow SpectralRegions to have multiple sub-regions defined that may not be continuous. Needed, for example, if one wants to define a "noise region" that encompasses the left side and right side of a spectrum but not the middle portion.

Fixes #286
Fixes #285
Fixes #287

@brechmos-stsci brechmos-stsci self-assigned this Aug 24, 2018
@brechmos-stsci
Copy link
Author

The current version of code allows:

Spectral region based on one range of values.

# Spectral region with just one range (lower and upper bound)
sr = SpectralRegion(0.45*u.um, 0.6*u.um)

Spectral region with two ranges of values:

# Spectral region with just two ranges
sr = SpectralRegion([(0.45*u.um, 0.6*u.um), (0.8*u.um, 0.9*u.um)])

Combine two Spectral regions into one:

# Add two spectral regions together
sr = SpectralRegion(0.45*u.um, 0.6*u.um) + SpectralRegion(0.8*u.um, 0.9*u.um)

In-place adding spectral regions:

# Add one spectral region into another.
sr1 = SpectralRegion(0.45*u.um, 0.6*u.um)
sr2 = SpectralRegion(0.8*u.um, 0.9*u.um)
sr1 += sr2

Get a single region:

# Get on spectral region (returns a SpectralRegion instance)
sone = sr1[0]

Get lower and upper bounds:

# Bounds on the spectral region (most minimum and maximum bound)
print(sr.bounds)

Get lower bound:

# Lower bound on the spectral region (most minimum)
print(sr.lower)

Get lower bound of a single sub-region:

# Lower bound on one element of the spectral region.
print(sr[0].lower)

Delete a sub-region:

# Remove a sub-region from the spectral region
del sr1[1]

Iterate over sub-regions:

# Iterate over a spectral region with multiple regions defined.
sr = SpectralRegion(0.15*u.um, 0.2*u.um) + SpectralRegion(0.3*u.um, 0.4*u.um) +\
     SpectralRegion(0.45*u.um, 0.6*u.um) + SpectralRegion(0.8*u.um, 0.9*u.um) +\
     SpectralRegion(1.0*u.um, 1.2*u.um) + SpectralRegion(1.3*u.um, 1.5*u.um)
for s in sr:
    print(s.lower)

Slicing a SpectralRegion:

# Slice spectral region.
subsr = sr[3:5]
# SpectralRegion: 0.8 um - 0.9 um, 1.0 um - 1.2 um

Invert a SpectralRegion:

sr_inverted = sr.invert(0.05*u.um, 3*u.um)
print(sr_inverted)
# SpectralRegion: 0.05 um - 0.15 um, 0.2 um - 0.3 um, 0.4 um - 0.45 um, 0.6 um - 0.8 um, 0.9 um - 1.0 um, 1.2 um - 1 .3 um, 1.5 um - 3.0 um

For SpectralRegion invert, pretend you have a SpectralRegion with 3 sub-regions that define peaks in a spectrum. Then you want to get all the other parts of the spectrum NOT in those 3 sub-regions (between a lower and upper bound). That is the idea of invert.

@brechmos-stsci
Copy link
Author

@eteq @hcferguson @robelgeda @nmearl

This is a first pass at the functionality. @robelgeda are there other things you can think you might want for your specific methods? (extract from spectrum is still there).

If there is general consensus that this is the right direction of functionality, then I will start adding tests for each and add documentation.

@robelgeda
Copy link
Contributor

@brechmos-stsci: As far as #286 goes this is really good and extensive. The only suggestion I have is that for

Get lower and upper bounds:

# Bounds on the spectral region (most minimum and maximum bound)
print(sr.bounds)

We should show the bounds of all the regions in the container along with the most max and min bounds. Other than that the summary looks very good to me.

@eteq
Copy link
Member

eteq commented Aug 24, 2018

@brechmos-stsci and I just chatted a bit out-of-band about this. I'm writing down a summary of some of the key points so that we don't lose them.

A baseline motivation for this is something like what one might otherwise use boolean array masks for: to cut out non-contiguous pieces of a spectrum that are to be used together, so it's not the same as having a list of regions. I've changed the title of this PR to reflect that a bit more.

But since the motivation is to replace masks, it leads to the question of what to do for situations where we already have masks. For example, if you want the region surrounding some spectral line, the easiest thing to do is:

line_sides = spectrum.flux < 10*u.nJy
line_side_spectrum = spectrum[line_sides]
... do whatever you wanted to do with line_side_spectrum

so we would want the region implementation here to fairly easily mock that up. I'd propose as a way to do this:

  1. A method SpectralRegion.mask(spectrum) which yields the boolean mask. It could also yield a non-boolean mask, where the edges are the fraction of overlap between the SpectralRegion and that one pixel (which is only 1 if the edges ox pixels are exactly aligned with the region edges)?
  2. A classmethod SpectralRegion.from_mask(mask, spectrum) which creates a new SpectralRegion given the mask and a spectrum. Should be symmetric with SpectralRegion.mask (although that's only possible of non-boolean masks are allowed).

The above could quite reasonably be separate from this PR though - it came up because it demonstrates that this PR's approach does not block that functionality.

However (and this was my biggest hangup initially): this does not mean we can use something like bitmasks to replace spectral regions, though: spectral regions have utility independent of a spectrum (e.g., "near the wl of Halpha" is a spectral region regardless of whether you have a spectrum with the line), whereas bitmasks can only be assigned a spectral_axis value when applied to a specific spectrum.

@brechmos-stsci
Copy link
Author

brechmos-stsci commented Aug 27, 2018

@eteq said it above, but I want to re-iterate...

My intent of spectral regions is that they are independent of a particular Spectrum1D object. Therefore, one could define a spectral region halpha_region = SpectralRegion(650*u.nm, 660*u.nm) and then halpha_region could be applied to spectra that are sampled differently.

This is why I went with a list of 2-tuples rather than a mask implementation.

@hcferguson
Copy link
Contributor

This looks very good. And I agree that having spectral regions that are independent of the pixel sampling could be quite powerful.

@brechmos-stsci
Copy link
Author

It sounds like there is general consensus that this is generally going in the right direction. I'll add in/fix tests add documentation.

@brechmos-stsci
Copy link
Author

brechmos-stsci commented Aug 28, 2018

@robelgeda

We should show the bounds of all the regions in the container along with the most max and min bounds. Other than that the summary looks very good to me.

I would suggest that this remain the way it is regarding the bounds. The idea is that this is the bounds for the whole spectral region.

If one wants to get the bounds for each sub-region one could just do:

lower_bounds = [sr[0] for sr in spectral_region]
upper_bounds = [sr[1] for sr in spectral_region]

So, I would suggest this stays the way it is for now. If others have an opinion, I am happy to listen (and change!).

@brechmos-stsci
Copy link
Author

This ticket is close but I want to wait to hear back from @nmearl about #288 as this ticket assumes the wcs adapter world_to_pixel method works properly.

@robelgeda
Copy link
Contributor

That sounds good to me

@brechmos-stsci
Copy link
Author

@eteq @crawfordsm @hcferguson @nmearl @hcferguson

In the current implementation of SpectralRegion there is an extract method that works like: sub_spectrum = interesting_region.extract(large_spectrum). This makes sense (to me).

The new enhanced SpectralRegion may contain multiple sub-regions, but in this case what would the extract method look like?

I don't think it makes that much sense to create a single sub_spectrum that is a sliced-up version of the large_spectrum.

So, do you think the SpectralRegion extract method should return a list of Spectrum1D objects, one for each sub-region?

@hcferguson
Copy link
Contributor

Seems sensible, although I can't immediately think of a science use case for doing this.

Or maybe I can.

oxygen_lines = SpectralRegion[(0.3725*u.um, 0.3729*u.um), (0.4957*u.um, 0.4961*u.um), (0.5005*u.um, 0.5009*u.um)
h_beta = SpectralRegion((0.4859*u.um, 0.4863*u.um)

r23 = flux(my_spectrum,region=oxygen_lines)/flux(my_spectrum,region=h_beta)

Although this use case means we have to have a function flux which takes these arguments. I don't think this exists yet. And of course one probably would make this more complex by doing some type of continuum subtraction along the way.

@brechmos-stsci
Copy link
Author

brechmos-stsci commented Aug 28, 2018

The other use case I can think of is to compute the RMS in the baseline parts of the spectrum defined by a SpectralRegion made up of multiple sub-regions.

But maybe this all begs for the "extract" to go the other way, as you say. Maybe there does need to be an "extract flux" (and dispersion?) method defined in Spectrum1D that takes a spectral region and returns a numpy array of flux over those spectral regions. (Rather than an extract in SpectralRegion)

[a little later]

I was thinking about this and I wonder if I should add methods into Spectrum1D:

  • flux_from_region(SpectralRegion)
  • uncertainty_from_region(SpectralRegion)
  • dispersion_from_region(SpectralRegion)

Then one can use that information as in @hcferguson's examples (and mine above).

@hcferguson
Copy link
Contributor

hcferguson commented Aug 28, 2018

In my example the function flux() was returning the integral of the flux (accounting for fractional pixels). A potential pitfall to having functions like flux_from_region(SpectralRegion) return numpy arrays would be that you are losing the benefit of whatever machinery we build to deal with fractional pixels.

@brechmos-stsci
Copy link
Author

@hcferguson Maybe the flux, in your example, could take an operator too so one could do sum, mean, median, std etc but then with the proper machinery for fractional pixels (?).

@hcferguson
Copy link
Contributor

Well, in the case of flux() in my example, the sum would have units of energy while the mean, median, std would have units of flux density (e.g. energy/frequency). So I'm a bit hesitant to combine them.

@nmearl
Copy link
Contributor

nmearl commented Aug 30, 2018

I don't think it makes that much sense to create a single sub_spectrum that is a sliced-up version of the large_spectrum.

Why not? If I knowing cut out regions around some absorption features, then my expectation would be that my dispersion solution would look like, e.g., spec.wavelength = [5, 6, 7, 8, 21, 22, 23, 24, 25] AA. It'd be the same expectation if I had applied a mask, with the benefit of fractional pixel data.

So, do you think the SpectralRegion extract method should return a list of Spectrum1D objects, one for each sub-region?

This is kind of interesting. In one case, e.g., extracting information about the continuum, a list of Spectrum1D would be the last thing that I'd want -- a single continuum spectrum is what I'd expect. However, when I'm wanted to extract e.g. SiIV lines from a spectrum in velocity space, I do generally want to be able to grab just one as a single Spectrum1D object. With that said, I'd expect to do something like interest_regions[0].extract(large_spectrum) to get a particular absorption feature.

In the current implementation of SpectralRegion there is an extract method that works like: sub_spectrum = interesting_region.extract(large_spectrum). This makes sense (to me).

Can I just mention that this seems completely semantically backwards to me -- conceptually, I want to extract a region from a spectrum, not a spectrum from a region. I feel like interesting_spectrum = large_spectrum.extract(interesting_region) makes more sense.

@hcferguson
Copy link
Contributor

I agree that spectrum.extract(region) seems more intuitive than region.extract(spectrum).

However, in reading the APE, I inferred there was a design decision to try to keep the Spectrum1D object lightweight, without a lot of methods apart from arithmetic. I guess the rationale for that was so that it would be easier to serialize, although that wasn't mentioned explicitly? If so, one should be cautious about adding any new methods.

@eteq
Copy link
Member

eteq commented Aug 31, 2018

Re:

I don't think it makes that much sense to create a single sub_spectrum that is a sliced-up version of the large_spectrum.

and

I was thinking about this and I wonder if I should add methods into Spectrum1D:

I agree with @nmearl here - I think it does make sense to use Spectrum1d~ - either the bits-of-continuum-to-get-RMS or the example @hcferguson gave could be implemented this way. I think it's much *better* to do it that way vs custom flux/unc/etc accessors, because we know we need flux*anyway* in the spectrum objects. So then it's entirely the business ofextract` to get the WCS right, but once that's done the user can trust it (rather than needing to worry about carrying the flux around carefully).

The hardest thing is about what to do on the edges: my vote is for the edge pixels in the resulting spectra to have the flux (and appropriately error-propogated uncertainties) from assuming they are "fractional" pixels based on the region edge+wcs, as we've discussed previously. The plot might look a bit weird because the edge pixels are all going to be low, but sum/var etc will all make more sense.

Put another way, we get all of those accessors "for free" if we return a set of spectra, but we are just throwing the user back in the current situation of "deal with arrays" if we implement the methods.

So, do you think the SpectralRegion extract method should return a list of Spectrum1D objects, one for each sub-region?

What about returning a SpectrumCollection? That has the advantage that we might eventually implement helpers like tools that will measure the total fluxes and such...

@eteq
Copy link
Member

eteq commented Aug 31, 2018

Re: @nmearl's

Can I just mention that this seems completely semantically backwards to me

I think this might be flip-a-coin. It's like the unix problem of linking: are you linking file a to location b, or creating link a and pointing it to file b? Reasonable people would think it could go either way, so we have to just pick one. But I'd vote for a more descriptive name. E.g., either:

  1. subspec = spectrum.extract_region(region)
  2. subspec = region.extract_from(spectrum)

@brechmos-stsci
Copy link
Author

So, there seem to be two things we need to decide on:

  1. What is returned from the extract region method (in particular, when there are multiple sub-regions in a spectral region)?

@nmearl says that a single spectrum should be returned regardless of the number of sub-regions.

@eteq suggests a SpectrumCollection could be returned.

Decision (?): Return a single Spectrum1D object.

  1. Where should the spectral-extract-region method live? (could be either in Spectrum1D or in SpectralRegion)

@nmearl would like to see it spectrum.extract(region), ie that the extraction should live in Spectrum1D

@hcferguson says the APE suggests the Spectrum1D should be lightweight and therefore extract may not fit in the Spectrum1D object.

@eteq the method could live in either but maybe it should be a more descriptive name.

Decision (?): Make the method name more descriptive and leave in SpectralRegion. (So Spectrum1D remains light-weight).

@eteq @hcferguson @nmearl @anyoneelse What do you guys think of the suggested "decisions" for points 1 and 2 above? (These are both reasonably fundamental but there is other work waiting on these so I would like to move this along a bit).

@nmearl nmearl added this to the v0.6.0 milestone Sep 5, 2018
@nmearl nmearl requested review from eteq and hcferguson September 5, 2018 15:15
@astrofrog
Copy link
Member

astrofrog commented Sep 5, 2018

FWIW I agree with @nmearl here:

Can I just mention that this seems completely semantically backwards to me -- conceptually, I want to extract a region from a spectrum, not a spectrum from a region. I feel like interesting_spectrum = large_spectrum.extract(interesting_region) makes more sense.

Or maybe extract could be a function, but I find it confusing to have region.extract(spectrum) (personally)

@brechmos-stsci
Copy link
Author

Based on the above comments I am going to:

  1. Change the extract method to return a single Spectrum1D object based on the regions
  2. Move the extract method to a method in the manipulation package and make a function that takes a Spectrum1D and a SpectralRegion and returns a Spectrum1D object.

If someone would like otherwise, feel free to comment.

@eteq
Copy link
Member

eteq commented Sep 6, 2018

Re: @brechmos-stsci 's 2 - To get something out for 0.4, lets go with the function approach as you suggested in #289 (comment). It's a lot easier to "upgrade" that to one of the method-approaches later (the function then just turns into a method call).

For later consideration: What about the case where someone wants a custom form of extraction? There was a case just like this for #210 ... In that case the best solution is definitely region.extract_from(spectrum), because the user can then do a simple subclass of region and override only the extract_from method. So to me that has swung the pendulum in favor of method-of-region (but I agree with @astrofrog and @nmearl that just extract is not sufficient and extract_from is better).

Re: item 1 - I realized SpectrumCollection won't work because they are not square. but I guess I still favor list-of-Spectrum1D? I don't really see the motivation for combining them into one big spectrum. It is possible but it seems to me to be not desirable.

@brechmos-stsci
Copy link
Author

After a short discussion with @eteq, I am going to have the output of extract_region as a list of Spectrum1D objects, one for each sub-region.

But... I think there is a need for a Spectrum1D concatenate method that takes some number of Spectrum1D objects and outputs a single one. There are a lot of sketchy issues with that, but it could be useful for applying to the output of extract_region in order to get the one Spectrum1D.

Copy link
Contributor

@hcferguson hcferguson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants