Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parameter for changing index order #87

Merged
merged 21 commits into from
Apr 8, 2019

Conversation

simeks
Copy link
Contributor

@simeks simeks commented Mar 14, 2019

This is a proposal for the problem discussed in #75. This should probably be seen as a work-in-progress and a base for further discussion. I'm not sure whether this is the best approach so feedback would be appreciated.

The first addition is the index_order argument for nrrd.read which allows the user to specify the desired index order. I set the default to 'F' so the behavior should stay the same as with previous versions.

I also changed nrrd.write. This function did previously not work on C-contiguous arrays and this should now be fixed. Here I added an index_order flag, but my hope is that it should only be required in a few cases. Similar to most numpy functions there are three options; F, C, and A. The A (default) option automatically detects the order of the array. So for example, when trying to write an array that was read using nrrd.read(index_order='F'), the function should correctly derive the shape and write the array in Fortran order.

At first sight I see only one problem with the A option and that is that it breaks down if you do something like this:

arr = nrrd.read('in', 'F')
arr = np.ascontiguousarray(arr)
np.write('out', arr)

Since np.ascontiguousarray will keep the shape but change the underlying memory order, changing it to C. This is maybe not something we should worry about though.

Also, to by knowledge there is no way to do parameterized tests using the standard unittest module so I did some jury-rigging to test the index_order argument. This could probably use some clean-up.

nrrd/writer.py Outdated

# Write the raw data directly to the file
fh.write(raw_data)
elif header['encoding'].lower() in ['ascii', 'text', 'txt']:
# savetxt only works for 1D and 2D arrays, so reshape any > 2 dim arrays into one long 1D array
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any reasons for using savetxt for 2D? I mostly removed it to simplify my solution but I can put it back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, and it has been awhile, it's so it will make the ASCII data look like:

2D:

1 2 3
4 5 6
7 8 9

Would be nice to have to be consistent with NRRD spec, but not crucial. Maybe there's a cleaner way to set this up so that it works well with the index order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. This should be an easy fix so I'll change it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense so I will revert this. Making it work with the ordering should be an easy fix.

@addisonElliott
Copy link
Collaborator

addisonElliott commented Mar 14, 2019

I just have some general comments for now. I will need to sit down and vigorously test this later.

However, I think the general implementation looks good and I think the general concepts seem right.

Comments

  • First, we need to fix the tests which from what I saw, it's a minor issue.
  • Would definitely want to add more documentation to the docstrings and potentially even a separate page on the pynrrd docs itself.
    • It's good for now until we get the implementation down though!
  • One concern of mine is that using index_order=F will be less efficient to loop through the slowest-varying axis (e.g. the z axis) than if using index_order=C. Although our current method is a bit hacky by reading the image as Fortran-order even though it is C-order to transpose the axes, doesn't it maintain efficiency when looping about the z axis?
  • I like how you modified the tests to run in C-order and Fortran-order. I'm surprised you didn't have to transpose any of the expected_data in the nrrd.read tests? Did I miss something?
  • Since the index_order parameter is likely to be a confusing topic, I think we should have more specialized tests in addition to the ones you've added. Not sure if this is necessary, will need to think on it whether the tests now fully test the capabilities of the index_order parameter.
  • Does line 290 raw_data = data.tostring(order=index_order) do what we expect it to?
  • For nrrd.write, do we want to use np.isfortran? I thought that determines the memory order. Don't we just care about the index order?

Also, let me make sure I'm understanding this correctly. The index_order parameter, as it states, only affects how the indices are represented. Python typically uses C-order but we are basically just transposing the data to make it Fortran-order.

Thus, for nrrd.write, the index order is just specifying whether we transpose the data again before saving?

Sounds like index_order=C should be the recommended approach. Probably should include some notes so people can be aware.

@simeks
Copy link
Contributor Author

simeks commented Mar 15, 2019

  • First, we need to fix the tests which from what I saw, it's a minor issue.

Yeah, I don't really understand what went wrong there since it only seems to fail on Python 2.

  • One concern of mine is that using index_order=F will be less efficient to loop through the slowest-varying axis (e.g. the z axis) than if using index_order=C. Although our current method is a bit hacky by reading the image as Fortran-order even though it is C-order to transpose the axes, doesn't it maintain efficiency when looping about the z axis?

Practically this should produce the same result as the current method. The transpose is not actually changing the layout of the data in memory, it is simply reversing the strides and flagging the array as Fortran contiguous. The same thing is happening when you're doing np.reshape(data, tuple(header['sizes']), order='F'). The x-axis (Given x, y, z in Fortran or z, y, x in C) will still be the fastest in memory.

  • I like how you modified the tests to run in C-order and Fortran-order. I'm surprised you didn't have to transpose any of the expected_data in the nrrd.read tests? Did I miss something?

I did change expected_data by setting the order to self.index_order when first reading it from the file. Still a bit unsure whether there are any implications by doing this in the test though. It seemed to work when I did some tests on real data.

  • Since the index_order parameter is likely to be a confusing topic, I think we should have more specialized tests in addition to the ones you've added. Not sure if this is necessary, will need to think on it whether the tests now fully test the capabilities of the index_order parameter.

Yes, definitely. One thing that hit me is that the A option in the writer still is untested. Writing a test for that could be a bit confusing though.

  • Does line 290 raw_data = data.tostring(order=index_order) do what we expect it to?

Hm, I would hope so. tostring writes the element in the order specified by index_order. Given an array of shape (W,H,D), F would produce the results with the W axis being the fastest, independent of the underlying memory layout. C would do the opposite. A on the other hand will write in C order unless the array is flagged as Fortran contiguous.

  • For nrrd.write, do we want to use np.isfortran? I thought that determines the memory order. Don't we just care about the index order?

Yes, this is a bit weird and not really intuitive for me neither. The problem is that numpy is always in C index order so there's no real way to distinguish the index orders. However, this follows the way numpy seems to deal with order using the A option. The alternative would be to force the user to specify either F or C which could cause a lot of confusion. That said, there is definitely room for discussion here.

Also, let me make sure I'm understanding this correctly. The index_order parameter, as it states, only affects how the indices are represented. Python typically uses C-order but we are basically just transposing the data to make it Fortran-order.

Thus, for nrrd.write, the index order is just specifying whether we transpose the data again before saving?

Exactly, and the A option was an attempt to make the write function automatically detect the index_order.

Sounds like index_order=C should be the recommended approach. Probably should include some notes so people can be aware.

In my eyes, definitely. Especially since numpy uses C order. I understand the importance of backward compatibility though.

@addisonElliott
Copy link
Collaborator

  • First, we need to fix the tests which from what I saw, it's a minor issue.

Yeah, I don't really understand what went wrong there since it only seems to fail on Python 2.

Try to run the tests locally on Python 2 if you're able. I can rerun the Travis CI if it was just a freak occurrence that it failed.

  • One concern of mine is that using index_order=F will be less efficient to loop through the slowest-varying axis (e.g. the z axis) than if using index_order=C. Although our current method is a bit hacky by reading the image as Fortran-order even though it is C-order to transpose the axes, doesn't it maintain efficiency when looping about the z axis?

Practically this should produce the same result as the current method. The transpose is not actually changing the layout of the data in memory, it is simply reversing the strides and flagging the array as Fortran contiguous. The same thing is happening when you're doing np.reshape(data, tuple(header['sizes']), order='F'). The x-axis (Given x, y, z in Fortran or z, y, x in C) will still be the fastest in memory.

Yeah, you're probably right. Just to be safe, I might play around with this and ensure the two options give the exact same strides.

  • I like how you modified the tests to run in C-order and Fortran-order. I'm surprised you didn't have to transpose any of the expected_data in the nrrd.read tests? Did I miss something?

I did change expected_data by setting the order to self.index_order when first reading it from the file. Still a bit unsure whether there are any implications by doing this in the test though. It seemed to work when I did some tests on real data.

Hmm, this is where things seem a bit odd. The order parameter for the Numpy constructor is referring to the memory order, right? Maybe it would be better to explicitly transpose the data. This would be less likely to confuse people if they're reading the tests. And it will make a clear distinction that we are only altering the index order, not the memory order.

  • Does line 290 raw_data = data.tostring(order=index_order) do what we expect it to?

Hm, I would hope so. tostring writes the element in the order specified by index_order. Given an array of shape (W,H,D), F would produce the results with the W axis being the fastest, independent of the underlying memory layout. C would do the opposite. A on the other hand will write in C order unless the array is flagged as Fortran contiguous.

I'll need to do a bit more research on this, from what I've found there are some cases where the order parameter refers to the memory order and other cases where it refers to the index order. In numpy.tostring, there are three options: C, F, or None. When Noneis given,tostringwill check theF_CONTIGUOUSflag, basically doing the same thing asnumpy.isfortran`.

The dangerous part here is that what if the user has a C-ordered array that they are treating as a Fortran array. A lot of users aren't even paying attention to what memory-order their data is in. For example, let's say I'm used to MATLAB so I want to created a 5 x 10 x 15 array (x, y, z). Here's how I would do it: x = np.zeros((5, 10, 15)). Well, isn't this C-ordered actually? So this would trick np.isfortran and potentially mess up things here I think...

Also, let me make sure I'm understanding this correctly. The index_order parameter, as it states, only affects how the indices are represented. Python typically uses C-order but we are basically just transposing the data to make it Fortran-order.
Thus, for nrrd.write, the index order is just specifying whether we transpose the data again before saving?

Exactly, and the A option was an attempt to make the write function automatically detect the index_order.

If we want to truly separate the index order concept from the memory order, then I'm not sure a Numpy array will contain any information on the index order because that is arbitrary. So, I don't think an automatic order parameter is possible. Rather, my vote would be to make the index_order default to 'F' for the read and write functions. Then, at the release of 1.0 maybe we will switch that to 'C'.

This is what users would have to do for C-order which is kinda a pain, but not that bad really.

data, header = nrrd.read(...., index_order='C')
...
nrrd.write('xxx', data, header, index_order='C')

@codecov-io
Copy link

codecov-io commented Mar 18, 2019

Codecov Report

Merging #87 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #87      +/-   ##
==========================================
+ Coverage   99.45%   99.46%   +<.01%     
==========================================
  Files           6        6              
  Lines         368      374       +6     
  Branches      116      119       +3     
==========================================
+ Hits          366      372       +6     
  Misses          1        1              
  Partials        1        1
Impacted Files Coverage Δ
nrrd/reader.py 100% <100%> (ø) ⬆️
nrrd/writer.py 98.33% <100%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77a0af6...4f301d3. Read the comment docs.

@simeks
Copy link
Contributor Author

simeks commented Mar 18, 2019

Try to run the tests locally on Python 2 if you're able. I can rerun the Travis CI if it was just a freak occurrence that it failed.

Odd, I thought I solved the issue this time since the tests seem to pass when I run them on Python 2. I have to look into this a bit further. First, it seemed like same warnings won't trigger twice on Python 2 unless you tell them to.

Edit: nevermind, the test actually passed. It was just Codecov that complained.

Hmm, this is where things seem a bit odd. The order parameter for the Numpy constructor is referring to the memory order, right? Maybe it would be better to explicitly transpose the data. This would be less likely to confuse people if they're reading the tests. And it will make a clear distinction that we are only altering the index order, not the memory order.

As far as I know, the order in this case only refers to the order you parse the linear data read from RAW_DATA_FILE_PATH. I.e., should x or z be the fastest axis. You're right about clarity though so it would probably be a good idea to do something like this:

if index_order == 'F':
    data = data.T

The dangerous part here is that what if the user has a C-ordered array that they are treating as a Fortran array. A lot of users aren't even paying attention to what memory-order their data is in. For example, let's say I'm used to MATLAB so I want to created a 5 x 10 x 15 array (x, y, z). Here's how I would do it: x = np.zeros((5, 10, 15)). Well, isn't this C-ordered actually? So this would trick np.isfortran and potentially mess up things here I think...

If we want to truly separate the index order concept from the memory order, then I'm not sure a Numpy array will contain any information on the index order because that is arbitrary. So, I don't think an automatic order parameter is possible. Rather, my vote would be to make the index_order default to 'F' for the read and write functions. Then, at the release of 1.0 maybe we will switch that to 'C'.

This is what users would have to do for C-order which is kinda a pain, but not that bad really.

data, header = nrrd.read(...., index_order='C')
...
nrrd.write('xxx', data, header, index_order='C')

Yeah, as mentioned in my first post, I realized that you could quite easily break things by doing something as simple as np.ascontiguousarray. My thought with A was to allow people to use it without thinking but that may be a bad idea. I know from experience that users (including me) are lazy 😄.

So, I agree, forcing F or C makes sense but this has to be clearly described in the documentation. Hopefully the users who find out about index_order='C' actually found it there and therefore know about the requirements for using nrrd.write. This shouldn't matter as much for the F users since this will continue to be the default behaviour.

@addisonElliott
Copy link
Collaborator

addisonElliott commented Mar 18, 2019

Well, I think we're in agreement for what to do now. I'll leave this PR up to you, but let me know if you have questions or need help. Also, may be useful to put [WIP] in the PR title until you're done. Just mention me when you are ready for a review. I will grab this branch and try to do a thorough testing of it before merging it in.

By the way, I probably need to add a threshold to the codecov diff. Having a greater coverage does not mean that your code is tested the best.

@ihnorton @tashrifbillah @AAAArcus Any comments or concerns? This PR will not break any of your code, but if you want to utilize the Python-default C-order index ordering for the data only, a new argument named index_order can be passed to essentially transpose the data in nrrd.read and nrrd.write.

A future version, potentially v1.0.0 whenever that will be, will set the default to use C-order which will break code but ultimately is easy to fix.

@simeks simeks changed the title Index order [WIP] Index order Mar 19, 2019
@simeks
Copy link
Contributor Author

simeks commented Mar 22, 2019

@addisonElliott I think this should be all, I hope I'm not forgetting anything. I made sure to clarify this in the user manual as well.

Copy link
Collaborator

@addisonElliott addisonElliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me. Documentation was a good start but I added a few more suggestions of content to add. Index ordering can be a confusing concept so I want to be as thorough as possible.

@tashrifbillah emailed me to let me know that he wants to review your PR just to make sure it doesn't break any of his existing code. Never hurts to have a second set of eyes.

From what I can tell, this code should be backwards compatible and I don't see any reason why it would break existing code.

@@ -182,6 +182,8 @@ Reading NRRD files
------------------
There are three functions that are used to read NRRD files: :meth:`read`, :meth:`read_header` and :meth:`read_data`. :meth:`read` is a convenience function that opens the file located at :obj:`filename` and calls :meth:`read_header` and :meth:`read_data` in succession and returns the data and header.

Reading NRRD files using :meth:`read` or :meth:`read_data` will by default return arrays being indexed in Fortran order, i.e array elements are accessed by `data[x,y,z]`. This differs to the C-order, i.e. `data[z,y,x]`, which is more common in numpy. The :obj:`index_order` parameter can be used to specify which index ordering should be used on the returned array ('C' or 'F'). The :obj:`index_order` parameter needs to be consistent with the parameter of same name in :meth:`write`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reading NRRD files using :meth:`read` or :meth:`read_data` will by default return arrays being indexed in Fortran order, i.e array elements are accessed by `data[x,y,z]`. This differs to the C-order, i.e. `data[z,y,x]`, which is more common in numpy. The :obj:`index_order` parameter can be used to specify which index ordering should be used on the returned array ('C' or 'F'). The :obj:`index_order` parameter needs to be consistent with the parameter of same name in :meth:`write`.
Reading NRRD files using :meth:`read` or :meth:`read_data` will by default return arrays being indexed in Fortran-order, i.e array elements are accessed by `data[x, y, z]`. This differs from C-order where array elements are accessed by `data[z, y, x]`, which is the more common order in Python and libraries (e.g. NumPy, scikit-image, PIL, OpenCV). The :obj:`index_order` parameter can be used to specify which index ordering should be used on the returned array ('C' or 'F').

Small comments on wording.

I think the last line mentioning that this should be consistent with nrrd.write should be removed. The reason why is because the index_order in nrrd.read can be specified as whichever order preferred with no consequences. However, the important distinction is that the user must specify the correct index_order in nrrd.write.

Having that last sentence in the reading NRRD section could make others think "So I need to know what order the data was saved in, C-order or Fortran-order?", which is not the case. In terms of nrrd.read, the index order is a preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it's worth reiterating on the fact that the user needs to be aware that the choices they make when reading will impact writing later on. This was the goal of the last sentence. Could probably do with some rephrasing though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, I think this note that was added describes what you were trying to note. Do you agree?

The :obj:`index_order` parameter must be consistent with the index order specified in :meth:`read`. Reading an NRRD file in C-order and then writing as Fortran-order or vice versa will result in the data being transposed in the NRRD file.

If so, then we just need to think about where it should be placed. It is currently placed in nrrd.write and I think that's the best spot personally. Maybe we could even place in the function docstring.

@@ -194,12 +196,16 @@ Writing NRRD files
------------------
Writing to NRRD files can be done with the single function :meth:`write`. The :obj:`filename` parameter to the function specifies the absolute or relative filename to write the NRRD file. If the :obj:`filename` extension is .nhdr, then the :obj:`detached_header` parameter is set to true automatically. If the :obj:`detached_header` parameter is set to :obj:`True` and the :obj:`filename` ends in .nrrd, then the header file will have the same path and base name as the :obj:`filename` but with an extension of .nhdr. In all other cases, the header and data are saved in the same file.

The :obj:`data` parameter is a :class:`numpy.ndarray` of data to be saved. :obj:`header` is an optional parameter of type :class:`dict` containing the field/values to be saved to the NRRD file.
The :obj:`data` parameter is a :class:`numpy.ndarray` of data to be saved. :obj:`header` is an optional parameter of type :class:`dict` containing the field/values to be saved to the NRRD file. :obj:`index_order` specifies the index order of the array :obj:`data`, this should be consistent with the argument of the same name in :meth:`read`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The :obj:`data` parameter is a :class:`numpy.ndarray` of data to be saved. :obj:`header` is an optional parameter of type :class:`dict` containing the field/values to be saved to the NRRD file. :obj:`index_order` specifies the index order of the array :obj:`data`, this should be consistent with the argument of the same name in :meth:`read`.
The :obj:`data` parameter is a :class:`numpy.ndarray` of data to be saved. :obj:`header` is an optional parameter of type :class:`dict` containing the field/values to be saved to the NRRD file.
Writing NRRD files will by default index the :obj:`data` array in Fortran-order where array elements are accessed by `data[x, y, z]`. This differs from C-order where array elements are accessed by `data[z, y, x]`, which is the more common order in Python and libraries (e.g. NumPy, scikit-image, PIL, OpenCV). The :obj:`index_order` parameter can be used to specify which index ordering should be used for the given :obj:`data` array ('C' or 'F').

My thoughts here are to keep the same wording and description from the reading NRRD files section just to be consistent. In addition, the note will draw to the user's attention that they need to be careful what they put here.

docs/source/user-guide.rst Outdated Show resolved Hide resolved
docs/source/user-guide.rst Show resolved Hide resolved
docs/source/user-guide.rst Show resolved Hide resolved
nrrd/writer.py Outdated Show resolved Hide resolved
nrrd/writer.py Outdated
header['dimension'] = data.ndim
header['sizes'] = list(data.shape)

if index_order == 'F':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would inline if it all:

header['sizes'] = list(data.shape) if index_order == 'F' else list(data.shape[::-1])

nrrd/reader.py Outdated Show resolved Hide resolved
nrrd/reader.py Outdated
Parameters
----------
filename : :class:`str`
Filename of the NRRD file
custom_field_map : :class:`dict` (:class:`str`, :class:`str`), optional
Dictionary used for parsing custom field types where the key is the custom field name and the value is a
string identifying datatype for the custom field.
index_order : :class: `str`, optional
Specifies the index ordering of the outputted array, either 'F' (Fortran order) or 'C' (C order). Using 'F' (default) the image will be transposed, providing an index ordering in this manneri; `data[x][y][z]`. However, Numpy generally uses 'C' order, (e.g. `data[z][y][x]`) and 'C' order is therefore adviced to keep consistent with other libraries.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Specifies the index ordering of the outputted array, either 'F' (Fortran order) or 'C' (C order). Using 'F' (default) the image will be transposed, providing an index ordering in this manneri; `data[x][y][z]`. However, Numpy generally uses 'C' order, (e.g. `data[z][y][x]`) and 'C' order is therefore adviced to keep consistent with other libraries.
Specifies the index order used for reading. Either 'C' (C-order) where the dimensions are ordered from
slowest-varying to fastest-varying (e.g. (z, y, x)), or 'F' (Fortran-order) where the dimensions are ordered
from fastest-varying to slowest-varying (e.g. (x, y, z)).

I basically just copied the wording from the write function. Not sure if it captures the same message you had or not. Just an idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could cause some confusion in the same manner as you mentioned in a previous comment. "Specifies the index order used for reading" implies that you are reading in a certain order from the file, which is not the case. The order is simply the index order (whether to transpose the data or not before returning).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned that "slowest-varying to fastest-varying" is vague without a description of memory layout which I agree with. Whatever language we come up with to describe C-order & Fortran-order, it should be here.

Fair enough, how about the below wording? It's basically changing "outputted" to "resulting data"
"Specifies the index order of the resulting data array."

nrrd/reader.py Show resolved Hide resolved
@addisonElliott addisonElliott changed the title [WIP] Index order Add parameter for changing index order Mar 23, 2019
@simeks
Copy link
Contributor Author

simeks commented Mar 25, 2019

The Github UI is a mess, but I think I covered all the comments.

Copy link
Collaborator

@addisonElliott addisonElliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the new suggestions sounds like it may be buggy. I think there was one mess up with committing suggestions.

In addition, there was the inline if comment that wasn't addressed. Maybe that was overlooked or somehow your changes weren't committed?

docs/source/user-guide.rst Show resolved Hide resolved
@@ -182,6 +182,8 @@ Reading NRRD files
------------------
There are three functions that are used to read NRRD files: :meth:`read`, :meth:`read_header` and :meth:`read_data`. :meth:`read` is a convenience function that opens the file located at :obj:`filename` and calls :meth:`read_header` and :meth:`read_data` in succession and returns the data and header.

Reading NRRD files using :meth:`read` or :meth:`read_data` will by default return arrays being indexed in Fortran order, i.e array elements are accessed by `data[x,y,z]`. This differs to the C-order, i.e. `data[z,y,x]`, which is more common in numpy. The :obj:`index_order` parameter can be used to specify which index ordering should be used on the returned array ('C' or 'F'). The :obj:`index_order` parameter needs to be consistent with the parameter of same name in :meth:`write`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, I think this note that was added describes what you were trying to note. Do you agree?

The :obj:`index_order` parameter must be consistent with the index order specified in :meth:`read`. Reading an NRRD file in C-order and then writing as Fortran-order or vice versa will result in the data being transposed in the NRRD file.

If so, then we just need to think about where it should be placed. It is currently placed in nrrd.write and I think that's the best spot personally. Maybe we could even place in the function docstring.

nrrd/reader.py Show resolved Hide resolved
nrrd/reader.py Outdated
Parameters
----------
filename : :class:`str`
Filename of the NRRD file
custom_field_map : :class:`dict` (:class:`str`, :class:`str`), optional
Dictionary used for parsing custom field types where the key is the custom field name and the value is a
string identifying datatype for the custom field.
index_order : :class: `str`, optional
Specifies the index ordering of the outputted array, either 'F' (Fortran order) or 'C' (C order). Using 'F' (default) the image will be transposed, providing an index ordering in this manneri; `data[x][y][z]`. However, Numpy generally uses 'C' order, (e.g. `data[z][y][x]`) and 'C' order is therefore adviced to keep consistent with other libraries.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned that "slowest-varying to fastest-varying" is vague without a description of memory layout which I agree with. Whatever language we come up with to describe C-order & Fortran-order, it should be here.

Fair enough, how about the below wording? It's basically changing "outputted" to "resulting data"
"Specifies the index order of the resulting data array."

nrrd/writer.py Show resolved Hide resolved
Copy link
Contributor Author

@simeks simeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this got really confusing due to me struggling with the interface. I think I managed to cover it all in the latest commit.

docs/source/user-guide.rst Show resolved Hide resolved
nrrd/reader.py Show resolved Hide resolved
Copy link
Collaborator

@addisonElliott addisonElliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the documentation side of this now.

There's one minor fix to make and then I want to take another look at the unit testing to make sure we're thorough. I think we are, but I just want to make absolutely sure the tests are running as expected.

@tashrifbillah hopefully will be able to review this soon so we can get it merged in.

docs/source/user-guide.rst Outdated Show resolved Hide resolved
@simeks
Copy link
Contributor Author

simeks commented Mar 29, 2019

👍

I want to take another look at the unit testing to make sure we're thorough. I think we are, but I just want to make absolutely sure the tests are running as expected.

Yes, of course. There are other approaches as well, like https://pypi.org/project/parameterized/, but I didn't want to add additional dependencies.

Copy link
Collaborator

@addisonElliott addisonElliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked over the tests closely. LGTM

Edit: One more minor change to make. I've found you enabled warnings for the code which is a good idea. However, I have a PR where it is at the top of each file and will fix some of these warnings.

@@ -176,6 +178,9 @@ def test_read_dup_field_error_and_warn(self):

import warnings
with warnings.catch_warnings(record=True) as w:
# Enable all warnings
warnings.simplefilter("always")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead and remove this line. When running tests on your branch, I noticed I received some warnings that I didn't get before. I finally figured out it's because of this line.

I have an upcoming PR where I enable filters for the entire tests (I put this code at the top of each file). I also will be fixing some of these warnings too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . This makes the test fail though, but since you're enabling it in another PR I guess it's fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, that's the reason why the one test was failing. I'm curious as to why this started happening only in your PR. I'm sure there's some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is mostly likely caused by the fact that we want to trigger the same warning twice (for both 'C' and 'F'). I don't know if it is intended but it seems that Python 2 only triggers warnings once, omitting the warning in subsequent calls.

@addisonElliott addisonElliott merged commit 3708341 into mhe:master Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants